![]() |
The New Guidelines |
Copyright 2000
John F. Raffensperger, Ph.D.
Department of Management, Private Bag 4800,
University of Canterbury, Christchurch 8004, New Zealand
+ 64-3-364-2987, ext. 8616
My University home page.
Related reference: Raffensperger, John F. (2003) "New Guidelines for Spreadsheets," International Journal of Business and Economics, 2, 2, 141-154.
Current prescriptions for spreadsheet style specify modular separation of data, calculation, and output, based on the notion that writing a spreadsheet is like writing a computer program. Users do not want and do not use this style.
This article presents a new style, based on writing prose, graphics, and mathematics. The new style emphasizes readability.
Some examples are taken from Assign.xls (or modifications of Assign.xls), a spreadsheet produced by Lindo Systems, Inc. I gratefully acknowledge their gracious support.
Assign.xls uses What'sBest!, a math modelling add-in from Lindo Systems, hence you will see the occasional “WBMAX” range name, the “WB(…)” formulas, and the “>=” cell displays.
The problem: “Writing a spreadsheet is like writing a computer program.”
Rule 1: Make your spreadsheet read from left to right and top to bottom.
Corollary to Rule 1: Have short arcs of precedence.
Be concise with sheets.
Be concise with blocks.
Be concise with cells.
Be concise with formulas.
Be concise with blank space.
Rule 3: Format for description, not decoration.
Rule 4: Show - and describe - your assumptions.
It is common knowledge that spreadsheets contain errors like a dog has fleas ( Panko & Sprague 1998). Panko's web site has a large bibliography, with many references in newspapers & magazines.
| The key problem? Overconfidence in a spreadsheet! Never assume a spreadsheet is right, even your own. This takes some humility! You cannot error-proof a spreadsheet. Errors are too easy to add. You can only be careful. |
It has mostly been specified by accountants (References: Bromley (1985), Stone & Black (1989), Kee (1988), Kee & Mason (1988), Bissell (1986), Edge and Wilson (1990), Crain & Fleenor (1989).)
They prescribe a modular “standard format” of data, formulas, and output.
![]()
These authors also recommend some or all of the following: heading, date, file name, author, approval signoff, contents, error summary box, instructions, range names, validation formulas, and no constants in formulas.
They promise, “If you use modular spreadsheets, you will avoid errors.”
They demand, “People must be forced to used standard formats.”
They think, “Writing a spreadsheet is like writing a computer program.”
With all this in a spreadsheet, it's big before we start! Wouldn't you agree that a big spreadsheet is more likely to have errors than a small one?
Operations researchers have done lots with spreadsheets, but not about spreadsheets. Edwards and Finlay (1997) and Winston (1994) discussed modelling with spreadsheets, but say nothing about how we should write with spreadsheets.
In particular, the OR literature recommends nothing about structure. Is a decision variable data, calculation, or output? It looks like a constant, but is produced by an algorithm.
Yet we operations researchers write big models, we produce spreadsheets in consulting. Introductory Management Science is usually where students are taught spreadsheets for the first time. So we operations researchers need to think about spreadsheet style.
It is true that every spreadsheet has data, formulas and output. However, this is an abstract structure for batch-processed punch cards, designed for the computer's needs, not the reader's needs.
The modular structure implies that formulas are a “black box” that the reader should ignore rather than understand. Therefore, rather than attempt to make them clear, formulas are in fact hidden.
But most users do not have a computer background and will therefore not naturally understand this paradigm. Also, managers are hostile to perceived interference, the idea that spreadsheet structure must be standardized. In fact, managers use spreadsheets for the freedom from internal information technology departments (Cragg and King 1993, Davis 1996).
Readers do not want to read a computer program.

Literature, graphic design, film, and management science have awards for good style. These well-established forms of communication have well-established rules for style. But there are no public examples of good spreadsheets, and the available rules of style for spreadsheets leave much to be desired. We are left with a vacuum.
Conway & Ragsdale (1997) reported that “...the spreadsheet design which communicates its purpose most clearly will also be the most reliable, auditable and modifiable design.”
Text + graphics + math = spreadsheet. Take away the text and graphics, and we have a calculator. Take away the math, and we have a presentation authoring package. The spreadsheet is a synergistic sum of these. Since a spreadsheet is a combination of text, graphics, and math, then rules of style for those forms of communication should apply to spreadsheets.
This article gives rules that are common for text, graphics, and mathematics, and applies those rules to spreadsheets.
In an excellent article, Gopen and Swan (1990) observed, “Since we read from left to right, we prefer the context on the left...We prefer the new, important information on the right, since its job is to intrigue the reader.” Gopen and Swan wrote about technical writing, how to write scientific prose.
In Windows Interface Guidelines for Software Design, Microsoft (1995, p. 384) said we should write interfaces the way we read.
To see whether a spreadsheet reads from left to right, we can use an Excel tool, the arcs of
precedence. Many concepts discussed here have to do with formula
dependants and precedents. Arcs of precedence (Tools Auditing in Excel) should
flow down and to the right.
| Bad | Good |
| Precedents surround the cell. | All precedents are above and left. |
![]() |
![]() |
![]() |
![]() |
As in Gopen and Swan, in a spreadsheet, old information is the input data, since the reader is expected to know it. The new information is the formula, derived information that the reader seeks.
We expect to see the data first, and when we have digested that, we expect the output formula, to the right or below nearby. Intermediate formulas logically are data for later formulas, so the rule applies recursively to all numeric cells.
Therefore, to write clearly, each formula should depend only on cells above and to the left.
Gopen and Swan (1990) wrote about prose, for technical journal articles. Old information should be followed immediately by the related new information.
Higham (1993) wrote about writing mathematics. “Try to minimize the distance between a definition and its place of first use.”
Conway and Ragsdale (1997) wrote about spreadsheets. “Things which are logically related…should be arranged in close physical proximity and in the same columnar or row orientation.” Conway and Ragsdale wrote the only article to suggest anything about spreadsheet readability.
If a cell is close to its dependants, the reader will more easily see the relationship between them. The spreadsheet will be naturally organized by blocks of meaning, blocks that reflect the business rather than the mechanical requirements of the spreadsheet.
Long arcs of precedence are bad. Long arcs of precedence make a spreadsheet hard to understand and hard to audit.

Off-screen arcs of precedence are worse. Below, where is the information about preferences? The precedents of the Preference Total cell are not even on this sheet. If you hunt, you can see that Preferences are on a separate sheet.

Arranging by input-calculation-output separates related cells by a large visual distance. A cell in an input block and a cell in a calculation block will be far from each other, so their logical relationship is harder to see. The “input calculation output” structure induces long arcs of precedence, because related cells are not together.
The problems are (1) long arcs of precedence, (2) abstract data types rather than concrete data types, and (3) the use of multiple worksheets when just one sheet would do.
Below is the same spreadsheet, rearranged. Now we see in one click that Preference Total is related to Assignments and Preferences.

So rule number one is “Make your spreadsheet read from left to right and top to bottom.” Related to that, have short arcs of precedence. Put related cells together.
This usually means putting the whole model on one sheet.
Strunk and White (Elements of Style 1979), in writing about prose, wrote, “Vigorous writing is concise.”
Higham (1993), writing about mathematics, under “Otiose symbols”, said, “Do not use mathematical symbols unless they serve a purpose.”
Daellenbach, Systems and Decision Making, 1994, writing about math models, said, “A good model is a model that is as parsimonious as possible in terms of the variables/aspects included.”
Tufte (1983), writing about graphics wrote, “Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space…”
This is an old rule for text, math, and graphics, but it is entirely novel for spreadsheets. It is appropriate for virtually every other form of communication. But some people consider it a virtue to violate this with spreadsheets.
Before we go on, I must define one term: A cell with exactly one precedent is spurious.
Here, D8 = B24. So D8 is spurious. If this were text, D8 would be edited out as repetitive.

If this were graphics, D8 would be erased as redundant. If this were a math model, variable D8 would be substituted out. Also, the writer forgot to format D8 like B24. So we see that spurious cells breed errors and create work for the writer and reader.
Various bad habits motivate the use of spurious references, but mainly the problem is from long arcs of precedence.
Following the structure of a spreadsheet, we shall see that we need to be concise with sheets, blocks, cells, and formulas.
If it will fit on one sheet, put it on one sheet. Why? What is wrong with multiple sheets?
Multiple sheets breed spurious cells and labels, as information is “copied” from other sheets. The writer cannot see data on another sheet, so he adds cells that simply refer to it.
Multiple sheets dislocate related blocks; the reader must remember a context from one screen to the next.
| An arc of precedence to off-sheet cells displays an uninformative “off the sheet” icon, so multiple sheets render auditing tools useless. | ![]() |
Multiple sheets are hard to navigate and make information hard to find.
Key and mouse controls to change sheets are more complicated than navigating a single sheet. Many spreadsheet readers do not know the key sequence for changing sheets (Ctrl-Pg Up, Ctrl-Pg Dn), but most can use keys to move around a single sheet.
| The search function in Excel 97 does not scan different sheets by default, but only the selected sheet. | ![]() |
| This single sheet (only 3 screens) was derived from the monster above.
It is the same model, but with the trivia eliminated and simpler formatting. Which would you rather audit? So be concise in the number of sheets. |
![]() |
The primary data type should be concrete, not abstract.
“People” rather than “data, formula, output.” Time.
The spreadsheet below is organized abstractly, like a linear program. The objective function is at the top, as in LINDO. The different sheets contain staffing constraints and right hand sides, decision variables, objective coefficients, and work capacities.

But the user is likely to be a manager who does not know or care much about LP.
To expand this spreadsheet, say to add another person to the schedule, the writer will have to adjust 3 different sheets.
Repeated labels suggest that the writer could consolidate blocks. A concrete data type, such as “person”, is probably a better structure than an abstract data type, such as “data” or “constraints.” It will make more sense to the manager.

Conway and Ragsdale (1997) wrote, “A design that results in formulas that can be copied is probably better than one that does not.”
Below, the sheet is organized in a concrete way, by person, rather than in an abstract way. Many redundant labels have been erased, making the spreadsheet smaller. To add another person, the manager can copy just one block.

Transposing by formula makes construction and modification difficult. Transposition with formulas is begging for trouble. Notice the blank row inserted in the table at right.

Why didn't the writer just enter the data vertically, the way he needed, it in the first place?
When you build a table, think about which way is the longer. Then put it so the long way goes down. Since most spreadsheets have more rows than columns, and since we prefer to read by paging down, it will be more readable aligned vertically.
However, this applies to more than just one table. Align all parts of the spreadsheet vertically.
The “bulletin board” arrangement below looks like the “for rent” board at the student union. It is almost impossible to find anything. Maybe you could use Excel's search function, if you knew a word to search for, but that fails on multiple sheets.
| Bad | Good |
![]() |
![]() |
We read by paging down, as in a web page, not by wandering randomly across a huge field. Place tables vertically.
| First, another definition. A cell with no dependants is dangling. No other cell needs it. Below, I clicked on the “Trace dependents” button, with the cursor on cell B9. It just went “ding.” Cell B9 has no dependents, so it is dangling. It's there “in case you want to see it.” We don't. |
![]() |
Erase dangling cells. Dangling cells include:
- useless calculations,
- unused inputs,
- validation formulas,
- interpreted output.
Nest and erase to reduce complexity and the number of cells.
Eliminate relics.
| For cell L7, I tried to click the “Show dependants” button, but L7 has no dependants. It is not used in the objective function or in any constraint. Cell L7 is dangling. It is an irrelevant calculation. | ![]() |
Spurious cells are repetitive. But dangling cells are irrelevant - they display unneeded numbers.
| An unused input is also a dangling cell. Why would the writer have an input that
isn't used anywhere? Usually, an unused input is a relic of a bygone version. Cell G15, a constant, is dangling. It is an unused input. |
![]() |
Many authors recommended that formulas be used to check the validity of input and output. These extra formulas are dangling by definition.
| Validation formulas can contain errors themselves and make the spreadsheet harder to
read. They may be useful for debugging, but eventually should be erased. Validations distract the reader and may have errors. Can you see the error here? It is not a serious error. |
![]() |
The fourth kind of dangling cell is the interpreted output. Interpreted output is like a comedian repeating the punch line, hoping for an extra laugh.
Here, the real output is cell K33 on the Input sheet, of all places. This monster cell here is unnecessary.
Don't mix a model's formulas with its documentation. Use formulas only for the model, and use concise constant text for labels.

| Keep dangling cells only for a compelling reason. Here, the What'sBest! constraint formula is dangling, but is required to model the linear program. So there is a compelling reason to keep it. |
![]() |
When a cell has just one dependent, the spreadsheet has an unneeded variable that can be substituted out. In mathematics, we wouldn't think twice about this. In prose, again, we would cut out the extra verbiage. Your grammar school English teacher would attack it with a red pen. In graphics, we would not put a logo twice in advertising copy. It would look silly. But somehow, people feel that extra cells are commendable in a spreadsheet.
| Bad: B18 = B11 - B16 and B19 = B2 - B18. |
![]() |
| Good: B19 = B2 - B11 + B16. Looks like I forgot to correct the label in A19. |
![]() |
| Another example of nest and erase. D8= C4 + C5 + C6 + C7 + C8. E8 = WB(D8,">=",B8). |
![]() |
| Cells D4:D10 are used only in E4:D10. So we can nest D4:D10 into E4:E10. Now D8 = WB(C4+C5+C6+C7+C8, ">=", B8). Now constraints refer only to constants and decision variables. |
![]() |
Here is another example. In this one, the formulas are long because there are multiple sheets.
However, the formulas can also be simplified. Here is the original:
I24=IF(M23<0, 'Accounting Summary'!H24,'Accounting Summary'!H24+'Acct Prep'!M23)
K24=IF(I24=0,0,(+V24*'Accounting Variables'!$B$26*0.5))
M24=+I24-K24
Put on one sheet and simplified, this could be:
I24=H24+MAX(0,M23)
K24=I24-IF(I24=0,0,V24*$B$26/2)
Now the formulas are simpler, smaller, easier to understand.
This contradicts many loud voices telling us to “split complicated formulas into multiple cells.”
I am saying that splitting formulas into multiple cells is generally bad advice. Splitting formulas results in bloated hard-to-read spreadsheets.
Instead, make your spreadsheets as concise as possible, with the fewest number of cells.
A relic is anything no longer needed, but appears because the writer has not cleaned it up. Relics confuse the reader.
A simple way to look for spreadsheet relics is to press the End key, then the Home key. This moves the cursor to the lowest and furthest right cell for which computer memory has been allocated. If this last cell is not the bottom right cell of the intended spreadsheet, then there may be relics. Most of the time, extra rows and columns just contain old formats.
| Hmmm, I wonder if the “Surplus = $1 Million” depends on cell F63? | ![]() |
| Bad: Formulas contain constants. | |
| Good: Formulas reference constants (constants are yellow). |
![]() |
| Bad: References are repeated. References appear randomly. |
|
| Good: References are collected. References read left to right in row and column order. |
|
| Bad: Division appears randomly. | |
| Good: Division appears at the end. |
Try to put division at the end of the formula.
The Number 1 cause of verbose spreadsheets: doing vector products the long way.
This model has 4 constraints and an objective function. Why make it harder than it is?
Bad:

Good:

The writer knew something was wrong here. When he ran the model, things went sour, so he had to add this kludge of 0.000001 to make it work. What's wrong is that he has AJ2 in the numerator and the denominator, so AJ2 has no effect whatsoever as long as AJ2 is a nonzero number.
| Bad: The formula has unneeded spaces and parentheses; it can be simplified. | |
| Good: The formula has the fewest characters necessary. |
Spreadsheet blank ¹ graphical blank.
Graphical blank cannot be covered by nearby information. But a blank cell may appear to have data that is really in another cell.
| Is K32 blank? We could see which cell contains “Surplus = $ 1 Million,” if the cursor were right on it. It can be in any of at least a dozen different cells. | ![]() |
Graphical blank hides nothing. Spreadsheet blank can hide information.
I33 is not blank. Cell I33 contains a big formula. It's not truly blank, though it looks blank. Do not write in invisible ink.

If cells depend on blanks, the spreadsheet is perverse.
Graphic blank has only a visual effect. A cell can be blank, but perversely another cell can depend on it. It is wrong because it is blank. This error is most common when precedence arcs are (1) too long and (2) do not point the way we read.
| Here, if the writer had put a zero in cell Z19, we wouldn't be worried. It's just a strange structure. But Z19 is blank. The writer probably made a mistake. | ![]() |
A reference to a blank is as dangerous as vagueness in legal writing. So be careful with blank space.
Below, in the first two sheets, it is unclear which cells contain some of the labels. Is Preference Total in column B, C, or D? Columns A and C just confuse the reader. In the bottom right sheet, there is no ambiguity.

For better readability, cells should be the way they look.
Use a minimum of blank space, and only to divide blocks visually. A minimum of blank space helps keep the spreadsheet on fewer screens, and helps avoid an impression of hidden or misplaced information.
A format is descriptive if it gives the reader immediately recognizable information. A format is decorative if it displays no information.
Decoration in a spreadsheet is like crayon in a business letter. Format to explain to the reader, not to impress the reader.
| Bad. Meaningless colour and format. Numbers are not formatted. | |
| Good. Text is clean. Formats give information. |
For readability, do colours help or hurt? Would you write a business letter like the spreadsheet below?
In the first sheet, the formats do not give us any information.
To look professional, strive for a clean look rather than distracting big titles and heavy colours. To get attention on a few specific cells, use a logical layout and concise labels.
A graph that is cool looking, but gives little information is what Edward Tufte calls a duck, so named for a restaurant in California that is shaped like that. The whole thing is one big decoration. This sheet looks like a duck - it is mostly decoration.

Below is the same sheet, reformatted. Inputs are boxed in yellow, though lately I've switched to a calmer grey.

Below, at left, there is so much formatting, it is meaningless.
In the centre, I turned it off the format - it's a relief to get rid of the noise.
At right, I formatted the constants in grey. Why are there constants on the Formulas sheet? Shouldn't they be on the Inputs sheet?

Think about how the format will look to your audience. On a laser printer, grey prints beautifully, but yellow becomes white. Blue prints as black.
I used to like all constants in yellow, but I'm switching over to putting formulas in grey. Most of my spreadsheets are constant, so if the constants are formatted, then most of the spreadsheet is formatted, which looks too heavy.
| Yellow constants. But yellow does not print. | Grey formulas, which print. Less format, cleaner, calmer. |
![]() |
![]() |
What'Best! makes decision variables blue, but blue looks black when printed. So I also make decision variables italic, when the spreadsheet must be printed in black and white.
| Pretty, but it hides the business logic. It's a duck! The sheet is formatted to look cool, not to help you understand the spreadsheet. The green title block actually uses a graphic frame. | ![]() |
| Calmer format. Much more fits on a screen. Once the formats are killed, we see this so-called template is trivial. | ![]() |
Readers try to find meaning in column widths. There should be none.
If we see a special column width, it jumps out. So we look for information in it. This suggests you should not use thin columns simply as dividers. It will confuse your reader. Instead, try to make columns all about the same width. This is also visually more pleasing.
| Bad | Good |
![]() |
![]() |
Do not use abbreviations, not even for ones that “everybody knows.” Abbreviations inhibit comprehension. Abbreviations require interpretation, even when the reader knows what they mean. It is better to have the extra space of a wider column or an extra row than to take the chance that the reader will not understand.
Even commonly understood symbols (such as Q for the economic order quantity, A for fixed order cost) should be written out in a spreadsheet (Economic order quantity Q, Fixed order cost A).
Text is most legible when the letters look most different. Upper case is hard to read because the letters tend to be uniform widths and heights, and there are no visual cues (like proper capitalization) that suggest punctuation.
| Bad: Labels have all capitals. Labels are abbreviated. Labels are misspelled. | Good: Labels have proper case. Labels are spelled out. Labels are correctly spelled. |
![]() |
![]() |
Labelling is related to the length of the arcs of precedence. A cell that is far from its dependants is out of context. A cell that is close to its dependants is in context. It is not just labelling or being close to related cells that we seek, it is improved meaning. If the label is unclear, but the cell is close to its dependants, the reader has a chance to figure it out from the context. If the label is unclear, and the cell is far from its dependants, understanding it will take all day.
| Bad: Labels are cryptic. The cell is not near its dependants. |
![]() |
| Good: Labels give information. The cell is near its dependants. |
![]() |
Since we read left to right, we look for the name, the introduction, the description, on the left. Then, when we have understood that, we look for the number on the right. So put labels on the left.
| Bad: Labels are on the right, where we see them last. | ![]() |
| Good: Labels are on the left, where we see them first. | ![]() |
Hiding cells in a spreadsheet is patronizing, irritating, and reduces the reader's confidence in the spreadsheet. If a reader takes the time to audit a spreadsheet with hidden cells, the model cannot be proved correct, because he cannot see the formulas.
Below, cell I32 is formatted as hidden. Auditing - or understanding - this formula is impossible.

(Recently, I did hear of a case where a clever writer used hidden cells to enable the reader to have Autocomplete. In that case, the writer hid irrelevant information to provide added functionality, and a better experience for the reader. Most likely, a drop-down box would have worked just as well, but this is a clever exception to the rule, “Don't hide cells.”)
Password protection can be used to keep information hidden, or or to lock sheets.
Like hiding information, preventing changes is similarly annoying. Writers sometimes keep formulas hidden with password protection to hide the fact that they don't know what they're doing, because they are in fact not sure of their formulas. The writer insults the reader by assuming the reader cannot improve the work, and preventing derivations from it diminishes its utility.
Locking a spreadsheet can help prevent accidental changes by spreadsheet-illiterate readers. Just avoid password protection, or at least give the password out freely on request.
New features in Excel allow the writer to password protect the spreadsheet in such a way that he can track changes made by others in a shared spreadsheet. Tracking changes is good. Hiding information that others deserve to see is bad.
Use password protection only to hide confidential data or to track changes.

Idiot-proofing, a heinous attempt to simplify, involves lots of formatting, heavy lines around the inputs, many colours, an input sheet, a summary sheet, password protection, etc. In short, idiot-proofing means creating a monster.
Instead, have the fewest cells necessary to produce the result, flow the logic from top to bottom and left to right, and put related cells close together. And put it all on one sheet. “Keep it simple” could be restated as “Keep it small.”
11 screens, 5 sheets versus 3 screens, 1 sheet. Which would you prefer to audit?
While these ideas are rather ordinary for mathematics, for writing prose, or for drawing graphics, they are controversial for spreadsheets. Presenting quantitative information is tricky, but if we rely on well-understood rules in the more established arts, we can produce better spreadsheets.
My University home page.
Updated 15 July 2003, jfr