Glam Prestige Journal

Bright entertainment trends with youth appeal.

I am just picking up Excel, but have experience with R and Stata. Does Excel have a missing symbol convention? That is, does Excel have something like R's NA or Stata's .?

For example, I evaluate an IF() statement and want to return a value that will be omitted from later calculations I use "NA". Is this the correct approach in Excel? It seems like I am missing a basic concept, but I can't get Google to give me a better answer.

To make this more concrete, I have a conditional like =IF([@[Div Dummy]]=1,"NA",EOMONTH(A8,0)). Is there a way to get a numeric "missing value" placeholder so that I don't get warnings about conflicted data types?

2

3 Answers

You're almost there. You can use NA() in your formulas to return that "error" value. It's very useful when charting because it's not charted (unlike 0).

4

You can leave cells empty and use (if I remember correctly) =isblank() to do the conditional. And it's worth noting that many numerical functions like =sum() and =product() will ignore empty cells, ala sum(x, na.rm=TRUE) in R.

But no, this is one of dozens of reasons why Excel is not a serious tool for statistical data. (Although it's not terrible for simple simulation and optimization problems.)

2

I find array formulas in Excel very useful for solving a lot of these kinds of problems. If you don't know array formulas yet, here's a example:

Data Col A Col B Col C
Row 1: Fruit Number Color
Row 2: Apple 5 Green
Row 3: (leave A3 blank) 10 Yellow
Row 4: Peach 6 (leave C4 blank)
Row 5: Grape 6 Purple

Then, enter this in another cell, somewhere:

=AVERAGE(IF((A2:A5<>"")*(C2:C5<>""),B2:B5))

To make it an array, you MUST use Ctrl+Shift+Enter.  Your formula will look like this:

{=AVERAGE(IF((A2:A5<>"")*(C2:C5<>""),B2:B5))}

but do not type the curly braces; they will appear automatically when you use Ctrl+Shift+Enter.

In that formula "" stands for missing, <> means not equal to, and the * means AND. So, that formula says, If A2:A5 is not missing and C2:C5 is not missing, then return the average of B2:B5. In this case, the result is 5 (average of 4 and 6). 5 and 10 are excluded because values are missing in A3 and C4.

Now try this:

=AVERAGE(IF((A2:A5<>"")+(C2:C5<>""),B2:B5))

Enter as an array.

The only difference is + instead of *. + means OR. So, in this case the answer is 6.25, because all of the values in B have a non-missing value in A or C.

AVERAGE can be replaced by a number of other functions such as MAX, MIN, SUM, COUNT, and you can put conditionals in front of the array: e.g.,

=IF(B2>3,AVERAGE(IF((A2:A5<>"")+(C2:C5<>""),B2:B5)),"N/A")

Enter as an array formula. So, if some value gives permission, then run the array, otherwise return "N/A". The permission value can be anywhere else on your spreadsheet and does not have to be contained in an array.

You can also filter by values within the target column: for example:

=AVERAGE(IF(((A2:A5<>"")+(C2:C5<>""))*(B2:B5>4),B2:B5))

so, if (A2:A5 is not missing OR C2:C5 is not missing) AND B2:B5 is greater than 4, the answer is 7 because it will take the average of 5, 6, and 10.

You can also do this...

=AVERAGE(IF((A2:A5="Apple")+(A2:A5="Pear"),B2:B5))

(note that "Apple" and "Pear" can be replaced with cell references).  The answer is 4.5 because it will average 4 and 5.

or this: First enter =10/0 in B3. You will get a divide by zero error that will replace the value 10.

=SUM(IF((ISNUMBER(B2:B5)),B2:B5))

Enter as array.

So, this looks across the array B2:B5 and will only include numbers. Since #DIV/0 is not a number, it will sum 4, 5, 6 for an answer of 15.

I'm sure there are limits somewhere on the number of conditionals within an array but I've never run into a situation where I ran out. As well, this is such a flexible tool, only really limited by your creativity. Lot's of interesting things you can do with embedded MAX and MIN functions within an array formula :)

Be aware though that like other kinds of formulas in Excel, array formulas use resources and can slow your spreadsheet.

I find these particularly useful for creating summary reporting in tables that can be highly customized. Because these formulas can be dragged, if you are careful with your absolute references ($), you can very quickly create a large number of these.

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy