I want to compute the mean of a list while ignoring missing values, but I don’t find a lot of information on how to deal with Missing data in Mathematica.

I have a list like :

a = {{{0, 1}, {2, Na}} , {{0, 3}, {2, 5}}}

and I want to compute the mean but ignoring the Na values. So my output would be:

a = {{0, 2}, {2, 5}}

How can I do that? Thanks.

=================

2

What is the expected out of {{{0, 1}, {2, Na}} , {{0, 3}, {2, Na}}}?

– Dr. belisarius

Jan 8 ’13 at 17:27

1

The question does make sense but it has an important ambiguity. Literally, a is a list of two elements and the first element contains an Na. Thus, “ignoring the Na values,” its mean equals its second element, {{0,3}, {2,5}}. The example, though, suggests–inconclusively–that this list is supposed to be thought of as a list of ordered pairs (each of which is a list of two elements) and that the means should be obtained independently for each component. That’s why we need more information in order to answer this question objectively.

– whuber

Jan 8 ’13 at 18:19

Sorry for the ambiguity. The expected out of {{{0, 1}, {2, Na}} , {{0, 3}, {2, Na}}} would be {{0, 2}, {2, Na}}. Yes the mean should be obtained independently for each component. Imagine that each sublist is a set of data:

– Sulli

Jan 9 ’13 at 8:55

I was saying, to rephrase the question, imagine that I’m measuring temperature at time t=0 and t=2. I get the data {{0, 1}, {2, Na}}. Then I repeat this experiment and get the data {{0, 3}, {2, 5}}. Now I’d like to compute the average to get only one dataset but ignoring the Na values.

– Sulli

Jan 9 ’13 at 9:04

=================

3 Answers

3

=================

Assuming that you just want to replicate the behavior of Mean with missing elements then the following may work:

meanWithNa[a_, na_] := MapThread[Mean[{##} /. na -> Sequence[]] &, a, 2]

And we have that

a = {{{0, 1}, {2, Na}, {3, 4}},

{{0, 7}, {2, 5}, {5, 6}},

{{1, 2}, {3, Na}, {7, 8}}

}

Mean[a]

(* => {{1/3, 10/3}, {7/3, 1/3 (5 + 2 Na)}, {5, 6}} *)

meanWithNa[a, Na]

(* => {{1/3, 10/3}, {7/3, 5}, {5, 6}} *)

Which may be what you want.

Here is a slight generalization:

meanWithNa1[a_, na_] := MapThread[Mean[{##} /. na -> Sequence[]] &, a, Depth[a] – 2]

which makes the function works with vectors and matrices.

This one does what I want and is simple enough for me to understand.:) So basically Sequence[] gives an empty input to Mean and Mean is ok with that, that’s what Sequence is for?

– Sulli

Jan 9 ’13 at 9:30

interesting, this effect of an empty Sequence[] seems to be undocumented. Learn something new every day.

– george2079

Jan 9 ’13 at 14:51

Sullivan, it eliminates the missing values from the list. For example: {a,na,b} /. na -> Sequence[] returns {a, b}. In the above, Mean operates over the resulting lists (without the missing values). You can see what’s going on by replacing Mean by some other undefined function.

– ecoxlinux

Jan 9 ’13 at 22:10

By the way, you could instead do MapThread[Mean[DeleteCases[{##}, na]] &, a, Depth[a] – 2], which seems to be faster.

– ecoxlinux

Jan 9 ’13 at 22:49

And I just noticed that you may also want to wrap na -> Sequence[] with parenthesis, so that it works when na is an integer.

– ecoxlinux

Jan 10 ’13 at 2:49

Here is my interpretation of the question:

a = {{{0, 1}, {2, Na}}, {{0, 3}, {2, 5}}, {{3, 8}, {0, 7}}};

DeleteCases[%, Na, -1]

Flatten[%, {{2}, {3}}]

Map[Mean, %, {2}]

{{{0, 1}, {2}}, {{0, 3}, {2, 5}}, {{3, 8}, {0, 7}}}

{{{0, 0, 3}, {1, 3, 8}}, {{2, 2, 0}, {5, 7}}}

{{1, 4}, {4/3, 6}}

More generally as a function:

thread[f_, a_?ArrayQ, pat_] :=

Map[f, Flatten[DeleteCases[a, pat, -1], List /@ Range[2, # + 1]], {#}] &[ArrayDepth@a – 1]

thread[Mean, a, Na]

{{1, 4}, {4/3, 6}}

Edit — my original post produced a transposed result, (not apparent due to symmetry of the example).

a = {{{0, 1}, {2, Na}}, {{0, 3}, {2, 5}}}

Table[ Mean[

Select[ Flatten[Take[a, All, {j}, {i}] ] , NumericQ]] , {j, 2}, {i, 2}]

{{0, 2}, {2, 5}}

looking at Michaels example:

a = {{{0, 1}, {2, na}}, {{0, 3}, {2, 5}}, {{3, 8}, {0, 7}}};

builtin Mean gives:

Mean[a] -> {{1, 4}, {4/3, (12 + na)/3}}

my interpretation of tehe question is that last component should be Mean[{na,5,7}] “ignoring” the na which is 6.

Table[Mean[Select[Flatten[Take[a, All, {j}, {i}]], NumericQ]], {j, 2},

{i, 2}] -> {{1, 4}, {4/3, 6}}

1

Could you add something about how your answer works?

– rcollyer

Jan 8 ’13 at 18:41

in a nutshell convert the list of 2×2 matrices to a 2×2 matrix of lists of the individual components. Use select on each component list to drop non-numeric values, then take the mean.

– george2079

Jan 8 ’13 at 19:56