I am a newbie and I’m trying to use Mathematica to obtain the symbolic Maximum Likelihood Estimation for a cumulative normal distribution. So far I have reached the step where I have the derivative of the log likelihood for both μ\mu and σ\sigma, i.e.

score={∂logL∂μ,∂logL∂σ} \text{score}=\left\{\frac{\partial \text{logL}}{\partial \mu },\frac{\partial \text{logL}}{\partial \sigma }\right\}

, or

{n∑i=1−√2πe−(μ−xi)22σ2σerfc(μ−xi√2σ),n∑i=1√2π(μ−xi)e−(μ−xi)22σ2σ2erfc(μ−xi√2σ)}

\left\{\sum _{i=1}^n -\frac{\sqrt{\frac{2}{\pi }} e^{-\frac{\left(\mu -x_i\right){}^2}{2 \sigma ^2}}}{\sigma \text{erfc}\left(\frac{\mu -x_i}{\sqrt{2} \sigma }\right)},\sum _{i=1}^n \frac{\sqrt{\frac{2}{\pi }} \left(\mu -x_i\right) e^{-\frac{\left(\mu -x_i\right){}^2}{2 \sigma ^2}}}{\sigma ^2 \text{erfc}\left(\frac{\mu -x_i}{\sqrt{2} \sigma }\right)}\right\}

The problem is that when I try to solve when these expressions are 0:

Solve[score == {0, 0} && Ïƒ > 0, {Î¼, Ïƒ}]

Mathematica tells me that it can’t solve it (Solve::nsmet: This system cannot be solved with the methods available to Solve).

Is this really such a hard problem or am I missing something?

=================

Try Solve[ Thread[ score={0,0}]…]

– chris

Mar 28 ’13 at 17:17

1

and please typeset mathematica expressions. cf the FAQ.

– chris

Mar 28 ’13 at 17:20

@chris I tried it with Thread but it still doesn’t work, also it works for other distributions such as Poisson, so if Thread is just for the syntax, i don’t think I have a syntax issue

– zamazalotta

Mar 28 ’13 at 17:39

What is your ‘cumulative normal distribution’? Do you mean a Normal distribution, or the distribution of the sample sum, or something else?

– wolfies

Mar 28 ’13 at 17:39

1

Eh? The CDF of the Normal is not the pdf. The MLE is calculated wrt the pdf, not the cdf.

– wolfies

Mar 28 ’13 at 17:45

=================

1 Answer

1

=================

Let us try with a Normal distribution. Start by the log normal of it:

ff = PowerExpand[Log@ PDF[NormalDistribution[Î¼, Ïƒ], x],

Assumptions -> {Ïƒ > 0,Î¼ âˆˆ Reals, x âˆˆ Reals}]

Now the log likelihood obey

Q = Sum[ff /. x -> Subscript[x, i], {i, n}]

We can define the condition of stastionarity

eqn = Thread[D[Q, #] & /@ {Ïƒ, Î¼} == 0]

We need to tell mathematica to expand the sums

eqn2 =

eqn /. Sum[a_, b_] :> sum[Expand[a], b] //.

sum[a_ + b_, c_] :> sum[a, c] + sum[b, c] /.

sum[a_ b_, c_] :> b sum[a, c] /; FreeQ[b, Subscript[x, i]] /.

sum[a_, b_] :> a sum[1, b] /; FreeQ[a, Subscript[x, i]] /.

sum -> Sum

now we can solve

Solve[eqn2, {Î¼, Ïƒ }][[2]]

QED

EDIT

You might want to look at the LogLikelihood function

With this function, taking a definite (n=3) case

Q = LogLikelihood[NormalDistribution[Î¼, Ïƒ], Table[Subscript[x, i], {i, 3}]]

Solve[Thread[D[Q, #] & /@ {Î¼, Ïƒ} == 0], {Î¼, Ïƒ}]

With a numerical example:

data = RandomVariate[NormalDistribution[1, 2], 100];

Q = LogLikelihood[NormalDistribution[Î¼, Ïƒ], data];

sol=NSolve[Thread[D[Q, #] & /@ {Î¼, Ïƒ} == 0], {Î¼, Ïƒ}][[2]]

(* ==> {Î¼-> 0.92576, Ïƒ-> 2.03064} *)

Show[ContourPlot[Q, {Î¼, -1, 3}, {Ïƒ, 1, 3}, ColorFunction -> “Heat”],

Graphics[Point[{Î¼, Ïƒ}] /. sol]]

For a cumulative distribution? What sort of density would you suggest this cumulative distribution has?

– wolfies

Mar 28 ’13 at 17:52

@chris so this means that the only way to fit a dataset with normcdf is numerical optimization?

– zamazalotta

Mar 28 ’13 at 17:54

@wolfies I guess its a good point: the CDF is not normalized…

– chris

Mar 28 ’13 at 17:55

1

@chris thank you very much for your detailed answer 🙂

– zamazalotta

Mar 28 ’13 at 18:07

1

The problem is that a normal cumulative distribution is not a distribution as its integral diverges.

– chris

Mar 29 ’13 at 6:38