I would like to efficiently find the maximum (σ,λ)(\sigma,\lambda) for the log-likelihood of the derived distribution below. I only need σ\sigma and λ\lambda to one decimal place – so not very precise. I have tried lowering both the AccuracyGoal and PrecisionGoal to as low as 1, or 2, but this doesn’t appear to affect the rate at which the solution is obtained.

The below code creates the function, the some test data, then attempts its maximisation:

aDist[Ïƒ_, Î»_] := TruncatedDistribution[{0, âˆž},

MixtureDistribution[{1, 1}, {NormalDistribution[0, Ïƒ], ExponentialDistribution[Î»]}]];

data = If[# > 0, #, 0] & /@ RandomVariate[aDist[4, 1/3], {20}];

NMaximize[{LogLikelihood[aDist[Ïƒ, Î»], data], 10 > Ïƒ > 0, 1 > Î» > 0}, {Ïƒ, Î»}]

Whilst for 20 data points this method returns a solution in a short time, for my actual dataset I need to find a solution for a dataset of size 1000+. As it currently stands, this would be untenable for the above method.

I have tried some of the different methods available to NMaximize, including “RandomSearch”, and “DifferentialEvolution”. However, the method I choose does not seem to make the maximisation run faster.

I have also tried FindMaximum where I start the solver close to the actual parameter values, however this calculation appears to hang forever. I have also tried differentiating, then using NSolve, and FindRoot, but again I am not having any success. Finally I tried a variant on the EMEM algorithm here, where I alternate between maximising over σ\sigma then λ\lambda, but again this doesn’t help.

Does anyone have any ideas here? I know I could use MCMC if I turn the problem into a Bayesian one, but for reasons I won’t mention here I don’t want to do this.

As an aside, I can see why it is difficult to find a reasonable maximum to this function, since there is a high degree of correlation between the parameters of the distribution. However, I can’t help but think there’s a solution of which I’m neglecting to think.

=================

2

Have you seen FindDistributionParameters[]?

– J. M.♦

Feb 21 at 1:31

1

I had tried it, but must have made a mistake. I have tried it now, and it works much, much better than previous methods. Thank you very much. I will close the question now. Best, Ben

– ben18785

Feb 21 at 1:38

2

Why do you use both If[# > 0, #, 0] and TruncatedDistribution

– Coolwater

Feb 21 at 9:17

Further to coolwater’s comment, what do you think the If[# > 0, #, 0] is doing in data = If[# > 0, #, 0] & /@ RandomVariate[aDist[4, 1/3], {20}]; Because it is doing absolutely nothing. adist is defined on the positive real line, so your If statement is superfluous – it is doing NOTHING.

– wolfies

Feb 21 at 15:54

=================

1 Answer

1

=================

As per J.M.’s answer, FindDistributionParameters works really well here. In particular, to get a low precision answer I used:

FindDistributionParameters[data, aDist[Ïƒ, Î»], WorkingPrecision -> 3]

This solves the 1,000 data points case in seconds. Thanks again to J.M.!