Why is FindMinimum slow for this convex cost function of one variable?

I implemented a Kalman filter, which perfectly works. This filter has four parameters denoted here by Q1,Q2,R1,R2. I’d like to find the optimal parameters, in the sense of an error measured by a function error. To make it simple, I fix Q2,R1,R2 and the optimization problem is only in one variable, Q1.

My problem in short The cost function to optimize is quite unexpensive to compute and is (at least locally) convex:

ListLinePlot[Table[{Q1,error[Q1,0.01,0.1,0.1]},{Q1,0,1,0.01}]] // AbsoluteTiming

However, when trying to optimize:

FindMinimum[{error[Q1, 0.01, 0.1, 0.1], 0 <= Q1 <= 1.}, {Q1, 0.1}] never seems to end. My goal would be to be able to optimize for all four parameters, but I fail even for one. It's not clear to me since the cost function seem to be nice. Any ideas on why FindMinimum is slow? Full code: (* generate data (vList) from reference (vFun) *) n=200; SeedRandom[1234] vData=RandomReal[{-1,1},2*n]; vFun[t_]=Fit[vData,Table[t^n,{n,0,50}],t]; vList=Array[vFun,n,{0,n}]+.2*RandomReal[{-.3,.3},n]; aList=Array[vFun',n,{0,n}]+.2*RandomReal[{-.05,.05},n]; (* definitions *) A={{1.,1},{0,1}}; y=Transpose[{vList,aList}]; x[0]={0,0}; p[0]=DiagonalMatrix[{10,10.}]; (* process iteration i *) step[i_]:=Block[{}, pttm1[i]=A.p[i-1].Transpose[A]+Q; Kt[i]=pttm1[i].Inverse[(pttm1[i]+R)]; x[i]=A.x[i-1]+Kt[i].(y[[i]]-A.x[i-1]); p[i]=(IdentityMatrix[2]-Kt[i]).pttm1[i];] (* run n steps *) run[Q1_,Q2_,R1_,R2_]:=Block[{}, Q=DiagonalMatrix[{Q1,Q2}]; R=DiagonalMatrix[{R1,R2}]; Array[step,n]; Array[x[#][[1]]&,n]] (* cost function *) error[Q1_, Q2_, R1_, R2_] := Block[{xKal}, xKal = run[Q1, Q2, R1, R2]; Total@Table[(vFun[i] - xKal[[i]])^2, {i, 1, Length[xKal]}]] Just for information, the aim of the Kalman filter is to denoise a signal. Example: sol = run[0.0001, 0.01, 0.3, 0.1]; Show[Plot[vFun[t], {t, 1, n}, PlotStyle -> Red],
ListLinePlot[{vList, sol}, PlotRange -> Full, DataRange -> {0, n}]]

The red curve is the objective, the blue curve is the input data, the orange curve is the filtered data. The parameters are not optimal here, i.e. with some better parameters the orange curve would better match the red one.




You have 0 <= Q1 < 0 inside the FindMinimum call... Is that intentional? – Rahul Aug 18 at 19:56      @Rahul Thank you for pointing out, that's a typo. The problem still holds. – anderstood Aug 18 at 20:11      Maybe you should consider the command KalmanFilter? – bill s Aug 18 at 20:31      @bills My goal is to really understand what happens in a Kalman filter, so I prefer to write it with my own hands. It proves to be also a good way of learning MMA 🙂 – anderstood Aug 18 at 20:49 ================= 1 Answer 1 ================= The problem seems to be that time is spent evaluating error symbolically. If I define fn[q_?NumericQ] := error[q, 0.01, 0.1, 0.1] then FindMinimum[{fn[Q1], 0 <= Q1 <= 1.}, {Q1, 0.1}] evaluates quickly, giving {0.566953, {Q1 -> 0.0883772}}



Seems to solve the problem. How did you find this? I thought MMA would treat everything as numeric, since all the variables have numeric values.
– anderstood
Aug 18 at 20:48



I think that MMA tries to understand the symbolic form of the expression it is being asked to minimise, before attempting minimisation. In your case, evaluating error[Q1, 0.01, 0.1, 0.1] takes a long time (presumably your code is not intended to work with symbolic inputs). The technique of adding ?NumericQ to any input that only makes sense when defined as an explicit number is a very general one, and is a good “defensive” coding technique.
– mikado
Aug 18 at 20:54