Monday, July 6, 2009

numpy/scipy

The more I use numpy and scipy the more I like them. Matplotlib has some quirks, but all in all its easy to quickly write something powerful that just works.

Update (09 July 2009): That said, I am surprised at how simplistic the fitting routines (aka optimization, minimization, etc) are in scipy, and the lack of good generalized fitting packages in python. Its all very well if you want to use the routines provided in scipy.optimize and uncertainties in your data aren't a big issue but that's not at all the realm of data I experience daily.

It may be that scipy.optimize's routines can handle data uncertainties properly through some more complicated interface, but all the examples I've seen don't seem to suggest that as they all totally ignore data errors. Nor do they mention even simple fitting concepts like chi squared (I'd survive with chi squared, I don't need log likelihood all the time).

Update (10 July 2009): OK, you can handle data uncertainties using the scipy.optimize routines, you just make sure the function you want to minimize calculates chi squared or log likelihood, and you effectively have chi^2 fitting or Cash statistic fitting. I haven't yet worked out how to calculate the uncertainties on the fit parameters using scipy.optimize -- the most naive way is to use the covariance matrix (not robust, I know).

Space Telescope's stsci_python provides pytools.nmpfit (a numpy version of python mpfit, itself a version of IDL mpfit). This does Levenberg-Marquardt fitting and does provide estimates of the best-fit parameter uncertainties (using the covariance matrix). After giving it a spin I find it works just as well as scipy.optimize, although at the cost of having to install the entire stsci_python package. More interestingly (n)mpfit has a well developed interface for fixing (and unfixing) fit parameters as well as handling limits on the fit parameters. I ended up using this for the HST ACS data analysis routines I'm writing. Still, none of these approach the sophistication of xspec or sherpa. If it weren't for the huge size of these packages I'd try to link to these behemoths (well, sherpa anyway, as it has a python interface), but I'm trying to (a) use more generalized packages, (b) wean myself away from relying on the same software I've used for years and learn something new...

2 comments:

Unknown said...

I wonder what the intended use of scipy is after all. Without a more useful and complete minimization system the package seems almost useless. In order to do a basic fit with scipy.optimize.leastsq I have to define two anonymous functions? Ok, fine, that's a little more work that gets you a lot more power and control but aren't easy things supposed to be easy? In every use case I've encountered, the weakness of the minimization has consistently been an annoying, if not prohibitory difficulty.

Scipy/numpy presents itself as "the right (python) way to do it", but I can't help feeling consistently bereft of what I need. I keep telling myself how easy it would be to do in ROOT, but I can't bring myself to use that again. Ironic that the ugly Java-style C++ package from the mid 90s written by physicists is so much more useful than the premier python package. Ironic or typical, I'm not sure.

Unknown said...

pyminuit looks interesting.