Posts Tagged ‘pandas’

“AttributeError: exp” from numpy when calling predict_proba()

Thursday, November 13th, 2014

If you’ve been trying out different types of Scikit-learn classifier algorithms, and have been merrily going along calling predict(X) and predict_proba(X) on various classifiers (e.g. DecisionTreeClassifier, RandomForestClassifier), you might decide to try something else (like LogisticRegression), which will seem to work for calling predict(X) but maddenly fails with “AttributeError: exp”

If you follow the stack trace and the error is when “np.exp” is being called within _predict_proba_lr, you might have my problem, namely, you have some un-casted booleans within your X. This causes the predict_proba method to fail with linear models (though not with classifiers).

You can fix this by converting your X to floats with X.astype(float) explicitly before passing X to predict_proba. Careful; if you have values that ACTUALLY don’t cast to float intelligently this will probably do terrible things to your model.

If you formed up your X as an np.array natively, you probably don’t get this behavior, since np.array’s constructor seems to convert your bools for you. But if you started with a pd.DataFrame or pd.Series, *even if you converted it to an np.array*, it will consider the bools as objects and they will bomb out in predict_proba.

(np = numpy, pd = pandas by convention)


import numpy as np
import pandas as pd
a = np.array([1,2,3,True, False])
b = pd.Series([1,2,3,True, False])
c = np.array(b)
d = c.astype(float)

## native np.array is OK, because bools were converted.
np.exp(a)
Out[64]: array([ 2.71828183, 7.3890561 , 20.08553692, 2.71828183, 1. ])

## pd.Series can usually be used where np.array, but not when exp can't handle bools:
np.exp(b)
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2883, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
np.exp(b)
AttributeError: exp

## merely explicitly creating a np.array first won't solve your problem.
np.exp(c)
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2883, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
np.exp(c)
AttributeError: exp

## explicit cast to float works.
np.exp(d)
Out[67]: array([ 2.71828183, 7.3890561 , 20.08553692, 2.71828183, 1. ])