Reshape Pandas.Df To Use In GridSearch
I am trying to use multiple feature columns in GridSearch with Pipeline. So I pass two columns for which I want to do a TfidfVectorizer, but I get into trouble when running the Gri
Solution 1:
TfidfVectorizer expects input a list of strings. That explains "AttributeError: 'numpy.ndarray' object has no attribute 'lower'" because you input 2d-array, which means a list of arrays.
So you have 2 choices, either concat 2 columns into 1 column beforehand (in pandas) or if you want to keep 2 columns, you could use feature union in the pipeline (http://scikit-learn.org/stable/modules/pipeline.html#feature-union)
About the first exception, I guess it's caused by the communication between pandas and sklearn. However you cannot tell for sure because of the above error in the code.
Post a Comment for "Reshape Pandas.Df To Use In GridSearch"