Reshape Pandas.Df To Use In GridSearch

October 30, 2022 Post a Comment

I am trying to use multiple feature columns in GridSearch with Pipeline. So I pass two columns for which I want to do a TfidfVectorizer, but I get into trouble when running the Gri

Solution 1:

TfidfVectorizer expects input a list of strings. That explains "AttributeError: 'numpy.ndarray' object has no attribute 'lower'" because you input 2d-array, which means a list of arrays.

So you have 2 choices, either concat 2 columns into 1 column beforehand (in pandas) or if you want to keep 2 columns, you could use feature union in the pipeline (http://scikit-learn.org/stable/modules/pipeline.html#feature-union)

About the first exception, I guess it's caused by the communication between pandas and sklearn. However you cannot tell for sure because of the above error in the code.

Python Freelancers

Reshape Pandas.Df To Use In GridSearch

Solution 1:

Post a Comment for "Reshape Pandas.Df To Use In GridSearch"