Skip to content Skip to sidebar Skip to footer

Onehotencoder Only A Single Feature Which Is String

I want one of my ONLY ONE of my features to be converted to a separate binary features: df['pattern_id'] Out[202]: 0 3 1 3 ... 7440 2 7441 2 7442 3 Name: patt

Solution 1:

If you take a look at the documentation for OneHotEncoder you can see that the categorical_features argument expects '“all” or array of indices or mask' not a string. You can make your code work by changing to the following lines

import pandas as pd
from sklearn.preprocessing import OneHotEncoder
# Create a dataframe of random ints
df = pd.DataFrame(np.random.randint(0, 4, size=(100, 4)),
                  columns=['pattern_id', 'B', 'C', 'D'])
onehotencoder = OneHotEncoder(categorical_features=[df.columns.tolist().index('pattern_id')])
df = onehotencoder.fit_transform(df)

However df will no longer be a DataFrame, I would suggest working directly with the numpy arrays.

Solution 2:

You can also do it like this

import pandas as pd
from sklearn.preprocessingimportOneHotEncoder
onehotenc = OneHotEncoder()
X = onehotenc.fit_transform(df.required_column.values.reshape(-1, 1)).toarray()

We need to reshape the column, because fit_transform requires a 2-D array. Then you can add columns to this numpy array and then merge it with your DataFrame.

Seen from this link here

Solution 3:

The recommended way to work with different column types is detailed in the sklearn documentation here.

Representative example:

numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[('scaler', StandardScaler())])

categorical_features = ['embarked', 'sex', 'pclass']
categorical_transformer = OneHotEncoder(handle_unknown='ignore')

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numeric_transformer, numeric_features),
        ('cat', categorical_transformer, categorical_features)])

Post a Comment for "Onehotencoder Only A Single Feature Which Is String"