Skip to content Skip to sidebar Skip to footer

Check Pandas Dataframe Column For String Type

I have a fairly large pandas dataframe (11k rows and 20 columns). One column has a mixed data type, mostly numeric (float) with a handful of strings scattered throughout. I subset

Solution 1:

This is one way. I'm not sure it can be vectorised.

import pandas as pd

df = pd.DataFrame({'A': [1, None, 'hello', True, 'world', 'mystr', 34.11]})

df['stringy'] = [isinstance(x, str) for x in df.A]

#        A stringy# 0      1   False# 1   None   False# 2  hello    True# 3   True   False# 4  world    True# 5  mystr    True# 6  34.11   False

Solution 2:

Here's a different way. It converts the values of column A to numeric, but does not fail on errors: strings are replaced by NA. The notnull() is there to remove these NA.

df = df[pd.to_numeric(df.A, errors='coerce').notnull()]

However, if there were NAs in the column already, they too will be removed.

See also: Select row from a DataFrame based on the type of the object(i.e. str)

Post a Comment for "Check Pandas Dataframe Column For String Type"