Skip to content Skip to sidebar Skip to footer

How To Convert Numbers Represented As Characters For Short Into Numeric In Python

I have a column in my data frame which has values like '3.456B' which actually stands for 3.456 Billion (and similar notation for Million). How to convert this string form to corre

Solution 1:

I'd use a dictionary to replace the strings then evaluate as float.

mapping = dict(K='E3', M='E6', B='E9')

df['Market Cap'] = pd.to_numeric(df['Market Cap'].replace(mapping, regex=True))

Solution 2:

Assuming all entries have a letter at the end, you can do this:

d = {'K': 1000, 'M': 1000000, 'B': 1000000000}
df.loc[:, 'Market Cap'] = pd.to_numeric(df['Market Cap'].str[:-1]) * \
    df['Market Cap'].str[-1].replace(d)

This converts everything but the last character into a numeric value, then multiplies it by the number equivalent to the letter in the last character.

Solution 3:

First extract units as last character in strings. Then convert values without units to floats and multiply where needed:

df = pd.DataFrame({'Market Cap':['6.46M','2.25B','0.23B']})
units = df['Market Cap'].str[-1]
df['Market Cap'] = df['Market Cap'].str[:-1].astype(float)
df.loc[units=='M','Market Cap'] *= 0.001
#    Market Cap# 0     0.00646# 1     2.25000# 2     0.23000

Now everything is in billions.

Post a Comment for "How To Convert Numbers Represented As Characters For Short Into Numeric In Python"