Skip to content Skip to sidebar Skip to footer

Python Pandas Compare Two Dataframes To Assign Country To Phone Number

I have two dataframes that I read in via csv. Dataframe one consists of a phone number and some additional data. The second dataframe contains country codes and country names. I wa

Solution 1:

i would do it this way:

cl = pd.read_csv('country_list.csv', sep=';', dtype={'country_code':str})
ll = pd.read_csv('phones.csv', skipinitialspace=True, dtype={'phonenumber':str})

lookup = cl['country_code']
lookup.index = cl['country_code']

ll['country_code'] = (
    ll['phonenumber']
    .apply(lambda x: pd.Series([lookup.get(x[:4]), lookup.get(x[:3]),
                                lookup.get(x[:2]), lookup.get(x[:1])]))
    .apply(lambda x: x.get(x.first_valid_index()), axis=1)
)

# remove `how='left'` parameter if you don't need "unmatched" phone-numbers    
result = ll.merge(cl, on='country_code', how='left')

Output:

In [195]: resultOut[195]:
    phonenumber add_info country_code      country  order_info
034123425209    info1           34        Spain         1.0192654321762    info2           92     Pakistan         4.0212018883637    info3            1          USA         2.0312428883637   info31         1242      Bahamas         3.046323450001    info4           63  Philippines         3.05496789521134    info5           49      Germany         4.0600000000000      BAD         None          NaN         NaN

Explanation:

In [216]: (ll['phonenumber']
   .....:   .apply(lambda x: pd.Series([lookup.get(x[:4]), lookup.get(x[:3]),
   .....:                               lookup.get(x[:2]), lookup.get(x[:1])]))
   .....: )
Out[216]:
      01230NoneNone34None1NoneNone92None2NoneNoneNone131242NoneNone14NoneNone63None5NoneNone49None6NoneNoneNoneNone

phones.csv: - i've intentionally added one Bahamas number (1242...) and one invalid number (00000000000)

phonenumber, add_info
34123425209, info1
92654321762, info2
12018883637, info3
12428883637, info31
6323450001, info4
496789521134, info5
00000000000, BAD

Post a Comment for "Python Pandas Compare Two Dataframes To Assign Country To Phone Number"