Python Pandas Compare Two Dataframes To Assign Country To Phone Number
I have two dataframes that I read in via csv. Dataframe one consists of a phone number and some additional data. The second dataframe contains country codes and country names. I wa
Solution 1:
i would do it this way:
cl = pd.read_csv('country_list.csv', sep=';', dtype={'country_code':str})
ll = pd.read_csv('phones.csv', skipinitialspace=True, dtype={'phonenumber':str})
lookup = cl['country_code']
lookup.index = cl['country_code']
ll['country_code'] = (
ll['phonenumber']
.apply(lambda x: pd.Series([lookup.get(x[:4]), lookup.get(x[:3]),
lookup.get(x[:2]), lookup.get(x[:1])]))
.apply(lambda x: x.get(x.first_valid_index()), axis=1)
)
# remove `how='left'` parameter if you don't need "unmatched" phone-numbers
result = ll.merge(cl, on='country_code', how='left')
Output:
In [195]: resultOut[195]:
phonenumber add_info country_code country order_info
034123425209 info1 34 Spain 1.0192654321762 info2 92 Pakistan 4.0212018883637 info3 1 USA 2.0312428883637 info31 1242 Bahamas 3.046323450001 info4 63 Philippines 3.05496789521134 info5 49 Germany 4.0600000000000 BAD None NaN NaN
Explanation:
In [216]: (ll['phonenumber']
.....: .apply(lambda x: pd.Series([lookup.get(x[:4]), lookup.get(x[:3]),
.....: lookup.get(x[:2]), lookup.get(x[:1])]))
.....: )
Out[216]:
01230NoneNone34None1NoneNone92None2NoneNoneNone131242NoneNone14NoneNone63None5NoneNone49None6NoneNoneNoneNone
phones.csv: - i've intentionally added one Bahamas number (1242...
) and one invalid number (00000000000
)
phonenumber, add_info
34123425209, info1
92654321762, info2
12018883637, info3
12428883637, info31
6323450001, info4
496789521134, info5
00000000000, BAD
Post a Comment for "Python Pandas Compare Two Dataframes To Assign Country To Phone Number"