Pandas Populate New Dataframe Column Based On Matching Columns In Another Dataframe
Solution 1:
Consider the following dataframes df
and df2
df = pd.DataFrame(dict(
AUTHOR_NAME=list('AAABBCCCCDEEFGG'),
title= list('zyxwvutsrqponml')
))
df2 = pd.DataFrame(dict(
AUTHOR_NAME=list('AABCCEGG'),
title =list('zwvtrpml'),
CATEGORY =list('11223344')
))
option 1merge
df.merge(df2, how='left')
option 2join
cols = ['AUTHOR_NAME', 'title']
df.join(df2.set_index(cols), on=cols)
both options yield
Solution 2:
APPROACH 1:
You could use concat
instead and drop the duplicated values present in both Index
and AUTHOR_NAME
columns combined. After that, use isin
for checking membership:
df_concat = pd.concat([df2, df]).reset_index().drop_duplicates(['Index', 'AUTHOR_NAME'])
df_concat.set_index('Index', inplace=True)
df_concat[df_concat.index.isin(df.index)]
Note: The column Index
is assumed to be set as the index column for both the DF's
.
APPROACH 2:
Use join
after setting the index column correctly as shown:
df2.set_index(['Index', 'AUTHOR_NAME'], inplace=True)
df.set_index(['Index', 'AUTHOR_NAME'], inplace=True)
df.join(df2).reset_index()
Solution 3:
While the other answers here give very good and elegant solutions to the asked question, I have found a resource that both answers this question in an extremely elegant fashion, as well as giving a beautifully clear and straightforward set of examples on how to accomplish join/ merge of dataframes, effectively teaching LEFT, RIGHT, INNER and OUTER joins.
Join And Merge Pandas Dataframe
I honestly feel any further seekers after this topic will want to also examine his examples...
Solution 4:
You may try the following. It will merge both the datasets on specified column as key.
expected_result = pd.merge(df, df2, on = 'CATEGORY', how = 'left')
Post a Comment for "Pandas Populate New Dataframe Column Based On Matching Columns In Another Dataframe"