How To Detect Change In Last 2 Months Starting From Specific Row In Pandas DataFrame
Solution 1:
I think adding a "transaction number column" for each policy will make this easier. Then you can just de-dupe the transactions to see if there are "changed" rows.
Look at the following for example:
import pandas as pd
dat = [['b123', 234, 522], ['b123', 234, 522], ['c123', 34, 23],
['c123', 38, 23], ['c123', 34, 23]]
cols = ['Policy_id', 'Fee1', 'Fee2']
df = pd.DataFrame(dat, columns=cols)
df['transaction_id'] = 1
df['transaction_id'] = df.groupby('Policy_id').cumsum()['transaction_id']
df2 = df[cols].drop_duplicates()
final_df = df2.join(df[['transaction_id']])
The output is:
Policy_id Fee1 Fee2 transaction_id
0 b123 234 522 1
2 c123 34 23 1
3 c123 38 23 2
And since b123
only has one transaction after de-duping, you know that nothing changed. Something had to change with c123
.
You can get all the changed transactions with final_df[final_df.transaction_id > 1]
.
As mentioned, you might have to do some other math with the dates, but this should get you most of the way there.
Edit: If you want to only look at the last two months, you can filter the DataFrame prior to running the above.
How to do this:
Make a variable for your filtered date like so:
from datetime import date, timedelta
filtered_date = date.today() - timedelta(days=60)
Then I would use the pyjanitor
package to use its filter_date method. Just filter on whatever column is the column that you want; I thought that Start_date
appears most reasonable.
import janitor
final_df.filter_date("Start_date", start=filtered_date)
Once you run import janitor
, final_df
will magically have the filter_date
method available.
You can see more filter_date
examples here.
Post a Comment for "How To Detect Change In Last 2 Months Starting From Specific Row In Pandas DataFrame"