Perform An Operation On All Pairs Of Rows In A Column
Assume the following DataFrame: id A 1 0 2 10 3 200 4 3000 I would like to make a calculation betweeen all rows to all other rows. For example, if the calcul
Solution 1:
IIUC itertools
importitertoolss=list(itertools.combinations(df.index,2))pd.Series([df.A.loc[x[1]]-df.A.loc[x[0]]forxins])Out[495]:01012002300031904299052800dtype:int64
Update
s=list(itertools.combinations(df.index,2))pd.DataFrame([x+(df.A.loc[x[1]]-df.A.loc[x[0]],)forxins])Out[518]:01200110102200203300031219041329905232800
Solution 2:
Use broadcasted subtraction, then np.tril_indices
to extract the lower diagonal (positive values).
# <= 0.23 # u = df['A'].values# 0.24+u=df['A'].to_numpy()u2=(u[:,None]-u)pd.Series(u2[np.tril_indices_from(u2,k=-1)])01012002190330004299052800dtype:int64
Or, use subtract.outer
to avoid the conversion to array beforehand.
u2 = np.subtract.outer(*[df.A]*2)
pd.Series(u2[np.tril_indices_from(u2, k=-1)])
If you need the index as well, use
idx=np.tril_indices_from(u2,k=-1)pd.DataFrame({'val':u2[np.tril_indices_from(u2,k=-1)],'row':idx[0],'col':idx[1]})valrowcol0101012002021902133000 3042990 3152800 32
Post a Comment for "Perform An Operation On All Pairs Of Rows In A Column"