Sort Rows Of Array To Match Order Of Another Array Using An Identifier Column
Solution 1:
Here's a vectorized approach using np.searchsorted
-
# Store the sorted indices of A
sidx = A[:,0].argsort()
# Find the indices of col-0 of B in col-0 of sorted A
l_idx = np.searchsorted(A[:,0],B[:,0],sorter = sidx)
# Create a mask corresponding to all those indices that indicates which indices# corresponding to B's col-0 match up with A's col-0
valid_mask = l_idx != np.searchsorted(A[:,0],B[:,0],sorter = sidx,side='right')
# Initialize output array with NaNs. # Use l_idx to set rows from A into output array. Use valid_mask to select # indices from l_idx and output rows that are to be set.
out = np.full((B.shape[0],A.shape[1]),np.nan)
out[valid_mask] = A[sidx[l_idx[valid_mask]]]
Please note that valid_mask
could also be created using np.in1d
: np.in1d(B[:,0],A[:,0])
for a more intuitive answer. But, we are using np.searchsorted
as that's better in terms of performance as also disscused in greater detail in this other solution
.
Sample run -
In [184]: A
Out[184]:
array([[45, 11, 86],
[18, 74, 59],
[30, 68, 13],
[55, 47, 78]])
In [185]: B
Out[185]:
array([[45, 11, 88],
[55, 83, 46],
[95, 87, 77],
[30, 9, 37],
[14, 97, 98],
[18, 48, 53]])
In [186]: out
Out[186]:
array([[ 45., 11., 86.],
[ 55., 47., 78.],
[ nan, nan, nan],
[ 30., 68., 13.],
[ nan, nan, nan],
[ 18., 74., 59.]])
Solution 2:
The simple approach is to build a dict
from A
and then use it to map identifiers found in B
to the new array.
Constructing dict
:
>>> A = [[1,"a"], [2,"b"], [3,"c"]]
>>> A_dict = {x[0]: x for x in A}
>>> A_dict
{1: [1, 'a'], 2: [2, 'b'], 3: [3, 'c']}
Mapping:
>>> B = [[3,"..."], [2,"..."], [1,"..."]]
>>> result = (A_dict[x[0]] for x in B)
>>> list(result)
[[3, 'c'], [2, 'b'], [1, 'a']]
Solution 3:
Its not clear if you wish to concatenate the values in B
onto A
. Lets assume not ... then the simplest way is probably to just build a dictionary of identifier to row and then reorder A
:
defmatch_order(A, B):
# identifier -> row
by_id = {A[i, 0]: A[i] for i inrange(len(A))}
# make up a fill row and rearrange according to B
fill_row = [-1] * A.shape[1]
return numpy.array([by_id.get(k, fill_row) for k in B[:, 0]])
As an example, if we have:
A = numpy.array([[111, 1], [222, 2], [333, 3], [555, 5]])
B = numpy.array([[222, 2], [111, 1], [333, 3], [444, 4], [555, 5]])
Then
>>> match_order(A, B)
array([[222, 2],
[111, 1],
[333, 3],
[ -1, -1],
[555, 5]])
If you wish to concatenate B
, then you can do so simply as:
>>> numpy.hstack( (match_order(A, B), B[:, 1:]) )
array([[222, 2, 2],
[111, 1, 1],
[333, 3, 3],
[ -1, -1, 4],
[555, 5, 5]])
Solution 4:
>>> A = [[3,'d', 'e', 'f'], [1,'a','b','c'], [2,'n','n','n']]
>>> B = [[1,'a','b','c'], [3,'d','e','f']]
>>> A_dict = {x[0]:x[1:] for x in A}
>>> A_dict
{1: ['a', 'b', 'c'], 2: ['n', 'n', 'n'], 3: ['d', 'e', 'f']}
>>> B_dict = {x[0]:x[1:] for x in B}
>>> B_dict
{1: ['a', 'b', 'c'], 3: ['d', 'e', 'f']}
>>> result=[[x] + A_dict[x] for x in A_dict if x in B_dict and A_dict[x]==B_dict[x]]
>>> result
[[1, 'a', 'b', 'c'], [3, 'd', 'e', 'f']]
Here A[0], B[1] and A[1],B[0] are identical. Converting into a dict and dealing the problem makes life easier here.
Step 1: Create dict objects for each 2D list.
Step 2: Iterate each key in A_dict and check: a. If Key exists in B_dict, b. If yes, see if both keys have same value
Step 3: Append the key and value to form a 2-D list.
Cheers!
Post a Comment for "Sort Rows Of Array To Match Order Of Another Array Using An Identifier Column"