Is The Mask Of A Structured Array Supposed To Be Structured Itself?
Solution 1:
The errors in your 1st case indicate that the methods expect the mask to have the same number (and names) of fields as the base array
__getitem__: dout._mask = _mask[indx]_recursive_printoption: (curdata, curmask) = (result[name], mask[name])
If the masked array is make with the 'main' constructor, the mask has the same structure
Rn = np.ma.masked_array(R, mask=R['A']>5)
Rn.mask.dtype: dtype([('A', '?'), ('B', '?')])
In other words, there is a mask value for each field of each element.
The masked_array
doc evidently intends for 'same shape' to include dtype
structure. Mask: Must be convertible to an array of booleans with the same shape as 'data'.
If I try to set the mask in the same way that masked_where
does
Rn._mask=R['A']>5
I get the same print error. The structured mask gets overwritten with the new boolean, changing its dtype. In contrast if I use
Rn.mask=R['A']<5
Rn
prints fine. .mask
is a property, whose set
method evidently handles the structured mask correctly.
Without digging into the code history (on github) my guess is that masked_where
is a convenience function that wasn't updated when structure dtypes were added to other parts of the ma
code. Compared to ma.masked_array
it's a simple function that does not look at the dtype at all. Other convenience functions like ma.masked_greater
use masked_where
. Changing result._mask = cond
to result.mask = cond
might be all that is need to correct this issue.
How thoroughly have you tested the consequences of an unstructured mask?
Rm.flatten()
returns an array with a structured mask, even when it started with an unstructured one. That's because it uses Rm.__setmask__
, which is sensitive to fields. And that's the set
function for the mask
property.
Rm.tolist() # same error as str()
masked_where
starts with:
cond = make_mask(condition)
make_mask
returns the simple 'bool' dtype. It can also be called with a dtype, producing a structured mask: np.ma.make_mask(R['A']<5,dtype=R.dtype)
. But such a structured mask gets flattened when used in masked_where
. masked_where
not only allows a unstructured mask, it forces it to be unstructured.
Your unstructured mask is already partly implemented, the recordmask
property:
recordmask = property(fget=_get_recordmask)
I say partly because it has a get
method, but the set
method is not yet implemented. See def _set_recordmask(self):
The more I look at this the more I'm convinced that masked_where
is wrong. It could be changed to set a structured mask, but then it's not much different from masked_array
. It might better if it raises an error when the array is structured (has dtype.names
). That way masked_where
will remain useful for unstructured numeric arrays, while preventing misapplication to structured ones.
I should also look at the test code.
Post a Comment for "Is The Mask Of A Structured Array Supposed To Be Structured Itself?"