Skip to content Skip to sidebar Skip to footer

Is The Mask Of A Structured Array Supposed To Be Structured Itself?

I was looking into numpy issue 2972 and several related problems. It turns out that all those problems are related to the situation where the array itself is structured, but its m

Solution 1:

The errors in your 1st case indicate that the methods expect the mask to have the same number (and names) of fields as the base array

__getitem__:  dout._mask = _mask[indx]_recursive_printoption: (curdata, curmask) = (result[name], mask[name])

If the masked array is make with the 'main' constructor, the mask has the same structure

Rn = np.ma.masked_array(R, mask=R['A']>5)
Rn.mask.dtype: dtype([('A', '?'), ('B', '?')])

In other words, there is a mask value for each field of each element.

The masked_array doc evidently intends for 'same shape' to include dtype structure. Mask: Must be convertible to an array of booleans with the same shape as 'data'.

If I try to set the mask in the same way that masked_where does

Rn._mask=R['A']>5

I get the same print error. The structured mask gets overwritten with the new boolean, changing its dtype. In contrast if I use

Rn.mask=R['A']<5

Rn prints fine. .mask is a property, whose set method evidently handles the structured mask correctly.

Without digging into the code history (on github) my guess is that masked_where is a convenience function that wasn't updated when structure dtypes were added to other parts of the ma code. Compared to ma.masked_array it's a simple function that does not look at the dtype at all. Other convenience functions like ma.masked_greater use masked_where. Changing result._mask = cond to result.mask = cond might be all that is need to correct this issue.


How thoroughly have you tested the consequences of an unstructured mask?

Rm.flatten()

returns an array with a structured mask, even when it started with an unstructured one. That's because it uses Rm.__setmask__, which is sensitive to fields. And that's the set function for the mask property.

Rm.tolist()  # same error as str()

masked_where starts with:

cond = make_mask(condition)

make_mask returns the simple 'bool' dtype. It can also be called with a dtype, producing a structured mask: np.ma.make_mask(R['A']<5,dtype=R.dtype). But such a structured mask gets flattened when used in masked_where. masked_where not only allows a unstructured mask, it forces it to be unstructured.

Your unstructured mask is already partly implemented, the recordmask property:

recordmask = property(fget=_get_recordmask)

I say partly because it has a get method, but the set method is not yet implemented. See def _set_recordmask(self):

The more I look at this the more I'm convinced that masked_where is wrong. It could be changed to set a structured mask, but then it's not much different from masked_array. It might better if it raises an error when the array is structured (has dtype.names). That way masked_where will remain useful for unstructured numeric arrays, while preventing misapplication to structured ones.

I should also look at the test code.

Post a Comment for "Is The Mask Of A Structured Array Supposed To Be Structured Itself?"