xorbits.numpy.isin#

xorbits.numpy.isin(element: Union[xorbits._mars.typing.TileableType, numpy.ndarray], test_elements: Union[xorbits._mars.typing.TileableType, numpy.ndarray, list], assume_unique: bool = False, invert: bool = False)[source]#

Calculates element in test_elements, broadcasting over element only. Returns a boolean array of the same shape as element that is True where an element of element is in test_elements and False otherwise.

Parameters
  • element (array_like) – Input array.

  • test_elements (array_like) – The values against which to test each value of element. This argument is flattened if it is an array or array_like. See notes for behavior with non-array-like parameters.

  • assume_unique (bool, optional) – If True, the input arrays are both assumed to be unique, which can speed up the calculation. Default is False.

  • invert (bool, optional) – If True, the values in the returned array are inverted, as if calculating element not in test_elements. Default is False. np.isin(a, b, invert=True) is equivalent to (but faster than) np.invert(np.isin(a, b)).

  • kind ({None, 'sort', 'table'}, optional (Not supported yet)) –

    The algorithm to use. This will not affect the final result, but will affect the speed and memory use. The default, None, will select automatically based on memory considerations.

    • If ‘sort’, will use a mergesort-based approach. This will have a memory usage of roughly 6 times the sum of the sizes of ar1 and ar2, not accounting for size of dtypes.

    • If ‘table’, will use a lookup table approach similar to a counting sort. This is only available for boolean and integer arrays. This will have a memory usage of the size of ar1 plus the max-min value of ar2. assume_unique has no effect when the ‘table’ option is used.

    • If None, will automatically choose ‘table’ if the required memory allocation is less than or equal to 6 times the sum of the sizes of ar1 and ar2, otherwise will use ‘sort’. This is done to not use a large amount of memory by default, even though ‘table’ may be faster in most cases. If ‘table’ is chosen, assume_unique will have no effect.

Returns

isin – Has the same shape as element. The values element[isin] are in test_elements.

Return type

ndarray, bool

See also

in1d

Flattened version of this function.

numpy.lib.arraysetops

Module with a number of other functions for performing set operations on arrays.

Notes

isin is an element-wise function version of the python keyword in. isin(a, b) is roughly equivalent to np.array([item in b for item in a]) if a and b are 1-D sequences.

element and test_elements are converted to arrays if they are not already. If test_elements is a set (or other non-sequence collection) it will be converted to an object array with one element, rather than an array of the values contained in test_elements. This is a consequence of the array constructor’s way of handling non-sequence collections. Converting the set to a list usually gives the desired behavior.

Using kind='table' tends to be faster than kind=’sort’ if the following relationship is true: log10(len(ar2)) > (log10(max(ar2)-min(ar2)) - 2.27) / 0.927, but may use greater memory. The default value for kind will be automatically selected based only on memory usage, so one may manually set kind='table' if memory constraints can be relaxed.

New in version 1.13.0(numpy).

Examples

>>> element = 2*np.arange(4).reshape((2, 2))  
>>> element  
array([[0, 2],
       [4, 6]])
>>> test_elements = [1, 2, 4, 8]  
>>> mask = np.isin(element, test_elements)  
>>> mask  
array([[False,  True],
       [ True, False]])
>>> element[mask]  
array([2, 4])

The indices of the matched values can be obtained with nonzero:

>>> np.nonzero(mask)  
(array([0, 1]), array([1, 0]))

The test can also be inverted:

>>> mask = np.isin(element, test_elements, invert=True)  
>>> mask  
array([[ True, False],
       [False,  True]])
>>> element[mask]  
array([0, 6])

Because of how array handles sets, the following does not work as expected:

>>> test_set = {1, 2, 4, 8}  
>>> np.isin(element, test_set)  
array([[False, False],
       [False, False]])

Casting the set to a list gives the expected result:

>>> np.isin(element, list(test_set))  
array([[False,  True],
       [ True, False]])

This docstring was copied from numpy.