xorbits.numpy.fromregex#

xorbits.numpy.fromregex(file, regexp, dtype, encoding=None)#

Construct an array from a text file, using regular expression parsing.

The returned array is always a structured array, and is constructed from all matches of the regular expression in the file. Groups in the regular expression are converted to fields of the structured array.

参数
  • file (path or file) –

    Filename or file object to read.

    在 1.22.0(numpy) 版更改: Now accepts os.PathLike implementations.

  • regexp (str or regexp) – Regular expression used to parse the file. Groups in the regular expression correspond to fields in the dtype.

  • dtype (dtype or list of dtypes) – Dtype for the structured array; must be a structured datatype.

  • encoding (str, optional) –

    Encoding used to decode the inputfile. Does not apply to input streams.

    1.14.0(numpy) 新版功能.

返回

output – The output array, containing the part of the content of file that was matched by regexp. output is always a structured array.

返回类型

ndarray

引发

TypeError – When dtype is not a valid dtype for a structured array.

提示

Dtypes for structured arrays can be specified in several forms, but all forms specify at least the data type and field name. For details see basics.rec.

实际案例

>>> from io import StringIO  
>>> text = StringIO("1312 foo\n1534  bar\n444   qux")  
>>> regexp = r"(\d+)\s+(...)"  # match [digits, whitespace, anything]  
>>> output = np.fromregex(text, regexp,  
...                       [('num', np.int64), ('key', 'S3')])
>>> output  
array([(1312, b'foo'), (1534, b'bar'), ( 444, b'qux')],
      dtype=[('num', '<i8'), ('key', 'S3')])
>>> output['num']  
array([1312, 1534,  444])

警告

This method has not been implemented yet. Xorbits will try to execute it with numpy.

This docstring was copied from numpy.