xorbits.pandas.read_orc#

xorbits.pandas.read_orc(path: FilePath | ReadBuffer[bytes], columns: list[str] | None = None, dtype_backend: DtypeBackend | lib.NoDefault = _NoDefault.no_default, filesystem: pyarrow.fs.FileSystem | fsspec.spec.AbstractFileSystem | None = None, **kwargs: Any) DataFrame[source]#

Load an ORC object from the file path, returning a DataFrame.

Parameters
  • path (str, path object, or file-like object) – String, path object (implementing os.PathLike[str]), or file-like object implementing a binary read() function. The string could be a URL. Valid URL schemes include http, ftp, s3, and file. For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.orc.

  • columns (list, default None) – If not None, only these columns will be read from the file. Output always follows the ordering of the file and not the columns list. This mirrors the original behaviour of .

  • dtype_backend ({'numpy_nullable', 'pyarrow'}, default 'numpy_nullable') –

    Back-end data type applied to the resultant DataFrame (still experimental). Behaviour is as follows:

    • "numpy_nullable": returns nullable-dtype-backed DataFrame (default).

    • "pyarrow": returns pyarrow-backed nullable ArrowDtype DataFrame.

    New in version 2.0(pandas).

  • filesystem (fsspec or pyarrow filesystem, default None) –

    Filesystem object to use when reading the parquet file.

    New in version 2.1.0(pandas).

  • **kwargs – Any additional kwargs are passed to pyarrow.

Return type

DataFrame

Notes

Before using this function you should read the user guide about ORC and install optional dependencies.

If path is a URI scheme pointing to a local or remote file (e.g. “s3://”), a pyarrow.fs filesystem will be attempted to read the file. You can also pass a pyarrow or fsspec filesystem object into the filesystem keyword to override this behavior.

Examples

>>> result = pd.read_orc("example_pa.orc")  

Warning

This method has not been implemented yet. Xorbits will try to execute it with pandas.

This docstring was copied from pandas.