xorbits.pandas.DataFrame#

class xorbits.pandas.DataFrame(*args, **kwargs)[source]#

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

Parameters

data (ndarray (structured or homogeneous), Iterable, dict, or DataFrame (Not supported yet)) –
Dict can contain Series, arrays, constants, dataclass or list-like objects. If data is a dict, column order follows insertion-order. If a dict contains Series which have an index defined, it is aligned by its index. This alignment also occurs if data is a Series or a DataFrame itself. Alignment is done on Series/DataFrame inputs.

If data is a list of dicts, column order follows insertion-order.
index (Index or array-like (Not supported yet)) – Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided.
columns (Index or array-like (Not supported yet)) – Column labels to use for resulting frame when data does not have them, defaulting to RangeIndex(0, 1, 2, …, n). If data contains column labels, will perform column selection instead.
dtype (dtype, default None) – Data type to force. Only a single dtype is allowed. If None, infer.
copy (bool or None, default None (Not supported yet)) –
Copy data from inputs. For dict data, the default of None behaves like copy=True. For DataFrame or 2d ndarray input, the default of None behaves like copy=False. If data is a dict containing one or more Series (possibly of different dtypes), copy=False will ensure that these inputs are not copied.

Changed in version 1.3.0(pandas).

See also

DataFrame.from_records: Constructor from tuples, also record arrays.
DataFrame.from_dict: From dicts of Series, arrays, or dicts.
read_csv: Read a comma-separated values (csv) file into DataFrame.
read_table: Read general delimited file into DataFrame.
read_clipboard: Read text from clipboard into DataFrame.

Notes

Please reference the User Guide for more information.

Examples

Constructing DataFrame from a dictionary.

>>> d = {'col1': [1, 2], 'col2': [3, 4]}  
>>> df = pd.DataFrame(data=d)  
>>> df  
   col1  col2
0     1     3
1     2     4

Notice that the inferred dtype is int64.

>>> df.dtypes  
col1    int64
col2    int64
dtype: object

To enforce a single dtype:

>>> df = pd.DataFrame(data=d, dtype=np.int8)  
>>> df.dtypes  
col1    int8
col2    int8
dtype: object

Constructing DataFrame from a dictionary including Series:

>>> d = {'col1': [0, 1, 2, 3], 'col2': pd.Series([2, 3], index=[2, 3])}  
>>> pd.DataFrame(data=d, index=[0, 1, 2, 3])  
   col1  col2
0     0   NaN
1     1   NaN
2     2   2.0
3     3   3.0

Constructing DataFrame from numpy ndarray:

>>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),  
...                    columns=['a', 'b', 'c'])
>>> df2  
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Constructing DataFrame from a numpy ndarray that has labeled columns:

>>> data = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)],  
...                 dtype=[("a", "i4"), ("b", "i4"), ("c", "i4")])
>>> df3 = pd.DataFrame(data, columns=['c', 'a'])  
...
>>> df3  
   c  a
0  3  1
1  6  4
2  9  7

Constructing DataFrame from dataclass:

>>> from dataclasses import make_dataclass  
>>> Point = make_dataclass("Point", [("x", int), ("y", int)])  
>>> pd.DataFrame([Point(0, 0), Point(0, 3), Point(2, 3)])  
   x  y
0  0  0
1  0  3
2  2  3

Constructing DataFrame from Series/DataFrame:

>>> ser = pd.Series([1, 2, 3], index=["a", "b", "c"])  
>>> df = pd.DataFrame(data=ser, index=["a", "c"])  
>>> df  
   0
a  1
c  3

>>> df1 = pd.DataFrame([1, 2, 3], index=["a", "b", "c"], columns=["x"])  
>>> df2 = pd.DataFrame(data=df1, index=["a", "c"])  
>>> df2  
   x
a  1
c  3

This docstring was copied from pandas.

__init__(*args, **kwargs)[source]#

Methods

__init__(*args, **kwargs)

Attributes

`at`	Access a single value for a row/column label pair.
`iat`	Access a single value for a row/column pair by integer position.
`iloc`	Purely integer-location based indexing for selection by position.
`loc`	Access a group of rows and columns by label(s) or a boolean array.
`shape`
`data`