xorbits.pandas.DataFrame.value_counts#

DataFrame.value_counts(subset: IndexLabel | None = None, normalize: bool = False, sort: bool = True, ascending: bool = False, dropna: bool = True) Series[源代码]#

Return a Series containing the frequency of each distinct row in the Dataframe.

参数
  • subset (label or list of labels, optional) – Columns to use when counting unique combinations.

  • normalize (bool, default False) – Return proportions rather than frequencies.

  • sort (bool, default True) – Sort by frequencies when True. Sort by DataFrame column values when False.

  • ascending (bool, default False) – Sort in ascending order.

  • dropna (bool, default True) –

    Don’t include counts of rows that contain NA values.

    1.3.0(pandas) 新版功能.

返回类型

Series

参见

Series.value_counts

Equivalent method on Series.

提示

The returned Series will have a MultiIndex with one level per input column but an Index (non-multi) for a single label. By default, rows that contain any NA values are omitted from the result. By default, the resulting Series will be in descending order so that the first element is the most frequently-occurring row.

实际案例

>>> df = pd.DataFrame({'num_legs': [2, 4, 4, 6],  
...                    'num_wings': [2, 0, 0, 0]},
...                   index=['falcon', 'dog', 'cat', 'ant'])
>>> df  
        num_legs  num_wings
falcon         2          2
dog            4          0
cat            4          0
ant            6          0
>>> df.value_counts()  
num_legs  num_wings
4         0            2
2         2            1
6         0            1
Name: count, dtype: int64
>>> df.value_counts(sort=False)  
num_legs  num_wings
2         2            1
4         0            2
6         0            1
Name: count, dtype: int64
>>> df.value_counts(ascending=True)  
num_legs  num_wings
2         2            1
6         0            1
4         0            2
Name: count, dtype: int64
>>> df.value_counts(normalize=True)  
num_legs  num_wings
4         0            0.50
2         2            0.25
6         0            0.25
Name: proportion, dtype: float64

With dropna set to False we can also count rows with NA values.

>>> df = pd.DataFrame({'first_name': ['John', 'Anne', 'John', 'Beth'],  
...                    'middle_name': ['Smith', pd.NA, pd.NA, 'Louise']})
>>> df  
  first_name middle_name
0       John       Smith
1       Anne        <NA>
2       John        <NA>
3       Beth      Louise
>>> df.value_counts()  
first_name  middle_name
Beth        Louise         1
John        Smith          1
Name: count, dtype: int64
>>> df.value_counts(dropna=False)  
first_name  middle_name
Anne        NaN            1
Beth        Louise         1
John        Smith          1
            NaN            1
Name: count, dtype: int64
>>> df.value_counts("first_name")  
first_name
John    2
Anne    1
Beth    1
Name: count, dtype: int64

警告

This method has not been implemented yet. Xorbits will try to execute it with pandas.

This docstring was copied from pandas.core.frame.DataFrame.