xorbits.pandas.groupby.SeriesGroupBy.agg#
- SeriesGroupBy.agg(func=None, method='auto', combine_size=None, *args, **kwargs)#
Aggregate using one or more operations over the specified axis.
- Parameters
func (function, str, list, dict or None) –
Function to use for aggregating the data. If a function, must either work when passed a Series or when passed to Series.apply.
Accepted combinations are:
function
string function name
list of functions and/or function names, e.g.
[np.sum, 'mean']
None, in which case
**kwargs
are used with Named Aggregation. Here the output has one column for each element in**kwargs
. The name of the column is keyword, whereas the value determines the aggregation used to compute the values in the column.Can also accept a Numba JIT function with
engine='numba'
specified. Only passing a single function is supported with this engine.If the
'numba'
engine is chosen, the function must be a user defined function withvalues
andindex
as the first and second arguments respectively in the function signature. Each group’s index will be passed to the user defined function and optionally available for use.
Deprecated since version 2.1.0(pandas): Passing a dictionary is deprecated and will raise in a future version of pandas. Pass a list of aggregations instead.
*args – Positional arguments to pass to func.
engine (str, default None (Not supported yet)) –
'cython'
: Runs the function through C-extensions from cython.'numba'
: Runs the function through JIT compiled code from numba.None
: Defaults to'cython'
or globally settingcompute.use_numba
engine_kwargs (dict, default None (Not supported yet)) –
For
'cython'
engine, there are no acceptedengine_kwargs
For
'numba'
engine, the engine can acceptnopython
,nogil
andparallel
dictionary keys. The values must either beTrue
orFalse
. The defaultengine_kwargs
for the'numba'
engine is{'nopython': True, 'nogil': False, 'parallel': False}
and will be applied to the function
**kwargs –
If
func
is None,**kwargs
are used to define the output names and aggregations via Named Aggregation. Seefunc
entry.Otherwise, keyword arguments to be passed into func.
- Return type
See also
Series.groupby.apply
Apply function func group-wise and combine the results together.
Series.groupby.transform
Transforms the Series on each group based on the given function.
Series.aggregate
Aggregate using one or more operations over the specified axis.
Notes
When using
engine='numba'
, there will be no “fall back” behavior internally. The group data and group index will be passed as numpy arrays to the JITed user defined function, and no alternative execution attempts will be tried.Functions that mutate the passed object can produce unexpected behavior or errors and are not supported. See gotchas.udf-mutation for more details.
Changed in version 1.3.0(pandas): The resulting dtype will reflect the return value of the passed
func
, see the examples below.Examples
>>> s = pd.Series([1, 2, 3, 4])
>>> s 0 1 1 2 2 3 3 4 dtype: int64
>>> s.groupby([1, 1, 2, 2]).min() 1 1 2 3 dtype: int64
>>> s.groupby([1, 1, 2, 2]).agg('min') 1 1 2 3 dtype: int64
>>> s.groupby([1, 1, 2, 2]).agg(['min', 'max']) min max 1 1 2 2 3 4
The output column names can be controlled by passing the desired column names and aggregations as keyword arguments.
>>> s.groupby([1, 1, 2, 2]).agg( ... minimum='min', ... maximum='max', ... ) minimum maximum 1 1 2 2 3 4
Changed in version 1.3.0(pandas): The resulting dtype will reflect the return value of the aggregating function.
>>> s.groupby([1, 1, 2, 2]).agg(lambda x: x.astype(float).min()) 1 1.0 2 3.0 dtype: float64 Extra Parameters ---------------- groupby : Mars Groupby Groupby data. method : {'auto', 'shuffle', 'tree'}, default 'auto' 'tree' method provide a better performance, 'shuffle' is recommended if aggregated result is very large, 'auto' will use 'shuffle' method in distributed mode and use 'tree' in local mode. combine_size : int The number of chunks to combine when method is 'tree'
This docstring was copied from pandas.core.groupby.generic.SeriesGroupBy.