Skip to content

pandas_openscm.grouping#

Support for grouping in various ways

Functions:

Name Description
fix_index_name_after_groupby_quantile

Fix the index name after performing a groupby(...).quantile(...) operation

groupby_except

Group by all index levels except specified levels

fix_index_name_after_groupby_quantile #

fix_index_name_after_groupby_quantile(
    pandas_obj: P,
    new_name: str = "quantile",
    copy: bool = False,
) -> P

Fix the index name after performing a groupby(...).quantile(...) operation

By default, pandas doesn't assign a name to the quantile level when doing an operation of the form given above. This fixes this, but it does assume that the quantile level is the only unnamed level in the index.

Parameters:

Name Type Description Default
pandas_obj P

Object of which we want to fix the name

required
new_name str

New name to give to the quantile column

'quantile'
copy bool

Whether to copy the object before manipulating the index name

False

Returns:

Type Description
P

Object, with the last level in its index renamed to new_name.

Source code in src/pandas_openscm/grouping.py
def fix_index_name_after_groupby_quantile(
    pandas_obj: P, new_name: str = "quantile", copy: bool = False
) -> P:
    """
    Fix the index name after performing a `groupby(...).quantile(...)` operation

    By default, pandas doesn't assign a name to the quantile level
    when doing an operation of the form given above.
    This fixes this, but it does assume
    that the quantile level is the only unnamed level in the index.

    Parameters
    ----------
    pandas_obj
        Object of which we want to fix the name

    new_name
        New name to give to the quantile column

    copy
        Whether to copy the object before manipulating the index name

    Returns
    -------
    :
        Object, with the last level in its index renamed to `new_name`.
    """
    if copy:
        res = pandas_obj.copy()  # ty: ignore[invalid-argument-type]
    else:
        res = pandas_obj

    new_names = [v if v is not None else new_name for v in res.index.names]
    res.index = res.index.set_names(new_names)

    return res  # ty: ignore[invalid-return-type]

groupby_except #

groupby_except(
    pandas_obj: DataFrame,
    non_groupers: str | list[str],
    observed: bool = True,
) -> DataFrameGroupBy[Any, Any]
groupby_except(
    pandas_obj: Series[Any],
    non_groupers: str | list[str],
    observed: bool = True,
) -> SeriesGroupBy[Any, Any]
groupby_except(
    pandas_obj: DataFrame | Series[Any],
    non_groupers: str | list[str],
    observed: bool = True,
) -> DataFrameGroupBy[Any, Any] | SeriesGroupBy[Any, Any]

Group by all index levels except specified levels

This is the inverse of pd.DataFrame.groupby.

Parameters:

Name Type Description Default
pandas_obj DataFrame | Series[Any]

Object to group

required
non_groupers str | list[str]

Columns to exclude from the grouping

required
observed bool

Whether to only return observed combinations or not

True

Returns:

Type Description
DataFrameGroupBy[Any, Any] | SeriesGroupBy[Any, Any]

Object, grouped by all columns except non_groupers.

Source code in src/pandas_openscm/grouping.py
def groupby_except(
    pandas_obj: pd.DataFrame | pd.Series[Any],
    non_groupers: str | list[str],
    observed: bool = True,
) -> (
    pandas.core.groupby.generic.DataFrameGroupBy[Any, Any]
    | pandas.core.groupby.generic.SeriesGroupBy[Any, Any]
):
    """
    Group by all index levels except specified levels

    This is the inverse of [pd.DataFrame.groupby][pandas.DataFrame.groupby].

    Parameters
    ----------
    pandas_obj
        Object to group

    non_groupers
        Columns to exclude from the grouping

    observed
        Whether to only return observed combinations or not

    Returns
    -------
    :
        Object, grouped by all columns except `non_groupers`.
    """
    if isinstance(non_groupers, str):
        non_groupers = [non_groupers]

    groupers = [v for v in pandas_obj.index.names if v not in non_groupers]

    return pandas_obj.groupby(groupers, observed=observed)