Skip to content

pandas_openscm.accessors.series#

Accessor for pd.Series

Classes:

Name Description
PandasSeriesOpenSCMAccessor

pd.Series accessor

PandasSeriesOpenSCMAccessor #

Bases: Generic[S]

pd.Series accessor

For details, see pandas' docs.

Methods:

Name Description
__init__

Initialise

convert_unit

Convert units

convert_unit_like

Convert units to match another supported pandas object

eiim

Ensure that the index is a pd.MultiIndex

ensure_index_is_multiindex

Ensure that the index is a pd.MultiIndex

fix_index_name_after_groupby_quantile

Fix the index name after performing a groupby(...).quantile(...) operation

groupby_except

Group by all index levels except specified levels

mi_loc

Select data, being slightly smarter than the default pandas.Series.loc.

set_index_levels

Set the index levels

to_category_index

Convert the index's values to categories

update_index_levels

Update the index levels

update_index_levels_from_other

Update the index levels based on other index levels

Source code in src/pandas_openscm/accessors/series.py
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
class PandasSeriesOpenSCMAccessor(Generic[S]):
    """
    [pd.Series][pandas.Series] accessor

    For details, see
    [pandas' docs](https://pandas.pydata.org/docs/development/extending.html#registering-custom-accessors).
    """

    def __init__(self, series: S):
        """
        Initialise

        Parameters
        ----------
        series
            [pd.Series][pandas.Series] to use via the accessor
        """
        # It is possible to validate here.
        # However, it's probably better to do validation closer to the data use.
        self._series = series

    def convert_unit(
        self,
        desired_units: str | Mapping[str, str] | pd.Series[str],
        unit_level: str = "unit",
        ur: pint.UnitRegistry | None = None,
    ) -> S:
        """
        Convert units

        This uses [convert_unit_from_target_series][pandas_openscm.unit_conversion.].
        If you want to understand the details of how the conversion works,
        see that function's docstring.

        Parameters
        ----------
        desired_units
            Desired unit(s) for `series`

            If this is a string,
            we attempt to convert all timeseries to the given unit.

            If this is a mapping,
            we convert the given units to the target units.
            Be careful using this form - you need to be certain of the units.
            If any of your keys don't match the existing units
            (even by a single whitespace character)
            then the unit conversion will not happen.

            If this is a [pd.Series][pandas.Series],
            then it will be passed to
            [convert_unit_from_target_series][pandas_openscm.unit_conversion.]
            after filling any rows in the [pd.Series][pandas.Series]
            that are not in `desired_units`
            with the existing unit (i.e. unspecified rows are not converted).

            For further details, see the examples
            in [convert_unit][pandas_openscm.unit_conversion.].

        unit_level
            Level in the index which holds unit information

            Passed to
            [convert_unit_from_target_series][pandas_openscm.unit_conversion.].

        ur
            Unit registry to use for the conversion.

            Passed to
            [convert_unit_from_target_series][pandas_openscm.unit_conversion.].

        Returns
        -------
        :
            Data with converted units
        """
        res = convert_unit(
            self._series, desired_units=desired_units, unit_level=unit_level, ur=ur
        )

        # The type hinting is impossible to get right here
        # because the casting doesn't work to match the return type
        # (the return type is the same as the input,
        # but we would have to cast to make sure it's numeric
        # and we can't do a runtime check because pd.Series
        # is not subscriptable at runtime).
        # Hence just ignore the type stuff,
        # it's impossible to get right with pandas' accessor pattern.
        # If users want correct type hints, they should use the functional form.
        return res  # type: ignore

    def convert_unit_like(
        self,
        target: pd.DataFrame | pd.Series[Any],
        unit_level: str = "unit",
        target_unit_level: str | None = None,
        ur: pint.UnitRegistry | None = None,
    ) -> S:
        """
        Convert units to match another supported pandas object

        For further details, see the examples
        in [convert_unit_like][pandas_openscm.unit_conversion.].

        This is essentially a helper for
        [convert_unit_from_target_series][pandas_openscm.unit_conversion.].
        It implements one set of logic for extracting desired units
        and tries to be clever, handling differences in index levels
        between the data and `target` sensibly wherever possible.

        If you want behaviour other than what is implemented here,
        use [convert_unit_from_target_series][pandas_openscm.unit_conversion.] directly.

        Parameters
        ----------
        target
            Supported [pandas][] object whose units should be matched

        unit_level
            Level in the data's index which holds unit information

        target_unit_level
            Level in `target`'s index which holds unit information

            If not supplied, we use `unit_level`.

        ur
            Unit registry to use for the conversion.

            Passed to
            [convert_unit_from_target_series][pandas_openscm.unit_conversion.].

        Returns
        -------
        :
            Data with converted units
        """
        res = convert_unit_like(
            self._series,
            target=target,
            unit_level=unit_level,
            target_unit_level=target_unit_level,
            ur=ur,
        )

        # The type hinting is impossible to get right here
        # because the casting doesn't work to match the return type
        # (the return type is the same as the input,
        # but we would have to cast to make sure it's numeric
        # and we can't do a runtime check because pd.Series
        # is not subscriptable at runtime).
        # Hence just ignore the type stuff,
        # it's impossible to get right with pandas' accessor pattern.
        # If users want correct type hints, they should use the functional form.
        return res  # type: ignore

    def ensure_index_is_multiindex(self, copy: bool = True) -> S:
        """
        Ensure that the index is a [pd.MultiIndex][pandas.MultiIndex]

        Parameters
        ----------
        copy
            Whether to copy `series` before manipulating the index name

        Returns
        -------
        :
            `series` with a [pd.MultiIndex][pandas.MultiIndex]

            If the index was already a [pd.MultiIndex][pandas.MultiIndex],
            this is a no-op (although the value of copy is respected).
        """
        res = ensure_index_is_multiindex(self._series, copy=copy)

        return res  # type: ignore # something wront with generic type hinting

    def eiim(self, copy: bool = True) -> S:
        """
        Ensure that the index is a [pd.MultiIndex][pandas.MultiIndex]

        Alias for [ensure_index_is_multiindex][pandas_openscm.index_manipulation.]

        Parameters
        ----------
        copy
            Whether to copy `series` before manipulating the index name

        Returns
        -------
        :
            `series` with a [pd.MultiIndex][pandas.MultiIndex]

            If the index was already a [pd.MultiIndex][pandas.MultiIndex],
            this is a no-op (although the value of copy is respected).
        """
        return self.ensure_index_is_multiindex(copy=copy)

    def fix_index_name_after_groupby_quantile(
        self, new_name: str = "quantile", copy: bool = False
    ) -> S:
        """
        Fix the index name after performing a `groupby(...).quantile(...)` operation

        By default, pandas doesn't assign a name to the quantile level
        when doing an operation of the form given above.
        This fixes this, but it does assume
        that the quantile level is the only unnamed level in the index.

        Parameters
        ----------
        new_name
            New name to give to the quantile column

        copy
            Whether to copy `series` before manipulating the index name

        Returns
        -------
        :
            `series`, with the last level in its index renamed to `new_name`.
        """
        res = fix_index_name_after_groupby_quantile(
            self._series, new_name=new_name, copy=copy
        )

        # Ignore return type
        # because I've done something wrong with how I've set this up.
        # Figuring this out is a job for another day
        return res  # type: ignore

    def groupby_except(
        self, non_groupers: str | list[str], observed: bool = True
    ) -> pandas.core.groupby.generic.SeriesGroupBy[Any, Any]:
        """
        Group by all index levels except specified levels

        This is the inverse of [pd.Series.groupby][pandas.Series.groupby].

        Parameters
        ----------
        non_groupers
            Columns to exclude from the grouping

        observed
            Whether to only return observed combinations or not

        Returns
        -------
        :
            The [pd.Series][pandas.Series],
            grouped by all columns except `non_groupers`.
        """
        return groupby_except(
            self._series, non_groupers=non_groupers, observed=observed
        )

    def mi_loc(
        self,
        locator: pd.Index[Any] | pd.MultiIndex | pix.selectors.Selector,
    ) -> S:
        """
        Select data, being slightly smarter than the default [pandas.Series.loc][].

        Parameters
        ----------
        locator
            Locator to apply

            If this is a multi-index, we use
            [multi_index_lookup][pandas_openscm.indexing.]
            to ensure correct alignment.

            If this is an index that has a name,
            we use the name to ensure correct alignment.

        Returns
        -------
        :
            Selected data

        Notes
        -----
        If you have [pandas_indexing][] installed,
        you can get the same (perhaps even better) functionality
        using something like the following instead

        ```python
        ...
        pandas_obj.loc[pandas_indexing.isin(locator)]
        ...
        ```
        """
        res = mi_loc(self._series, locator)

        # Ignore return type
        # because I've done something wrong with how I've set this up.
        # Figuring this out is a job for another day
        return res  # type: ignore

    def set_index_levels(
        self,
        levels_to_set: dict[str, Any | Collection[Any]],
        copy: bool = True,
    ) -> S:
        """
        Set the index levels

        Parameters
        ----------
        levels_to_set
            Mapping of level names to values to set

        copy
            Should the [pd.Series][pandas.Series] be copied before returning?

        Returns
        -------
        :
            [pd.Series][pandas.Series] with updates applied to its index
        """
        res = set_index_levels_func(
            self._series,
            levels_to_set=levels_to_set,
            copy=copy,
        )

        # Ignore return type
        # because I've done something wrong with how I've set this up.
        # Figuring this out is a job for another day
        return res  # type: ignore

    def to_category_index(self) -> S:
        """
        Convert the index's values to categories

        This can save a lot of memory and improve the speed of processing.
        However, it comes with some pitfalls.
        For a nice discussion of some of them,
        see [this article](https://towardsdatascience.com/staying-sane-while-adopting-pandas-categorical-datatypes-78dbd19dcd8a/).

        Returns
        -------
        :
            [pd.Series][pandas.Series] with all index levels
            converted to category type.
        """
        res = convert_index_to_category_index(self._series)

        # Ignore return type
        # because I've done something wrong with how I've set this up.
        # Figuring this out is a job for another day
        return res  # type: ignore

    def update_index_levels(
        self,
        updates: dict[Any, Callable[[Any], Any]],
        copy: bool = True,
        remove_unused_levels: bool = True,
    ) -> S:
        """
        Update the index levels

        Parameters
        ----------
        updates
            Updates to apply to the index levels

            Each key is the index level to which the updates will be applied.
            Each value is a function which updates the levels to their new values.

        copy
            Should the [pd.Series][pandas.Series] be copied before returning?

        remove_unused_levels
            Remove unused levels before applying the update

            Specifically, call
            [pd.MultiIndex.remove_unused_levels][pandas.MultiIndex.remove_unused_levels].

            This avoids trying to update levels that aren't being used.

        Returns
        -------
        :
            [pd.Series][pandas.Series] with updates applied to its index
        """
        res = update_index_levels_func(
            self._series,
            updates=updates,
            copy=copy,
            remove_unused_levels=remove_unused_levels,
        )

        # Ignore return type
        # because I've done something wrong with how I've set this up.
        # Figuring this out is a job for another day
        return res  # type: ignore

    def update_index_levels_from_other(
        self,
        update_sources: dict[
            Any, tuple[Any, Callable[[Any], Any] | dict[Any, Any] | pd.Series[Any]]
        ],
        copy: bool = True,
        remove_unused_levels: bool = True,
    ) -> S:
        """
        Update the index levels based on other index levels

        Parameters
        ----------
        update_sources
            Updates to apply to the index levels

            Each key is the level to which the updates will be applied
            (or the level that will be created if it doesn't already exist).

            Each value is a tuple of which the first element
            is the level to use to generate the values (the 'source level')
            and the second is mapper of the form used by
            [pd.Index.map][pandas.Index.map]
            which will be applied to the source level
            to update/create the level of interest.

        copy
            Should the [pd.Series][pandas.Series] be copied before returning?

        remove_unused_levels
            Remove unused levels before applying the update

            Specifically, call
            [pd.MultiIndex.remove_unused_levels][pandas.MultiIndex.remove_unused_levels].

            This avoids trying to update levels that aren't being used.

        Returns
        -------
        :
            [pd.Series][pandas.Series] with updates applied to its index
        """
        res = update_index_levels_from_other_func(
            self._series,
            update_sources=update_sources,
            copy=copy,
            remove_unused_levels=remove_unused_levels,
        )

        # Ignore return type
        # because I've done something wrong with how I've set this up.
        # Figuring this out is a job for another day
        return res  # type: ignore

__init__ #

__init__(series: S)

Initialise

Parameters:

Name Type Description Default
series S

pd.Series to use via the accessor

required
Source code in src/pandas_openscm/accessors/series.py
def __init__(self, series: S):
    """
    Initialise

    Parameters
    ----------
    series
        [pd.Series][pandas.Series] to use via the accessor
    """
    # It is possible to validate here.
    # However, it's probably better to do validation closer to the data use.
    self._series = series

convert_unit #

convert_unit(
    desired_units: str | Mapping[str, str] | Series[str],
    unit_level: str = "unit",
    ur: UnitRegistry | None = None,
) -> S

Convert units

This uses convert_unit_from_target_series. If you want to understand the details of how the conversion works, see that function's docstring.

Parameters:

Name Type Description Default
desired_units str | Mapping[str, str] | Series[str]

Desired unit(s) for series

If this is a string, we attempt to convert all timeseries to the given unit.

If this is a mapping, we convert the given units to the target units. Be careful using this form - you need to be certain of the units. If any of your keys don't match the existing units (even by a single whitespace character) then the unit conversion will not happen.

If this is a pd.Series, then it will be passed to convert_unit_from_target_series after filling any rows in the pd.Series that are not in desired_units with the existing unit (i.e. unspecified rows are not converted).

For further details, see the examples in convert_unit.

required
unit_level str

Level in the index which holds unit information

Passed to convert_unit_from_target_series.

'unit'
ur UnitRegistry | None

Unit registry to use for the conversion.

Passed to convert_unit_from_target_series.

None

Returns:

Type Description
S

Data with converted units

Source code in src/pandas_openscm/accessors/series.py
def convert_unit(
    self,
    desired_units: str | Mapping[str, str] | pd.Series[str],
    unit_level: str = "unit",
    ur: pint.UnitRegistry | None = None,
) -> S:
    """
    Convert units

    This uses [convert_unit_from_target_series][pandas_openscm.unit_conversion.].
    If you want to understand the details of how the conversion works,
    see that function's docstring.

    Parameters
    ----------
    desired_units
        Desired unit(s) for `series`

        If this is a string,
        we attempt to convert all timeseries to the given unit.

        If this is a mapping,
        we convert the given units to the target units.
        Be careful using this form - you need to be certain of the units.
        If any of your keys don't match the existing units
        (even by a single whitespace character)
        then the unit conversion will not happen.

        If this is a [pd.Series][pandas.Series],
        then it will be passed to
        [convert_unit_from_target_series][pandas_openscm.unit_conversion.]
        after filling any rows in the [pd.Series][pandas.Series]
        that are not in `desired_units`
        with the existing unit (i.e. unspecified rows are not converted).

        For further details, see the examples
        in [convert_unit][pandas_openscm.unit_conversion.].

    unit_level
        Level in the index which holds unit information

        Passed to
        [convert_unit_from_target_series][pandas_openscm.unit_conversion.].

    ur
        Unit registry to use for the conversion.

        Passed to
        [convert_unit_from_target_series][pandas_openscm.unit_conversion.].

    Returns
    -------
    :
        Data with converted units
    """
    res = convert_unit(
        self._series, desired_units=desired_units, unit_level=unit_level, ur=ur
    )

    # The type hinting is impossible to get right here
    # because the casting doesn't work to match the return type
    # (the return type is the same as the input,
    # but we would have to cast to make sure it's numeric
    # and we can't do a runtime check because pd.Series
    # is not subscriptable at runtime).
    # Hence just ignore the type stuff,
    # it's impossible to get right with pandas' accessor pattern.
    # If users want correct type hints, they should use the functional form.
    return res  # type: ignore

convert_unit_like #

convert_unit_like(
    target: DataFrame | Series[Any],
    unit_level: str = "unit",
    target_unit_level: str | None = None,
    ur: UnitRegistry | None = None,
) -> S

Convert units to match another supported pandas object

For further details, see the examples in convert_unit_like.

This is essentially a helper for convert_unit_from_target_series. It implements one set of logic for extracting desired units and tries to be clever, handling differences in index levels between the data and target sensibly wherever possible.

If you want behaviour other than what is implemented here, use convert_unit_from_target_series directly.

Parameters:

Name Type Description Default
target DataFrame | Series[Any]

Supported pandas object whose units should be matched

required
unit_level str

Level in the data's index which holds unit information

'unit'
target_unit_level str | None

Level in target's index which holds unit information

If not supplied, we use unit_level.

None
ur UnitRegistry | None

Unit registry to use for the conversion.

Passed to convert_unit_from_target_series.

None

Returns:

Type Description
S

Data with converted units

Source code in src/pandas_openscm/accessors/series.py
def convert_unit_like(
    self,
    target: pd.DataFrame | pd.Series[Any],
    unit_level: str = "unit",
    target_unit_level: str | None = None,
    ur: pint.UnitRegistry | None = None,
) -> S:
    """
    Convert units to match another supported pandas object

    For further details, see the examples
    in [convert_unit_like][pandas_openscm.unit_conversion.].

    This is essentially a helper for
    [convert_unit_from_target_series][pandas_openscm.unit_conversion.].
    It implements one set of logic for extracting desired units
    and tries to be clever, handling differences in index levels
    between the data and `target` sensibly wherever possible.

    If you want behaviour other than what is implemented here,
    use [convert_unit_from_target_series][pandas_openscm.unit_conversion.] directly.

    Parameters
    ----------
    target
        Supported [pandas][] object whose units should be matched

    unit_level
        Level in the data's index which holds unit information

    target_unit_level
        Level in `target`'s index which holds unit information

        If not supplied, we use `unit_level`.

    ur
        Unit registry to use for the conversion.

        Passed to
        [convert_unit_from_target_series][pandas_openscm.unit_conversion.].

    Returns
    -------
    :
        Data with converted units
    """
    res = convert_unit_like(
        self._series,
        target=target,
        unit_level=unit_level,
        target_unit_level=target_unit_level,
        ur=ur,
    )

    # The type hinting is impossible to get right here
    # because the casting doesn't work to match the return type
    # (the return type is the same as the input,
    # but we would have to cast to make sure it's numeric
    # and we can't do a runtime check because pd.Series
    # is not subscriptable at runtime).
    # Hence just ignore the type stuff,
    # it's impossible to get right with pandas' accessor pattern.
    # If users want correct type hints, they should use the functional form.
    return res  # type: ignore

eiim #

eiim(copy: bool = True) -> S

Ensure that the index is a pd.MultiIndex

Alias for ensure_index_is_multiindex

Parameters:

Name Type Description Default
copy bool

Whether to copy series before manipulating the index name

True

Returns:

Type Description
S

series with a pd.MultiIndex

If the index was already a pd.MultiIndex, this is a no-op (although the value of copy is respected).

Source code in src/pandas_openscm/accessors/series.py
def eiim(self, copy: bool = True) -> S:
    """
    Ensure that the index is a [pd.MultiIndex][pandas.MultiIndex]

    Alias for [ensure_index_is_multiindex][pandas_openscm.index_manipulation.]

    Parameters
    ----------
    copy
        Whether to copy `series` before manipulating the index name

    Returns
    -------
    :
        `series` with a [pd.MultiIndex][pandas.MultiIndex]

        If the index was already a [pd.MultiIndex][pandas.MultiIndex],
        this is a no-op (although the value of copy is respected).
    """
    return self.ensure_index_is_multiindex(copy=copy)

ensure_index_is_multiindex #

ensure_index_is_multiindex(copy: bool = True) -> S

Ensure that the index is a pd.MultiIndex

Parameters:

Name Type Description Default
copy bool

Whether to copy series before manipulating the index name

True

Returns:

Type Description
S

series with a pd.MultiIndex

If the index was already a pd.MultiIndex, this is a no-op (although the value of copy is respected).

Source code in src/pandas_openscm/accessors/series.py
def ensure_index_is_multiindex(self, copy: bool = True) -> S:
    """
    Ensure that the index is a [pd.MultiIndex][pandas.MultiIndex]

    Parameters
    ----------
    copy
        Whether to copy `series` before manipulating the index name

    Returns
    -------
    :
        `series` with a [pd.MultiIndex][pandas.MultiIndex]

        If the index was already a [pd.MultiIndex][pandas.MultiIndex],
        this is a no-op (although the value of copy is respected).
    """
    res = ensure_index_is_multiindex(self._series, copy=copy)

    return res  # type: ignore # something wront with generic type hinting

fix_index_name_after_groupby_quantile #

fix_index_name_after_groupby_quantile(
    new_name: str = "quantile", copy: bool = False
) -> S

Fix the index name after performing a groupby(...).quantile(...) operation

By default, pandas doesn't assign a name to the quantile level when doing an operation of the form given above. This fixes this, but it does assume that the quantile level is the only unnamed level in the index.

Parameters:

Name Type Description Default
new_name str

New name to give to the quantile column

'quantile'
copy bool

Whether to copy series before manipulating the index name

False

Returns:

Type Description
S

series, with the last level in its index renamed to new_name.

Source code in src/pandas_openscm/accessors/series.py
def fix_index_name_after_groupby_quantile(
    self, new_name: str = "quantile", copy: bool = False
) -> S:
    """
    Fix the index name after performing a `groupby(...).quantile(...)` operation

    By default, pandas doesn't assign a name to the quantile level
    when doing an operation of the form given above.
    This fixes this, but it does assume
    that the quantile level is the only unnamed level in the index.

    Parameters
    ----------
    new_name
        New name to give to the quantile column

    copy
        Whether to copy `series` before manipulating the index name

    Returns
    -------
    :
        `series`, with the last level in its index renamed to `new_name`.
    """
    res = fix_index_name_after_groupby_quantile(
        self._series, new_name=new_name, copy=copy
    )

    # Ignore return type
    # because I've done something wrong with how I've set this up.
    # Figuring this out is a job for another day
    return res  # type: ignore

groupby_except #

groupby_except(
    non_groupers: str | list[str], observed: bool = True
) -> SeriesGroupBy[Any, Any]

Group by all index levels except specified levels

This is the inverse of pd.Series.groupby.

Parameters:

Name Type Description Default
non_groupers str | list[str]

Columns to exclude from the grouping

required
observed bool

Whether to only return observed combinations or not

True

Returns:

Type Description
SeriesGroupBy[Any, Any]

The pd.Series, grouped by all columns except non_groupers.

Source code in src/pandas_openscm/accessors/series.py
def groupby_except(
    self, non_groupers: str | list[str], observed: bool = True
) -> pandas.core.groupby.generic.SeriesGroupBy[Any, Any]:
    """
    Group by all index levels except specified levels

    This is the inverse of [pd.Series.groupby][pandas.Series.groupby].

    Parameters
    ----------
    non_groupers
        Columns to exclude from the grouping

    observed
        Whether to only return observed combinations or not

    Returns
    -------
    :
        The [pd.Series][pandas.Series],
        grouped by all columns except `non_groupers`.
    """
    return groupby_except(
        self._series, non_groupers=non_groupers, observed=observed
    )

mi_loc #

mi_loc(locator: Index[Any] | MultiIndex | Selector) -> S

Select data, being slightly smarter than the default pandas.Series.loc.

Parameters:

Name Type Description Default
locator Index[Any] | MultiIndex | Selector

Locator to apply

If this is a multi-index, we use multi_index_lookup to ensure correct alignment.

If this is an index that has a name, we use the name to ensure correct alignment.

required

Returns:

Type Description
S

Selected data

Notes

If you have pandas_indexing installed, you can get the same (perhaps even better) functionality using something like the following instead

...
pandas_obj.loc[pandas_indexing.isin(locator)]
...
Source code in src/pandas_openscm/accessors/series.py
def mi_loc(
    self,
    locator: pd.Index[Any] | pd.MultiIndex | pix.selectors.Selector,
) -> S:
    """
    Select data, being slightly smarter than the default [pandas.Series.loc][].

    Parameters
    ----------
    locator
        Locator to apply

        If this is a multi-index, we use
        [multi_index_lookup][pandas_openscm.indexing.]
        to ensure correct alignment.

        If this is an index that has a name,
        we use the name to ensure correct alignment.

    Returns
    -------
    :
        Selected data

    Notes
    -----
    If you have [pandas_indexing][] installed,
    you can get the same (perhaps even better) functionality
    using something like the following instead

    ```python
    ...
    pandas_obj.loc[pandas_indexing.isin(locator)]
    ...
    ```
    """
    res = mi_loc(self._series, locator)

    # Ignore return type
    # because I've done something wrong with how I've set this up.
    # Figuring this out is a job for another day
    return res  # type: ignore

set_index_levels #

set_index_levels(
    levels_to_set: dict[str, Any | Collection[Any]],
    copy: bool = True,
) -> S

Set the index levels

Parameters:

Name Type Description Default
levels_to_set dict[str, Any | Collection[Any]]

Mapping of level names to values to set

required
copy bool

Should the pd.Series be copied before returning?

True

Returns:

Type Description
S

pd.Series with updates applied to its index

Source code in src/pandas_openscm/accessors/series.py
def set_index_levels(
    self,
    levels_to_set: dict[str, Any | Collection[Any]],
    copy: bool = True,
) -> S:
    """
    Set the index levels

    Parameters
    ----------
    levels_to_set
        Mapping of level names to values to set

    copy
        Should the [pd.Series][pandas.Series] be copied before returning?

    Returns
    -------
    :
        [pd.Series][pandas.Series] with updates applied to its index
    """
    res = set_index_levels_func(
        self._series,
        levels_to_set=levels_to_set,
        copy=copy,
    )

    # Ignore return type
    # because I've done something wrong with how I've set this up.
    # Figuring this out is a job for another day
    return res  # type: ignore

to_category_index #

to_category_index() -> S

Convert the index's values to categories

This can save a lot of memory and improve the speed of processing. However, it comes with some pitfalls. For a nice discussion of some of them, see this article.

Returns:

Type Description
S

pd.Series with all index levels converted to category type.

Source code in src/pandas_openscm/accessors/series.py
def to_category_index(self) -> S:
    """
    Convert the index's values to categories

    This can save a lot of memory and improve the speed of processing.
    However, it comes with some pitfalls.
    For a nice discussion of some of them,
    see [this article](https://towardsdatascience.com/staying-sane-while-adopting-pandas-categorical-datatypes-78dbd19dcd8a/).

    Returns
    -------
    :
        [pd.Series][pandas.Series] with all index levels
        converted to category type.
    """
    res = convert_index_to_category_index(self._series)

    # Ignore return type
    # because I've done something wrong with how I've set this up.
    # Figuring this out is a job for another day
    return res  # type: ignore

update_index_levels #

update_index_levels(
    updates: dict[Any, Callable[[Any], Any]],
    copy: bool = True,
    remove_unused_levels: bool = True,
) -> S

Update the index levels

Parameters:

Name Type Description Default
updates dict[Any, Callable[[Any], Any]]

Updates to apply to the index levels

Each key is the index level to which the updates will be applied. Each value is a function which updates the levels to their new values.

required
copy bool

Should the pd.Series be copied before returning?

True
remove_unused_levels bool

Remove unused levels before applying the update

Specifically, call pd.MultiIndex.remove_unused_levels.

This avoids trying to update levels that aren't being used.

True

Returns:

Type Description
S

pd.Series with updates applied to its index

Source code in src/pandas_openscm/accessors/series.py
def update_index_levels(
    self,
    updates: dict[Any, Callable[[Any], Any]],
    copy: bool = True,
    remove_unused_levels: bool = True,
) -> S:
    """
    Update the index levels

    Parameters
    ----------
    updates
        Updates to apply to the index levels

        Each key is the index level to which the updates will be applied.
        Each value is a function which updates the levels to their new values.

    copy
        Should the [pd.Series][pandas.Series] be copied before returning?

    remove_unused_levels
        Remove unused levels before applying the update

        Specifically, call
        [pd.MultiIndex.remove_unused_levels][pandas.MultiIndex.remove_unused_levels].

        This avoids trying to update levels that aren't being used.

    Returns
    -------
    :
        [pd.Series][pandas.Series] with updates applied to its index
    """
    res = update_index_levels_func(
        self._series,
        updates=updates,
        copy=copy,
        remove_unused_levels=remove_unused_levels,
    )

    # Ignore return type
    # because I've done something wrong with how I've set this up.
    # Figuring this out is a job for another day
    return res  # type: ignore

update_index_levels_from_other #

update_index_levels_from_other(
    update_sources: dict[
        Any,
        tuple[
            Any,
            Callable[[Any], Any]
            | dict[Any, Any]
            | Series[Any],
        ],
    ],
    copy: bool = True,
    remove_unused_levels: bool = True,
) -> S

Update the index levels based on other index levels

Parameters:

Name Type Description Default
update_sources dict[Any, tuple[Any, Callable[[Any], Any] | dict[Any, Any] | Series[Any]]]

Updates to apply to the index levels

Each key is the level to which the updates will be applied (or the level that will be created if it doesn't already exist).

Each value is a tuple of which the first element is the level to use to generate the values (the 'source level') and the second is mapper of the form used by pd.Index.map which will be applied to the source level to update/create the level of interest.

required
copy bool

Should the pd.Series be copied before returning?

True
remove_unused_levels bool

Remove unused levels before applying the update

Specifically, call pd.MultiIndex.remove_unused_levels.

This avoids trying to update levels that aren't being used.

True

Returns:

Type Description
S

pd.Series with updates applied to its index

Source code in src/pandas_openscm/accessors/series.py
def update_index_levels_from_other(
    self,
    update_sources: dict[
        Any, tuple[Any, Callable[[Any], Any] | dict[Any, Any] | pd.Series[Any]]
    ],
    copy: bool = True,
    remove_unused_levels: bool = True,
) -> S:
    """
    Update the index levels based on other index levels

    Parameters
    ----------
    update_sources
        Updates to apply to the index levels

        Each key is the level to which the updates will be applied
        (or the level that will be created if it doesn't already exist).

        Each value is a tuple of which the first element
        is the level to use to generate the values (the 'source level')
        and the second is mapper of the form used by
        [pd.Index.map][pandas.Index.map]
        which will be applied to the source level
        to update/create the level of interest.

    copy
        Should the [pd.Series][pandas.Series] be copied before returning?

    remove_unused_levels
        Remove unused levels before applying the update

        Specifically, call
        [pd.MultiIndex.remove_unused_levels][pandas.MultiIndex.remove_unused_levels].

        This avoids trying to update levels that aren't being used.

    Returns
    -------
    :
        [pd.Series][pandas.Series] with updates applied to its index
    """
    res = update_index_levels_from_other_func(
        self._series,
        update_sources=update_sources,
        copy=copy,
        remove_unused_levels=remove_unused_levels,
    )

    # Ignore return type
    # because I've done something wrong with how I've set this up.
    # Figuring this out is a job for another day
    return res  # type: ignore