Skip to content

pandas_openscm.io#

Serialisation/deserialisation (i.e. input/output) support

Functions:

Name Description
load_timeseries_csv

Load a CSV holding timeseries

load_timeseries_csv #

load_timeseries_csv(
    fp: Path,
    lower_column_names: bool = True,
    index_columns: list[str] | None = None,
    out_columns_type: type | None = None,
    out_columns_name: str | None = None,
) -> DataFrame

Load a CSV holding timeseries

In other words, a CSV that has metadata columns and then some time columns.

Parameters:

Name Type Description Default
fp Path

File path to load

required
lower_column_names bool

Convert the column names to all lower case as part of loading.

Note, if lower_col_names is True, the column names are converted to lower case before the index is set so a) you should only use lower case in index_columns and b) the lowering will affect values that do not end up in the index too.

True
index_columns list[str] | None

Columns to treat as metadata from the loaded CSV.

At the moment, if this is not provided, a NotImplementedError is raised. In future, if not provided, we will try and infer the columns based on whether they look like time columns or not.

None
out_columns_type type | None

The type to apply to the output columns that are not part of the index.

If not supplied, the raw type returned by pandas is returned.

None
out_columns_name str | None

The name for the columns in the output.

If not supplied, the raw name returned by pandas is returned.

This can also be set with pd.DataFrame.rename_axis but we provide it here for convenience (and in case you couldn't find this trick for ages, like us).

None

Returns:

Type Description
DataFrame

Loaded data

Source code in src/pandas_openscm/io.py
def load_timeseries_csv(
    fp: Path,
    lower_column_names: bool = True,
    index_columns: list[str] | None = None,
    out_columns_type: type | None = None,
    out_columns_name: str | None = None,
) -> pd.DataFrame:
    """
    Load a CSV holding timeseries

    In other words, a CSV that has metadata columns
    and then some time columns.

    Parameters
    ----------
    fp
        File path to load

    lower_column_names
        Convert the column names to all lower case as part of loading.

        Note, if `lower_col_names` is `True`,
        the column names are converted to lower case
        before the index is set so
        a) you should only use lower case in `index_columns`
        and b) the lowering will affect values that do not end up in the index too.

    index_columns
        Columns to treat as metadata from the loaded CSV.

        At the moment, if this is not provided, a `NotImplementedError` is raised.
        In future, if not provided, we will try and infer the columns
        based on whether they look like time columns or not.

    out_columns_type
        The type to apply to the output columns that are not part of the index.

        If not supplied, the raw type returned by pandas is returned.

    out_columns_name
        The name for the columns in the output.

        If not supplied, the raw name returned by pandas is returned.

        This can also be set with
        [pd.DataFrame.rename_axis][pandas.DataFrame.rename_axis]
        but we provide it here for convenience
        (and in case you couldn't find this trick for ages, like us).

    Returns
    -------
    :
        Loaded data
    """
    out = pd.read_csv(fp)

    if lower_column_names:
        out.columns = out.columns.str.lower()

    if index_columns is None:
        raise NotImplementedError(index_columns)

    out = out.set_index(index_columns)

    if out_columns_type is not None:
        out.columns = out.columns.astype(out_columns_type)

    if out_columns_name is not None:
        out = out.rename_axis(out_columns_name, axis="columns")

    return out