Unit conversion¶

Here we detail pandas-openscm's unit conversion support.

Imports¶

In [1]:

Copied!





import traceback

import numpy as np
import openscm_units
import pandas as pd
import pandas_indexing as pix
import pint

from pandas_openscm import register_pandas_accessors
from pandas_openscm.testing import create_test_df
from pandas_openscm.unit_conversion import (
    AmbiguousTargetUnitError,
    convert_unit,
    convert_unit_from_target_series,
    convert_unit_like,
)
import traceback

import numpy as np
import openscm_units
import pandas as pd
import pandas_indexing as pix
import pint

from pandas_openscm import register_pandas_accessors
from pandas_openscm.testing import create_test_df
from pandas_openscm.unit_conversion import (
    AmbiguousTargetUnitError,
    convert_unit,
    convert_unit_from_target_series,
    convert_unit_like,
)

Setup¶

In [2]:

Copied!





# Register the openscm accessor for pandas objects
# (we don't do this on import
# as we have had bad experiences with implicit behaviour like that)
register_pandas_accessors()
# Register the openscm accessor for pandas objects
# (we don't do this on import
# as we have had bad experiences with implicit behaviour like that)
register_pandas_accessors()

Basics¶

Convert all data to a given unit¶

Imagine we start with some data.

In [3]:

Copied!





df_basic = create_test_df(
    variables=(("Warming", "K"),),
    n_scenarios=2,
    n_runs=2,
    timepoints=np.arange(1950.0, 1965.0),
)
df_basic
df_basic = create_test_df(
    variables=(("Warming", "K"),),
    n_scenarios=2,
    n_runs=2,
    timepoints=np.arange(1950.0, 1965.0),
)
df_basic

Out[3]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	K	0.833020	1.596905	3.094557	3.992015	4.373712	6.107554	7.197416	7.505911	8.627713	9.671078	10.841117	12.781138	13.251307	14.248473	15.323850
scenario_0	Warming	1	K	0.235816	2.147355	4.778399	5.724058	7.638744	9.625288	12.165216	13.419185	15.786075	17.392358	19.923693	21.866557	23.407121	25.251397	27.026891
scenario_1	Warming	0	K	0.788474	2.848990	5.562105	8.218011	11.110182	14.104328	16.778587	19.593635	22.718554	25.553278	27.384330	31.046353	33.810148	35.909439	38.591891
scenario_1	Warming	1	K	0.366913	3.752108	7.531107	11.588520	15.099883	18.601441	22.251733	25.213163	28.887288	32.881978	35.717950	39.586653	43.210651	46.717081	50.754983

If we want to convert the entire dataset to a different unit, we can simply call convert_unit with the desired unit.

In [4]:

Copied!

df_basic.openscm.convert_unit("degC")
df_basic.openscm.convert_unit("degC")

Out[4]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	degC	-272.316980	-271.553095	-270.055443	-269.157985	-268.776288	-267.042446	-265.952584	-265.644089	-264.522287	-263.478922	-262.308883	-260.368862	-259.898693	-258.901527	-257.826150
scenario_0	Warming	1	degC	-272.914184	-271.002645	-268.371601	-267.425942	-265.511256	-263.524712	-260.984784	-259.730815	-257.363925	-255.757642	-253.226307	-251.283443	-249.742879	-247.898603	-246.123109
scenario_1	Warming	0	degC	-272.361526	-270.301010	-267.587895	-264.931989	-262.039818	-259.045672	-256.371413	-253.556365	-250.431446	-247.596722	-245.765670	-242.103647	-239.339852	-237.240561	-234.558109
scenario_1	Warming	1	degC	-272.783087	-269.397892	-265.618893	-261.561480	-258.050117	-254.548559	-250.898267	-247.936837	-244.262712	-240.268022	-237.432050	-233.563347	-229.939349	-226.432919	-222.395017

The functional equivalent is convert_unit. It does the same thing.

In [5]:

Copied!

pd.testing.assert_frame_equal(
    df_basic.openscm.convert_unit("degC"), convert_unit(df_basic, "degC")
)
pd.testing.assert_frame_equal(
    df_basic.openscm.convert_unit("degC"), convert_unit(df_basic, "degC")
)

By default, this assumes that the unit information is in an index level called 'unit'. If this isn't the case, the unit_level argument should be used.

In [6]:

Copied!





df_other_unit_col = create_test_df(
    variables=(("Warming", "K"),),
    n_scenarios=2,
    n_runs=2,
    timepoints=np.arange(1950.0, 1965.0),
).rename_axis(["scenario", "variable", "run", "units"])

# Notice that unit information is in a column called "units" not "unit"
df_other_unit_col
df_other_unit_col = create_test_df(
    variables=(("Warming", "K"),),
    n_scenarios=2,
    n_runs=2,
    timepoints=np.arange(1950.0, 1965.0),
).rename_axis(["scenario", "variable", "run", "units"])

# Notice that unit information is in a column called "units" not "unit"
df_other_unit_col

Out[6]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	units
scenario_0	Warming	0	K	0.121723	1.807267	3.014577	3.330916	5.023547	5.463518	7.016332	8.206352	8.680376	10.110922	10.981748	12.577318	13.704795	14.635089	15.251599
scenario_0	Warming	1	K	0.039855	2.126347	4.565873	6.025437	8.254715	9.543748	11.748486	14.116716	15.653353	17.514174	19.675970	21.061125	23.149377	25.513571	27.152355
scenario_1	Warming	0	K	0.578164	2.738325	6.241563	8.797144	11.444365	14.053786	17.203503	19.709319	22.794930	24.830284	28.175205	30.283175	33.703585	36.574880	38.443465
scenario_1	Warming	1	K	0.313213	4.335544	7.499656	10.951590	15.057613	18.850291	21.769198	25.289064	28.607206	32.271472	36.216251	39.913319	43.058160	47.217991	50.610226

In [7]:

Copied!

df_other_unit_col.openscm.convert_unit("kK", unit_level="units")
df_other_unit_col.openscm.convert_unit("kK", unit_level="units")

Out[7]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	units
scenario_0	Warming	0	kK	0.000122	0.001807	0.003015	0.003331	0.005024	0.005464	0.007016	0.008206	0.008680	0.010111	0.010982	0.012577	0.013705	0.014635	0.015252
scenario_0	Warming	1	kK	0.000040	0.002126	0.004566	0.006025	0.008255	0.009544	0.011748	0.014117	0.015653	0.017514	0.019676	0.021061	0.023149	0.025514	0.027152
scenario_1	Warming	0	kK	0.000578	0.002738	0.006242	0.008797	0.011444	0.014054	0.017204	0.019709	0.022795	0.024830	0.028175	0.030283	0.033704	0.036575	0.038443
scenario_1	Warming	1	kK	0.000313	0.004336	0.007500	0.010952	0.015058	0.018850	0.021769	0.025289	0.028607	0.032271	0.036216	0.039913	0.043058	0.047218	0.050610

More specific conversions¶

Above we have shown how to convert the entire dataset to a given unit. This works well when such a conversion is possible. However, we often have cases where different timeseries have different dimensionality, therefore cannot all be converted to the same unit.

Once again, start with some example data. Here we have data with different dimensionality.

In [8]:

Copied!





df_multi_unit = create_test_df(
    variables=(
        ("Warming", "K"),
        ("Ocean Heat Content", "ZJ"),
        ("SLR", "mm"),
    ),
    n_scenarios=2,
    n_runs=2,
    timepoints=np.arange(1950.0, 1965.0),
)
df_multi_unit
df_multi_unit = create_test_df(
    variables=(
        ("Warming", "K"),
        ("Ocean Heat Content", "ZJ"),
        ("SLR", "mm"),
    ),
    n_scenarios=2,
    n_runs=2,
    timepoints=np.arange(1950.0, 1965.0),
)
df_multi_unit

Out[8]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	K	0.591109	1.562867	2.679513	4.030553	4.825483	5.737766	6.954924	8.372727	8.623694	10.261342	11.325057	11.890421	13.820463	14.303354	15.043330
	Warming	1	K	0.336865	1.370294	3.160775	4.675830	5.966644	7.076796	8.240343	9.493297	11.120161	12.500829	13.776882	14.359810	16.256197	17.362932	18.647938
	Ocean Heat Content	0	ZJ	0.650901	1.718926	3.599942	5.545718	6.139408	8.141200	9.664617	11.646145	13.155223	14.116252	15.429051	16.991799	18.336360	20.597745	21.642114
	Ocean Heat Content	1	ZJ	0.865208	2.043763	3.774768	5.922776	7.079142	9.545313	10.882294	12.897646	14.617578	16.108130	17.700154	19.860012	21.511893	23.670213	24.590182
	SLR	0	mm	0.705512	2.477574	4.259611	6.136331	8.920068	10.794516	12.424736	14.444088	16.085314	18.178475	20.133402	21.992966	23.936006	26.231796	28.589251
	SLR	1	mm	0.270463	2.377309	5.036762	6.774614	8.882436	11.289773	13.499017	16.123079	18.389886	19.968141	22.614139	24.411224	27.092495	29.658975	30.989785
scenario_1	Warming	0	K	0.259090	2.885206	5.209213	8.237810	10.385821	12.465355	15.335473	17.286317	19.549002	22.367454	24.438071	27.245716	30.142752	32.459841	34.266800
	Warming	1	K	0.965964	2.882328	5.559345	8.926928	10.948453	13.619826	16.690716	18.757951	22.213460	24.682051	27.069908	29.696053	32.128800	34.954508	37.892366
	Ocean Heat Content	0	ZJ	0.263489	2.942555	6.115062	9.179288	12.320463	14.783798	17.607771	20.525367	23.388455	26.534521	29.332296	32.016790	35.656566	37.657090	41.448901
	Ocean Heat Content	1	ZJ	0.892679	3.237138	6.472755	9.945021	13.166714	15.949202	19.266764	22.750679	25.422125	28.239576	32.140174	34.588350	37.834550	40.553825	43.798472
	SLR	0	mm	0.076160	3.590359	7.339651	10.334739	14.203034	17.025183	20.985771	24.158424	27.308888	31.025325	33.503596	37.171544	40.857656	43.486541	47.465708
	SLR	1	mm	0.099283	4.103452	7.949211	11.086579	15.209196	17.984787	22.160941	25.300687	29.478183	32.345733	36.161467	40.057720	43.015301	46.650004	50.908462

If we try to convert everything to a single unit, we will get a dimensionality error for whatever units aren't compatible.

In [9]:

Copied!





try:
    df_multi_unit.openscm.convert_unit("cm")
except pint.DimensionalityError:
    traceback.print_exc(limit=0)
try:
    df_multi_unit.openscm.convert_unit("cm")
except pint.DimensionalityError:
    traceback.print_exc(limit=0)

pint.errors.DimensionalityError: Cannot convert from 'kelvin' ([temperature]) to 'centimeter' ([length])

To support unit conversion in such a case, we can do one of three things:

filter the data first before converting the unit
- this obviously defeats the purpose of this API a bit, as you would end up filtering, converting and recombining everywhere. As a result, we ignore this option from here.
specify the conversion as a mapping from the current unit to the desired unit
specify the conversion as a pd.Series of the units we would like to end up with

Specifying the unit as a mapping¶

One option is to specify the conversion as a mapping from the current unit to the desired unit. This option is useful if all you need to specify the desired unit is the current unit. Any current units which don't appear in the mapping are simply left alone i.e. the data for these rows is simply returned as is. The API is quite straightforward and demonstrated below.

In [10]:

Copied!

# Note also that no conversion is done for temperature (units of K)
# as "K" does not appear in the mapping
df_multi_unit.openscm.convert_unit({"mm": "cm", "ZJ": "PJ"})
# Note also that no conversion is done for temperature (units of K)
# as "K" does not appear in the mapping
df_multi_unit.openscm.convert_unit({"mm": "cm", "ZJ": "PJ"})

Out[10]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	K	0.591109	1.562867e+00	2.679513e+00	4.030553e+00	4.825483e+00	5.737766e+00	6.954924e+00	8.372727e+00	8.623694e+00	1.026134e+01	1.132506e+01	1.189042e+01	1.382046e+01	1.430335e+01	1.504333e+01
	Warming	1	K	0.336865	1.370294e+00	3.160775e+00	4.675830e+00	5.966644e+00	7.076796e+00	8.240343e+00	9.493297e+00	1.112016e+01	1.250083e+01	1.377688e+01	1.435981e+01	1.625620e+01	1.736293e+01	1.864794e+01
	Ocean Heat Content	0	PJ	650901.313413	1.718926e+06	3.599942e+06	5.545718e+06	6.139408e+06	8.141200e+06	9.664617e+06	1.164614e+07	1.315522e+07	1.411625e+07	1.542905e+07	1.699180e+07	1.833636e+07	2.059774e+07	2.164211e+07
	Ocean Heat Content	1	PJ	865208.241574	2.043763e+06	3.774768e+06	5.922776e+06	7.079142e+06	9.545313e+06	1.088229e+07	1.289765e+07	1.461758e+07	1.610813e+07	1.770015e+07	1.986001e+07	2.151189e+07	2.367021e+07	2.459018e+07
	SLR	0	cm	0.070551	2.477574e-01	4.259611e-01	6.136331e-01	8.920068e-01	1.079452e+00	1.242474e+00	1.444409e+00	1.608531e+00	1.817847e+00	2.013340e+00	2.199297e+00	2.393601e+00	2.623180e+00	2.858925e+00
	SLR	1	cm	0.027046	2.377309e-01	5.036762e-01	6.774614e-01	8.882436e-01	1.128977e+00	1.349902e+00	1.612308e+00	1.838989e+00	1.996814e+00	2.261414e+00	2.441122e+00	2.709250e+00	2.965897e+00	3.098979e+00
scenario_1	Warming	0	K	0.259090	2.885206e+00	5.209213e+00	8.237810e+00	1.038582e+01	1.246536e+01	1.533547e+01	1.728632e+01	1.954900e+01	2.236745e+01	2.443807e+01	2.724572e+01	3.014275e+01	3.245984e+01	3.426680e+01
	Warming	1	K	0.965964	2.882328e+00	5.559345e+00	8.926928e+00	1.094845e+01	1.361983e+01	1.669072e+01	1.875795e+01	2.221346e+01	2.468205e+01	2.706991e+01	2.969605e+01	3.212880e+01	3.495451e+01	3.789237e+01
	Ocean Heat Content	0	PJ	263488.878579	2.942555e+06	6.115062e+06	9.179288e+06	1.232046e+07	1.478380e+07	1.760777e+07	2.052537e+07	2.338845e+07	2.653452e+07	2.933230e+07	3.201679e+07	3.565657e+07	3.765709e+07	4.144890e+07
	Ocean Heat Content	1	PJ	892678.664963	3.237138e+06	6.472755e+06	9.945021e+06	1.316671e+07	1.594920e+07	1.926676e+07	2.275068e+07	2.542213e+07	2.823958e+07	3.214017e+07	3.458835e+07	3.783455e+07	4.055382e+07	4.379847e+07
	SLR	0	cm	0.007616	3.590359e-01	7.339651e-01	1.033474e+00	1.420303e+00	1.702518e+00	2.098577e+00	2.415842e+00	2.730889e+00	3.102532e+00	3.350360e+00	3.717154e+00	4.085766e+00	4.348654e+00	4.746571e+00
	SLR	1	cm	0.009928	4.103452e-01	7.949211e-01	1.108658e+00	1.520920e+00	1.798479e+00	2.216094e+00	2.530069e+00	2.947818e+00	3.234573e+00	3.616147e+00	4.005772e+00	4.301530e+00	4.665000e+00	5.090846e+00

The main thing to be careful of here is that you don't have a typo in your current unit (i.e. mapping key). If you do have a typo then, silently, no conversion will be done which may cause you confusion in later code (if you expected the conversion to be done and it turns out it hadn't been).

In [11]:

Copied!





# There is a typo, "zJ" is given below rather than "ZJ"
# so the ocean heat content data is not converted to PJ.
# This happens silently i.e. no warning or error.
df_multi_unit.openscm.convert_unit({"mm": "cm", "zJ": "PJ"})
# There is a typo, "zJ" is given below rather than "ZJ"
# so the ocean heat content data is not converted to PJ.
# This happens silently i.e. no warning or error.
df_multi_unit.openscm.convert_unit({"mm": "cm", "zJ": "PJ"})

Out[11]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	K	0.591109	1.562867	2.679513	4.030553	4.825483	5.737766	6.954924	8.372727	8.623694	10.261342	11.325057	11.890421	13.820463	14.303354	15.043330
	Warming	1	K	0.336865	1.370294	3.160775	4.675830	5.966644	7.076796	8.240343	9.493297	11.120161	12.500829	13.776882	14.359810	16.256197	17.362932	18.647938
	Ocean Heat Content	0	ZJ	0.650901	1.718926	3.599942	5.545718	6.139408	8.141200	9.664617	11.646145	13.155223	14.116252	15.429051	16.991799	18.336360	20.597745	21.642114
	Ocean Heat Content	1	ZJ	0.865208	2.043763	3.774768	5.922776	7.079142	9.545313	10.882294	12.897646	14.617578	16.108130	17.700154	19.860012	21.511893	23.670213	24.590182
	SLR	0	cm	0.070551	0.247757	0.425961	0.613633	0.892007	1.079452	1.242474	1.444409	1.608531	1.817847	2.013340	2.199297	2.393601	2.623180	2.858925
	SLR	1	cm	0.027046	0.237731	0.503676	0.677461	0.888244	1.128977	1.349902	1.612308	1.838989	1.996814	2.261414	2.441122	2.709250	2.965897	3.098979
scenario_1	Warming	0	K	0.259090	2.885206	5.209213	8.237810	10.385821	12.465355	15.335473	17.286317	19.549002	22.367454	24.438071	27.245716	30.142752	32.459841	34.266800
	Warming	1	K	0.965964	2.882328	5.559345	8.926928	10.948453	13.619826	16.690716	18.757951	22.213460	24.682051	27.069908	29.696053	32.128800	34.954508	37.892366
	Ocean Heat Content	0	ZJ	0.263489	2.942555	6.115062	9.179288	12.320463	14.783798	17.607771	20.525367	23.388455	26.534521	29.332296	32.016790	35.656566	37.657090	41.448901
	Ocean Heat Content	1	ZJ	0.892679	3.237138	6.472755	9.945021	13.166714	15.949202	19.266764	22.750679	25.422125	28.239576	32.140174	34.588350	37.834550	40.553825	43.798472
	SLR	0	cm	0.007616	0.359036	0.733965	1.033474	1.420303	1.702518	2.098577	2.415842	2.730889	3.102532	3.350360	3.717154	4.085766	4.348654	4.746571
	SLR	1	cm	0.009928	0.410345	0.794921	1.108658	1.520920	1.798479	2.216094	2.530069	2.947818	3.234573	3.616147	4.005772	4.301530	4.665000	5.090846

Specifying the unit as a series¶

If you want really fine-grained, i.e. timeseries-level, control then you can also specify the desired units as a pd.Series. The pd.Series should specify the desired unit for each timeseries and have an index which matches the data's index, except for the unit-information level (those are the pd.Series's values, not part of the index).

This is the hardest option to set up, but gives you the most control in exchange. As for the mapping option, any timeseries for which the desired unit is not specified are simply returned as they are.

In [12]:

Copied!





# There are lots of ways to make a series like this.
# Here we go with a hand-woven, but simple option.
# For your own work, you may want/need something
# that includes much more programming and logic.
desired_unit = pd.Series(
    ["mK", "PJ", "cm", "cm"],
    index=pd.MultiIndex.from_tuples(
        [
            ("scenario_0", "Warming", 0),
            ("scenario_0", "Ocean Heat Content", 1),
            ("scenario_0", "SLR", 0),
            ("scenario_1", "SLR", 1),
        ],
        # Note: no unit level here
        names=["scenario", "variable", "run"],
    ),
)
desired_unit
# There are lots of ways to make a series like this.
# Here we go with a hand-woven, but simple option.
# For your own work, you may want/need something
# that includes much more programming and logic.
desired_unit = pd.Series(
    ["mK", "PJ", "cm", "cm"],
    index=pd.MultiIndex.from_tuples(
        [
            ("scenario_0", "Warming", 0),
            ("scenario_0", "Ocean Heat Content", 1),
            ("scenario_0", "SLR", 0),
            ("scenario_1", "SLR", 1),
        ],
        # Note: no unit level here
        names=["scenario", "variable", "run"],
    ),
)
desired_unit

Out[12]:

scenario    variable            run
scenario_0  Warming             0      mK
            Ocean Heat Content  1      PJ
            SLR                 0      cm
scenario_1  SLR                 1      cm
dtype: object

In [13]:

Copied!

# Note that only the rows which appear in `desired_unit`
# are converted, all others are unchanged.
df_multi_unit.openscm.convert_unit(desired_unit)
# Note that only the rows which appear in `desired_unit`
# are converted, all others are unchanged.
df_multi_unit.openscm.convert_unit(desired_unit)

Out[13]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	mK	591.109485	1.562867e+03	2.679513e+03	4.030553e+03	4.825483e+03	5.737766e+03	6.954924e+03	8.372727e+03	8.623694e+03	1.026134e+04	1.132506e+04	1.189042e+04	1.382046e+04	1.430335e+04	1.504333e+04
	Warming	1	K	0.336865	1.370294e+00	3.160775e+00	4.675830e+00	5.966644e+00	7.076796e+00	8.240343e+00	9.493297e+00	1.112016e+01	1.250083e+01	1.377688e+01	1.435981e+01	1.625620e+01	1.736293e+01	1.864794e+01
	Ocean Heat Content	0	ZJ	0.650901	1.718926e+00	3.599942e+00	5.545718e+00	6.139408e+00	8.141200e+00	9.664617e+00	1.164614e+01	1.315522e+01	1.411625e+01	1.542905e+01	1.699180e+01	1.833636e+01	2.059774e+01	2.164211e+01
	Ocean Heat Content	1	PJ	865208.241574	2.043763e+06	3.774768e+06	5.922776e+06	7.079142e+06	9.545313e+06	1.088229e+07	1.289765e+07	1.461758e+07	1.610813e+07	1.770015e+07	1.986001e+07	2.151189e+07	2.367021e+07	2.459018e+07
	SLR	0	cm	0.070551	2.477574e-01	4.259611e-01	6.136331e-01	8.920068e-01	1.079452e+00	1.242474e+00	1.444409e+00	1.608531e+00	1.817847e+00	2.013340e+00	2.199297e+00	2.393601e+00	2.623180e+00	2.858925e+00
	SLR	1	mm	0.270463	2.377309e+00	5.036762e+00	6.774614e+00	8.882436e+00	1.128977e+01	1.349902e+01	1.612308e+01	1.838989e+01	1.996814e+01	2.261414e+01	2.441122e+01	2.709250e+01	2.965897e+01	3.098979e+01
scenario_1	Warming	0	K	0.259090	2.885206e+00	5.209213e+00	8.237810e+00	1.038582e+01	1.246536e+01	1.533547e+01	1.728632e+01	1.954900e+01	2.236745e+01	2.443807e+01	2.724572e+01	3.014275e+01	3.245984e+01	3.426680e+01
	Warming	1	K	0.965964	2.882328e+00	5.559345e+00	8.926928e+00	1.094845e+01	1.361983e+01	1.669072e+01	1.875795e+01	2.221346e+01	2.468205e+01	2.706991e+01	2.969605e+01	3.212880e+01	3.495451e+01	3.789237e+01
	Ocean Heat Content	0	ZJ	0.263489	2.942555e+00	6.115062e+00	9.179288e+00	1.232046e+01	1.478380e+01	1.760777e+01	2.052537e+01	2.338845e+01	2.653452e+01	2.933230e+01	3.201679e+01	3.565657e+01	3.765709e+01	4.144890e+01
	Ocean Heat Content	1	ZJ	0.892679	3.237138e+00	6.472755e+00	9.945021e+00	1.316671e+01	1.594920e+01	1.926676e+01	2.275068e+01	2.542213e+01	2.823958e+01	3.214017e+01	3.458835e+01	3.783455e+01	4.055382e+01	4.379847e+01
	SLR	0	mm	0.076160	3.590359e+00	7.339651e+00	1.033474e+01	1.420303e+01	1.702518e+01	2.098577e+01	2.415842e+01	2.730889e+01	3.102532e+01	3.350360e+01	3.717154e+01	4.085766e+01	4.348654e+01	4.746571e+01
	SLR	1	cm	0.009928	4.103452e-01	7.949211e-01	1.108658e+00	1.520920e+00	1.798479e+00	2.216094e+00	2.530069e+00	2.947818e+00	3.234573e+00	3.616147e+00	4.005772e+00	4.301530e+00	4.665000e+00	5.090846e+00

As above, the main gotcha is silently not doing conversions. If you make typos in the specification, this will happen. Given that the specification, such typos can be much harder to spot.

In [14]:

Copied!





desired_unit_typo = pd.Series(
    ["mK", "PJ", "cm", "cm"],
    index=pd.MultiIndex.from_tuples(
        [
            ("scenario_0", "Warming", 0),
            ("scenario_0", "Ocean Heat Content", 1),
            # Typo here
            ("scenario_0", "SLr", 0),
            ("scenario_1", "SLR", 1),
        ],
        # Not unit level here
        names=["scenario", "variable", "run"],
    ),
)
desired_unit_typo
desired_unit_typo = pd.Series(
    ["mK", "PJ", "cm", "cm"],
    index=pd.MultiIndex.from_tuples(
        [
            ("scenario_0", "Warming", 0),
            ("scenario_0", "Ocean Heat Content", 1),
            # Typo here
            ("scenario_0", "SLr", 0),
            ("scenario_1", "SLR", 1),
        ],
        # Not unit level here
        names=["scenario", "variable", "run"],
    ),
)
desired_unit_typo

Out[14]:

scenario    variable            run
scenario_0  Warming             0      mK
            Ocean Heat Content  1      PJ
            SLr                 0      cm
scenario_1  SLR                 1      cm
dtype: object

In [15]:

Copied!

# Note that scenario_0, SLR, run 0 isn't converted because of the typo
df_multi_unit.openscm.convert_unit(desired_unit_typo)
# Note that scenario_0, SLR, run 0 isn't converted because of the typo
df_multi_unit.openscm.convert_unit(desired_unit_typo)

Out[15]:

				1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	run	unit
scenario_0	Warming	0	mK	591.109485	1.562867e+03	2.679513e+03	4.030553e+03	4.825483e+03	5.737766e+03	6.954924e+03	8.372727e+03	8.623694e+03	1.026134e+04	1.132506e+04	1.189042e+04	1.382046e+04	1.430335e+04	1.504333e+04
	Warming	1	K	0.336865	1.370294e+00	3.160775e+00	4.675830e+00	5.966644e+00	7.076796e+00	8.240343e+00	9.493297e+00	1.112016e+01	1.250083e+01	1.377688e+01	1.435981e+01	1.625620e+01	1.736293e+01	1.864794e+01
	Ocean Heat Content	0	ZJ	0.650901	1.718926e+00	3.599942e+00	5.545718e+00	6.139408e+00	8.141200e+00	9.664617e+00	1.164614e+01	1.315522e+01	1.411625e+01	1.542905e+01	1.699180e+01	1.833636e+01	2.059774e+01	2.164211e+01
	Ocean Heat Content	1	PJ	865208.241574	2.043763e+06	3.774768e+06	5.922776e+06	7.079142e+06	9.545313e+06	1.088229e+07	1.289765e+07	1.461758e+07	1.610813e+07	1.770015e+07	1.986001e+07	2.151189e+07	2.367021e+07	2.459018e+07
	SLR	0	mm	0.705512	2.477574e+00	4.259611e+00	6.136331e+00	8.920068e+00	1.079452e+01	1.242474e+01	1.444409e+01	1.608531e+01	1.817847e+01	2.013340e+01	2.199297e+01	2.393601e+01	2.623180e+01	2.858925e+01
	SLR	1	mm	0.270463	2.377309e+00	5.036762e+00	6.774614e+00	8.882436e+00	1.128977e+01	1.349902e+01	1.612308e+01	1.838989e+01	1.996814e+01	2.261414e+01	2.441122e+01	2.709250e+01	2.965897e+01	3.098979e+01
scenario_1	Warming	0	K	0.259090	2.885206e+00	5.209213e+00	8.237810e+00	1.038582e+01	1.246536e+01	1.533547e+01	1.728632e+01	1.954900e+01	2.236745e+01	2.443807e+01	2.724572e+01	3.014275e+01	3.245984e+01	3.426680e+01
	Warming	1	K	0.965964	2.882328e+00	5.559345e+00	8.926928e+00	1.094845e+01	1.361983e+01	1.669072e+01	1.875795e+01	2.221346e+01	2.468205e+01	2.706991e+01	2.969605e+01	3.212880e+01	3.495451e+01	3.789237e+01
	Ocean Heat Content	0	ZJ	0.263489	2.942555e+00	6.115062e+00	9.179288e+00	1.232046e+01	1.478380e+01	1.760777e+01	2.052537e+01	2.338845e+01	2.653452e+01	2.933230e+01	3.201679e+01	3.565657e+01	3.765709e+01	4.144890e+01
	Ocean Heat Content	1	ZJ	0.892679	3.237138e+00	6.472755e+00	9.945021e+00	1.316671e+01	1.594920e+01	1.926676e+01	2.275068e+01	2.542213e+01	2.823958e+01	3.214017e+01	3.458835e+01	3.783455e+01	4.055382e+01	4.379847e+01
	SLR	0	mm	0.076160	3.590359e+00	7.339651e+00	1.033474e+01	1.420303e+01	1.702518e+01	2.098577e+01	2.415842e+01	2.730889e+01	3.102532e+01	3.350360e+01	3.717154e+01	4.085766e+01	4.348654e+01	4.746571e+01
	SLR	1	cm	0.009928	4.103452e-01	7.949211e-01	1.108658e+00	1.520920e+00	1.798479e+00	2.216094e+00	2.530069e+00	2.947818e+00	3.234573e+00	3.616147e+00	4.005772e+00	4.301530e+00	4.665000e+00	5.090846e+00

If you are trying to figure out why something isn't being converted, pandas provides some quite helpful APIs.

In [16]:

Copied!





rows_that_wont_be_used = desired_unit_typo.index.difference(
    df_multi_unit.index.droplevel("unit")
)
rows_that_wont_be_used
rows_that_wont_be_used = desired_unit_typo.index.difference(
    df_multi_unit.index.droplevel("unit")
)
rows_that_wont_be_used

Out[16]:

MultiIndex([('scenario_0', 'SLr', 0)],
           names=['scenario', 'variable', 'run'])

Unit registries and pint¶

The unit conversion is all done with the pint package by default. Pint is built around the idea of unit registries. The registry to use can be passed via the ur argument. If it is not specified, we use whatever is returned from pint.get_application_registry(). This is pint's way of setting the default registry for whatever you are doing. By default, it returns pint's default registry but you can set a different registry for whatever work you're doing with pint.set_application_registry().

If you're doing climate work, especially related to emissions, you often want 'emissions units' like "Mt CO2/yr". These are not recognised by default by Pint so you get errors if you try to convert them.

In [17]:

Copied!





df_emissions = create_test_df(
    variables=(
        ("co2", "Mt CO2 / yr"),
        ("ch4", "Mt CH4 / yr"),
        ("hfc23", "kt HFC23 / yr"),
    ),
    n_scenarios=2,
    n_runs=1,
    timepoints=np.arange(1950.0, 1965.0),
).reset_index("run", drop=True)
df_emissions
df_emissions = create_test_df(
    variables=(
        ("co2", "Mt CO2 / yr"),
        ("ch4", "Mt CH4 / yr"),
        ("hfc23", "kt HFC23 / yr"),
    ),
    n_scenarios=2,
    n_runs=1,
    timepoints=np.arange(1950.0, 1965.0),
).reset_index("run", drop=True)
df_emissions

Out[17]:

			1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	unit
scenario_0	co2	Mt CO2 / yr	0.334778	1.343045	3.124794	3.272307	4.958702	5.944578	6.680654	7.912122	8.619314	9.952941	10.776197	12.215116	13.353068	14.414149	15.376208
	ch4	Mt CH4 / yr	0.998931	2.016692	3.866751	5.128289	6.864509	8.165403	9.513507	11.520793	13.008829	14.763937	16.285959	17.693262	19.739902	20.869994	22.763815
	hfc23	kt HFC23 / yr	0.096459	2.362880	4.649200	6.267509	8.347073	10.849019	13.307531	15.084751	17.535367	18.670190	21.656035	23.644925	25.407146	27.373390	29.178703
scenario_1	co2	Mt CO2 / yr	0.352810	2.784526	5.512495	7.798160	10.597410	13.338805	15.476393	18.432925	21.189186	23.373402	25.960067	29.278525	31.041590	34.372871	36.309774
	ch4	Mt CH4 / yr	0.898024	4.009457	6.696396	9.721175	12.543608	15.813440	18.839671	21.642711	25.236076	28.185402	30.872498	33.859299	37.657214	40.230637	43.425710
	hfc23	kt HFC23 / yr	0.274082	4.547892	7.396640	11.445659	15.069731	18.415992	22.302893	25.495154	28.598960	32.982734	35.801677	39.711390	43.080513	46.818480	50.332368

As was written above, the default unit registry does not know about emissions units so if we try to convert this data, we receive an error.

In [18]:

Copied!





try:
    df_emissions.openscm.convert_unit({"Mt CO2 / yr": "GtC / yr"})
except pint.UndefinedUnitError:
    traceback.print_exc(limit=0)
try:
    df_emissions.openscm.convert_unit({"Mt CO2 / yr": "GtC / yr"})
except pint.UndefinedUnitError:
    traceback.print_exc(limit=0)

pint.errors.UndefinedUnitError: 'CO2' is not defined in the unit registry

If we specify openscm-units' registry instead, the conversion will work.

In [19]:

Copied!

df_emissions.openscm.convert_unit(
    {"Mt CO2 / yr": "GtC / yr"}, ur=openscm_units.unit_registry
)
df_emissions.openscm.convert_unit(
    {"Mt CO2 / yr": "GtC / yr"}, ur=openscm_units.unit_registry
)

Out[19]:

			1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	unit
scenario_0	co2	GtC / yr	0.000091	0.000366	0.000852	0.000892	0.001352	0.001621	0.001822	0.002158	0.002351	0.002714	0.002939	0.003331	0.003642	0.003931	0.004194
	ch4	Mt CH4 / yr	0.998931	2.016692	3.866751	5.128289	6.864509	8.165403	9.513507	11.520793	13.008829	14.763937	16.285959	17.693262	19.739902	20.869994	22.763815
	hfc23	kt HFC23 / yr	0.096459	2.362880	4.649200	6.267509	8.347073	10.849019	13.307531	15.084751	17.535367	18.670190	21.656035	23.644925	25.407146	27.373390	29.178703
scenario_1	co2	GtC / yr	0.000096	0.000759	0.001503	0.002127	0.002890	0.003638	0.004221	0.005027	0.005779	0.006375	0.007080	0.007985	0.008466	0.009374	0.009903
	ch4	Mt CH4 / yr	0.898024	4.009457	6.696396	9.721175	12.543608	15.813440	18.839671	21.642711	25.236076	28.185402	30.872498	33.859299	37.657214	40.230637	43.425710
	hfc23	kt HFC23 / yr	0.274082	4.547892	7.396640	11.445659	15.069731	18.415992	22.302893	25.495154	28.598960	32.982734	35.801677	39.711390	43.080513	46.818480	50.332368

If we set the application registry to openscm-units' registry, then we do not need to pass the registry every time we want to do such a conversion.

In [20]:

Copied!

pint.set_application_registry(openscm_units.unit_registry)
pint.set_application_registry(openscm_units.unit_registry)

In [21]:

Copied!

# Now the conversion works without specifying the registry
df_emissions.openscm.convert_unit({"Mt CO2 / yr": "GtC / yr"})
# Now the conversion works without specifying the registry
df_emissions.openscm.convert_unit({"Mt CO2 / yr": "GtC / yr"})

Out[21]:

			1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	unit
scenario_0	co2	GtC / yr	0.000091	0.000366	0.000852	0.000892	0.001352	0.001621	0.001822	0.002158	0.002351	0.002714	0.002939	0.003331	0.003642	0.003931	0.004194
	ch4	Mt CH4 / yr	0.998931	2.016692	3.866751	5.128289	6.864509	8.165403	9.513507	11.520793	13.008829	14.763937	16.285959	17.693262	19.739902	20.869994	22.763815
	hfc23	kt HFC23 / yr	0.096459	2.362880	4.649200	6.267509	8.347073	10.849019	13.307531	15.084751	17.535367	18.670190	21.656035	23.644925	25.407146	27.373390	29.178703
scenario_1	co2	GtC / yr	0.000096	0.000759	0.001503	0.002127	0.002890	0.003638	0.004221	0.005027	0.005779	0.006375	0.007080	0.007985	0.008466	0.009374	0.009903
	ch4	Mt CH4 / yr	0.898024	4.009457	6.696396	9.721175	12.543608	15.813440	18.839671	21.642711	25.236076	28.185402	30.872498	33.859299	37.657214	40.230637	43.425710
	hfc23	kt HFC23 / yr	0.274082	4.547892	7.396640	11.445659	15.069731	18.415992	22.302893	25.495154	28.598960	32.982734	35.801677	39.711390	43.080513	46.818480	50.332368

Contexts¶

Pint supports the idea of contexts. Within a context, conversions that would normally not be allowed can be allowed. Pint's docs give good examples of cases where this is useful. For emissions work, the key one is CO₂-equivalent units. Thanks to Pint's contexts and the unit conversion API, converting to CO₂-equivalent units becomes trivial.

In [22]:

Copied!





with openscm_units.unit_registry.context("AR6GWP100"):
    df_emissions_co2_eq = df_emissions.openscm.convert_unit(
        "Mt CO2 / yr", ur=openscm_units.unit_registry
    )

df_emissions_co2_eq
with openscm_units.unit_registry.context("AR6GWP100"):
    df_emissions_co2_eq = df_emissions.openscm.convert_unit(
        "Mt CO2 / yr", ur=openscm_units.unit_registry
    )

df_emissions_co2_eq

Out[22]:

			1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	variable	unit
scenario_0	co2	Mt CO2 / yr	0.334778	1.343045	3.124794	3.272307	4.958702	5.944578	6.680654	7.912122	8.619314	9.952941	10.776197	12.215116	13.353068	14.414149	15.376208
	ch4	Mt CO2 / yr	27.870180	56.265714	107.882341	143.079267	191.519795	227.814756	265.426838	321.430128	362.946320	411.913849	454.378261	493.642006	550.743270	582.272836	635.110428
	hfc23	Mt CO2 / yr	1.408295	34.498050	67.878316	91.505630	121.867259	158.395677	194.289958	220.237363	256.016356	272.584769	316.178111	345.215909	370.944330	399.651490	426.009063
scenario_1	co2	Mt CO2 / yr	0.352810	2.784526	5.512495	7.798160	10.597410	13.338805	15.476393	18.432925	21.189186	23.373402	25.960067	29.278525	31.041590	34.372871	36.309774
	ch4	Mt CO2 / yr	25.054861	111.863837	186.829462	271.220770	349.966658	441.194977	525.626823	603.831637	704.086534	786.372710	861.342703	944.674442	1050.636258	1122.434767	1211.577296
	hfc23	Mt CO2 / yr	4.001590	66.399219	107.990941	167.106622	220.018077	268.873490	325.622239	372.229255	417.544822	481.547911	522.704479	579.786292	628.975491	683.549812	734.852566

In [23]:

Copied!

# From here, calculating e.g. total CO2-equivalent emissions
# is then trivial, e.g.
df_emissions_co2_eq.openscm.groupby_except("variable").sum().pix.assign(variable="ghg")
# From here, calculating e.g. total CO2-equivalent emissions
# is then trivial, e.g.
df_emissions_co2_eq.openscm.groupby_except("variable").sum().pix.assign(variable="ghg")

Out[23]:

			1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	1960.0	1961.0	1962.0	1963.0	1964.0
scenario	unit	variable
scenario_0	Mt CO2 / yr	ghg	29.613252	92.106810	178.885452	237.857205	318.345756	392.155011	466.397450	549.579612	627.581990	694.451559	781.332569	851.073031	935.040667	996.338475	1076.495699
scenario_1	Mt CO2 / yr	ghg	29.409261	181.047582	300.332898	446.125552	580.582145	723.407272	866.725455	994.493817	1142.820542	1291.294023	1410.007249	1553.739258	1710.653340	1840.357451	1982.739636

Convert unit like¶

A common scenario is wanting to compare two datasets. In such cases, life is much easier if they have the same unit. To support this case, we provide the convert_unit_like API. This is essentially just a wrapper around convert_unit_from_target_series, that figures out the desired units based on the data which we would like to match. If the logic included in convert_unit_like doesn't fit your use case, then we suggest making your desired units by hand and then directly using convert_unit_from_target_series or convert_unit instead.

Let's imagine we have scenario data like the below.

In [24]:

Copied!





df_scenarios = create_test_df(
    variables=(
        ("co2", "Mt CO2 / yr"),
        ("ch4", "Mt CH4 / yr"),
        ("hfc23", "kt HFC23 / yr"),
    ),
    n_scenarios=2,
    n_runs=1,
    timepoints=np.arange(2025.0, 2100.0 + 1.0),
).reset_index("run", drop=True)
df_scenarios
df_scenarios = create_test_df(
    variables=(
        ("co2", "Mt CO2 / yr"),
        ("ch4", "Mt CH4 / yr"),
        ("hfc23", "kt HFC23 / yr"),
    ),
    n_scenarios=2,
    n_runs=1,
    timepoints=np.arange(2025.0, 2100.0 + 1.0),
).reset_index("run", drop=True)
df_scenarios

Out[24]:

			2025.0	2026.0	2027.0	2028.0	2029.0	2030.0	2031.0	2032.0	2033.0	2034.0	...	2091.0	2092.0	2093.0	2094.0	2095.0	2096.0	2097.0	2098.0	2099.0	2100.0
scenario	variable	unit
scenario_0	co2	Mt CO2 / yr	0.630829	0.374979	1.082395	0.672892	1.647849	1.460875	1.667553	2.055225	2.580004	2.606394	...	13.661322	14.218882	14.532644	14.521115	14.417093	14.997389	14.927472	14.802409	15.562887	15.202732
	ch4	Mt CH4 / yr	0.222510	1.257801	0.604924	1.345336	1.646566	1.607872	2.543852	2.624976	3.152217	3.554689	...	20.112408	20.078811	20.536215	21.187019	21.165569	20.991694	21.522563	22.409479	21.946998	22.139521
	hfc23	kt HFC23 / yr	0.907473	0.761665	1.352878	1.591107	1.988541	2.354604	2.801807	3.638481	3.854258	4.320065	...	25.942678	26.596298	27.239053	26.792063	27.380232	28.124666	28.204998	28.594711	28.775431	29.442063
scenario_1	co2	Mt CO2 / yr	0.333148	0.722961	0.982739	1.705736	2.422920	2.841952	3.028106	4.066724	4.333335	4.768300	...	32.645639	32.864509	33.055400	33.461216	33.668035	34.977745	34.963627	35.109885	36.382614	36.341592
	ch4	Mt CH4 / yr	0.760049	1.455638	1.874968	1.879505	2.596289	3.134631	3.501749	4.417489	5.080045	5.316956	...	38.503701	38.823542	39.769538	40.194249	40.160799	41.322017	42.050710	41.899587	42.455124	43.850286
	hfc23	kt HFC23 / yr	0.696459	0.896275	2.092961	2.249874	2.802661	3.345677	4.618293	4.898255	6.250212	6.213745	...	44.342354	45.581363	45.723373	46.069332	47.138450	47.379370	48.533851	48.734898	49.388509	50.959111

6 rows × 76 columns

Then we have some historical data.

In [25]:

Copied!





df_history = create_test_df(
    variables=(
        ("co2", "Gt CO2 / yr"),
        ("ch4", "kt CH4 / yr"),
        ("hfc23", "t HFC23 / yr"),
    ),
    n_scenarios=1,
    n_runs=1,
    timepoints=np.arange(1950.0, 2024.0 + 1.0),
).reset_index(["run", "scenario"], drop=True)
df_history
df_history = create_test_df(
    variables=(
        ("co2", "Gt CO2 / yr"),
        ("ch4", "kt CH4 / yr"),
        ("hfc23", "t HFC23 / yr"),
    ),
    n_scenarios=1,
    n_runs=1,
    timepoints=np.arange(1950.0, 2024.0 + 1.0),
).reset_index(["run", "scenario"], drop=True)
df_history

Out[25]:

		1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	...	2015.0	2016.0	2017.0	2018.0	2019.0	2020.0	2021.0	2022.0	2023.0	2024.0
variable	unit
co2	Gt CO2 / yr	0.978043	0.773526	0.912909	1.452641	1.636037	1.155063	1.692588	1.591687	2.135544	2.410116	...	14.148612	13.686054	14.140349	14.744365	14.366217	14.316261	15.365074	15.358737	15.158790	15.171308
ch4	kt CH4 / yr	0.783886	0.804794	1.004127	1.346120	1.772816	2.696717	3.012730	3.242307	3.956110	4.423999	...	28.619005	29.193156	29.756717	30.491717	31.096479	31.225626	31.814345	32.176674	32.337564	33.020764
hfc23	t HFC23 / yr	0.855126	1.413841	1.694239	2.726967	2.738134	3.550249	4.659124	5.119329	6.033273	6.537820	...	44.683132	44.757387	46.246930	46.552409	46.816337	47.885555	48.101853	49.577580	49.781345	50.897692

3 rows × 75 columns

We can simply convert the scenario data to have the same units of the history with convert_unit_like.

In [26]:

Copied!

df_scenarios_like_history = df_scenarios.openscm.convert_unit_like(df_history)
df_scenarios_like_history
df_scenarios_like_history = df_scenarios.openscm.convert_unit_like(df_history)
df_scenarios_like_history

Out[26]:

			2025.0	2026.0	2027.0	2028.0	2029.0	2030.0	2031.0	2032.0	2033.0	2034.0	...	2091.0	2092.0	2093.0	2094.0	2095.0	2096.0	2097.0	2098.0	2099.0	2100.0
scenario	variable	unit
scenario_0	co2	Gt CO2 / yr	0.000631	0.000375	0.001082	0.000673	0.001648	0.001461	0.001668	0.002055	0.002580	0.002606	...	0.013661	0.014219	0.014533	0.014521	0.014417	0.014997	0.014927	0.014802	0.015563	0.015203
	ch4	kt CH4 / yr	222.509659	1257.801096	604.923605	1345.336175	1646.565964	1607.872465	2543.852483	2624.976289	3152.217194	3554.688682	...	20112.408086	20078.811141	20536.214563	21187.019150	21165.568877	20991.693947	21522.563090	22409.478620	21946.997626	22139.521296
	hfc23	t HFC23 / yr	907.473110	761.664878	1352.878260	1591.107459	1988.541339	2354.603867	2801.806841	3638.480910	3854.257636	4320.064950	...	25942.677829	26596.297594	27239.053313	26792.062682	27380.232365	28124.665905	28204.998273	28594.710871	28775.430798	29442.062543
scenario_1	co2	Gt CO2 / yr	0.000333	0.000723	0.000983	0.001706	0.002423	0.002842	0.003028	0.004067	0.004333	0.004768	...	0.032646	0.032865	0.033055	0.033461	0.033668	0.034978	0.034964	0.035110	0.036383	0.036342
	ch4	kt CH4 / yr	760.048637	1455.638181	1874.968334	1879.505340	2596.288576	3134.630982	3501.749307	4417.489063	5080.044930	5316.956134	...	38503.701140	38823.541952	39769.537508	40194.248583	40160.799016	41322.016568	42050.710298	41899.587030	42455.124228	43850.285509
	hfc23	t HFC23 / yr	696.459075	896.275118	2092.961092	2249.873631	2802.661319	3345.676609	4618.293460	4898.255256	6250.212111	6213.744511	...	44342.353562	45581.362736	45723.372766	46069.332367	47138.449518	47379.369970	48533.851142	48734.897994	49388.508796	50959.110554

6 rows × 76 columns

The functional equivalent of the above is convert_unit_like.

In [27]:

Copied!





pd.testing.assert_frame_equal(
    df_scenarios.openscm.convert_unit_like(df_history),
    convert_unit_like(df_scenarios, df_history),
)
pd.testing.assert_frame_equal(
    df_scenarios.openscm.convert_unit_like(df_history),
    convert_unit_like(df_scenarios, df_history),
)

For scenarios like this, where the desired units are clear, we can also do the reverse operation.

In [28]:

Copied!

df_history.openscm.convert_unit_like(df_scenarios)
df_history.openscm.convert_unit_like(df_scenarios)

Out[28]:

		1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	...	2015.0	2016.0	2017.0	2018.0	2019.0	2020.0	2021.0	2022.0	2023.0	2024.0
variable	unit
co2	Mt CO2 / yr	978.042673	773.525960	912.909168	1452.641459	1636.036852	1155.062555	1692.587556	1591.686524	2135.544372	2410.116117	...	14148.611951	13686.053976	14140.349136	14744.364768	14366.217023	14316.260674	15365.073914	15358.737229	15158.790498	15171.307770
ch4	Mt CH4 / yr	0.000784	0.000805	0.001004	0.001346	0.001773	0.002697	0.003013	0.003242	0.003956	0.004424	...	0.028619	0.029193	0.029757	0.030492	0.031096	0.031226	0.031814	0.032177	0.032338	0.033021
hfc23	kt HFC23 / yr	0.000855	0.001414	0.001694	0.002727	0.002738	0.003550	0.004659	0.005119	0.006033	0.006538	...	0.044683	0.044757	0.046247	0.046552	0.046816	0.047886	0.048102	0.049578	0.049781	0.050898

3 rows × 75 columns

However, if the scenarios themselves had different units, then the target unit for a given timeseries in history would be ambiguous and we would get an error.

In [29]:

Copied!





df_scenarios_differing_units = df_scenarios.openscm.set_index_levels(
    {
        "unit": [
            "Gt CO2 / yr",
            "kt CH4 / yr",
            "kt HFC23 / yr",
            "Mt CO2 / yr",
            "Mt CH4 / yr",
            "kt HFC23 / yr",
        ]
    }
)

# Note that e.g. scenario_0 uses Gt CO2 / yr for co2
# while scenario_1 uses Mt CO2 / yr
df_scenarios_differing_units
df_scenarios_differing_units = df_scenarios.openscm.set_index_levels(
    {
        "unit": [
            "Gt CO2 / yr",
            "kt CH4 / yr",
            "kt HFC23 / yr",
            "Mt CO2 / yr",
            "Mt CH4 / yr",
            "kt HFC23 / yr",
        ]
    }
)

# Note that e.g. scenario_0 uses Gt CO2 / yr for co2
# while scenario_1 uses Mt CO2 / yr
df_scenarios_differing_units

Out[29]:

			2025.0	2026.0	2027.0	2028.0	2029.0	2030.0	2031.0	2032.0	2033.0	2034.0	...	2091.0	2092.0	2093.0	2094.0	2095.0	2096.0	2097.0	2098.0	2099.0	2100.0
scenario	variable	unit
scenario_0	co2	Gt CO2 / yr	0.630829	0.374979	1.082395	0.672892	1.647849	1.460875	1.667553	2.055225	2.580004	2.606394	...	13.661322	14.218882	14.532644	14.521115	14.417093	14.997389	14.927472	14.802409	15.562887	15.202732
	ch4	kt CH4 / yr	0.222510	1.257801	0.604924	1.345336	1.646566	1.607872	2.543852	2.624976	3.152217	3.554689	...	20.112408	20.078811	20.536215	21.187019	21.165569	20.991694	21.522563	22.409479	21.946998	22.139521
	hfc23	kt HFC23 / yr	0.907473	0.761665	1.352878	1.591107	1.988541	2.354604	2.801807	3.638481	3.854258	4.320065	...	25.942678	26.596298	27.239053	26.792063	27.380232	28.124666	28.204998	28.594711	28.775431	29.442063
scenario_1	co2	Mt CO2 / yr	0.333148	0.722961	0.982739	1.705736	2.422920	2.841952	3.028106	4.066724	4.333335	4.768300	...	32.645639	32.864509	33.055400	33.461216	33.668035	34.977745	34.963627	35.109885	36.382614	36.341592
	ch4	Mt CH4 / yr	0.760049	1.455638	1.874968	1.879505	2.596289	3.134631	3.501749	4.417489	5.080045	5.316956	...	38.503701	38.823542	39.769538	40.194249	40.160799	41.322017	42.050710	41.899587	42.455124	43.850286
	hfc23	kt HFC23 / yr	0.696459	0.896275	2.092961	2.249874	2.802661	3.345677	4.618293	4.898255	6.250212	6.213745	...	44.342354	45.581363	45.723373	46.069332	47.138450	47.379370	48.533851	48.734898	49.388509	50.959111

6 rows × 76 columns

In [30]:

Copied!





try:
    df_history.openscm.convert_unit_like(df_scenarios_differing_units)
except AmbiguousTargetUnitError:
    traceback.print_exc(limit=0)
try:
    df_history.openscm.convert_unit_like(df_scenarios_differing_units)
except AmbiguousTargetUnitError:
    traceback.print_exc(limit=0)

pandas_openscm.unit_conversion.AmbiguousTargetUnitError: `pobj` has pobj.index.names=FrozenList(['variable', 'unit']). `target` has target.index.names=FrozenList(['scenario', 'variable', 'unit']). The index levels in `target` that are also in `pobj` are ['variable']. When we only look at these levels, the desired unit looks like:
variable
co2           Gt CO2 / yr
ch4           kt CH4 / yr
hfc23       kt HFC23 / yr
co2           Mt CO2 / yr
ch4           Mt CH4 / yr
Name: unit, dtype: object
The unit to use isn't unambiguous for the following metadata:
variable
co2         Gt CO2 / yr
ch4         kt CH4 / yr
co2         Mt CO2 / yr
ch4         Mt CH4 / yr
Name: unit, dtype: object
The drivers of this ambiguity are the following metadata levels in `target`
MultiIndex([('scenario_0', 'co2', 'Gt CO2 / yr'),
            ('scenario_0', 'ch4', 'kt CH4 / yr'),
            ('scenario_1', 'co2', 'Mt CO2 / yr'),
            ('scenario_1', 'ch4', 'Mt CH4 / yr')],
           names=['scenario', 'variable', 'unit'])

In such a case, we can instead create our desired units ourselves and directly call convert_unit or convert_unit_from_target_series.

In [31]:

Copied!





desired_units = (
    df_scenarios_differing_units.loc[pix.isin(scenario="scenario_0")]
    .index.droplevel("scenario")
    .to_frame()["unit"]
    .reset_index("unit", drop=True)
)
desired_units
desired_units = (
    df_scenarios_differing_units.loc[pix.isin(scenario="scenario_0")]
    .index.droplevel("scenario")
    .to_frame()["unit"]
    .reset_index("unit", drop=True)
)
desired_units

Out[31]:

variable
co2        Gt CO2 / yr
ch4        kt CH4 / yr
hfc23    kt HFC23 / yr
Name: unit, dtype: object

In [32]:

Copied!

df_history.openscm.convert_unit(desired_units)
df_history.openscm.convert_unit(desired_units)

Out[32]:

		1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	...	2015.0	2016.0	2017.0	2018.0	2019.0	2020.0	2021.0	2022.0	2023.0	2024.0
variable	unit
co2	Gt CO2 / yr	0.978043	0.773526	0.912909	1.452641	1.636037	1.155063	1.692588	1.591687	2.135544	2.410116	...	14.148612	13.686054	14.140349	14.744365	14.366217	14.316261	15.365074	15.358737	15.158790	15.171308
ch4	kt CH4 / yr	0.783886	0.804794	1.004127	1.346120	1.772816	2.696717	3.012730	3.242307	3.956110	4.423999	...	28.619005	29.193156	29.756717	30.491717	31.096479	31.225626	31.814345	32.176674	32.337564	33.020764
hfc23	kt HFC23 / yr	0.000855	0.001414	0.001694	0.002727	0.002738	0.003550	0.004659	0.005119	0.006033	0.006538	...	0.044683	0.044757	0.046247	0.046552	0.046816	0.047886	0.048102	0.049578	0.049781	0.050898

3 rows × 75 columns

In [33]:

Copied!

# The functional equivalent using convert_unit_from_target_series
convert_unit_from_target_series(df_history, desired_units)
# The functional equivalent using convert_unit_from_target_series
convert_unit_from_target_series(df_history, desired_units)

Out[33]:

		1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	...	2015.0	2016.0	2017.0	2018.0	2019.0	2020.0	2021.0	2022.0	2023.0	2024.0
variable	unit
co2	Gt CO2 / yr	0.978043	0.773526	0.912909	1.452641	1.636037	1.155063	1.692588	1.591687	2.135544	2.410116	...	14.148612	13.686054	14.140349	14.744365	14.366217	14.316261	15.365074	15.358737	15.158790	15.171308
ch4	kt CH4 / yr	0.783886	0.804794	1.004127	1.346120	1.772816	2.696717	3.012730	3.242307	3.956110	4.423999	...	28.619005	29.193156	29.756717	30.491717	31.096479	31.225626	31.814345	32.176674	32.337564	33.020764
hfc23	kt HFC23 / yr	0.000855	0.001414	0.001694	0.002727	0.002738	0.003550	0.004659	0.005119	0.006033	0.006538	...	0.044683	0.044757	0.046247	0.046552	0.046816	0.047886	0.048102	0.049578	0.049781	0.050898

3 rows × 75 columns

Converting units of two pd.DataFrame's so they match then makes further operations, like concatenating the two datasets or adding/subtracting them, much simpler.

In [34]:

Copied!





df_full_timeseries = pd.concat(
    [
        v.dropna(how="all", axis="columns")
        for v in df_history.align(df_scenarios_like_history)
    ],
    axis="columns",
)
df_full_timeseries
df_full_timeseries = pd.concat(
    [
        v.dropna(how="all", axis="columns")
        for v in df_history.align(df_scenarios_like_history)
    ],
    axis="columns",
)
df_full_timeseries

Out[34]:

			1950.0	1951.0	1952.0	1953.0	1954.0	1955.0	1956.0	1957.0	1958.0	1959.0	...	2091.0	2092.0	2093.0	2094.0	2095.0	2096.0	2097.0	2098.0	2099.0	2100.0
variable	unit	scenario
ch4	kt CH4 / yr	scenario_0	0.783886	0.804794	1.004127	1.346120	1.772816	2.696717	3.012730	3.242307	3.956110	4.423999	...	20112.408086	20078.811141	20536.214563	21187.019150	21165.568877	20991.693947	21522.563090	22409.478620	21946.997626	22139.521296
ch4	kt CH4 / yr	scenario_1	0.783886	0.804794	1.004127	1.346120	1.772816	2.696717	3.012730	3.242307	3.956110	4.423999	...	38503.701140	38823.541952	39769.537508	40194.248583	40160.799016	41322.016568	42050.710298	41899.587030	42455.124228	43850.285509
co2	Gt CO2 / yr	scenario_0	0.978043	0.773526	0.912909	1.452641	1.636037	1.155063	1.692588	1.591687	2.135544	2.410116	...	0.013661	0.014219	0.014533	0.014521	0.014417	0.014997	0.014927	0.014802	0.015563	0.015203
co2	Gt CO2 / yr	scenario_1	0.978043	0.773526	0.912909	1.452641	1.636037	1.155063	1.692588	1.591687	2.135544	2.410116	...	0.032646	0.032865	0.033055	0.033461	0.033668	0.034978	0.034964	0.035110	0.036383	0.036342
hfc23	t HFC23 / yr	scenario_0	0.855126	1.413841	1.694239	2.726967	2.738134	3.550249	4.659124	5.119329	6.033273	6.537820	...	25942.677829	26596.297594	27239.053313	26792.062682	27380.232365	28124.665905	28204.998273	28594.710871	28775.430798	29442.062543
hfc23	t HFC23 / yr	scenario_1	0.855126	1.413841	1.694239	2.726967	2.738134	3.550249	4.659124	5.119329	6.033273	6.537820	...	44342.353562	45581.362736	45723.372766	46069.332367	47138.449518	47379.369970	48533.851142	48734.897994	49388.508796	50959.110554

6 rows × 151 columns

Summary¶

Here you have seen pandas-openscm's unit conversion related APIs. We hope these are helpful. If you have any issues, please raise an issue.