Unit conversion¶
Here we detail pandas-openscm's unit conversion support.
Imports¶
import traceback
import numpy as np
import openscm_units
import pandas as pd
import pandas_indexing as pix
import pint
from pandas_openscm import register_pandas_accessors
from pandas_openscm.testing import create_test_df
from pandas_openscm.unit_conversion import (
AmbiguousTargetUnitError,
convert_unit,
convert_unit_from_target_series,
convert_unit_like,
)
Setup¶
# Register the openscm accessor for pandas objects
# (we don't do this on import
# as we have had bad experiences with implicit behaviour like that)
register_pandas_accessors()
Basics¶
Convert all data to a given unit¶
Imagine we start with some data.
df_basic = create_test_df(
variables=(("Warming", "K"),),
n_scenarios=2,
n_runs=2,
timepoints=np.arange(1950.0, 1965.0),
)
df_basic
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | K | 0.833020 | 1.596905 | 3.094557 | 3.992015 | 4.373712 | 6.107554 | 7.197416 | 7.505911 | 8.627713 | 9.671078 | 10.841117 | 12.781138 | 13.251307 | 14.248473 | 15.323850 |
| 1 | K | 0.235816 | 2.147355 | 4.778399 | 5.724058 | 7.638744 | 9.625288 | 12.165216 | 13.419185 | 15.786075 | 17.392358 | 19.923693 | 21.866557 | 23.407121 | 25.251397 | 27.026891 | ||
| scenario_1 | Warming | 0 | K | 0.788474 | 2.848990 | 5.562105 | 8.218011 | 11.110182 | 14.104328 | 16.778587 | 19.593635 | 22.718554 | 25.553278 | 27.384330 | 31.046353 | 33.810148 | 35.909439 | 38.591891 |
| 1 | K | 0.366913 | 3.752108 | 7.531107 | 11.588520 | 15.099883 | 18.601441 | 22.251733 | 25.213163 | 28.887288 | 32.881978 | 35.717950 | 39.586653 | 43.210651 | 46.717081 | 50.754983 |
If we want to convert the entire dataset to a different unit,
we can simply call convert_unit with the desired unit.
df_basic.openscm.convert_unit("degC")
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | degC | -272.316980 | -271.553095 | -270.055443 | -269.157985 | -268.776288 | -267.042446 | -265.952584 | -265.644089 | -264.522287 | -263.478922 | -262.308883 | -260.368862 | -259.898693 | -258.901527 | -257.826150 |
| 1 | degC | -272.914184 | -271.002645 | -268.371601 | -267.425942 | -265.511256 | -263.524712 | -260.984784 | -259.730815 | -257.363925 | -255.757642 | -253.226307 | -251.283443 | -249.742879 | -247.898603 | -246.123109 | ||
| scenario_1 | Warming | 0 | degC | -272.361526 | -270.301010 | -267.587895 | -264.931989 | -262.039818 | -259.045672 | -256.371413 | -253.556365 | -250.431446 | -247.596722 | -245.765670 | -242.103647 | -239.339852 | -237.240561 | -234.558109 |
| 1 | degC | -272.783087 | -269.397892 | -265.618893 | -261.561480 | -258.050117 | -254.548559 | -250.898267 | -247.936837 | -244.262712 | -240.268022 | -237.432050 | -233.563347 | -229.939349 | -226.432919 | -222.395017 |
The functional equivalent is convert_unit.
It does the same thing.
pd.testing.assert_frame_equal(
df_basic.openscm.convert_unit("degC"), convert_unit(df_basic, "degC")
)
By default, this assumes that the unit information
is in an index level called 'unit'.
If this isn't the case, the unit_level argument should be used.
df_other_unit_col = create_test_df(
variables=(("Warming", "K"),),
n_scenarios=2,
n_runs=2,
timepoints=np.arange(1950.0, 1965.0),
).rename_axis(["scenario", "variable", "run", "units"])
# Notice that unit information is in a column called "units" not "unit"
df_other_unit_col
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | units | |||||||||||||||
| scenario_0 | Warming | 0 | K | 0.121723 | 1.807267 | 3.014577 | 3.330916 | 5.023547 | 5.463518 | 7.016332 | 8.206352 | 8.680376 | 10.110922 | 10.981748 | 12.577318 | 13.704795 | 14.635089 | 15.251599 |
| 1 | K | 0.039855 | 2.126347 | 4.565873 | 6.025437 | 8.254715 | 9.543748 | 11.748486 | 14.116716 | 15.653353 | 17.514174 | 19.675970 | 21.061125 | 23.149377 | 25.513571 | 27.152355 | ||
| scenario_1 | Warming | 0 | K | 0.578164 | 2.738325 | 6.241563 | 8.797144 | 11.444365 | 14.053786 | 17.203503 | 19.709319 | 22.794930 | 24.830284 | 28.175205 | 30.283175 | 33.703585 | 36.574880 | 38.443465 |
| 1 | K | 0.313213 | 4.335544 | 7.499656 | 10.951590 | 15.057613 | 18.850291 | 21.769198 | 25.289064 | 28.607206 | 32.271472 | 36.216251 | 39.913319 | 43.058160 | 47.217991 | 50.610226 |
df_other_unit_col.openscm.convert_unit("kK", unit_level="units")
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | units | |||||||||||||||
| scenario_0 | Warming | 0 | kK | 0.000122 | 0.001807 | 0.003015 | 0.003331 | 0.005024 | 0.005464 | 0.007016 | 0.008206 | 0.008680 | 0.010111 | 0.010982 | 0.012577 | 0.013705 | 0.014635 | 0.015252 |
| 1 | kK | 0.000040 | 0.002126 | 0.004566 | 0.006025 | 0.008255 | 0.009544 | 0.011748 | 0.014117 | 0.015653 | 0.017514 | 0.019676 | 0.021061 | 0.023149 | 0.025514 | 0.027152 | ||
| scenario_1 | Warming | 0 | kK | 0.000578 | 0.002738 | 0.006242 | 0.008797 | 0.011444 | 0.014054 | 0.017204 | 0.019709 | 0.022795 | 0.024830 | 0.028175 | 0.030283 | 0.033704 | 0.036575 | 0.038443 |
| 1 | kK | 0.000313 | 0.004336 | 0.007500 | 0.010952 | 0.015058 | 0.018850 | 0.021769 | 0.025289 | 0.028607 | 0.032271 | 0.036216 | 0.039913 | 0.043058 | 0.047218 | 0.050610 |
More specific conversions¶
Above we have shown how to convert the entire dataset to a given unit. This works well when such a conversion is possible. However, we often have cases where different timeseries have different dimensionality, therefore cannot all be converted to the same unit.
Once again, start with some example data. Here we have data with different dimensionality.
df_multi_unit = create_test_df(
variables=(
("Warming", "K"),
("Ocean Heat Content", "ZJ"),
("SLR", "mm"),
),
n_scenarios=2,
n_runs=2,
timepoints=np.arange(1950.0, 1965.0),
)
df_multi_unit
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | K | 0.591109 | 1.562867 | 2.679513 | 4.030553 | 4.825483 | 5.737766 | 6.954924 | 8.372727 | 8.623694 | 10.261342 | 11.325057 | 11.890421 | 13.820463 | 14.303354 | 15.043330 |
| 1 | K | 0.336865 | 1.370294 | 3.160775 | 4.675830 | 5.966644 | 7.076796 | 8.240343 | 9.493297 | 11.120161 | 12.500829 | 13.776882 | 14.359810 | 16.256197 | 17.362932 | 18.647938 | ||
| Ocean Heat Content | 0 | ZJ | 0.650901 | 1.718926 | 3.599942 | 5.545718 | 6.139408 | 8.141200 | 9.664617 | 11.646145 | 13.155223 | 14.116252 | 15.429051 | 16.991799 | 18.336360 | 20.597745 | 21.642114 | |
| 1 | ZJ | 0.865208 | 2.043763 | 3.774768 | 5.922776 | 7.079142 | 9.545313 | 10.882294 | 12.897646 | 14.617578 | 16.108130 | 17.700154 | 19.860012 | 21.511893 | 23.670213 | 24.590182 | ||
| SLR | 0 | mm | 0.705512 | 2.477574 | 4.259611 | 6.136331 | 8.920068 | 10.794516 | 12.424736 | 14.444088 | 16.085314 | 18.178475 | 20.133402 | 21.992966 | 23.936006 | 26.231796 | 28.589251 | |
| 1 | mm | 0.270463 | 2.377309 | 5.036762 | 6.774614 | 8.882436 | 11.289773 | 13.499017 | 16.123079 | 18.389886 | 19.968141 | 22.614139 | 24.411224 | 27.092495 | 29.658975 | 30.989785 | ||
| scenario_1 | Warming | 0 | K | 0.259090 | 2.885206 | 5.209213 | 8.237810 | 10.385821 | 12.465355 | 15.335473 | 17.286317 | 19.549002 | 22.367454 | 24.438071 | 27.245716 | 30.142752 | 32.459841 | 34.266800 |
| 1 | K | 0.965964 | 2.882328 | 5.559345 | 8.926928 | 10.948453 | 13.619826 | 16.690716 | 18.757951 | 22.213460 | 24.682051 | 27.069908 | 29.696053 | 32.128800 | 34.954508 | 37.892366 | ||
| Ocean Heat Content | 0 | ZJ | 0.263489 | 2.942555 | 6.115062 | 9.179288 | 12.320463 | 14.783798 | 17.607771 | 20.525367 | 23.388455 | 26.534521 | 29.332296 | 32.016790 | 35.656566 | 37.657090 | 41.448901 | |
| 1 | ZJ | 0.892679 | 3.237138 | 6.472755 | 9.945021 | 13.166714 | 15.949202 | 19.266764 | 22.750679 | 25.422125 | 28.239576 | 32.140174 | 34.588350 | 37.834550 | 40.553825 | 43.798472 | ||
| SLR | 0 | mm | 0.076160 | 3.590359 | 7.339651 | 10.334739 | 14.203034 | 17.025183 | 20.985771 | 24.158424 | 27.308888 | 31.025325 | 33.503596 | 37.171544 | 40.857656 | 43.486541 | 47.465708 | |
| 1 | mm | 0.099283 | 4.103452 | 7.949211 | 11.086579 | 15.209196 | 17.984787 | 22.160941 | 25.300687 | 29.478183 | 32.345733 | 36.161467 | 40.057720 | 43.015301 | 46.650004 | 50.908462 |
If we try to convert everything to a single unit, we will get a dimensionality error for whatever units aren't compatible.
try:
df_multi_unit.openscm.convert_unit("cm")
except pint.DimensionalityError:
traceback.print_exc(limit=0)
pint.errors.DimensionalityError: Cannot convert from 'kelvin' ([temperature]) to 'centimeter' ([length])
To support unit conversion in such a case, we can do one of three things:
- filter the data first before converting the unit
- this obviously defeats the purpose of this API a bit, as you would end up filtering, converting and recombining everywhere. As a result, we ignore this option from here.
- specify the conversion as a mapping from the current unit to the desired unit
- specify the conversion as a
pd.Seriesof the units we would like to end up with
Specifying the unit as a mapping¶
One option is to specify the conversion as a mapping from the current unit to the desired unit. This option is useful if all you need to specify the desired unit is the current unit. Any current units which don't appear in the mapping are simply left alone i.e. the data for these rows is simply returned as is. The API is quite straightforward and demonstrated below.
# Note also that no conversion is done for temperature (units of K)
# as "K" does not appear in the mapping
df_multi_unit.openscm.convert_unit({"mm": "cm", "ZJ": "PJ"})
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | K | 0.591109 | 1.562867e+00 | 2.679513e+00 | 4.030553e+00 | 4.825483e+00 | 5.737766e+00 | 6.954924e+00 | 8.372727e+00 | 8.623694e+00 | 1.026134e+01 | 1.132506e+01 | 1.189042e+01 | 1.382046e+01 | 1.430335e+01 | 1.504333e+01 |
| 1 | K | 0.336865 | 1.370294e+00 | 3.160775e+00 | 4.675830e+00 | 5.966644e+00 | 7.076796e+00 | 8.240343e+00 | 9.493297e+00 | 1.112016e+01 | 1.250083e+01 | 1.377688e+01 | 1.435981e+01 | 1.625620e+01 | 1.736293e+01 | 1.864794e+01 | ||
| Ocean Heat Content | 0 | PJ | 650901.313413 | 1.718926e+06 | 3.599942e+06 | 5.545718e+06 | 6.139408e+06 | 8.141200e+06 | 9.664617e+06 | 1.164614e+07 | 1.315522e+07 | 1.411625e+07 | 1.542905e+07 | 1.699180e+07 | 1.833636e+07 | 2.059774e+07 | 2.164211e+07 | |
| 1 | PJ | 865208.241574 | 2.043763e+06 | 3.774768e+06 | 5.922776e+06 | 7.079142e+06 | 9.545313e+06 | 1.088229e+07 | 1.289765e+07 | 1.461758e+07 | 1.610813e+07 | 1.770015e+07 | 1.986001e+07 | 2.151189e+07 | 2.367021e+07 | 2.459018e+07 | ||
| SLR | 0 | cm | 0.070551 | 2.477574e-01 | 4.259611e-01 | 6.136331e-01 | 8.920068e-01 | 1.079452e+00 | 1.242474e+00 | 1.444409e+00 | 1.608531e+00 | 1.817847e+00 | 2.013340e+00 | 2.199297e+00 | 2.393601e+00 | 2.623180e+00 | 2.858925e+00 | |
| 1 | cm | 0.027046 | 2.377309e-01 | 5.036762e-01 | 6.774614e-01 | 8.882436e-01 | 1.128977e+00 | 1.349902e+00 | 1.612308e+00 | 1.838989e+00 | 1.996814e+00 | 2.261414e+00 | 2.441122e+00 | 2.709250e+00 | 2.965897e+00 | 3.098979e+00 | ||
| scenario_1 | Warming | 0 | K | 0.259090 | 2.885206e+00 | 5.209213e+00 | 8.237810e+00 | 1.038582e+01 | 1.246536e+01 | 1.533547e+01 | 1.728632e+01 | 1.954900e+01 | 2.236745e+01 | 2.443807e+01 | 2.724572e+01 | 3.014275e+01 | 3.245984e+01 | 3.426680e+01 |
| 1 | K | 0.965964 | 2.882328e+00 | 5.559345e+00 | 8.926928e+00 | 1.094845e+01 | 1.361983e+01 | 1.669072e+01 | 1.875795e+01 | 2.221346e+01 | 2.468205e+01 | 2.706991e+01 | 2.969605e+01 | 3.212880e+01 | 3.495451e+01 | 3.789237e+01 | ||
| Ocean Heat Content | 0 | PJ | 263488.878579 | 2.942555e+06 | 6.115062e+06 | 9.179288e+06 | 1.232046e+07 | 1.478380e+07 | 1.760777e+07 | 2.052537e+07 | 2.338845e+07 | 2.653452e+07 | 2.933230e+07 | 3.201679e+07 | 3.565657e+07 | 3.765709e+07 | 4.144890e+07 | |
| 1 | PJ | 892678.664963 | 3.237138e+06 | 6.472755e+06 | 9.945021e+06 | 1.316671e+07 | 1.594920e+07 | 1.926676e+07 | 2.275068e+07 | 2.542213e+07 | 2.823958e+07 | 3.214017e+07 | 3.458835e+07 | 3.783455e+07 | 4.055382e+07 | 4.379847e+07 | ||
| SLR | 0 | cm | 0.007616 | 3.590359e-01 | 7.339651e-01 | 1.033474e+00 | 1.420303e+00 | 1.702518e+00 | 2.098577e+00 | 2.415842e+00 | 2.730889e+00 | 3.102532e+00 | 3.350360e+00 | 3.717154e+00 | 4.085766e+00 | 4.348654e+00 | 4.746571e+00 | |
| 1 | cm | 0.009928 | 4.103452e-01 | 7.949211e-01 | 1.108658e+00 | 1.520920e+00 | 1.798479e+00 | 2.216094e+00 | 2.530069e+00 | 2.947818e+00 | 3.234573e+00 | 3.616147e+00 | 4.005772e+00 | 4.301530e+00 | 4.665000e+00 | 5.090846e+00 |
The main thing to be careful of here is that you don't have a typo in your current unit (i.e. mapping key). If you do have a typo then, silently, no conversion will be done which may cause you confusion in later code (if you expected the conversion to be done and it turns out it hadn't been).
# There is a typo, "zJ" is given below rather than "ZJ"
# so the ocean heat content data is not converted to PJ.
# This happens silently i.e. no warning or error.
df_multi_unit.openscm.convert_unit({"mm": "cm", "zJ": "PJ"})
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | K | 0.591109 | 1.562867 | 2.679513 | 4.030553 | 4.825483 | 5.737766 | 6.954924 | 8.372727 | 8.623694 | 10.261342 | 11.325057 | 11.890421 | 13.820463 | 14.303354 | 15.043330 |
| 1 | K | 0.336865 | 1.370294 | 3.160775 | 4.675830 | 5.966644 | 7.076796 | 8.240343 | 9.493297 | 11.120161 | 12.500829 | 13.776882 | 14.359810 | 16.256197 | 17.362932 | 18.647938 | ||
| Ocean Heat Content | 0 | ZJ | 0.650901 | 1.718926 | 3.599942 | 5.545718 | 6.139408 | 8.141200 | 9.664617 | 11.646145 | 13.155223 | 14.116252 | 15.429051 | 16.991799 | 18.336360 | 20.597745 | 21.642114 | |
| 1 | ZJ | 0.865208 | 2.043763 | 3.774768 | 5.922776 | 7.079142 | 9.545313 | 10.882294 | 12.897646 | 14.617578 | 16.108130 | 17.700154 | 19.860012 | 21.511893 | 23.670213 | 24.590182 | ||
| SLR | 0 | cm | 0.070551 | 0.247757 | 0.425961 | 0.613633 | 0.892007 | 1.079452 | 1.242474 | 1.444409 | 1.608531 | 1.817847 | 2.013340 | 2.199297 | 2.393601 | 2.623180 | 2.858925 | |
| 1 | cm | 0.027046 | 0.237731 | 0.503676 | 0.677461 | 0.888244 | 1.128977 | 1.349902 | 1.612308 | 1.838989 | 1.996814 | 2.261414 | 2.441122 | 2.709250 | 2.965897 | 3.098979 | ||
| scenario_1 | Warming | 0 | K | 0.259090 | 2.885206 | 5.209213 | 8.237810 | 10.385821 | 12.465355 | 15.335473 | 17.286317 | 19.549002 | 22.367454 | 24.438071 | 27.245716 | 30.142752 | 32.459841 | 34.266800 |
| 1 | K | 0.965964 | 2.882328 | 5.559345 | 8.926928 | 10.948453 | 13.619826 | 16.690716 | 18.757951 | 22.213460 | 24.682051 | 27.069908 | 29.696053 | 32.128800 | 34.954508 | 37.892366 | ||
| Ocean Heat Content | 0 | ZJ | 0.263489 | 2.942555 | 6.115062 | 9.179288 | 12.320463 | 14.783798 | 17.607771 | 20.525367 | 23.388455 | 26.534521 | 29.332296 | 32.016790 | 35.656566 | 37.657090 | 41.448901 | |
| 1 | ZJ | 0.892679 | 3.237138 | 6.472755 | 9.945021 | 13.166714 | 15.949202 | 19.266764 | 22.750679 | 25.422125 | 28.239576 | 32.140174 | 34.588350 | 37.834550 | 40.553825 | 43.798472 | ||
| SLR | 0 | cm | 0.007616 | 0.359036 | 0.733965 | 1.033474 | 1.420303 | 1.702518 | 2.098577 | 2.415842 | 2.730889 | 3.102532 | 3.350360 | 3.717154 | 4.085766 | 4.348654 | 4.746571 | |
| 1 | cm | 0.009928 | 0.410345 | 0.794921 | 1.108658 | 1.520920 | 1.798479 | 2.216094 | 2.530069 | 2.947818 | 3.234573 | 3.616147 | 4.005772 | 4.301530 | 4.665000 | 5.090846 |
Specifying the unit as a series¶
If you want really fine-grained, i.e. timeseries-level, control
then you can also specify the desired units as a pd.Series.
The pd.Series should specify the desired unit for each timeseries
and have an index which matches the data's index,
except for the unit-information level
(those are the pd.Series's values, not part of the index).
This is the hardest option to set up, but gives you the most control in exchange. As for the mapping option, any timeseries for which the desired unit is not specified are simply returned as they are.
# There are lots of ways to make a series like this.
# Here we go with a hand-woven, but simple option.
# For your own work, you may want/need something
# that includes much more programming and logic.
desired_unit = pd.Series(
["mK", "PJ", "cm", "cm"],
index=pd.MultiIndex.from_tuples(
[
("scenario_0", "Warming", 0),
("scenario_0", "Ocean Heat Content", 1),
("scenario_0", "SLR", 0),
("scenario_1", "SLR", 1),
],
# Note: no unit level here
names=["scenario", "variable", "run"],
),
)
desired_unit
scenario variable run
scenario_0 Warming 0 mK
Ocean Heat Content 1 PJ
SLR 0 cm
scenario_1 SLR 1 cm
dtype: object
# Note that only the rows which appear in `desired_unit`
# are converted, all others are unchanged.
df_multi_unit.openscm.convert_unit(desired_unit)
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | mK | 591.109485 | 1.562867e+03 | 2.679513e+03 | 4.030553e+03 | 4.825483e+03 | 5.737766e+03 | 6.954924e+03 | 8.372727e+03 | 8.623694e+03 | 1.026134e+04 | 1.132506e+04 | 1.189042e+04 | 1.382046e+04 | 1.430335e+04 | 1.504333e+04 |
| 1 | K | 0.336865 | 1.370294e+00 | 3.160775e+00 | 4.675830e+00 | 5.966644e+00 | 7.076796e+00 | 8.240343e+00 | 9.493297e+00 | 1.112016e+01 | 1.250083e+01 | 1.377688e+01 | 1.435981e+01 | 1.625620e+01 | 1.736293e+01 | 1.864794e+01 | ||
| Ocean Heat Content | 0 | ZJ | 0.650901 | 1.718926e+00 | 3.599942e+00 | 5.545718e+00 | 6.139408e+00 | 8.141200e+00 | 9.664617e+00 | 1.164614e+01 | 1.315522e+01 | 1.411625e+01 | 1.542905e+01 | 1.699180e+01 | 1.833636e+01 | 2.059774e+01 | 2.164211e+01 | |
| 1 | PJ | 865208.241574 | 2.043763e+06 | 3.774768e+06 | 5.922776e+06 | 7.079142e+06 | 9.545313e+06 | 1.088229e+07 | 1.289765e+07 | 1.461758e+07 | 1.610813e+07 | 1.770015e+07 | 1.986001e+07 | 2.151189e+07 | 2.367021e+07 | 2.459018e+07 | ||
| SLR | 0 | cm | 0.070551 | 2.477574e-01 | 4.259611e-01 | 6.136331e-01 | 8.920068e-01 | 1.079452e+00 | 1.242474e+00 | 1.444409e+00 | 1.608531e+00 | 1.817847e+00 | 2.013340e+00 | 2.199297e+00 | 2.393601e+00 | 2.623180e+00 | 2.858925e+00 | |
| 1 | mm | 0.270463 | 2.377309e+00 | 5.036762e+00 | 6.774614e+00 | 8.882436e+00 | 1.128977e+01 | 1.349902e+01 | 1.612308e+01 | 1.838989e+01 | 1.996814e+01 | 2.261414e+01 | 2.441122e+01 | 2.709250e+01 | 2.965897e+01 | 3.098979e+01 | ||
| scenario_1 | Warming | 0 | K | 0.259090 | 2.885206e+00 | 5.209213e+00 | 8.237810e+00 | 1.038582e+01 | 1.246536e+01 | 1.533547e+01 | 1.728632e+01 | 1.954900e+01 | 2.236745e+01 | 2.443807e+01 | 2.724572e+01 | 3.014275e+01 | 3.245984e+01 | 3.426680e+01 |
| 1 | K | 0.965964 | 2.882328e+00 | 5.559345e+00 | 8.926928e+00 | 1.094845e+01 | 1.361983e+01 | 1.669072e+01 | 1.875795e+01 | 2.221346e+01 | 2.468205e+01 | 2.706991e+01 | 2.969605e+01 | 3.212880e+01 | 3.495451e+01 | 3.789237e+01 | ||
| Ocean Heat Content | 0 | ZJ | 0.263489 | 2.942555e+00 | 6.115062e+00 | 9.179288e+00 | 1.232046e+01 | 1.478380e+01 | 1.760777e+01 | 2.052537e+01 | 2.338845e+01 | 2.653452e+01 | 2.933230e+01 | 3.201679e+01 | 3.565657e+01 | 3.765709e+01 | 4.144890e+01 | |
| 1 | ZJ | 0.892679 | 3.237138e+00 | 6.472755e+00 | 9.945021e+00 | 1.316671e+01 | 1.594920e+01 | 1.926676e+01 | 2.275068e+01 | 2.542213e+01 | 2.823958e+01 | 3.214017e+01 | 3.458835e+01 | 3.783455e+01 | 4.055382e+01 | 4.379847e+01 | ||
| SLR | 0 | mm | 0.076160 | 3.590359e+00 | 7.339651e+00 | 1.033474e+01 | 1.420303e+01 | 1.702518e+01 | 2.098577e+01 | 2.415842e+01 | 2.730889e+01 | 3.102532e+01 | 3.350360e+01 | 3.717154e+01 | 4.085766e+01 | 4.348654e+01 | 4.746571e+01 | |
| 1 | cm | 0.009928 | 4.103452e-01 | 7.949211e-01 | 1.108658e+00 | 1.520920e+00 | 1.798479e+00 | 2.216094e+00 | 2.530069e+00 | 2.947818e+00 | 3.234573e+00 | 3.616147e+00 | 4.005772e+00 | 4.301530e+00 | 4.665000e+00 | 5.090846e+00 |
As above, the main gotcha is silently not doing conversions. If you make typos in the specification, this will happen. Given that the specification, such typos can be much harder to spot.
desired_unit_typo = pd.Series(
["mK", "PJ", "cm", "cm"],
index=pd.MultiIndex.from_tuples(
[
("scenario_0", "Warming", 0),
("scenario_0", "Ocean Heat Content", 1),
# Typo here
("scenario_0", "SLr", 0),
("scenario_1", "SLR", 1),
],
# Not unit level here
names=["scenario", "variable", "run"],
),
)
desired_unit_typo
scenario variable run
scenario_0 Warming 0 mK
Ocean Heat Content 1 PJ
SLr 0 cm
scenario_1 SLR 1 cm
dtype: object
# Note that scenario_0, SLR, run 0 isn't converted because of the typo
df_multi_unit.openscm.convert_unit(desired_unit_typo)
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | run | unit | |||||||||||||||
| scenario_0 | Warming | 0 | mK | 591.109485 | 1.562867e+03 | 2.679513e+03 | 4.030553e+03 | 4.825483e+03 | 5.737766e+03 | 6.954924e+03 | 8.372727e+03 | 8.623694e+03 | 1.026134e+04 | 1.132506e+04 | 1.189042e+04 | 1.382046e+04 | 1.430335e+04 | 1.504333e+04 |
| 1 | K | 0.336865 | 1.370294e+00 | 3.160775e+00 | 4.675830e+00 | 5.966644e+00 | 7.076796e+00 | 8.240343e+00 | 9.493297e+00 | 1.112016e+01 | 1.250083e+01 | 1.377688e+01 | 1.435981e+01 | 1.625620e+01 | 1.736293e+01 | 1.864794e+01 | ||
| Ocean Heat Content | 0 | ZJ | 0.650901 | 1.718926e+00 | 3.599942e+00 | 5.545718e+00 | 6.139408e+00 | 8.141200e+00 | 9.664617e+00 | 1.164614e+01 | 1.315522e+01 | 1.411625e+01 | 1.542905e+01 | 1.699180e+01 | 1.833636e+01 | 2.059774e+01 | 2.164211e+01 | |
| 1 | PJ | 865208.241574 | 2.043763e+06 | 3.774768e+06 | 5.922776e+06 | 7.079142e+06 | 9.545313e+06 | 1.088229e+07 | 1.289765e+07 | 1.461758e+07 | 1.610813e+07 | 1.770015e+07 | 1.986001e+07 | 2.151189e+07 | 2.367021e+07 | 2.459018e+07 | ||
| SLR | 0 | mm | 0.705512 | 2.477574e+00 | 4.259611e+00 | 6.136331e+00 | 8.920068e+00 | 1.079452e+01 | 1.242474e+01 | 1.444409e+01 | 1.608531e+01 | 1.817847e+01 | 2.013340e+01 | 2.199297e+01 | 2.393601e+01 | 2.623180e+01 | 2.858925e+01 | |
| 1 | mm | 0.270463 | 2.377309e+00 | 5.036762e+00 | 6.774614e+00 | 8.882436e+00 | 1.128977e+01 | 1.349902e+01 | 1.612308e+01 | 1.838989e+01 | 1.996814e+01 | 2.261414e+01 | 2.441122e+01 | 2.709250e+01 | 2.965897e+01 | 3.098979e+01 | ||
| scenario_1 | Warming | 0 | K | 0.259090 | 2.885206e+00 | 5.209213e+00 | 8.237810e+00 | 1.038582e+01 | 1.246536e+01 | 1.533547e+01 | 1.728632e+01 | 1.954900e+01 | 2.236745e+01 | 2.443807e+01 | 2.724572e+01 | 3.014275e+01 | 3.245984e+01 | 3.426680e+01 |
| 1 | K | 0.965964 | 2.882328e+00 | 5.559345e+00 | 8.926928e+00 | 1.094845e+01 | 1.361983e+01 | 1.669072e+01 | 1.875795e+01 | 2.221346e+01 | 2.468205e+01 | 2.706991e+01 | 2.969605e+01 | 3.212880e+01 | 3.495451e+01 | 3.789237e+01 | ||
| Ocean Heat Content | 0 | ZJ | 0.263489 | 2.942555e+00 | 6.115062e+00 | 9.179288e+00 | 1.232046e+01 | 1.478380e+01 | 1.760777e+01 | 2.052537e+01 | 2.338845e+01 | 2.653452e+01 | 2.933230e+01 | 3.201679e+01 | 3.565657e+01 | 3.765709e+01 | 4.144890e+01 | |
| 1 | ZJ | 0.892679 | 3.237138e+00 | 6.472755e+00 | 9.945021e+00 | 1.316671e+01 | 1.594920e+01 | 1.926676e+01 | 2.275068e+01 | 2.542213e+01 | 2.823958e+01 | 3.214017e+01 | 3.458835e+01 | 3.783455e+01 | 4.055382e+01 | 4.379847e+01 | ||
| SLR | 0 | mm | 0.076160 | 3.590359e+00 | 7.339651e+00 | 1.033474e+01 | 1.420303e+01 | 1.702518e+01 | 2.098577e+01 | 2.415842e+01 | 2.730889e+01 | 3.102532e+01 | 3.350360e+01 | 3.717154e+01 | 4.085766e+01 | 4.348654e+01 | 4.746571e+01 | |
| 1 | cm | 0.009928 | 4.103452e-01 | 7.949211e-01 | 1.108658e+00 | 1.520920e+00 | 1.798479e+00 | 2.216094e+00 | 2.530069e+00 | 2.947818e+00 | 3.234573e+00 | 3.616147e+00 | 4.005772e+00 | 4.301530e+00 | 4.665000e+00 | 5.090846e+00 |
If you are trying to figure out why something isn't being converted, pandas provides some quite helpful APIs.
rows_that_wont_be_used = desired_unit_typo.index.difference(
df_multi_unit.index.droplevel("unit")
)
rows_that_wont_be_used
MultiIndex([('scenario_0', 'SLr', 0)],
names=['scenario', 'variable', 'run'])
Unit registries and pint¶
The unit conversion is all done with the
pint package by default.
Pint is built around the idea of
unit registries.
The registry to use can be passed via the ur argument.
If it is not specified, we use whatever is returned from
pint.get_application_registry().
This is pint's way of setting the default registry for whatever you are doing.
By default, it returns pint's default registry
but you can set a different registry for whatever work you're doing
with pint.set_application_registry().
If you're doing climate work, especially related to emissions, you often want 'emissions units' like "Mt CO2/yr". These are not recognised by default by Pint so you get errors if you try to convert them.
df_emissions = create_test_df(
variables=(
("co2", "Mt CO2 / yr"),
("ch4", "Mt CH4 / yr"),
("hfc23", "kt HFC23 / yr"),
),
n_scenarios=2,
n_runs=1,
timepoints=np.arange(1950.0, 1965.0),
).reset_index("run", drop=True)
df_emissions
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||
| scenario_0 | co2 | Mt CO2 / yr | 0.334778 | 1.343045 | 3.124794 | 3.272307 | 4.958702 | 5.944578 | 6.680654 | 7.912122 | 8.619314 | 9.952941 | 10.776197 | 12.215116 | 13.353068 | 14.414149 | 15.376208 |
| ch4 | Mt CH4 / yr | 0.998931 | 2.016692 | 3.866751 | 5.128289 | 6.864509 | 8.165403 | 9.513507 | 11.520793 | 13.008829 | 14.763937 | 16.285959 | 17.693262 | 19.739902 | 20.869994 | 22.763815 | |
| hfc23 | kt HFC23 / yr | 0.096459 | 2.362880 | 4.649200 | 6.267509 | 8.347073 | 10.849019 | 13.307531 | 15.084751 | 17.535367 | 18.670190 | 21.656035 | 23.644925 | 25.407146 | 27.373390 | 29.178703 | |
| scenario_1 | co2 | Mt CO2 / yr | 0.352810 | 2.784526 | 5.512495 | 7.798160 | 10.597410 | 13.338805 | 15.476393 | 18.432925 | 21.189186 | 23.373402 | 25.960067 | 29.278525 | 31.041590 | 34.372871 | 36.309774 |
| ch4 | Mt CH4 / yr | 0.898024 | 4.009457 | 6.696396 | 9.721175 | 12.543608 | 15.813440 | 18.839671 | 21.642711 | 25.236076 | 28.185402 | 30.872498 | 33.859299 | 37.657214 | 40.230637 | 43.425710 | |
| hfc23 | kt HFC23 / yr | 0.274082 | 4.547892 | 7.396640 | 11.445659 | 15.069731 | 18.415992 | 22.302893 | 25.495154 | 28.598960 | 32.982734 | 35.801677 | 39.711390 | 43.080513 | 46.818480 | 50.332368 |
As was written above, the default unit registry does not know about emissions units so if we try to convert this data, we receive an error.
try:
df_emissions.openscm.convert_unit({"Mt CO2 / yr": "GtC / yr"})
except pint.UndefinedUnitError:
traceback.print_exc(limit=0)
pint.errors.UndefinedUnitError: 'CO2' is not defined in the unit registry
If we specify openscm-units' registry instead, the conversion will work.
df_emissions.openscm.convert_unit(
{"Mt CO2 / yr": "GtC / yr"}, ur=openscm_units.unit_registry
)
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||
| scenario_0 | co2 | GtC / yr | 0.000091 | 0.000366 | 0.000852 | 0.000892 | 0.001352 | 0.001621 | 0.001822 | 0.002158 | 0.002351 | 0.002714 | 0.002939 | 0.003331 | 0.003642 | 0.003931 | 0.004194 |
| ch4 | Mt CH4 / yr | 0.998931 | 2.016692 | 3.866751 | 5.128289 | 6.864509 | 8.165403 | 9.513507 | 11.520793 | 13.008829 | 14.763937 | 16.285959 | 17.693262 | 19.739902 | 20.869994 | 22.763815 | |
| hfc23 | kt HFC23 / yr | 0.096459 | 2.362880 | 4.649200 | 6.267509 | 8.347073 | 10.849019 | 13.307531 | 15.084751 | 17.535367 | 18.670190 | 21.656035 | 23.644925 | 25.407146 | 27.373390 | 29.178703 | |
| scenario_1 | co2 | GtC / yr | 0.000096 | 0.000759 | 0.001503 | 0.002127 | 0.002890 | 0.003638 | 0.004221 | 0.005027 | 0.005779 | 0.006375 | 0.007080 | 0.007985 | 0.008466 | 0.009374 | 0.009903 |
| ch4 | Mt CH4 / yr | 0.898024 | 4.009457 | 6.696396 | 9.721175 | 12.543608 | 15.813440 | 18.839671 | 21.642711 | 25.236076 | 28.185402 | 30.872498 | 33.859299 | 37.657214 | 40.230637 | 43.425710 | |
| hfc23 | kt HFC23 / yr | 0.274082 | 4.547892 | 7.396640 | 11.445659 | 15.069731 | 18.415992 | 22.302893 | 25.495154 | 28.598960 | 32.982734 | 35.801677 | 39.711390 | 43.080513 | 46.818480 | 50.332368 |
If we set the application registry to openscm-units' registry, then we do not need to pass the registry every time we want to do such a conversion.
pint.set_application_registry(openscm_units.unit_registry)
# Now the conversion works without specifying the registry
df_emissions.openscm.convert_unit({"Mt CO2 / yr": "GtC / yr"})
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||
| scenario_0 | co2 | GtC / yr | 0.000091 | 0.000366 | 0.000852 | 0.000892 | 0.001352 | 0.001621 | 0.001822 | 0.002158 | 0.002351 | 0.002714 | 0.002939 | 0.003331 | 0.003642 | 0.003931 | 0.004194 |
| ch4 | Mt CH4 / yr | 0.998931 | 2.016692 | 3.866751 | 5.128289 | 6.864509 | 8.165403 | 9.513507 | 11.520793 | 13.008829 | 14.763937 | 16.285959 | 17.693262 | 19.739902 | 20.869994 | 22.763815 | |
| hfc23 | kt HFC23 / yr | 0.096459 | 2.362880 | 4.649200 | 6.267509 | 8.347073 | 10.849019 | 13.307531 | 15.084751 | 17.535367 | 18.670190 | 21.656035 | 23.644925 | 25.407146 | 27.373390 | 29.178703 | |
| scenario_1 | co2 | GtC / yr | 0.000096 | 0.000759 | 0.001503 | 0.002127 | 0.002890 | 0.003638 | 0.004221 | 0.005027 | 0.005779 | 0.006375 | 0.007080 | 0.007985 | 0.008466 | 0.009374 | 0.009903 |
| ch4 | Mt CH4 / yr | 0.898024 | 4.009457 | 6.696396 | 9.721175 | 12.543608 | 15.813440 | 18.839671 | 21.642711 | 25.236076 | 28.185402 | 30.872498 | 33.859299 | 37.657214 | 40.230637 | 43.425710 | |
| hfc23 | kt HFC23 / yr | 0.274082 | 4.547892 | 7.396640 | 11.445659 | 15.069731 | 18.415992 | 22.302893 | 25.495154 | 28.598960 | 32.982734 | 35.801677 | 39.711390 | 43.080513 | 46.818480 | 50.332368 |
Contexts¶
Pint supports the idea of contexts. Within a context, conversions that would normally not be allowed can be allowed. Pint's docs give good examples of cases where this is useful. For emissions work, the key one is CO2-equivalent units. Thanks to Pint's contexts and the unit conversion API, converting to CO2-equivalent units becomes trivial.
with openscm_units.unit_registry.context("AR6GWP100"):
df_emissions_co2_eq = df_emissions.openscm.convert_unit(
"Mt CO2 / yr", ur=openscm_units.unit_registry
)
df_emissions_co2_eq
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||
| scenario_0 | co2 | Mt CO2 / yr | 0.334778 | 1.343045 | 3.124794 | 3.272307 | 4.958702 | 5.944578 | 6.680654 | 7.912122 | 8.619314 | 9.952941 | 10.776197 | 12.215116 | 13.353068 | 14.414149 | 15.376208 |
| ch4 | Mt CO2 / yr | 27.870180 | 56.265714 | 107.882341 | 143.079267 | 191.519795 | 227.814756 | 265.426838 | 321.430128 | 362.946320 | 411.913849 | 454.378261 | 493.642006 | 550.743270 | 582.272836 | 635.110428 | |
| hfc23 | Mt CO2 / yr | 1.408295 | 34.498050 | 67.878316 | 91.505630 | 121.867259 | 158.395677 | 194.289958 | 220.237363 | 256.016356 | 272.584769 | 316.178111 | 345.215909 | 370.944330 | 399.651490 | 426.009063 | |
| scenario_1 | co2 | Mt CO2 / yr | 0.352810 | 2.784526 | 5.512495 | 7.798160 | 10.597410 | 13.338805 | 15.476393 | 18.432925 | 21.189186 | 23.373402 | 25.960067 | 29.278525 | 31.041590 | 34.372871 | 36.309774 |
| ch4 | Mt CO2 / yr | 25.054861 | 111.863837 | 186.829462 | 271.220770 | 349.966658 | 441.194977 | 525.626823 | 603.831637 | 704.086534 | 786.372710 | 861.342703 | 944.674442 | 1050.636258 | 1122.434767 | 1211.577296 | |
| hfc23 | Mt CO2 / yr | 4.001590 | 66.399219 | 107.990941 | 167.106622 | 220.018077 | 268.873490 | 325.622239 | 372.229255 | 417.544822 | 481.547911 | 522.704479 | 579.786292 | 628.975491 | 683.549812 | 734.852566 |
# From here, calculating e.g. total CO2-equivalent emissions
# is then trivial, e.g.
df_emissions_co2_eq.openscm.groupby_except("variable").sum().pix.assign(variable="ghg")
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | 1960.0 | 1961.0 | 1962.0 | 1963.0 | 1964.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | unit | variable | |||||||||||||||
| scenario_0 | Mt CO2 / yr | ghg | 29.613252 | 92.106810 | 178.885452 | 237.857205 | 318.345756 | 392.155011 | 466.397450 | 549.579612 | 627.581990 | 694.451559 | 781.332569 | 851.073031 | 935.040667 | 996.338475 | 1076.495699 |
| scenario_1 | Mt CO2 / yr | ghg | 29.409261 | 181.047582 | 300.332898 | 446.125552 | 580.582145 | 723.407272 | 866.725455 | 994.493817 | 1142.820542 | 1291.294023 | 1410.007249 | 1553.739258 | 1710.653340 | 1840.357451 | 1982.739636 |
Convert unit like¶
A common scenario is wanting to compare two datasets.
In such cases, life is much easier if they have the same unit.
To support this case, we provide the convert_unit_like API.
This is essentially just a wrapper around
convert_unit_from_target_series, that figures out the desired units
based on the data which we would like to match.
If the logic included in convert_unit_like doesn't fit your use case,
then we suggest making your desired units by hand and then directly using
convert_unit_from_target_series or convert_unit instead.
Let's imagine we have scenario data like the below.
df_scenarios = create_test_df(
variables=(
("co2", "Mt CO2 / yr"),
("ch4", "Mt CH4 / yr"),
("hfc23", "kt HFC23 / yr"),
),
n_scenarios=2,
n_runs=1,
timepoints=np.arange(2025.0, 2100.0 + 1.0),
).reset_index("run", drop=True)
df_scenarios
| 2025.0 | 2026.0 | 2027.0 | 2028.0 | 2029.0 | 2030.0 | 2031.0 | 2032.0 | 2033.0 | 2034.0 | ... | 2091.0 | 2092.0 | 2093.0 | 2094.0 | 2095.0 | 2096.0 | 2097.0 | 2098.0 | 2099.0 | 2100.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||||||||
| scenario_0 | co2 | Mt CO2 / yr | 0.630829 | 0.374979 | 1.082395 | 0.672892 | 1.647849 | 1.460875 | 1.667553 | 2.055225 | 2.580004 | 2.606394 | ... | 13.661322 | 14.218882 | 14.532644 | 14.521115 | 14.417093 | 14.997389 | 14.927472 | 14.802409 | 15.562887 | 15.202732 |
| ch4 | Mt CH4 / yr | 0.222510 | 1.257801 | 0.604924 | 1.345336 | 1.646566 | 1.607872 | 2.543852 | 2.624976 | 3.152217 | 3.554689 | ... | 20.112408 | 20.078811 | 20.536215 | 21.187019 | 21.165569 | 20.991694 | 21.522563 | 22.409479 | 21.946998 | 22.139521 | |
| hfc23 | kt HFC23 / yr | 0.907473 | 0.761665 | 1.352878 | 1.591107 | 1.988541 | 2.354604 | 2.801807 | 3.638481 | 3.854258 | 4.320065 | ... | 25.942678 | 26.596298 | 27.239053 | 26.792063 | 27.380232 | 28.124666 | 28.204998 | 28.594711 | 28.775431 | 29.442063 | |
| scenario_1 | co2 | Mt CO2 / yr | 0.333148 | 0.722961 | 0.982739 | 1.705736 | 2.422920 | 2.841952 | 3.028106 | 4.066724 | 4.333335 | 4.768300 | ... | 32.645639 | 32.864509 | 33.055400 | 33.461216 | 33.668035 | 34.977745 | 34.963627 | 35.109885 | 36.382614 | 36.341592 |
| ch4 | Mt CH4 / yr | 0.760049 | 1.455638 | 1.874968 | 1.879505 | 2.596289 | 3.134631 | 3.501749 | 4.417489 | 5.080045 | 5.316956 | ... | 38.503701 | 38.823542 | 39.769538 | 40.194249 | 40.160799 | 41.322017 | 42.050710 | 41.899587 | 42.455124 | 43.850286 | |
| hfc23 | kt HFC23 / yr | 0.696459 | 0.896275 | 2.092961 | 2.249874 | 2.802661 | 3.345677 | 4.618293 | 4.898255 | 6.250212 | 6.213745 | ... | 44.342354 | 45.581363 | 45.723373 | 46.069332 | 47.138450 | 47.379370 | 48.533851 | 48.734898 | 49.388509 | 50.959111 |
6 rows × 76 columns
Then we have some historical data.
df_history = create_test_df(
variables=(
("co2", "Gt CO2 / yr"),
("ch4", "kt CH4 / yr"),
("hfc23", "t HFC23 / yr"),
),
n_scenarios=1,
n_runs=1,
timepoints=np.arange(1950.0, 2024.0 + 1.0),
).reset_index(["run", "scenario"], drop=True)
df_history
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | ... | 2015.0 | 2016.0 | 2017.0 | 2018.0 | 2019.0 | 2020.0 | 2021.0 | 2022.0 | 2023.0 | 2024.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| variable | unit | |||||||||||||||||||||
| co2 | Gt CO2 / yr | 0.978043 | 0.773526 | 0.912909 | 1.452641 | 1.636037 | 1.155063 | 1.692588 | 1.591687 | 2.135544 | 2.410116 | ... | 14.148612 | 13.686054 | 14.140349 | 14.744365 | 14.366217 | 14.316261 | 15.365074 | 15.358737 | 15.158790 | 15.171308 |
| ch4 | kt CH4 / yr | 0.783886 | 0.804794 | 1.004127 | 1.346120 | 1.772816 | 2.696717 | 3.012730 | 3.242307 | 3.956110 | 4.423999 | ... | 28.619005 | 29.193156 | 29.756717 | 30.491717 | 31.096479 | 31.225626 | 31.814345 | 32.176674 | 32.337564 | 33.020764 |
| hfc23 | t HFC23 / yr | 0.855126 | 1.413841 | 1.694239 | 2.726967 | 2.738134 | 3.550249 | 4.659124 | 5.119329 | 6.033273 | 6.537820 | ... | 44.683132 | 44.757387 | 46.246930 | 46.552409 | 46.816337 | 47.885555 | 48.101853 | 49.577580 | 49.781345 | 50.897692 |
3 rows × 75 columns
We can simply convert the scenario data to have the same units of the history
with convert_unit_like.
df_scenarios_like_history = df_scenarios.openscm.convert_unit_like(df_history)
df_scenarios_like_history
| 2025.0 | 2026.0 | 2027.0 | 2028.0 | 2029.0 | 2030.0 | 2031.0 | 2032.0 | 2033.0 | 2034.0 | ... | 2091.0 | 2092.0 | 2093.0 | 2094.0 | 2095.0 | 2096.0 | 2097.0 | 2098.0 | 2099.0 | 2100.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||||||||
| scenario_0 | co2 | Gt CO2 / yr | 0.000631 | 0.000375 | 0.001082 | 0.000673 | 0.001648 | 0.001461 | 0.001668 | 0.002055 | 0.002580 | 0.002606 | ... | 0.013661 | 0.014219 | 0.014533 | 0.014521 | 0.014417 | 0.014997 | 0.014927 | 0.014802 | 0.015563 | 0.015203 |
| ch4 | kt CH4 / yr | 222.509659 | 1257.801096 | 604.923605 | 1345.336175 | 1646.565964 | 1607.872465 | 2543.852483 | 2624.976289 | 3152.217194 | 3554.688682 | ... | 20112.408086 | 20078.811141 | 20536.214563 | 21187.019150 | 21165.568877 | 20991.693947 | 21522.563090 | 22409.478620 | 21946.997626 | 22139.521296 | |
| hfc23 | t HFC23 / yr | 907.473110 | 761.664878 | 1352.878260 | 1591.107459 | 1988.541339 | 2354.603867 | 2801.806841 | 3638.480910 | 3854.257636 | 4320.064950 | ... | 25942.677829 | 26596.297594 | 27239.053313 | 26792.062682 | 27380.232365 | 28124.665905 | 28204.998273 | 28594.710871 | 28775.430798 | 29442.062543 | |
| scenario_1 | co2 | Gt CO2 / yr | 0.000333 | 0.000723 | 0.000983 | 0.001706 | 0.002423 | 0.002842 | 0.003028 | 0.004067 | 0.004333 | 0.004768 | ... | 0.032646 | 0.032865 | 0.033055 | 0.033461 | 0.033668 | 0.034978 | 0.034964 | 0.035110 | 0.036383 | 0.036342 |
| ch4 | kt CH4 / yr | 760.048637 | 1455.638181 | 1874.968334 | 1879.505340 | 2596.288576 | 3134.630982 | 3501.749307 | 4417.489063 | 5080.044930 | 5316.956134 | ... | 38503.701140 | 38823.541952 | 39769.537508 | 40194.248583 | 40160.799016 | 41322.016568 | 42050.710298 | 41899.587030 | 42455.124228 | 43850.285509 | |
| hfc23 | t HFC23 / yr | 696.459075 | 896.275118 | 2092.961092 | 2249.873631 | 2802.661319 | 3345.676609 | 4618.293460 | 4898.255256 | 6250.212111 | 6213.744511 | ... | 44342.353562 | 45581.362736 | 45723.372766 | 46069.332367 | 47138.449518 | 47379.369970 | 48533.851142 | 48734.897994 | 49388.508796 | 50959.110554 |
6 rows × 76 columns
The functional equivalent of the above is convert_unit_like.
pd.testing.assert_frame_equal(
df_scenarios.openscm.convert_unit_like(df_history),
convert_unit_like(df_scenarios, df_history),
)
For scenarios like this, where the desired units are clear, we can also do the reverse operation.
df_history.openscm.convert_unit_like(df_scenarios)
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | ... | 2015.0 | 2016.0 | 2017.0 | 2018.0 | 2019.0 | 2020.0 | 2021.0 | 2022.0 | 2023.0 | 2024.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| variable | unit | |||||||||||||||||||||
| co2 | Mt CO2 / yr | 978.042673 | 773.525960 | 912.909168 | 1452.641459 | 1636.036852 | 1155.062555 | 1692.587556 | 1591.686524 | 2135.544372 | 2410.116117 | ... | 14148.611951 | 13686.053976 | 14140.349136 | 14744.364768 | 14366.217023 | 14316.260674 | 15365.073914 | 15358.737229 | 15158.790498 | 15171.307770 |
| ch4 | Mt CH4 / yr | 0.000784 | 0.000805 | 0.001004 | 0.001346 | 0.001773 | 0.002697 | 0.003013 | 0.003242 | 0.003956 | 0.004424 | ... | 0.028619 | 0.029193 | 0.029757 | 0.030492 | 0.031096 | 0.031226 | 0.031814 | 0.032177 | 0.032338 | 0.033021 |
| hfc23 | kt HFC23 / yr | 0.000855 | 0.001414 | 0.001694 | 0.002727 | 0.002738 | 0.003550 | 0.004659 | 0.005119 | 0.006033 | 0.006538 | ... | 0.044683 | 0.044757 | 0.046247 | 0.046552 | 0.046816 | 0.047886 | 0.048102 | 0.049578 | 0.049781 | 0.050898 |
3 rows × 75 columns
However, if the scenarios themselves had different units, then the target unit for a given timeseries in history would be ambiguous and we would get an error.
df_scenarios_differing_units = df_scenarios.openscm.set_index_levels(
{
"unit": [
"Gt CO2 / yr",
"kt CH4 / yr",
"kt HFC23 / yr",
"Mt CO2 / yr",
"Mt CH4 / yr",
"kt HFC23 / yr",
]
}
)
# Note that e.g. scenario_0 uses Gt CO2 / yr for co2
# while scenario_1 uses Mt CO2 / yr
df_scenarios_differing_units
| 2025.0 | 2026.0 | 2027.0 | 2028.0 | 2029.0 | 2030.0 | 2031.0 | 2032.0 | 2033.0 | 2034.0 | ... | 2091.0 | 2092.0 | 2093.0 | 2094.0 | 2095.0 | 2096.0 | 2097.0 | 2098.0 | 2099.0 | 2100.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| scenario | variable | unit | |||||||||||||||||||||
| scenario_0 | co2 | Gt CO2 / yr | 0.630829 | 0.374979 | 1.082395 | 0.672892 | 1.647849 | 1.460875 | 1.667553 | 2.055225 | 2.580004 | 2.606394 | ... | 13.661322 | 14.218882 | 14.532644 | 14.521115 | 14.417093 | 14.997389 | 14.927472 | 14.802409 | 15.562887 | 15.202732 |
| ch4 | kt CH4 / yr | 0.222510 | 1.257801 | 0.604924 | 1.345336 | 1.646566 | 1.607872 | 2.543852 | 2.624976 | 3.152217 | 3.554689 | ... | 20.112408 | 20.078811 | 20.536215 | 21.187019 | 21.165569 | 20.991694 | 21.522563 | 22.409479 | 21.946998 | 22.139521 | |
| hfc23 | kt HFC23 / yr | 0.907473 | 0.761665 | 1.352878 | 1.591107 | 1.988541 | 2.354604 | 2.801807 | 3.638481 | 3.854258 | 4.320065 | ... | 25.942678 | 26.596298 | 27.239053 | 26.792063 | 27.380232 | 28.124666 | 28.204998 | 28.594711 | 28.775431 | 29.442063 | |
| scenario_1 | co2 | Mt CO2 / yr | 0.333148 | 0.722961 | 0.982739 | 1.705736 | 2.422920 | 2.841952 | 3.028106 | 4.066724 | 4.333335 | 4.768300 | ... | 32.645639 | 32.864509 | 33.055400 | 33.461216 | 33.668035 | 34.977745 | 34.963627 | 35.109885 | 36.382614 | 36.341592 |
| ch4 | Mt CH4 / yr | 0.760049 | 1.455638 | 1.874968 | 1.879505 | 2.596289 | 3.134631 | 3.501749 | 4.417489 | 5.080045 | 5.316956 | ... | 38.503701 | 38.823542 | 39.769538 | 40.194249 | 40.160799 | 41.322017 | 42.050710 | 41.899587 | 42.455124 | 43.850286 | |
| hfc23 | kt HFC23 / yr | 0.696459 | 0.896275 | 2.092961 | 2.249874 | 2.802661 | 3.345677 | 4.618293 | 4.898255 | 6.250212 | 6.213745 | ... | 44.342354 | 45.581363 | 45.723373 | 46.069332 | 47.138450 | 47.379370 | 48.533851 | 48.734898 | 49.388509 | 50.959111 |
6 rows × 76 columns
try:
df_history.openscm.convert_unit_like(df_scenarios_differing_units)
except AmbiguousTargetUnitError:
traceback.print_exc(limit=0)
pandas_openscm.unit_conversion.AmbiguousTargetUnitError: `pobj` has pobj.index.names=FrozenList(['variable', 'unit']). `target` has target.index.names=FrozenList(['scenario', 'variable', 'unit']). The index levels in `target` that are also in `pobj` are ['variable']. When we only look at these levels, the desired unit looks like:
variable
co2 Gt CO2 / yr
ch4 kt CH4 / yr
hfc23 kt HFC23 / yr
co2 Mt CO2 / yr
ch4 Mt CH4 / yr
Name: unit, dtype: object
The unit to use isn't unambiguous for the following metadata:
variable
co2 Gt CO2 / yr
ch4 kt CH4 / yr
co2 Mt CO2 / yr
ch4 Mt CH4 / yr
Name: unit, dtype: object
The drivers of this ambiguity are the following metadata levels in `target`
MultiIndex([('scenario_0', 'co2', 'Gt CO2 / yr'),
('scenario_0', 'ch4', 'kt CH4 / yr'),
('scenario_1', 'co2', 'Mt CO2 / yr'),
('scenario_1', 'ch4', 'Mt CH4 / yr')],
names=['scenario', 'variable', 'unit'])
In such a case, we can instead create our desired units ourselves
and directly call convert_unit or convert_unit_from_target_series.
desired_units = (
df_scenarios_differing_units.loc[pix.isin(scenario="scenario_0")]
.index.droplevel("scenario")
.to_frame()["unit"]
.reset_index("unit", drop=True)
)
desired_units
variable co2 Gt CO2 / yr ch4 kt CH4 / yr hfc23 kt HFC23 / yr Name: unit, dtype: object
df_history.openscm.convert_unit(desired_units)
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | ... | 2015.0 | 2016.0 | 2017.0 | 2018.0 | 2019.0 | 2020.0 | 2021.0 | 2022.0 | 2023.0 | 2024.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| variable | unit | |||||||||||||||||||||
| co2 | Gt CO2 / yr | 0.978043 | 0.773526 | 0.912909 | 1.452641 | 1.636037 | 1.155063 | 1.692588 | 1.591687 | 2.135544 | 2.410116 | ... | 14.148612 | 13.686054 | 14.140349 | 14.744365 | 14.366217 | 14.316261 | 15.365074 | 15.358737 | 15.158790 | 15.171308 |
| ch4 | kt CH4 / yr | 0.783886 | 0.804794 | 1.004127 | 1.346120 | 1.772816 | 2.696717 | 3.012730 | 3.242307 | 3.956110 | 4.423999 | ... | 28.619005 | 29.193156 | 29.756717 | 30.491717 | 31.096479 | 31.225626 | 31.814345 | 32.176674 | 32.337564 | 33.020764 |
| hfc23 | kt HFC23 / yr | 0.000855 | 0.001414 | 0.001694 | 0.002727 | 0.002738 | 0.003550 | 0.004659 | 0.005119 | 0.006033 | 0.006538 | ... | 0.044683 | 0.044757 | 0.046247 | 0.046552 | 0.046816 | 0.047886 | 0.048102 | 0.049578 | 0.049781 | 0.050898 |
3 rows × 75 columns
# The functional equivalent using convert_unit_from_target_series
convert_unit_from_target_series(df_history, desired_units)
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | ... | 2015.0 | 2016.0 | 2017.0 | 2018.0 | 2019.0 | 2020.0 | 2021.0 | 2022.0 | 2023.0 | 2024.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| variable | unit | |||||||||||||||||||||
| co2 | Gt CO2 / yr | 0.978043 | 0.773526 | 0.912909 | 1.452641 | 1.636037 | 1.155063 | 1.692588 | 1.591687 | 2.135544 | 2.410116 | ... | 14.148612 | 13.686054 | 14.140349 | 14.744365 | 14.366217 | 14.316261 | 15.365074 | 15.358737 | 15.158790 | 15.171308 |
| ch4 | kt CH4 / yr | 0.783886 | 0.804794 | 1.004127 | 1.346120 | 1.772816 | 2.696717 | 3.012730 | 3.242307 | 3.956110 | 4.423999 | ... | 28.619005 | 29.193156 | 29.756717 | 30.491717 | 31.096479 | 31.225626 | 31.814345 | 32.176674 | 32.337564 | 33.020764 |
| hfc23 | kt HFC23 / yr | 0.000855 | 0.001414 | 0.001694 | 0.002727 | 0.002738 | 0.003550 | 0.004659 | 0.005119 | 0.006033 | 0.006538 | ... | 0.044683 | 0.044757 | 0.046247 | 0.046552 | 0.046816 | 0.047886 | 0.048102 | 0.049578 | 0.049781 | 0.050898 |
3 rows × 75 columns
Converting units of two pd.DataFrame's so they match then makes further operations,
like concatenating the two datasets
or adding/subtracting them,
much simpler.
df_full_timeseries = pd.concat(
[
v.dropna(how="all", axis="columns")
for v in df_history.align(df_scenarios_like_history)
],
axis="columns",
)
df_full_timeseries
| 1950.0 | 1951.0 | 1952.0 | 1953.0 | 1954.0 | 1955.0 | 1956.0 | 1957.0 | 1958.0 | 1959.0 | ... | 2091.0 | 2092.0 | 2093.0 | 2094.0 | 2095.0 | 2096.0 | 2097.0 | 2098.0 | 2099.0 | 2100.0 | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| variable | unit | scenario | |||||||||||||||||||||
| ch4 | kt CH4 / yr | scenario_0 | 0.783886 | 0.804794 | 1.004127 | 1.346120 | 1.772816 | 2.696717 | 3.012730 | 3.242307 | 3.956110 | 4.423999 | ... | 20112.408086 | 20078.811141 | 20536.214563 | 21187.019150 | 21165.568877 | 20991.693947 | 21522.563090 | 22409.478620 | 21946.997626 | 22139.521296 |
| scenario_1 | 0.783886 | 0.804794 | 1.004127 | 1.346120 | 1.772816 | 2.696717 | 3.012730 | 3.242307 | 3.956110 | 4.423999 | ... | 38503.701140 | 38823.541952 | 39769.537508 | 40194.248583 | 40160.799016 | 41322.016568 | 42050.710298 | 41899.587030 | 42455.124228 | 43850.285509 | ||
| co2 | Gt CO2 / yr | scenario_0 | 0.978043 | 0.773526 | 0.912909 | 1.452641 | 1.636037 | 1.155063 | 1.692588 | 1.591687 | 2.135544 | 2.410116 | ... | 0.013661 | 0.014219 | 0.014533 | 0.014521 | 0.014417 | 0.014997 | 0.014927 | 0.014802 | 0.015563 | 0.015203 |
| scenario_1 | 0.978043 | 0.773526 | 0.912909 | 1.452641 | 1.636037 | 1.155063 | 1.692588 | 1.591687 | 2.135544 | 2.410116 | ... | 0.032646 | 0.032865 | 0.033055 | 0.033461 | 0.033668 | 0.034978 | 0.034964 | 0.035110 | 0.036383 | 0.036342 | ||
| hfc23 | t HFC23 / yr | scenario_0 | 0.855126 | 1.413841 | 1.694239 | 2.726967 | 2.738134 | 3.550249 | 4.659124 | 5.119329 | 6.033273 | 6.537820 | ... | 25942.677829 | 26596.297594 | 27239.053313 | 26792.062682 | 27380.232365 | 28124.665905 | 28204.998273 | 28594.710871 | 28775.430798 | 29442.062543 |
| scenario_1 | 0.855126 | 1.413841 | 1.694239 | 2.726967 | 2.738134 | 3.550249 | 4.659124 | 5.119329 | 6.033273 | 6.537820 | ... | 44342.353562 | 45581.362736 | 45723.372766 | 46069.332367 | 47138.449518 | 47379.369970 | 48533.851142 | 48734.897994 | 49388.508796 | 50959.110554 |
6 rows × 151 columns
Summary¶
Here you have seen pandas-openscm's unit conversion related APIs. We hope these are helpful. If you have any issues, please raise an issue.