OpenSCMDB back-end speed#
Here we present an analysis of the speed of different back-ends available to use with OpenSCMDB. For each back-end, we show how fast it is to read and write a collection of timeseries. We consider use cases that mirror the use cases we have. If you have a use case that you would like us to include in this analysis, please raise an issue.
Run-time conditions#
- pandas-openscm version: 0.1.1a1
- pandas-openscm commit: 53479d6df2ca6c8ea5e5001221bdc25e3e86b4d8
- python version: 3.11.11
- platform information: macOS-14.3-arm64-arm-64bit
- number of workers used in parallel tests: 8
Simple climate model full output#
Here we consider the case of output that mirrors the output produced by simple climate models. We have a number of scenarios and variables, each of which has hundreds of ensemble members. The output also spans a relatively long time period.
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 9.2 | 3,000 | 401 | 1 | 0.00 | 0.81 | 0.05 | 0.18 |
| 5 | 0.00 | 0.82 | 0.10 | 0.05 | |||
| 3,000 | 0.69 | 4.59 | 50.97 | 16.63 | |||
| 91.9 | 30,000 | 401 | 1 | 0.00 | 7.51 | 0.17 | 0.21 |
| 50 | 0.04 | 7.07 | 1.06 | 0.48 | |||
| 275.8 | 90,000 | 401 | 1 | 0.00 | 23.92 | 0.67 | 0.26 |
| 150 | 0.12 | 24.36 | 4.16 | 1.35 | |||
| 919.3 | 300,000 | 401 | 1 | 0.01 | 80.25 | 1.38 | 1.13 |
| 500 | 0.42 | 81.60 | 12.61 | 5.55 |
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 9.2 | 3,000 | 401 | 1 | nan | 1.30 | 0.66 | 0.73 |
| 5 | nan | 1.07 | 0.80 | 0.91 | |||
| 3,000 | nan | 2.92 | 10.71 | 5.64 | |||
| 91.9 | 30,000 | 401 | 1 | nan | 7.86 | 0.85 | 0.86 |
| 50 | nan | 2.03 | 1.00 | 1.20 | |||
| 275.8 | 90,000 | 401 | 1 | nan | 23.74 | 1.87 | 1.26 |
| 150 | nan | 5.79 | 1.80 | 1.59 | |||
| 919.3 | 300,000 | 401 | 1 | nan | 78.68 | 3.05 | 2.80 |
| 500 | nan | 21.31 | 4.05 | 2.86 |
*Note that there are no values for parallel operations with an in-memory back-end because such a setup doesn't make sense (you just duplicate the Python process for no gain).
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 9.2 | 3,000 | 401 | 1 | 0.00 | 21.37 | 9.41 | 9.49 |
| 5 | 0.00 | 21.38 | 10.03 | 9.59 | |||
| 3,000 | 0.00 | 29.69 | 469.18 | 61.47 | |||
| 91.9 | 30,000 | 401 | 1 | 0.00 | 213.67 | 92.65 | 94.35 |
| 50 | 0.00 | 213.83 | 100.27 | 95.66 | |||
| 275.8 | 90,000 | 401 | 1 | 0.00 | 641.11 | 277.66 | 282.95 |
| 150 | 0.00 | 641.64 | 300.79 | 286.93 | |||
| 919.3 | 300,000 | 401 | 1 | 0.00 | 2,137.17 | 925.19 | 943.06 |
| 500 | 0.00 | 2,139.08 | 1,002.63 | 956.38 |
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 9.2 | 3,000 | 401 | 1 | 0.00 | 0.12 | 0.01 | 0.06 |
| 5 | 0.00 | 0.14 | 0.04 | 0.04 | |||
| 3,000 | 0.11 | 8.34 | 12.36 | 11.43 | |||
| 91.9 | 30,000 | 401 | 1 | 0.00 | 0.90 | 0.05 | 0.09 |
| 50 | 0.03 | 1.13 | 0.32 | 0.38 | |||
| 275.8 | 90,000 | 401 | 1 | 0.00 | 3.20 | 0.15 | 0.31 |
| 150 | 0.05 | 4.12 | 1.01 | 0.92 | |||
| 919.3 | 300,000 | 401 | 1 | 0.00 | 10.27 | 0.71 | 1.20 |
| 500 | 0.20 | 15.19 | 7.13 | 3.95 |
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 9.2 | 3,000 | 401 | 1 | nan | 0.54 | 0.52 | 0.66 |
| 5 | nan | 0.71 | 0.65 | 0.73 | |||
| 3,000 | nan | 3.42 | 6.51 | 4.71 | |||
| 91.9 | 30,000 | 401 | 1 | nan | 1.50 | 0.70 | 0.77 |
| 50 | nan | 1.00 | 0.93 | 1.35 | |||
| 275.8 | 90,000 | 401 | 1 | nan | 4.10 | 1.08 | 1.17 |
| 150 | nan | 2.23 | 1.35 | 1.75 | |||
| 919.3 | 300,000 | 401 | 1 | nan | 10.49 | 2.66 | 2.93 |
| 500 | nan | 5.59 | 3.37 | 3.19 |
*Note that there are no values for parallel operations with an in-memory back-end because such a setup doesn't make sense (you just duplicate the Python process for no gain).
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 9.2 | 3,000 | 401 | 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 5 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| 3,000 | 0.07 | 0.13 | 0.16 | 0.17 | |||
| 91.9 | 30,000 | 401 | 1 | 0.00 | 0.01 | 0.00 | 0.00 |
| 50 | 0.00 | 0.01 | 0.01 | 0.01 | |||
| 275.8 | 90,000 | 401 | 1 | 0.00 | 0.05 | 0.01 | 0.04 |
| 150 | 0.00 | 0.04 | 0.01 | 0.02 | |||
| 919.3 | 300,000 | 401 | 1 | 0.00 | 0.31 | 0.01 | 0.02 |
| 500 | 0.01 | 0.25 | 0.04 | 0.15 |
Simple climate model future quantile output#
Here we consider the case of output that mirrors processed output produced from simple climate models. We have a number of scenarios and variables, each of which has been processed to a few quantiles. The output is restricted to the future time period.
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 0.1 | 75 | 126 | 1 | 0.00 | 0.01 | 0.01 | 0.02 |
| 5 | 0.00 | 0.01 | 0.03 | 0.04 | |||
| 0.7 | 750 | 126 | 1 | 0.00 | 0.06 | 0.01 | 0.02 |
| 50 | 0.01 | 0.10 | 0.26 | 0.31 | |||
| 7.3 | 7,500 | 126 | 1 | 0.00 | 0.69 | 0.02 | 0.03 |
| 5 | 0.00 | 0.62 | 0.07 | 0.05 | |||
| 21.8 | 22,500 | 126 | 1 | 0.00 | 1.89 | 0.04 | 0.07 |
| 5 | 0.01 | 1.92 | 0.09 | 0.08 | |||
| 72.6 | 75,000 | 126 | 1 | 0.00 | 6.35 | 0.16 | 0.19 |
| 5 | 0.02 | 6.56 | 0.14 | 0.18 | |||
| 217.4 | 225,000 | 126 | 1 | 0.01 | 18.61 | 0.64 | 0.30 |
| 50 | 0.07 | 18.87 | 0.60 | 0.82 | |||
| 725.5 | 750,000 | 126 | 1 | 0.01 | 60.37 | 1.78 | 1.08 |
| 5 | 0.17 | 65.08 | 2.24 | 1.14 |
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 0.1 | 75 | 126 | 1 | nan | 0.47 | 0.63 | 0.53 |
| 5 | nan | 0.65 | 0.72 | 0.70 | |||
| 0.7 | 750 | 126 | 1 | nan | 0.47 | 0.42 | 0.50 |
| 50 | nan | 0.72 | 0.80 | 0.93 | |||
| 7.3 | 7,500 | 126 | 1 | nan | 1.13 | 0.55 | 0.62 |
| 5 | nan | 0.86 | 0.69 | 0.93 | |||
| 21.8 | 22,500 | 126 | 1 | nan | 2.46 | 0.63 | 0.66 |
| 5 | nan | 1.41 | 0.82 | 0.93 | |||
| 72.6 | 75,000 | 126 | 1 | nan | 6.81 | 0.76 | 0.82 |
| 5 | nan | 2.52 | 0.90 | 1.05 | |||
| 217.4 | 225,000 | 126 | 1 | nan | 19.10 | 1.32 | 1.18 |
| 50 | nan | 5.08 | 1.92 | 1.91 | |||
| 725.5 | 750,000 | 126 | 1 | nan | 57.81 | 2.99 | 2.78 |
| 5 | nan | 18.87 | 3.11 | 2.85 |
*Note that there are no values for parallel operations with an in-memory back-end because such a setup doesn't make sense (you just duplicate the Python process for no gain).
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 0.1 | 75 | 126 | 1 | 0.00 | 0.17 | 0.13 | 0.11 |
| 5 | 0.00 | 0.18 | 0.33 | 0.16 | |||
| 0.7 | 750 | 126 | 1 | 0.00 | 1.71 | 0.79 | 0.82 |
| 50 | 0.00 | 1.76 | 3.25 | 1.49 | |||
| 7.3 | 7,500 | 126 | 1 | 0.00 | 17.08 | 7.44 | 7.89 |
| 5 | 0.00 | 17.08 | 7.64 | 7.97 | |||
| 21.8 | 22,500 | 126 | 1 | 0.00 | 51.27 | 22.21 | 23.60 |
| 5 | 0.00 | 51.27 | 22.42 | 23.74 | |||
| 72.6 | 75,000 | 126 | 1 | 0.00 | 170.93 | 73.93 | 78.60 |
| 5 | 0.00 | 170.93 | 74.18 | 79.00 | |||
| 217.4 | 225,000 | 126 | 1 | 0.00 | 513.01 | 221.63 | 235.26 |
| 50 | 0.00 | 513.23 | 224.20 | 237.07 | |||
| 725.5 | 750,000 | 126 | 1 | 0.00 | 1,710.69 | 739.01 | 785.35 |
| 5 | 0.00 | 1,710.69 | 739.57 | 788.20 |
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 0.1 | 75 | 126 | 1 | 0.00 | 0.01 | 0.01 | 0.02 |
| 5 | 0.00 | 0.01 | 0.02 | 0.03 | |||
| 0.7 | 750 | 126 | 1 | 0.00 | 0.01 | 0.01 | 0.01 |
| 50 | 0.00 | 0.07 | 0.10 | 0.20 | |||
| 7.3 | 7,500 | 126 | 1 | 0.00 | 0.12 | 0.02 | 0.03 |
| 5 | 0.00 | 0.11 | 0.02 | 0.05 | |||
| 21.8 | 22,500 | 126 | 1 | 0.00 | 0.35 | 0.03 | 0.05 |
| 5 | 0.01 | 0.31 | 0.04 | 0.08 | |||
| 72.6 | 75,000 | 126 | 1 | 0.00 | 0.90 | 0.05 | 0.11 |
| 5 | 0.01 | 0.89 | 0.09 | 0.13 | |||
| 217.4 | 225,000 | 126 | 1 | 0.00 | 2.53 | 0.22 | 0.30 |
| 50 | 0.05 | 3.11 | 0.41 | 0.56 | |||
| 725.5 | 750,000 | 126 | 1 | 0.00 | 7.95 | 0.66 | 1.22 |
| 5 | 0.10 | 9.23 | 0.96 | 1.35 |
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 0.1 | 75 | 126 | 1 | nan | 0.38 | 0.38 | 0.50 |
| 5 | nan | 0.53 | 0.50 | 0.66 | |||
| 0.7 | 750 | 126 | 1 | nan | 0.37 | 0.34 | 0.41 |
| 50 | nan | 0.75 | 0.78 | 0.94 | |||
| 7.3 | 7,500 | 126 | 1 | nan | 0.59 | 0.45 | 0.50 |
| 5 | nan | 0.60 | 0.58 | 0.73 | |||
| 21.8 | 22,500 | 126 | 1 | nan | 0.92 | 0.51 | 0.59 |
| 5 | nan | 0.82 | 0.72 | 1.36 | |||
| 72.6 | 75,000 | 126 | 1 | nan | 1.42 | 0.60 | 0.82 |
| 5 | nan | 1.16 | 0.81 | 0.92 | |||
| 217.4 | 225,000 | 126 | 1 | nan | 3.46 | 0.91 | 1.11 |
| 50 | nan | 2.32 | 1.53 | 1.87 | |||
| 725.5 | 750,000 | 126 | 1 | nan | 9.34 | 2.36 | 2.51 |
| 5 | nan | 5.72 | 3.23 | 2.88 |
*Note that there are no values for parallel operations with an in-memory back-end because such a setup doesn't make sense (you just duplicate the Python process for no gain).
| Back-end | In memory | csv | feather | netCDF | |||
|---|---|---|---|---|---|---|---|
| In-memory size (MB) | Time series | Time points | Files in database (after grouping) | ||||
| 0.1 | 75 | 126 | 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 5 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| 0.7 | 750 | 126 | 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 50 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| 7.3 | 7,500 | 126 | 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 5 | 0.00 | 0.00 | 0.00 | 0.00 | |||
| 21.8 | 22,500 | 126 | 1 | 0.00 | 0.00 | 0.00 | 0.00 |
| 5 | 0.00 | 0.01 | 0.00 | 0.00 | |||
| 72.6 | 75,000 | 126 | 1 | 0.00 | 0.01 | 0.00 | 0.01 |
| 5 | 0.00 | 0.01 | 0.00 | 0.00 | |||
| 217.4 | 225,000 | 126 | 1 | 0.00 | 0.05 | 0.01 | 0.01 |
| 50 | 0.00 | 0.04 | 0.01 | 0.01 | |||
| 725.5 | 750,000 | 126 | 1 | 0.00 | 0.20 | 0.05 | 0.01 |
| 5 | 0.00 | 0.17 | 0.02 | 0.04 |