pandas_openscm.db.reader#
Database reader
A small optimisation to allow for a reader that holds the index in memory, rather than loading it from disk on every operation.
Classes:
| Name | Description |
|---|---|
OpenSCMDBReader |
Reader for reading data out of a database created with |
OpenSCMDBReader #
Reader for reading data out of a database created with OpenSCMDB
Holds the database file map and index in memory,
which can make repeated read operations faster
than using an OpenSCMDB instance.
Methods:
| Name | Description |
|---|---|
__enter__ |
If the reader has a lock, acquire it |
__exit__ |
If the reader has a lock, release it |
load |
Load data |
Attributes:
| Name | Type | Description |
|---|---|---|
backend_data |
OpenSCMDBDataBackend
|
The backend for reading data from disk |
db_dir |
Path
|
The directory in which the database lives |
db_file_map |
Series[Path]
|
The file map of the database from which we are reading. |
db_index |
DataFrame
|
The index of the database from which we are reading. |
lock |
BaseFileLock | None
|
Lock for the database from which data is being read |
metadata |
MultiIndex
|
Database's metadata |
Source code in src/pandas_openscm/db/reader.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 | |
backend_data
class-attribute
instance-attribute
#
backend_data: OpenSCMDBDataBackend = field(kw_only=True)
The backend for reading data from disk
db_dir
class-attribute
instance-attribute
#
The directory in which the database lives
db_file_map
class-attribute
instance-attribute
#
The file map of the database from which we are reading.
db_index
class-attribute
instance-attribute
#
The index of the database from which we are reading.
lock
class-attribute
instance-attribute
#
lock: BaseFileLock | None = field(kw_only=True)
Lock for the database from which data is being read
If None, we don't hold the lock and automatic locking is not enabled.
__enter__ #
__enter__() -> OpenSCMDBReader
__exit__ #
__exit__(
exc_type: type[BaseException] | None,
exc_value: BaseException | None,
traceback: TracebackType | None,
) -> None
If the reader has a lock, release it
Source code in src/pandas_openscm/db/reader.py
load #
load(
selector: Index[Any]
| MultiIndex
| Selector
| None = None,
*,
out_columns_type: type | None = None,
out_columns_name: str | None = None,
parallel_op_config: ParallelOpConfig | None = None,
progress: bool = False,
max_workers: int | None = None,
) -> DataFrame
Load data
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selector
|
Index[Any] | MultiIndex | Selector | None
|
Selector to use to choose the data to load |
None
|
out_columns_type
|
type | None
|
Type to set the output columns to. If not supplied, we don't set the output columns' type. |
None
|
out_columns_name
|
str | None
|
The name for the columns in the output. If not supplied, we don't set the output columns' name. This can also be set with pd.DataFrame.rename_axis but we provide it here for convenience (and in case you couldn't find this trick for ages, like us). |
None
|
parallel_op_config
|
ParallelOpConfig | None
|
Configuration for executing the operation in parallel with progress bars If not supplied, we use the values of |
None
|
progress
|
bool
|
Should progress bar(s) be used to display the progress of the deletion? Only used if |
False
|
max_workers
|
int | None
|
Maximum number of workers to use for parallel processing. If supplied, we create an instance of
concurrent.futures.ProcessPoolExecutor
with the provided number of workers.
A process pool seems to be the sensible default from our experimentation,
but it is not a universally better choice.
If you need something else because of how your database is set up,
simply pass If not supplied, the loading is executed serially. Only used if |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Loaded data |