quantify_core.data
#
types
#
Module containing the core data concepts of quantify.
- class TUID(value: str)[source]#
A human readable unique identifier based on the timestamp. This class does not wrap the passed in object but simply verifies and returns it.
A tuid is a string formatted as
YYYYmmDD-HHMMSS-sss-******
. The tuid serves as a unique identifier for experiments in quantify.See also
The.
handling
module.- classmethod is_valid(tuid)[source]#
Test if tuid is valid. A valid tuid is a string formatted as
YYYYmmDD-HHMMSS-sss-******
.- Parameters:
tuid (str) – a tuid string
- Return type:
- Returns:
bool True if the string is a valid TUID.
- Raises:
ValueError – Invalid format
handling
#
Utilities for handling data.
- class DecodeToNumpy(list_to_ndarray=False, *args, **kwargs)[source]#
Decodes a JSON object to Python/Numpy objects.
- __init__(list_to_ndarray=False, *args, **kwargs)[source]#
Decodes a JSON object to Python/Numpy objects.
Example
json.loads(json_string, cls=DecodeToNumpy, list_to_numpy=True)
- Parameters:
list_to_ndarray (
bool
(default:False
)) – If True, will try to convert python lists to a numpy array.args – Additional args to be passed to
json.JSONDecoder
.kwargs – Additional kwargs to be passed to
json.JSONDecoder
.
- concat_dataset(tuids, dim='dim_0', name=None, analysis_name=None)[source]#
Takes in a list of TUIDs and concatenates the corresponding datasets. It adds the TUIDs as a coordinate in the new dataset.
By default, we will extract the unprocessed dataset from each directory, but if analysis_name is specified, we will extract the processed dataset for that analysis.
- Parameters:
dim (
str
(default:'dim_0'
)) – Dimension along which to concatenate the datasets.analysis_name (
Optional
[str
] (default:None
)) – In the case that we want to extract the processed dataset for give analysis, this is the name of the analysis.name (
Optional
[str
] (default:None
)) – The name of the concatenated dataset. If None, use the name of the first dataset in the list.
- Return type:
- Returns:
: Concatenated dataset with new TUID and references to the old TUIDs.
- create_exp_folder(tuid, name=None, datadir=None)[source]#
Creates an empty folder to store an experiment container.
If the folder already exists, simply returns the experiment folder corresponding to the
TUID
.- Parameters:
- Return type:
- Returns:
: Full path of the experiment folder following format:
/datadir/YYYYmmDD/YYYYmmDD-HHMMSS-sss-******-name/
.
- default_datadir(verbose=True)[source]#
Returns (and optionally print) a default datadir path.
Intended for fast prototyping, tutorials, examples, etc..
- extract_parameter_from_snapshot(snapshot, parameter)[source]#
A function which takes a parameter and extracts it from a snapshot, including in the case where the parameter is part of a nested submodule within a QCoDeS instrument.
- Parameters:
- Return type:
- Returns:
: The dict specifying the parameter properties which was extracted from the snapshot
- get_datadir()[source]#
Returns the current data directory.
The data directory can be changed using
set_datadir()
.- Return type:
- Returns:
: The current data directory.
- get_latest_tuid(contains='')[source]#
Returns the most recent tuid.
Tip
This function is similar to
get_tuids_containing()
but is preferred if one is only interested in the most recentTUID
for performance reasons.- Parameters:
contains (
str
(default:''
)) – An optional string contained in the experiment name.- Return type:
- Returns:
: The latest TUID.
- Raises:
FileNotFoundError – No data found.
- get_tuids_containing(contains='', t_start=None, t_stop=None, max_results=9223372036854775807, reverse=False)[source]#
Returns a list of tuids containing a specific label.
Tip
If one is only interested in the most recent
TUID
,get_latest_tuid()
is preferred for performance reasons.- Parameters:
contains (
str
(default:''
)) – A string contained in the experiment name.t_start (
Union
[datetime
,str
,None
] (default:None
)) – datetime to search from, inclusive. If a string is specified, it will be converted to a datetime object usingparse
. If no value is specified, will use the year 1 as a reference t_start.t_stop (
Union
[datetime
,str
,None
] (default:None
)) – datetime to search until, exclusive. If a string is specified, it will be converted to a datetime object usingparse
. If no value is specified, will use the current time as a reference t_stop.max_results (
int
(default:9223372036854775807
)) – Maximum number of results to return. Defaults to unlimited.reverse (
bool
(default:False
)) – If False, sorts tuids chronologically, if True sorts by most recent.
- Return type:
- Returns:
- Raises:
FileNotFoundError – No data found.
- get_varying_parameter_values(tuids, parameter)[source]#
A function that gets a parameter which varies over multiple experiments and puts it in a ndarray.
- Parameters:
- Return type:
- Returns:
: The values of the varying parameter.
- initialize_dataset(settable_pars, setpoints, gettable_pars)[source]#
Initialize an empty dataset based on settable_pars, setpoints and gettable_pars.
- load_dataset(tuid, datadir=None, name='dataset.hdf5')[source]#
Loads a dataset specified by a tuid.
Tip
This method also works when specifying only the first part of a
TUID
.Note
This method uses
load_dataset()
to ensure the file is closed after loading as datasets are intended to be immutable after performing the initial experiment.- Parameters:
- Return type:
- Returns:
: The dataset.
- Raises:
FileNotFoundError – No data found for specified date.
- load_dataset_from_path(path)[source]#
Loads a
Dataset
with a specific engine preference.Before returning the dataset
AdapterH5NetCDF.recover()
is applied.This function tries to load the dataset until success with the following engine preference:
"h5netcdf"
"netcdf4"
No engine specified (
load_dataset()
default)
- load_processed_dataset(tuid, analysis_name)[source]#
Given an experiment TUID and the name of an analysis previously run on it, retrieves the processed dataset resulting from that analysis.
- load_quantities_of_interest(tuid, analysis_name)[source]#
Given an experiment TUID and the name of an analysis previously run on it, retrieves the corresponding “quantities of interest” data.
- load_snapshot(tuid, datadir=None, list_to_ndarray=False, file='snapshot.json')[source]#
Loads a snapshot specified by a tuid.
- Parameters:
tuid (
TUID
) – ATUID
string. It is also possible to specify only the first part of a tuid.datadir (
Union
[Path
,str
,None
] (default:None
)) – Path of the data directory. IfNone
, usesget_datadir()
to determine the data directory.list_to_ndarray (
bool
(default:False
)) – Uses an internal DecodeToNumpy decoder which allows a user to automatically convert a list to numpy array during deserialization of the snapshot.file (
str
(default:'snapshot.json'
)) – Filename to load.
- Return type:
- Returns:
: The snapshot.
- Raises:
FileNotFoundError – No data found for specified date.
- locate_experiment_container(tuid, datadir=None)[source]#
Returns the path to the experiment container of the specified tuid.
- Parameters:
- Return type:
- Returns:
: The path to the experiment container
- Raises:
FileNotFoundError – Experiment container not found.
- multi_experiment_data_extractor(experiment, parameter, *, new_name=None, t_start=None, t_stop=None, analysis_name=None, dimension='dim_0')[source]#
A data extraction function which loops through multiple quantify data directories and extracts the selected varying parameter value and corresponding datasets, then compiles this data into a single dataset for further analysis.
By default, we will extract the unprocessed dataset from each directory, but if analysis_name is specified, we will extract the processed dataset for that analysis.
- Parameters:
experiment (
str
) – The experiment to be included in the new dataset. For example “Pulsed spectroscopy”parameter (
str
) – The name and address of the QCoDeS parameter from which to get the value, including the instrument name and all submodules. For example"current_source.module0.dac0.current"
.new_name (
Optional
[str
] (default:None
)) – The name of the new multifile dataset. If no new name is given, it will create a new name as experiment vs instrument.t_start (
Optional
[str
] (default:None
)) – Datetime to search from, inclusive. If a string is specified, it will be converted to a datetime object usingparse
. If no value is specified, will use the year 1 as a reference t_start.t_stop (
Optional
[str
] (default:None
)) – Datetime to search until, exclusive. If a string is specified, it will be converted to a datetime object usingparse
. If no value is specified, will use the current time as a reference t_stop.analysis_name (
Optional
[str
] (default:None
)) – In the case that we want to extract the processed dataset for give analysis, this is the name of the analysis.dimension (
str
|None
(default:'dim_0'
)) – The name of the dataset dimension to concatenate over
- Return type:
- Returns:
: The compiled quantify dataset.
- snapshot(update=False, clean=True)[source]#
State of all instruments setup as a JSON-compatible dictionary (everything that the custom JSON encoder class
NumpyJSONEncoder
supports).
- to_gridded_dataset(quantify_dataset, dimension='dim_0', coords_names=None)[source]#
Converts a flattened (a.k.a. “stacked”) dataset as the one generated by the
initialize_dataset()
to a dataset in which the measured values are mapped onto a grid in the xarray format.This will be meaningful only if the data itself corresponds to a gridded measurement.
Note
Each individual
(x0[i], x1[i], x2[i], ...)
setpoint must be unique.Conversions applied:
The names
"x0", "x1", ...
will correspond to the names of the Dimensions.- The unique values for each of the
x0, x1, ...
Variables are converted to Coordinates.
- The unique values for each of the
- The
y0, y1, ...
Variables are reshaped into a (multi-)dimensional grid and associated to the Coordinates.
- The
- Parameters:
quantify_dataset (
Dataset
) – Input dataset in the format generated by theinitialize_dataset
.dimension (
str
(default:'dim_0'
)) – The flattened xarray Dimension.coords_names (
Optional
[Iterable
] (default:None
)) – Optionally specify explicitly which Variables correspond to orthogonal coordinates, e.g. datasets holds values for("x0", "x1")
but only “x0” is independent:to_gridded_dataset(dset, coords_names=["x0"])
.
- Return type:
- Returns:
: The new dataset.
- trim_dataset(dataset)[source]#
Trim NaNs from a dataset, useful in the case of a dynamically resized dataset (e.g. adaptive loops).
dataset_adapters
#
Utilities for dataset (python object) handling.
- class AdapterH5NetCDF[source]#
Quantify dataset adapter for the
h5netcdf
engine.It has the functionality of adapting the Quantify dataset to a format compatible with the
h5netcdf
xarray backend engine that is used to write and load the dataset to/from disk.Warning
The
h5netcdf
engine has minor issues when performing a two-way trip of the dataset. Thetype
of some attributes are not preserved. E.g., list- and tuple-like objects are loaded as numpy arrays ofdtype=object
.- classmethod adapt(dataset)[source]#
Serializes to JSON the dataset and variables attributes.
To prevent the JSON serialization for specific items, their names should be listed under the attribute named
json_serialize_exclude
(for eachattrs
dictionary).
- static attrs_convert(attrs, inplace=False, vals_converter=<function dumps>)[source]#
Converts to/from JSON string the values of the keys which are not listed in the
json_serialize_exclude
list.
- class DatasetAdapterBase[source]#
A generic interface for a dataset adapter.
Note
It might be difficult to grasp the generic purpose of this class. See
AdapterH5NetCDF
for a specialized use case.A dataset adapter is intended to “adapt”/”convert” a dataset to a format compatible with some other piece of software such as a function, interface, read/write back end, etc.. The main use case is to define the interface of the
AdapterH5NetCDF
that converts the Quantify dataset for loading and writing to/from disk.Subclasses implementing this interface are intended to be a two-way bridge to some other object/interface/backend to which we refer to as the “Target” of the adapter.
The function
.adapt()
should return a dataset to be consumed by the Target.The function
.recover()
should receive a dataset generated by the Target.
- class DatasetAdapterIdentity[source]#
A dataset adapter that does not modify the datasets in any way.
Intended to be used just as an object that respects the adapter interface defined by
DatasetAdapterBase
.A particular use case is the backwards compatibility for loading and writing older versions of the Quantify dataset.
dataset_attrs
#
Utilities for handling the attributes of xarray.Dataset
and
xarray.DataArray
(python objects) handling.
- class QCoordAttrs(unit='', long_name='', is_main_coord=None, uniformly_spaced=None, is_dataset_ref=False, json_serialize_exclude=<factory>)[source]#
A dataclass representing the required
attrs
attribute of main and secondary coordinates.- is_dataset_ref: bool = False#
Flags if it is an array of
quantify_core.data.types.TUID
s of other dataset.
- is_main_coord: bool | None = None#
When set to
True
, flags the xarray coordinate to correspond to a main coordinate, otherwise (False
) it corresponds to a secondary coordinate.
- json_serialize_exclude: List[str]#
A list of strings corresponding to the names of other attributes that should not be json-serialized when writing the dataset to disk. Empty by default.
- long_name: str = ''#
A long name for this coordinate.
- uniformly_spaced: bool | None = None#
Indicates if the values are uniformly spaced.
- unit: str = ''#
The units of the values.
- class QDatasetAttrs(tuid=None, dataset_name='', dataset_state=None, timestamp_start=None, timestamp_end=None, quantify_dataset_version='2.0.0', software_versions=<factory>, relationships=<factory>, json_serialize_exclude=<factory>)[source]#
A dataclass representing the
attrs
attribute of the Quantify dataset.See also
- dataset_name: str = ''#
The dataset name, usually same as the the experiment name included in the name of the experiment container.
- dataset_state: Literal[None, 'running', 'interrupted (safety)', 'interrupted (forced)', 'done'] = None#
Denotes the last known state of the experiment/data acquisition that served to ‘build’ this dataset. Can be used later to filter ‘bad’ datasets.
- json_serialize_exclude: List[str]#
A list of strings corresponding to the names of other attributes that should not be json-serialized when writing the dataset to disk. Empty by default.
- quantify_dataset_version: str = '2.0.0'#
A string identifying the version of this Quantify dataset for backwards compatibility.
- relationships: List[QDatasetIntraRelationship]#
A list of relationships within the dataset specified as list of dictionaries that comply with the
QDatasetIntraRelationship
.
- software_versions: Dict[str, str]#
A mapping of other relevant software packages that are relevant to log for this dataset. Another example is the git tag or hash of a commit of a lab repository.
- timestamp_end: Union[str, None] = None#
Human-readable timestamp (ISO8601) as returned by
datetime.datetime.now().astimezone().isoformat()
. Specifies when the experiment/data acquisition ended.
- timestamp_start: Union[str, None] = None#
Human-readable timestamp (ISO8601) as returned by
datetime.datetime.now().astimezone().isoformat()
. Specifies when the experiment/data acquisition started.
- tuid: Union[str, None] = None#
The time-based unique identifier of the dataset. See
quantify_core.data.types.TUID
.
- class QDatasetIntraRelationship(item_name=None, relation_type=None, related_names=<factory>, relation_metadata=<factory>)[source]#
A dataclass representing a dictionary that specifies a relationship between dataset variables.
A prominent example are calibration points contained within one variable or several variables that are necessary to interpret correctly the data of another variable.
- item_name: str | None = None#
The name of the coordinate/variable to which we want to relate other coordinates/variables.
A list of names related to the
item_name
.
- relation_metadata: Dict[str, Any]#
A free-form dictionary to store additional information relevant to this relationship.
- relation_type: str | None = None#
A string specifying the type of relationship.
Reserved relation types:
"calibration"
- Specifies a list of main variables used as calibration data for the main variables whose name is specified by theitem_name
.
- class QVarAttrs(unit='', long_name='', is_main_var=None, uniformly_spaced=None, grid=None, is_dataset_ref=False, has_repetitions=False, json_serialize_exclude=<factory>)[source]#
A dataclass representing the required
attrs
attribute of main and secondary variables.- grid: bool | None = None#
Indicates if the variables data are located on a grid, which does not need to be uniformly spaced along all dimensions. In other words, specifies if the corresponding main coordinates are the ‘unrolled’ points (also known as ‘unstacked’) corresponding to a grid.
If
True
than it is possible to usequantify_core.data.handling.to_gridded_dataset()
to convert the variables to a ‘stacked’ version.
- has_repetitions: bool = False#
Indicates that the outermost dimension of this variable is a repetitions dimension. This attribute is intended to allow easy programmatic detection of such dimension. It can be used, for example, to average along this dimension before an automatic live plotting or analysis.
- is_dataset_ref: bool = False#
Flags if it is an array of
quantify_core.data.types.TUID
s of other dataset. See also Dataset for a “nested MeasurementControl” experiment.
- is_main_var: bool | None = None#
When set to
True
, flags this xarray data variable to correspond to a main variable, otherwise (False
) it corresponds to a secondary variable.
- json_serialize_exclude: List[str]#
A list of strings corresponding to the names of other attributes that should not be json-serialized when writing the dataset to disk. Empty by default.
- long_name: str = ''#
A long name for this coordinate.
- uniformly_spaced: bool | None = None#
Indicates if the values are uniformly spaced. This does not apply to ‘true’ main variables but, because a MultiIndex is not supported yet by xarray when writing to disk, some coordinate variables have to be stored as main variables instead.
- unit: str = ''#
The units of the values.
- get_main_coords(dataset)[source]#
Finds the main coordinates in the dataset (except secondary coordinates).
Finds the xarray coordinates in the dataset that have their attributes
is_main_coord
set toTrue
(inside thexarray.DataArray.attrs
dictionary).
- get_main_dims(dataset)[source]#
Determines the ‘main’ dimensions in the dataset.
Each of the dimensions returned is the outermost dimension for an main coordinate/variable, OR the second one when a repetitions dimension is present. (see
has_repetitions
).These dimensions are detected based on
is_main_coord
andis_main_var
attributes.Warning
The dimensions listed in this list should be considered “incompatible” in the sense that the main coordinate/variables must lie on one and only one of such dimension.
Note
The dimensions, on which the secondary coordinates/variables lie, are not included in this list. See also
get_secondary_dims()
.
- get_main_vars(dataset)[source]#
Finds the main variables in the dataset (except secondary variables).
Finds the xarray data variables in the dataset that have their attributes
is_main_var
set toTrue
(inside thexarray.DataArray.attrs
dictionary).
- get_secondary_coords(dataset)[source]#
Finds the secondary coordinates in the dataset.
Finds the xarray coordinates in the dataset that have their attributes
is_main_coord
set toFalse
(inside thexarray.DataArray.attrs
dictionary).
- get_secondary_dims(dataset)[source]#
Returns the ‘main’ secondary dimensions.
For details see
get_main_dims()
,is_main_var
andis_main_coord
.
- get_secondary_vars(dataset)[source]#
Finds the secondary variables in the dataset.
Finds the xarray data variables in the dataset that have their attributes
is_main_var
set toFalse
(inside thexarray.DataArray.attrs
dictionary).
experiment
#
Utilities for managing experiment data.
- class QuantifyExperiment(tuid, dataset=None)[source]#
Class which represents all data related to an experiment. This allows the user to run experiments and store data without the quantify_core.measurement.control.MeasurementControl. The class serves as an initial interface for other data storage backends.
- load_dataset()[source]#
Loads the quantify dataset associated with the TUID set within the class.
- Return type:
- Returns:
:
- Raises:
FileNotFoundError – If no file with a dataset can be found
- load_metadata()[source]#
Loads the metadata from the directory specified by ~.experiment_directory.
- Return type:
- Returns:
: The loaded metadata from disk. None if no file is found.
- Raises:
FileNotFoundError – If no file with metadata can be found
- load_snapshot()[source]#
Loads the snapshot from the directory specified by ~.experiment_directory.
- Return type:
- Returns:
: The loaded snapshot from disk
- Raises:
FileNotFoundError – If no file with a snapshot can be found
- load_text(rel_path)[source]#
Loads a string from a text file from the path specified by ~.experiment_directory / rel_path.
- Parameters:
rel_path (
str
) – path relative to the base directory of the experiment, e.g. “data.json” or “my_folder/data.txt”- Return type:
- Returns:
: The loaded text from disk
- Raises:
FileNotFoundError – If no file can be found at rel_path
- save_metadata(metadata=None)[source]#
Writes the metadata to disk as specified by ~.experiment_directory.
- save_snapshot(snapshot=None)[source]#
Writes the snapshot to disk as specified by ~.experiment_directory.
- save_text(text, rel_path)[source]#
Saves a string to a text file in the path specified by ~.experiment_directory / rel_path.