Dataset#

class pydidas.core.Dataset(array: ndarray, **kwargs: dict)#

Bases: ndarray

Dataset class, a subclass of a numpy.ndarray with metadata.

Parameters:

array (np.ndarray) – The data array.
**kwargs (dict) – Optional keyword arguments.
**axis_labels (Union[dict, list, tuple], optional) – The labels for the axes. The length must correspond to the array dimensions. The default is None.
**axis_ranges (Union[dict, list, tuple], optional) – The ranges for the axes. The length must correspond to the array dimensions. The default is None.
**axis_units (Union[dict, list, tuple], optional) – The units for the axes. The length must correspond to the array dimensions. The default is None.
**metadata (Union[dict, None], optional) – A dictionary with metadata. The default is None.
**data_unit (str, optional) – The description of the data unit. The default is an empty string.
**data_label (str, optional) – The description of the data. The default is an empty string.

property array: ndarray#

Get the raw array data of the dataset.

Returns:: The array data.
Return type:: np.ndarray

property axis_labels: dict#

Get the axis_labels.

Returns:: The axis labels: A dictionary with keys corresponding to the dimension in the array and respective values.
Return type:: dict

property axis_ranges: dict#

Get the axis ranges.

These arrays for every dimension give the range of the data (in conjunction with the units).

Returns:: The axis ranges: A dictionary with keys corresponding to the dimension in the array and respective values.
Return type:: dict

property axis_units: dict#

Get the axis units.

Returns:: The axis units: A dictionary with keys corresponding to the dimension in the array and respective values.
Return type:: dict

copy(order: Literal['C', 'F', 'A', 'K'] = 'C') → Self#

Overload the generic nd.ndarray copy method to copy metadata as well.

Parameters:: order (Literal["C", "F", "A", "K"], optional) – The memory layout. The default is “C”.
Returns:: The copied dataset.
Return type:: Dataset

property data_description: str#

Get a descriptive string for the data.

Returns:: The descriptive string for the data.
Return type:: str

property data_label: str#

Get the data label.

Returns:: The data label.
Return type:: str

property data_unit: str#

Get the data unit.

Returns:: The data unit.
Return type:: str

flatten(order: Literal['C', 'F', 'A', 'K'] = 'C') → Self#

Clear the metadata when flattening the array.

Parameters:: order ({'C', 'F', 'A', 'K'}, optional) – ‘C’ means to flatten in row-major (C-style) order. ‘F’ means to flatten in column-major (Fortran-style) order. ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. ‘K’ means to flatten a in the order the elements occur in memory. The default is ‘C’.

flatten_dims(*args: tuple, new_dim_label: str = 'Flattened', new_dim_unit: str = '', new_dim_range: None | ndarray | Iterable = None)#

Flatten the specified dimensions in place in the Dataset.

This method will reduce the dimensionality of the Dataset by len(args).

Warning: Flattening distributed dimensions throughout the dataset will destroy the data organisation and only adjacent dimensions can be processed.

Parameters:

*args (tuple) – The tuple of the dimensions to be flattened. Each dimension must be an integer entry.
new_dim_label (str, optional) – The label for the new, flattened dimension. The default is ‘Flattened’.
new_dim_unit (str, optional) – The unit for the new, flattened dimension. The default is ‘’.
new_dim_range (Union[None, np.ndarray, Iterable], optional) – The new range for the flattened dimension. If None, a simple The default is None.

get_axis_description(index: int) → str#

Get the description for the given axis, based on the axis label and unit.

Parameters:: index (int) – The axis index.
Returns:: The description for the given axis.
Return type:: str

get_description_of_point(indices: Iterable) → str#

Get the metadata description of a single point in the array.

Index values of “None” will be interpreted as request to skip this axis.

Parameters:: indices (Iterable) – The indices for each dimension.
Returns:: A string description of the selected point.
Return type:: str

get_rebinned_copy(binning: int) → Self#

Get a binned copy of the Dataset.

This method will create a binned copy and copy all axis metadata. It will also modify the ranges, if required.

Parameters:: binning (int) – The binning factor.
Returns:: The binned Dataset.
Return type:: pydidas.core.Dataset

property metadata: dict#

Get the dataset metadata.

Returns:: The metadata dictionary. There is no enforced structure of the dictionary.
Return type:: dict

property property_dict: dict#

Get a copy of the properties dictionary.

Returns:: A dictionary with copies of all properties.
Return type:: dict

squeeze(axis: None | int = None) → Self#

Squeeze the array and remove dimensions of length one.

Parameters:: axis (Union[None, int], optional) – The axis to be squeezed. If None, all axes of length one will be squeezed. The default is None.
Returns:: The squeezed Dataset.
Return type:: pydidas.core.Dataset

Take elements from an array along an axis.

This method overloads the ndarray.take method to process the axis properties as well.

Parameters:

indices (Union[int, ArrayLike]) – The indicies of the values to extract.
axis (Union[None, int], optional) – The axis to take the data from. If None, data will be taken from the flattened array. The default is None.
out (Union[np.ndarray, None], optional) – An optional output array. If None, a new array is created. The default is None.
mode (str, optional) – Specifies how out-of-bounds indices will behave. The default is “raise”.

Returns:

new – The new dataset.

Return type:

pydidas.core.Dataset

transpose(*axes: tuple) → Self#

Overload the generic transpose method to transpose the metadata as well.

Note that contrary to the generic method, transpose creates a deepcopy of the data and not only a view to prevent inconsistent metadata.

Parameters:: *axes (tuple) – The axes to be transposed. If not given, the generic order is used.
Returns:: The transposed Dataset.
Return type:: pydidas.core.Dataset

update_axis_label(index: int, item: str)#

Update a single axis label value.

Parameters:

index (int) – The dimension to be updated.
item (str) – The new item for the range of the selected dimension.

Raises:

ValueError – If the index is not in range of the Dataset dimensions or if the item is not a string.

update_axis_range(index: int, item: ndarray | Iterable)#

Update a single axis range value.

Parameters:

index (int) – The dimension to be updated.
item (Union[np.ndarray, Iterable]) – The new item for the range of the selected dimension.

Raises:

ValueError – If the index is not in range of the Dataset dimensions.

update_axis_unit(index: int, item: str)#

Update a single axis unit value.

Parameters:

index (int) – The dimension to be updated.
item (str) – The new item for the range of the selected dimension.

Raises:

ValueError – If the index is not in range of the Dataset dimensions or if the item is not a string.