Dataset#

class pydidas.core.Dataset(array: ndarray, **kwargs: dict)#

Bases: ndarray

Dataset class, a subclass of a numpy.ndarray with metadata.

Parameters:
  • array (np.ndarray) – The data array.

  • **kwargs (dict) – Optional keyword arguments.

  • **axis_labels (Union[dict, list, tuple], optional) – The labels for the axes. The length must correspond to the array dimensions. The default is None.

  • **axis_ranges (Union[dict, list, tuple], optional) – The ranges for the axes. The length must correspond to the array dimensions. The default is None.

  • **axis_units (Union[dict, list, tuple], optional) – The units for the axes. The length must correspond to the array dimensions. The default is None.

  • **metadata (Union[dict, None], optional) – A dictionary with metadata. The default is None.

  • **data_unit (str, optional) – The description of the data unit. The default is an empty string.

  • **data_label (str, optional) – The description of the data. The default is an empty string.

property array: ndarray#

Get the raw array data of the dataset.

Returns:

The array data.

Return type:

np.ndarray

property axis_labels: dict#

Get the axis_labels.

Returns:

The axis labels: A dictionary with keys corresponding to the dimension in the array and respective values.

Return type:

dict

property axis_ranges: dict#

Get the axis ranges.

These arrays for every dimension give the range of the data (in conjunction with the units).

Returns:

The axis ranges: A dictionary with keys corresponding to the dimension in the array and respective values.

Return type:

dict

property axis_units: dict#

Get the axis units.

Returns:

The axis units: A dictionary with keys corresponding to the dimension in the array and respective values.

Return type:

dict

copy(order: Literal['C', 'F', 'A', 'K'] = 'C') Self#

Overload the generic nd.ndarray copy method to copy metadata as well.

Parameters:

order (Literal["C", "F", "A", "K"], optional) – The memory layout. The default is “C”.

Returns:

The copied dataset.

Return type:

Dataset

property data_description: str#

Get a descriptive string for the data.

Returns:

The descriptive string for the data.

Return type:

str

property data_label: str#

Get the data label.

Returns:

The data label.

Return type:

str

property data_unit: str#

Get the data unit.

Returns:

The data unit.

Return type:

str

flatten(order: Literal['C', 'F', 'A', 'K'] = 'C') Self#

Clear the metadata when flattening the array.

Parameters:

order ({'C', 'F', 'A', 'K'}, optional) – ‘C’ means to flatten in row-major (C-style) order. ‘F’ means to flatten in column-major (Fortran-style) order. ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. ‘K’ means to flatten a in the order the elements occur in memory. The default is ‘C’.

flatten_dims(*args: tuple, new_dim_label: str = 'Flattened', new_dim_unit: str = '', new_dim_range: None | ndarray | Iterable = None)#

Flatten the specified dimensions in place in the Dataset.

This method will reduce the dimensionality of the Dataset by len(args).

Warning: Flattening distributed dimensions throughout the dataset will destroy the data organisation and only adjacent dimensions can be processed.

Parameters:
  • *args (tuple) – The tuple of the dimensions to be flattened. Each dimension must be an integer entry.

  • new_dim_label (str, optional) – The label for the new, flattened dimension. The default is ‘Flattened’.

  • new_dim_unit (str, optional) – The unit for the new, flattened dimension. The default is ‘’.

  • new_dim_range (Union[None, np.ndarray, Iterable], optional) – The new range for the flattened dimension. If None, a simple The default is None.

get_axis_description(index: int) str#

Get the description for the given axis, based on the axis label and unit.

Parameters:

index (int) – The axis index.

Returns:

The description for the given axis.

Return type:

str

get_description_of_point(indices: Iterable) str#

Get the metadata description of a single point in the array.

Index values of “None” will be interpreted as request to skip this axis.

Parameters:

indices (Iterable) – The indices for each dimension.

Returns:

A string description of the selected point.

Return type:

str

get_rebinned_copy(binning: int) Self#

Get a binned copy of the Dataset.

This method will create a binned copy and copy all axis metadata. It will also modify the ranges, if required.

Parameters:

binning (int) – The binning factor.

Returns:

The binned Dataset.

Return type:

pydidas.core.Dataset

property metadata: dict#

Get the dataset metadata.

Returns:

The metadata dictionary. There is no enforced structure of the dictionary.

Return type:

dict

property property_dict: dict#

Get a copy of the properties dictionary.

Returns:

A dictionary with copies of all properties.

Return type:

dict

squeeze(axis: None | int = None) Self#

Squeeze the array and remove dimensions of length one.

Parameters:

axis (Union[None, int], optional) – The axis to be squeezed. If None, all axes of length one will be squeezed. The default is None.

Returns:

The squeezed Dataset.

Return type:

pydidas.core.Dataset

take(indices: int | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes], axis: int | None = None, out: None | ndarray = None, mode: Literal['raise', 'wrap', 'clip'] = 'raise') Self#

Take elements from an array along an axis.

This method overloads the ndarray.take method to process the axis properties as well.

Parameters:
  • indices (Union[int, ArrayLike]) – The indicies of the values to extract.

  • axis (Union[None, int], optional) – The axis to take the data from. If None, data will be taken from the flattened array. The default is None.

  • out (Union[np.ndarray, None], optional) – An optional output array. If None, a new array is created. The default is None.

  • mode (str, optional) – Specifies how out-of-bounds indices will behave. The default is “raise”.

Returns:

new – The new dataset.

Return type:

pydidas.core.Dataset

transpose(*axes: tuple) Self#

Overload the generic transpose method to transpose the metadata as well.

Note that contrary to the generic method, transpose creates a deepcopy of the data and not only a view to prevent inconsistent metadata.

Parameters:

*axes (tuple) – The axes to be transposed. If not given, the generic order is used.

Returns:

The transposed Dataset.

Return type:

pydidas.core.Dataset

update_axis_label(index: int, item: str)#

Update a single axis label value.

Parameters:
  • index (int) – The dimension to be updated.

  • item (str) – The new item for the range of the selected dimension.

Raises:

ValueError – If the index is not in range of the Dataset dimensions or if the item is not a string.

update_axis_range(index: int, item: ndarray | Iterable)#

Update a single axis range value.

Parameters:
  • index (int) – The dimension to be updated.

  • item (Union[np.ndarray, Iterable]) – The new item for the range of the selected dimension.

Raises:

ValueError – If the index is not in range of the Dataset dimensions.

update_axis_unit(index: int, item: str)#

Update a single axis unit value.

Parameters:
  • index (int) – The dimension to be updated.

  • item (str) – The new item for the range of the selected dimension.

Raises:

ValueError – If the index is not in range of the Dataset dimensions or if the item is not a string.