audata package

Subpackages

Submodules

audata.dataset module

Classes for wrapping HDF5 datasets.

class audata.dataset.Dataset(au_parent, name)

Bases: Element

Maps to an HDF5 dataset, maintaining the audata schema and facilitating translation of higher-level data types. Generally should not be instantiated directly.

Parameters:
  • au_parent (Element) –

  • name (str) –

append(data, direct=False, time_cols=None, timedelta_cols=None)

Append additional data to a dataset.

Parameters:
  • data (DataFrame | recarray) –

  • direct (bool) –

  • time_cols (AbstractSet[str] | None) –

  • timedelta_cols (AbstractSet[str] | None) –

property columns: Dict[str, Any]

Get dictionary of column specifications.

get(idx=slice(None, -1, None), raw=False, datetimes=None)

Return a dataset as a pandas DataFrame.

Parameters:
  • raw (bool | None) –

  • datetimes (bool | None) –

Return type:

DataFrame

property ncol: int

Number of columns in dataset.

classmethod new(au_parent, name, value, overwrite=False, **kwargs)

Create a new Dataset object.

Parameters:
  • au_parent (Element) –

  • name (str) –

  • value (Dataset | ndarray | recarray | DataFrame) –

  • overwrite (bool) –

Return type:

Dataset

property nrow: int

Number of rows in dataset.

property shape: Tuple[int, int]

Get dataset shape tuple (rows, cols).

audata.element module

Base element class.

class audata.element.Element(parent=None, name='')

Bases: object

Represents an abstract audata element (e.g., files, groups, etc..) It should not be necessary to interact with this class directly.

Parameters:
  • parent (File | Element | None) –

  • name (str) –

clear()

Clear members to reset class.

property file_meta: Dict[str, Any]

File metadata (HDF5 .meta attribute of the built-in root group) (JSON dictionary, read-only)

property filename: str | None

Path to file (Optional[str], read-only)

property hdf: HLObject | None

Get wrapped HDF object.

property meta: Dict[str, Any]

Element meta data (HDF5 .meta attribute) (JSON dictionary)

property name: str | None

Element name (Optional[str], read-only)

property time_reference: dt.datetime | None

File time reference (Optional[dt.datetime], read-only)

property valid: bool

Is element valid? (bool, read-only)

audata.file module

HDF5 file wrapper class.

class audata.file.File(file, time_reference=None, return_datetimes=True)

Bases: Group

Wrapper around an HDF5 file.

The wrapper adds a lot of convenience in handling audata files by automatically maintaining the correct underlying data schema while providing an intuitive interface and some convenience functions. Datasets can be accessed and updated by using a file object as a dictionary, a dictionary of dictionaries, or a dictionary where hierarchy is implied by use of the forward-slash as a “directory” delimiter, similar to how datasets are accessed using h5py. Data conversions to store higher-level data types unsupported natively by HDF5 (e.g., timestamps, ranges, or categorical variables) is handled implicitely.

Generally, files are opened or created using the open or new class methods, respectively, instead of the constructor.

Example

Creating a new file and adding a dataset:

>>> f = audata.File.new('test.h5', time_reference=dt.datetime(2020, 5, 4, tzinfo=UTC))
>>> f['data'] = pd.DataFrame(data={
...     'time': f.time_reference + dt.timedelta(hours=1)*np.arange(3),
...     'a': [1,2,3],
...     'b': pd.Categorical(['a', 'b', 'c'])})
>>> f['data']
/data: Dataset [3 rows x 3 cols]
  time: time
  a: integer (signed)
  b: factor with 3 levels [a, b, c]
>>> f['data'][:]
                       time  a  b
0 2020-05-04 00:00:00+00:00  1  a
1 2020-05-04 01:00:00+00:00  2  b
2 2020-05-04 02:00:00+00:00  3  c

Instantiates the File object.

Generally new or open will be called, the constructor is not called directly.

Parameters:
  • file (File) – The opened HDF5 file object.

  • time_reference (datetime | None) – The file-level time reference.

  • return_datetimes (bool) – True if timestamps should be converted to dt.datetime objects, False if Unix timestamps (UTC) should be returned instead.

DateTimeFormat = '%Y-%m-%d %H:%M:%S.%f %Z'
close()

Close the file handle.

flush()

Flush changes to disk.

classmethod new(filename, overwrite=False, time_reference='now', metadata={}, return_datetimes=True, **kwargs)

Create a new file.

Parameters:
  • filename (str) – The path of the file to open.

  • overwrite (bool) – If True, existing files will be truncated. Otherwise, an existing file will cause an exception.

  • time_reference (str | datetime) – The time reference to use, or ‘now’ to use the time of file creation.

  • metadata (Dict[str, Any]) – An optional dict containing global metadata for the file.

  • return_datetimes (bool) – If True times will be converted to dt.datetime objects, otherwise Unix (UTC) timestamps.

  • **kwargs – Additional keyword arguments will be passed on to h5.File’s constructor.

Returns:

The newly opened file object.

Return type:

File

classmethod open(filename, create=False, readonly=True, return_datetimes=True, **kwargs)

Open an audata file.

Parameters:
  • filename (str) – The path to the file to open.

  • create (bool) – If True, missing files will be created. Otherwise, missing files cause an exception.

  • readonly (bool) – Whether to open in read-only or mutable.

  • return_datetimes (bool) – If True times will be converted to dt.datetime objects, otherwise Unix (UTC) timestamps.

  • **kwargs – Additional keyword arguments will be passed on to h5.File’s constructor if a file is to be created.

Returns:

The opened file object.

Return type:

File

property time_reference: datetime

The timezone-aware time reference.

Can be set with either a dt.datetime object or a str that can be parsed as a datetime. If a naive datetime is provided, the local timezone will be inferred.

audata.group module

Wrapper for Group types.

class audata.group.Group(au_parent, name='')

Bases: Element

Group element. Acts largely like a container for datasets. Generally should not be instantiated directly.

Parameters:
  • au_parent (Element) –

  • name (str) –

list()

List all child attributes, groups, and datasets.

Return type:

Dict[str, List[str]]

new_dataset(name, value, **kwargs)

Create a new dataset.

Parameters:
  • name (str) –

  • value (Dataset | np.ndarray | np.recarray | pd.DataFrame) –

recurse()

Recursively find all datasets. Groups and datasets prefixed with a period are ignored.

Returns:

Element, name: str).

Return type:

Iterable (generator) of tuples of (object

Module contents

Define version and import high-level objects.