audata package¶
Submodules¶
audata.annotation module¶
audata.dataset module¶
Classes for wrapping HDF5 datasets.
-
class
audata.dataset.Dataset(au_parent, name)¶ Bases:
audata.element.ElementMaps to an HDF5 dataset, maintaining the audata schema and facilitating translation of higher-level data types. Generally should not be instantiated directly.
- Parameters
au_parent (audata.element.Element) –
name (str) –
-
append(data, direct=False, time_cols=None, timedelta_cols=None)¶ Append additional data to a dataset.
- Parameters
data (Union[pandas.core.frame.DataFrame, numpy.recarray]) –
direct (bool) –
time_cols (Optional[AbstractSet[str]]) –
timedelta_cols (Optional[AbstractSet[str]]) –
-
property
columns¶ Get dictionary of column specifications.
-
get(idx=slice(None, - 1, None), raw=False, datetimes=None)¶ Return a dataset as a pandas DataFrame.
- Parameters
raw (bool) –
datetimes (Optional[bool]) –
- Return type
pandas.core.frame.DataFrame
-
property
ncol¶ Number of columns in dataset.
-
classmethod
new(au_parent, name, value, overwrite=False, **kwargs)¶ Create a new Dataset object.
- Parameters
au_parent (audata.element.Element) –
name (str) –
value (Union[h5py._hl.dataset.Dataset, numpy.ndarray, numpy.recarray, pandas.core.frame.DataFrame]) –
overwrite (bool) –
- Return type
-
property
nrow¶ Number of rows in dataset.
-
property
shape¶ Get dataset shape tuple (rows, cols).
audata.element module¶
Base element class.
-
class
audata.element.Element(parent=None, name='')¶ Bases:
objectRepresents an abstract audata element (e.g., files, groups, etc..) It should not be necessary to interact with this class directly.
- Parameters
parent (Optional[Union[h5py._hl.files.File, Element]]) –
name (str) –
-
clear()¶ Clear members to reset class.
-
property
filename¶ Path to file (Optional[str], read-only)
-
property
hdf¶ Get wrapped HDF object.
-
property
meta¶ Element meta data (HDF5 .meta attribute) (JSON dictionary)
-
property
meta_audata¶ File audata meta, ‘.meta/audata’ attribute (JSON dictionary, read-only)
-
property
meta_data¶ File data meta, ‘.meta/data’ attribute (JSON dictionary, read-only)
-
property
name¶ Element name (Optional[str], read-only)
-
property
time_reference¶ File time reference (Optional[dt.datetime], read-only)
-
property
valid¶ Is element valid? (bool, read-only)
audata.file module¶
HDF5 file wrapper class.
-
class
audata.file.File(file, time_reference=None, return_datetimes=True)¶ Bases:
audata.group.GroupWrapper around an HDF5 file.
The wrapper adds a lot of convenience in handling audata files by automatically maintaining the correct underlying data schema while providing an intuitive interface and some convenience functions. Datasets can be accessed and updated by using a file object as a dictionary, a dictionary of dictionaries, or a dictionary where hierarchy is implied by use of the forward-slash as a “directory” delimiter, similar to how datasets are accessed using h5py. Data conversions to store higher-level data types unsupported natively by HDF5 (e.g., timestamps, ranges, or categorical variables) is handled implicitely.
Generally, files are opened or created using the open or new class methods, respectively, instead of the constructor.
Example
Creating a new file and adding a dataset:
>>> f = audata.File.new('test.h5', time_reference=dt.datetime(2020, 5, 4, tzinfo=UTC)) >>> f['data'] = pd.DataFrame(data={ ... 'time': f.time_reference + dt.timedelta(hours=1)*np.arange(3), ... 'a': [1,2,3], ... 'b': pd.Categorical(['a', 'b', 'c'])}) >>> f['data'] /data: Dataset [3 rows x 3 cols] time: time a: integer (signed) b: factor with 3 levels [a, b, c] >>> f['data'][:] time a b 0 2020-05-04 00:00:00+00:00 1 a 1 2020-05-04 01:00:00+00:00 2 b 2 2020-05-04 02:00:00+00:00 3 c
Instantiates the File object.
Generally new or open will be called, the constructor is not called directly.
- Parameters
file (h5py._hl.files.File) – The opened HDF5 file object.
time_reference (Optional[datetime.datetime]) – The file-level time reference.
return_datetimes (bool) – True if timestamps should be converted to dt.datetime objects, False if Unix timestamps (UTC) should be returned instead.
-
DateTimeFormat= '%Y-%m-%d %H:%M:%S.%f %Z'¶
-
close()¶ Close the file handle.
-
flush()¶ Flush changes to disk.
-
classmethod
new(filename, overwrite=False, time_reference='now', title=None, author=None, organization=None, return_datetimes=True, **kwargs)¶ Create a new file.
- Parameters
filename (str) – The path of the file to open.
overwrite (bool) – If True, existing files will be truncated. Otherwise, an existing file will cause an exception.
time_reference (Union[Literal[now], datetime.datetime]) – The time reference to use, or ‘now’ to use the time of file creation.
title (Optional[str]) – A title for the dataset.
author (Optional[str]) – The dataset author.
organization (Optional[str]) – The dataset organization.
return_datetimes (bool) – If True times will be converted to dt.datetime objects, otherwise Unix (UTC) timestamps.
**kwargs – Additional keyword arguments will be passed on to h5.File’s constructor.
- Returns
The newly opened file object.
- Return type
-
classmethod
open(filename, create=False, readonly=True, return_datetimes=True, **kwargs)¶ Open an audata file.
- Parameters
filename (str) – The path to the file to open.
create (bool) – If True, missing files will be created. Otherwise, missing files cause an exception.
readonly (bool) – Whether to open in read-only or mutable.
return_datetimes (bool) – If True times will be converted to dt.datetime objects, otherwise Unix (UTC) timestamps.
**kwargs – Additional keyword arguments will be passed on to h5.File’s constructor if a file is to be created.
- Returns
The opened file object.
- Return type
-
property
time_reference¶ The timezone-aware time reference.
Can be set with either a dt.datetime object or a str that can be parsed as a datetime. If a naive datetime is provided, the local timezone will be inferred.
audata.group module¶
Wrapper for Group types.
-
class
audata.group.Group(au_parent, name='')¶ Bases:
audata.element.ElementGroup element. Acts largely like a container for datasets. Generally should not be instantiated directly.
- Parameters
au_parent (audata.element.Element) –
name (str) –
-
list()¶ List all child attributes, groups, and datasets.
- Return type
Dict[str, List[str]]
-
new_dataset(name, value, **kwargs)¶ Create a new dataset.
- Parameters
name (str) –
value (Union[h5py._hl.dataset.Dataset, np.ndarray, np.recarray, pd.DataFrame]) –
-
recurse()¶ Recursively find all datasets. Groups and datasets prefixed with a period are ignored.
- Returns
Element, name: str).
- Return type
Iterable (generator) of tuples of (object
Module contents¶
Define version and import high-level objects.