Data Tiers¶
DataJoint assigns all tables to one of the following data tiers that differentiate how the data originate.
Tier |
Superclass |
Description |
---|---|---|
Lookup |
|
Small tables containing general facts and settings of the data pipeline; not specific to any experiment or dataset. |
Manual |
|
Data entered from outside the pipeline, either by hand or with external helper scripts. |
Imported |
|
Data ingested automatically inside the pipeline but requiring access to data outside the pipeline. |
Computed |
|
Data computed automatically entirely inside the pipeline. |
Table data tiers indicate to database administrators how valuable the data are. Manual data are the most valuable, as re-entry may be tedious or impossible. Computed data are safe to delete, as the data can always be recomputed from within DataJoint. Imported data are safer than manual data but less safe than computed data because of dependency on external data sources. With these considerations, database administrators may opt not to back up computed data, for example, or to back up imported data less frequently than manual data.
The data tier of a table is specified by the superclass of its class.
For example, the User class in Table Definition uses the dj.Manual
superclass.
Therefore, the corresponding User table on the database would be of the Manual tier.
Furthermore, the classes for imported and computed tables have additional capabilities for automated processing as described in Auto-populate.
Internal conventions for naming tables¶
On the server side, DataJoint uses a naming scheme to generate a table name corresponding to a given class. The naming scheme includes prefixes specifying each table’s data tier.
First, the name of the class is converted from CamelCase
to snake_case
(separation by underscores).
Then the name is prefixed according to the data tier.
Manual
tables have no prefix.
Lookup
tables are prefixed with#
.
Imported
tables are prefixed with_
, a single underscore.
Computed
tables are prefixed with__
, two underscores.
For example:
The table for the class StructuralScan
subclassing dj.Manual
will be named structural_scan
.
The table for the class SpatialFilter
subclassing dj.Lookup
will be named #spatial_filter
.
Again, the internal table names including prefixes are used only on the server side. These are never visible to the user, and DataJoint users do not need to know these conventions However, database administrators may use these naming patterns to set backup policies or to restrict access based on data tiers.
Part tables¶
Part tables do not have their own tier. Instead, they share the same tier as their master table. The prefix for part tables also differs from the other tiers. They are prefixed by the name of their master table, separated by two underscores.
For example, the table for the class Channel(dj.Part)
with the master Ephys(dj.Imported)
will be named _ephys__channel
.