matlab-v3.4
switch versions
python-v0.13
python-v0.12
python-v0.11
matlab-v3.4
matlab-v3.3
matlab-v3.2
Introduction
Data Pipelines
What is a data pipeline?
What is DataJoint?
How DataJoint works
Real-life example
Summary of DataJoint features
Teamwork
Data management in a science project
Data-centric project organization
Team roles
Input and Output
Where is my data?
Do I have to manually enter all my data into the database?
Won’t the database get too big if all my data are there?
Why not just process the data and save them back to a file?
How do I get my data out?
Interfaces
Publishing Data
Provide access to a DataJoint server
Containerizing as a DataJoint pipeline
Exporting into a collection of files
Progress
License
FAQs
How do I use GUIs with DataJoint?
Does DataJoint support other programming languages?
Is DataJoint another ORM?
How can I use DataJoint with a LIMS?
What is the difference between DataJoint and Alyx?
Release Notes
3.4.3 – May 28, 2021
3.4.2 – March 16, 2021
3.4.1 – December 18, 2020
3.4.0 – December 11, 2020
3.3.2 – October 15, 2020
3.3.1 – October 31, 2019
3.2.2 – February 5, 2019
Server Administration
Database Server Hosting
Cloud hosting
Self hosting
General server / hardware support requirements
Relational Database Server
Hardware considerations
CPU
RAM
Disk
Networking
General recommendations
Large-scale installations
Master-slave replication
Multi-master replication
Recommendations
Docker
User Management
Grouping with Wildcards
Bulk Storage Systems
Why External Bulk Storage?
Cost
Flexibility
Performance
Data Sharing
Bulk Storage Scenarios
Bulk Storage Considerations
Performance Characteristics
Network Traffic
Data Coherency
External Store
Principles of operation
Configuration
Cleanup
Migration between DataJoint v0.11 and v0.12
Backups and Recovery
Cloud hosted backups
Disk-based backup
MySQLDump
Percona XTraBackup
Locking and DDL issues
Replication and snapshots for backup
Client Setup
Install and Connect
Other Configuration Settings
TLS Configuration
DataJoint Python Windows Install Guide
Quick steps
Step 1: install Python
Step 2: verify installation
Step 3: install DataJoint
(Optional) step 4: install packages for ERD support
Install Graphviz
Install PyDotPlus
Install Matplotlib
(Optional) step 5: install Jupyter Notebook
Git for Windows
MySQL for Windows
Concepts
Data Model
What is a data model?
Relational data model
Core principles of the relational data model
DataJoint is a refinement of the relational data model
Terminology
DataJoint: databases, schemas, packages, and modules
Base tables
Relvars and relation values
Metadata
Entity Normalization
Criteria of a well-formed entity set
Entity normalization in schema design
Entity normalization in data queries
Examples of poor normalization
Indirect attributes
Repeated attributes
Attributes that do not apply to all entities
Transient attributes
Data Integrity
Entity integrity
Referential integrity
Group integrity
Relationships
Data Definition
Creating Schemas
Schemas
Manual
Automatic
Working with existing data
Creating Tables
Classes represent tables
Data tiers
Defining a table
Valid class names
Table Definition
Table creation on the database server
Changing the definition of an existing table
Reverse-engineering the table definition
Examples
Definition Syntax
Attribute names
Default values
Data Tiers
Internal conventions for naming tables
Part tables
Datatypes
Most common datatypes
Less common (but supported) datatypes
Special DataJoint-only datatypes
Datatypes not (yet) supported
External Data
File Attachment Datatype
Configuration & Usage
Filepath Datatype
Configuration & Usage
Integrity Notes
Primary Key
Primary keys in DataJoint
Defining a primary key
Entity integrity
Datatypes in primary keys
Choosing primary key attributes
Using hashes as primary keys
auto_increment
Dependencies
Understanding dependencies
Defining a dependency
How dependencies work
Referential integrity
Dependencies with renamed attributes
Foreign key options
ERD
Diagram notation
Diagramming an entire schema
Initializing with a single table
Adding ERDs together
Expanding ERDs upstream and downstream
Manual Tables
Lookup Tables
Drop
Dropping part tables
Work with Existing Pipelines
Loading Classes
Creating a virtual class
Data Manipulation
Manipulation
Insert
Batched inserts
Server-side inserts
Delete
Examples
Deleting from part tables
Cautious Update
Transactions
Queries
Query Objects
Checking for returned entities
Normalization in queries
Example Schema
Example schema ERD
Fetch
Fetch the primary key
Fetch entire query
As separate variables
Obtaining the primary key along with individual values
Rename and calculate
Sorting and limiting the results
Iteration
Operators
Principles of relational algebra
Matching entities
Examples
Join compatibility
Restriction
Restriction operators
&
and
-
Restriction by a table
Restriction by a table with no common attributes
Restriction by an empty table
Restriction by a mapping
Restriction by a string
Restriction by a collection
Restriction by a Boolean expression
Restriction by a query
Join
Join operator
*
Principles of joins
Examples of joins
Properties of join
Proj
Simple projection
Renaming
Calculations
Aggr
Examples
Union
Union operator
+
Principles of union
Examples of union
Properties of union
Universal Sets
Computation
Auto-populate
Make
Populate
Populate options
Progress
Key Source
Default key source
Custom key source
Master-Part Relationship
Populating
Deleting
Multiple parts
Transactions in Make
Distributed Computing
Job reservations
Managing connections
Community
Publications
Contribute
1) Which issue should I contribute towards?
2) What is the proper etiquette for proposing changes as contribution?
3) How can I track the progress of an issue that has been assigned?
4) What is the release process? How do I know when my merged contribution will officially make it into a release?
5) I am not yet too comfortable contributing but would like to engage the community. What is the policy on community engagement?
5a) Generally, how do I perform
__________
?
5b) I just encountered this error, how can I resolve it?
5c) I just encountered this error and I am sure it is a bug, how do I report it?
5d) I have an idea or new feature request, how do I submit it?
5e) I am curious why the maintainers choose to
__________
? i.e. questions that are ‘opinionated’ in nature with answers that some might disagree.
5f) What is the timeline or roadmap for the release of certain supported features?
5g) I need urgent help best suited for live debugging, how can I reach out directly?
Engagements
Multi-lab collaboratives
Invidiual Labs
DataJoint Documentation
Docs
»
Computation
Computation
¶
Auto-populate
Make
Populate
Populate options
Progress
Key Source
Default key source
Custom key source
Master-Part Relationship
Populating
Deleting
Multiple parts
Transactions in Make
Distributed Computing
Job reservations
Managing connections
Talk to the Community