Configure Database Connection¶
Set up your DataJoint database connection.
Tip: DataJoint.com handles database configuration automatically with fully managed infrastructure and support.
Example: See lcms-demo for a complete DataJoint project with configuration examples.
Configuration Structure¶
DataJoint separates configuration into two parts:
datajoint.jsonโ Non-sensitive settings (checked into version control).secrets/directory โ Credentials and secrets (never committed)
Project Configuration (datajoint.json)¶
Create datajoint.json in your project root for non-sensitive settings:
{
"database.host": "db.example.com",
"database.port": 3306,
"database.use_tls": true,
"safemode": true
}
This file should be committed to version control.
Secrets Directory (.secrets/)¶
Store credentials in .secrets/datajoint.json:
{
"database.user": "myuser",
"database.password": "mypassword"
}
Important: Add .secrets/ to your .gitignore:
.secrets/
Environment Variables¶
For CI/CD and production, use environment variables:
export DJ_HOST=db.example.com
export DJ_USER=myuser
export DJ_PASS=mypassword
Environment variables take precedence over config files.
Configuration Settings¶
| Setting | Environment | Default | Description |
|---|---|---|---|
database.host |
DJ_HOST |
localhost |
Database server hostname |
database.port |
DJ_PORT |
Auto | Database server port (3306 for MySQL, 5432 for PostgreSQL) |
database.user |
DJ_USER |
โ | Database username |
database.password |
DJ_PASS |
โ | Database password |
database.backend |
DJ_BACKEND |
mysql |
Database backend: mysql or postgresql |
database.name |
DJ_DATABASE_NAME |
None |
Database name (PostgreSQL only). Defaults to "postgres" |
database.use_tls |
DJ_TLS |
True |
Use TLS encryption |
database.reconnect |
โ | True |
Auto-reconnect on timeout |
safemode |
โ | True |
Prompt before destructive operations |
Test Connection¶
import datajoint as dj
# Connects using configured credentials
conn = dj.conn()
print(f"Connected to {conn.host}")
Programmatic Configuration¶
For scripts, you can set configuration programmatically:
import datajoint as dj
dj.config['database.host'] = 'localhost'
# Credentials from environment or secrets file
Temporary Override¶
with dj.config.override(database={'host': 'test-server'}):
# Uses test-server for this block only
conn = dj.conn()
Configuration Precedence¶
- Programmatic settings (highest priority)
- Environment variables
.secrets/datajoint.jsondatajoint.json- Default values (lowest priority)
TLS Configuration¶
For production, always use TLS:
{
"database.use_tls": true
}
For local development without TLS:
{
"database.use_tls": false
}
PostgreSQL Backend¶
New in 2.1
PostgreSQL is now supported as an alternative database backend.
DataJoint supports both MySQL and PostgreSQL backends. To use PostgreSQL:
Configuration File¶
{
"database": {
"host": "localhost",
"backend": "postgresql"
}
}
The port defaults to 5432 when backend is set to postgresql.
Database Name¶
New in 2.2.1
The database.name setting specifies which PostgreSQL database to connect to.
PostgreSQL requires connecting to a specific database. By default, DataJoint connects to the postgres database. To use a different database:
{
"database": {
"host": "localhost",
"backend": "postgresql",
"name": "my_database"
}
}
Or via environment variable:
export DJ_DATABASE_NAME=my_database
Or programmatically:
dj.config['database.name'] = 'my_database'
This setting only applies to PostgreSQL. Setting it with the MySQL backend emits a warning.
Environment Variable¶
export DJ_BACKEND=postgresql
export DJ_HOST=localhost
export DJ_USER=postgres
export DJ_PASS=password
export DJ_DATABASE_NAME=my_database # optional, defaults to "postgres"
Programmatic Configuration¶
import datajoint as dj
dj.config['database.backend'] = 'postgresql'
dj.config['database.host'] = 'localhost'
dj.config['database.name'] = 'my_database' # optional
Docker Compose for Local Development¶
services:
postgres:
image: postgres:15
environment:
- POSTGRES_PASSWORD=password
- POSTGRES_USER=postgres
- POSTGRES_DB=test
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
timeout: 30s
retries: 5
Backend Compatibility¶
DataJoint's core types and query operators work identically on both backends. Table definitions, queries, and computations are portable between MySQL and PostgreSQL without code changes.
Connection Lifecycle¶
Persistent Connection (Default)¶
DataJoint uses a persistent singleton connection by default:
import datajoint as dj
# First call establishes connection
conn = dj.conn()
# Subsequent calls return the same connection
conn2 = dj.conn() # Same as conn
# Reset to create a new connection
conn3 = dj.conn(reset=True) # New connection
This is ideal for interactive sessions and notebooks.
Context Manager (Explicit Cleanup)¶
For serverless environments (AWS Lambda, Cloud Functions) or when you need explicit connection lifecycle control, use the context manager:
import datajoint as dj
with dj.Connection(host, user, password) as conn:
schema = dj.Schema('my_schema', connection=conn)
MyTable().insert(data)
# Connection automatically closed when exiting the block
The connection closes automatically even if an exception occurs:
try:
with dj.Connection(**creds) as conn:
schema = dj.Schema('my_schema', connection=conn)
MyTable().insert(data)
raise SomeError()
except SomeError:
pass
# Connection is still closed properly
Manual Close¶
You can also close a connection explicitly:
conn = dj.conn()
# ... do work ...
conn.close()
Instance-Based Connections¶
New in 2.2
dj.Instance provides isolated connections independent of global config.
For applications that need multiple connections or thread safety, use dj.Instance instead of global config:
import datajoint as dj
inst = dj.Instance(host="db.example.com", user="myuser", password="mypassword")
schema = inst.Schema("my_schema")
Each Instance has its own config and connection. This is useful for:
- Web servers: One Instance per request or tenant
- Testing: Isolated databases per test
- Multi-database: Connect to production and staging simultaneously
- Thread safety: Set
DJ_THREAD_SAFE=trueto enforce Instance usage
# Multiple simultaneous connections
prod = dj.Instance(host="prod.example.com", user="analyst", password="...")
staging = dj.Instance(host="staging.example.com", user="dev", password="...")
See Use Isolated Instances for a complete guide.