Skip to content

Configure Database Connection

Set up your DataJoint database connection.

Tip: DataJoint.com handles database configuration automatically with fully managed infrastructure and support.

Example: See lcms-demo for a complete DataJoint project with configuration examples.

Configuration Structure

DataJoint separates configuration into two parts:

  1. datajoint.json โ€” Non-sensitive settings (checked into version control)
  2. .secrets/ directory โ€” Credentials and secrets (never committed)

Project Configuration (datajoint.json)

Create datajoint.json in your project root for non-sensitive settings:

{
  "database.host": "db.example.com",
  "database.port": 3306,
  "database.use_tls": true,
  "safemode": true
}

This file should be committed to version control.

Secrets Directory (.secrets/)

Store credentials in .secrets/datajoint.json:

{
  "database.user": "myuser",
  "database.password": "mypassword"
}

Important: Add .secrets/ to your .gitignore:

.secrets/

Environment Variables

For CI/CD and production, use environment variables:

export DJ_HOST=db.example.com
export DJ_USER=myuser
export DJ_PASS=mypassword

Environment variables take precedence over config files.

Configuration Settings

Setting Environment Default Description
database.host DJ_HOST localhost Database server hostname
database.port DJ_PORT Auto Database server port (3306 for MySQL, 5432 for PostgreSQL)
database.user DJ_USER โ€” Database username
database.password DJ_PASS โ€” Database password
database.backend DJ_BACKEND mysql Database backend: mysql or postgresql
database.name DJ_DATABASE_NAME None Database name (PostgreSQL only). Defaults to "postgres"
database.use_tls DJ_TLS True Use TLS encryption
database.reconnect โ€” True Auto-reconnect on timeout
safemode โ€” True Prompt before destructive operations

Test Connection

import datajoint as dj

# Connects using configured credentials
conn = dj.conn()
print(f"Connected to {conn.host}")

Programmatic Configuration

For scripts, you can set configuration programmatically:

import datajoint as dj

dj.config['database.host'] = 'localhost'
# Credentials from environment or secrets file

Temporary Override

with dj.config.override(database={'host': 'test-server'}):
    # Uses test-server for this block only
    conn = dj.conn()

Configuration Precedence

  1. Programmatic settings (highest priority)
  2. Environment variables
  3. .secrets/datajoint.json
  4. datajoint.json
  5. Default values (lowest priority)

TLS Configuration

For production, always use TLS:

{
  "database.use_tls": true
}

For local development without TLS:

{
  "database.use_tls": false
}

PostgreSQL Backend

New in 2.1

PostgreSQL is now supported as an alternative database backend.

DataJoint supports both MySQL and PostgreSQL backends. To use PostgreSQL:

Configuration File

{
  "database": {
    "host": "localhost",
    "backend": "postgresql"
  }
}

The port defaults to 5432 when backend is set to postgresql.

Database Name

New in 2.2.1

The database.name setting specifies which PostgreSQL database to connect to.

PostgreSQL requires connecting to a specific database. By default, DataJoint connects to the postgres database. To use a different database:

{
  "database": {
    "host": "localhost",
    "backend": "postgresql",
    "name": "my_database"
  }
}

Or via environment variable:

export DJ_DATABASE_NAME=my_database

Or programmatically:

dj.config['database.name'] = 'my_database'

This setting only applies to PostgreSQL. Setting it with the MySQL backend emits a warning.

Environment Variable

export DJ_BACKEND=postgresql
export DJ_HOST=localhost
export DJ_USER=postgres
export DJ_PASS=password
export DJ_DATABASE_NAME=my_database  # optional, defaults to "postgres"

Programmatic Configuration

import datajoint as dj

dj.config['database.backend'] = 'postgresql'
dj.config['database.host'] = 'localhost'
dj.config['database.name'] = 'my_database'  # optional

Docker Compose for Local Development

services:
  postgres:
    image: postgres:15
    environment:
      - POSTGRES_PASSWORD=password
      - POSTGRES_USER=postgres
      - POSTGRES_DB=test
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      timeout: 30s
      retries: 5

Backend Compatibility

DataJoint's core types and query operators work identically on both backends. Table definitions, queries, and computations are portable between MySQL and PostgreSQL without code changes.

Connection Lifecycle

Persistent Connection (Default)

DataJoint uses a persistent singleton connection by default:

import datajoint as dj

# First call establishes connection
conn = dj.conn()

# Subsequent calls return the same connection
conn2 = dj.conn()  # Same as conn

# Reset to create a new connection
conn3 = dj.conn(reset=True)  # New connection

This is ideal for interactive sessions and notebooks.

Context Manager (Explicit Cleanup)

For serverless environments (AWS Lambda, Cloud Functions) or when you need explicit connection lifecycle control, use the context manager:

import datajoint as dj

with dj.Connection(host, user, password) as conn:
    schema = dj.Schema('my_schema', connection=conn)
    MyTable().insert(data)
# Connection automatically closed when exiting the block

The connection closes automatically even if an exception occurs:

try:
    with dj.Connection(**creds) as conn:
        schema = dj.Schema('my_schema', connection=conn)
        MyTable().insert(data)
        raise SomeError()
except SomeError:
    pass
# Connection is still closed properly

Manual Close

You can also close a connection explicitly:

conn = dj.conn()
# ... do work ...
conn.close()

Instance-Based Connections

New in 2.2

dj.Instance provides isolated connections independent of global config.

For applications that need multiple connections or thread safety, use dj.Instance instead of global config:

import datajoint as dj

inst = dj.Instance(host="db.example.com", user="myuser", password="mypassword")
schema = inst.Schema("my_schema")

Each Instance has its own config and connection. This is useful for:

  • Web servers: One Instance per request or tenant
  • Testing: Isolated databases per test
  • Multi-database: Connect to production and staging simultaneously
  • Thread safety: Set DJ_THREAD_SAFE=true to enforce Instance usage
# Multiple simultaneous connections
prod = dj.Instance(host="prod.example.com", user="analyst", password="...")
staging = dj.Instance(host="staging.example.com", user="dev", password="...")

See Use Isolated Instances for a complete guide.