DataJoint Elements for 2-Photon Calcium Imaging¶

Open-source data pipeline for processing and analyzing fluorescent imaging datasets.¶

Welcome to the tutorial for the DataJoint Element for calcium imaging. This tutorial aims to provide a comprehensive udnerstanding of the open-source data pipeline created using element-calcium-imaging.

This package is designed to seamlessly process, ingest, and track calcium imaging data, along with its associated parameters such as those used for image segmentation or motion correction, and scan-level metadata. By the end of this tutorial you will have a clear grasp on setting up and integrating element-calcium-imaging into your specific research projects and lab.

flowchart

Prerequisites¶

Please see the datajoint tutorials GitHub repository before proceeding.

A basic understanding of the following DataJoint concepts will be beneficial to your understanding of this tutorial:

The Imported and Computed tables types in datajoint-python.
The functionality of the .populate() method.

Tutorial Overview¶

Setup
Activate the DataJoint pipeline.
Insert subject, session, and scan metadata.
Populate scan-level metadata from image files.
Run the image processing task.
Curate the results (optional).
Visualize the results.

Setup¶

This tutorial examines calcium imaging data acquired with ScanImage and processed via suite2p. The goal is to store, track, and manage sessions of calcium imaging data, including all outputs of image segmentations, fluorescence traces and deconvolved activity traces.

The results of this Element can be combined with other modalities to create a complete, customizable data pipeline for your specific lab or study. For instance, you can combine element-calcium-imaging with element-array-ephys and element-deeplabcut to characterize the neural activity along with markless pose-estimation during behavior.

Let's start this tutorial by importing the packages necessary to run the notebook.

In [1]:

Copied!





import datajoint as dj
import datetime
import matplotlib.pyplot as plt
import numpy as np
import datajoint as dj
import datetime
import matplotlib.pyplot as plt
import numpy as np

If the tutorial is run in Codespaces, a private, local database server is created and made available for you. This is where we will insert and store our processed results. let's connect to the database server.

In [ ]:

Copied!

dj.conn()
dj.conn()

Activate the DataJoint Pipeline¶

This tutorial activates the imaging.py module from element-calcium-imaging, along with upstream dependencies from element-animal and element-session. Please refer to the tutorial_pipeline.py for the source code.

In [2]:

Copied!





from tutorial_pipeline import (
    lab,
    subject,
    session,
    scan,
    imaging,
    imaging_report,
    Equipment,
)
from tutorial_pipeline import (
    lab,
    subject,
    session,
    scan,
    imaging,
    imaging_report,
    Equipment,
)

[2023-06-30 17:56:21,374][WARNING]: lab.Project and related tables will be removed in a future version of Element Lab. Please use the project schema.
[2023-06-30 17:56:21,377][INFO]: Connecting root@fakeservices.datajoint.io:3306
[2023-06-30 17:56:21,401][INFO]: Connected root@fakeservices.datajoint.io:3306

We can represent the tables in the scan and imaging schemas as well as some of the upstream dependencies to session and subject schemas as a diagram.

In [3]:

Copied!





(
    dj.Diagram(subject.Subject)
    + dj.Diagram(session.Session)
    + dj.Diagram(scan)
    + dj.Diagram(imaging)
)
(
    dj.Diagram(subject.Subject)
    + dj.Diagram(session.Session)
    + dj.Diagram(scan)
    + dj.Diagram(imaging)
)

Out[3]:

Schema `neuro_subject`

As evident from the diagram, this data pipeline encompasses tables associated with scan metadata, results of image processing, and optional curation of image processing results. A few tables, such as subject.Subject or session.Session, while important for a complete pipeline, fall outside the scope of the element-calcium-imaging tutorial, and will therefore, not be explored extensively here. The primary focus of this tutorial will be on the scan and imaging schemas.

Insert subject, session, and probe metadata¶

Let's start with the first table in the schema diagram (i.e. subject.Subject table).

To know what data to insert into the table, we can view its dependencies and attributes using the .describe() and .heading methods.

In [4]:

Copied!

subject.Subject()
subject.Subject()

Out[4]:

subject	subject_nickname	sex	subject_birth_date	subject_description

Total: 0

In [6]:

Copied!

print(subject.Subject.describe())
print(subject.Subject.describe())

subject              : varchar(8)                   
---
subject_nickname     : varchar(64)                  
sex                  : enum('M','F','U')            
subject_birth_date   : date                         
subject_description  : varchar(1024)

In [7]:

Copied!

subject.Subject.heading
subject.Subject.heading

Out[7]:

# 
subject              : varchar(8)                   # 
---
subject_nickname     : varchar(64)                  # 
sex                  : enum('M','F','U')            # 
subject_birth_date   : date                         # 
subject_description  : varchar(1024)                #

The cells above show all attributes of the subject table. We will insert data into the subject.Subject table.

In [8]:

Copied!





subject.Subject.insert1(
    dict(
        subject="subject1",
        subject_nickname="subject1_nickname",
        sex="F",
        subject_birth_date="2020-01-01",
        subject_description="ScanImage acquisition. Suite2p processing.",
    )
)
subject.Subject()
subject.Subject.insert1(
    dict(
        subject="subject1",
        subject_nickname="subject1_nickname",
        sex="F",
        subject_birth_date="2020-01-01",
        subject_description="ScanImage acquisition. Suite2p processing.",
    )
)
subject.Subject()

Out[8]:

subject	subject_nickname	sex	subject_birth_date	subject_description
subject1	subject1_nickname	F	2020-01-01	ScanImage acquisition. Suite2p processing.

Total: 1

Let's repeat the steps above for the Session table and see how the output varies between .describe and .heading.

In [9]:

Copied!

print(session.Session.describe())
print(session.Session.describe())

-> subject.Subject
session_datetime     : datetime

In [10]:

Copied!

session.Session.heading
session.Session.heading

Out[10]:

# 
subject              : varchar(8)                   # 
session_datetime     : datetime                     #

Notice that describe, displays the table's structure and highlights its dependencies, such as its reliance on the Subject table. These dependencies represent foreign key references, linking data across tables.

On the other hand, heading provides an exhaustive list of the table's attributes. This list includes both the attributes declared in this table and any inherited from upstream tables.

With this understanding, let's move on to insert a session associated with our subject.

We will insert into the session.Session table by passing a dictionary to the insert1 method.

In [11]:

Copied!

session_key = dict(subject="subject1", session_datetime="2021-04-30 12:22:15")
session_key = dict(subject="subject1", session_datetime="2021-04-30 12:22:15")

In [12]:

Copied!

session.Session.insert1(session_key)
session.Session()
session.Session.insert1(session_key)
session.Session()

Out[12]:

subject	session_datetime
subject1	2021-04-30 12:22:15

Total: 1

Every experimental session produces a set of data files. The purpose of the SessionDirectory table is to locate these files. It references a directory path relative to a root directory, defined in dj.config["custom"]. More information about dj.config is provided in the documentation.

In [13]:

Copied!

session.SessionDirectory.insert1(dict(**session_key, session_dir="subject1/session1"))
session.SessionDirectory()
session.SessionDirectory.insert1(dict(**session_key, session_dir="subject1/session1"))
session.SessionDirectory()

Out[13]:

subject	session_datetime	session_dir Path to the data directory for a session
subject1	2021-04-30 12:22:15	subject1/session1

Total: 1

As the Diagram indicates, the tables in the scan schemas need to contain data before the tables in the imaging schema accept any data. Let's start by inserting into scan.Scan, a table containing metadata about a calcium imaging scan.

In [14]:

Copied!

print(scan.Scan.describe())
print(scan.Scan.describe())

-> session.Session
scan_id              : int                          
---
-> [nullable] Equipment
-> scan.AcquisitionSoftware
scan_notes           : varchar(4095)

The Scan table's attributes include the Session table and the Equipment table. Let's insert into the Equipment table and then Scan.

In [16]:

Copied!





Equipment.insert1(
    dict(
        device="Mesoscope1",
        modality="Calcium imaging",
        description="Example microscope",
    )
)
Equipment.insert1(
    dict(
        device="Mesoscope1",
        modality="Calcium imaging",
        description="Example microscope",
    )
)

In [17]:

Copied!





scan.Scan.insert1(
    dict(
        **session_key,
        scan_id=0,
        device="Mesoscope1",
        acq_software="ScanImage",
        scan_notes="",
    )
)
scan.Scan()
scan.Scan.insert1(
    dict(
        **session_key,
        scan_id=0,
        device="Mesoscope1",
        acq_software="ScanImage",
        scan_notes="",
    )
)
scan.Scan()

Out[17]:

subject	session_datetime	scan_id	device	acq_software	scan_notes
subject1	2021-04-30 12:22:15	0	Mesoscope1	ScanImage

Total: 1

Populate calcium imaging scan metadata¶

In the upcoming cells, the .populate() method will automatically extract and store the recording metadata for each experimental session in the scan.ScanInfo table and its part table scan.ScanInfo.Field.

In [18]:

Copied!

scan.ScanInfo()
scan.ScanInfo()

Out[18]:

# General data about the resoscans/mesoscans from header
subject              : varchar(8)                   # 
session_datetime     : datetime                     # 
scan_id              : int                          # 
---
nfields              : tinyint                      # number of fields
nchannels            : tinyint                      # number of channels
ndepths              : int                          # Number of scanning depths (planes)
nframes              : int                          # number of recorded frames
nrois                : tinyint                      # number of ROIs (see scanimage's multi ROI imaging)
x=null               : float                        # (um) ScanImage's 0 point in the motor coordinate system
y=null               : float                        # (um) ScanImage's 0 point in the motor coordinate system
z=null               : float                        # (um) ScanImage's 0 point in the motor coordinate system
fps                  : float                        # (Hz) frames per second - Volumetric Scan Rate
bidirectional        : tinyint                      # true = bidirectional scanning
usecs_per_line=null  : float                        # microseconds per scan line
fill_fraction=null   : float                        # raster scan temporal fill fraction (see scanimage)
scan_datetime=null   : datetime                     # datetime of the scan
scan_duration=null   : float                        # (seconds) duration of the scan
bidirectional_z=null : tinyint                      # true = bidirectional z-scan

In [19]:

Copied!

scan.ScanInfo.Field()
scan.ScanInfo.Field()

Out[19]:

# field-specific scan information
subject              : varchar(8)                   # 
session_datetime     : datetime                     # 
scan_id              : int                          # 
field_idx            : int                          # 
---
px_height            : smallint                     # height in pixels
px_width             : smallint                     # width in pixels
um_height=null       : float                        # height in microns
um_width=null        : float                        # width in microns
field_x=null         : float                        # (um) center of field in the motor coordinate system
field_y=null         : float                        # (um) center of field in the motor coordinate system
field_z=null         : float                        # (um) relative depth of field
delay_image=null     : longblob                     # (ms) delay between the start of the scan and pixels in this field
roi=null             : int                          # the scanning roi (as recorded in the acquisition software) containing this field - only relevant to mesoscale scans

In [22]:

Copied!

# duration depends on your network bandwidth to s3
scan.ScanInfo.populate(display_progress=True)
# duration depends on your network bandwidth to s3
scan.ScanInfo.populate(display_progress=True)

ScanInfo:   0%|          | 0/1 [00:00<?, ?it/s]

ScanInfo: 100%|██████████| 1/1 [05:41<00:00, 341.51s/it]

Let's view the information was entered into each of these tables.

In [23]:

Copied!

scan.ScanInfo()
scan.ScanInfo()

Out[23]:

General data about the resoscans/mesoscans from header

subject	session_datetime	scan_id	nfields number of fields	nchannels number of channels	ndepths Number of scanning depths (planes)	nframes number of recorded frames	nrois number of ROIs (see scanimage's multi ROI imaging)	x (um) ScanImage's 0 point in the motor coordinate system	y (um) ScanImage's 0 point in the motor coordinate system	z (um) ScanImage's 0 point in the motor coordinate system	fps (Hz) frames per second - Volumetric Scan Rate	bidirectional true = bidirectional scanning	usecs_per_line microseconds per scan line	fill_fraction raster scan temporal fill fraction (see scanimage)	scan_datetime datetime of the scan	scan_duration (seconds) duration of the scan	bidirectional_z true = bidirectional z-scan
subject1	2021-04-30 12:22:15	0	1	1	1	3000	0	13441.9	15745.0	-205821.0	29.2398	1	63.0981	0.712867	None	102.6	None

Total: 1

In [24]:

Copied!

scan.ScanInfo.Field()
scan.ScanInfo.Field()

Out[24]:

field-specific scan information

subject	session_datetime	scan_id	field_idx	px_height height in pixels	px_width width in pixels	um_height height in microns	um_width width in microns	field_x (um) center of field in the motor coordinate system	field_y (um) center of field in the motor coordinate system	field_z (um) relative depth of field	delay_image (ms) delay between the start of the scan and pixels in this field	roi the scanning roi (as recorded in the acquisition software) containing this field - only relevant to mesoscale scans
subject1	2021-04-30 12:22:15	0	0	512	512	nan	nan	13441.9	15745.0	-205821.0	=BLOB=	None

Total: 1

Run the Processing Task¶

We're almost ready to perform image processing with suite2p. An important step before processing is managing the parameters which will be used in that step. To do so, we will define the suite2p parameters in a dictionary and insert them into a DataJoint table ProcessingParamSet. This table keeps track of all combinations of your image processing parameters. You can choose which parameter are used during processing in a later step.

Let's view the attributes and insert data into imaging.ProcessingParamSet.

In [ ]:

Copied!

imaging.ProcessingParamSet.heading
imaging.ProcessingParamSet.heading

In [25]:

Copied!





import suite2p

params_suite2p = suite2p.default_ops()
params_suite2p["nonrigid"] = False

imaging.ProcessingParamSet.insert_new_params(
    processing_method="suite2p",
    paramset_idx=0,
    params=params_suite2p,
    paramset_desc="Calcium imaging analysis with Suite2p using default parameters",
)
import suite2p

params_suite2p = suite2p.default_ops()
params_suite2p["nonrigid"] = False

imaging.ProcessingParamSet.insert_new_params(
    processing_method="suite2p",
    paramset_idx=0,
    params=params_suite2p,
    paramset_desc="Calcium imaging analysis with Suite2p using default parameters",
)

DataJoint uses a ProcessingTask table to manage which Scan and ProcessingParamSet should be used during processing.

This table is important for defining several important aspects of downstream processing. Let's view the attributes to get a better understanding.

In [27]:

Copied!

imaging.ProcessingTask.heading
imaging.ProcessingTask.heading

Out[27]:

# Manual table for defining a processing task ready to be run
subject              : varchar(8)                   # 
session_datetime     : datetime                     # 
scan_id              : int                          # 
paramset_idx         : smallint                     # Unique parameter set ID.
---
processing_output_dir : varchar(255)                 # Output directory of the processed scan relative to root data directory
task_mode            : enum('load','trigger')       # 'load': load computed analysis results, 'trigger': trigger computation

The ProcessingTask table contains two important attributes: processing algorithm defined in ProcessingParamSet. When set to trigger, the

paramset_idx - Allows the user to choose the parameter set with which you want to run image processing.
task_mode - Can be set to load or trigger. When set to load, running the processing step initiates a search for existing output files of the image

processing step will run image processing on the raw data.

In [28]:

Copied!





imaging.ProcessingTask.insert1(
    dict(
        **session_key,
        scan_id=0,
        paramset_idx=0,
        task_mode="load",  # load or trigger
        processing_output_dir="subject1/session1/suite2p",
    )
)
imaging.ProcessingTask.insert1(
    dict(
        **session_key,
        scan_id=0,
        paramset_idx=0,
        task_mode="load",  # load or trigger
        processing_output_dir="subject1/session1/suite2p",
    )
)

Let's call populate on the Processing table, which checks for Suite2p results since task_mode=load.

In [29]:

Copied!

imaging.Processing.populate(session_key, display_progress=True)
imaging.Processing.populate(session_key, display_progress=True)

Processing:   0%|          | 0/1 [00:00<?, ?it/s]

Processing: 100%|██████████| 1/1 [00:00<00:00,  1.05it/s]

Curate the results (Optional)¶

While image processing is complete in the step above, you can optionally curate the output of image processing using the Curation table. For this demo, we will simply use the results of image processing output from the Processing task.

In [30]:

Copied!

imaging.Curation.heading
imaging.Curation.heading

Out[30]:

# Curation(s) results
subject              : varchar(8)                   # 
session_datetime     : datetime                     # 
scan_id              : int                          # 
paramset_idx         : smallint                     # Unique parameter set ID.
curation_id          : int                          # 
---
curation_time        : datetime                     # Time of generation of this set of curated results
curation_output_dir  : varchar(255)                 # Output directory of the curated results, relative to root data directory
manual_curation      : tinyint                      # Has manual curation been performed on this result?
curation_note        : varchar(2000)                #

In [31]:

Copied!





imaging.Curation.insert1(
    dict(
        **session_key,
        scan_id=0,
        paramset_idx=0,
        curation_id=0,
        curation_time="2021-04-30 12:22:15.032",
        curation_output_dir="subject1/session1/suite2p",
        manual_curation=False,
        curation_note="",
    )
)
imaging.Curation.insert1(
    dict(
        **session_key,
        scan_id=0,
        paramset_idx=0,
        curation_id=0,
        curation_time="2021-04-30 12:22:15.032",
        curation_output_dir="subject1/session1/suite2p",
        manual_curation=False,
        curation_note="",
    )
)

Once the Curation table receives an entry, we can populate the remaining tables in the workflow including MotionCorrection, Segmentation, and Fluorescence.

In [32]:

Copied!





imaging.MotionCorrection.populate(display_progress=True)
imaging.Segmentation.populate(display_progress=True)
imaging.Fluorescence.populate(display_progress=True)
imaging.Activity.populate(display_progress=True)
imaging_report.ScanLevelReport.populate(display_progress=True)
imaging_report.TraceReport.populate(display_progress=True)
imaging.MotionCorrection.populate(display_progress=True)
imaging.Segmentation.populate(display_progress=True)
imaging.Fluorescence.populate(display_progress=True)
imaging.Activity.populate(display_progress=True)
imaging_report.ScanLevelReport.populate(display_progress=True)
imaging_report.TraceReport.populate(display_progress=True)

MotionCorrection:   0%|          | 0/1 [00:00<?, ?it/s]

MotionCorrection: 100%|██████████| 1/1 [00:09<00:00,  9.92s/it]
Segmentation: 100%|██████████| 1/1 [00:08<00:00,  8.57s/it]
Fluorescence: 100%|██████████| 1/1 [00:07<00:00,  7.97s/it]
Activity: 100%|██████████| 1/1 [00:02<00:00,  2.74s/it]
ScanLevelReport: 100%|██████████| 1/1 [00:01<00:00,  1.20s/it]
TraceReport: 100%|██████████| 1276/1276 [01:21<00:00, 15.71it/s]

Now that we've populated the tables in this DataJoint pipeline, there are one of several next steps. If you have an existing pipeline for aligning waveforms to behavior data or other stimuli, you can easily invoke element-event or define your custom DataJoint tables to extend the pipeline.

Visualize the results¶

In this tutorial, we will do some exploratory analysis by fetching the data from the database and creating a few plots.

Next, we will fetch the fluorescence attribute for mask=10 with the fetch1 method by passing the attribute as an argument to the method.

By default, fetch1() returns all attributes of one of the entries in the table. If a query has multiple entries, fetch1() imports the first entry in the table.

In [41]:

Copied!

trace = (imaging.Fluorescence.Trace & "mask = '10'").fetch1("fluorescence")
trace = (imaging.Fluorescence.Trace & "mask = '10'").fetch1("fluorescence")

In the query above, we fetch the fluorescence trace from the Trace part table belonging to the Fluorescence parent table.

Let's plot this trace after fetching sampling rate of the data to define the x-axis values.

In [42]:

Copied!

sampling_rate = (scan.ScanInfo & session_key & "scan_id=0").fetch1("fps")
sampling_rate = (scan.ScanInfo & session_key & "scan_id=0").fetch1("fps")

In [43]:

Copied!





plt.plot(np.r_[: trace.size] * 1 / sampling_rate, trace)
plt.title("Fluorescence trace for mask 10")
plt.xlabel("Time (s)")
plt.ylabel("Activity (a.u.)")
plt.plot(np.r_[: trace.size] * 1 / sampling_rate, trace)
plt.title("Fluorescence trace for mask 10")
plt.xlabel("Time (s)")
plt.ylabel("Activity (a.u.)")

No description has been provided for this image

DataJoint queries are a highly flexible tool to manipulate and visualize your data. After all, visualizing traces or generating rasters is likely just the start of your analysis workflow. This can also make the queries seem more complex at first. However, we'll walk through them slowly to simplify their content in this notebook.

The examples below perform several operations using DataJoint queries:

Fetch the primary key attributes of the scan with scan_id=0.
Use multiple restrictions to fetch the average motion-corrected image for this scan with field_idx=0.
Use a join operation and multiple restrictions to fetch ROI mask coordinates and overlay them on the average motion-corrected image.

In [44]:

Copied!





scan_key = (scan.Scan & "scan_id=0").fetch1("KEY")
average_image = (imaging.MotionCorrection.Summary & scan_key & "field_idx=0").fetch1(
    "average_image"
)
scan_key = (scan.Scan & "scan_id=0").fetch1("KEY")
average_image = (imaging.MotionCorrection.Summary & scan_key & "field_idx=0").fetch1(
    "average_image"
)

In [45]:

Copied!

plt.imshow(average_image)
plt.imshow(average_image)

Out[45]:

<matplotlib.image.AxesImage at 0x7f38b3f6f8e0>

We will fetch mask coordinates and overlay these on the average image.

In [46]:

Copied!





mask_xpix, mask_ypix = (
    imaging.Segmentation.Mask * imaging.MaskClassification.MaskType
    & session_key
    & "mask_center_z=0"
    & "mask_npix > 130"
).fetch("mask_xpix", "mask_ypix")
mask_xpix, mask_ypix = (
    imaging.Segmentation.Mask * imaging.MaskClassification.MaskType
    & session_key
    & "mask_center_z=0"
    & "mask_npix > 130"
).fetch("mask_xpix", "mask_ypix")

In [47]:

Copied!

mask_image = np.zeros(np.shape(average_image), dtype=bool)
for xpix, ypix in zip(mask_xpix, mask_ypix):
    mask_image[ypix, xpix] = True
mask_image = np.zeros(np.shape(average_image), dtype=bool)
for xpix, ypix in zip(mask_xpix, mask_ypix):
    mask_image[ypix, xpix] = True

In [48]:

Copied!

plt.imshow(average_image)
plt.contour(mask_image, colors="white", linewidths=0.5)
plt.imshow(average_image)
plt.contour(mask_image, colors="white", linewidths=0.5)

This Element includes an interactive widget to plot the segmentations and traces to visualize the results after processing with Suite2p, CaImAn, or EXTRACT.

In [49]:

Copied!

from element_calcium_imaging.plotting.widget import main
from element_calcium_imaging.plotting.widget import main

In [50]:

Copied!

main(imaging)
main(imaging)

Out[50]:

VBox(children=(HBox(children=(Dropdown(description='Result:', layout=Layout(display='flex', flex_flow='row', g…

Summary¶

Following this tutorial, we have:

Covered the essential functionality of element-calcium-imaging.
Learned how to manually insert data into tables.
Executed and ingested results of image processing with suite2p.
Visualized the results.

Documentation and DataJoint Tutorials¶

Detailed documentation on element-calcium-imaging.
General datajoint-python tutorials. covering fundamentals, such as table tiers, query operations, fetch operations, automated computations with the make function, and more.
Documentation for datajoint-python.

Run this tutorial on your own data¶

To run this tutorial notebook on your own data, please use the following steps:

Download the mysql-docker image for DataJoint and run the container according to the instructions provide in the repository.
Create a fork of this repository to your GitHub account.
Clone the repository and open the files using your IDE.
Add a code cell immediately after the first code cell in the notebook - we will setup the local connection using this cell. In this cell, type in the following code.

import datajoint as dj
dj.config["database.host"] = "localhost"
dj.config["database.user"] = "<your-username>"
dj.config["database.password"] = "<your-password>"
dj.config["custom"] = {"imaging_root_data_dir": "path/to/your/data/dir",
"database_prefix": "<your-username_>"}
dj.config.save_local()
dj.conn()

Run the code block above and proceed with the rest of the notebook.

In [ ]: