Skip to content

Master-Part Relationship

Often an entity in one table is inseparably associated with a group of entities in another, forming a master-part relationship. The master-part relationship ensures that all parts of a complex representation appear together or not at all. This has become one of the most powerful data integrity principles in DataJoint.

As an example, imagine segmenting an image to identify regions of interest. The resulting segmentation is inseparable from the ROIs that it produces. In this case, the two tables might be called Segmentation and Segmentation.ROI.

In Python, the master-part relationship is expressed by making the part a nested class of the master. The part is subclassed from dj.Part and does not need the @schema decorator.

@schema
class Segmentation(dj.Computed):
     definition = """  # image segmentation
     -> Image
     """

     class ROI(dj.Part):
          definition = """  # Region of interest resulting from segmentation
          -> Segmentation
          roi  : smallint   # roi number
          ---
          roi_pixels  : longblob   #  indices of pixels
          roi_weights : longblob   #  weights of pixels
          """

     def make(self, key):
          image = (Image & key).fetch1('image')
          self.insert1(key)
          count = itertools.count()
          Segmentation.ROI.insert(
               dict(key, roi=next(count), roi_pixel=roi_pixels, roi_weights=roi_weights)
               for roi_pixels, roi_weights in mylib.segment(image))

Populating

Master-part relationships can form in any data tier, but DataJoint observes them more strictly for auto-populated tables. To populate both the master Segmentation and the part Segmentation.ROI, it is sufficient to call the populate method of the master:

Segmentation.populate()

Note that the entities in the master and the matching entities in the part are inserted within a single make call of the master, which means that they are a processed inside a single transactions: either all are inserted and committed or the entire transaction is rolled back. This ensures that partial results never appear in the database.

For example, imagine that a segmentation is performed, but an error occurs halfway through inserting the results. If this situation were allowed to persist, then it might appear that 20 ROIs were detected where 45 had actually been found.

Deleting

To delete from a master-part pair, one should never delete from the part tables directly. The only valid method to delete from a part table is to delete the master. This has been an unenforced rule, but upcoming versions of DataJoint will prohibit direct deletes from the master table. DataJoint's delete operation is also enclosed in a transaction.

Together, the rules of master-part relationships ensure a key aspect of data integrity: results of computations involving multiple components and steps appear in their entirety or not at all.

Multiple parts

The master-part relationship cannot be chained or nested. DataJoint does not allow part tables of other part tables per se. However, it is common to have a master table with multiple part tables that depend on each other. For example:

@schema
class ArrayResponse(dj.Computed):
definition = """
array: int
"""

class ElectrodeResponse(dj.Part):
definition = """
-> master
electrode: int    # electrode number on the probe
"""

class ChannelResponse(dj.Part):
definition = """
-> ElectrodeResponse
channel: int
---
response: longblob  # response of a channel
"""

Conceptually, one or more channels belongs to an electrode, and one or more electrodes belong to an array. This example assumes that information about an array's response (which consists ultimately of the responses of multiple electrodes each consisting of multiple channel responses) including it's electrodes and channels are entered together.