Subsections of Profiles

Profile Structure

In the Karnak user interface, the Profiles page can be accessed using the menu bar on the left. It displays the list of existing profiles and offers to import new profiles.

A profile file is one or a list of profile elements that are defined for a group of DICOM attributes and with a particular action. During de-identification or tag morphing, Karnak will apply the profile elements to the applicable DICOM attributes. Only one profile element can be applied to a DICOM attribute. The profile elements are applied in the order defined in the yaml file and, therefore, the first applicable profile element will modify the value of a DICOM attribute. If other profile elements were applicable to that specific tag, they won’t be applied since it has already been modified.

Currently, the profile must be a yaml file (MIME-TYPE: application/x-yaml) and respect the definition as below.

Profile metadata

All these metadata are optional, but for a better user experience we recommend defining at least the name and the version. They will be used to identify and select your profile in a Project.

  • name - The name of your profile

  • version - The version of your profile

  • minimumKarnakVersion - The version of Karnak when the profile has been imported

  • defaultIssuerOfPatientID - Default value in case the IssuerOfPatientID value is not available in DICOM file, it is used to build the patient’s pseudonym when applying de-identification

  • profileElements - The list of profile elements, the elements are applied accordingly to their position in the list

Profile element

A profile element is defined as below in the yaml file.

  • name - The name of your profile element

  • codename - The codename represents the type of profile element. The available types of profiles elements and their codename are described in details in this documentation

  • condition - A boolean condition that defines some requirements to apply this profile element

  • action - The type of action that will be applied

  • option - Required for certain types of profile elements, contains a single value

  • arguments - Required for certain types of profile elements, contains a list of key-value pairs

  • tags - List of tags or pattern that identifies the DICOM attributes this profile should be applied to

  • excludedTags - List of tags or pattern that identifies the DICOM attributes this profile should not be applied to. These attributes can then be modified by another profile element if applicable

Tag

DICOM Tags can be defined in different formats: (0010,0010); 0010,0010; 00100010;

A tag pattern represent a group of tags and can be defined as follows: e.g. (0010,XXXX) represent all the tags of group 0010. The pattern (XXXX,XXXX) targets all the DICOM attributes.

Condition

A condition can be added to any type of profile element. It contains an expression that will be evaluated for each tag the profile element is applied to.

The syntax and usage of these conditions is detailed in the Conditions page.

Validation

The content of the yaml file is validated upon import. If the structure or parameters are not defined correctly, detailed errors will be displayed to the user.

Please refer to the Profiles page for more information.

Basic Dicom Profile

The Basic DICOM Profile is defined by DICOM to remove all the attributes that could contain Individually Identifying Information (III) about the patient or other individuals or organizations associated with the data. The details of this profile element can be found in the DICOM Standard.

Further details on this profile element and its implementation in Karnak can be found in the How does de-identification work? page.

We strongly recommend including this profile as basis for de-identification.

This profile element can be included in the profile definition by referencing its codename:

- name: "DICOM basic profile"
  codename: "basic.dicom.profile"

Example of a complete and valid profile yaml file that applies only the Basic DICOM Profile:

name: "Dicom Basic Profile"
version: "1.0"
minimumKarnakVersion: "0.9.2"
defaultIssuerOfPatientID:
profileElements:
  - name: "DICOM basic profile"
    codename: "basic.dicom.profile"

Actions on tags

Actions on specific tags

This profile element applies an action on a tag or a group of tags defined by the user. Its codename is action.on.specific.tags.

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: action.on.specific.tags
  • action: Remove (X) / Keep (K)
  • tags: list of tags the action should be applied to

This profile element can have these optional parameters:

  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance
  • excludedTags: list of tags that will be ignored by this action

In this example, all the tags starting with 0028 will be removed excepted (0028,1199) which will be kept.

- name: "Remove tags"
  codename: "action.on.specific.tags"
  action: "X"
  tags:
    - "(0028,xxxx)"
  excludedTags:
    - "(0028,1199)"

- name: "Keep tags 0028,1199"
  codename: "action.on.specific.tags"
  action: "K"
  tags:
    - "0028,1199"

Actions on private tags

This profile element applies an action on a private tag or a group of private tags defined by the user. This action won’t be applied in case the tag is not private.

Its codename is action.on.privatetags.

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: action.on.privatetags
  • action: Remove (X) / Keep (K)

This profile can have these optional parameters:

  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance
  • tags: list of tags the action should be applied to. If not specified, the action is applied to all private tags
  • excludedTags: list of tags that will be ignored by this action

In this example, all tags starting with 0009 will be kept and all the other private tags will be removed.

- name: "Keep private tags starting with 0009"
  codename: "action.on.privatetags"
  action: "K"
  tags:
    - "(0009,xxxx)"

- name: "Remove all private tags"
  codename: "action.on.privatetags"
  action: "X"

Add new tags

This profile element adds a tag if it is not already present in the instance. This action will be ignored if the tag is already present in the instance.

Its codename is action.add.tag.

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: action.add.tag
  • arguments:
    • value: required value to set the tag’s value to
    • vr: VR of the tag, if not specified, its VR will be retrieved from DICOM Standard
  • tags: must contain exactly one tag, the one to add

This profile can have these optional parameters:

  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance

Regarding the application of profile elements, if the tag is not initially present in the instance, then it will be added by this action. Further actions that match this tag won’t be applied. If the tag was already present in the instance, then the add action is ignored and a further action can be applied to that tag.

This feature is especially useful when applying masks to non-compliant SOPs by using the attribute Burned In Annotation. Please refer to the Cleaning Data Pixel Exceptions page for an exhaustive example.

In this example, we add the optional tag Recognizable Visual Features (0028,0302).

- name: "Add Recognizable Visual Features tag"
  codename: "action.add.tag"
  arguments:
    value: "YES"
    vr: "CS"
  tags:
    - "(0028,0302)"

Example of a complete and valid profile yaml file that applies the following profile elements in order:

  • Remove the tags that match (0008,00XX) and (0010,00XX) except for (0008,0008) and (0008,0013)
  • Keep specifically the previously excluded tags (so that they won’t be removed by a profile element applied afterward)
  • Remove all the private tags
  • Apply the Basic DICOM Profile to all the tags that did not undergo a modification previously
name: "De-identification profile"
version: "1.0"
minimumKarnakVersion: "0.9.2"
defaultIssuerOfPatientID:
profileElements:
  - name: "Remove tags"
    codename: "action.on.specific.tags"
    action: "X"
    tags:
      - "(0008,00XX)"
      - "0010,00XX"
    excludedTags:
      - "0008,0008"
      - "0008,0013"

  - name: "Keep tags"
    codename: "action.on.specific.tags"
    action: "K"
    tags:
      - "0008,0008"
      - "0008,0013"

  - name: "Remove all private tags"
    codename: "action.on.privatetags"
    action: "X"

  - name: "DICOM basic profile"
    codename: "basic.dicom.profile"

This example keeps a Study Description tag if that as contain ‘R2D2’ value, keep the Philips PET private group and apply the basic DICOM profile.

The tag patterns (0073,xx00) and (7053,xx09) are defined in Philips PET Private Group by DICOM.

name: "Profile Example"
version: "1.0"
minimumKarnakVersion: "0.9.7"
profileElements:
  - name: "Keep StudyDescription tag according to a condition"
    codename: "action.on.specific.tags"
    condition: "tagValueContains(#Tag.StudyDescription, 'R2D2')"
    action: "K"
    tags:
      - "(0008,1030)"

  - name: "Keep Philips PET Private Group"
    codename: "action.on.privatetags"
    action: "K"
    tags:
      - "(7053,xx00)"
      - "(7053,xx09)"

  - name: "DICOM basic profile"
    codename: "basic.dicom.profile"

Actions on dates

This profile element applies a specific action on a tag or a group of tags containing a date. This action can only be applied to the following Value Representations: Age String (AS), Date (DA), Date Time (DT) and Time (TM).

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: action.on.dates
  • option: the action to be applied, detailed below
  • arguments: additional parameters depending on the chosen option

This profile can have these optional parameters:

  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance
  • tags: list of tags the action should be applied to. If not specified, the profile element will be applied to all the tags that have a AS, DA, DT or TM VR in the instance
  • excludedTags: list of tags that will be ignored by this action

The parameter option can have one of the following values:

  • shift
  • shift_range
  • shift_by_tag
  • date_format

Shift option

The shift option applies a shift to a date according to the following required arguments:

  • seconds: integer representing the number of seconds the shift operation should apply
  • days: integer representing the number of days the shift operation should apply

In the case of a shift action applied to an Age String (AS) Value Representation, the seconds and days will be added to the existing value.

In case of a shift action applied to a Date (DA), Date Time (DT) or Time (TM) Value Representation, the seconds and days will be subtracted to the existing value.

In this example, all the tags starting with 0010 and that have a AS, DA, DT or TM VR will be shifted by 30 seconds and 10 days.

- name: "Shift Date"
  codename: "action.on.dates"
  arguments:
    seconds: 30
    days: 10
  option: "shift"
  tags:
    - "0010,XXXX"

Shift Range option

The shift range option applies a random shift to a date parametrized by ranges defined by the user as follows:

  • max_seconds (required): integer representing the upper bound of the number of seconds the shift operation should apply
  • max_days (required): integer representing the upper bound of the number of days the shift operation should apply
  • min_seconds (optional): integer representing the lower bound of the number of seconds the shift operation should apply, the default value is 0 is not specified
  • min_days (optional): integer representing the lower bound of the number of days the shift operation should apply, the default value is 0 is not specified

The random operation is deterministic and reproducible based on the patient and the project. For more details about the usage of randomization in Karnak, please refer to the Karnak Project.

In this example, all the tags starting with 0008,002 and that have a AS, DA, DT or TM VR will be shifted randomly in a range of 0 to 60 seconds and 50 to 100 days.

  - name: "Shift Range Date"
    codename: "action.on.dates"
    arguments:
      max_seconds: 60
      min_days: 50
      max_days: 100
    option: "shift_range"
    tags:
      - "0008,002X"

Date Format option

The date format option applies a partial deletion to a date depending on the specified option:

  • day: this option will delete the day information and replace it with the first day of the month
  • month_day: this option will delete the day and month information and replace it with the first month and first day of the month

This action can only be applied to the following Value Representations: Date (DA) and Date Time (DT).

day example

In this example, all the tags starting with 0008,003 and that have a DA or DT VR will have the day information removed.

  - name: "Date format"
    codename: "action.on.dates"
    arguments:
      remove: "day"
    option: "date_format"
    tags:
      - "0008,003X"

For example, if the value contained in the tag is 20230512, the output value will be 20230501.

month_day example

In this example, all the tags starting with 0008,003 and that have a DA or DT VR will have the day and month information removed.

  - name: "Date format"
    codename: "action.on.dates"
    arguments:
      remove: "month_day"
    option: "date_format"
    tags:
      - "0008,003X"

For example, if the value contained in the tag is 20230512, the output value will be 20230101.

Shift By Tag option

The shift by tag option applies a shift to a date according to a value contained in a specified DICOM tag. The arguments are detailed below:

  • seconds_tag: tag that contains the number of seconds the shift operation should apply
  • days_tag: tag that contains the number of days the shift operation should apply

Both arguments are not required, but at least one of them must be specified in order to apply the action.

In this example, all the tags starting with 0010 and that have a AS, DA, DT or TM VR will be shifted by the number of days stored in the private tag (0015,0011).

- name: "Shift Date By Tag"
  codename: "action.on.dates"
  arguments:
    days_tag: "(0015,0011)"
  option: "shift_by_tag"
  tags:
    - "0010,XXXX"

Example of a complete and valid profile yaml file that applies the following profile elements in order:

  • Shift the tags that match (0008,003X) and (0008,0012) except for (0008,0030) and (0008,0032) and that have a AS, DA, DT or TM VR will be shifted randomly in a range of 0 to 60 seconds and 10 to 50 days.
  • Remove the day and month information contained in the tags (0008,0023) and (0008,0021) and that have a DA or DT VR
  • Shift the tags that match (0010,XXXX) except for (0010,0010) and that have a AS, DA, DT or TM VR will be shifted by 30 seconds and 10 days.
  • Apply the Basic DICOM Profile to all the tags that did not undergo a modification previously
name: "De-identification profile"
version: "1.0"
minimumKarnakVersion: "0.9.2"
defaultIssuerOfPatientID:
profileElements:
  - name: "Shift Range Date with arguments"
    codename: "action.on.dates"
    arguments:
      max_seconds: 60
      min_days: 10
      max_days: 50
    option: "shift_range"
    tags:
      - "0008,0012"
      - "0008,003X"
    excludedTags:
      - "0008,0030"
      - "0008,0032"
      
  - name: "Date Format"
    codename: "action.on.dates"
    arguments:
      remove: "month_day"
    option: "format_date"
    tags:
      - "0008,0023"
      - "0008,0021"

  - name: "Shift Date with arguments"
    codename: "action.on.dates"
    arguments:
      seconds: 30
      days: 10
    option: "shift"
    tags:
      - "0010,XXXX"
    excludedTags:
      - "0010,1010"

  - name: "DICOM basic profile"
    codename: "basic.dicom.profile"

Expressions

Actions on specific tags

This profile element applies an expression to a tag or a group of tags defined by the user. An expression is based on Spring Expression Language (SpEL) and returns a value or an action according to a certain condition. Its codename is expression.on.tags.

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: expression.on.tags
  • arguments: definition of the expression
  • tags: list of tags the action should be applied to

This profile can have these optional parameters:

  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance
  • excludedTags: list of tags that will be ignored by this action

The expression is contained in the expr argument. The expression will be executed and can either return an action that will be executed, or a null value that will do nothing and move on the next profile element.

Some custom variables are defined in the context of the expression, they contain information relative to the current attribute the profile element is being applied to:

  • tag contains the current attribute tag, for example (0010,0010)
  • vr contains the current attribute VR, for example PN
  • stringValue contains the current element value, for example ‘John^Doe’ //TOCHECK

Some constants are also defined to improve readability and reduce errors.

  • #Tag is a constant that can be used to retrieve any tag integer value in the DICOM standard. For example, #Tag.PatientBirthDate corresponds to (0010,0030).
  • #VR is a constant that can be used to retrieve any VR value in the DICOM standard. For example, #VR.LO corresponds to the Long String type.

Some utility functions are defined to work with the tags:

  • getString(int tag) returns the value of the given tag in the DICOM instance, returns null if the tag is not present

  • tagIsPresent(int tag) returns true if the tag is present in the DICOM instance, false otherwise

The actions are defined as functions, they are used to set the action and its parameters for the profile element.

The possible actions are:

  • ReplaceNull(): Sets to empty the current tag value
  • Replace(String dummyValue): Replaces the current tag value
  • Remove(): Removes the tag
  • Keep(): Keeps the tag unchanged
  • UID(): Replaces the current tag value with a newly generated UID and sets the tag’s VR to UI
  • Add(int tagToAdd, int vr, String value): Adds a tag to the DICOM instance
  • ComputePatientAge(): Replaces the current tag value with a computed value of the patient’s age at the time of the exam

//TOCHECK calls api + add ?

Examples

If the current tag corresponds to the Patient Name attribute and its value is ‘John’, its content will be replaced by the value of the Institution Name attribute, otherwise the tag is kept unchanged. This example is applied to all the attributes of the DICOM instance, and returns either a Keep or Replace action for every tag, implying that the following profile elements, if any, will be ignored since all the attributes already had an action applied to them.

  - name: "Expression"
    codename: "expression.on.tags"
    arguments:
      expr: "stringValue == 'John' and tag == #Tag.PatientName? Replace(getString(#Tag.InstitutionName)) : Keep()"
    tags: 
      - "(xxxx,xxxx)"

If the current tag value is UNDEFINED, it will be kept as is (and it will implicitly block any further potential modifications applied by other profile elements), otherwise the tag is removed. This expression is applied to 2 tags, (0010,0010) and (0010,0212).

  - name: "Expression"
    codename: "expression.on.tags"
    arguments:
      expr: "stringValue == 'UNDEFINED'? Keep() : Remove()"
    tags: 
      - "(0010,0010)" #PatientName
      - "(0010,0212)" #StrainDescription

Replacement of the Study Description tag value with a concatenation of the Institution Name and the Station Name of the DICOM instance.

- name: "Expression"
  codename: "expression.on.tags"
  arguments:
    expr: "Replace(getString(#Tag.InstitutionName) + '-' + getString(#Tag.StationName))"
  tags: 
    - "(0008,1030)" #StudyDescription

Computation of the patient’s age at the time of the exam.

- name: "Expression"
  codename: "expression.on.tags"
  arguments:
    expr: "ComputePatientAge()"
  tags: 
    - "(0010,1010)" #PatientAge

Image modifications

Defacing

This profile applies defacing to the image data of the DICOM instance.

This profile can only be applied to images in the Axial orientation of the following SOP:

  • 1.2.840.10008.5.1.4.1.1.2 - CT Image Storage
  • 1.2.840.10008.5.1.4.1.1.2 - Enhanced CT Image Storage

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: clean.recognizable.visual.features
  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance

Pixel Data Cleaning

This profile applies a mask defined by the user on the DICOM instance pixel data to remove identifying information burned in the image.

The details on the masks definition can be found below.

This profile is applied only on the following SOP:

  • 1.2.840.10008.5.1.4.1.1.6.1 - Ultrasound Image Storage
  • 1.2.840.10008.5.1.4.1.1.7.1 - Multiframe Single Bit Secondary Capture Image Storage
  • 1.2.840.10008.5.1.4.1.1.7.2 - Multiframe Grayscale Byte Secondary Capture Image Storage
  • 1.2.840.10008.5.1.4.1.1.7.3 - Multiframe Grayscale Word Secondary Capture Image Storage
  • 1.2.840.10008.5.1.4.1.1.7.4 - Multiframe True Color Secondary Capture Image Storage
  • 1.2.840.10008.5.1.4.1.1.3.1 - Ultrasound Multiframe Image Storage
  • 1.2.840.10008.5.1.4.1.1.77.1.1 - VL Endoscopic Image Storage

Or if the tag value Burned In Annotation (0028,0301) is “YES”

This profile element requires the following parameters:

  • name: description of the action applied
  • codename: clean.pixel.data
  • condition: optional, defines a condition to evaluate if this profile element should be applied to this DICOM instance

The condition parameter can be used to exclude the images coming from a specific machine for example.

profileElements:
  - name: "Clean pixel data"
    codename: "clean.pixel.data"
    condition: "!tagValueContains(#Tag.StationName,'ICT256')"

Masks Definition

The mask definition requires the following parameters:

  • stationName: source station name that is matched against the attribute Station Name in the DICOM instance. It allows the mask to be specific depending on the station that generated the image. The value can also be set to * to match any station.
  • color: color of the mask in hexadecimal
  • rectangles: defines the list of rectangles to apply to mask identifying information

The mask definition can have these optional parameters:

  • imageWidth: mask specific to an image of a given width in pixels, this value will be matched against the value of the Columns attribute in the DICOM instance
  • imageHeight: mask specific to an image of a given height in pixels, this value will be matched against the value of the Rows attribute in the DICOM instance.

The selection of the mask based on the image size requires both attributes to be set, height and width. The definition of the width or height solely is not supported.

A rectangle is defined by the following required parameters:

  • x: x coordinate of the upper left corner of the rectangle
  • y: y coordinate of the upper left corner of the rectangle
  • width: width of the rectangle
  • height: height of the rectangle

The upper left corner of the image corresponds to the coordinates (0,0).

The schema below illustrate the definition of a rectangle having the following parameters (25, 75, 150, 50).

Rectangles example Rectangles example

The example below shows how to define a default mask (stationName: *), a mask specific to the R2D2 station and a more specific mask applied only depending on the image size.

Depending on the instance image size and station name, the following actions will be performed:

  • The instance Rows, Columns and Station Name attributes are retrieved.
  • These values are matched against the masks defined in the masks list. If an exact match is found using the imageWidth, imageHeight and stationName values, this mask is used for the cleaning pixel action. In this example, if the instance contains the value 1024 in the Rows and Columns attribute and “R2D2” in the station name attribute, the third mask will be selected.
  • If no match is found, the image size attributes are removed and a match is performed only on the station name attribute. In this example, if the instance contains “R2D2” in the Station Name attribute, the second mask will be selected without any regards for the image size of the instance.
  • If no match is found, the default mask will be selected, here the first one.
masks:
  - stationName: "*"
    color: "ffff00"
    rectangles:
      - "25 75 150 50"
  - stationName: "R2D2"
    color: "00ff00"
    rectangles:
      - "25 25 150 50"
      - "350 15 150 50"
  - stationName: "R2D2"
    imageWidth: 1024
    imageHeight: 1024
    color: "00ffff"
    rectangles:
      - "50 25 100 100"

Pixel Data Cleaning Exceptions

In some cases, often based on the manufacturer and equipment, pixel data can contain embedded identifying information. Below is an exhaustive example illustrating how to apply cleaning pixel profile element depending on the station that produced the image, applicable to any DICOM modality.

name: "Clean pixel data"
version: "1.0"
minimumKarnakVersion: "0.9.2"
defaultIssuerOfPatientID:
profileElements:
  - name: "Add tag BurnedInAnnotation if does not exist"
    codename: "action.add.tag"
    condition: "tagValueContains(#Tag.StationName, 'ICT256') && !tagIsPresent(#Tag.BurnedInAnnotation)"
    arguments:
      value: "YES"
      vr: "CS"
    tags:
      - "(0028,0301)"

  - name: "Set BurnedInAnnotation to YES"
    codename: "expression.on.tags"
    condition: "tagValueContains(#Tag.StationName, 'ICT256')"
    arguments:
      expr: "Replace('YES')"
    tags:
      - "(0028,0301)"

  - name: "Clean pixel data"
    codename: "clean.pixel.data"

  - name: "DICOM basic profile"
    codename: "basic.dicom.profile"

masks:
  - stationName: "*"
    color: "ffff00"
    rectangles:
      - "25 75 150 50"
  - stationName: "ICT256"
    color: "00ff00"
    rectangles:
      - "25 25 150 50"
      - "350 15 150 50"

Conditions

A condition is an expression evaluated in a certain context and that returns a boolean value (true or false).

Some constants are also defined to improve readability and reduce errors.

  • #Tag is a constant that can be used to retrieve any tag integer value in the DICOM standard. For example, #Tag.PatientBirthDate corresponds to (0010,0030).
  • #VR is a constant that can be used to retrieve any VR value in the DICOM standard. For example, #VR.LO corresponds to the Long String type.

Utility functions are available to define the conditions and are detailed below.


tagValueIsPresent(int tag, String value) or tagValueIsPresent(String tag, String value)

This function will retrieve the tag value of the DICOM and check if the value parameter is the same as the tag value.

# Check if the study description is equals to "755523-st222-GE"
tagValueIsPresent(#Tag.StudyDescription, "755523-st222-GE")
tagValueIsPresent("0008,1030", "755523-st222-GE")

# Check if the study description is not equals to "755523-st222-GE"
!tagValueIsPresent(#Tag.StudyDescription, "755523-st222-GE")
!tagValueIsPresent("0008,1030", "755523-st222-GE")

tagValueContains(int tag, String value) or tagValueContains(String tag, String value)

This function will retrieve the tag value of the DICOM and check if the value parameter appears in the tag value.

# Check if the study description contains "st222"
tagValueContains(#Tag.StudyDescription, "st222")
tagValueContains("0008,1030", "st222")

# Check if the study description does not contain "st222"
!tagValueContains(#Tag.StudyDescription, "st222")
!tagValueContains("0008,1030", "st222")

tagValueBeginsWith(int tag, String value) or tagValueBeginsWith(String tag, String value)

This function will retrieve the tag value of the DICOM and check if the tag value begins with the parameter value

# Check if the study description begins with "755523"
tagValueBeginsWith(#Tag.StudyDescription, "755523")
tagValueBeginsWith("0008,1030", "755523")

# Check if the study description does not begin with "755523"
!tagValueBeginsWith(#Tag.StudyDescription, "755523")
!tagValueBeginsWith("0008,1030", "755523")

tagValueEndsWith(int tag, String value) or tagValueEndsWith(String tag, String value)

This function will retrieve the tag value of the DICOM and check if the tag value ends with the parameter value

# Check if the study description ends with "GE"
tagValueEndsWith(#Tag.StudyDescription, "GE")
tagValueEndsWith("0008,1030", "GE")

# Check if the study description does not end with "GE"
!tagValueEndsWith(#Tag.StudyDescription, "GE")
!tagValueEndsWith("0008,1030", "GE")

tagIsPresent(int tag) or tagIsPresent(String tag)

This function will check if the tag is present in the DICOM instance.

# Check if the tag study description is present in the DICOM file
tagIsPresent(#Tag.StudyDescription)
tagIsPresent("0008,1030")

# Check if the tag study description is not present in the DICOM file
!tagIsPresent(#Tag.StudyDescription)
!tagIsPresent("0008,1030")

Multiple conditions can be combined using logical operators.

&& corresponds to the AND logical operator

|| corresponds to the OR logical operator

# Check if the tag study description ends with "GE" and if the station name is "CT1234"
tagValueEndsWith(#Tag.StudyDescription, "GE") && tagValueContains(#Tag.StationName, "CT1234")

# Check if the tag study description ends with "GE" or if the station name is "CT1234"
tagValueEndsWith(#Tag.StudyDescription, "GE") || tagValueContains(#Tag.StationName, "CT1234")

How does de-identification work?

Karnak is a gateway for sending DICOM files to one or multiple Application Entity Title (AET). Karnak offers the possibility to configure multiple destinations for an AET. These destinations can communicate using the DICOM or DICOM WEB protocol.

A destination contains several configurations for the DICOM endpoint, the credentials, and is also linked to a project.

A project defines the de-idenfication or tag morphing method and a secret that will be used to generate deterministic random values like UIDs or shift date arguments.

Basic Profile

The reference profile for de-identifying DICOM objects is provided by the DICOM standard. This profile defines an exhaustive list of DICOM tags and their related action to allow the de-identification of the instance.

In order to properly de-identify the sensitive data, five different actions are defined in the standard:

  • D – Replace with a dummy value
  • Z – Set to null
  • X – Remove
  • K – Keep
  • U – Replace with a new UID

The Basic Profile defines one or more actions to be applied to a list of tags.

The DICOM’s type is often dependent on the Information Object Definition (IOD) of the instance. To avoid DICOM corruption, multiple actions can be defined for a tag, ensuring that a destructive action like REMOVE won’t be applied on a Type 1 or Type 2 attribute.

  • Z/D – Z unless D is required to maintain IOD conformance (Type 2 versus Type 1)
  • X/Z – X unless Z is required to maintain IOD conformance (Type 3 versus Type 2)
  • X/D – X unless Z is required to maintain IOD conformance (Type 3 versus Type 1)
  • X/Z/D – X unless Z or D is required to maintain IOD conformance (Type 3 versus Type 2 versus Type 1)
  • X/Z/U* – X unless Z or replacement of contained instance UIDs (U) is required to maintain IOD conformance (Type 3 versus Type 2 versus Type 1 sequences containing UID references)

Karnak loads the SOPs and attributes as specified in the DICOM Standard. Based on the tag’s type in the current instance, the proper action is set and applied. If the tag cannot be identified in the SOP or its type cannot be inferred, the strictest action will be applied (U/D > Z > X).

Below is a concrete illustration of the action applied in case of multiple actions defined in the Basic Profile:

  • Z/D, X/D, X/Z/D → apply action D
  • X/Z → apply action Z
  • X/Z/U, X/Z/U* → apply action U

Action D, without a dummy value

The action D replaces the tag value with a dummy one. This value must be consistent with the Value Representation (VR) of the tag.

Karnak will use a default value based on the VR in this case, as defined below:

  • AE, CS, LO, LT, PN, SH, ST, UN, UT, UC, UR → “UNKNOWN”
  • DS, IS → “0”
  • AS, DA, DT, TM → a date is generated using shiftRange(), as explained in the Shift Date section
  • UI → a new UID is generated using the Action U

The shiftRange() action will return a random value between a given maximum days and seconds. By default, maximum days is set to 365 and maximum seconds is set to 86400.

The following VRs FL, FD, SL, SS, UL, US are of type Binary. By default, Karnak will set to null the value of this VR.

Action U, Generate a new UID

For each U action, Karnak will hash the input value. A one-way function is created to ensure that it is not possible to revert to the original UID. This function will hash the input UID and generate a new UID from the hashed input.

Context

It’s possible for a DICOM study to be de-identified several times and in different ways, potentially implying the use of multiple hashing and one-way functions. Karnak ensures deterministic generation of UIDs in order to maintain the quality and usability of the data.

To achieve that behavior, a project must be created and associated to the destination that requires de-identification. A project defines a de-idenfication method and a secret, either generated randomly or imported by the user. The project’s secret will be used as key for the HMAC.

Project secret

The secret is 16 bytes long and randomly defined when the project is created.

A user can upload his own secret, but it must be 16 bytes long. It can be uploaded as a String in hexadecimal format.

Hash function

The algorithm used for hashing is the “Message Authentication Code” (MAC). Karnak uses the MAC, not as message authentication, but as a one-way function. Below is a definition from the JAVA Mac class used in Karnak:

« A MAC provides a way to check the integrity of information transmitted over or stored in an unreliable medium, based on a secret key. Typically, message authentication codes are used between two parties that share a secret key in order to validate information transmitted between these parties.

A MAC mechanism that is based on cryptographic hash functions is referred to as HMAC. HMAC can be used with any cryptographic hash function, e.g., SHA256 or SHA384, in combination with a secret shared key. HMAC is specified in RFC 2104. »

For each use of the HMAC, it uses the SHA256 hash function combined with the project’s secret.

Generate UID

« What DICOM calls a “UID” is referred to in the ISO OSI world as an Object Identifier (OID) » [1]

To generate a new DICOM UID, Karnak will create an OID beginning with “2.25”, that is an OID encoded UUID.

« The value after “2.25.” is the straight decimal encoding of the UUID as an integer. It MUST be a direct decimal encoding of the single integer, all 128 bits. » [2]

The generated UUID will use the first 16 bytes (128 bits) from the hash value. The UUID is a type 4 with a variant 1. See the pseudocode below to ensure the type and the variant are correct in the UUID:

// Version
uuid[6] &= 0x0F
uuid[6] |= 0x40

// Variant
uuid[8] &= 0x3F
uuid[8] != 0x80

The hashed value will be converted in a positive decimal number and appended to the OID root separated by a dot. See the example below:

OID_ROOT = “2.25”
uuid = OID_ROOT + “.” + HashedValue[0:16].toPositiveDecimal()

Shift Date, Generate a random date

Karnak implements a randomized date shifting action. This shift must be identical based on the project and the patient for data consistency. For example, if a random shift is made for the birthdate of the patient “José Santos”, it must be the same for each instance associated to “José Santos”, even if the instance is loaded later.

The random shift date action will use the HMAC defined above and a range of days or seconds defined by the user. If the minimum isn’t specified, it defaults to 0.

The patientID, along with the project’s secret, will be passed to the HMAC, ensuring data consistency by patient.

The code below illustrates how a random value is generated within a given minimum (inclusive) and maximum (exclusive) range.

scaleHash(PatientID, scaledMin, scaledMax):
    patientHashed = hmac.hash(PatientID)
    scale = scaledMax - scaledMin

    shift = (patientHashed[0:6].toPositiveDecimal() * scale) + scaledMin

Pseudonym

This chapter details the problems linked to the PatientID generation for different de-identification methods.

A patient can participate in several studies using different de-identification methods. Depending on the project or the clinical research, the de-identification profile can keep some of the patient’s metadata. A pseudonym and a patientID are generated and affected to the patient in order to identify him in the context of the project.

Most of the patient’s identifying information is contained in the Patient Module.

Below is illustrated a case where some patient’s data could be leaked. During the de-identification, the patient is associated with a pseudonym provided by an external service or mapping table. In this example, a patient’s study falls in the scope of two different projects in Karnak. The first de-identification removes the patient birthdate (3. Apply Project 1) and the second keeps it (5. Apply Project 2). If the patient pseudonym is used as patient identification in the Patient Module, and the data is reconciled between the two study, the birthdate will be leaked.

Pseudonymization Pseudonymization

Below is illustrated an alternative way of handling pseudonyms during the de-identification.

In this example, the patient is still associated with a pseudonym provided by an external service or mapping table. But Karnak will generate a PatientID based on this pseudonym and other characteristics linked to the project specifically. This ID will be used to identify the patient in the context of the project. In case of different de-identification methods in different projects, the patients cannot be reconciled and data won’t be leaked.

Generate PatientID Generate PatientID

PatientID generation

Karnak generates a PatientID to solve the problem explained previously. The PatientID is generated using the HMAC function defined in the project, see chapter Action U, Generate a new UID for more details.

The de-identified patientID is generated as follows :

  • the patient’s pseudonym is retrieved from an external service or a mapping table
  • the pseudonym is hashed using the hmac function and the project’s secret, making it unique and deterministic in the context of the project
  • the patientID is set to the first 16 bytes of the hashed pseudonym

The pseudonym will be used as the patient’s name if no other action has been defined during de-identification.

Keep the correspondence between pseudonym and patient

It is possible to retrieve the patient information once de-identified.

The patientID is generated from the pseudonym and the project’s secret using the hmac function, making it irreversible. The pseudonym is stored in the attribute Clinical Trial Subject ID (0012,0040). Using the mapping table or the service that provided the pseudonym, it is possible to retrieve the original patient identity.

« The Clinical Trial Subject ID (0012,0040) identifies the subject within the investigational protocol specified by Clinical Trial Protocol ID (0012,0020). » [3]

Attributes added by Karnak

Some attributes are automatically set by Karnak during de-identification in the following modules.

SOP Common

The following attributes are set in the SOP Common module during de-identification:

  • Instance Creation Time (0008,0013), is set to the time the SOP instance was created. The value representation used is TM (HHMMSS.FFFFFF).
  • Instance Creation Date (0008,0012), is set to the date the SOP instance was created. The value representation used is DA (YYYYMMDD).

Patient Module

The following attributes are set in the SOP Common module during de-identification:

  • Patient ID (0010,0020), is set to the hashed pseudonym, see PatientID Generation for more details
  • Patient Name (0010,0010) is set to the pseudonym if no other action is applied to that tag during de-identification
  • Patient Identity Removed (0012,0062) is set to YES
  • De-identification Method (0012,0063) is set to the concatenated profile element codenames applied to the instance in order of application

The profile element codenames are concatenated and separated by -. For example, a profile composed of the profile elements action.on.specific.tags and basic.dicom.profile will appear as action.on.specific.tags-basic.dicom.profile.

Clinical Trial Subject Module

The following attributes are set in the Clinical Trial Subject Module during de-identification:

  • Clinical Trial Sponsor Name (0012,0010) is set to the project name
  • Clinical Trial Protocol ID (0012,0020) is set to the profile codename (concatenated profile elements’ codename)
  • Clinical Trial Protocol Name (0012,0021 is set to null
  • Clinical Trial Site ID (0012,0030) is set to null
  • Clinical Trial Site Name (0012,0031) is set to null
  • Clinical Trial Subject ID (0012,0040) is set to the patient’s pseudonym