Thursday, April 30, 2026

Considerations for Implementing the SLS RI

In a previous post I introduce a Reference Implementation of a Security Labeling Service. I published this as open-source on the SHIFT SLS RI GitHub Repository. I published the API definition including profile on ValueSet for defining a sensitive tagging ValueSet of codes. With an IG that holds many prototype ValueSets of sensitive topics. I also explain how sensitive topics are a subset of data, distinct from Normal health data.

The SLS Reference Implementation is designed to provide an informative example of how to implement the Security Labeling Service (SLS) for Health Information tagging so that fine grain access controls can be implemented. Data tagging applies a categorization code to a FHIR resource based only on the content of that FHIR resource. The tag does not indicate what kind of access control is applied. The access control rules are separate.

When is tagging needed?

  1. Data does not need to be tagged if there is no access control policies (e.g. Consent, Business Rules, or Regulations) that would apply different rules to different categories of data.
  2. Data does not need to be tagged if the current state of Consent is blanket permit or deny. That is when the Patient Consent has no specific rules per category then the tagging is not needed.
  3. Data need only be tagged sufficient to support the categorization of the access control policies. The SLS RI is configured by loading the SLS Policies as ValueSets
  4. When the tagging policies change (e.g. the SLS ValueSets are updated) then any data that was tagged under the old policy needs to be retagged. The SLS RI implements a timestamp on the tags to allow for not retagging if the SLS policies have not changed since last tagging.
The Reference Implementation of the SLS is designed to provide clarity of the concept of tagging. It is not designed to be fast or efficient. In a real-world system, the tagging of the data would be designed into the system utilizing features of that system (e.g., leveraging database indexing).

When to apply the SLS RI?

Executing the SLS against any data is an expensive operation. This is true as the number of entries in the SLS policies increases. The SLS must look at all codes in the data and detect if any of the codes match any of the configured SLS policies. Thus, the more policies (ValueSets) and the more entries in those ValueSets, the more computationally expensive the tagging process becomes. The SLS RI includes a timestamp so that data are not inspected unless the data timestamp is older than the SLS Policies. See ValueSet Profile

There is a concern with legacy databases not having an element to hold the security tags. Thus there needs to be a way to support SLS in those cases.

There are a few ways to apply the SLS RI:

Executed on all data Creation and Update

This is the most comprehensive approach, ensuring that all data are tagged appropriately. However, it may have performance implications due to the need to tag every resource. It also relies on the system being able to persist the tags with the data. This approach also must retag data when the tagging policy (e.g. SLS ValueSets) changes.

This model likely can be implemented efficiently with a systems design of a SLS into the database. Thus, the use of the SLS RI is not likely directly usable if this is the model desired.


Executed on demand when a Patient's data are accessed


When a Patient's data are accessed, a task examines all of that Patient's data and applies the appropriate tags. This approach allows for dynamic tagging based on the current state of the data and the applicable access control policies. However, it may lead to performance issues due to the need to tag data at the time of access, which could introduce latency. It also relies on the system being able to persist the tags with the data. This approach has the benefit of not changing data to add tags unless that Patient is actively being accessed. Thus historic patients that are no longer being accessed would not need to be tagged.

The system should keep some timestamp at the Patient level to know when the Patient was last tagged, so as to keep from running this task unnecessarily often.



Executed on demand with each Search with writeback


When a search is executed, the SLS RI is executed to inspect the Search Bundle and any new tags are written back to the database. This approach allows for dynamic tagging based on the current state of the data and the applicable access control policies at the time of search. However, it may lead to significant performance issues due to the need to tag data at the time of search, which could introduce latency. It also relies on the system being able to persist the tags with the data. This approach has the benefit of not changing data to add tags unless that data is actively being searched for. This inspection of the Search Bundle would only be done if the Access Control decision has residual rules to further remove categories of sensitive data.

Executed on demand with each Search doing only inline tagging

When a search is executed, the SLS RI is executed to inspect the Search Bundle and any new tags are only added to the Search Bundle in memory and not written back to the database. This approach allows for dynamic tagging based on the current state of the data and the applicable access control policies at the time of search without changing the underlying data. However, it may lead to performance issues due to the need to tag data at the time of search, which could introduce latency. This approach has the benefit of not changing data to add tags unless that data is actively being searched for and does not require that the system be able to persist the tags with the data. This inspection of the Search Bundle would only be done if the Access Control decision has residual rules to further remove categories of sensitive data.


🎁Note, this is one of those projects I do pro bono but would love if someone would care enough about it to contract with me. Sustaining the Work That Sustains Trust: Why I’m Seeking Support for Some of My Standards Efforts

Tuesday, April 28, 2026

Sensitive data as Venn diagram

In Healthcare, Normal data is all data that is linked to an identified Patient and not specifically sensitive. Any data that is sensitive would be Restricted. "Normal" refers to the normal average curve, thus the majority of data. Sensitive data can be categorized into sensitivity topics, and some data may fall into multiple sensitive categories, as illustrated in the Venn diagram below.

Sensitive topics are generally potentially stigmatizing information, for which exposure would present high risk of harm to an individual's reputation and sense of privacy. 

In a data tagging architecture, sensitivity topics are indicated as a "sensitivity" code in the FHIR Resource.meta.security tag of FHIR resources, and can be used for access control decisions in a Privacy Consent driven access control model.

Normal data is often not tagged as Normal, but rather is just the absence of any sensitive tag. This is recognizing that the vast majority of medical data are Normal (algorithmically average). The presence of any sensitive tag would make the data Restricted, indicated as R restricted Confidentiality code.
The data are tagged with the kind of sensitivity purely due to their data content, and not due to any other factors such as the Patient consent status. The labeling does not imply that there is any particular access control policy in place, but rather that the data is sensitive and may require special handling. 

The access control policies would be defined separately and could use the presence of these sensitivity tag to make decisions about who can access the data and under what circumstances. 

For example, when a given patient indicates that broad treatment use of their data is not restricted, but their Sexual Health sensitive data must not be shared beyond their PCP and never with non-Treatment purpose of Use. So, in this case, note that the other sensitive tags beyond Sexual Health have no effect on accessibility. Note that this Consent policy just needs to see the Sexual Health tags, it does not care about Normal vs Restricted. 

Resources:
  • An Implementation Guide with various ValueSet(s) that could be used by a Security Labeling Service (SLS) to tag data according to specific sensitivity. - SLS ValueSets
  • A Reference Implementation (OpenSource) of a Security Labeling Service (SLS) that I created using Vibe coding with AI - SLS RI GitHub Repository
  • An Implementation Guide defining that API, and specifically a Profile on ValueSet to identify sensitivity type and the codes for that type. -- SLS RI Implementation Guide
  • Example Patient Data SHIFT Demo Scenarios IG

🎁Note, this is one of those projects I do pro bono but would love if someone would care enough about it to contract with me. Sustaining the Work That Sustains Trust: Why I’m Seeking Support for Some of My Standards Efforts