Thursday, February 4, 2021

From Implementation-Guide to IHE-Connectathon

So you have an Implementation Guide (aka IHE-Profile), and want to test at IHE-Connectathon....

The following is mostly based on what happens when an IHE Profile (aka Implementation Guide) is written. Much of what I outline is not as visible as it should be. Often these tasks are done by the IHE Connectathon staff, with a bit of help and oversight by the Profile writers and co-chairs. I am working to move some of this more visible, and sooner in the process.

I am not yet convinced that IHE is ready to take on the task of testing a specification not written by IHE. This is talked about a lot, but not much resources have been put toward it. First up was SANER, but it is a bit stalled. I started a TestPlan in the SANER implementation guide, and Keith has improved it. I am just not sure how comprehensive the SANER test plan is. 

I am a fan of using IHE-Connectathon for more comprehensive and formal testing, vs using the FHIR-Connectathon more for specification validation and experimentation. See my past articles on "What is a Connectathon", "Maturing FHIR Connectathon without confusing the marketplace". and "Introduction to IHE Connectathon and Projectathon".

So the idea of IHE-Connectathon testing of a developed IG goes something like this.
  1. IHE develops a test plan -- this is the overall plan for how the actors would be tested independently, and how scenarios would test a set of products.
  2. IHE develops test procedures for everything in the plan
  3. IHE develops or selects test tools to simulate peer actors for actor testing
  4. IHE enters the IG, actors, and any optional pathways into Gazelle
  5. Products sign up for testing
  6. some IHE-Connectathon happens (virtual or physical or adhoc)
  7. Products test using the actor test test procedure and tools
  8. Products submit their results to proctors
  9. Proctor checks the results with the expected results and passes them or sends them back to try again
  10. Product is paired up with peers for cross-product testing
  11. proof of cross-testing is submitted to proctor
  12. After final review products are given a gold-star

Generally today this starts with #4, and only after 3 or more products sign up for connectathon are steps #1-3 done. We could do this with HL7 Implementation Guides, but I would think we should look for interest before we enter them into Gazelle. Although the theory is that with IG publication and the CapabilityStatements in these IGs, this Gazelle registration could be automated.   

Generally today 1-3 is done by two IHE experts. These steps are often done in isolation, and in the first year done very quickly. As the signal that an Implementation Guide needs these written is very close to the time at which the IHE-Connecathon happens. This is why I want step 1 to be done as part of the specification writing. Doing step 1 as part of the specification writing will also assure that the goal of the Implementation Guide is clear.

Step 1 is where I and a few others are thinking Gherkin comes in. Step 1 is a critical step to have cooperation between the specification writers, product implementers, and test writers. Theory is that if we had a mature Gherkin infrastructure and writing, then many of the other steps could be less hard, and the testing could potentially be automated. The use of Gherkin fits nicely because it is very Behavior based, and is considered a critical tool in Behavior Driven Development (BDD). Gherkin promises to provide a well pattered sentence structure (Given, When, Then) so that the sentence structure can be parsed by regular-expressions and glue-code. These regular-expressions and glue-code are the magics, and are special to every project. Theory is that IHE might find some of these reusable.

Here is an example of a test-plan as part of my MHD IG that I am developing, this test plan only covers 25%. It is not using Gherkin (yet), but rather is just a minimally expressed set of test scenarios that are envisioned would be necessary to test comprehensively:

Here is SANER. I started this page, but it has taken on a life beyond my efforts. So I am not exactly sure if it is a good example. But it is again high-level set of scenarios

Similar setup can be done with a "Projectathon", which starts with the above already done, and focuses on project specific further refinement. Where projects tend to be regions, countries, or other community. Possibly I will write about this in a future blog article.

The above has not been outlined as well as I just did... so this is just my first try at expressing this.  Each step is likely 20-60 hours of work to do it right. Thus to get to step 6, means at least 100 hours expended. I will see if others agree.


4 comments:

  1. Hi John,

    Thanks for the excellent write up. For our swiss projectaton where we have local implementation guides we are closely working together with IHE services to develop the testing around it.

    We encounter some overlap in the roles in steps 1-2. Is it in the responsibility of the IG authors to develop the test plan or can this be provided by the base IHE test plan? Starting to documenti this in any case with a test plan with Gherkin sounds like a good approach to share the intention of the testing directly in the IG.

    For step3 for the validation of the Implementation Guide content we made some first experiences connecting the gazelle tooling by integrating it with the FHIR validator to have the reports directly integrated.

    Is proctor a new word for what we call 'monitor' or is this a new role?

    I'agree with your time estimates, hopefully if it will integrated into the development of the ig's parts of it can be reused for derived it's.

    ReplyDelete
    Replies
    1. I have not expressed what would happen in a projectathon case. I would guess that 1-2 would be more team based. I am very interested in how this might look different.

      I used Proctor rather than Monitor only because the english word Monitor can also be a verb that could be misunderstood. And the key I was pointing out is the manual step of checking the work, which is more classically assigned to a proctor than a classic monitor. This manual step is something to be reduced as much as possible through automation and tools.

      With FHIR the validator is a critical tool, and it is fantastic to have such a thing. However we also need generators that can initiate interactions, looking to Inferno. And with complex workflows there is complex interactions between trigger events and incoming transactions.

      Delete
  2. Thanks John, for yet another informative post! I agree that Gherkin is great high-level BDD DSL for (1). Are you aware of a FHIR-specific Gherkin test runner for (3)? If not, there may be a gap, in that someone would need to write executable tests based off the Gherkin in (1) [^1]. Partial duplication and an invitation for (human) error, drift and incomplete test coverage.

    Since Cucumber is already implemented many languages, I can imagine a sushi/fsh executor for Cucumber. One writes sushi/fsh Gherkin which is executable by Cucumber.

    I found one mention of this approach in the FHIR chat, but as of yet no implementation.
    > FWIW - The Gemini #SDPi+FHIR group has been looking at using Gherkin (and possibly Cucumber Studio) to capture detailed use case / scenario requirements that can eventually be mapped to test tooling configuration & execution; possibly leveraging ReqIF as well.
    https://chat.fhir.org/#narrow/stream/179166-implementers/topic/testscript.20executor

    [^1] Alternatively, are you of the opinion that there is no gap. That the human-readable Gherkin is _intended_ as non-executable contract between the Actors and the IG authors. That the executable tests need to be written separately in (3) is a feature not a bug: https://news.ycombinator.com/item?id=10194242

    ReplyDelete
    Replies
    1. I certainly don't have all the answers. I am pushing forward bit by bit. My primary interest in Gherkin is that it is both human-readable, and well-formed with the intention that humans will be able to see when they have said something wrong or incomplete; while being well-formed enough to hope to get reliable tests. I can get this benefit even without leveraging Cucumber for test automation.

      But I do expect that some test automation will be possible right away. I have put some of these expectations into one of my IG that is not public. It proposes that there are glue-code that can (a) given a id, retrieve it, (b) given a resource, it can use an external validator to validate it to a given profile, (c) given an example from the IG, adjust given fields, (d) given a resource submit it to a server endpoint. From this I can do much of the interop testing. And this would not be the end, just a core base to build upon.

      I do hope that IHE can bring a community together to achieve this. There is high interest from within IHE. I intend to support this as an open collaboration. I welcome all that are interested.

      Delete