Friday, April 4, 2014

Murky Research Award

I am going to take a page from Keith, and his Ad Hoc Motorcycle Guy Harley Award. This is an authorized pillage of his idea. I thus create the Murky Research Award, tip of a hat to Car Talk - Click and Clack - Murky Research. I am constantly reminded of Murky Research when I explain to people how to pronounce my name.(Keith also recommended this title). Sorry my graphic isn't as nice as the Ad Hoc Motorcycle Guy Harley Award.

The First Murky Research Award goes to Josh Mandel, who showed tremendous Research abilities, transparency, and ultimate Professionalism in is pursuit of knowledge on security vulnerabilities he discovered in some EHR products regarding malformed CDA (an XML form) documents that are not robustly sanitized and validated before being displayed using a simple stylesheet and an off-the-shelf browser (or browser framework). The details of this are far better explained by  Josh.


Dear Strucdoc and Security WGs,

In this era of personal health records and Direct messaging, it's increasingly unrealistic to assume that an EHR can trust every (C-)CDA document that arrives in a clinician's inbox. Here's an article I've published on the SMART Platforms blog describing a set of security considerations for the display of potentially malicious C-CDA documents:


This post describes a set of security considerations that are probably well-known to many of you -- but that have been overlooked by multiple real-world EHR products, leading to serious vulnerabilities. 

Bringing "best practices" to real-world implementations is critical, and as a community we should think about how HL7 might help. (In this specific case, for example, by hardening stylesheets and including warnings that these stylesheets are unsafe for use with untrusted documents. In general, by advocating for well-defined vulnerability reporting protocols and bounty programs.)

Best,

  Josh

Not only did Josh do the research into the deep details, and write them up in exacting details, but what you all don't yet know is that he has been working one-on-one with the vendor community to help them understand the problem, multiple times delaying his release to give a vendor another week. Did this all with the utmost discresion and professionalism. I know he is going to publish more deeper details.

It is not easy for someone who knows this level of problem to be so professional and to utalize the rules of responsible disclosure. My hat goes off to Josh Mandel. Thank You.

Tuesday, April 1, 2014

HIPAA Risk Assessment reader

HHS/ONC has released a fantastic and easy to use HIPAA Security Risk Assessment tool:
New Security Risk Assessment (SRA) tool

In collaboration with the HHS Office for Civil Rights, we released this morning a new tool designed to help practices conduct and document a comprehensive assessment to identify risks in their organizations from the Health Insurance Portability and Accountability Act (HIPAA) Security Rule. The SRA tool also produces a report that can be useful during audits. You can read the news release announcing the new tool here.
Okay, in case you didn't notice today is April 1st... This tool from HHS/ONC is potentially useless to someone unwilling to read the HIPAA Security rule, and unwilling to contract with even a low-end Security consultant. The big news is that this tool is just a 'wizard' that walks you thorough reading the HIPAA Security rule. Once you will be done using this tool, YOU HAVE read the HIPAA Security rule. You are likely no smarter, and you end up with a spreadsheet that just recorded your clicks through the wizard.
I must provide a little bit of reality. I really do (not April 1st) think that HHS/ONC have tried. The HIPAA Security rule is not easy for some to grasp. Unfortunately, I really don't think that a pretty wizard is going to make it any more readable. So I must give them some positive credit for trying. I just think you would be better off just reading the regulation itself, and hiring even a low-end security consultant.

Wednesday, March 26, 2014

Health Information Exchange: Centralized, Federated, or Distributed

I was asked why don't we just centralize all our health information so that it can be decomposed and harmonized once, rather than presuming that every little doctor office has the ability to have high-powered algorithms to decompose and harmonize complex healthcare information?

Well, I am always willing to say something stupid, so here goes my non-medically-trained viewpoint. Feel free to tell me how I am wrong in comments.

The end-goal defined is often where people skip to. If this was indeed a useful goal, then this would be a useful solution. The problem is that although we humans today are very mobile, we actually are not seen at all possible care settings. Thus the re-analysis of the longitudinal data, is only needed to be done at a few places. And that place is our GP. When the GP does this, they utilize the data they find to create "their view" of the patient. If you change GPs, the data is likely re-analyzed, if it wasn't shared by the previous GP. I am not saying that it wouldn't be useful, I am saying here that federated summation isn't really suboptimal.

A problem with global summation of the longitudinal view is that there is no universal medical view that is accepted globally (glacially). Radiology has had structured and coded forms in DICOM for a long time now. Why in DICOM do we keep each image independent? Why are they not each harmonized into a perfect 3D view of the body? Surely radiologists would love to see this view. Surely they would prefer not to try to bend a chest X-ray around in their mind to fit the curves of the body and mentally integrate that shoulder injury from 1998 into the image. This is what is done in star-trek so clearly this is where we will end up.Right?

Which brings up the other thought that I have. Some past data are useless, or are only relevant at specific times. Even when these data are available via XDS they are not incorporated into the GP view. In fact I expect the data shared via XDS is seen as reference material and is not often put into the GP view, at least not the whole data. It is an emergency room visit summary, it is a referral to a specialist, it is a request for overview, it is reports from a personal health measurement device, etc. It is important, but the GP will likely take advice from that external data, not take it all.

The last thought I have on the topic is that if all data possible was incorporated into a singular view, there would need to be provenance and change-tracking on each element back to the source. These record-keeping aspects would need to be very 'good', as life depends on them. That is we would need to think through how one would prove that the summary view is perfect, or more specifically prove who is at fault when it is wrong. Which brings up medical-liability issues related to your GP making decisions based on data that they must trust as perfect. Trust is not going to come quickly, and perfection of algorithms is clearly not here. BUT more my point the amount of data that would need to prove all this technically is likely to be more data than the medical data it-self, and the original (XDS) data would still need to be maintained as perfect copies too.

There are other points I can think of but want to stop here. The massive database of all data has been envisioned by many. I just think that we have a huge number of baby-steps to experience before we can do that. I am hopeful that maturity will bring these things. I am also confident that this maturity will take time.

Which leads me to the conclusion that:
  1. the concept of Document is important, especially longitudinally. It is self-contained context, provenance, and testably complete. Yes there are bad documents. 
  2. the concept of Federated is important, to enable expansion of our health information and our travels. Yes this initially appears complex.
  3. the concept of agility is important, to enable change over time. Because things will change, maturity happens. 

Document Sharing Management (Health Information Exchange - HIE)

Thursday, March 20, 2014

What does the SAML assertion mean in a XDS/XCA query/retrieve?

I have received this question from multiple sites. I will take this as a really good thing. It shows a maturing of the HIE market. The question comes about from a few different perspectives, but ultimately it boils down to a question of what the meaning is of that SAML assertion that is put into the XDS or XCA query and retrieve transactions. The reality is that the problem is not unique to XDS/XCA, it is true of any transaction that uses the PULL model. That is where a request is made for some information, and some information is returned back.

This is especially hard in a federated environment (like XDS and XCA)
, as the Access Control decision can't be made ONCE. There is no central authority. The Access Control decisions and enforcement are federated / distributed. The Requesting environment (e.g. EHR) is expected it controls things within their environment, it has access controls inside the EHR system. The Responding environment may trust the Requester has done the right Access Control, it still would like the identity for audit logging purposes. The Responding environment might also want to do further access control rules.

In XCA (XCPD shown for emphasis that this isn't just XCA but any PULL style transaction.) there are multiple intermediaries, and the request will end up at many endpoints. Those endpoints can't be predicted at request time.This picture is even simple showing only two levels. This can be nested almost infinite levels.



What is the SAML claiming?


A common mistake is to presume that the SAML assertion (i.e. XUA)  is a claim about the 'user' that caused the transaction to happen. This isn't wrong, if that is indeed going to be the scope of the access that will be given to the response. This perception is common, because we sometimes need to use the example of 'user' to help differentiate the SAML assertion from the TLS identity (i.e. ATNA). The reality is that they can be the same, if that is the right scope. One might do this too, as the SAML assertion is far more accessible on the service layer application code, the TLS identity is hard to get at.

The most likely case is that the response to a request will be incorporated into the requesting system (e.g. EHR), and thus further managed by that requesting system's access control rules. Thus the SAML assertion should be identifying the 'entity' that represents the scope of where the data will be available. This thus should then be the sum total of the roles, where your local roles are harmonized into the role vocabulary used in the Exchange.

This is also true about the PurposeOfUse statements. Meaning one must ask for all intended PurposeOfUse. One can't presume that 'treatment' is understood by the recipient as meaning both 'treatment' and 'payment'; and certainly doesn't mean 'research' and 'marketing'.

Deeper dive on the DS4P use-case

This issue was brought to light during the DS4P efforts. This is especially troubling in this environment as the Legal Release is a targeted release, and thus a broad request should be rejected. Thus we have a miss-match between the desire to get the most broad access to the data, while the data might need to be fine-grain controlled. How to resolve this is not yet clear. The DS4P presumes that a broad request can be responded to with constricted rules, but there is no pathway for this returned restriction. 

Authorization vs Identity

The point to be made at this point is that the SAML assertion we generally use (e.g. XUA) is an assertion of identity, yet a SAML assertion can be an assertion of authorization. These authorization assertions are more commonly associated with XACML environments. Note that in OAuth, they are all primarily authorization tokens where the authorization is ignored and the identity is used. Same result different perspective.

The XACML way to do this is through two step mechanism. That is I ask the Access Control engine for authorization to access broad data.  I get back from that an authorization assertion, possibly with a constrained list of users/roles/purposeOfuse/etc. I use that authorization to do the XCA query.  Where as to day the SAML assertion is an identity assertion, not an authorization assertion. This is the topic I cover for  OAuth token scope constraint. In both XACML and OAuth there is presumed to be a central all-knowing Access Control authority. This is simply not true in a federated architecture. So we tend to make multiple levels of decision, allowing the resource holder to be the final decider.

So what do we do with a SAML Identity Assertion environment?

The current architecture, using SAML Identity Assertions, result is that somehow the requester needs to ask for a rather broad request, yet recognize that this broad request might be too broad for the access control rules to allow. Which means you might get rejected.  The rejection code should indicate that the rejection in this case is due to too broad of a request. You could then try again with a more constrained SAML Identity Assertion.

Conclusion

So, the currently used SAML identity assertion (e.g. XUA) should identify the widest scope. Resources should recognize this. This is reality, not some hack. This is backoff mechanism is cumbersome, but if you tune your initial requests to 80% of the cases (Normal medical records for treatment use only), then the result is mostly success. The 20% (I am not claiming real values) then recognize that they need to be handled differently (probably individually and without automatic incorporation into the EHR). This can all be handled automatically by the requesting software. It does NOT need to involve the user.

Moving beyond this model is very hard. It requires a backbone of Access Control decision points that are 'all-knowing', and 'fully trusted'. I don't see this happening. I think Federation is more sustainable, and scales.

I have other articles on the Access Control topic:


Tuesday, March 4, 2014

Testing ATNA Secure Communications

I wrote a blog article that used the Apple "goto fail" problem as an opportunity to stress the need to test both 'happy path' as well as failure-modes. Grahame further wrote a blog article that enhanced this discussion with excellent observations and recommendations.

The happy-path is the successful completion of a feature, what you expect to happen, what should happen 99.999% of the time. It is the easy stuff to write tests for. The failure-modes are much harder, as this article will show for what is seen as a simple IHE Profile. Skip to the end to see the conclusion, or read through the derivation...

Writing Tests for failure-modes

The failure-modes are the hard thing to write tests for, and are considered no-fun by many. The failure-modes tests are trying to prove that nothing wrong happens, which is testing a negative. It takes a specific type of person to think through all the various failure-modes. This is the kind of person you really wants to make sure you get onto your project, as there are few of them and they are valuable over the long term. These are not specifically negative people, their goal is not always to break things, but they can put themselves into that 'not-happy place' and dream up devious tests. These are critical people to have for Quality. These are critical people to have for Safety, Security, and Privacy; all 'risk domains' that one can only avoid as one can never bring risk to zero.

IHE-ATNA Authenticate Node transaction

The secure communications transaction in IHE-ATNA is leveraged by almost every Profile in IHE. Many people think that this is only made up of Transport Level Security (TLS). This is central, but not the only form of secure communications. In fact the name of the transaction should be a hint -- Authenticate Node [ITI-19]. The prime purpose of the secure communications transaction is to authenticate the endpoints of the communications, which was the part that Apple 'goto fail' failed to confirm. In IHE this authentication is 'mutual' authentication as well.

In normal Browser based SSL one only gets to know who the server is, the server never gets to know who the client is. In the case of Internet HTTP Browsers this is not too much of a problem as the access is either to non-sensitive information or information where the human identity is more important than the client machine. In the IHE use-cases the communications are more at the system-to-system communications so it is important to know what 'system' initiated the communications as well as what 'system' is the server. Thus in IHE use-cases mutual-authentication of the endpoints of the communication is critical.

This is mostly done with TLS, forcing client and server authentication. This is a part of the TLS protocol, but not one that is used often. But TLS is not the only solution in the ITI-19 transaction. This transaction also recognizes that for e-mail the end-points can be mutually authenticated using S/MIME; just like was adopted by the USA "Direct Project". The ITI-19 transaction also recognizes that for web-services that WS-Security can be used to do message-level end-to-end security. In both of these alternatives the certificate management is much more difficult, it isn't just a CA-Trust, but also a certificate discovery problem.

Don't build it yourself - Trust your platform

For someone building any of these protocols, the tests are far more intricate. As one can imagine by looking at the Apple 'goto fail' code. I am not going to cover this level of testing. Because I want everyone to utilize their platform for protocols like this. Yes, even in the face of the  Apple 'goto fail' failure, I want you to trust your platform and not try to code these protocols yourself. I do want you to test that the platform is providing you what you need, and this is a proper subset of the total tests. The main reasons to utilize your platform is that protocols (especially security) are hard to write, these protocols are used widely, and if a bug is found it will be resolved and transparently patched. Yes there are many Apple devices that have not yet been fixed, but many have been fixed and more so every day. The platform is more likely to get it right and more capable of fixing it too.

But Verify your platform

Testing that the communications succeed is first. One doesn't even need special tools for this. But one does need to make sure you have success with more than yourself. As with Connectathon, test with some 'gold standard', and test with 3 peers that have implementations of applications that are as different from yours as possible (testing between three peers that all use the same open-source toolkit is not helpful). ATNA is tested before and during connectathon. This is the happy path.

So, how might we break a secure communications? 

I would not put much effort into trying to crack the cryptography. If you have successfully communicated with three peers and a 'gold-standard' then one must 'trust' the algorithm writers and all their cryptoanalysis. This is a level of smarts that is in rarefied air. Yes there are suspicions of these people and their procedures. I find it really hard to believe these stories, it is far easier to break the endpoints, or people at the endpoints, than to try to engineer a bug into a cryptographic algorithm that is not noticed. Even the Apple 'goto fail' is far more likely to be an accident than intentional.

One must make sure you are testing the cryptographic algorithms that you are using. The simple 'happy path' will test the default algorithms, but you do need to force all algorithms that you are accepting. You do know that by default there are multiple algorithms? IHE defines that 'at least' "RSA_WITH_AES_128_CBC_SHA" algorithm ciphersuite is available. IHE does not say that nothing else can be used. This is especially true for TLS, as TLS has real-time negotiations that is intended to pick the 'best' of the available ciphersuites. Which might not be this one. So you need to test ALL combinations. How is this done?

Using a monitoring tool that tells you what ciphersuite was just successfully tested (e.g. wireshark), then go and remove that ciphersuite from the list of acceptable ciphersuites. As you remove them, you know which ones have been successfully tested. When the system no-longer works, you know that the rest have not been tested. The rest might be perfectly good algorithms, but you don't know. You might see a ciphersuite in the list that you think you should keep. If so, then you need to figure out a way to test it. Note that at IHE Connectathon they will only test the RSA_WITH_AES_128_CBC_SHA, so they can stop there, but you likely need to go further.  Note that as you remove ciphersuites your system will be forced to choose worse and worse algorithms. At some point you should decide that these worse algorithms, although tested, are not worthy of keeping.

Should you allow the non-tested ciphersuites? That is a good exercise for a risk-assessment: What is the risk of keeping them in (What could go wrong)? How likely is that to happen? If it did happen how bad would it be? Weigh this against the risk that this ciphersuite might be necessary in production at some point, how likely is that, and how bad would it be? Using a risk-assessment system one can determine if they should keep or un-configure these untested ciphersuites. Most likely it is not helpful to keep them.

Note that your final list of ciphersuites has been carefully selected, and you should re-test often. You should at least test often that your configuration is still set the way you  want it to be set. It is not unlikely that an operating system patch might reset the ciphersuites back to default, and thus untested.

With S/MIME and WS-Security end-to-end secure communications this is somewhat easier as the set of ciphersuites is far more constrained by configuration. This because one must be far better at picking ONE ciphersuite that will work, as there is only one chance (not completely true, but close enough for this high-level testing vs protocol level testing).

Certificate success

There are many 'happy-path' methods for certificate (authentication) to succeed. In IHE two major methods are identified in section Volume 2a:3.19.6.1. The first one where you test that the certificate used has a proper chain to a certificate authority that you trust. The second one where you test that the certificate used is one that you directly trust.  These two modes recognize the scale differences between a large-scale network and a small-scale network. Certificate validation needs often to scale to a very small network of a few interconnected systems, up to a complex nationwide network. The larger scale, the more likely one must out-source trust to a certificate 'authority'. The smaller scale one can 'trust' an administrator to walk around to each system with the certificates of the others. IHE wants both modes to be possible, the one you actually choose is an operational choice. It is possible that both modes are used. In the "Direct Project" they called these 'trust anchors', that is the explicit trusted certs and/or the certificate authorities that you trust.

Note that part of 'happy-path' is also testing the process of getting a certificate. This involves the 'happy-path' of creating a certificate request, communicating that to the certificate authority, and communicating the signed certificate back. Or the self-signing ceremony if that is being used.

Failure-modes

So far I have not covered any failure-modes. I have also not tested fully all the happy-paths. I have only identified a reasonable set of happy-path to test. Yes, this is difficult. But this is easy to automate the re-testing.

This is just a small but reasonable set of certificate failure-modes.
  • Certificates that fail the verification.of the signature across the certificate - (Your test bench uses a corrupted certificate)
  • Certificates that don't match the private key (This one is hard to do with off-the-shelf test-tools)
  • Certificates that are expired (Just keep your test bench in the past, where it thinks things are fine)
  • Certificates that are not directly trusted and not issued by a CA you trust
  • Certificates that are revoked -- provided you have certificate revocation protocols in place.
  • Certificates that are self-signed, when that is unacceptable
  • Where the communications is not secured properly, Mutual-Authentication is necessary. 
Note that these 'negative' tests do have the 'prove a negative' problem. Thus the best you can do is refine these high-level 'negative' tests into an explicit list of representative negative tests. For example: using exactly one expired certificate doesn't test all possible expired certificates, just a representative one; but that one is likely enough.


Often I find people want to make sure the certificate contains the 'subject' which is the 'system' it is claiming to be. For example that a S/MIME certificate has in the certificate the email address of the endpoint, or for TLS that the certificate includes the hostname of the system. These are NOT useful. These will only cause your system to be fragile to legitimate changes. The proof that the system has the private key, and that the certificate is not revoked or expired, is all that is necessary. This is especially true of controlled environments,  legitimately less so true in the great uncontrolled environment of Browsers utilizing the World-Wide-Web.

Changes over time

You will notice that certificates will expire, this is normal too. You need to have "happy-path" tests for those times when a certificate naturally expires. Do you have mechanisms in place to notice when a certificate is about to expire? You should know how long it takes to get a new certificate issued and distributed (manually if using direct certificate validation). Well in advance of the expiration, one needs to get a new certificate issued. You already know that you are testing certificate expiration. So this is a happy-path test that you can replace certificates.

Does your system behave during certificate request phase? During re-issuing a certificate?  Does it handle getting a re-issued (same private/public keys)? Does it handle getting a new certificate (new private/public keys)?

Robustness everywhere

Both happy-path and failure-modes must respond in a robust way, and likely with a touch of audit logging. Most failures need to be recorded in the audit log, but repeated failures should not result in repeated egregious audit log entries. The reason is that the repeatted failure attempts are likely an attack, likely an attack to cause a denial-of-service. Thus by adding extra overhead to record redundant audit log entries is just adding to the 'success' of the attack, meaning you would be adding to the denial-of-service; not protecting from it.

This is an example of robustness, as are the protections on any of the failure-modes. But generally robustness tests are beyond known-misuse. Robustness is the principle of being strong against the attacks you didn't think of.

Conclusion:


Happy-path -- clearly dependent on what "ALL" means to you, the more constrained the more 'reasonable' that continuous testing will be
  • Test that certificate issuing works
  • Test ALL of your acceptable ciphersuites with a gold-standard and 3 peers
  • Test ALL of your acceptable peer certificates

Failure modes -- Listed here as 'negative' tests, which one must refine in your environment.
  • Test that your configuration disallows non-approved ciphersuites
  • Test that your configuration disallows non-approved certificates
  • Test that your configuration disallows corrupted certificates or corrupt CAs
  • Test that your configuration disallows expired certificates
  • Test that your configuration disallows revoked certificates
  • Test that your configuration disallows connections from non-authenticated peer
  • All negative tests must be handled efficiently (not susceptible to denial-of-service-attack) and record sufficient facts in audit log for analysis.
Updated March 5, 2014 - Added check for corrupted certificate that fail signature check

 Secure Communications

Monday, March 3, 2014

HIT Standards - privacy and security workgroup -- NSTIC testmony

Please plan to virtually attend an HIT Standards - privacy and security workgroup committee meeting. Next week  -- Wednesday, March 12, 2014, 10:00 am to 2:45 pm Eastern Time. 


There will be testimony from NSTIC pilot projects. I have written about NSTIC before (see articles below), it is really very useful and progressive work. The hearing will be used to determine how viable the technology is, how soon Healthcare can utilize it, and what it might be best used for. I hope also that standards gaps are made clear, and that policy and operational issues are brought to light. The core 'interoperability' standards are simply the stuff IHE has been profiling for years: ATNA, XUA, and IUA. The gaps are in the maturity of the glue standards, and the operational issues. Trust is very much a policy problem first, that is enabled with a little bit of technology.

Some articles I have written on the topic:


Sunday, March 2, 2014

Testing - governance

Testing is is critical both early and often. Please learn from others failures. The Apple “goto fail” provides a chance to learn testing governance lesson. It is just one in a long line of failures that one can learn the following governance lesson from. Learning these lessons is more than just knowing the right thing to do, but also putting it into practice.

The governance lesson I am speaking of is: Testing, Re-testing, and testing again and again and again. In software one needs to create complete tests and continuously test them. These tests are not just for software development, but also operational deployment. Yes, you running an HIE or a Hospital or a Clinic. What are the essential functions you expect? More important what are the anticipated failures you don't want?

Lesson – Test failure modes

As one develops any product one will test failure-modes. That is they will make sure that the system fails properly, safely. This is the classic ‘crash test’ in automobiles, or the UL test for kitchen appliances. These examples are external tests, but I assure you that the manufacture has tested many times against these same kinds of failures before they are ever made into products.

In physical products these tests are re-executed on random samples on the manufacturing line. In software the manufacturing line is ‘perfect’, so little reason to randomly sample copies of software (actually software distribution does have continuous tests built into the installers and such). Software just needs to apply this principle in a different way, one that recognizes that failures happen when 'change' happens.

As one develops any product one creates tests to make sure that the product is doing what they want it to do, and not what they don’t want it to do. This is true of software as well as other things. In software there is even the concept that one build the tests first – Test Driven Development. 

It is common for tests to be created for the success case, the ‘happy path’. These are the easy tests. Indeed most of the Meaningful Use tests are ‘happy path’ tests. Can your software create a CDA with ‘this’ content? Can your software display a well-formed CDA document with ‘this’ content? Can you send a Direct message to this recipient? These tests will notice when something is not working successfully. These tests are not trying to force errors. Back to the “Crash Test”, where they deliberately drive a car into a wall. One must come up with failure tests. The failure tests are not as pervasive, yet there should be many many more of these kinds of tests. These negative tests are hard to create, but critical overall.

For every test that you expect your ‘feature’ to shine, you need likely 10 or 100 more that push the limits. In robustness testing, especially for web-services interfaces, there are test software that create random input – “Fuzzing”. These are useful, but they don’t take the place of deliberately created ‘negative’ tests.

Lesson – Continuous testing.

Testing your product once is not sufficient. Testing your subroutine once is not sufficient. Once you create a test, get it into an automated test tool. Meaning that every time you build, this test is run to make sure that what you just built is still what you intended. It does not matter why you just changed the source code, run all of the tests you have. You might have just changed the color or font, but test everything. This is especially true when you check-in your code. Automated build systems are common as projects get large. They need to be mandatory and universal.

If this had been done in the Apple "goto fail" case, then it is likely that someone would have created a test for the code that was being ignored by the extra "goto fail". These tests would fail (double negative), and thus someone would have noticed and not allowed the changes to continue.

Governance failure modes

In the case of the Apple "goto fail": I can imagine that there was indeed positive and negative tests, and that they are part of continuous testing. These test might have been failing properly. I can imagine that the extra goto fail was actually added when testing some new feature. This new feature might have been redundant checks, thus the first check needed to be circumvented so as to have the redundant check executed. A mistake to not remove the extra goto fail, followed by a later removal of the redundant code . I have no idea if this is the case, but I can imagine how multiple accidents can cause stuff like this. There are theories that this Apple SSL failure was deliberate, I and others don't think so.

This is why one needs to “Trust but Verify”. That is create positive tests, create more negative tests, and execute these tests constantly. Note that being open-source is no protection, this code was openly published. Note that being pervasively used is no protection, this code was very widely deployed by Apple. Note that this failure was a failure-mode failure, so properly working was not enough.

Operational governance

This is not just a software development problem. This governance needs to be applied at many levels. Everyone that is deploying software should have their own sets of tests, both positive and negative. Yes these tests are not going to be as invasive, but they should hit upon the critical things.

Software updates happen, likely to fix a discovered bug (proving that bugs happen). When these software updates are applied to your operational environment they change the system slightly. These slight changes should only be positive changes, but unintended consequences can happen. Thus if you have positive and negative tests in your deployment environment, you will know that the system is working as expected both from the positive and negative perspective.

Hospitals that rely on good security, need to make sure that what security they have is indeed good and stays good. This means that when they say that TLS will be used with mutual authentication; that they have regular testing of that requirement. Testing happy-path is not hard, it is tested by normal-business. Testing failure modes is much more important, and requires some creativity. Have tests of expired certificates, revoked certificates, valid certificates that are not authorized for that purpose, etc. Have tests for failure-modes of the failure-modes, what happens when the certificate revocation check can't be done due to a network failure? Further testing the failures here, will also test your happy-path of your audit-log system and you business-continuance system. Having these tests in your operational deployment environment would have shown you that Apple was not 'trustable'.  

Trust but Verify.