When Entrust initially posted the incident report for “Entrust: EV TLS Certificate cPSuri missing,” my immediate feelings towards it was “welp, another silly mistake. I’m sure they’ll handle this properly.” As I interacted more and more with Entrust’s incidents, I started to see a troubling pattern emerge. To confirm my gut feeling on this I’ve compiled a list of historical incidents to see if the troubling behavior is in fact a historical pattern or not.
What I found was so, so much worse than I thought. I was planning for this to be a single post. However, the behavior I’ve seen in these incidents made me realize that a single thread would take hours to read. I have to split this up to keep it readable. In these series of posts I will be discussing every incident Entrust has had, and keep track of their claims and see if they have been truthful.
Spoiler Alert: I do not think Entrust should remain a CA anymore. Every day the root programs delay on removing Entrust from their root stores, it taints the trust the public have on the ability for the root programs to regulate WebPKI. I don’t expect everyone to agree with my viewpoint here, but I hope that even those who disagree can find these series of posts helpful.
In this analysis, I will end each incident with a subjective score (SIRQ - Subjective Incident Response Quality), between -1 and 5:
-1 meaning that this can’t even be called incident response.
5 meaning that this is a high quality incident response.
0 means that they had an incident response, but it had effectively no useful information.
Please note that this score is completely subjective.
Entrust: Non-BR-Compliant Certificate Issuance
2017-08-16
Note, this is the first Entrust incident on Bugzilla.
Entrust did not self report this incident. In this incident, Entrust was found to be issuing certificates with a hyphen (‘-’) in the OU field.
Entrust responds to this issue with:
“Confirmed, although as indicated below it’s not clear that including a hyphen in the OU field at the customer’s request (as opposed to at the decision of the CA) is a violation of the BRs.”
First thing to keep track: “Did Entrust question the requirements when faced with an incident?” - In this case, yes.
Second thing to keep track: “Did Entrust find an excuse to not revoke certificates, if they needed to revoke?” Yes:
“Since the inclusion of this value (a hyphen) is not viewed as a security risk the unexpired certificates will not be revoked.”
Entrust claims:
“No indication was found in the vetting file as to why the hyphen was allowed by the verification people, but our best guess is that they were approved due to an interpretation of the BRs concerning allowable values for the OU field …”
So they don’t even know why they’ve misissued?
Then, Jonathan Rudenberg finds that there are more certificates that are misissued since they contain IP addresses as a `dnsName`. What does Entrust do to respond to this?
In response to the CA/Browser Forum discussion of this issue in 2016, Entrust stopped issuing these certificates in August 2016. Because no security issues have been found from these certificates, we decided it was not necessary to revoke the certificates previously issued but rather allow them to expire.
Weirdly, no one asks Entrust to file an incident for this as well. Either way, Entrust says that this won’t be a problem because they’ve stopped issuing these certificates.
Beyond these problems, Entrust has difficulty providing an incident response with accurate, and precise timelines in it. So something new to keep track of, “Does Entrust provide a complete incident response without being prompted to by someone else?” It is not the job of the community to keep reminding a CA to do proper incident responses, so this is worthy to keep track.
As a final update, Entrust says they’ve done all the remediation items.
SIRQ: 1
Entrust: Non-BR-Compliant OCSP Responder
2018-01-08
Entrust once again has not self-reported this incident.
Entrust takes 4 days to acknowledge this incident.
In this incident, Entrust’s OCSP system is responding “good” to invalid serial numbers. Effectively meaning that Entrust’s OCSP system for this time period is just useless as the info from it can’t be trusted.
Entrust once again starts partially arguing about the requirements, and highlights their misunderstanding of the requirements:
Although Mozilla policy 1.1(2) does state that CAs which are not technically constrained or are used for SSL or S/MIME are in scope, we did not consider that Mozilla policy 2.3 imposed the BRs on S/MIME CAs. The reason is 2.3 appears to be applied to “CA operations relating to issuance of certificates capable of being used for SSL-enabled servers.”
Even if you weren’t sure about these rules, in what system is having an OCSP system respond “good” to an invalid signature a proper design?
It takes two weeks for Entrust to fix this broken OCSP system.
In this incident, no one asked Entrust to fix their incident response. So at least that goes well. Generally, this incident response was “alright”, but it did raise some questions about how Entrust handles their infrastructure internally.
SIRQ: 1
Entrust: IP Address in dNSName form
2018-03-26
Remember the first incident discussed here and how Jonathan Rudenberg found that there are certificates being issued with IP addresses in dnsName?
Entrust’s response to that was that “Entrust stopped issuing these certificates in August 2016.”
Clearly, that was not true, since now its 2018-03-26, and Entrust has a similar incident again. It should also be noted that, Entrust received a notice on this 4 days before they filed the preliminary incident report as well.
During their multi-day investigation, they did not stop issuance at any point. Their incident response was also very weak, until Ryan Sleevi asked for more information.
In this incident, Entrust is asked if they’re planning to implement pre-issuance linting. Entrust’s agrees that they will implement pre-issuance linting “in the first release of 2019.” That’s at least ~7 months after this incident.
On 2019-01-04, Ryan Sleevi asks if Entrust has implemented the pre-issuance linting yet. Ryan Sleevi on 2019-01-14, 10 days after the first question, asks Entrust for an update. On the same day, Entrust says that:
We currently have post-issuance linting in place to cover all public and private trust certificate types including SSL/TLS, code signing, S/MIME, and document signing. We are progressing on pre-issuance linting, but have not yet deployed.
Ryan once again asks for a concrete timeframe, and Entrust’s response to this is:
This is, once again, Entrust taking a combative posture months after this incident, and years after the same mistake repeats itself. It takes multiple back and forths on this issue until Entrust says that the first release in 2019 will be in April of 2019. Further down, this date gets changed to 1st of May, and then to 14th of May.
Even by the end of this incident, it is unclear what actual linting has been added to Entrust’s pre-issuance process. It took Entrust over a year to add a basic action to their operations.
One point that Wayne Thayer brought up in this process is this:
This is a clearly defined threat against Entrust, and one that I believe should be realized.
I once again want to emphasize that, Entrust had made this misissuance of including IPs in DNSNames in 2016, and then once again in 2018.
SIRQ: -1
Entrust: Certificate issued with '-' in ST field
2018-12-04
This is extremely similar to the “Entrust: Non-BR-Compliant Certificate Issuance” incident here from over a year ago! In this case, Entrust has placed the hyphen character in the ST field.
Somehow after all this time, Entrust once again forgets that there is an incident response template until Ryan Sleevi points it out to them. Even after that, Ryan has to then point out that the incident response is still not complete, and faulty.
Entrust had to also be pushed to file an incident for delayed revocation by Ryan in that same comment. Once again showing that Entrust either does not care, or can’t seem to remember what the rules require of them.
SIRQ: 0
Entrust: Late mis-issue certificate revocation
2019-01-17
As a result of the previous incident, Entrust files this incident. This time they do follow the incident response template. However, this incident gets filed over a month after the previous incident.
Entrust says that this is the steps the CA is taking to resolve the situation:
At the time that a miss-issuance has been determined, a revocation deadline will be set. The deadline will be based on the time of notification and not the time the investigation is complete. A 24 hour alarm will be set in our Support system with a notice to a distribution list. Managers on the distribution list will ensure that the certificate gets revoked before the deadline.
Anyone familiar with the baseline requirements will immediately realize that this won’t be compliant with the requirements. Ryan Sleevi pointed this out to them:
To which Entrust agrees and states: “You are correct as BR 4.9.1.1 states "The CA obtains evidence that the Certificate was misused." So the 5 days may start as late as this time.”
I’m glad that they agreed. However, this isn’t a class to come learn about the requirements. These requirements are very clearly spelled out in the BRs. The BRs aren’t that long of a document to read and keep track. Entrust took a month to file this incident, and this was the result of their incident response?
SIRQ: 1
Entrust: Late revocation of underscore certificate
2019-01-21
Entrust once again had to rely on an external party telling them they had messed up for them to file this incident. Entrust once again does not provide any root cause analysis until they’re prodded to by Ryan Sleevi.
This incident one was of Entrust’s so far, least problematic responses, even though it showed a clear gap in their operations.
SIRQ: 1
Entrust: IP in dnsName
2019-02-03
Once again, Entrust relying on a external party to tell them they have an issue. Jonathan Rudenberg is here to report that, once again, Entrust has messed up their certificate crafting when IPs are involved. What’s concerning here is Entrust’s response to Jonathan directly:
We've taken a look at the certificates you brought forward internally and have an answer for you. In August 2016, we modified our issuance policies to correct the dnsName discrepancy however a decision was made at that time to allow existing certificates to expire, rather than be revoked.
The reasoning behind this decision is that the dnsName field containing an IP address was required to ensure support from some browsers, and there was no direct security concern brought forward by this discrepancy.
Currently, that plan has not changed, and we will continue to allow these mentioned certificates to expire in line with our original decision.
So, they’ve misissued, and they say that is okay. The following is something we’ve seen from Entrust very recently too: “and we will continue to allow these mentioned certificates to expire in line with our original decision.” but let's not get ahead of ourselves here yet.
Ryan Sleevi also mentions that Entrust has claimed different things before, and there is a discrepancy here. To which Entrust says:
Again, I’m left with a “How on earth did you think this was ok?” moment. Once Entrust provides the incident report, they make this claim:
Once again, Entrust blaming the BRs here. In that same report, Entrust sets a deadline for revocation for 2019-02-22. Despite this already being way beyond the limit, and way beyond the limit of when this problem was initially detected in #1448986 in 2018-03-26 . They still fail to revoke within their own set deadline and finally revoke it all on 2019-02-28.
It should be noted that these certificates belong to important government institutions. In this case, the Department of Homeland Security was using a bad certificate for years thanks to Entrust’s inability to do their job as a public CA.
SIRQ: 0
Entrust: Issued Certificates to incorrect Organization
2019-03-15
In this incident, Entrust has a misrepresentation between the timeline and their comment on why it took so long to file an incident report. A misrepresentation that Ryan was quick to ask about:
In this incident, Entrust keeps misunderstanding the BRs requirement and Ryan has to effectively spoon feed them procedures to be compliant with the BRs. Once again displaying that Entrust does not have the expertise to effectively run a CA at the scale they do.
While Entrust took a day to fix this problem, they never stopped issuance for this organization. I’m not entirely sure if they ended up issuing anything while they were working on it, but it was clear that they never made the conscious choice to stop issuance.
SIRQ: 0
Entrust: AffirmTrust Issuing CA Impacted by EJBCA Serial Number Issue
2019-03-18
In this incident, Entrust makes it clear they don’t even have an inventory of what software they run for their CA:
One of the foundational requirements for a CA is to keep an inventory of their software, and hardware they run as part of managing a CA. The fact that they didn’t realize that they have EJBCA in their stack for weeks after this problem was disclosed once again goes to show that the operational knowledge is simply missing.
SIRQ: 1
Entrust: Outdated audit statement for intermediate cert
2019-05-07
Entrust allowed a subCA to run without an up-to-date Audit for four months. This entire incident is concerning, but what stood out to me is this back and forth between Ryan Sleevi and Entrust, showcasing that Entrust seemingly can’t even use a search engine:
Ryan asks:
Considering that multiple other CAs have had issues with providing timely audits for their sub-CAs, why did Entrust not have such procedures already in place, having learned from these incidents? It seems this list is still missing controls that other CAs have implemented for their subordinates, and it's not clear why that decision was made.
Could you share a bit more detail about the related Incident Reports from other CAs that Entrust has examined, and the motivations for omitting the controls other CAs have implemented?
Entrust responds (highlighted the interesting part in bold.)
Entrust has been managing third party sub-CAs for about 15 years. We have worked through managing these sub-CAs as the requirements and the audit criteria has changed over the years. We did not consider reviewing what other CA's have done to address their incidents to provide timely audits. This sounds like a good idea. I did a quick review, but my search did not provide me any benefit. Do you have any recommendations on which ones to review?
Ryan responds:
I am deeply concerned that Entrust has not actively followed discussions of m.d.s.p., as required since April 2017. It seems Entrust has relied on its own experience, which is demonstrably deficient, rather than working to understand how the industry has changed. I am absolutely appalled at Entrust's inability to find issues like Bug 1566162 or Bug 1539296. These were trivial to find, as it simply required looking through https://wiki.mozilla.org/CA/Incident_Dashboard . Of course, if you also examined closed issues, you will find others.
What this also shows is that Entrust has no proper procedure of keeping track of other incidents. I know that this problem hasn’t gotten better either: I’m going to skip ahead a tiny bit, but Entrust only just (2024-02-09) found out about ocsp_watch. A tool which has been around for years at this point.
Entrust has demonstrated (at least) in 2019, and 2024 that they do not effectively monitor Bugzilla for incidents to apply the lessons from them to their own CA. Beyond that, I’m very surprised that not a single engineer at Entrust knew about this tool? Once again, it makes me question their expertise.
SIRQ: -1
Entrust: Question marks in certificate O and L fields
2019-05-17
Once again, in this incident, Entrust has to be prodded to provide any useful information on what actually was the incident. One of my favorite lines in this incident is:
we were not transparent and did no file a miss-issue report ~ Entrust
I know I’m pulling this partially out of context, but honestly, yeah Entrust. That is the theme of this write up.
In this incident, Entrust once again demonstrates a lack of training and knowledge by revoking these certificates and never filing an incident report for them until their auditors noticed this bug.
SIRQ: 1
Entrust: Certificate Issued with Incorrect Country Code
2019-06-14
In this incident, Entrust talks about the use of a SQL query on a live database to modify information. Ryan Sleevi points out that he wonders how queries like this can be compliant with the auditing and data retention requirements of the BRs. Unfortunately, we never really got an answer to this. I wish that auditors would focus more on “Can a random engineer change data without it being recorded somewhere?”
Either way, this incident was one of the first examples of Entrust answering somewhat properly to an incident.
SIRQ: 1
Entrust: Certificate issued with validity greater than 825-days
2019-06-24
In this incident, somehow Entrust’s non-production software stack is able to produce a production (Publically trusted) certificate. Reading through this incident response, I still do not know how this actually happened. Entrust claimed that there was no preissuance linting as it was a “non production” service.
From what I think they’re trying to say here is, even though these certs were being issued by a real CA, the intended website for them are the test websites CAs put up for clients to test against. Therefore, these were “non production” services. To make this very clear, if your system is issuing certs against your publicly trusted roots, that is a production system.
Entrust once again has to be prodded to give more information, during which they find an extra two certs that were misissued. As entrust began to address this problem, they misissued another 3 certificates again. This is simply unacceptable, they’re treating their production CA as some sort of public testing tool. Wayne Thayer seems to agree:
Bruce: it's not acceptable to brush off these additional misissuances of as manual errors. When remediating a prior misissuance, I would expect the Entrust team to follow documented and peer-reviewed procedures and to double- and triple-check configurations before triggering production issuance. This sounds more like testing in production, with publicly-trusted certificates.
Because of this Wayne creates the next incident (Entrust: SHA-1 Issuance and other misissuance while testing) for Entrust on 2019-07-19, and Entrust doesn’t respond to it for another 6 days.
SIRQ: -1
Entrust: SHA-1 Issuance and other misissuance while testing
2019-07-19
Entrust takes 6 days to acknowledge and respond to this incident. Entrust seems to basically copy paste their incident report from the previous incident here. Once again repeating this same line:
The issuing CA is not set up for production as there are no third party certificates currently being issued. There is no automated issuance nor is there any pre-issuance linting set for a non-production CA. The certificates to support the BR required test sites were issued manually.
To me, none of their answers were satisfactory here. Their solution was effectively to change from a user interface driving a software, to a user interface driving an API that drives some software. Either way the real issue seems that it is Entrust, once again, not understanding what constitutes as a production system.
Also, entrust claims:
Entrust does monitor Mozilla discussions, which may include incident reporting. This monitoring may flag action to improve our deployment of certificate management systems. We will plan to increase our level of monitoring and action.
Except as I mentioned above, somehow they’ve missed the ocsp_watch tool that has existed for years. Someone should ask them to provide a history of all the incidents they’ve monitored & triaged, and what their conclusions were for them.
SIRQ: 1
This marks the end of part 1 of this saga. To summarize:
In the 14 incidents we’ve analyzed, Entrust created timely incident responses exactly once, with 1559376.
In the case of the IP in dnsName issue, Entrust went back and forth and back and forth, and took over a year to implement pre-issuance linting. Linting that somehow didn’t make it’s way to the last two incidents we discussed because they were “non production”.
Out of these 14 incidents, Entrust discovered the issue on their own in only four of the incidents. Out of the remaining 10, one of them was discovered through Audit, 8 of them were from reports sent to them about misissuances. The last of them (1559376), it was not clear how Entrust actually found out about this incident.
Out of these 14 incidents, Entrust argued about the requirements in some capacity in 8 of them.
Edit: A couple of folks on SomethingAwful have done an awesome job putting together a list of historical incidents Entrust has had. Find those here: