Over the weekend I enabled DNSSec on this site and blogged about it [1]. Today Lets Encrypt renewed the certificate on the Origin Server via DNS-01 – its setup uses CNAMEs to allow internal devices to get public certs, described in an earlier blog post [2].
The certificate renewal failed. Considering it has worked since I created the site in 2021 and I’ve just enabled DNSSec could this been an unintended side affect?
As an ex-4th line support engineer at a global telco, the first thing I do is not jump to conclusions. The change is noteworthy but lets take it step-by-step and analyse what is actually happening.
First, we can see from the syslog that certbot is clearly failing to renew the certificate, but does not give much more information than that:
Lets run certbot renew
manually and see what it reports:
This is interesting – we can see that one domain doesn’t see any TXT record but the other does.
On the DNS server we can also see that the certbot hook script is correctly updating the zone and elsewhere that it is being propagated correctly. Also of note, is that the second certbot domain is only seeing the first txt update. We can also query the main domain to also verify the link still works.
So rather than DNSSec it appears that the bot on the Lets Encrypt side isn’t waiting for DNS propagation (or retrying if failed) and instead expects us to wait before proceeding with the next stage of the ACME challenge.
We can also verify in detail the challenges correlate with what we are adding to DNS by running certbot in challenge debug mode with the verbose option to see the contents of the requests.
certbot –debug-challenges -v renew
We can test our theory with a crude and simple modification to the hook script – we sleep for 5 seconds after the update.
Re-running the renew command and we get a success.
Now we just need to update our hook script to validate that the DNS update has propagated before returning control to certbot.
By systematically analysing the problem we quickly deduced that enabling DNSSec was a red-herring and it was a simple DNS propagation timing issue.
References
[1] https://simulatedattack.com/2024/02/dnssec-all-the-things-an-easy-and-free-way/
[2] https://simulatedattack.com/2022/01/lets-encrypt-inside/