This is part two of the Certificate Transparency series.
I show here how certificate transparency is leaking valuable information about companies due to either unfamiliarity or obliviousness from companies’ IT or DevOps teams.
Overview
As explained in part one Certificate Transparency 101, certificate transparency makes the details of all trusted SSL certificates public for anyone to see. While it’s useful for browsers to validate information about certificates issued by a CA, it is also useful for attackers to gain valuable information about companies’ infrastructure.
While I don’t want to show proofs, let’s just say that most of the cyber startups I’ve checked are leaking their entire infrastructure info through certificate transparency.
Why subdomains are created
Company buys its domain, let’s say example.com. They serve it through HTTPs so they’d have to obtain a certificate. Before Let’s Encrypt companies would have to pay money to buy a certificate for a year. It wasn’t very expensive, but enough to reduce the amount of certificates a company issues. When the company grows, to simplify its internal Infrastructure layout, it usually uses subdomains to separate internal servers from user interface and we’d start seeing app.example.com, login.example.com and others.
Let’s Encrypt allows generating free certificates, as long as the issuer can prove it controls the domain requesting the certificate. To do this, when requesting a certificate for newcert.example.com, Let’s Encrypt presents a challenge to that specific domain, and the server must respond appropriately to prove control. For more details on challenge methods, see Let’s Encrypt challenge types.
As the company grows, the company’s DevOps team will probably request more control over the subdomains it manages, so if for instance a service for logs storage is developed in the company the logs.example.com subdomain will be created. This is done as a convenience, in theory they could use some arbitrary name like wolf.example.com, or simba.example.com, but they pick logs as it’s more readable and easier to remember.
After a while, the company will have the example following subdomains:
example.com- main websitelogin.example.com- authentication methods, the auth servers are hereapp.example.com- the UI after login is herelogs.example.com- the user uploads logs to this website
As the company grows further it might have multiple UIs for various customers and it might installs a wildcard certificate, so a *.console.example.com will serve both customerA.console.example.com and customerB.console.example.com. This cannot be done with Let’s Encrypt, so the company has to buy a certificate from a company selling wildcard certificates. As it’s expensive, the DevOps team will choose to use Let’s Encrypt and will issue a specific certificate per customer.
So now the company have the following subdomains with certificates:
example.com- main websitelogin.example.com- authentication methods, the auth servers are hereapp.example.com- the UI after login is herelogs.example.com- the user uploads logs to this websitecustomerA.console.example.com- customer A UIcustomerB.console.example.com- customer B UI
Kubernetes
As the company grows (or god forbid from its beginning…) it will start using Kubernetes as it makes spinning up new services trivially easy, and each service will have its own internal subdomain.
Internally, teams will push to have their own deployment area, like staging or dev, and this is a very easy thing to do in K8s. So staging.api.example.com or staging.app.example.com will be created.
As the company deployment flow become more complicated, the company will start using tools like cert-manager. Once done, the entire certificate and subdomains flow creation is automated, and there is almost nothing to be done by humans.
DNS information
As long as the subdomains are only in DNS, to gain information about a company you’d have to use brute-force with word-list in order to find something. So you’d have to query app.example.com or ui.example.com the DNS server to understand what is found and what not in a given company’s domain.
The info leak
As described in part one Certificate Transparency 101, each certificate issued for a domain will be stored forever in the certificate transparency logs.
Indeed app.example.com doesn’t mean anything as a leak, but what about sailpoint.example.com, or okta.example.com? This is valuable information about whether the company uses or integrates with SailPoint or Okta.
Due to DevOps teams unfamiliarity with the concept of public logging of the certificates, a vast number of companies leak their entire infrastructure information freely, due to the use of k8s + cert-manager + Let’s Encrypt.
When I contacted Let’s Encrypt, they have basically told me people should be aware of this and it’s their own problem.
With the help of a website called crt.sh, anyone can query any domain they want, and the website will dump the company’s entire infra. The reconnaissance step of understanding how a company works is done in a single search.
Summarize subdomains leak with LLM
To make everything worse LLM are used everywhere right now. Query the crt.sh, cut the domain name to just keep a list of subdomains and tell LLM to summarize the company infrastructure with something like:
You are a cybersecurity and infrastructure analyst.
Analyze the provided list of subdomains and generate a concise summary of the company's infrastructure.
Identify patterns such as:
- Cloud providers (aws, azure, gcp, cloudflare, etc.)
- Development/staging/production environments
- Services and technologies (api, mail, vpn, jenkins, gitlab, etc.)
- Geographic regions
- Third-party integrations
- Security-related services
- Customer Names?
Provide a structured summary with bullet points.
When added with the subdomains list, this provides a summary which pretty much describes the entire company infra, names of their customers and the tools they use.
Example of stuff which is leaked for companies
- Environment details
- Servers layout
- Internal Tools
- External Integrations
- Customer Names
Your new integration for version 3.0 with that huge undisclosed company? Leaked before relase.
Your highly confidential customer name? Leaked even with NDA.
Your auth servers names? Leaked.
Your monitoring tools and how you debug? Leaked.
This is valuable information which should not be visible to the external world.
If you use any of the tools described above, at least know that this information is public.