I don't know that I can give out my advice here before, so I had written an e-mail to [email protected] and that's my first time to send an email to IETF. I don't know whether sending an e-mail is the right thing, so I would paste it here again.
I'm not a native English speaker, I would apologize for my poor English first.
DNS record disadvantage
1.Impossible for wildcard A or AAAA record
It isn't possible for us to set a txt record for wildcard A or AAAA record (_esni.*.example.com. 60S IN TXT "..." "...").
The same problem as #79 mentioned.
2.Publishing a public key using DNS is very inconvenient and insecure.
In general, we can't change the DNS record very conveniently (except some DNS which provide API), so that most domain owners won't change the key very often.
And in some cases, domain owners host their site at host service providers (I don't know how to explain that in English. Host service provider use one physical or virtual server to host many sites) once the provider want to change the ESNI public key, all owners on this server need to change their txt record on time, which is very inconvenient and nearly impossible.
The lack of frequently changing public key and the lack of Perfect Forward Security only provide us little security.
3.DNS record limited only exist domains can enable ESNI
As DNS record is needed to be set, only the owner of a exist and valid domain can enable ESNI.
However, sometimes we want to use encrypted-sni as a private entry. For example, I configured my server that client hello with the sni "vps.panel" would lead to the vps control panel ( I issued a cert for "vps.panel" from my own PKI ). Though I configured that the server must verify my identity using a client certificate, I don't want this private sni to be known to the public. But your draft's method won't allow this use case as "vps.panel" isn't a valid domain name, so that I won't be able to own this domain and set a DNS record for it.
4.Local DNS (hosts) won't work
The hosts file can only take the place of A and AAAA record and won't be able to set public key in hosts in the near future.
5.Requirements to the server is too high
This ESNI draft requires that the server can the decrypt the sni which is encrypted by any public key. This means that the server has to take hold all ESNI private keys. However, I think that it's not proper.
To CDN service provider like Cloudflare, it means that one Cloudflare server has to storage all private ESNI keys of the websites hosted on it.
In my opinion, the browser should encrypt the sni using all public keys found in the DNS record and add a few invalid ESNI along with those, to prevent the attacker identify the real sni from the number of public keys.
Purpose Enhancement - Wall-resistent
In some countries (e.g. China, North Korea, Iran) , people may want to use ESNI to break the government's GreatFireWall, which blocks people from visiting some website.
The wall isn't just a wall, but a very complicated system which can modify DNS resolve result ( DNSSEC isn't common ). They can simply block all _esni txt record, so that all browsers can' get the public key and downgrade to cleartext-sni.
If ESNI can be used with SNI Proxy, it would be a good solution to break the wall. However the DNS record limited only the owner can enable ESNI ( Another disadvantage of DNS :D )
My suggestions
I think some ideas of https://datatracker.ietf.org/doc/draft-ietf-tls-sni-encryption/ is quite good. One good idea is to use a certificate to authenticate the Client-Facing Server instead of publishing a public key. My main idea is to introduce a series of HTTP headers like HSTS and HPKP. I will give an example below:
Firstly, I would introduce these headers
( These names may be not proper due to my poor English )
ESNI:(preload);(includeSubdomains);max-age=??
If ESNI header presents in the response, it means this domain enables ESNI. Otherwise, the browser MUST ignore other ESNI-* headers.
e.g.
ESNI:preload;includeSubdomains;max-age=31536000
ESNI-Resolve:{address};max-age=??
{address} can be either FQDN or IP(s) the purpose is to hide the real domain during DNS resolve. The server MUST keep this header up-to-date. This header just like a cacheable CNAME record.
e.g.
ESNI-Resolve: server233.domain-cdn.com;max-age=31536000
ESNI-Resolve: 0.0.0.0/16;max-age=2592000
ESNI-Resolve: 1.1.1.1;max-age=2592000
ESNI-Trust:{address or pin-hash};max-age=??
{address} is the same as above. The only difference is that the FQDN can contain wildcard.
{pin-hash} is similar to HPKP's pin-hash.
Can have more than one this header in the HTTP Response. (Explain later)
e.g.
ESNI-Trust:fqdn{ *.domain-cdn.com};max-age=31536000
ESNI-Trust:ip{0.0.0.0/16};max-age=31536000
ESNI-Trust:pin-hash{X523zEOQCuEJeU6PzewOGkKCRX+YLvfAsCYJbQubCuE=};max-age=31536000
ESNI-Policy:{policy};max-age=??
{policy} can be "force-encrypted-sni", "retry-clear-sni", "allow-dns-re-resolve", "disallow-dns-re-resolve" defines the action if error happens using ESNI.
Can have more than one this header in the Response. (Explain below)
It's impossible to have more than one header with the same name, but we may construct these in to a structure and encode (e.g.base64) into one header. The header may be very big, but don't worry, only the first response contains it, HTTP2 will automatically compress them into a few bits in the later response.
When the first time I visit https://www.example.com , I just do the normal tls1.3 hand shake with clear-text sni. The server just do what it should do, but the response MUST contain those ESNI headers if ESNI is enabled. The browser MUST remember these headers ( unless in incognito mode )
When the second time I visit https://www.example.com , the browser SHOULD NOT query DNS resolve for www.example.com, instead, it should visit the FQDN/IP mentioned in the ESNI-Resolve header. And the browser MUST NOT send clear-text sni in the ClientHello, instead, send a CH without sni or a special CH indicates using ESNI. The Client-Facing Server send back certificate, the browser validates the certificate and makes sure the certificate contains at least one FQDN/IP mentioned in the ESNI-Trust header. If mismatch, check the ESNI-Policy and determine whether to retry or not.
And then re-handshake with the server with encrypted-ClientHello in the 0RTT data or another CH extension.
( similar to draft-ietf-tls-sni-encryption method )
My suggestions' advantages and disadvantages are nearly the same as draft-ietf-tls-sni-encryption method. It provides PFS, don't need to change the Split-Mode Server and solved some part of DNS sniffing. But the disadvantage is obvious too, the first hand shake can't be encrypted and the ESNI-Resolve may become incorrect over time. It needs to handshake twice, however, it provided PFS and I think it worth it. And if multiple sites are hosted on the same server (e.g.CDN, Host Service Provider), we just need to do the second handshake, which eliminate the risk of DoS attack in some level.... And I think 0RTT symmetric decryption is better than asymmetric decryption.
Setting max-age too big or too small both have some risks, and I don't know how to solve it.
If you don't like the idea of authenticating the Client-Facing Server, you can simply replace the ESNI-Trust header's content into a public key and use your method. But I don' think set a static public key is a good idea ( explained in some part of DNS disadvantage ).
All in all, we can't have both efficiency, performance and security, reliability. You should make up your mind and choose whether to provide perfect security or perfect performance. I prefer the former.
I know that my idea is similar to draft-ietf-tls-sni-encryption than yours. However, you are the candidate for TLS WG, so I hope you can think it over. But I do agree with your idea that "no need to don't stick out" or simply "we can stick out" :D