Why you need Network Error Logging (NEL)
Introduction
Wouldn't it be great if every visitor to your site took the time to notify you when they were experiencing connectivity issues? Wouldn't it be even better if they told you exactly what caused the issue? Things like DNS lookup failures, connection timeouts, reset connections, or dead links that result in 404 errors?
As most of these issues are not detectable server-side—because, by definition, the client may not have managed to establish a successful connection with the server— getting client-side feedback would be extremely valuable!
Well..., stop fantasizing because this is now possible by adding just two response headers to your website. By adding the Report-To and NEL headers, you instruct supporting user agents (browsers) to send you reports about connectivity issues. Aside from logging errors, NEL can also be used to log successful requests, allowing you to determine rates of errors across different client populations.
Report-To Response headers
The Report-To header instructs the user agent where to send the reports. Its value is a JSON-formatted array of objects. After adding this header, you'll receive deprecation, intervention, and crash reports. I won't dive into these right now, but you can find more information about these reports here. Example:
Report-To: {"group":"endpoint-1","max_age":10886400,"endpoints":[{"url":"https://example.uriports.com/reports"}],"include_subdomains":true}
And you'll find more information about this header's elements here.
NEL Response headers
The NEL header is also a JSON-formatted array of objects and refers to a reporting endpoint defined in the previously mentioned Report-To header. This header is an extension to the Report-To header that instructs the user agent to send network error reports. Therefore, it is not possible to configure a NEL header without a Report-To header. Example:
NEL: {"report_to":"endpoint-1","max_age":2592000,"include_subdomains":true,"failure_fraction":0.5}
Find more information about this header's elements here.
Sampling report volume
Two optional fraction values (success_fraction and failure_fraction) allow you to define a sampling rate between 0 and 1, inclusive. If you have a high-traffic website, it would be a good idea to set a value lower than 1 to decrease the number of reports being sent. For instance, a value of 0.25 will instruct browsers to send only 1 out of 4 reports (25%). The default value for success_fraction is 0, so you will not receive any success reports by default. Be sure to start small when enabling this feature, as this will undoubtedly result in many reports. So start at 0.001 and work your way up if you need more.
Example report
Below is an example report from the W3C Editor's Draft. The report is delivered using a POST to the endpoint that was specified in the Report-To response header.
{
"age": 0,
"type": "network-error",
"url": "https://widget.com/thing.js",
"body": {
"sampling_fraction": 1.0,
"referrer": "https://www.example.com/",
"server_ip": "",
"protocol": "",
"method": "GET",
"request_headers": {},
"response_headers": {},
"status_code": 0,
"elapsed_time": 143,
"phase": "dns",
"type": "dns.name_not_resolved"
}
}
The example report above indicates that the user agent attempted to fetch https://widget.com/thing.js
from https://www.example.com/
. However, the user agent was unable to resolve the DNS name (widget.com
) and the request was aborted by the user agent after 143 milliseconds. Because a previous request to widget.com
delivered a valid NEL policy, the user agent generates a network error report for this request. The report was uploaded immediately after the network error was encountered (i.e., the report age is 0).
Network error types
Below is a list of predefined error codes and their descriptions. As you can see, you can gain a lot of detailed information from adding a NEL header to your website.
dns.unreachable
DNS server is unreachable
dns.name_not_resolved
DNS server responded but is unable to resolve the address
dns.failed
Request to the DNS server failed due to reasons not covered by previous
errors
dns.address_changed
Indicates that the resolved IP address for a request's origin has
changed since the corresponding NEL policy was received
tcp.timed_out
TCP connection to the server timed out
tcp.closed
The TCP connection was closed by the server
tcp.reset
The TCP connection was reset
tcp.refused
The TCP connection was refused by the server
tcp.aborted
The TCP connection was aborted
tcp.address_invalid
The IP address is invalid
tcp.address_unreachable
The IP address is unreachable
tcp.failed
The TCP connection failed due to reasons not covered by previous errors
tls.version_or_cipher_mismatch
The TLS connection was aborted due to version or cipher mismatch
tls.bad_client_auth_cert
The TLS connection was aborted due to invalid client certificate
tls.cert.name_invalid
The TLS connection was aborted due to invalid name
tls.cert.date_invalid
The TLS connection was aborted due to invalid certificate date
tls.cert.authority_invalid
The TLS connection was aborted due to invalid issuing authority
tls.cert.invalid
The TLS connection was aborted due to invalid certificate
tls.cert.revoked
The TLS connection was aborted due to revoked server certificate
tls.cert.pinned_key_not_in_cert_chain
The TLS connection was aborted due to a key pinning error
tls.protocol.error
The TLS connection was aborted due to a TLS protocol error
tls.failed
The TLS connection failed due to reasons not covered by previous errors
http.error
The user agent successfully received a response, but it had a 4xx or
5xx status code
http.protocol.error
The connection was aborted due to an HTTP protocol error
http.response.invalid
Response is empty, has a content-length mismatch, has improper encoding,
and/or other conditions that prevent user agent from processing the
response
http.response.redirect_loop
The request was aborted due to a detected redirect loop
http.failed
The connection failed due to errors in HTTP protocol not covered by
previous errors
abandoned
User aborted the resource fetch before it is complete
unknown
error type is unknown
Notifications
Collecting and analyzing these reports can reveal valuable data that allows you to improve your website. At URIports, we automatically detect issues and send notifications when something is wrong. You will receive an email or push notification when multiple sources (different IP/browser combinations) experience the same issues, like 404 not-found errors, expired or wrongly configured certificates, or connectivity or DNS issues. This will keep you up-to-date and allow you to quickly resolve the issues without the site visitor taking the time to alert you personally.
Adoption
Based on the (January 2020) data from Scott Helme's Crawler.Ninja, only 1.2% of the websites in the Alexa top 1 million use Network Error Logging. In my opinion, that would make NEL the most undervalued monitoring technique available today.
Conclusion
Setting up Network Error Logging is easy, and best of all, it is free and already supported by most browsers like Chrome (Mobile), Opera, Yandex, Electron, Vivaldi, Edge, etc. Furthermore, combining it with URIports makes it even more valuable, allowing you to easily navigate and analyze the data and add notifications as a bonus.
Getting Started with URIports
We have a Getting Started page to help you set up everything by adding a few response headers and DNS records. You'll have your free 30-day URIports trial account set up and working in less than 30 minutes. No commitment or credit card details are required, and no strings attached. An additional advantage of using URIports is that the reports are collected outside the website network. This way, they can be delivered, even when the entire network of the website is down.
As always, if you have any questions, please find me on Twitter @freddieleeman