CloudFlare Blog

MultiCloud... flare

If you want to start an intense conversation in the halls of Cloudflare, try describing us as a "CDN". CDNs don't generally provide you with Load Balancing, they don't allow you to deploy Serverless Applications, and they certainly don't get installed onto your phone. One of the costs of that confusion is many people don't realize everything Cloudflare can do for people who want to operate in multiple public clouds, or want to operate in both the cloud and on their own hardware.Load BalancingCloudflare has countless servers located in 180 data centers around the world. Each one is capable of acting as a Layer 7 load balancer, directing incoming traffic between origins wherever they may be. You could, for example, add load balancing between a set of machines you have in AWS' EC2, and another set you keep in Google Cloud.This load balancing isn't just round-robining traffic. It supports weighting to allow you to control how much traffic goes to each cluster. It supports latency-based routing to automatically route traffic to the cluster which is closer (so adding geographic distribution can be as simple as spinning up machines). It even supports health checks, allowing it to automatically direct traffic to the cloud which is currently healthy.Most importantly, it doesn't run in any of the provider's clouds and isn't dependent on them to function properly. Even better, since the load balancing runs near virtually every Internet user around the world it doesn't come at any performance cost. (Using our Argo technology performance often gets better!).Argo TunnelOne of the hardest components to managing a multi-cloud deployment is networking. Each provider has their own method of defining networks and firewalls, and even tools which can deploy clusters across multiple clouds often can't quite manage to get the networking configuration to work in the same way. The task of setting it up can often be a trial-and-error game where the final config is best never touched again, leaving 'going multi-cloud' as a failed experiment within organizations.At Cloudflare we have a technology called Argo Tunnel which flips networking on its head. Rather than opening ports and directing incoming traffic, each of your virtual machines (or k8s pods) makes outbound tunnels to the nearest Cloudflare PoPs. All of your Internet traffic then flows over those tunnels. You keep all your ports closed to inbound traffic, and never have to think about Internet networking again.What's so powerful about this configuration is is makes it trivial to spin up machines in new locations. Want a dozen machines in Australia? As long as they start the Argo Tunnel daemon they will start receiving traffic. Don't need them any more? Shut them down and the traffic will be routed elsewhere. And, of course, none of this relies on any one public cloud provider, making it reliable even if they should have issues.Argo Tunnel makes it trivial to add machines in new clouds, or to keep machines on-prem even as you start shifting workloads into the Cloud.Access ControlOne thing you'll realize about using Argo Tunnel is you now have secure tunnels which connect your infrastructure with Cloudflare's network. Once traffic reaches that network, it doesn't necessarily have to flow directly to your machines. It could, for example, have access control applied where we use your Identity Provider (like Okta or Active Directory) to decide who should be able to access what. Rather than wrestling with VPCs and VPN configurations, you can move to a zero-trust model where you use policies to decide exactly who can access what on a per-request basis.In fact, you can now do this with SSH as well! You can manage all your user accounts in a single place and control with precision who can access which piece of infrastructure, irrespective of which cloud it's in.Our ReliabilityNo computer system is perfect, and ours is no exception. We make mistakes, have bugs in our code, and deal with the pain of operating at the scale of the Internet every day. One great innovation in the recent history of computers, however, is the idea that it is possible to build a reliable system on top of many individually unreliable components.Each of Cloudflare's PoPs is designed to function without communication or support from others, or from a central data center. That alone greatly increases our tolerance for network partitions and moves us from maintaining a single system to be closer to maintaining 180 independent clouds, any of which can serve all traffic.We are also a system built on anycast which allows us to tap into the fundamental reliability of the Internet. The Internet uses a protocol called BGP which asks each system who would like to receive traffic for a particular IP address to 'advertise' it. Each router then will decide to forward traffic based on which person advertising an address is the closest. We advertise all of our IP addresses in every one of our data centers. If a data centers goes down, it stops advertising BGP routes, and the very same packets which would have been destined for it arrive in another PoP seamlessly.Ultimately we are trying to help build a better Internet. We don't believe that Internet is built on the back of a single provider. Many of the services provided by these cloud providers are simply too complex to be as reliable as the Internet demands.True reliability and cost control both require existing on multiple clouds. It is clear that the tools which the Internet of the 80s and 90s gave us may be insufficient to move into that future. With a smarter network we can do more, better.

Just Write Code: Improving Developer Experience for Cloudflare Workers

We’re excited to announce that starting today, Cloudflare Workers® gets a CLI, new and improved docs, multiple scripts for everyone, the ability to run applications on without bringing your own domain, and a free tier to make experimentation easier than ever. We are building the serverless platform of the future, and want you to build your application on it, today. In this post, we’ll elaborate on what a serverless platform of the future looks like, how it changes today’s paradigms, and our commitment to making building on it a great experience.Three years ago, I was interviewing with Cloudflare for a Solutions Engineering role. As a part of an interview assignment, I had to set up an origin behind Cloudflare on my own  domain. I spent my weekend, frustrated and lost in configurations, trying to figure out how to set up an EC2 instance, connect to it over IPv6, and install NGINX on Ubuntu 16.4 just so I could end up with a static site with a picture of my cat on it. I have a computer science degree, and spent my career up until that point as a software engineer — building this simple app was a horrible experience. A weekend spent coding, without worrying about servers, would have yielded a much richer application. And this is just one rung in the ladder — the first one. While the primitives have moved up the stack, the fact is, developing an application, putting it on the Internet, and growing it from MVP to a scalable, performant product all still remain distinct steps in the development process.This is what “serverless” has promised to solve. Abstract away the servers at all stages of the process, and allow developers to do what they do best: develop, without having to worry about infrastructure.And yet, with many serverless offerings today, the first thing they do is the thing that they promised you they wouldn’t — they make you think about servers. “What region would you like?” (the first question that comes to my mind: why are you forcing me to think about which customers I care more about: East Coast, or West Coast? Why can’t you solve this for me?). Or: “how much memory do you think you’ll need?” (again: why are you making this my problem?! You figure it out!).We don’t think it should work like this.I often think back to that problem I was facing myself three years ago, and that I know developers all around the world face every day. Developers should be able to just focus on the code. Someone else should deal with everything else from setting up infrastructure through making that infrastructure fast and scalable. While we’ve made some architectural decisions in building Workers that enable us to do this better than anyone else, today isn’t about expounding on them (though if you’d like to read more, here’s a great blog post detailing some of them). What today is about is really honing Workers in on the needs of developers.We want Workers to bring the dream of serverless to life —  of letting developers only worry about bugs in their code. Today marks the start of a sustained push that Cloudflare is making towards building a great developer experience with Workers. We have some exciting things to announce today — but this is just the beginning.Wrangler: the official Workers CLIWrangler, originally open sourced as the Rust CLI for Workers, has graduated into being the official Workers CLI, supporting all your Workers deployment needs. Get started now by installing Wranglernpm install -g @cloudflare/wrangler Generate your first project from our template gallerywrangler generate <name> <template> --type=["webpack", "javascript", "rust"] Wrangler will take care of webpacking your project, compiling to WebAssembly, and uploading your project to Workers, all in one simple step:wrangler publish A few of the other goodies we’re excited for you to use Wrangler for:Compile Rust, C, and C++ to WebAssemblyCreate single or multi-file JavaScript applicationsInstall NPM dependencies (we take care of webpack for you)Add KV namespaces and bindingsGet started with pre-made templatesNew and Improved DocsWe’ve updated our docs (and used Wrangler to do so!) to make it easier than ever for you to get started and deploy your first application with Workers. Check out our new tutorials:Deploy a Slack bot with WorkersBuild a QR code generatorServe and cache files from cloud storageMultiscript for AllYou asked, we listened. When we introduced Workers, we wanted to keep things as simple as possible. As a developer, you want to break up your code into logical components. Rather than having a single monolithic script, we want to allow you to deploy your code in a way that makes sense to you. no-domain-required.workers.devWriting software is a creative process: a new project means creating something out of nothing. You may not entirely know what exactly it’s going to be yet, let alone what to name it.We are changing the way you get started on Workers, by allowing you to deploy to You may have heard about this announcement back in February, and we’re excited to deliver. For those of you who pre-registered, your subdomains will be waiting for you upon signing up and clicking into Workers. A Free Tier to ExperimentGreat products don’t always come from great ideas, they often come from freedom to tinker. When tinkering comes at a price, even if it’s $5, we realized we were limiting peoples’ ability to experiment.Starting today, we are announcing a free tier for Workers. The free tier will allow you to use Workers at up to 100,000 requests per day, on your own domain or You can learn more about the limits here.New and improved UIWe have packaged this up into a clean, and easy experience that allows you to go from sign up to a deployed Worker in less than 2 minutes:Our commitmentWe have a long way to go. This is not about crossing developer experience off our list, rather, about emphasizing our commitment to it. As our co-founder, Michelle likes to say, “we’re only getting started”. There’s a lot here, and there’s a lot more to come. Join us over at to find out more, and if you’re ready to give it a spin, you can sign up there.We’re excited to see what you build!

Join Cloudflare India Forum in Bangalore on 6 June 2019!

Please join us for an exclusive gathering to discover the latest in cloud solutions for Internet Security and Performance.Cloudflare Bangalore MeetupThursday, 6 June, 2019:  15:30 - 20:00Location: the Oberoi (37-39, MG Road, Yellappa Garden, Yellappa Chetty Layout, Sivanchetti Gardens, Bengalore)We will discuss the newest security trends and introduce serverless solutions.We have invited renowned leaders across industries, including big brands and some of the fastest-growing startups. You will  learn the insider strategies and tactics that will help you to protect your business, to accelerate the performance and to identify the quick-wins in a complex internet environment.Speakers:Vaidik Kapoor, Head of Engineering, GrofersNithyanand Mehta, VP of Technical Services & GM India, CatchpointViraj Patel, VP of Technology, BookmyshowKailash Nadh, CTO, ZerodhaTrey Guinn, Global Head of Solution Engineering, CloudflareAgenda:15:30 - 16:00 - Registration and Refreshment16:00 - 16:30 - DDoS Landscapes and Security Trends16:30 - 17:15 - Workers Overview and Demo17:15 - 18:00 - Panel Discussion - Best Practice on Successful Cyber Security and Performance Strategy18:00 - 18:30 - Keynote #1 - Future edge computing18:30 - 19:00 -  Keynote # 2 - Cyber attacks are evolving, so should you: How to adopt a quick-win security policy19:00 - 20:00 - Happy HourView Event Details & Register Here »We look forward to meeting you there!

Cloudflare Repositories FTW

This is a guest post by Jim “Elwood” O’Gorman, one of the maintainers of Kali Linux. Kali Linux is a Debian based GNU/Linux distribution popular amongst the security research communities.Kali Linux turned six years old this year!In this time, Kali has established itself as the de-facto standard open source penetration testing platform. On a quarterly basis, we release updated ISOs for multiple platforms, pre-configured virtual machines, Kali Docker, WSL, Azure, AWS images, tons of ARM devices, Kali NetHunter, and on and on and on. This has lead to Kali being trusted and relied on to always being there for both security professionals and enthusiasts alike.But that popularity has always led to one complication: How to get Kali to people?With so many different downloads plus the apt repository, we have to move a lot of data. To accomplish this, we have always relied on our network of first- and third-party mirrors.The way this works is, we run a master server that pushes out to a number of mirrors. We then pay to host a number of servers that are geographically dispersed and use them as our first-party mirrors. Then, a number of third parties donate storage and bandwidth to operate third-party mirrors, ensuring that we have even more systems that are geographically close to you. When you go to download, you hit a redirector that will send you to a mirror that is close to you, ideally allowing you to download your files quickly.This solution has always been pretty decent, however it has some drawbacks. First, our network of first-party mirrors is expensive. Second, some mirrors are not as good as others. Nothing is worse than trying to download Kali and getting sent to a slow mirror, where your download might drag on for hours. Third, we always always need more mirrors as Kali continues to grow in popularity.This situation led to us encountering Cloudflare thanks to some extremely generous outreach and we can chat more about your specific use case.— Justin (@xxdesmus) June 29, 2018 I will be honest, we are a bunch of security nerds, so we were a bit skeptical at first. We have some pretty unique needs, we use a lot of bandwidth, syncing an apt repository to a CDN is no small task, and well, we are paranoid. We have an average of 1,000,000 downloads a month on just our ISO images. Add in our apt repos and you are talking some serious, serious traffic. So how much help could we really expect from Cloudflare anyway? Were we really going to be able to put this to use, or would this just be a nice fancy front end to our website and nothing else?On the other hand, it was a chance to use something new and shiny, and it is an expensive product, so of course we dove right in to play with it.Initially we had some sync issues. A package repository is a mix of static data (binary and source packages) and dynamic data (package lists are updated every 6 hours). To make things worse, the cryptographic sealing of the metadata means that we need atomic updates of all the metadata (the signed top-level ‘Release’ file contains checksums of all the binary and source package lists).The default behavior of a CDN is not appropriate for this purpose as it caches all files for a certain amount of time after they have been fetched for the first time. This means that you could have different versions of various metadata files in the cache, resulting in invalid checksums errors returned by apt-get. So we had to implement a few tweaks to make it work and reap the full benefits of Cloudflare’s CDN network.First we added an “Expires” HTTP header to disable expiration of all files that will never change. Then we added another HTTP header to tag all metadata files so that we could manually purge those files from the CDN cache through an API call that we integrated at the end of the repository update procedure on our backend server.With nginx in our backend, the configuration looks like this:location /kali/dists/ { add_header Cache-Tag metadata,dists; } location /kali/project/trace/ { add_header Cache-Tag metadata,trace; expires 1h; } location /kali/pool/ { add_header Cache-Tag pool; location ~ \.(deb|udeb|dsc|changes|xz|gz|bz2)$ { expires max; } } The API call is a simple shell script launched by a hook of the repository mirroring script:#!/bin/sh curl -sS -X POST "" \ -H "Content-Type:application/json" \ -H "X-Auth-Key:XXXXXXXXXXXXX" \ -H "" \ --data '{"tags":["metadata"]}' With this simple yet powerful feature, we ensure that the CDN cache always contains consistent versions of the metadata files. Going further, we might want to configure Prefetching so that Cloudflare downloads all the package lists as soon as a user downloads the top-level ‘Release’ file.In short, we were using this system in a way that was never intended, but it worked! This really reduced the load on our backend, as a single server could feed the entire CDN. Putting the files geographically close to users, allowing the classic apt dist-upgrade to occur much, much faster than ever before.A huge benefit, and was not really a lot of work to set up. Sevki Hasirci was there with us the entire time as we worked through this process, ensuring any questions we had were answered promptly. A great win.However, there was just one problem.Looking at our logs, while the apt repo was working perfectly, our image distribution was not so great. None of those images were getting cached, and our origin server was dying.Talking with Sevki, it turns out there were limits to how large of a file Cloudflare would cache. He upped our limit to the system capacity, but that still was not enough for how large some of our images are. At this point, we just assumed that was that--we could use this solution for the repo but for our image distribution it would not help. However, Sevki told us to wait a bit. He had a surprise in the works for us.After some development time, Cloudflare pushed out an update to address our issue, allowing us to cache very large files. With that in place, everything just worked with no additional tweaking. Even items like partial downloads for users using download accelerators worked just fine. Amazing!To show an example of what this translated into, let’s look at some graphs. Once the very large file support was added and we started to push out our images through Cloudflare, you could see that there is not a real increase in requests:However, looking at Bandwidth there is a clear increase:After it had been implemented for a while, we see a clear pattern.This pushed us from around 80 TB a week when we had just the repo, to now around 430TB a month when its repo and images. As you can imagine, that's an amazing bandwidth savings for an open source project such as ours.Performance is great, and with a cache hit rate of over 97% (amazingly high considering how often and frequently files in our repo changes), we could not be happier.So what’s next? That's the question we are asking ourselves. This solution has worked so well, we are looking at other ways to leverage it, and there are a lot of options. One thing is for sure, we are not done with this.Thanks to Cloudflare, Sevki, Justin, and Matthew for helping us along this path. It is fair to say this is the single largest contribution to Kali that we have received outside of the support by Offensive Security.Support we received from Cloudflare was amazing. The Kali project and community thanks you immensely every time they update their distribution or download an image.

Join Cloudflare & PicsArt at our meetup in Yerevan!

Cloudflare is partnering with PiscArt to create a meetup this month at PicsArt office in Yerevan.  We would love to invite you to join us to learn about the newest in the Internet industry. You'll join Cloudflare's users, stakeholders from the tech community, and Engineers from both Cloudflare and PicsArt.Tuesday, 4 June, 18:30-21:00PicsArt office, YerevanAgenda:18:30-19:00   Doors open, food and drinks    19:00 - 19:30   Areg Harutyunyan, Engineering Lead of Argo Tunnel at Cloudflare, "Cloudflare Overview / Cloudflare Security: How Argo Tunnel and Cloudflare Access enable effortless security for your team"19:30-20:00    Gerasim Hovhannisyan, Director IT Infrastructure Operations at PicsArt, "Scaling to 10PB Content Delivery with Cloudflare's Global Network"20:00-20:30   Olga Skobeleva, Solutions Engineer at Cloudflare, "Security: the Serverless Future"20:30-21:00   Networking, food and drinksView Event Details & Register Here »We'll hope to meet you soon. Here are some photos from the meetup at PicsArt last year:

Stopping SharePoint’s CVE-2019-0604

On Saturday, 11th May 2019, we got the news of a critical web vulnerability being actively exploited in the wild by advanced persistent threats (APTs), affecting Microsoft’s SharePoint server (versions 2010 through 2019).This was CVE-2019-0604, a Remote Code Execution vulnerability in Microsoft SharePoint Servers which was not previously known to be exploitable via the web.Several cyber security centres including the Canadian Centre for Cyber Security and Saudi Arabia’s National Center put out alerts for this threat, indicating it was being exploited to download and execute malicious code which would in turn take complete control of servers.The affected software versions:Microsoft SharePoint Foundation 2010 Service Pack 2Microsoft SharePoint Foundation 2013 Service Pack 1Microsoft SharePoint Server 2010 Service Pack 2Microsoft SharePoint Server 2013 Service Pack 1Microsoft SharePoint Enterprise Server 2016Microsoft SharePoint Server 2019IntroductionThe vulnerability was initially given a critical CVSS v3 rating of 8.8 on the Zero Day Initiative advisory (however the advisory states authentication is required). This would imply only an insider threat, someone who has authorisation within SharePoint, such as an employee, on the local network could exploit the vulnerability.We discovered that was not always the case, since there were paths which could be reached without authentication, via external facing websites. Using the NIST NVD calculator, it determines the actual base score to be a 9.8 severity out of 10 without the authentication requirement.As part of our internal vulnerability scoring process, we decided this was critical enough to require immediate attention. This was for a number of reasons. The first being it was a critical CVE affecting a major software ecosystem, primarily aimed at enterprise businesses. There appeared to be no stable patch available at the time. And, there were several reports of it being actively exploited in the wild by APTs.We deployed an initial firewall rule the same day, rule 100157. This allowed us to analyse traffic and request frequency before making a decision on the default action. At the same time, it gave our customers the ability to protect their online assets from these attacks in advance of a patch.We observed the first probes at around 4:47 PM on the 11th of May, which went on until 9:12 PM. We have reason to believe these were not successful attacks, and were simply reconnaissance probes at this point.The online vulnerable hosts exposed to the web were largely made up of high traffic enterprise businesses, which makes sense based on the below graph from W3Techs.Figure 1: Depicts SharePoint’s market position (Image attributed to W3Techs)The publicly accessible proof of concept exploit code found online did not work out of the box. Therefore it was not immediately widely used, since it required weaponisation by a more skilled adversary.We give customers advance notice of most rule changes. However, in this case, we decided that the risk was high enough that we needed to act upon this, and so made the decision to make an immediate rule release to block this malicious traffic for all of our customers on May 13th.We were confident enough in going default block here, as the requests we’d analysed so far did not appear to be legitimate. We took several factors into consideration to determine this, some of which are detailed below.The bulk of requests we’d seen so far, a couple hundred, originated from cloud instances, within the same IP ranges. They were enumerating the subdomains of many websites over a short time period.This is a fairly common scenario. Malicious actors will perform reconnaissance using various methods in an attempt to find a vulnerable host to attack, before actually exploiting the vulnerability. The query string parameters also appeared suspicious, having only the ones necessary to exploit the vulnerability and nothing more.The rule was deployed in default block mode protecting our customers, before security researchers discovered how to weaponise the exploit and before a stable patch from Microsoft was widely adopted.The vulnerabilityZero Day Initiative did a good job in drilling down on the root cause of this vulnerability, and how it could potentially be exploited in practice.From debugging the .NET executable, they discovered the following functions could eventually reach the deserialisation call, and so may potentially be exploitable.Figure 2: Depicts the affected function calls (Image attributed to Trend Micro Zero Day Initiative)The most interesting ones here are the “.Page_Load” and “.OnLoad” methods, as these can be directly accessed by visiting a webpage. However, only one appears to not require authentication, ItemPicker.ValidateEntity which can be reached via the Picker.aspx webpage.The vulnerability lies in the following function calls:EntityInstanceIdEncoder.DecodeEntityInstanceId(encodedId); Microsoft.SharePoint.BusinessData.Infrastructure.EntityInstanceIdEncoder.DecodeEntityInstanceId(pe.Key); Figure 3: PickerEntity Validation (Image attributed to Trend Micro Zero Day Initiative)The PickerEntity ValidateEntity function takes “pe” (Picker Entity) as an argument.After checking pe.Key is not null, and it matches the necessary format: via a call toMicrosoft.SharePoint.BusinessData.Infrastucture.EntityInstanceIdEncoder.IsEncodedIdentifier(pe.Key) it continues to define an object of identifierValues from the result ofMicrosoft.SharePoint.BusinessData.Infrastructure.EntityInstanceIdEncoder.DecodeEntityInstanceId(pe.Key) where the actual deserialisation takes place.Otherwise, it will raise an AuthenticationException, which will display an error page to the user.The affected function call can be seen below. First, there is a conditional check on the encodedId argument which is passed to DecodeEntityInstanceId(), if it begins with __, it will continue onto deserialising the XML Schema with xmlSerializer.Deserialize().Figure 4: DecodeEntityInstanceId leading to the deserialisation (Image attributed to Trend Micro Zero Day Initiative)When reached, the encodedId (in the form of an XML serialised payload) would be deserialised, and eventually executed on the system in a SharePoint application pool context, leading to a full system compromise.One such XML payload which spawns a calculator (calc.exe) instance via a call to command prompt (cmd.exe):<ResourceDictionary xmlns="" xmlns:x="" xmlns:System="clr-namespace:System;assembly=mscorlib" xmlns:Diag="clr-namespace:System.Diagnostics;assembly=system"> <ObjectDataProvider x:Key="LaunchCalch" ObjectType="{x:Type Diag:Process}" MethodName="Start"> <ObjectDataProvider.MethodParameters> <System:String>cmd.exe</System:String> <System:String>/c calc.exe</System:String> </ObjectDataProvider.MethodParameters> </ObjectDataProvider> </ResourceDictionary> AnalysisWhen we first deployed the rule in log mode, we did not initially see many hits, other than a hundred probes later that evening.We believe this was largely due to the unknowns of the vulnerability and its exploitation, as a number of conditions had to be met to craft a working exploit that are not yet widely known.It wasn’t until after we had set the rule in default drop mode, that we saw the attacks really start to ramp up. On Monday the 13th we observed our first exploit attempts, and on the 14th saw what we believe to be individuals manually attempting to exploit sites for this vulnerability.Given this was a weekend, it realistically gives you 1 working day to have rolled out a patch across your organisation, before malicious actors started attempting to exploit this vulnerability.Figure 5: Depicts requests matched, rule 100157 was set as default block early 13th May.Further into the week, we started seeing smaller spikes for the rule. And on the 16th May, the same day the UK’s NCSC put out an alert reporting of highly successful exploitation attempts against UK organisations, thousands of requests were dropped, primarily launched at larger enterprises and government entities.This is often the nature of such targeted attacks, malicious actors will try to automate exploits to have the biggest possible impact, and that’s exactly what we saw here.So far into our analysis, we’ve seen malicious hits for the following paths:/_layouts/15/Picker.aspx/_layouts/Picker.aspx/_layouts/15/downloadexternaldata.aspxThe bulk of attacks we’ve seen have been targeting the unauthenticated Picker.aspx endpoint as one would expect, using the ItemPickerDialog type:/_layouts/15/picker.aspx?PickerDialogType=Microsoft.SharePoint.Portal.WebControls.ItemPickerDialog We expect the vulnerability to be exploited more when a complete exploit is publicly available, so it is important to update your systems if you have not already. We also recommend isolating these systems to the internal network in cases they do not need to be external facing, in order to avoid an unnecessary attack surface.Sometimes it’s not practical to isolate such systems to an internal network, this is usually the case for global organisations, with teams spanning multiple locations. In these cases, we highly recommend putting these systems behind an access management solution, like Cloudflare Access. This gives you granular control over who can access resources, and has the additional benefit of auditing user access.Microsoft initially released a patch, but it did not address all vulnerable functions, therefore customers were left vulnerable with the only options being to virtually patch their systems or shut their services down entirely until an effective fix became available.This is a prime example of why firewalls like Cloudflare’s WAF are critical to keeping a business online. Sometimes patching is not an option, and even when it is, it can take time to roll out effectively across an enterprise.

The Serverlist Newsletter: Connecting the Serverless Ecosystem

Check out our fifth edition of The Serverlist below. Get the latest scoop on the serverless space, get your hands dirty with new developer tutorials, engage in conversations with other serverless developers, and find upcoming meetups and conferences to attend.Sign up below to have The Serverlist sent directly to your mailbox. .newsletter .visually-hidden { position: absolute; white-space: nowrap; width: 1px; height: 1px; overflow: hidden; border: 0; padding: 0; clip: rect(0 0 0 0); clip-path: inset(50%); } .newsletter form { display: flex; flex-direction: row; margin-bottom: 1em; } .newsletter input[type="email"], .newsletter button[type="submit"] { font: inherit; line-height: 1.5; padding-top: .5em; padding-bottom: .5em; border-radius: 3px; } .newsletter input[type="email"] { padding-left: .8em; padding-right: .8em; margin: 0; margin-right: .5em; box-shadow: none; border: 1px solid #ccc; } .newsletter input[type="email"]:focus { border: 1px solid #3279b3; } .newsletter button[type="submit"] { padding-left: 1.25em; padding-right: 1.25em; background-color: #f18030; color: #fff; } .newsletter .privacy-link { font-size: .9em; } Email Submit Your privacy is important to us newsletterForm.addEventListener('submit', async function(e) { e.preventDefault() fetch('', { method: 'POST', body: newsletterForm.elements[0].value }).then(async res => { const thing = await res.text() newsletterForm.innerHTML = thing const homeURL = '' if (window.location.href !== homeURL) { window.setTimeout(_ => { window.location = homeURL }, 5000) } }) }) iframe[seamless]{ background-color: transparent; border: 0 none transparent; padding: 0; overflow: hidden; } const magic = document.getElementById('magic') function resizeIframe() { const iframeDoc = magic.contentDocument const iframeWindow = magic.contentWindow magic.height = iframeDoc.body.clientHeight const injectedStyle = iframeDoc.createElement('style') injectedStyle.innerHTML = ` body { background: white !important; } ` magic.contentDocument.head.appendChild(injectedStyle) function onFinish() { setTimeout(() => { = '' }, 80) } if (iframeDoc.readyState === 'loading') { iframeWindow.addEventListener('load', onFinish) } else { onFinish() } } async function fetchURL(url) { magic.addEventListener('load', resizeIframe) const call = await fetch(`${url}`) const text = await call.text() const divie = document.createElement("div") divie.innerHTML = text const listie = divie.getElementsByTagName("a") for (var i = 0; i < listie.length; i++) { listie[i].setAttribute("target", "_blank") } magic.scrolling = "no" magic.srcdoc = divie.innerHTML } fetchURL("")

NGINX structural enhancements for HTTP/2 performance

IntroductionMy team: the Cloudflare PROTOCOLS team is responsible for termination of HTTP traffic at the edge of the Cloudflare network. We deal with features related to: TCP, QUIC, TLS and Secure Certificate management, HTTP/1 and HTTP/2. Over Q1, we were responsible for implementing the Enhanced HTTP/2 Prioritization product that Cloudflare announced during Speed Week.This is a very exciting project to be part of, and doubly exciting to see the results of, but during the course of the project, we had a number of interesting realisations about NGINX: the HTTP oriented server onto which Cloudflare currently deploys its software infrastructure. We quickly became certain that our Enhanced HTTP/2 Prioritization project could not achieve even moderate success if the internal workings of NGINX were not changed.Due to these realisations we embarked upon a number of significant changes to the internal structure of NGINX in parallel to the work on the core prioritization product. This blog post describes the motivation behind the structural changes, how we approached them, and what impact they had. We also identify additional changes that we plan to add to our roadmap, which we hope will improve performance further.BackgroundEnhanced HTTP/2 Prioritization aims to do one thing to web traffic flowing between a client and a server: it provides a means to shape the many HTTP/2 streams as they flow from upstream (server or origin side) into a single HTTP/2 connection that flows downstream (client side).Enhanced HTTP/2 Prioritization allows site owners and the Cloudflare edge systems to dictate the rules about how various objects should combine into the single HTTP/2 connection: whether a particular object should have priority and dominate that connection and reach the client as soon as possible, or whether a group of objects should evenly share the capacity of the connection and put more emphasis on parallelism.As a result, Enhanced HTTP/2 Prioritization allows site owners to tackle two problems that exist between a client and a server: how to control precedence and ordering of objects, and: how to make the best use of a limited connection resource, which may be constrained by a number of factors such as bandwidth, volume of traffic and CPU workload at the various stages on the path of the connection.What did we see?The key to prioritisation is being able to compare two or more HTTP/2 streams in order to determine which one’s frame is to go down the pipe next. The Enhanced HTTP/2 Prioritization project necessarily drew us into the core NGINX codebase, as our intention was to fundamentally alter the way that NGINX compared and queued HTTP/2 data frames as they were written back to the client.Very early in the analysis phase, as we rummaged through the NGINX internals to survey the site of our proposed features, we noticed a number of shortcomings in the structure of NGINX itself, in particular: how it moved data from upstream (server side) to downstream (client side) and how it temporarily stored (buffered) that data in its various internal stages. The main conclusion of our early analysis of NGINX was that it largely failed to give the stream data frames any 'proximity'. Either streams were processed in the NGINX HTTP/2 layer in isolated succession or frames of different streams spent very little time in the same place: a shared queue for example. The net effect was a reduction in the opportunities for useful comparison.We coined a new, barely scientific but useful measurement: Potential, to describe how effectively the Enhanced HTTP/2 Prioritization strategies (or even the default NGINX prioritization) can be applied to queued data streams. Potential is not so much a measurement of the effectiveness of prioritization per se, that metric would be left for later on in the project, it is more a measurement of the levels of participation during the application of the algorithm. In simple terms, it considers the number of streams and frames thereof that are included in an iteration of prioritization, with more streams and more frames leading to higher Potential.What we could see from early on was that by default, NGINX displayed low Potential: rendering prioritization instructions from either the browser, as is the case in the traditional HTTP/2 prioritization model, or from our Enhanced HTTP/2 Prioritization product, fairly useless.What did we do?With the goal of improving the specific problems related to Potential, and also improving general throughput of the system, we identified some key pain points in NGINX. These points, which will be described below, have either been worked on and improved as part of our initial release of Enhanced HTTP/2 Prioritization, or have now branched out into meaningful projects of their own that we will put engineering effort into over the course of the next few months.HTTP/2 frame write queue reclamationWrite queue reclamation was successfully shipped with our release of Enhanced HTTP/2 Prioritization and ironically, it wasn’t a change made to the original NGINX, it was in fact a change made against our Enhanced HTTP/2 Prioritization implementation when we were part way through the project, and it serves as a good example of something one may call: conservation of data, which is a good way to increase Potential.Similar to the original NGINX, our Enhanced HTTP/2 Prioritization algorithm will place a cohort of HTTP/2 data frames into a write queue as a result of an iteration of the prioritization strategies being applied to them. The contents of the write queue would be destined to be written the downstream TLS layer.  Also similar to the original NGINX, the write queue may only be partially written to the TLS layer due to back-pressure from the network connection that has temporarily reached write capacity.Early on in our project, if the write queue was only partially written to the TLS layer, we would simply leave the frames in the write queue until the backlog was cleared, then we would re-attempt to write that data to the network in a future write iteration, just like the original NGINX.The original NGINX takes this approach because the write queue is the only place that waiting data frames are stored. However, in our NGINX modified for Enhanced HTTP/2 Prioritization, we have a unique structure that the original NGINX lacks: per-stream data frame queues where we temporarily store data frames before our prioritization algorithms are applied to them.We came to the realisation that in the event of a partial write, we were able to restore the unwritten frames back into their per-stream queues. If it was the case that a subsequent data cohort arrived behind the partially unwritten one, then the previously unwritten frames could participate in an additional round of prioritization comparisons, thus raising the Potential of our algorithms.The following diagram illustrates this process:We were very pleased to ship Enhanced HTTP/2 Prioritization with the reclamation feature included as this single enhancement greatly increased Potential and made up for the fact that we had to withhold the next enhancement for speed week due to its delicacy.HTTP/2 frame write event re-orderingIn Cloudflare infrastructure, we map the many streams of a single HTTP/2 connection from the eyeball to multiple HTTP/1.1 connections to the upstream Cloudflare control plane.As a note: it may seem counterintuitive that we downgrade protocols like this, and it may seem doubly counterintuitive when I reveal that we also disable HTTP keepalive on these upstream connections, resulting in only one transaction per connection, however this arrangement offers a number of advantages, particularly in the form of improved CPU workload distribution.When NGINX monitors its upstream HTTP/1.1 connections for read activity, it may detect readability on many of those connections and process them all in a batch. However, within that batch, each of the upstream connections is processed sequentially, one at a time, from start to finish: from HTTP/1.1 connection read, to framing in the HTTP/2 stream, to HTTP/2 connection write to the TLS layer.The existing NGINX workflow is illustrated in this diagram:By committing each streams’ frames to the TLS layer one stream at a time, many frames may pass entirely through the NGINX system before backpressure on the downstream connection allows the queue of frames to build up, providing an opportunity for these frames to be in proximity and allowing prioritization logic to be applied.  This negatively impacts Potential and reduces the effectiveness of prioritization.The Cloudflare Enhanced HTTP/2 Prioritization modified NGINX aims to re-arrange the internal workflow described above into the following model:Although we continue to frame upstream data into HTTP/2 data frames in the separate iterations for each upstream connection, we no longer commit these frames to a single write queue within each iteration, instead we arrange the frames into the per-stream queues described earlier. We then post a single event to the end of the per-connection iterations, and perform the prioritization, queuing and writing of the HTTP/2 data frames of all streams in that single event.This single event finds the cohort of data conveniently stored in their respective per-stream queues, all in close proximity, which greatly increases the Potential of the Edge Prioritization algorithms.In a form closer to actual code, the core of this modification looks a bit like this:ngx_http_v2_process_data(ngx_http_v2_connection *h2_conn, ngx_http_v2_stream *h2_stream, ngx_buffer *buffer) { while ( ! ngx_buffer_empty(buffer) { ngx_http_v2_frame_data(h2_conn, h2_stream->frames, buffer); } ngx_http_v2_prioritise(h2_conn->queue, h2_stream->frames); ngx_http_v2_write_queue(h2_conn->queue); } To this:ngx_http_v2_process_data(ngx_http_v2_connection *h2_conn, ngx_http_v2_stream *h2_stream, ngx_buffer *buffer) { while ( ! ngx_buffer_empty(buffer) { ngx_http_v2_frame_data(h2_conn, h2_stream->frames, buffer); } ngx_list_add(h2_conn->active_streams, h2_stream); ngx_call_once_async(ngx_http_v2_write_streams, h2_conn); } ngx_http_v2_write_streams(ngx_http_v2_connection *h2_conn) { ngx_http_v2_stream *h2_stream; while ( ! ngx_list_empty(h2_conn->active_streams)) { h2_stream = ngx_list_pop(h2_conn->active_streams); ngx_http_v2_prioritise(h2_conn->queue, h2_stream->frames); } ngx_http_v2_write_queue(h2_conn->queue); } There is a high level of risk in this modification, for even though it is remarkably small, we are taking the well established and debugged event flow in NGINX and switching it around to a significant degree. Like taking a number of Jenga pieces out of the tower and placing them in another location, we risk: race conditions, event misfires and event blackholes leading to lockups during transaction processing.Because of this level of risk, we did not release this change in its entirety during Speed Week, but we will continue to test and refine it for future release.Upstream buffer partial re-useNginx has an internal buffer region to store connection data it reads from upstream. To begin with, the entirety of this buffer is Ready for use. When data is read from upstream into the Ready buffer, the part of the buffer that holds the data is passed to the downstream HTTP/2 layer. Since HTTP/2 takes responsibility for that data, that portion of the buffer is marked as: Busy and it will remain Busy for as long as it takes for the HTTP/2 layer to write the data into the TLS layer, which is a process that may take some time (in computer terms!).During this gulf of time, the upstream layer may continue to read more data into the remaining Ready sections of the buffer and continue to pass that incremental data to the HTTP/2 layer until there are no Ready sections available.When Busy data is finally finished in the HTTP/2 layer, the buffer space that contained that data is then marked as: FreeThe process is illustrated in this diagram:You may ask: When the leading part of the upstream buffer is marked as Free (in blue in the diagram), even though the trailing part of the upstream buffer is still Busy, can the Free part be re-used for reading more data from upstream?The answer to that question is: NOBecause just a small part of the buffer is still Busy, NGINX will refuse to allow any of the entire buffer space to be re-used for reads. Only when the entirety of the buffer is Free, can the buffer be returned to the Ready state and used for another iteration of upstream reads. So in summary, data can be read from upstream into Ready space at the tail of the buffer, but not into Free space at the head of the buffer.This is a shortcoming in NGINX and is clearly undesirable as it interrupts the flow of data into the system. We asked: what if we could cycle through this buffer region and re-use parts at the head as they became Free? We seek to answer that question in the near future by testing the following buffering model in NGINX:TLS layer BufferingOn a number of occasions in the above text, I have mentioned the TLS layer, and how the HTTP/2 layer writes data into it. In the OSI network model, TLS sits just below the protocol (HTTP/2) layer, and in many consciously designed networking software systems such as NGINX, the software interfaces are separated in a way that mimics this layering.The NGINX HTTP/2 layer will collect the current cohort of data frames and place them in priority order into an output queue, then submit this queue to the TLS layer. The TLS layer makes use of a per-connection buffer to collect HTTP/2 layer data before performing the actual cryptographic transformations on that data.The purpose of the buffer is to give the TLS layer a more meaningful quantity of data to encrypt, for if the buffer was too small, or the TLS layer simply relied on the units of data from the HTTP/2 layer, then the overhead of encrypting and transmitting the multitude of small blocks may negatively impact system throughput.The following diagram illustrates this undersize buffer situation:If the TLS buffer is too big, then an excessive amount of HTTP/2 data will be committed to encryption and if it failed to write to the network due to backpressure, it would be locked into the TLS layer and not be available to return to the HTTP/2 layer for the reclamation process, thus reducing the effectiveness of reclamation. The following diagram illustrates this oversize buffer situation:In the coming months, we will embark on a process to attempt to find the ‘goldilocks’ spot for TLS buffering: To size the TLS buffer so it is big enough to maintain efficiency of encryption and network writes, but not so big as to reduce the responsiveness to incomplete network writes and the efficiency of reclamation.Thank you - Next!The Enhanced HTTP/2 Prioritization project has the lofty goal of fundamentally re-shaping how we send traffic from the Cloudflare edge to clients, and as results of our testing and feedback from some of our customers shows, we have certainly achieved that! However, one of the most important aspects that we took away from the project was the critical role the internal data flow within our NGINX software infrastructure plays in the outlook of the traffic observed by our end users. We found that changing a few lines of (albeit critical) code, could have significant impacts on the effectiveness and performance of our prioritization algorithms. Another positive outcome is that in addition to improving HTTP/2, we are looking forward to carrying our newfound skills and lessons learned and apply them to HTTP/3 over QUIC.We are eager to share our modifications to NGINX with the community, so we have opened this ticket, through which we will discuss upstreaming the event re-ordering change and the buffer partial re-use change with the NGINX team.As Cloudflare continues to grow, our requirements on our software infrastructure also shift. Cloudflare has already moved beyond proxying of HTTP/1 over TCP to support termination and Layer 3 and 4 protection for any UDP and TCP traffic. Now we are moving on to other technologies and protocols such as QUIC and HTTP/3, and full proxying of a wide range of other protocols such as messaging and streaming media.For these endeavours we are looking at new ways to answer questions on topics such as: scalability, localised performance, wide scale performance, introspection and debuggability, release agility, maintainability.If you would like to help us answer these questions and know a bit about: hardware and software scalability, network programming, asynchronous event and futures based software design, TCP, TLS, QUIC, HTTP, RPC protocols, Rust or maybe something else?, then have a look here.

Building a To-Do List with Workers and KV

In this tutorial, we’ll build a todo list application in HTML, CSS and JavaScript, with a twist: all the data should be stored inside of the newly-launched Workers KV, and the application itself should be served directly from Cloudflare’s edge network, using Cloudflare Workers.To start, let’s break this project down into a couple different discrete steps. In particular, it can help to focus on the constraint of working with Workers KV, as handling data is generally the most complex part of building an application:Build a todos data structureWrite the todos into Workers KVRetrieve the todos from Workers KVReturn an HTML page to the client, including the todos (if they exist)Allow creation of new todos in the UIAllow completion of todos in the UIHandle todo updatesThis task order is pretty convenient, because it’s almost perfectly split into two parts: first, understanding the Cloudflare/API-level things we need to know about Workers and KV, and second, actually building up a user interface to work with the data.Understanding WorkersIn terms of implementation, a great deal of this project is centered around KV - although that may be the case, it’s useful to break down what Workers are exactly.Service Workers are background scripts that run in your browser, alongside your application. Cloudflare Workers are the same concept, but super-powered: your Worker scripts run on Cloudflare’s edge network, in-between your application and the client’s browser. This opens up a huge amount of opportunity for interesting integrations, especially considering the network’s massive scale around the world. Here’s some of the use-cases that I think are the most interesting:Custom security/filter rules to block bad actors before they ever reach the originReplacing/augmenting your website’s content based on the request content (i.e. user agents and other headers)Caching requests to improve performance, or using Cloudflare KV to optimize high-read tasks in your applicationBuilding an application directly on the edge, removing the dependence on origin servers entirelyFor this project, we’ll lean heavily towards the latter end of that list, building an application that clients communicate with, served on Cloudflare’s edge network. This means that it’ll be globally available, with low-latency, while still allowing the ease-of-use in building applications directly in JavaScript.Setting up a canvasTo start, I wanted to approach this project from the bare minimum: no frameworks, JS utilities, or anything like that. In particular, I was most interested in writing a project from scratch and serving it directly from the edge. Normally, I would deploy a site to something like GitHub Pages, but avoiding the need for an origin server altogether seems like a really powerful (and performant idea) - let’s try it!I also considered using TodoMVC as the blueprint for building the functionality for the application, but even the Vanilla JS version is a pretty impressive amount of code, including a number of Node packages - it wasn’t exactly a concise chunk of code to just dump into the Worker itself.Instead, I decided to approach the beginnings of this project by building a simple, blank HTML page, and including it inside of the Worker. To start, we’ll sketch something out locally, like this:<!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width,initial-scale=1"> <title>Todos</title> </head> <body> <h1>Todos</h1> </body> </html> Hold on to this code - we’ll add it later, inside of the Workers script. For the purposes of the tutorial, I’ll be serving up this project at My personal website was already hosted on Cloudflare, and since I’ll be serving, it was time to create my first Worker.Creating a workerInside of my Cloudflare account, I hopped into the Workers tab and launched the Workers editor.This is one of my favorite features of the editor - working with your actual website, understanding how the worker will interface with your existing project.The process of writing a Worker should be familiar to anyone who’s used the fetch library before. In short, the default code for a Worker hooks into the fetch event, passing the request of that event into a custom function, handleRequest:addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)) }) Within handleRequest, we make the actual request, using fetch, and return the response to the client. In short, we have a place to intercept the response body, but by default, we let it pass-through:async function handleRequest(request) { console.log('Got request', request) const response = await fetch(request) console.log('Got response', response) return response } So, given this, where do we begin actually doing stuff with our worker?Unlike the default code given to you in the Workers interface, we want to skip fetching the incoming request: instead, we’ll construct a new Response, and serve it directly from the edge:async function handleRequest(request) { const response = new Response("Hello!") return response } Given that very small functionality we’ve added to the worker, let’s deploy it. Moving into the “Routes” tab of the Worker editor, I added the route* and attached it to the cloudflare-worker-todos script.Once attached, I deployed the worker, and voila! Visiting in-browser gives me my simple “Hello!” response back.Writing data to KVThe next step is to populate our todo list with actual data. To do this, we’ll make use of Cloudflare’s Workers KV - it’s a simple key-value store that you can access inside of your Worker script to read (and write, although it’s less common) data.To get started with KV, we need to set up a “namespace”. All of our cached data will be stored inside that namespace, and given just a bit of configuration, we can access that namespace inside the script with a predefined variable.I’ll create a new namespace called KRISTIAN_TODOS, and in the Worker editor, I’ll expose the namespace by binding it to the variable KRISTIAN_TODOS.Given the presence of KRISTIAN_TODOS in my script, it’s time to understand the KV API. At time of writing, a KV namespace has three primary methods you can use to interface with your cache: get, put, and delete. Pretty straightforward!Let’s start storing data by defining an initial set of data, which we’ll put inside of the cache using the put method. I’ve opted to define an object, defaultData, instead of a simple array of todos: we may want to store metadata and other information inside of this cache object later on. Given that data object, I’ll use JSON.stringify to put a simple string into the cache:async function handleRequest(request) { // ...previous code const defaultData = { todos: [ { id: 1, name: 'Finish the Cloudflare Workers blog post', completed: false } ] } KRISTIAN_TODOS.put("data", JSON.stringify(defaultData)) } The Worker KV data store is eventually consistent: writing to the cache means that it will become available eventually, but it’s possible to attempt to read a value back from the cache immediately after writing it, only to find that the cache hasn’t been updated yet.Given the presence of data in the cache, and the assumption that our cache is eventually consistent, we should adjust this code slightly: first, we should actually read from the cache, parsing the value back out, and using it as the data source if exists. If it doesn’t, we’ll refer to defaultData, setting it as the data source for now (remember, it should be set in the future… eventually), while also setting it in the cache for future use. After breaking out the code into a few functions for simplicity, the result looks like this:const defaultData = { todos: [ { id: 1, name: 'Finish the Cloudflare Workers blog post', completed: false } ] } const setCache = data => KRISTIAN_TODOS.put("data", data) const getCache = () => KRISTIAN_TODOS.get("data") async function getTodos(request) { // ... previous code let data; const cache = await getCache() if (!cache) { await setCache(JSON.stringify(defaultData)) data = defaultData } else { data = JSON.parse(cache) } } Rendering data from KVGiven the presence of data in our code, which is the cached data object for our application, we should actually take this data and make it available on screen.In our Workers script, we’ll make a new variable, html, and use it to build up a static HTML template that we can serve to the client. In handleRequest, we can construct a new Response (with a Content-Type header of text/html), and serve it to the client:const html = ` <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width,initial-scale=1"> <title>Todos</title> </head> <body> <h1>Todos</h1> </body> </html> ` async function handleRequest(request) { const response = new Response(html, { headers: { 'Content-Type': 'text/html' } }) return response } We have a static HTML site being rendered, and now we can begin populating it with data! In the body, we’ll add a ul tag with an id of todos:<body> <h1>Todos</h1> <ul id="todos"></ul> </body> Given that body, we can also add a script after the body that takes a todos array, loops through it, and for each todo in the array, creates a li element and appends it to the todos list:<script> window.todos = []; var todoContainer = document.querySelector("#todos"); window.todos.forEach(todo => { var el = document.createElement("li"); el.innerText =; todoContainer.appendChild(el); }); </script> Our static page can take in window.todos, and render HTML based on it, but we haven’t actually passed in any data from KV. To do this, we’ll need to make a couple changes.First, our html variable will change to a function. The function will take in an argument, todos, which will populate the window.todos variable in the above code sample:const html = todos => ` <!doctype html> <html> <!-- ... --> <script> window.todos = ${todos || []} var todoContainer = document.querySelector("#todos"); // ... <script> </html> ` In handleRequest, we can use the retrieved KV data to call the html function, and generate a Response based on it:async function handleRequest(request) { let data; // Set data using cache or defaultData from previous section... const body = html(JSON.stringify(data.todos)) const response = new Response(body, { headers: { 'Content-Type': 'text/html' } }) return response } The finished product looks something like this:Adding todos from the UIAt this point, we’ve built a Cloudflare Worker that takes data from Cloudflare KV and renders a static page based on it. That static page reads the data, and generates a todo list based on that data. Of course, the piece we’re missing is creating todos, from inside the UI. We know that we can add todos using the KV API - we could simply update the cache by saying KRISTIAN_TODOS.put(newData), but how do we update it from inside the UI?It’s worth noting here that Cloudflare’s Workers documentation suggests that any writes to your KV namespace happen via their API - that is, at its simplest form, a cURL statement:curl "<$ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/values/first-key>" \ -X PUT \ -H "X-Auth-Email: $CLOUDFLARE_EMAIL" \ -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" \ --data 'My first value!' We’ll implement something similar by handling a second route in our worker, designed to watch for PUT requests to /. When a body is received at that URL, the worker will send the new todo data to our KV store.I’ll add this new functionality to my worker, and in handleRequest, if the request method is a PUT, it will take the request body and update the cache:addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)) }) const setCache = data => KRISTIAN_TODOS.put("data", data) async function updateTodos(request) { const body = await request.text() const ip = request.headers.get("CF-Connecting-IP") const cacheKey = `data-${ip}`; try { JSON.parse(body) await setCache(body) return new Response(body, { status: 200 }) } catch (err) { return new Response(err, { status: 500 }) } } async function handleRequest(request) { if (request.method === "PUT") { return updateTodos(request); } else { // Defined in previous code block return getTodos(request); } } The script is pretty straightforward - we check that the request is a PUT, and wrap the remainder of the code in a try/catch block. First, we parse the body of the request coming in, ensuring that it is JSON, before we update the cache with the new data, and return it to the user. If anything goes wrong, we simply return a 500. If the route is hit with an HTTP method other than PUT - that is, GET, DELETE, or anything else - we return a 404.With this script, we can now add some “dynamic” functionality to our HTML page to actually hit this route.First, we’ll create an input for our todo “name”, and a button for “submitting” the todo.<div> <input type="text" name="name" placeholder="A new todo"></input> <button id="create">Create</button> </div> Given that input and button, we can add a corresponding JavaScript function to watch for clicks on the button - once the button is clicked, the browser will PUT to / and submit the todo.var createTodo = function() { var input = document.querySelector("input[name=name]"); if (input.value.length) { fetch("/", { method: 'PUT', body: JSON.stringify({ todos: todos }) }); } }; document.querySelector("#create") .addEventListener('click', createTodo); This code updates the cache, but what about our local UI? Remember that the KV cache is eventually consistent - even if we were to update our worker to read from the cache and return it, we have no guarantees it’ll actually be up-to-date. Instead, let’s just update the list of todos locally, by taking our original code for rendering the todo list, making it a re-usable function called populateTodos, and calling it when the page loads and when the cache request has finished:var populateTodos = function() { var todoContainer = document.querySelector("#todos"); todoContainer.innerHTML = null; window.todos.forEach(todo => { var el = document.createElement("li"); el.innerText =; todoContainer.appendChild(el); }); }; populateTodos(); var createTodo = function() { var input = document.querySelector("input[name=name]"); if (input.value.length) { todos = [].concat(todos, { id: todos.length + 1, name: input.value, completed: false, }); fetch("/", { method: 'PUT', body: JSON.stringify({ todos: todos }) }); populateTodos(); input.value = ""; } }; document.querySelector("#create") .addEventListener('click', createTodo); With the client-side code in place, deploying the new Worker should put all these pieces together. The result is an actual dynamic todo list!Updating todos from the UIFor the final piece of our (very) basic todo list, we need to be able to update todos - specifically, marking them as completed.Luckily, a great deal of the infrastructure for this work is already in place. We can currently update the todo list data in our cache, as evidenced by our createTodo function. Performing updates on a todo, in fact, is much more of a client-side task than a Worker-side one!To start, let’s update the client-side code for generating a todo. Instead of a ul-based list, we’ll migrate the todo container and the todos themselves into using divs:<!-- <ul id="todos"></ul> becomes... --> <div id="todos"></div> The populateTodos function can be updated to generate a div for each todo. In addition, we’ll move the name of the todo into a child element of that div:var populateTodos = function() { var todoContainer = document.querySelector("#todos"); todoContainer.innerHTML = null; window.todos.forEach(todo => { var el = document.createElement("div"); var name = document.createElement("span"); name.innerText =; el.appendChild(name); todoContainer.appendChild(el); }); } So far, we’ve designed the client-side part of this code to take an array of todos in, and given that array, render out a list of simple HTML elements. There’s a number of things that we’ve been doing that we haven’t quite had a use for, yet: specifically, the inclusion of IDs, and updating the completed value on a todo. Luckily, these things work well together, in order to support actually updating todos in the UI.To start, it would be useful to signify the ID of each todo in the HTML. By doing this, we can then refer to the element later, in order to correspond it to the todo in the JavaScript part of our code. Data attributes, and the corresponding dataset method in JavaScript, are a perfect way to implement this. When we generate our div element for each todo, we can simply attach a data attribute called todo to each div:window.todos.forEach(todo => { var el = document.createElement("div"); el.dataset.todo = // ... more setup todoContainer.appendChild(el); }); Inside our HTML, each div for a todo now has an attached data attribute, which looks like:<div data-todo="1"></div> <div data-todo="2"></div> Now we can generate a checkbox for each todo element. This checkbox will default to unchecked for new todos, of course, but we can mark it as checked as the element is rendered in the window:window.todos.forEach(todo => { var el = document.createElement("div"); el.dataset.todo = var name = document.createElement("span"); name.innerText =; var checkbox = document.createElement("input") checkbox.type = "checkbox" checkbox.checked = todo.completed ? 1 : 0; el.appendChild(checkbox); el.appendChild(name); todoContainer.appendChild(el); }) The checkbox is set up to correctly reflect the value of completed on each todo, but it doesn’t yet update when we actually check the box! To do this, we’ll add an event listener on the click event, calling completeTodo. Inside the function, we’ll inspect the checkbox element, finding its parent (the todo div), and using the todo data attribute on it to find the corresponding todo in our data. Given that todo, we can toggle the value of completed, update our data, and re-render the UI:var completeTodo = function(evt) { var checkbox =; var todoElement = checkbox.parentNode; var newTodoSet = [].concat(window.todos) var todo = newTodoSet.find(t => == todoElement.dataset.todo ); todo.completed = !todo.completed; todos = newTodoSet; updateTodos() } The final result of our code is a system that simply checks the todos variable, updates our Cloudflare KV cache with that value, and then does a straightforward re-render of the UI based on the data it has locally.Conclusions and next stepsWith this, we’ve created a pretty remarkable project: an almost entirely static HTML/JS application, transparently powered by Cloudflare KV and Workers, served at the edge. There’s a number of additions to be made to this application, whether you want to implement a better design (I’ll leave this as an exercise for readers to implement - you can see my version at, security, speed, etc.One interesting and fairly trivial addition is implementing per-user caching. Of course, right now, the cache key is simply “data”: anyone visiting the site will share a todo list with any other user. Because we have the request information inside of our worker, it’s easy to make this data user-specific. For instance, implementing per-user caching by generating the cache key based on the requesting IP:const ip = request.headers.get("CF-Connecting-IP") const cacheKey = `data-${ip}`; const getCache = key => KRISTIAN_TODOS.get(key) getCache(cacheKey) One more deploy of our Workers project, and we have a full todo list application, with per-user functionality, served at the edge!The final version of our Workers script looks like this:const html = todos => ` <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width,initial-scale=1"> <title>Todos</title> <link href="" rel="stylesheet"></link> </head> <body class="bg-blue-100"> <div class="w-full h-full flex content-center justify-center mt-8"> <div class="bg-white shadow-md rounded px-8 pt-6 py-8 mb-4"> <h1 class="block text-grey-800 text-md font-bold mb-2">Todos</h1> <div class="flex"> <input class="shadow appearance-none border rounded w-full py-2 px-3 text-grey-800 leading-tight focus:outline-none focus:shadow-outline" type="text" name="name" placeholder="A new todo"></input> <button class="bg-blue-500 hover:bg-blue-800 text-white font-bold ml-2 py-2 px-4 rounded focus:outline-none focus:shadow-outline" id="create" type="submit">Create</button> </div> <div class="mt-4" id="todos"></div> </div> </div> </body> <script> window.todos = ${todos || []} var updateTodos = function() { fetch("/", { method: 'PUT', body: JSON.stringify({ todos: window.todos }) }) populateTodos() } var completeTodo = function(evt) { var checkbox = var todoElement = checkbox.parentNode var newTodoSet = [].concat(window.todos) var todo = newTodoSet.find(t => == todoElement.dataset.todo) todo.completed = !todo.completed window.todos = newTodoSet updateTodos() } var populateTodos = function() { var todoContainer = document.querySelector("#todos") todoContainer.innerHTML = null window.todos.forEach(todo => { var el = document.createElement("div") el.className = "border-t py-4" el.dataset.todo = var name = document.createElement("span") name.className = todo.completed ? "line-through" : "" name.innerText = var checkbox = document.createElement("input") checkbox.className = "mx-4" checkbox.type = "checkbox" checkbox.checked = todo.completed ? 1 : 0 checkbox.addEventListener('click', completeTodo) el.appendChild(checkbox) el.appendChild(name) todoContainer.appendChild(el) }) } populateTodos() var createTodo = function() { var input = document.querySelector("input[name=name]") if (input.value.length) { window.todos = [].concat(todos, { id: window.todos.length + 1, name: input.value, completed: false }) input.value = "" updateTodos() } } document.querySelector("#create").addEventListener('click', createTodo) </script> </html> ` const defaultData = { todos: [] } const setCache = (key, data) => KRISTIAN_TODOS.put(key, data) const getCache = key => KRISTIAN_TODOS.get(key) async function getTodos(request) { const ip = request.headers.get('CF-Connecting-IP') const cacheKey = `data-${ip}` let data const cache = await getCache(cacheKey) if (!cache) { await setCache(cacheKey, JSON.stringify(defaultData)) data = defaultData } else { data = JSON.parse(cache) } const body = html(JSON.stringify(data.todos || [])) return new Response(body, { headers: { 'Content-Type': 'text/html' }, }) } async function updateTodos(request) { const body = await request.text() const ip = request.headers.get('CF-Connecting-IP') const cacheKey = `data-${ip}` try { JSON.parse(body) await setCache(cacheKey, body) return new Response(body, { status: 200 }) } catch (err) { return new Response(err, { status: 500 }) } } async function handleRequest(request) { if (request.method === 'PUT') { return updateTodos(request) } else { return getTodos(request) } } addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)) }) You can find the source code for this project, as well as a README with deployment instructions, on GitHub.

Workers KV — Cloudflare's distributed database

Today, we’re excited to announce Workers KV is entering general availability and is ready for production use!What is Workers KV?Workers KV is a highly distributed, eventually consistent, key-value store that spans Cloudflare's global edge. It allows you to store billions of key-value pairs and read them with ultra-low latency anywhere in the world. Now you can build entire applications with the performance of a CDN static cache.Why did we build it?Workers is a platform that lets you run JavaScript on Cloudflare's global edge of 175+ data centers. With only a few lines of code, you can route HTTP requests, modify responses, or even create new responses without an origin server.// A Worker that handles a single redirect, // such a humble beginning... addEventListener("fetch", event => { event.respondWith(handleOneRedirect(event.request)) }) async function handleOneRedirect(request) { let url = new URL(request.url) let device = request.headers.get("CF-Device-Type") // If the device is mobile, add a prefix to the hostname. // (eg. becomes if (device === "mobile") { url.hostname = "mobile." + url.hostname return Response.redirect(url, 302) } // Otherwise, send request to the original hostname. return await fetch(request) } Customers quickly came to us with use cases that required a way to store persistent data. Following our example above, it's easy to handle a single redirect, but what if you want to handle billions of them? You would have to hard-code them into your Workers script, fit it all in under 1 MB, and re-deploy it every time you wanted to make a change — yikes! That’s why we built Workers KV.// A Worker that can handle billions of redirects, // now that's more like it! addEventListener("fetch", event => { event.respondWith(handleBillionsOfRedirects(event.request)) }) async function handleBillionsOfRedirects(request) { let prefix = "/redirect" let url = new URL(request.url) // Check if the URL is a special redirect. // (eg.<random-hash>) if (url.pathname.startsWith(prefix)) { // REDIRECTS is a custom variable that you define, // it binds to a Workers KV "namespace." (aka. a storage bucket) let redirect = await REDIRECTS.get(url.pathname.replace(prefix, "")) if (redirect) { url.pathname = redirect return Response.redirect(url, 302) } } // Otherwise, send request to the original path. return await fetch(request) } With only a few changes from our previous example, we scaled from one redirect to billions − that's just a taste of what you can build with Workers KV.How does it work?Distributed data stores are often modeled using the CAP Theorem, which states that distributed systems can only pick between 2 out of the 3 following guarantees:Consistency - is my data the same everywhere?Availability - is my data accessible all the time?Partition tolerance - is my data resilient to regional outages?Workers KV chooses to guarantee Availability and Partition tolerance. This combination is known as eventual consistency, which presents Workers KV with two unique competitive advantages:Reads are ultra fast (median of 12 ms) since its powered by our caching technology.Data is available across 175+ edge data centers and resilient to regional outages.Although, there are tradeoffs to eventual consistency. If two clients write different values to the same key at the same time, the last client to write eventually "wins" and its value becomes globally consistent. This also means that if a client writes to a key and that same client reads that same key, the values may be inconsistent for a short amount of time.To help visualize this scenario, here's a real-life example amongst three friends:Suppose Matthew, Michelle, and Lee are planning their weekly lunch.Matthew decides they're going out for sushi.Matthew tells Michelle their sushi plans, Michelle agrees.Lee, not knowing the plans, tells Michelle they're actually having pizza.An hour later, Michelle and Lee are waiting at the pizza parlor while Matthew is sitting alone at the sushi restaurant — what went wrong? We can chalk this up to eventual consistency, because after waiting for a few minutes, Matthew looks at his updated calendar and eventually finds the new truth, they're going out for pizza instead.While it may take minutes in real-life, Workers KV is much faster. It can achieve global consistency in less than 60 seconds. Additionally, when a Worker writes to a key, then immediately reads that same key, it can expect the values to be consistent if both operations came from the same location.When should I use it?Now that you understand the benefits and tradeoffs of using eventual consistency, how do you determine if it's the right storage solution for your application? Simply put, if you want global availability with ultra-fast reads, Workers KV is right for you.However, if your application is frequently writing to the same key, there is an additional consideration. We call it "the Matthew question": Are you okay with the Matthews of the world occasionally going to the wrong restaurant?You can imagine use cases (like our redirect Worker example) where this doesn't make any material difference. But if you decide to keep track of a user’s bank account balance, you would not want the possibility of two balances existing at once, since they could purchase something with money they’ve already spent.What can I build with it?Here are a few examples of applications that have been built with KV:Mass redirects - handle billions of HTTP redirects.User authentication - validate user requests to your API.Translation keys - dynamically localize your web pages.Configuration data - manage who can access your origin.Step functions - sync state data between multiple APIs functions.Edge file store - host large amounts of small files.We’ve highlighted several of those use cases in our previous blog post. We also have some more in-depth code walkthroughs, including a recently published blog post on how to build an online To-do list with Workers KV.What's new since beta?By far, our most common request was to make it easier to write data to Workers KV. That's why we're releasing three new ways to make that experience even better:1. Bulk WritesIf you want to import your existing data into Workers KV, you don't want to go through the hassle of sending an HTTP request for every key-value pair. That's why we added a bulk endpoint to the Cloudflare API. Now you can upload up to 10,000 pairs (up to 100 MB of data) in a single PUT request.curl " \ $ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/bulk" \ -X PUT \ -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" \ -H "X-Auth-Email: $CLOUDFLARE_AUTH_EMAIL" \ -d '[ {"key": "built_by", value: "kyle, alex, charlie, andrew, and brett"}, {"key": "reviewed_by", value: "joaquin"}, {"key": "approved_by", value: "steve"} ]' Let's walk through an example use case: you want to off-load your website translation to Workers. Since you're reading translation keys frequently and only occasionally updating them, this application works well with the eventual consistency model of Workers KV.In this example, we hook into Crowdin, a popular platform to manage translation data. This Worker responds to a /translate endpoint, downloads all your translation keys, and bulk writes them to Workers KV so you can read it later on our edge:addEventListener("fetch", event => { if (event.request.url.pathname === "/translate") { event.respondWith(uploadTranslations()) } }) async function uploadTranslations() { // Ask crowdin for all of our translations. var response = await fetch( "" + "/:ci_project_id/download/") // If crowdin is responding, parse the response into // a single json with all of our translations. if (response.ok) { var translations = await zipToJson(response) return await bulkWrite(translations) } // Return the errored response from crowdin. return response } async function bulkWrite(keyValuePairs) { return fetch( "" + "/:cf_account_id/storage/kv/namespaces/:cf_namespace_id/bulk", { method: "PUT", headers: { "Content-Type": "application/json", "X-Auth-Key": ":cf_auth_key", "X-Auth-Email": ":cf_email" }, body: JSON.stringify(keyValuePairs) } ) } async function zipToJson(response) { // ... omitted for brevity ... // (eg. return [ {key: "hello.EN", value: "Hello World"}, {key: "hello.ES", value: "Hola Mundo"} ] } Now, when you want to translate a page, all you have to do is read from Workers KV:async function translate(keys, lang) { // You bind your translations namespace to the TRANSLATIONS variable. return Promise.all( => TRANSLATIONS.get(key + "." + lang))) } 2. Expiring KeysBy default, key-value pairs stored in Workers KV last forever. However, sometimes you want your data to auto-delete after a certain amount of time. That's why we're introducing the expiration and expirationTtloptions for write operations.// Key expires 60 seconds from now. NAMESPACE.put("myKey", "myValue", {expirationTtl: 60}) // Key expires if the UNIX epoch is in the past. NAMESPACE.put("myKey", "myValue", {expiration: 1247788800}) # You can also set keys to expire from the Cloudflare API. curl " \ $ACCOUNT_ID/storage/kv/namespaces/$NAMESPACE_ID/ \ values/$KEY?expiration_ttl=$EXPIRATION_IN_SECONDS" -X PUT \ -H "X-Auth-Key: $CLOUDFLARE_AUTH_KEY" \ -H "X-Auth-Email: $CLOUDFLARE_AUTH_EMAIL" \ -d "$VALUE" Let's say you want to block users that have been flagged as inappropriate from your website, but only for a week. With an expiring key, you can set the expire time and not have to worry about deleting it later.In this example, we assume users and IP addresses are one of the same. If your application has authentication, you could use access tokens as the key identifier.addEventListener("fetch", event => { var url = new URL(event.request.url) // An internal API that blocks a new user IP. // (eg. if (url.pathname.startsWith("/block")) { var ip = url.pathname.split("/").pop() event.respondWith(blockIp(ip)) } else { // Other requests check if the IP is blocked. event.respondWith(handleRequest(event.request)) } }) async function blockIp(ip) { // Values are allowed to be empty in KV, // we don't need to store any extra information anyway. await BLOCKED.put(ip, "", {expirationTtl: 60*60*24*7}) return new Response("ok") } async function handleRequest(request) { var ip = request.headers.get("CF-Connecting-IP") if (ip) { var blocked = await BLOCKED.get(ip) // If we detect an IP and its blocked, respond with a 403 error. if (blocked) { return new Response({status: 403, statusText: "You are blocked!"}) } } // Otherwise, passthrough the original request. return fetch(request) } 3. Larger ValuesWe've increased our size limit on values from 64 kB to 2 MB. This is quite useful if you need to store buffer-based or file data in Workers KV.Consider this scenario: you want to let your users upload their favorite GIF to their profile without having to store these GIFs as binaries in your database or managing another cloud storage bucket.Workers KV is a great fit for this use case! You can create a Workers KV namespace for your users’ GIFs that is fast and reliable wherever your customers are located.In this example, users upload a link to their favorite GIF, then a Worker downloads it and stores it to Workers KV.addEventListener("fetch", event => { var url = event.request.url var arg = request.url.split("/").pop() // User sends a URI encoded link to the GIF they wish to upload. // (eg.<encoded-uri>) if (url.pathname.startsWith("/api/upload_gif")) { event.respondWith(uploadGif(arg)) // Profile contains link to view the GIF. // (eg.<username>) } else if (url.pathname.startsWith("/api/view_gif")) { event.respondWith(getGif(arg)) } }) async function uploadGif(url) { // Fetch the GIF from the Internet. var gif = await fetch(decodeURIComponent(url)) var buffer = await gif.arrayBuffer() // Upload the GIF as a buffer to Workers KV. await GIFS.put(, buffer) return gif } async function getGif(username) { var gif = await GIFS.get(username, "arrayBuffer") // If the user has set one, respond with the GIF. if (gif) { return new Response(gif, {headers: {"Content-Type": "image/gif"}}) } else { return new Response({status: 404, statusText: "User has no GIF!"}) } } Lastly, we want to thank all of our beta customers. It was your valuable feedback that led us to develop these changes to Workers KV. Make sure to stay in touch with us, we're always looking ahead for what's next and we love hearing from you!PricingWe’re also ready to announce our GA pricing. If you're one of our Enterprise customers, your pricing obviously remains unchanged.$0.50 / GB of data stored, 1 GB included$0.50 / million reads, 10 million included$5 / million write, list, and delete operations, 1 million includedDuring the beta period, we learned customers don't want to just read values at our edge, they want to write values from our edge too. Since there is high demand for these edge operations, which are more costly, we have started charging non-read operations per month.LimitsAs mentioned earlier, we increased our value size limit from 64 kB to 2 MB. We've also removed our cap on the number of keys per namespace — it's now unlimited. Here are our GA limits:Up to 20 namespaces per account, each with unlimited keysKeys of up to 512 bytes and values of up to 2 MBUnlimited writes per second for different keysOne write per second for the same keyUnlimited reads per second per keyTry it out now!Now open to all customers, you can start using Workers KV today from your Cloudflare dashboard under the Workers tab. You can also look at our updated documentation.We're really excited to see what you all can build with Workers KV!

One night in Beijing

As the old saying goes, good things come in pairs, 好事成双! The month of May marks a double celebration in China for our customers, partners and Cloudflare.First and ForemostA Beijing Customer Appreciation Cocktail was held in the heart of Beijing at Yintai Centre Xiu Rooftop Garden Bar on the 10 May 2019, an RSVP event graced by our supportive group of partners and customers. We have been blessed with almost 10 years of strong growth at Cloudflare - sharing our belief in providing access to internet security and performance to customers of all sizes and industries. This success has been the result of collaboration between our developers, our product team as represented today by our special guest, Jen Taylor, our Global Head of Product, Business Leaders Xavier Cai, Head of China business, and Aliza Knox Head of our APAC Business, James Ball our Head of Solutions Engineers for APAC, most importantly, by the trust and faith that our partners, such as Baidu, and customers have placed in us.Double Happiness, 双喜On the same week, we embarked on another exciting journey in China with our grand office opening at WeWork. Beijing team consists of functions from Customer Development to Solutions Engineering and Customer Success lead by Xavier, Head of China business. The team has grown rapidly in size by double since it started last year.We continue to invest in China and to grow our customer base, and importantly our methods for supporting our customers, here are well. Those of us who came from different parts of the world, are also looking to learn from the wisdom and experience of our customers in this market. And to that end, we look forward to many more years of openness, trust, and mutual success.感谢所有花时间来参加我们这次北京鸡尾酒会的客户和合作伙伴,谢谢各位对此活动的大力支持与热烈交流!

One more thing... new Speed Page

Congratulations on making it through Speed Week. In the last week, Cloudflare has: described how our global network speeds up the Internet, launched a HTTP/2 prioritisation model that will improve web experiences on all browsers, launched an image resizing service which will deliver the optimal image to every device, optimized live video delivery, detailed how to stream progressive images so that they render twice as fast - using the flexibility of our new HTTP/2 prioritisation model and finally, prototyped a new over-the-wire format for JavaScript that could improve application start-up performance especially on mobile devices. As a bonus, we’re also rolling out one more new feature: “TCP Turbo” automatically chooses the TCP settings to further accelerate your website.As a company, we want to help every one of our customers improve web experiences. The growth of Cloudflare, along with the increase in features, has often made simple questions difficult to answer:How fast is my website?How should I be thinking about performance features?How much faster would the site be if I were to enable a particular feature?This post will describe the exciting changes we have made to the Speed Page on the Cloudflare dashboard to give our customers a much clearer understanding of how their websites are performing and how they can be made even faster. The new Speed Page consists of :A visual comparison of your website loading on Cloudflare, with caching enabled, compared to connecting directly to the origin.The measured improvement expected if any performance feature is enabled.A report describing how fast your website is on desktop and mobile.We want to simplify the complexity of making web experiences fast and give our customers control.  Take a look - We hope you like it.Why do fast web experiences matter?Customer experience : No one likes slow service. Imagine if you go to a restaurant and the service is slow, especially when you arrive; you are not likely to go back or recommend it to your friends. It turns out the web works in the same way and Internet customers are even more demanding. As many as 79% of customers who are “dissatisfied” with a website’s performance are less likely to buy from that site again.Engagement and Revenue : There are many studies explaining how speed affects customer engagement, bounce rates and revenue. Reputation : There is also brand reputation to consider as customers associate an online experience to the brand. A study found that for 66% of the sample website performance influences their impression of the company.Diversity : Mobile traffic has grown to be larger than its desktop counterpart over the last few years. Mobile customers' expectations have becoming increasingly demanding and expect seamless Internet access regardless of location. Mobile provides a new set of challenges that includes the diversity of device specifications. When testing, be aware that the average mobile device is significantly less capable than the top-of-the-range models. For example, there can be orders-of-magnitude disparity in the time different mobile devices take to run JavaScript. Another challenge is the variance in mobile performance, as customers move from a strong, high quality office network to mobile networks of different speeds (3G/5G), and quality within the same browsing session.New Speed PageThere is compelling evidence that a faster web experience is important for anyone online. Most of the major studies involve the largest tech companies, who have whole teams dedicated to measuring and improving web experiences for their own services. At Cloudflare we are on a mission to help build a better and faster Internet for everyone - not just the selected few. Delivering fast web experiences is not a simple matter. That much is clear.To know what to send and when requires a deep understanding of every layer of the stack, from TCP tuning, protocol level prioritisation, content delivery formats through to the intricate mechanics of browser rendering.  You will also need a global network that strives to be within 10 ms of every Internet user. The intrinsic value of such a network, should be clear to everyone. Cloudflare has this network, but it also offers many additional performance features.With the Speed Page redesign, we are emphasizing the performance benefits of using Cloudflare and the additional improvements possible from our features.The de facto standard for measuring website performance has been WebPageTest. Having its creator in-house at Cloudflare encouraged us to use it as the basis for website performance measurement. So, what is the easiest way to understand how a web page loads? A list of statistics do not paint a full picture of actual user experience. One of the cool features of WebPageTest is that it can generate a filmstrip of screen snapshots taken during a web page load, enabling us to quantify how a page loads, visually. This view makes it significantly easier to determine how long the page is blank for, and how long it takes for the most important content to render. Being able to look at the results in this way, provides the ability to empathise with the user.How fast on Cloudflare ?After moving your website to Cloudflare, you may have asked: How fast did this decision make my website? Well, now we provide the answer:Comparison of website performance using Cloudflare. As well as the increase in speed, we provide filmstrips of before and after, so that it is easy to compare and understand how differently a user will experience the website. If our tests are unable to reach your origin and you are already setup on Cloudflare, we will test with development mode enabled, which disables caching and minification.Site performance statisticsHow can we measure the user experience of a website?Traditionally, page load was the important metric. Page load is a technical measurement used by browser vendors that has no bearing on the presentation or usability of a page. The metric reports on how long it takes not only to load the important content but also all of the 3rd party content (social network widgets, advertising, tracking scripts etc.). A user may very well not see anything until after all the page content has loaded, or they may be able to interact with a page immediately, while content continues to load.A user will not decide whether a page is fast by a single measure or moment. A user will perceive how fast a website is from a combination of factors:when they see any responsewhen they see the content they expectwhen they can interact with the pagewhen they can perform the task they intendedExperience has shown that if you focus on one measure, it will likely be to the detriment of the others.Importance of Visual responseIf an impatient user navigates to your site and sees no content for several seconds or no valuable content, they are likely to get frustrated and leave. The paint timing spec defines a set of paint metrics, when content appears on a page, to measure the key moments in how a user perceives performance. First Contentful Paint (FCP) is the time when the browser first renders any DOM content. First Meaningful Paint (FMP) is the point in time when the page’s “primary” content appears on the screen. This metric should relate to what the user has come to the site to see and is designed as the point in time when the largest visible layout change happens. Speed Index attempts to quantify the value of the filmstrip rather than using a single paint timing. The speed index measures the rate at which content is displayed - essentially the area above the curve. In the chart below from our progressive image feature you can see reaching 80% happens much earlier for the parallelized (red) load rather than the regular (blue). Importance of interactivityThe same impatient user is now happy that the content they want to see has appeared. They will still become frustrated if they are unable to interact with the site. Time to Interactive is the time it takes for content to be rendered and the page is ready to receive input from the user. Technically this is defined as when the browser’s main processing thread has been idle for several seconds after first meaningful paint.The Speed Tab displays these key metrics for mobile and desktop.How much faster on Cloudflare ?The Cloudflare Dashboard provides a list of performance features which can, admittedly, be both confusing and daunting. What would be the benefit of turning on Rocket Loader and on which performance metrics will it have the most impact ? If you upgrade to Pro what will be the value of the enhanced HTTP/2 prioritisation ? The optimization section answers these questions. Tests are run with each performance feature turned on and off. The values for the tests for the appropriate performance metrics are displayed, along with the improvement. You can enable or upgrade the feature from this view. Here are a few examples :If Rocket Loader were enabled for this website, the render-blocking JavaScript would be deferred causing first paint time to drop from 1.25s to 0.81s - an improvement of 32% on desktop.Image heavy sites do not perform well on slow mobile connections. If you enable Mirage, your customers on 3G connections would see meaningful content 1s sooner - an improvement of 29.4%.So how about our new features?We tested the enhanced HTTP/2 prioritisation feature on an Edge browser on desktop and saw meaningful content display 2s sooner - an improvement of 64%.This is a more interesting result taken from the blog example used to illustrate the progressive image streaming. At first glance the improvement of 29% in speed index is good. The filmstrip comparison shows a more significant difference. In this case the page with no images shown is already 43% visually complete for both scenarios after 1.5s. At 2.5s the difference is 77% compared to 50%.This is a great example of how metrics do not tell the full story. They cannot completely replace viewing the page loading flow and understanding what is important for your site.How to tryThis is our first iteration of the new Speed Page and we are eager to get your feedback. We will be rolling this out to beta customers who are interested in seeing how their sites perform. To be added to the queue for activation of the new Speed Page please click on the banner on the overview page, or click on the banner on the existing Speed Page.

EU election season and securing online democracy

It’s election season in Europe, as European Parliament seats are contested across the European Union by national political parties. With approximately 400 million people eligible to vote, this is one of the biggest democratic exercises in the world - second only to India - and it takes place once every five years. Over the course of four days, 23-26 May 2019, each of the 28 EU countries will elect a different number of Members of the European Parliament (“MEPs”) roughly mapped to population size and based on a proportional system. The 751 newly elected MEPs (a number which includes the UK’s allocation for the time being) will take their seats in July. These elections are not only important because the European Parliament plays a large role in the EU democratic system, being a co-legislator alongside the European Council, but as the French President Emmanuel Macron has described, these European elections will be decisive for the future of the continent.Election security: an EU political priorityPolitical focus on the potential cybersecurity threat to the EU elections has been extremely high, and various EU institutions and agencies have been engaged in a long campaign to drive awareness among EU Member States and to help political parties prepare. Last month for example, more than 80 representatives from the European Parliament, EU Member States, the European Commission and the European Agency for Network and Information Security (ENISA) gathered for a table-top exercise to test the EU's response to potential incidents. The objective of the exercise was to test the efficacy of EU Member States’ practices and crisis plans, to acquire an overview of the level of resilience across the EU, and to identify potential gaps and adequate mitigation measures. Earlier this year, ENISA published a paper on EU-wide election security which described how as a result of the large attack surface that is inherent to elections, the risks do not only concern government election systems but also extend to individual candidates and individual political campaigns. Examples of attack vectors that affect election processes can include spear phishing, data theft, online disinformation, malware, and DDoS attacks. ENISA went on to propose that election systems, processes and infrastructures be classified as critical infrastructure, and that a legal obligation be put in place requiring political organisations to deploy a high level of cybersecurity.Last September, in his State of the Union address, European Commission President Juncker announced a package of initiatives aimed at ensuring that the EU elections are organised in a free, fair and secure manner. EU Member States subsequently set up a national cooperation network of relevant authorities – such as electoral, cybersecurity, data protection and law enforcement authorities – and appointed contact points to take part in a European cooperation network for elections. In July 2018, the Cooperation Group set up under the EU NIS Directive (composed of Member States, the European Commission and ENISA) issued a detailed report, "Compendium on Cyber Security of Election Technology". The report outlined how election processes typically extend over a long life cycle, consisting of several phases, and the presentation layer is as important as the correct vote count and protection of the interface where citizens learn of the election results. Estonia - a country that is known to be a digital leader when it comes to eGovernment services - is currently the only EU country that offers its citizens the option to cast their ballot online. However, even electoral systems that rely exclusively on paper voting typically take advantage of digital tools and services in compiling voter rolls, candidate registration or result tabulation and communication. The report described various election/cyber incidents witnessed at EU Member State level and the methods used. As the electoral systems vary greatly across the EU, the NIS Cooperation Group ultimately recommended that tools, procedures, technologies and protection measures should follow a “pick and mix” approach which can include DDoS protection, network flow analysis and monitoring, and use of a CDN. Cloudflare provides all these services and more, helping to prevent the defacement of public-facing websites and Denial of Service attacks, and ensuring the high availability and performance of web pages which need to be capable of withstanding a significant traffic load at peak times. Cloudflare’s election security experienceCloudflare’s CTO John Graham-Cumming recently spoke at a session in Brussels which explored Europe’s cyber-readiness for the EU elections. He outlined that while sophisticated cyber attacks are on the rise, humans can often be the weakest link. Strong password protection, two factor authentication and a keen eye for phishing scams can go a long way in thwarting attackers’ attempts to penetrate campaign and voting web properties. John also described Cloudflare’s experience in running the Athenian Project, which provides free enterprise-level services to government election and voter registration websites. Source: PoliticoCloudflare has protected most of the major U.S Presidential campaign websites from cyberattacks, including the Trump/Pence campaign website, the website for the campaign of Senator Bernie Sanders, and websites for 14 of the 15 leading candidates from the two  political parties. We have also protected election websites in countries like Peru, Ecuador and, most recently, North Macedonia. Is Europe cyber-ready?Thanks to the high profile awareness campaign across the EU, Europeans have had time to prepare and to look for solutions according to their needs. Election interference is certainly not a new phenomenon, however, the scale of the current threat is unprecedented and clever disinformation campaigns are also now in play. Experts have recently identified techniques such as spear phishing and DDoS attacks as particular threats to watch for, and the European Commission has been monitoring industry progress under the Code of Practice on Disinformation which has encouraged platforms such as Google, Twitter and Facebook to take action to fight against malicious bots and fake accounts.What is clear is that this can only ever be a coordinated effort, with both governments and industry working together to ensure a robust response to any threats to the democratic process. For its part, Cloudflare is protecting a number of political group websites across the EU and we have been seeing Layer 4 and Layer 7 DDoS attacks, as well as pen testing and firewall probing attempts. Incidents this month have included attacks against Swedish, French, Spanish and UK web properties, with particularly high activity across the board around 8th May. As the elections approach, we can expect the volume/spread of attacks to increase.Further information about the European elections can be found here - and if you are based in Europe, don’t forget to vote!

Cloudflare architecture and how BPF eats the world

Recently at Netdev 0x13, the Conference on Linux Networking in Prague, I gave a short talk titled "Linux at Cloudflare". The talk ended up being mostly about BPF. It seems, no matter the question - BPF is the answer.Here is a transcript of a slightly adjusted version of that talk.At Cloudflare we run Linux on our servers. We operate two categories of data centers: large "Core" data centers, processing logs, analyzing attacks, computing analytics, and the "Edge" server fleet, delivering customer content from 180 locations across the world.In this talk, we will focus on the "Edge" servers. It's here where we use the newest Linux features, optimize for performance and care deeply about DoS resilience.Our edge service is special due to our network configuration - we are extensively using anycast routing. Anycast means that the same set of IP addresses are announced by all our data centers.This design has great advantages. First, it guarantees the optimal speed for end users. No matter where you are located, you will always reach the closest data center. Then, anycast helps us to spread out DoS traffic. During attacks each of the locations receives a small fraction of the total traffic, making it easier to ingest and filter out unwanted traffic.Anycast allows us to keep the networking setup uniform across all edge data centers. We applied the same design inside our data centers - our software stack is uniform across the edge servers. All software pieces are running on all the servers.In principle, every machine can handle every task - and we run many diverse and demanding tasks. We have a full HTTP stack, the magical Cloudflare Workers, two sets of DNS servers - authoritative and resolver, and many other publicly facing applications like Spectrum and Warp.Even though every server has all the software running, requests typically cross many machines on their journey through the stack. For example, an HTTP request might be handled by a different machine during each of the 5 stages of the processing.Let me walk you through the early stages of inbound packet processing:(1) First, the packets hit our router. The router does ECMP, and forwards packets onto our Linux servers. We use ECMP to spread each target IP across many, at least 16, machines. This is used as a rudimentary load balancing technique.(2) On the servers we ingest packets with XDP eBPF. In XDP we perform two stages. First, we run volumetric DoS mitigations, dropping packets belonging to very large layer 3 attacks.(3) Then, still in XDP, we perform layer 4 load balancing. All the non-attack packets are redirected across the machines. This is used to work around the ECMP problems, gives us fine-granularity load balancing and allows us to gracefully take servers out of service.(4) Following the redirection the packets reach a designated machine. At this point they are ingested by the normal Linux networking stack, go through the usual iptables firewall, and are dispatched to an appropriate network socket.(5) Finally packets are received by an application. For example HTTP connections are handled by a "protocol" server, responsible for performing TLS encryption and processing HTTP, HTTP/2 and QUIC protocols.It's in these early phases of request processing where we use the coolest new Linux features. We can group useful modern functionalities into three categories:DoS handlingLoad balancingSocket dispatchLet's discuss DoS handling in more detail. As mentioned earlier, the first step after ECMP routing is Linux's XDP stack where, among other things, we run DoS mitigations.Historically our mitigations for volumetric attacks were expressed in classic BPF and iptables-style grammar. Recently we adapted them to execute in the XDP eBPF context, which turned out to be surprisingly hard. Read on about our adventures:L4Drop: XDP DDoS Mitigationsxdpcap: XDP Packet CaptureXDP based DoS mitigation talk by Arthur FabreXDP in practice: integrating XDP into our DDoS mitigation pipeline (PDF)During this project we encountered a number of eBPF/XDP limitations. One of them was the lack of concurrency primitives. It was very hard to implement things like race-free token buckets. Later we found that Facebook engineer Julia Kartseva had the same issues. In February this problem has been addressed with the introduction of bpf_spin_lock helper.While our modern volumetric DoS defenses are done in XDP layer, we still rely on iptables for application layer 7 mitigations. Here, a higher level firewall’s features are useful: connlimit, hashlimits and ipsets. We also use the xt_bpf iptables module to run cBPF in iptables to match on packet payloads. We talked about this in the past:Lessons from defending the indefensible (PPT)Introducing the BPF toolsAfter XDP and iptables, we have one final kernel side DoS defense layer.Consider a situation when our UDP mitigations fail. In such case we might be left with a flood of packets hitting our application UDP socket. This might overflow the socket causing packet loss. This is problematic - both good and bad packets will be dropped indiscriminately. For applications like DNS it's catastrophic. In the past to reduce the harm, we ran one UDP socket per IP address. An unmitigated flood was bad, but at least it didn't affect the traffic to other server IP addresses.Nowadays that architecture is no longer suitable. We are running more than 30,000 DNS IP's and running that number of UDP sockets is not optimal. Our modern solution is to run a single UDP socket with a complex eBPF socket filter on it - using the SO_ATTACH_BPF socket option. We talked about running eBPF on network sockets in past blog posts:eBPF, Sockets, Hop Distance and manually writing eBPF assemblySOCKMAP - TCP splicing of the futureThe mentioned eBPF rate limits the packets. It keeps the state - packet counts - in an eBPF map. We can be sure that a single flooded IP won't affect other traffic. This works well, though during work on this project we found a rather worrying bug in the eBPF verifier:eBPF can't count?!I guess running eBPF on a UDP socket is not a common thing to do.Apart from the DoS, in XDP we also run a layer 4 load balancer layer. This is a new project, and we haven't talked much about it yet. Without getting into many details: in certain situations we need to perform a socket lookup from XDP.The problem is relatively simple - our code needs to look up the "socket" kernel structure for a 5-tuple extracted from a packet. This is generally easy - there is a bpf_sk_lookup helper available for this. Unsurprisingly, there were some complications. One problem was the inability to verify if a received ACK packet was a valid part of a three-way handshake when SYN-cookies are enabled. My colleague Lorenz Bauer is working on adding support for this corner case.After DoS and the load balancing layers, the packets are passed onto the usual Linux TCP / UDP stack. Here we do a socket dispatch - for example packets going to port 53 are passed onto a socket belonging to our DNS server.We do our best to use vanilla Linux features, but things get complex when you use thousands of IP addresses on the servers.Convincing Linux to route packets correctly is relatively easy with the "AnyIP" trick. Ensuring packets are dispatched to the right application is another matter. Unfortunately, standard Linux socket dispatch logic is not flexible enough for our needs. For popular ports like TCP/80 we want to share the port between multiple applications, each handling it on a different IP range. Linux doesn't support this out of the box. You can call bind() either on a specific IP address or all IP's (with order to fix this, we developed a custom kernel patch which adds a SO_BINDTOPREFIX socket option. As the name suggests - it allows us to call bind() on a selected IP prefix. This solves the problem of multiple applications sharing popular ports like 53 or 80.Then we run into another problem. For our Spectrum product we need to listen on all 65535 ports. Running so many listen sockets is not a good idea (see our old war story blog), so we had to find another way. After some experiments we learned to utilize an obscure iptables module - TPROXY - for this purpose. Read about it here:Abusing Linux's firewall: the hack that allowed us to build SpectrumThis setup is working, but we don't like the extra firewall rules. We are working on solving this problem correctly - actually extending the socket dispatch logic. You guessed it - we want to extend socket dispatch logic by utilizing eBPF. Expect some patches from us.Then there is a way to use eBPF to improve applications. Recently we got excited about doing TCP splicing with SOCKMAP:SOCKMAP - TCP splicing of the futureThis technique has a great potential for improving tail latency across many pieces of our software stack. The current SOCKMAP implementation is not quite ready for prime time yet, but the potential is vast.Similarly, the new TCP-BPF aka BPF_SOCK_OPS hooks provide a great way of inspecting performance parameters of TCP flows. This functionality is super useful for our performance team.Some Linux features didn't age well and we need to work around them. For example, we are hitting limitations of networking metrics. Don't get me wrong - the networking metrics are awesome, but sadly they are not granular enough. Things like TcpExtListenDrops and TcpExtListenOverflows are reported as global counters, while we need to know it on a per-application basis.Our solution is to use eBPF probes to extract the numbers directly from the kernel. My colleague Ivan Babrou wrote a Prometheus metrics exporter called "ebpf_exporter" to facilitate this. Read on:Introducing ebpf_exporter "ebpf_exporter" we can generate all manner of detailed metrics. It is very powerful and saved us on many occasions.In this talk we discussed 6 layers of BPFs running on our edge servers:Volumetric DoS mitigations are running on XDP eBPFIptables xt_bpf cBPF for application-layer attacksSO_ATTACH_BPF for rate limits on UDP socketsLoad balancer, running on XDPeBPFs running application helpers like SOCKMAP for TCP socket splicing, and TCP-BPF for TCP measurements"ebpf_exporter" for granular metricsAnd we're just getting started! Soon we will be doing more with eBPF based socket dispatch, eBPF running on Linux TC (Traffic Control) layer and more integration with cgroup eBPF hooks. Then, our SRE team is maintaining ever-growing list of BCC scripts useful for debugging.It feels like Linux stopped developing new API's and all the new features are implemented as eBPF hooks and helpers. This is fine and it has strong advantages. It's easier and safer to upgrade eBPF program than having to recompile a kernel module. Some things like TCP-BPF, exposing high-volume performance tracing data, would probably be impossible without eBPF.Some say "software is eating the world", I would say that: "BPF is eating the software".

Join Cloudflare & Yandex at our Moscow meetup! Присоединяйтесь к митапу в Москве!

Photo by Serge Kutuzov / UnsplashAre you based in Moscow? Cloudflare is partnering with Yandex to produce a meetup this month in Yandex's Moscow headquarters.  We would love to invite you to join us to learn about the newest in the Internet industry. You'll join Cloudflare's users, stakeholders from the tech community, and Engineers and Product Managers from both Cloudflare and Yandex.Cloudflare Moscow MeetupTuesday, May 30, 2019: 18:00 - 22:00 Location: Yandex - Ulitsa L'va Tolstogo, 16, Moskva, Russia, 119021Talks will include "Performance and scalability at Cloudflare”, "Security at Yandex Cloud", and "Edge computing".Speakers will include Evgeny Sidorov, Information Security Engineer at Yandex, Ivan Babrou, Performance Engineer at Cloudflare, Alex Cruz Farmer, Product Manager for Firewall at Cloudflare, and Olga Skobeleva, Solutions Engineer at Cloudflare.Agenda:18:00 - 19:00 - Registration and welcome cocktail19:00 - 19:10 - Cloudflare overview19:10 - 19:40 - Performance and scalability at Cloudflare19:40 - 20:10 - Security at Yandex Cloud20:10 - 20:40 - Cloudflare security solutions and industry security trends20:40 - 21:10 - Edge computingQ&AThe talks will be followed by food, drinks, and networking.View Event Details & Register Here »We'll hope to meet you soon.Разработчики, присоединяйтесь к Cloudflare и Яндексу на нашей предстоящей встрече в Москве!Cloudflare сотрудничает с Яндексом, чтобы организовать мероприятие в этом месяце в штаб-квартире Яндекса. Мы приглашаем вас присоединиться к встрече посвященной новейшим достижениям в интернет-индустрии. На мероприятии соберутся клиенты Cloudflare, профессионалы из технического сообщества, инженеры из Cloudflare и Яндекса.Вторник, 30 мая: 18:00 - 22:00Место встречи: Яндекс, улица Льва Толстого, 16, Москва, Россия, 119021Доклады будут включать себя такие темы как «Решения безопасности Cloudflare и тренды в области безопасности», «Безопасность в Yandex Cloud», “Производительность и масштабируемость в Cloudflare и «Edge computing» от докладчиков из Cloudflare и Яндекса.Среди докладчиков будут Евгений Сидоров, Заместитель руководителя группы безопасности сервисов в Яндексе, Иван Бобров, Инженер по производительности в Cloudflare, Алекс Круз Фармер, Менеджер продукта Firewall в Cloudflare, и Ольга Скобелева, Инженер по внедрению в Cloudflare.Программа:18:00 - 19:00 - Регистрация, напитки и общение19:00 - 19:10 - Обзор Cloudflare19:10 - 19:40 - Производительность и масштабируемость в Cloudflare19:40 - 20:10 - Решения для обеспечения безопасности в Яндексе20:10 - 20:40 - Решения безопасности Cloudflare и тренды в области безопасности20:40 - 21:10 - Примеры Serverless-решений по безопасностиQ&AВслед за презентациям последует общение, еда и напитки.Посмотреть детали события и зарегистрироваться можно здесь »Ждем встречи с вами!

Faster script loading with BinaryAST?

JavaScript Cold startsThe performance of applications on the web platform is becoming increasingly bottlenecked by the startup (load) time. Large amounts of JavaScript code are required to create rich web experiences that we’ve become used to. When we look at the total size of JavaScript requested on mobile devices from HTTPArchive, we see that an average page loads 350KB of JavaScript, while 10% of pages go over the 1MB threshold. The rise of more complex applications can push these numbers even higher.While caching helps, popular websites regularly release new code, which makes cold start (first load) times particularly important. With browsers moving to separate caches for different domains to prevent cross-site leaks, the importance of cold starts is growing even for popular subresources served from CDNs, as they can no longer be safely shared.Usually, when talking about the cold start performance, the primary factor considered is a raw download speed. However, on modern interactive pages one of the other big contributors to cold starts is JavaScript parsing time. This might seem surprising at first, but makes sense - before starting to execute the code, the engine has to first parse the fetched JavaScript, make sure it doesn’t contain any syntax errors and then compile it to the initial bytecode. As networks become faster, parsing and compilation of JavaScript could become the dominant factor.The device capability (CPU or memory performance) is the most important factor in the variance of JavaScript parsing times and correspondingly the time to application start. A 1MB JavaScript file will take an order of a 100 ms to parse on a modern desktop or high-end mobile device but can take over a second on an average phone  (Moto G4).A more detailed post on the overall cost of parsing, compiling and execution of JavaScript shows how the JavaScript boot time can vary on different mobile devices. For example, in the case of, it can range from 4s on a Pixel 2 to 28s on a low-end device.While engines continuously improve raw parsing performance, with V8 in particular doubling it over the past year, as well as moving more things off the main thread, parsers still have to do lots of potentially unnecessary work that consumes memory, battery and might delay the processing of the useful resources.The “BinaryAST” ProposalThis is where BinaryAST comes in. BinaryAST is a new over-the-wire format for JavaScript proposed and actively developed by Mozilla that aims to speed up parsing while keeping the semantics of the original JavaScript intact. It does so by using an efficient binary representation for code and data structures, as well as by storing and providing extra information to guide the parser ahead of time.The name comes from the fact that the format stores the JavaScript source as an AST encoded into a binary file. The specification lives at and is being worked on by engineers from Mozilla, Facebook, Bloomberg and Cloudflare.“Making sure that web applications start quickly is one of the most important, but also one of the most challenging parts of web development. We know that BinaryAST can radically reduce startup time, but we need to collect real-world data to demonstrate its impact. Cloudflare's work on enabling use of BinaryAST with Cloudflare Workers is an important step towards gathering this data at scale.” Till Schneidereit, Senior Engineering Manager, Developer TechnologiesMozillaParsing JavaScriptFor regular JavaScript code to execute in a browser the source is parsed into an intermediate representation known as an AST that describes the syntactic structure of the code. This representation can then be compiled into a byte code or a native machine code for execution.A simple example of adding two numbers can be represented in an AST as:Parsing JavaScript is not an easy task; no matter which optimisations you apply, it still requires reading the entire text file char by char, while tracking extra context for syntactic analysis.The goal of the BinaryAST is to reduce the complexity and the amount of work the browser parser has to do overall by providing an additional information and context by the time and place where the parser needs it.To execute JavaScript delivered as BinaryAST the only steps required are:Another benefit of BinaryAST is that it makes possible to only parse the critical code necessary for start-up, completely skipping over the unused bits. This can dramatically improve the initial loading time.This post will now describe some of the challenges of parsing JavaScript in more detail, explain how the proposed format addressed them, and how we made it possible to run its encoder in Workers.HoistingJavaScript relies on hoisting for all declarations - variables, functions, classes. Hoisting is a property of the language that allows you to declare items after the point they’re syntactically used.Let's take the following example:function f() { return g(); } function g() { return 42; }Here, when the parser is looking at the body of f, it doesn’t know yet what g is referring to - it could be an already existing global function or something declared further in the same file - so it can’t finalise parsing of the original function and start the actual compilation.BinaryAST fixes this by storing all the scope information and making it available upfront before the actual expressions.As shown by the difference between the initial AST and the enhanced AST in a JSON representation:Lazy parsingOne common technique used by modern engines to improve parsing times is lazy parsing. It utilises the fact that lots of websites include more JavaScript than they actually need, especially for the start-up.Working around this involves a set of heuristics that try to guess when any given function body in the code can be safely skipped by the parser initially and delayed for later. A common example of such heuristic is immediately running the full parser for any function that is wrapped into parentheses:(function(...Such prefix usually indicates that a following function is going to be an IIFE (immediately-invoked function expression), and so the parser can assume that it will be compiled and executed ASAP, and wouldn’t benefit from being skipped over and delayed for later.(function() { … })();These heuristics significantly improve the performance of the initial parsing and cold starts, but they’re not completely reliable or trivial to implement.One of the reasons is the same as in the previous section - even with lazy parsing, you still need to read the contents, analyse them and store an additional scope information for the declarations.Another reason is that the JavaScript specification requires reporting any syntax errors immediately during load time, and not when the code is actually executed. A class of these errors, called early errors, is checking for mistakes like usage of the reserved words in invalid contexts, strict mode violations, variable name clashes and more. All of these checks require not only lexing JavaScript source, but also tracking extra state even during the lazy parsing.Having to do such extra work means you need to be careful about marking functions as lazy too eagerly, especially if they actually end up being executed during the page load. Otherwise you’re making cold start costs even worse, as now every function that is erroneously marked as lazy, needs to be parsed twice - once by the lazy parser and then again by the full one.Because BinaryAST is meant to be an output format of other tools such as Babel, TypeScript and bundlers such as Webpack, the browser parser can rely on the JavaScript being already analysed and verified by the initial parser. This allows it to skip function bodies completely, making lazy parsing essentially free.It reduces the cost of a completely unused code - while including it is still a problem in terms of the network bandwidth (don’t do this!), at least it’s not affecting parsing times anymore. These benefits apply equally to the code that is used later in the page lifecycle (for example, invoked in response to user actions), but is not required during the startup.Last but not least important benefit of such approach is that BinaryAST encodes lazy annotations as part of the format, giving tools and developers direct and full control over the heuristics. For example, a tool targeting the Web platform or a framework CLI can use its domain-specific knowledge to mark some event handlers as lazy or eager depending on the context and the event type.Avoiding ambiguity in parsingUsing a text format for a programming language is great for readability and debugging, but it's not the most efficient representation for parsing and execution.For example, parsing low-level types like numbers, booleans and even strings from text requires extra analysis and computation, which is unnecessary when you can just store and read them as native binary-encoded values in the first place and read directly on the other side.Another problem is an ambiguity in the grammar itself. It was already an issue in the ES5 world, but could usually be resolved with some extra bookkeeping based on the previously seen tokens. However, in ES6+ there are productions that can be ambiguous all the way through until they’re parsed completely.For example, a token sequence like:(a, {b: c, d}, [e = 1])...can start either a parenthesized comma expression with nested object and array literals and an assignment:(a, {b: c, d}, [e = 1]); // it was an expressionor a parameter list of an arrow expression function with nested object and array patterns and a default value:(a, {b: c, d}, [e = 1]) => … // it was a parameter listBoth representations are perfectly valid, but have completely different semantics, and you can’t know which one you’re dealing with until you see the final token.To work around this, parsers usually have to either backtrack, which can easily get exponentially slow, or to parse contents into intermediate node types that are capable of holding both expressions and patterns, with following conversion. The latter approach preserves linear performance, but makes the implementation more complicated and requires preserving more state.In the BinaryAST format this issue doesn't exist in the first place because the parser sees the type of each node before it even starts parsing its contents.Cloudflare ImplementationCurrently, the format is still in flux, but the very first version of the client-side implementation was released under a flag in Firefox Nightly several months ago. Keep in mind this is only an initial unoptimised prototype, and there are already several experiments changing the format to provide improvements to both size and parsing performance.On the producer side, the reference implementation lives at Our goal was to take this reference implementation and consider how we would deploy it at Cloudflare scale.If you dig into the codebase, you will notice that it currently consists of two parts.One is the encoder itself, which is responsible for taking a parsed AST, annotating it with scope and other relevant information, and writing out the result in one of the currently supported formats. This part is written in Rust and is fully native.Another part is what produces that initial AST - the parser. Interestingly, unlike the encoder, it's implemented in JavaScript.Unfortunately, there is currently no battle-tested native JavaScript parser with an open API, let alone implemented in Rust. There have been a few attempts, but, given the complexity of JavaScript grammar, it’s better to wait a bit and make sure they’re well-tested before incorporating it into the production encoder.On the other hand, over the last few years the JavaScript ecosystem grew to extensively rely on developer tools implemented in JavaScript itself. In particular, this gave a push to rigorous parser development and testing. There are several JavaScript parser implementations that have been proven to work on thousands of real-world projects.With that in mind, it makes sense that the BinaryAST implementation chose to use one of them - in particular, Shift - and integrated it with the Rust encoder, instead of attempting to use a native parser.Connecting Rust and JavaScriptIntegration is where things get interesting.Rust is a native language that can compile to an executable binary, but JavaScript requires a separate engine to be executed. To connect them, we need some way to transfer data between the two without sharing the memory.Initially, the reference implementation generated JavaScript code with an embedded input on the fly, passed it to Node.js and then read the output when the process had finished. That code contained a call to the Shift parser with an inlined input string and produced the AST back in a JSON format.This doesn’t scale well when parsing lots of JavaScript files, so the first thing we did is transformed the Node.js side into a long-living daemon. Now Rust could spawn a required Node.js process just once and keep passing inputs into it and getting responses back as individual messages.Running in the cloudWhile the Node.js solution worked fairly well after these optimisations, shipping both a Node.js instance and a native bundle to production requires some effort. It's also potentially risky and requires manual sandboxing of both processes to make sure we don’t accidentally start executing malicious code.On the other hand, the only thing we needed from Node.js is the ability to run the JavaScript parser code. And we already have an isolated JavaScript engine running in the cloud - Cloudflare Workers! By additionally compiling the native Rust encoder to Wasm (which is quite easy with the native toolchain and wasm-bindgen), we can even run both parts of the code in the same process, making cold starts and communication much faster than in a previous model.Optimising data transferThe next logical step is to reduce the overhead of data transfer. JSON worked fine for communication between separate processes, but with a single process we should be able to retrieve the required bits directly from the JavaScript-based AST.To attempt this, first of all, we needed to move away from the direct JSON usage to something more generic that would allow us to support various import formats. The Rust ecosystem already has an amazing serialisation framework for that - Serde.Aside from allowing us to be more flexible in regard to the inputs, rewriting to Serde helped an existing native use case too. Now, instead of parsing JSON into an intermediate representation and then walking through it, all the native typed AST structures can be deserialized directly from the stdout pipe of the Node.js process in a streaming manner. This significantly improved both the CPU usage and memory pressure.But there is one more thing we can do: instead of serializing and deserializing from an intermediate format (let alone, a text format like JSON), we should be able to operate [almost] directly on JavaScript values, saving memory and repetitive work.How is this possible? wasm-bindgen provides a type called JsValue that stores a handle to an arbitrary value on the JavaScript side. This handle internally contains an index into a predefined array.Each time a JavaScript value is passed to the Rust side as a result of a function call or a property access, it’s stored in this array and an index is sent to Rust. The next time Rust wants to do something with that value, it passes the index back and the JavaScript side retrieves the original value from the array and performs the required operation.By reusing this mechanism, we could implement a Serde deserializer that requests only the required values from the JS side and immediately converts them to their native representation. It’s now open-sourced under first, we got a much worse performance out of this due to the overhead of more frequent calls between 1) Wasm and JavaScript - SpiderMonkey has improved these recently, but other engines still lag behind and 2) JavaScript and C++, which also can’t be optimised well in most engines.The JavaScript <-> C++ overhead comes from the usage of TextEncoder to pass strings between JavaScript and Wasm in wasm-bindgen, and, indeed, it showed up as the highest in the benchmark profiles. This wasn’t surprising - after all, strings can appear not only in the value payloads, but also in property names, which have to be serialized and sent between JavaScript and Wasm over and over when using a generic JSON-like structure.Luckily, because our deserializer doesn’t have to be compatible with JSON anymore, we can use our knowledge of Rust types and cache all the serialized property names as JavaScript value handles just once, and then keep reusing them for further property accesses.This, combined with some changes to wasm-bindgen which we have upstreamed, allows our deserializer to be up to 3.5x faster in benchmarks than the original Serde support in wasm-bindgen, while saving ~33% off the resulting code size. Note that for string-heavy data structures it might still be slower than the current JSON-based integration, but situation is expected to improve over time when reference types proposal lands natively in Wasm.After implementing and integrating this deserializer, we used the wasm-pack plugin for Webpack to build a Worker with both Rust and JavaScript parts combined and shipped it to some test zones.Show me the numbersKeep in mind that this proposal is in very early stages, and current benchmarks and demos are not representative of the final outcome (which should improve numbers much further).As mentioned earlier, BinaryAST can mark functions that should be parsed lazily ahead of time. By using different levels of lazification in the encoder ( and running tests against some popular JavaScript libraries, we found following speed-ups.Level 0 (no functions are lazified)With lazy parsing disabled in both parsers we got a raw parsing speed improvement of between 3 and 10%. Name Source size (kb) JavaScript Parse time (average ms) BinaryAST parse time (average ms) Diff (%) React 20 0.403 0.385 -4.56 D3 (v5) 240 11.178 10.525 -6.018 Angular 180 6.985 6.331 -9.822 Babel 780 21.255 20.599 -3.135 Backbone 32 0.775 0.699 -10.312 wabtjs 1720 64.836 59.556 -8.489 Fuzzball (1.2) 72 3.165 2.768 -13.383 Level 3 (functions up to 3 levels deep are lazified)But with the lazification set to skip nested functions of up to 3 levels we see much more dramatic improvements in parsing time between 90 and 97%. As mentioned earlier in the post, BinaryAST makes lazy parsing essentially free by completely skipping over the marked functions. Name Source size (kb) Parse time (average ms) BinaryAST parse time (average ms) Diff (%) React 20 0.407 0.032 -92.138 D3 (v5) 240 11.623 0.224 -98.073 Angular 180 7.093 0.680 -90.413 Babel 780 21.100 0.895 -95.758 Backbone 32 0.898 0.045 -94.989 wabtjs 1720 59.802 1.601 -97.323 Fuzzball (1.2) 72 2.937 0.089 -96.970 All the numbers are from manual tests on a Linux x64 Intel i7 with 16Gb of ram.While these synthetic benchmarks are impressive, they are not representative of real-world scenarios. Normally you will use at least some of the loaded JavaScript during the startup. To check this scenario, we decided to test some realistic pages and demos on desktop and mobile Firefox and found speed-ups in page loads too.For a sample application (, which weighed in at around 1.2 MB of JavaScript we got the following numbers for initial script execution: Device JavaScript BinaryAST Desktop 338ms 314ms Mobile (HTC One M8) 2019ms 1455ms Here is a video that will give you an idea of the improvement as seen by a user on mobile Firefox (in this case showing the entire page startup time):Next step is to start gathering data on real-world websites, while improving the underlying format.How do I test BinaryAST on my website?We’ve open-sourced our Worker so that it could be installed on any Cloudflare zone: thing to be currently wary of is that, even though the result gets stored in the cache, the initial encoding is still an expensive process, and might easily hit CPU limits on any non-trivial JavaScript files and fall back to the unencoded variant. We are working to improve this situation by releasing BinaryAST encoder as a separate feature with more relaxed limits in the following few days.Meanwhile, if you want to play with BinaryAST on larger real-world scripts, an alternative option is to use a static binjs_encode tool from to pre-encode JavaScript files ahead of time. Then, you can use a Worker from to serve the resulting BinaryAST assets when supported and requested by the browser.On the client side, you’ll currently need to download Firefox Nightly, go to about:config and enable unrestricted BinaryAST support via the following options:Now, when opening a website with either of the Workers installed, Firefox will get BinaryAST instead of JavaScript automatically.SummaryThe amount of JavaScript in modern apps is presenting performance challenges for all consumers. Engine vendors are experimenting with different ways to improve the situation - some are focusing on raw decoding performance, some on parallelizing operations to reduce overall latency, some are researching new optimised formats for data representation, and some are inventing and improving protocols for the network delivery.No matter which one it is, we all have a shared goal of making the Web better and faster. On Cloudflare's side, we're always excited about collaborating with all the vendors and combining various approaches to make that goal closer with every step.

Live video just got more live: Introducing Concurrent Streaming Acceleration

Today we’re excited to introduce Concurrent Streaming Acceleration, a new technique for reducing the end-to-end latency of live video on the web when using Stream Delivery.Let’s dig into live-streaming latency, why it’s important, and what folks have done to improve it.How “live” is “live” video?Live streaming makes up an increasing share of video on the web. Whether it’s a TV broadcast, a live game show, or an online classroom, users expect video to arrive quickly and smoothly. And the promise of “live” is that the user is seeing events as they happen. But just how close to “real-time” is “live” Internet video? Delivering live video on the Internet is still hard and adds lots of latency:The content source records video and sends it to an encoding server;The origin server transforms this video into a format like DASH, HLS or CMAF that can be delivered to millions of devices efficiently;A CDN is typically used to deliver encoded video across the globeClient players decode the video and render it on the screenAnd all of this is under a time constraint — the whole process need to happen in a few seconds, or video experiences will suffer. We call the total delay between when the video was shot, and when it can be viewed on an end-user’s device, as “end-to-end latency” (think of it as the time from the camera lens to your phone’s screen).Traditional segmented deliveryVideo formats like DASH, HLS, and CMAF work by splitting video into small files, called “segments”. A typical segment duration is 6 seconds.If a client player needs to wait for a whole 6s segment to be encoded, sent through a CDN, and then decoded, it can be a long wait! It takes even longer if you want the client to build up a buffer of segments to protect against any interruptions in delivery. A typical player buffer for HLS is 3 segments:Clients may have to buffer three 6-second chunks, introducing at least 18s of latency‌‌When you consider encoding delays, it’s easy to see why live streaming latency on the Internet has typically been about 20-30 seconds. We can do better.Reduced latency with chunked transfer encodingA natural way to solve this problem is to enable client players to start playing the chunks while they’re downloading, or even while they’re still being created. Making this possible requires a clever bit of cooperation to encode and deliver the files in a particular way, known as “chunked encoding.” This involves splitting up segments into smaller, bite-sized pieces, or “chunks”. Chunked encoding can typically bring live latency down to 5 or 10 seconds.Confusingly, the word “chunk” is overloaded to mean two different things:CMAF or HLS chunks, which are small pieces of a segment (typically 1s) that are aligned on key framesHTTP chunks, which are just a way of delivering any file over the webChunked Encoding splits segments into shorter chunksHTTP chunks are important because web clients have limited ability to process streams of data. Most clients can only work with data once they’ve received the full HTTP response, or at least a complete HTTP chunk. By using HTTP chunked transfer encoding, we enable video players to start parsing and decoding video sooner.CMAF chunks are important so that decoders can actually play the bits that are in the HTTP chunks. Without encoding video in a careful way, decoders would have random bits of a video file that can’t be played.CDNs can introduce additional bufferingChunked encoding with HLS and CMAF is growing in use across the web today. Part of what makes this technique great is that HTTP chunked encoding is widely supported by CDNs – it’s been part of the HTTP spec for 20 years.CDN support is critical because it allows low-latency live video to scale up and reach audiences of thousands or millions of concurrent viewers – something that’s currently very difficult to do with other, non-HTTP based protocols.Unfortunately, even if you enable chunking to optimise delivery, your CDN may be working against you by buffering the entire segment. To understand why consider what happens when many people request a live segment at the same time:If the file is already in cache, great! CDNs do a great job at delivering cached files to huge audiences. But what happens when the segment isn’t in cache yet? Remember – this is the typical request pattern for live video!Typically, CDNs are able to “stream on cache miss” from the origin. That looks something like this:But again – what happens when multiple people request the file at once? CDNs typically need to pull the entire file into cache before serving additional viewers:Only one viewer can stream video, while other clients wait for the segment to buffer at the CDNThis behavior is understandable. CDN data centers consist of many servers. To avoid overloading origins, these servers typically coordinate amongst themselves using a “cache lock” (mutex) that allows only one server to request a particular file from origin at a given time. A side effect of this is that while a file is being pulled into cache, it can’t be served to any user other than the first one that requested it. Unfortunately, this cache lock also defeats the purpose of using chunked encoding!To recap thus far:Chunked encoding splits up video segments into smaller piecesThis can reduce end-to-end latency by allowing chunks to be fetched and decoded by players, even while segments are being produced at the origin serverSome CDNs neutralize the benefits of chunked encoding by buffering entire files inside the CDN before they can be delivered to clientsCloudflare’s solution: Concurrent Streaming AccelerationAs you may have guessed, we think we can do better. Put simply, we now have the ability to deliver un-cached files to multiple clients simultaneously while we pull the file once from the origin server.This sounds like a simple change, but there’s a lot of subtlety to do this safely. Under the hood, we’ve made deep changes to our caching infrastructure to remove the cache lock and enable multiple clients to be able to safely read from a single file while it’s still being written.The best part is – all of Cloudflare now works this way! There’s no need to opt-in, or even make a config change to get the benefit.We rolled this feature out a couple months ago and have been really pleased with the results so far. We measure success by the “cache lock wait time,” i.e. how long a request must wait for other requests – a direct component of Time To First Byte.  One OTT customer saw this metric drop from 1.5s at P99 to nearly 0, as expected:This directly translates into a 1.5-second improvement in end-to-end latency. Live video just got more live!ConclusionNew techniques like chunked encoding have revolutionized live delivery, enabling publishers to deliver low-latency live video at scale. Concurrent Streaming Acceleration helps you unlock the power of this technique at your CDN, potentially shaving precious seconds of end-to-end latency.If you’re interested in using Cloudflare for live video delivery, contact our enterprise sales team.And if you’re interested in working on projects like this and helping us improve live video delivery for the entire Internet, join our engineering team!

Announcing Cloudflare Image Resizing: Simplifying Optimal Image Delivery

In the past three years, the amount of image data on the median mobile webpage has doubled. Growing images translate directly to users hitting data transfer caps, experiencing slower websites, and even leaving if a website doesn’t load in a reasonable amount of time. The crime is many of these images are so slow because they are larger than they need to be, sending data over the wire which has absolutely no (positive) impact on the user’s experience.To provide a concrete example, let’s consider this photo of Cloudflare’s Lava Lamp Wall: On the left you see the photo, scaled to 300 pixels wide. On the right you see the same image delivered in its original high resolution, scaled in a desktop web browser. On a regular-DPI screen, they both look the same, yet the image on the right takes more than twenty times more data to load. Even for the best and most conscientious developers resizing every image to handle every possible device geometry consumes valuable time, and it’s exceptionally easy to forget to do this resizing altogether.Today we are launching a new product, Image Resizing, to fix this problem once and for all.Announcing Image ResizingWith Image Resizing, Cloudflare adds another important product to its suite of available image optimizations.  This product allows customers to perform a rich set of the key actions on images.Resize - The source image will be resized to the specified height and width.  This action allows multiple different sized variants to be created for each specific use.Crop - The source image will be resized to a new size that does not maintain the original aspect ratio and a portion of the image will be removed.  This can be especially helpful for headshots and product images where different formats must be achieved by keeping only a portion of the image.Compress - The source image will have its file size reduced by applying lossy compression.  This should be used when slight quality reduction is an acceptable trade for file size reduction.Convert to WebP - When the users browser supports it, the source image will be converted to WebP.  Delivering a WebP image takes advantage of the modern, highly optimized image format.By using a combination of these actions, customers store a single high quality image on their server, and Image Resizing can be leveraged to create specialized variants for each specific use case.  Without any additional effort, each variant will also automatically benefit from Cloudflare’s global caching.ExamplesEcommerce ThumbnailsEcommerce sites typically store a high-quality image of each product.  From that image, they need to create different variants depending on how that product will be displayed.  One example is creating thumbnails for a catalog view.  Using Image Resizing, if the high quality image is located here: is how to display a 75x75 pixel thumbnail using Image Resizing:<img src="/cdn-cgi/image/width=75,height=75/images/shoe123.jpg">Responsive ImagesWhen tailoring a site to work on various device types and sizes, it’s important to always use correctly sized images.  This can be difficult when images are intended to fill a particular percentage of the screen.  To solve this problem, <img srcset sizes> can be used.Without Image Resizing, multiple versions of the same image would need to be created and stored.  In this example, a single high quality copy of hero.jpg is stored, and Image Resizing is used to resize for each particular size as needed.<img width="100%" srcset=" /cdn-cgi/image/fit=contain,width=320/assets/hero.jpg 320w, /cdn-cgi/image/fit=contain,width=640/assets/hero.jpg 640w, /cdn-cgi/image/fit=contain,width=960/assets/hero.jpg 960w, /cdn-cgi/image/fit=contain,width=1280/assets/hero.jpg 1280w, /cdn-cgi/image/fit=contain,width=2560/assets/hero.jpg 2560w, " src="/cdn-cgi/image/width=960/assets/hero.jpg"> Enforce Maximum Size Without Changing URLsImage Resizing is also available from within a Cloudflare Worker. Workers allow you to write code which runs close to your users all around the world. For example, you might wish to add Image Resizing to your images while keeping the same URLs. Your users and client would be able to use the same image URLs as always, but the images will be transparently modified in whatever way you need.You can install a Worker on a route which matches your image URLs, and resize any images larger than a limit:addEventListener('fetch', event => { event.respondWith(handleRequest(event.request)) }) async function handleRequest(request) { return fetch(request, { cf: { image: { width: 800, height: 800, fit: 'scale-down' } }); } As a Worker is just code, it is also easy to run this worker only on URLs with image extensions, or even to only resize images being delivered to mobile clients.Cloudflare and ImagesCloudflare has a long history building tools to accelerate images. Our caching has always helped reduce latency by storing a copy of images closer to the user.  Polish automates options for both lossless and lossy image compression to remove unnecessary bytes from images.  Mirage accelerates image delivery based on device type. We are continuing to invest in all of these tools, as they all serve a unique role in improving the image experience on the web.Image Resizing is different because it is the first image product at Cloudflare to give developers full control over how their images would be served. You should choose Image Resizing if you are comfortable defining the sizes you wish your images to be served at in advance or within a Cloudflare Worker.Next Steps and Simple PricingImage Resizing is available today for Business and Enterprise Customers.  To enable it, login to the Cloudflare Dashboard and navigate to the Speed Tab.  There you’ll find the section for Image Resizing which you can enable with one click.This product is included in the Business and Enterprise plans at no additional cost with generous usage limits.  Business Customers have 100k requests per month limit and will be charged $10 for each additional 100k requests per month.  Enterprise Customers have a 10M request per month limit with discounted tiers for higher usage.  Requests are defined as a hit on a URI that contains Image Resizing or a call to Image Resizing from a Worker.Now that you’ve enabled Image Resizing, it’s time to resize your first image.Using your existing site, store an image here: this URL to resize that image:,height=100,quality=75/images/yourimage.jpgExperiment with changing width=, height=, and quality=.The instructions above use the Default URL Format for Image Resizing.  For details on options, uses cases, and compatibility, refer to our Developer Documentation.

Parallel streaming of progressive images

Progressive image rendering and HTTP/2 multiplexing technologies have existed for a while, but now we've combined them in a new way that makes them much more powerful. With Cloudflare progressive streaming images appear to load in half of the time, and browsers can start rendering pages sooner. document.getElementsByTagName('video')[0].playbackRate=0.4In HTTP/1.1 connections, servers didn't have any choice about the order in which resources were sent to the client; they had to send responses, as a whole, in the exact order they were requested by the web browser. HTTP/2 improved this by adding multiplexing and prioritization, which allows servers to decide exactly what data is sent and when. We’ve taken advantage of these new HTTP/2 capabilities to improve perceived speed of loading of progressive images by sending the most important fragments of image data sooner.This feature is compatible with all major browsers, and doesn’t require any changes to page markup, so it’s very easy to adopt. Sign up for the Beta to enable it on your site!What is progressive image rendering?Basic images load strictly from top to bottom. If a browser has received only half of an image file, it can show only the top half of the image. Progressive images have their content arranged not from top to bottom, but from a low level of detail to a high level of detail. Receiving a fraction of image data allows browsers to show the entire image, only with a lower fidelity. As more data arrives, the image becomes clearer and sharper.This works great in the JPEG format, where only about 10-15% of the data is needed to display a preview of the image, and at 50% of the data the image looks almost as good as when the whole file is delivered. Progressive JPEG images contain exactly the same data as baseline images, merely reshuffled in a more useful order, so progressive rendering doesn’t add any cost to the file size. This is possible, because JPEG doesn't store the image as pixels. Instead, it represents the image as frequency coefficients, which are like a set of predefined patterns that can be blended together, in any order, to reconstruct the original image. The inner workings of JPEG are really fascinating, and you can learn more about them from my recent conference talk.The end result is that the images can look almost fully loaded in half of the time, for free! The page appears to be visually complete and can be used much sooner. The rest of the image data arrives shortly after, upgrading images to their full quality, before visitors have time to notice anything is missing.HTTP/2 progressive streamingBut there's a catch. Websites have more than one image (sometimes even hundreds of images). When the server sends image files naïvely, one after another, the progressive rendering doesn’t help that much, because overall the images still load sequentially:Having complete data for half of the images (and no data for the other half) doesn't look as good as having half of the data for all images.And there's another problem: when the browser doesn't know image sizes yet, it lays the page out with placeholders instead, and relays out the page when each image loads. This can make pages jump during loading, which is inelegant, distracting and annoying for the user.Our new progressive streaming feature greatly improves the situation: we can send all of the images at once, in parallel. This way the browser gets size information for all of the images as soon as possible, can paint a preview of all images without having to wait for a lot of data, and large images don’t delay loading of styles, scripts and other more important resources.This idea of streaming of progressive images in parallel is as old as HTTP/2 itself, but it needs special handling in low-level parts of web servers, and so far this hasn't been implemented at a large scale. When we were improving our HTTP/2 prioritization, we realized it can be also used to implement this feature. Image files as a whole are neither high nor low priority. The priority changes within each file, and dynamic re-prioritization gives us the behavior we want: The image header that contains the image size is very high priority, because the browser needs to know the size as soon as possible to do page layout. The image header is small, so it doesn't hurt to send it ahead of other data. The minimum amount of data in the image required to show a preview of the image has a medium priority (we'd like to plug "holes" left for unloaded images as soon as possible, but also leave some bandwidth available for scripts, fonts and other resources) The remainder of the image data is low priority. Browsers can stream it last to refine image quality once there's no rush, since the page is already fully usable. Knowing the exact amount of data to send in each phase requires understanding the structure of image files, but it seemed weird to us to make our web server parse image responses and have a format-specific behavior hardcoded at a protocol level. By framing the problem as a dynamic change of priorities, were able to elegantly separate low-level networking code from knowledge of image formats. We can use Workers or offline image processing tools to analyze the images, and instruct our server to change HTTP/2 priorities accordingly.The great thing about parallel streaming of images is that it doesn’t add any overhead. We’re still sending the same data, the same amount of data, we’re just sending it in a smarter order. This technique takes advantage of existing web standards, so it’s compatible with all browsers.The waterfallHere are waterfall charts from WebPageTest showing comparison of regular HTTP/2 responses and progressive streaming. In both cases the files were exactly the same, the amount of data transferred was the same, and the overall page loading time was the same (within measurement noise). In the charts, blue segments show when data was transferred, and green shows when each request was idle.The first chart shows a typical server behavior that makes images load mostly sequentially. The chart itself looks neat, but the actual experience of loading that page was not great — the last image didn't start loading until almost the end.The second chart shows images loaded in parallel. The blue vertical streaks throughout the chart are image headers sent early followed by a couple of stages of progressive rendering. You can see that useful data arrived sooner for all of the images. You may notice that one of the images has been sent in one chunk, rather than split like all the others. That’s because at the very beginning of a TCP/IP connection we don't know the true speed of the connection yet, and we have to sacrifice some opportunity to do prioritization in order to maximize the connection speed.The metrics compared to other solutionsThere are other techniques intended to provide image previews quickly, such as low-quality image placeholder (LQIP), but they have several drawbacks. They add unnecessary data for the placeholders, and usually interfere with browsers' preload scanner, and delay loading of full-quality images due to dependence on JavaScript needed to upgrade the previews to full images.Our solution doesn't cause any additional requests, and doesn't add any extra data. Overall page load time is not delayed.Our solution doesn't require any JavaScript. It takes advantage of functionality supported natively in the browsers.Our solution doesn't require any changes to page's markup, so it's very safe and easy to deploy site-wide.The improvement in user experience is reflected in performance metrics such as SpeedIndex metric and and time to visually complete. Notice that with regular image loading the visual progress is linear, but with the progressive streaming it quickly jumps to mostly complete:Getting the most out of progressive renderingAvoid ruining the effect with JavaScript. Scripts that hide images and wait until the onload event to reveal them (with a fade in, etc.) will defeat progressive rendering. Progressive rendering works best with the good old <img> element. Is it JPEG-only?Our implementation is format-independent, but progressive streaming is useful only for certain file types. For example, it wouldn't make sense to apply it to scripts or stylesheets: these resources are rendered as all-or-nothing.Prioritizing of image headers (containing image size) works for all file formats.The benefits of progressive rendering are unique to JPEG (supported in all browsers) and JPEG 2000 (supported in Safari). GIF and PNG have interlaced modes, but these modes come at a cost of worse compression. WebP doesn't even support progressive rendering at all. This creates a dilemma: WebP is usually 20%-30% smaller than a JPEG of equivalent quality, but progressive JPEG appears to load 50% faster. There are next-generation image formats that support progressive rendering better than JPEG, and compress better than WebP, but they're not supported in web browsers yet. In the meantime you can choose between the bandwidth savings of WebP or the better perceived performance of progressive JPEG by changing Polish settings in your Cloudflare dashboard.Custom header for experimentationWe also support a custom HTTP header that allows you to experiment with, and optimize streaming of other resources on your site. For example, you could make our servers send the first frame of animated GIFs with high priority and deprioritize the rest. Or you could prioritize loading of resources mentioned in <head> of HTML documents before <body> is loaded. The custom header can be set only from a Worker. The syntax is a comma-separated list of file positions with priority and concurrency. The priority and concurrency is the same as in the whole-file cf-priority header described in the previous blog <offset in bytes>:<priority>/<concurrency>, ... For example, for a progressive JPEG we use something like (this is a fragment of JS to use in a Worker):let headers = new Headers(response.headers); headers.set("cf-priority", "30/0"); headers.set("cf-priority-change", "512:20/1, 15000:10/n"); return new Response(response.body, {headers}); Which instructs the server to use priority 30 initially, while it sends the first 512 bytes. Then switch to priority 20 with some concurrency (/1), and finally after sending 15000 bytes of the file, switch to low priority and high concurrency (/n) to deliver the rest of the file. We’ll try to split HTTP/2 frames to match the offsets specified in the header to change the sending priority as soon as possible. However, priorities don’t guarantee that data of different streams will be multiplexed exactly as instructed, since the server can prioritize only when it has data of multiple streams waiting to be sent at the same time. If some of the responses arrive much sooner from the upstream server or the cache, the server may send them right away, without waiting for other responses.Try it!You can use our Polish tool to convert your images to progressive JPEG. Sign up for the beta to have them elegantly streamed in parallel.

Better HTTP/2 Prioritization for a Faster Web

HTTP/2 promised a much faster web and Cloudflare rolled out HTTP/2 access for all our customers long, long ago. But one feature of HTTP/2, Prioritization, didn’t live up to the hype. Not because it was fundamentally broken but because of the way browsers implemented it. Today Cloudflare is pushing out a change to HTTP/2 Prioritization that gives our servers control of prioritization decisions that truly make the web much faster. Historically the browser has been in control of deciding how and when web content is loaded. Today we are introducing a radical change to that model for all paid plans that puts control into the hands of the site owner directly. Customers can enable “Enhanced HTTP/2 Prioritization” in the Speed tab of the Cloudflare dashboard: this overrides the browser defaults with an improved scheduling scheme that results in a significantly faster visitor experience (we have seen 50% faster on multiple occasions). With Cloudflare Workers, site owners can take this a step further and fully customize the experience to their specific needs. Background Web pages are made up of dozens (sometimes hundreds) of separate resources that are loaded and assembled by the browser into the final displayed content. This includes the visible content the user interacts with (HTML, CSS, images) as well as the application logic (JavaScript) for the site itself, ads, analytics for tracking site usage and marketing tracking beacons. The sequencing of how those resources are loaded can have a significant impact on how long it takes for the user to see the content and interact with the page. A browser is basically an HTML processing engine that goes through the HTML document and follows the instructions in order from the start of the HTML to the end, building the page as it goes along. References to stylesheets (CSS) tell the browser how to style the page content and the browser will delay displaying content until it has loaded the stylesheet (so it knows how to style the content it is going to display). Scripts referenced in the document can have several different behaviors. If the script is tagged as “async” or “defer” the browser can keep processing the document and just run the script code whenever the scripts are available. If the scripts are not tagged as async or defer then the browser MUST stop processing the document until the script has downloaded and executed before continuing. These are referred to as “blocking” scripts because they block the browser from continuing to process the document until they have been loaded and executed. The HTML document is split into two parts. The <head> of the document is at the beginning and contains stylesheets, scripts and other instructions for the browser that are needed to display the content. The <body> of the document comes after the head and contains the actual page content that is displayed in the browser window (though scripts and stylesheets are allowed to be in the body as well). Until the browser gets to the body of the document there is nothing to display to the user and the page will remain blank so getting through the head of the document as quickly as possible is important. “HTML5 rocks” has a great tutorial on how browsers work if you want to dive deeper into the details. The browser is generally in charge of determining the order of loading the different resources it needs to build the page and to continue processing the document. In the case of HTTP/1.x, the browser is limited in how many things it can request from any one server at a time (generally 6 connections and only one resource at a time per connection) so the ordering is strictly controlled by the browser by how things are requested. With HTTP/2 things change pretty significantly. The browser can request all of the resources at once (at least as soon as it knows about them) and it provides detailed instructions to the server for how the resources should be delivered. Optimal Resource Ordering For most parts of the page loading cycle there is an optimal ordering of the resources that will result in the fastest user experience (and the difference between optimal and not can be significant - as much as a 50% improvement or more). As described above, early in the page load cycle before the browser can render any content it is blocked on the CSS and blocking JavaScript in the <head> section of the HTML. During that part of the loading cycle it is best for 100% of the connection bandwidth to be used to download the blocking resources and for them to be downloaded one at a time in the order they are defined in the HTML. That lets the browser parse and execute each item while it is downloading the next blocking resource, allowing the download and execution to be pipelined. The scripts take the same amount of time to download when downloaded in parallel or one after the other but by downloading them sequentially the first script can be processed and execute while the second script is downloading. Once the render-blocking content has loaded things get a little more interesting and the optimal loading may depend on the specific site or even business priorities (user content vs ads vs analytics, etc). Fonts in particular can be difficult as the browser only discovers what fonts it needs after the stylesheets have been applied to the content that is about to be displayed so by the time the browser knows about a font, it is needed to display text that is already ready to be drawn to the screen. Any delays in getting the font loaded end up as periods with blank text on the screen (or text displayed using the wrong font). Generally there are some tradeoffs that need to be considered: Custom fonts and visible images in the visible part of the page (viewport) should be loaded as quickly as possible. They directly impact the user’s visual experience of the page loading. Non-blocking JavaScript should be downloaded serially relative to other JavaScript resources so the execution of each can be pipelined with the downloads. The JavaScript may include user-facing application logic as well as analytics tracking and marketing beacons and delaying them can cause a drop in the metrics that the business tracks. Images benefit from downloading in parallel. The first few bytes of an image file contain the image dimensions which may be necessary for browser layout, and progressive images downloading in parallel can look visually complete with around 50% of the bytes transferred. Weighing the tradeoffs, one strategy that works well in most cases is: Custom fonts download sequentially and split the available bandwidth with visible images. Visible images download in parallel, splitting the “images” share of the bandwidth among them. When there are no more fonts or visible images pending: Non-blocking scripts download sequentially and split the available bandwidth with non-visible images Non-visible images download in parallel, splitting the “images” share of the bandwidth among them. That way the content visible to the user is loaded as quickly as possible, the application logic is delayed as little as possible and the non-visible images are loaded in such a way that layout can be completed as quickly as possible. Example For illustrative purposes, we will use a simplified product category page from a typical e-commerce site. In this example the page has: The HTML file for the page itself, represented by a blue box. 1 external stylesheet (CSS file), represented by a green box. 4 external scripts (JavaScript), represented by orange boxes. 2 of the scripts are blocking at the beginning of the page and 2 are asynchronous. The blocking script boxes use a darker shade of orange. 1 custom web font, represented by a red box. 13 images, represented by purple boxes. The page logo and 4 of the product images are visible in the viewport and 8 of the product images require scrolling to see. The 5 visible images use a darker shade of purple. For simplicity, we will assume that all of the resources are the same size and each takes 1 second to download on the visitor’s connection. Loading everything takes a total of 20 seconds, but HOW it is loaded can have a huge impact to the experience. This is what the described optimal loading would look like in the browser as the resources load: The page is blank for the first 4 seconds while the HTML, CSS and blocking scripts load, all using 100% of the connection. At the 4-second mark the background and structure of the page is displayed with no text or images. One second later, at 5 seconds, the text for the page is displayed. From 5-10 seconds the images load, starting out as blurry but sharpening very quickly. By around the 7-second mark it is almost indistinguishable from the final version. At the 10 second mark all of the visual content in the viewport has completed loading. Over the next 2 seconds the asynchronous JavaScript is loaded and executed, running any non-critical logic (analytics, marketing tags, etc). For the final 8 seconds the rest of the product images load so they are ready for when the user scrolls. Current Browser Prioritization All of the current browser engines implement different prioritization strategies, none of which are optimal. Microsoft Edge and Internet Explorer do not support prioritization so everything falls back to the HTTP/2 default which is to load everything in parallel, splitting the bandwidth evenly among everything. Microsoft Edge is moving to use the Chromium browser engine in future Windows releases which will help improve the situation. In our example page this means that the browser is stuck in the head for the majority of the loading time since the images are slowing down the transfer of the blocking scripts and stylesheets. Visually that results in a pretty painful experience of staring at a blank screen for 19 seconds before most of the content displays, followed by a 1-second delay for the text to display. Be patient when watching the animated progress because for the 19 seconds of blank screen it may feel like nothing is happening (even though it is): Safari loads all resources in parallel, splitting the bandwidth between them based on how important Safari believes they are (with render-blocking resources like scripts and stylesheets being more important than images). Images load in parallel but also load at the same time as the render-blocking content. While similar to Edge in that everything downloads at the same time, by allocating more bandwidth to the render-blocking resources Safari can display the content much sooner: At around 8 seconds the stylesheet and scripts have finished loading so the page can start to be displayed. Since the images were loading in parallel, they can also be rendered in their partial state (blurry for progressive images). This is still twice as slow as the optimal case but much better than what we saw with Edge. At around 11 seconds the font has loaded so the text can be displayed and more image data has been downloaded so the images will be a little sharper. This is comparable to the experience around the 7-second mark for the optimal loading case. For the remaining 9 seconds of the load the images get sharper as more data for them downloads until it is finally complete at 20 seconds. Firefox builds a dependency tree that groups resources and then schedules the groups to either load one after another or to share bandwidth between the groups. Within a given group the resources share bandwidth and download concurrently. The images are scheduled to load after the render-blocking stylesheets and to load in parallel but the render-blocking scripts and stylesheets also load in parallel and do not get the benefits of pipelining. In our example case this ends up being a slightly faster experience than with Safari since the images are delayed until after the stylesheets complete: At the 6 second mark the initial page content is rendered with the background and blurry versions of the product images (compared to 8 seconds for Safari and 4 seconds for the optimal case). At 8 seconds the font has loaded and the text can be displayed along with slightly sharper versions of the product images (compared to 11 seconds for Safari and 7 seconds in the Optimal case). For the remaining 12 seconds of the loading the product images get sharper as the remaining content loads. Chrome (and all Chromium-based browsers) prioritizes resources into a list. This works really well for the render-blocking content that benefits from loading in order but works less well for images. Each image loads to 100% completion before starting the next image. In practice this is almost as good as the optimal loading case with the only difference being that the images load one at a time instead of in parallel: Up until the 5 second mark the Chrome experience is identical to the optimal case, displaying the background at 4 seconds and the text content at 5. For the next 5 seconds the visible images load one at a time until they are all complete at the 10 second mark (compared to the optimal case where they are just slightly blurry at 7 seconds and sharpen up for the remaining 3 seconds). After the visual part of the page is complete at 10 seconds (identical to the optimal case), the remaining 10 seconds are spent running the async scripts and loading the hidden images (just like with the optimal loading case). Visual Comparison Visually, the impact can be quite dramatic, even though they all take the same amount of time to technically load all of the content: Server-Side Prioritization HTTP/2 prioritization is requested by the client (browser) and it is up to the server to decide what to do based on the request. A good number of servers don’t support doing anything at all with the prioritization but for those that do, they all honor the client’s request. Another option would be to decide on the best prioritization to use on the server-side, taking into account the client’s request. Per the specification, HTTP/2 prioritization is a dependency tree that requires full knowledge of all of the in-flight requests to be able to prioritize resources against each other. That allows for incredibly complex strategies but is difficult to implement well on either the browser or server side (as evidenced by the different browser strategies and varying levels of server support). To make prioritization easier to manage we have developed a simpler prioritization scheme that still has all of the flexibility needed for optimal scheduling. The Cloudflare prioritization scheme consists of 64 priority “levels” and within each priority level there are groups of resources that determine how the connection is shared between them: All of the resources at a higher priority level are transferred before moving on to the next lower priority level. Within a given priority level, there are 3 different “concurrency” groups: 0 : All of the resources in the concurrency “0” group are sent sequentially in the order they were requested, using 100% of the bandwidth. Only after all of the concurrency “0” group resources have been downloaded are other groups at the same level considered. 1 : All of the resources in the concurrency “1” group are sent sequentially in the order they were requested. The available bandwidth is split evenly between the concurrency “1” group and the concurrency “n” group. n : The resources in the concurrency “n” group are sent in parallel, splitting the bandwidth available to the group between them. Practically speaking, the concurrency “0” group is useful for critical content that needs to be processed sequentially (scripts, CSS, etc). The concurrency “1” group is useful for less-important content that can share bandwidth with other resources but where the resources themselves still benefit from processing sequentially (async scripts, non-progressive images, etc). The concurrency “n” group is useful for resources that benefit from processing in parallel (progressive images, video, audio, etc). Cloudflare Default Prioritization When enabled, the enhanced prioritization implements the “optimal” scheduling of resources described above. The specific prioritizations applied look like this: This prioritization scheme allows sending the render-blocking content serially, followed by the visible images in parallel and then the rest of the page content with some level of sharing to balance application and content loading. The “* If Detectable” caveat is that not all browsers differentiate between the different types of stylesheets and scripts but it will still be significantly faster in all cases. 50% faster by default, particularly for Edge and Safari visitors is not unusual: Customizing Prioritization with Workers Faster-by-default is great but where things get really interesting is that the ability to configure the prioritization is also exposed to Cloudflare Workers so sites can override the default prioritization for resources or implement their own complete prioritization schemes. If a Worker adds a “cf-priority” header to the response, Cloudflare edge servers will use the specified priority and concurrency for that response. The format of the header is <priority>/<concurrency> so something like response.headers.set('cf-priority', “30/0”); would set the priority to 30 with a concurrency of 0 for the given response. Similarly, “30/1” would set concurrency to 1 and “30/n” would set concurrency to n. With this level of flexibility a site can tweak resource prioritization to meet their needs. Boosting the priority of some critical async scripts for example or increasing the priority of hero images before the browser has identified that they are in the viewport. To help inform any prioritization decisions, the Workers runtime also exposes the browser-requested prioritization information in the request object passed in to the Worker’s fetch event listener ( The incoming requested priority is a semicolon-delimited list of attributes that looks something like this: “weight=192;exclusive=0;group=3;group-weight=127”. weight: The browser-requested weight for the HTTP/2 prioritization. exclusive: The browser-requested HTTP/2 exclusive flag (1 for Chromium-based browsers, 0 for others). group: HTTP/2 stream ID for the request group (only non-zero for Firefox). group-weight: HTTP/2 weight for the request group (only non-zero for Firefox). This is Just the Beginning The ability to tune and control the prioritization of responses is the basic building block that a lot of future work will benefit from. We will be implementing our own advanced optimizations on top of it but by exposing it in Workers we have also opened it up to sites and researchers to experiment with different prioritization strategies. With the Apps Marketplace it is also possible for companies to build new optimization services on top of the Workers platform and make it available to other sites to use. If you are on a Pro plan or above, head over to the speed tab in the Cloudflare dashboard and turn on “Enhanced HTTP/2 Prioritization” to accelerate your site.


Recommended Content