Service Provider Blogs

Announcing the Cloudflare Access App Launch

CloudFlare Blog -

Every person joining your team has the same question on Day One: how do I find and connect to the applications I need to do my job?Since launch, Cloudflare Access has helped improve how users connect to those applications. When you protect an application with Access, users never have to connect to a private network and never have to deal with a clunky VPN client. Instead, they reach on-premise apps as if they were SaaS tools. Behind the scenes, Access evaluates and logs every request to those apps for identity, giving administrators more visibility and security than a traditional VPN.Administrators need about an hour to deploy Access. End user logins take about 20 ms, and that response time is consistent globally. Unlike VPN appliances, Access runs in every data center in Cloudflare’s network in 200 cities around the world. When Access works well, it should be easy for administrators and invisible to the end user.However, users still need to locate the applications behind Access, and for internally managed applications, traditional dashboards require constant upkeep. As organizations grow, that roster of links keeps expanding. Department leads and IT administrators can create and publish manual lists, but those become a chore to maintain. Teams need to publish custom versions for contractors or partners that only make certain tools visible.Starting today, teams can use Cloudflare Access to solve that challenge. We’re excited to announce the first feature in Access built specifically for end users: the Access App Launch portal.The Access App Launch is a dashboard for all the applications protected by Access. Once enabled, end users can login and connect to every app behind Access with a single click.How does it work?When administrators secure an application with Access, any request to the hostname of that application stops at Cloudflare’s network first. Once there, Cloudflare Access checks the request against the list of users who have permission to reach the application.To check identity, Access relies on the identity provider that the team already uses. Access integrates with providers like OneLogin, Okta, AzureAD, G Suite and others to determine who a user is. If the user has not logged in yet, Access will prompt them to do so at the identity provider configured.When the user logs in, they are redirected through a subdomain unique to each Access account. Access assigns that subdomain based on a hostname already active in the account. For example, an account with the hostname “widgetcorp.tech” will be assigned “widgetcorp.cloudflareaccess.com”.The Access App Launch uses the unique subdomain assigned to each Access account. Now, when users visit that URL directly, Cloudflare Access checks their identity and displays only the applications that the user has permission to reach. When a user clicks on an application, they are redirected to the application behind it. Since they are already authenticated, they do not need to login again.In the background, the Access App Launch decodes and validates the token stored in the cookie on the account’s subdomain.How is it configured?The Access App Launch can be configured in the Cloudflare dashboard in three steps. First, navigate to the Access tab in the dashboard. Next, enable the feature in the “App Launch Portal” card. Finally, define who should be able to use the Access App Launch in the modal that appears and click “Save”. Permissions to use the Access App Launch portal do not impact existing Access policies for who can reach protected applications.Administrators do not need to manually configure each application that appears in the portal. Access App Launch uses the policies already created in the account to generate a page unique to each individual user, automatically.Defense-in-depth against phishing attacksPhishing attacks attempt to trick users by masquerading as a legitimate website. In the case of business users, team members think they are visiting an authentic application. Instead, an attacker can present a spoofed version of the application at a URL that looks like the real thing.Take “example.com” vs “examрle.com” - they look identical, but one uses the Cyrillic “р” and becomes an entirely different hostname. If an attacker can lure a user to visit “examрle.com”, and make the site look like the real thing, that user could accidentally leak credentials or information.To be successful, the attacker needs to get the victim to visit that fraudulent URL. That frequently happens via email from untrusted senders.The Access App Launch can help prevent these attacks from targeting internal tools. Teams can instruct users to only navigate to internal applications through the Access App Launch dashboard. When users select a tile in the page, Access will send users to that application using the organization’s SSO.Cloudflare Gateway can take it one step further. Gateway’s DNS resolver filtering can help defend from phishing attacks that utilize sites that resemble legitimate applications that do not sit behind Access. To learn more about adding Gateway, in conjunction with Access, sign up to join the beta here.What’s next?As part of last week’s announcement of Cloudflare for Teams, the Access App Launch is now available to all Access customers today. You can get started with instructions here.Interested in learning more about Cloudflare for Teams? Read more about the announcement and features here.

Introducing Cloudflare for Campaigns

CloudFlare Blog -

During the past year, we saw nearly 2 billion global citizens go to the polls to vote in democratic elections. There were major elections in more than 50 countries, including India, Nigeria, and the United Kingdom, as well as elections for the European Parliament. In 2020, we will see a similar number of elections in countries from Peru to Myanmar. In November, U.S citizens will cast their votes for the 46th President, 435 seats in the U.S House of Representatives, 35 of the 100 seats in the U.S. Senate, and many state and local elections. Recognizing the importance of maintaining public access to election information, Cloudflare launched the Athenian Project in 2017, providing U.S. state and local government entities with the tools needed to secure their election websites for free. As we’ve seen, however, political parties and candidates for office all over the world are also frequent targets for cyberattack. Cybersecurity needs for campaign websites and internal tools are at an all time high.Although Cloudflare has helped improve the security and performance of political parties and candidates for office all over the world for years, we’ve long felt that we could do more. So today, we’re announcing Cloudflare for Campaigns, a suite of Cloudflare services tailored to campaign needs. Cloudflare for Campaigns is designed to make it easier for all political campaigns and parties, especially those with small teams and limited resources, to get access to cybersecurity services.Risks faced by political campaignsSince Russians attempted to use cyberattacks to interfere in the U.S. Presidential election in 2016, the news has been filled with reports of cyber threats against political campaigns, in both the United States and around the world. Hackers targeted the Presidential campaigns of Emmanuel Macron in France and Angela Merkel in Germany with phishing attacks, the main political parties in the UK with DDoS attacks, and congressional campaigns in California with a combination of malware, DDoS attacks and brute force login attempts. Both because of our services to state and local government election websites through the Athenian Project and because a significant number of political parties and candidates for office use our services, Cloudflare has seen many attacks on election infrastructure and political campaigns firsthand.During the 2020 U.S. election cycle, Cloudflare has provided services to 18 major presidential campaigns, as well as a range of congressional campaigns. On a typical day, Cloudflare blocks 400,000 attacks against political campaigns, and, on a busy day, Cloudflare blocks more than 40 million attacks against campaigns.What is Cloudflare for Campaigns?Cloudflare for Campaigns is a suite of Cloudflare products focused on the needs of political campaigns, particularly smaller campaigns that don’t have the resources to bring significant cybersecurity resources in house. To ensure the security of a campaign website, the Cloudflare for Campaigns package includes Business-level service, as well as security tools particularly helpful for political campaigns websites, such as the web application firewall, rate limiting, load balancing, Enterprise level “I am Under Attack Support”, bot management, and multi-user account enablement.To ensure the security of internal campaign teams, the Cloudflare for Campaigns service will also provide tools for campaigns to ensure the security of their internal teams with Cloudflare Access, allowing for campaigns to secure, authenticate, and monitor user access to any domain, application, or path on Cloudflare, without using a VPN. Along with Access, we will be providing Cloudflare Gateway with DNS-based filtering at multiple locations to protect campaign staff as they navigate the Internet by keeping malicious content off the campaign’s network using DNS filtering, helping prevent users from running into phishing scams or malware sites. Campaigns can use Gateway after the product’s public release.Cloudflare for Campaigns also includes Cloudflare reliability and security guide, which lists a best practice guide for political campaigns to maintain their campaign site and secure their internal teams.Regulatory ChallengesAlthough there is widespread agreement that campaigns and political parties face threats of cyberattack, there is less consensus on how best to get political campaigns the help they need.  Many political campaigns and political parties operate under resource constraints, without the technological capability and financial resources to dedicate to cybersecurity. At the same time, campaigns around the world are the subject of a variety of different regulations intended to prevent corruption of democratic processes. As a practical matter, that means that, although campaigns may not have the resources needed to access cybersecurity services, donation of cybersecurity services to campaigns may not always be allowed.In the U.S., campaign finance regulations prohibit corporations from providing any contributions of either money or services to federal candidates or political party organizations. These rules prevent companies from offering free or discounted services if those services are not provided on the same terms and conditions to similarly situated members of the general public. The Federal Elections Commission (FEC), which enforces U.S. campaign finance laws, has struggled with the issue of how best to apply those rules to the provision of free or discounted cybersecurity services to campaigns. In consideration of a number of advisory opinions, they have publicly wrestled with the competing priorities of securing campaigns from cyberattack while not opening a backdoor to donation of goods services that are intended to curry favors with particular candidates.The FEC has issued two advisory opinions to tech companies seeking to provide free or discounted cybersecurity services to campaigns. In 2018, the FEC approved a request by Microsoft to offer a package of enhanced online account security protections for “election-sensitive” users. The FEC reasoned that Microsoft was offering the services to its paid users “based on commercial rather than political considerations, in the ordinary course of its business and not merely for promotional consideration or to generate goodwill.” In July 2019, the FEC approved a request by a cybersecurity company to provide low-cost anti-phishing services to campaigns because those services would be provided in the ordinary course of business and on the same terms and conditions as offered to similarly situated non-political clients.In September 2018, a month after Microsoft submitted its request, Defending Digital Campaigns (DDC), a nonprofit established with the mission to “secure our democratic campaign process by providing eligible campaigns and political parties, committees, and related organizations with knowledge, training, and resources to defend themselves from cyber threats,” submitted a request to the FEC to offer free or reduced-cost cybersecurity services, including from technology corporations, to federal candidates and parties. Over the following months, the FEC issued and requested comment on multiple draft opinions on whether the donation was permissible and, if so, on what basis. As described by the FEC, to support its position, DDC represented that “federal candidates and parties are singularly ill-equipped to counteract these threats.” The FEC’s advisory opinion to DDC noted:“You [DDC] state that presidential campaign committees and national party committees require expert guidance on cybersecurity and you contend that the 'vast majority of campaigns' cannot afford full-time cybersecurity staff and that 'even basic cybersecurity consulting software and services' can overextend the budgets of most congressional campaigns. AOR004. For instance, you note that a congressional candidate in California reported a breach to the Federal Bureau of Investigation (FBI) in March of this year but did not have the resources to hire a professional cybersecurity firm to investigate the attack, or to replace infected computers. AOR003.” In May 2019, the FEC approved DDC’s request to partner with technology companies to provide free and discounted cybersecurity services “[u]nder the unusual and exigent circumstances” presented by the request and “in light of the demonstrated, currently enhanced threat of foreign cyberattacks against party and candidate committees.”All of these opinions demonstrate the FEC’s desire to allow campaigns to access affordable cybersecurity services because of the heightened threat of cyberattack, while still being cautious to ensure that those services are offered transparently and consistent with the goals of campaign finance laws.Partnering with DDC to Provide Free Services to US CandidatesWe share the view of both DDC and the FEC that political campaigns -- which are central to our democracy -- must have the tools to protect themselves against foreign cyberattack. Cloudflare is therefore excited to announce a new partnership with DDC to provide Cloudflare for Campaigns for free to candidates and parties that meet DDC’s criteria.To receive free services under DDC, political campaigns must meet the following criteria, as the DDC laid out to the FEC:A House candidate’s committee that has at least $50,000 in receipts for the current election cycle, and a Senate candidate’s committee that has at least $100,000 in receipts for the current election cycle;A House or Senate candidate’s committee for candidates who have qualified for the general election ballot in their respective elections; orAny presidential candidate’s committee whose candidate is polling above five percent in national polls.For more information on eligibility for these services under DDC and the next steps, please visit cloudflare.com/campaigns/usa. Election packageAlthough political campaigns are regulated differently all around the world, Cloudflare believes that the integrity of all political campaigns should be protected against powerful adversaries. With this in mind, Cloudflare will therefore also be offering Cloudflare for Campaigns as a paid service, designed to help campaigns all around the world as we attempt to address regulatory hurdles. For more information on how to sign up for the Cloudflare election package, please visit cloudflare.com/campaigns.

A cost-effective and extensible testbed for transport protocol development

CloudFlare Blog -

This was originally published on Perf Planet's 2019 Web Performance Calendar. At Cloudflare, we develop protocols at multiple layers of the network stack. In the past, we focused on HTTP/1.1, HTTP/2, and TLS 1.3. Now, we are working on QUIC and HTTP/3, which are still in IETF draft, but gaining a lot of interest.QUIC is a secure and multiplexed transport protocol that aims to perform better than TCP under some network conditions. It is specified in a family of documents: a transport layer which specifies packet format and basic state machine, recovery and congestion control, security based on TLS 1.3, and an HTTP application layer mapping, which is now called HTTP/3.Let’s focus on the transport and recovery layer first. This layer provides a basis for what is sent on the wire (the packet binary format) and how we send it reliably. It includes how to open the connection, how to handshake a new secure session with the help of TLS, how to send data reliably and how to react when there is packet loss or reordering of packets. Also it includes flow control and congestion control to interact well with other transport protocols in the same network. With confidence in the basic transport and recovery layer,  we can take a look at higher application layers such as HTTP/3.To develop such a transport protocol, we need multiple stages of the development environment. Since this is a network protocol, it’s best to test in an actual physical network to see how works on the wire. We may start the development using localhost, but after some time we may want to send and receive packets with other hosts. We can build a lab with a couple of virtual machines, using Virtualbox, VMWare or even with Docker. We also have a local testing environment with a Linux VM. But sometimes these have a limited network (localhost only) or are noisy due to other processes in the same host or virtual machines.Next step is to have a test lab, typically an isolated network focused on protocol analysis only consisting of dedicated x86 hosts. Lab configuration is particularly important for testing various cases - there is no one-size-fits-all scenario for protocol testing. For example, EDGE is still running in production mobile networks but LTE is dominant and 5G deployment is in early stages. WiFi is very common these days. We want to test our protocol in all those environments. Of course, we can't buy every type of machine or have a very expensive network simulator for every type of environment, so using cheap hardware and an open source OS where we can configure similar environments is ideal.The QUIC Protocol Testing labThe goal of the QUIC testing lab is to aid transport layer protocol development. To develop a transport protocol we need to have a way to control our network environment and a way to get as many different types of debugging data as possible. Also we need to get metrics for comparison with other protocols in production.The QUIC Testing Lab has the following goals:Help with multiple transport protocol development: Developing a new transport layer requires many iterations, from building and validating packets as per protocol spec, to making sure everything works fine under moderate load, to very harsh conditions such as low bandwidth and high packet loss. We need a way to run tests with various network conditions reproducibly in order to catch unexpected issues.Debugging multiple transport protocol development: Recording as much debugging info as we can is important for fixing bugs. Looking into packet captures definitely helps but we also need a detailed debugging log of the server and client to understand the what and why for each packet. For example, when a packet is sent, we want to know why. Is this because there is an application which wants to send some data? Or is this a retransmit of data previously known as lost? Or is this a loss probe which is not an actual packet loss but sent to see if the network is lossy?Performance comparison between each protocol: We want to understand the performance of a new protocol by comparison with existing protocols such as TCP, or with a previous version of the protocol under development. Also we want to test with varying parameters such as changing the congestion control mechanism, changing various timeouts, or changing the buffer sizes at various levels of the stack.Finding a bottleneck or errors easily: Running tests we may see an unexpected error - a transfer that timed out, or ended with an error, or a transfer was corrupted at the client side - each test needs to make sure every test is run correctly, by using a checksum of the original file to compare with what is actually downloaded, or by checking various error codes at the protocol of API level.When we have a test lab with separate hardware, we have benefits, as follows:Can configure the testing lab without public Internet access - safe and quiet.Handy access to hardware and its console for maintenance purpose, or for adding or updating hardware.Try other CPU architectures. For clients we use the Raspberry Pi for regular testing because this is ARM architecture (32bit or 64bit), similar to modern smartphones. So testing with ARM architecture helps for compatibility testing before going into a smartphone OS.We can add a real smartphone for testing, such as Android or iPhone. We can test with WiFi but these devices also support Ethernet, so we can test them with a wired network for better consistency.Lab ConfigurationHere is a diagram of our QUIC Protocol Testing Lab:This is a conceptual diagram and we need to configure a switch for connecting each machine. Currently, we have Raspberry Pis (2 and 3) as an Origin and a Client. And small Intel x86 boxes for the Traffic Shaper and Edge server plus Ethernet switches for interconnectivity.Origin is simply serving HTTP and HTTPS test objects using a web server. Client may download a file from Origin directly to simulate a download direct from a customer's origin server.Client will download a test object from Origin or Edge, using a different protocol. In typical a configuration Client connects to Edge instead of Origin, so this is to simulate an edge server in the real world. For TCP/HTTP we are using the curl command line client and for QUIC, quiche’s http3_client with some modification.Edge is running Cloudflare's web server to serve HTTP/HTTPS via TCP and also the QUIC protocol using quiche. Edge server is installed with the same Linux kernel used on Cloudflare's production machines in order to have the same low level network stack.Traffic Shaper is sitting between Client and Edge (and Origin), controlling network conditions. Currently we are using FreeBSD and ipfw + dummynet. Traffic shaping can also be done using Linux' netem which provides additional network simulation features. The goal is to run tests with various network conditions, such as bandwidth, latency and packet loss upstream and downstream. The lab is able to run a plaintext HTTP test but currently our focus of testing is HTTPS over TCP and HTTP/3 over QUIC. Since QUIC is running over UDP, both TCP and UDP traffic need to be controlled.Test Automation and VisualizationIn the lab, we have a script installed in Client, which can run a batch of testing with various configuration parameters - for each test combination, we can define a test configuration, including:Network Condition - Bandwidth, Latency, Packet Loss (upstream and downstream)For example using netem traffic shaper we can simulate LTE network as below,(RTT=50ms, BW=22Mbps upstream and downstream, with BDP queue size)$ tc qdisc add dev eth0 root handle 1:0 netem delay 25ms $ tc qdisc add dev eth0 parent 1:1 handle 10: tbf rate 22mbit buffer 68750 limit 70000 Test Object sizes - 1KB, 8KB, … 32MBTest Protocols: HTTPS (TCP) and QUIC (UDP)Number of runs and number of requests in a single connectionThe test script outputs a CSV file of results for importing into other tools for data processing and visualization - such as Google Sheets, Excel or even a jupyter notebook. Also it’s able to post the result to a database (Clickhouse in our case), so we can query and visualize the results.Sometimes a whole test combination takes a long time - the current standard test set with simulated 2G, 3G, LTE, WiFi and various object sizes repeated 10 times for each request may take several hours to run. Large object testing on a slow network takes most of the time, so sometimes we also need to run a limited test (e.g. testing LTE-like conditions only for a sanity check) for quick debugging.Chart using Google Sheets:The following comparison chart shows the total transfer time in msec for TCP vs QUIC for different network conditions. The QUIC protocol used here is a development version one.Debugging and performance analysis using of a smartphoneMobile devices have become a crucial part of our day to day life, so testing the new transport protocol on mobile devices is critically important for mobile app performance. To facilitate that, we need to have a mobile test app which will proxy data over the new transport protocol under development. With this we have the ability to analyze protocol functionality and performance in mobile devices with different network conditions.Adding a smartphone to the testbed mentioned above gives an advantage in terms of understanding real performance issues. The major smartphone operating systems, iOS and Android, have quite different networking stack. Adding a smartphone to testbed gives the ability to understand these operating system network stacks in depth which aides new protocol designs.The above figure shows the network block diagram of another similar lab testbed used for protocol testing where a smartphone is connected both wired and wirelessly. A Linux netem based traffic shaper sits in-between the client and server shaping the traffic. Various networking profiles are fed to the traffic shaper to mimic real world scenarios. The client can be either an Android or iOS based smartphone, the server is a vanilla web server serving static files. Client, server and traffic shaper are all connected to the Internet along with the private lab network for management purposes.The above lab has mobile devices for both Android or iOS  installed with a test app built with a proprietary client proxy software for proxying data over the new transport protocol under development. The test app also has the ability to make HTTP requests over TCP for comparison purposes. The Android or iOS test app can be used to issue multiple HTTPS requests of different object sizes sequentially and concurrently using TCP and QUIC as underlying transport protocol. Later, TTOTAL (total transfer time) of each HTTPS request is used to compare TCP and QUIC performance over different network conditions. One such comparison is shown below,The table above shows the total transfer time taken for TCP and QUIC requests over an LTE network profile fetching different objects with different concurrency levels using the test app. Here TCP goes over native OS network stack and QUIC goes over Cloudflare QUIC stack.Debugging network performance issues is hard when it comes to mobile devices. By adding an actual smartphone into the testbed itself we have the ability to take packet captures at different layers. These are very critical in analyzing and understanding protocol performance.It's easy and straightforward to capture packets and analyze them using the tcpdump tool on x86 boxes, but it's a challenge to capture packets on iOS and Android devices. On iOS device ‘rvictl’ lets us capture packets on an external interface. But ‘rvictl’ has some drawbacks such as timestamps being inaccurate. Since we are dealing with millisecond level events, timestamps need to be accurate to analyze the root cause of a problem.We can capture packets on internal loopback interfaces on jailbroken iPhones and rooted Android devices. Jailbreaking a recent iOS device is nontrivial. We also need to make sure that autoupdate of any sort is disabled on such a phone otherwise it would disable the jailbreak and you have to start the whole process again. With a jailbroken phone we have root access to the device which lets us take packet captures as needed using tcpdump.Packet captures taken using jailbroken iOS devices or rooted Android devices connected to the lab testbed help us analyze  performance bottlenecks and improve protocol performance.iOS and Android devices different network stacks in their core operating systems. These packet captures also help us understand the network stack of these mobile devices, for example in iOS devices packets punted through loopback interface had a mysterious delay of 5 to 7ms.ConclusionCloudflare is actively involved in helping to drive forward the QUIC and HTTP/3 standards by testing and optimizing these new protocols in simulated real world environments. By simulating a wide variety of networks we are working on our mission of Helping Build a Better Internet. For everyone, everywhere.Would like to thank SangJo Lee, Hiren Panchasara, Lucas Pardue and Sreeni Tellakula for their contributions.

Helping mitigate the Citrix NetScaler CVE with Cloudflare Access

CloudFlare Blog -

Yesterday, Citrix sent an updated notification to customers warning of a vulnerability in their Application Delivery Controller (ADC) product. If exploited, malicious attackers can bypass the login page of the administrator portal, without authentication, to perform arbitrary code execution.No patch is available yet. Citrix expects to have a fix for certain versions on January 20 and others at the end of the month.In the interim, Citrix has asked customers to attempt to mitigate the vulnerability. The recommended steps involve running a number of commands from an administrator command line interface.The vulnerability relied on by attackers requires that they first be able to reach a login portal hosted by the ADC. Cloudflare can help teams secure that page and the resources protected by the ADC. Teams can place the login page, as well as the administration interface, behind Cloudflare Access’ identity proxy to prevent unauthenticated users from making requests to the portal.Exploiting URL pathsCitrix ADC, also known as Citrix NetScaler, is an application delivery controller that provides Layer 3 through Layer 7 security for applications and APIs. Once deployed, administrators manage the installation of the ADC through a portal available at a dedicated URL on a hostname they control.Users and administrators can reach the ADC interface over multiple protocols, but it appears that the vulnerability stems from HTTP paths that contain “/vpn/../vpns/” in the path via the VPN or AAA endpoints, from which a directory traversal exploit is possible.The suggested mitigation steps ask customers to run commands which enforce new responder policies for the ADC interface. Those policies return 403s when certain paths are requested, blocking unauthenticated users from reaching directories that sit behind the authentication flow.Protecting administrator portals with Cloudflare AccessTo exploit this vulnerability, attackers must first be able to reach a login portal hosted by the ADC. As part of a defense-in-depth strategy, Cloudflare Access can prevent attackers from ever reaching the panel over HTTP or SSH.Cloudflare Access, part of Cloudflare for Teams, protects internally managed resources by checking each request for identity and permission. When administrators secure an application behind Access, any request to the hostname of that application stops at Cloudflare’s network first. Once there, Cloudflare Access checks the request against the list of users who have permission to reach the application.Deploying Access does not require exposing new holes in corporate firewalls. Teams connect their resources through a secure outbound connection, Argo Tunnel, which runs in your infrastructure to connect the applications and machines to Cloudflare. That tunnel makes outbound-only calls to the Cloudflare network and organizations can replace complex firewall rules with just one: disable all inbound connections.To defend against attackers addressing IPs directly, Argo Tunnel can help secure the interface and force outbound requests through Cloudflare Access. With Argo Tunnel, and firewall rules preventing inbound traffic, no request can reach those IPs without first hitting Cloudflare, where Access can evaluate the request for authentication.Administrators then build rules to decide who should authenticate to and reach the tools protected by Access. Whether those resources are virtual machines powering business operations or internal web applications, like Jira or iManage, when a user needs to connect, they pass through Cloudflare first.When users need to connect to the tools behind Access, they are prompted to authenticate with their team’s SSO and, if valid, instantly connected to the application without being slowed down. Internally managed apps suddenly feel like SaaS products, and the login experience is seamless and familiar.Behind the scenes, every request made to those internal tools hits Cloudflare first where we enforce identity-based policies. Access evaluates and logs every request to those apps for identity, giving administrators more visibility and security than a traditional VPN.Cloudflare Access can also be bundled with the Cloudflare WAF, and WAF rules can be applied to guard against this as well. Adding Cloudflare Access, the Cloudflare WAF, and the mitigation commands from Citrix together provide layers of security while a patch is in development.How to get startedWe recommend that users of the Citrix ADC follow the mitigation steps recommended by Citrix. Cloudflare Access adds another layer of security by enforcing identity-based authentication for requests made over HTTP and SSH to the ADC interface. Together, these steps can help form a defense-in-depth strategy until a patch is released by Citrix.To get started, Citrix ADC users can place their ADC interface and exposed endpoints behind a bastion host secured by Cloudflare Access. On that bastion host, administrators can use Cloudflare Argo Tunnel to open outbound-only connections to Cloudflare through which HTTP and SSH requests can be proxied.Once deployed, users of the login portal can connect to the protected hostname. Cloudflare Access will prompt them to login with their identity provider and Cloudflare will validate the user against the rules created to control who can reach the interface. If authenticated and allowed, the user will be able to connect. No other requests will be able to reach the interface over HTTP or SSH without authentication.The first five seats of Cloudflare Access are free. Teams can sign up here to get started.

Accelerating UDP packet transmission for QUIC

CloudFlare Blog -

This was originally published on Perf Planet's 2019 Web Performance Calendar. QUIC, the new Internet transport protocol designed to accelerate HTTP traffic, is delivered on top of UDP datagrams, to ease deployment and avoid interference from network appliances that drop packets from unknown protocols. This also allows QUIC implementations to live in user-space, so that, for example, browsers will be able to implement new protocol features and ship them to their users without having to wait for operating systems updates. But while a lot of work has gone into optimizing TCP implementations as much as possible over the years, including building offloading capabilities in both software (like in operating systems) and hardware (like in network interfaces), UDP hasn't received quite as much attention as TCP, which puts QUIC at a disadvantage. In this post we'll look at a few tricks that help mitigate this disadvantage for UDP, and by association QUIC. For the purpose of this blog post we will only be concentrating on measuring throughput of QUIC connections, which, while necessary, is not enough to paint an accurate overall picture of the performance of the QUIC protocol (or its implementations) as a whole. Test Environment The client used in the measurements is h2load, built with QUIC and HTTP/3 support, while the server is NGINX, built with the open-source QUIC and HTTP/3 module provided by Cloudflare which is based on quiche (github.com/cloudflare/quiche), Cloudflare's own open-source implementation of QUIC and HTTP/3. The client and server are run on the same host (my laptop) running Linux 5.3, so the numbers don’t necessarily reflect what one would see in a production environment over a real network, but it should still be interesting to see how much of an impact each of the techniques have. Baseline Currently the code that implements QUIC in NGINX uses the sendmsg() system call to send a single UDP packet at a time. ssize_t sendmsg(int sockfd, const struct msghdr *msg, int flags); The struct msghdr carries a struct iovec which can in turn carry multiple buffers. However, all of the buffers within a single iovec will be merged together into a single UDP datagram during transmission. The kernel will then take care of encapsulating the buffer in a UDP packet and sending it over the wire. The throughput of this particular implementation tops out at around 80-90 MB/s, as measured by h2load when performing 10 sequential requests for a 100 MB resource. sendmmsg() Due to the fact that sendmsg() only sends a single UDP packet at a time, it needs to be invoked quite a lot in order to transmit all of the QUIC packets required to deliver the requested resources, as illustrated by the following bpftrace command: % sudo bpftrace -p $(pgrep nginx) -e 'tracepoint:syscalls:sys_enter_sendm* { @[probe] = count(); }' Attaching 2 probes... @[tracepoint:syscalls:sys_enter_sendmsg]: 904539 Each of those system calls causes an expensive context switch between the application and the kernel, thus impacting throughput. But while sendmsg() only transmits a single UDP packet at a time for each invocation, its close cousin sendmmsg() (note the additional “m” in the name) is able to batch multiple packets per system call: int sendmmsg(int sockfd, struct mmsghdr *msgvec, unsigned int vlen, int flags); Multiple struct mmsghdr structures can be passed to the kernel as an array, each in turn carrying a single struct msghdr with its own struct iovec , with each element in the msgvec array representing a single UDP datagram. Let's see what happens when NGINX is updated to use sendmmsg() to send QUIC packets: % sudo bpftrace -p $(pgrep nginx) -e 'tracepoint:syscalls:sys_enter_sendm* { @[probe] = count(); }' Attaching 2 probes... @[tracepoint:syscalls:sys_enter_sendmsg]: 2437 @[tracepoint:syscalls:sys_enter_sendmmsg]: 15676 The number of system calls went down dramatically, which translates into an increase in throughput, though not quite as big as the decrease in syscalls: UDP segmentation offload With sendmsg() as well as sendmmsg(), the application is responsible for separating each QUIC packet into its own buffer in order for the kernel to be able to transmit it. While the implementation in NGINX uses static buffers to implement this, so there is no overhead in allocating them, all of these buffers need to be traversed by the kernel during transmission, which can add significant overhead. Linux supports a feature, Generic Segmentation Offload (GSO), which allows the application to pass a single "super buffer" to the kernel, which will then take care of segmenting it into smaller packets. The kernel will try to postpone the segmentation as much as possible to reduce the overhead of traversing outgoing buffers (some NICs even support hardware segmentation, but it was not tested in this experiment due to lack of capable hardware). Originally GSO was only supported for TCP, but support for UDP GSO was recently added as well, in Linux 4.18. This feature can be controlled using the UDP_SEGMENT socket option: setsockopt(fd, SOL_UDP, UDP_SEGMENT, &gso_size, sizeof(gso_size))) As well as via ancillary data, to control segmentation for each sendmsg() call: cm = CMSG_FIRSTHDR(&msg); cm->cmsg_level = SOL_UDP; cm->cmsg_type = UDP_SEGMENT; cm->cmsg_len = CMSG_LEN(sizeof(uint16_t)); *((uint16_t *) CMSG_DATA(cm)) = gso_size; Where gso_size is the size of each segment that form the "super buffer" passed to the kernel from the application. Once configured, the application can provide one contiguous large buffer containing a number of packets of gso_size length (as well as a final smaller packet), that will then be segmented by the kernel (or the NIC if hardware segmentation offloading is supported and enabled). Up to 64 segments can be batched with the UDP_SEGMENT option. GSO with plain sendmsg() already delivers a significant improvement: And indeed the number of syscalls also went down significantly, compared to plain sendmsg() : % sudo bpftrace -p $(pgrep nginx) -e 'tracepoint:syscalls:sys_enter_sendm* { @[probe] = count(); }' Attaching 2 probes... @[tracepoint:syscalls:sys_enter_sendmsg]: 18824 GSO can also be combined with sendmmsg() to deliver an even bigger improvement. The idea being that each struct msghdr can be segmented in the kernel by setting the UDP_SEGMENT option using ancillary data, allowing an application to pass multiple “super buffers”, each carrying up to 64 segments, to the kernel in a single system call. The improvement is again fairly significant: Evolving from AFAP Transmitting packets as fast as possible is easy to reason about, and there's much fun to be had in optimizing applications for that, but in practice this is not always the best strategy when optimizing protocols for the Internet Bursty traffic is more likely to cause or be affected by congestion on any given network path, which will inevitably defeat any optimization implemented to increase transmission rates. Packet pacing is an effective technique to squeeze out more performance from a network flow. The idea being that adding a short delay between each outgoing packet will smooth out bursty traffic and reduce the chance of congestion, and packet loss. For TCP this was originally implemented in Linux via the fq packet scheduler, and later by the BBR congestion control algorithm implementation, which implements its own pacer. Due to the nature of current QUIC implementations, which reside entirely in user-space, pacing of QUIC packets conflicts with any of the techniques explored in this post, because pacing each packet separately during transmission will prevent any batching on the application side, and in turn batching will prevent pacing, as batched packets will be transmitted as fast as possible once received by the kernel. However Linux provides some facilities to offload the pacing to the kernel and give back some control to the application: SO_MAX_PACING_RATE: an application can define this socket option to instruct the fq packet scheduler to pace outgoing packets up to the given rate. This works for UDP sockets as well, but it is yet to be seen how this can be integrated with QUIC, as a single UDP socket can be used for multiple QUIC connections (unlike TCP, where each connection has its own socket). In addition, this is not very flexible, and might not be ideal when implementing the BBR pacer. SO_TXTIME / SCM_TXTIME: an application can use these options to schedule transmission of specific packets at specific times, essentially instructing fq to delay packets until the provided timestamp is reached. This gives the application a lot more control, and can be easily integrated into sendmsg() as well as sendmmsg(). But it does not yet support specifying different times for each packet when GSO is used, as there is no way to define multiple timestamps for packets that need to be segmented (each segmented packet essentially ends up being sent at the same time anyway). While the performance gains achieved by using the techniques illustrated here are fairly significant, there are still open questions around how any of this will work with pacing, so more experimentation is required.

Prototyping optimizations with Cloudflare Workers and WebPageTest

CloudFlare Blog -

This article was originally published as part of  Perf Planet's 2019 Web Performance Calendar.Have you ever wanted to quickly test a new performance idea, or see if the latest performance wisdom is beneficial to your site? As web performance appears to be a stochastic process, it is really important to be able to iterate quickly and review the effects of different experiments. The challenge is to be able to arbitrarily change requests and responses without the overhead of setting up another internet facing server. This can be straightforward to implement by combining two of my favourite technologies : WebPageTest and Cloudflare Workers. Pat Meenan sums this up with the following slide from a recent getting the most of WebPageTest presentation:So what is Cloudflare Workers and why is it ideally suited to easy prototyping of optimizations?Cloudflare WorkersFrom the documentation :Cloudflare Workers provides a lightweight JavaScript execution environment that allows developers to augment existing applications or create entirely new ones without configuring or maintaining infrastructure.A Cloudflare Worker is a programmable proxy which brings the simplicity and flexibility of the Service Workers event-based fetch API from the browser to the edge. This allows a worker to intercept and modify requests and responses. With the Service Worker API you can add an EventListener to any fetch event that is routed through the worker script and modify the request to come from a different origin. Cloudflare Workers also provides a streaming HTMLRewriter to enable on the fly modification of HTML as it passes through the worker. The streaming nature of this parser ensures latency is minimised as the entire HTML document does not have to be buffered before rewriting can happen.Setting up a workerIt is really quick and easy to sign up for a free subdomain at workers.dev which provides you with 100,000 free requests per day. There is a quick-start guide available here.To be able to run the examples in this post you will need to install Wrangler, the CLI tool for deploying workers. Once Wrangler is installed run the following command to download the example worker project:     wrangler generate wpt-proxy https://github.com/xtuc/WebPageTest-proxyYou will then need to update the wrangler.toml with your account_id, which can be found in the dashboard in the right sidebar. Then configure an API key with the command:wrangler configFinally, you can publish the worker with:   wrangler publish At this the point, the worker will be active at https://wpt-proxy.<your-subdomain>.workers.dev.WebPageTest OverrideHost  Now that your worker is configured, the next step is to configure WebPageTest to redirect requests through the worker. WebPageTest has a feature where it can re-point arbitrary origins to a different domain. To access the feature in WebPageTest, you need to use the WebPageTest scripting language "overrideHost" command, as shown:This example will redirect all network requests to www.bbc.co.uk to wpt-proxy.prf.workers.dev instead. WebPageTest also adds an x-host header to each redirected request so that the destination can determine for which host the request was originally intended:     x-host: www.bbc.co.ukThe script can process multiple overrideHost commands to override multiple different origins. If HTTPS is used, WebPageTest can use HTTP/2 and benefit from connection coalescing:   overrideHost www.bbc.co.uk wpt-proxy.prf.workers.dev overrideHost nav.files.bbci.co.uk wpt-proxy.prf.workers.dev navigate https://www.bbc.co.uk  It also supports wildcards:   overrideHost *bbc.co.uk wpt-proxy.prf.workers.dev navigate https://www.bbc.co.uk There are a few special strings that can be used in a script when bulk testing, so a single script can be re-used across multiple URLs:%URL% - Replaces with the URL of the current test%HOST% - Replaces with the hostname of the URL of the current test%HOSTR% - Replaces with the hostname of the final URL in case the test URL does a redirect.A more generic script would look like this:    overrideHost %HOSTR% wpt-proxy.prf.workers.dev navigate %URL% Basic workerIn the base example below, the worker listens for the fetch event, looks for the x-host header that WebPageTest has set and responds by fetching the content from the orginal url:/* * Handle all requests. * Proxy requests with an x-host header and return 403 * for everything else */ addEventListener("fetch", event => { const host = event.request.headers.get('x-host'); if (host) { const url = new URL(event.request.url); const originUrl = url.protocol + '//' + host + url.pathname + url.search; let init = { method: event.request.method, redirect: "manual", headers: [...event.request.headers] }; event.respondWith(fetch(originUrl, init)); } else { const response = new Response('x-Host headers missing', {status: 403}); event.respondWith(response); } });The source code can be found here and instructions to download and deploy this worker are described in the earlier section.So what happens if we point all the domains on the BBC website through this worker, using the following config:  overrideHost *bbci.co.uk wpt.prf.workers.dev overrideHost *bbc.co.uk wpt.prf.workers.dev navigate https://www.bbc.co.ukconfigured to a 3G Fast setting from a UK test location.Comparison of BBC website if when using a single connection.  Before After The potential performance improvement of loading a page over a single connection, eliminating the additional DNS lookup, TCP connection and TLS handshakes, can be seen  by comparing the filmstrips and waterfalls. There are several reasons why you may not want or be able to move everything to a single domain, but at least it is now easy to see what the performance difference would be.   HTMLRewriterWith the HTMLRewriter, it is possible to change the HTML response as it passes through the worker. A jQuery-like syntax provides CSS-selector matching and a standard set of DOM mutation methods. For instance you could rewrite your page to measure the effects of different preload/prefetch strategies, review the performance savings of removing or using different third-party scripts, or you could stock-take the HEAD of your document. One piece of performance advice is to self-host some third-party scripts. This example script invokes the HTMLRewriter to listen for a script tag with a src attribute. If the script is from a proxiable domain the src is rewritten to be first-party, with a specific path prefix.async function rewritePage(request) { const response = await fetch(request); return new HTMLRewriter() .on("script[src]", { element: el => { let src = el.getAttribute("src"); if (PROXIED_URL_PREFIXES_RE.test(src)) { el.setAttribute("src", createProxiedScriptUrl(src)); } } }) .transform(response); } Subsequently, when the browser makes a request with the specific prefix, the worker fetches the asset from the original URL. This example can be downloaded with this command:     wrangler generate test https://github.com/xtuc/rewrite-3d-party.gitRequest ManglingAs well as rewriting content, it is also possible to change or delay a request. Below is an example of how to randomly add a delay of a second to a request:addEventListener("fetch", event => { const host = event.request.headers.get('x-host'); if (host) { //.... // Add the delay if necessary if (Math.random() * 100 < DELAY_PERCENT) { await new Promise(resolve => setTimeout(resolve, DELAY_MS)); } event.respondWith(fetch(originUrl, init)); //... }HTTP/2 prioritizationWhat if you want to see what the effect of changing the HTTP/2 prioritization of assets would make to your website? Cloudflare Workers provide custom http2 prioritization schemes that can be applied by setting a custom header on the response. The cf-priority header is defined as <priority>/<concurrency> so adding:    response.headers.set('cf-priority', “30/0”);     would set the priority of that response to 30 with a concurrency of 0 for the given response. Similarly, “30/1” would set concurrency to 1 and “30/n” would set concurrency to n. With this flexibility, you can prioritize the bytes that are important for your website or run a bulk test to prove that your new  prioritization scheme is better than any of the existing browser implementations. Summary A major barrier to understanding and innovation, is the amount of time is takes to get feedback. Having a quick and easy framework, to try out a new idea and comprehend the impact, is key. I hope this post has convinced you that combining WebPageTest and Cloudflare Workers is an easy solution to this problem and is indeed magic

Introducing Cloudflare for Teams

CloudFlare Blog -

Ten years ago, when Cloudflare was created, the Internet was a place that people visited. People still talked about ‘surfing the web’ and the iPhone was less than two years old, but on July 4, 2009 large scale DDoS attacks were launched against websites in the US and South Korea.Those attacks highlighted how fragile the Internet was and how all of us were becoming dependent on access to the web as part of our daily lives.Fast forward ten years and the speed, reliability and safety of the Internet is paramount as our private and work lives depend on it. We started Cloudflare to solve one half of every IT organization's challenge: how do you ensure the resources and infrastructure that you expose to the Internet are safe from attack, fast, and reliable. We saw that the world was moving away from hardware and software to solve these problems and instead wanted a scalable service that would work around the world.To deliver that, we built one of the world's largest networks. Today our network spans more than 200 cities worldwide and is within milliseconds of nearly everyone connected to the Internet. We have built the capacity to stand up to nation-state scale cyberattacks and a threat intelligence system powered by the immense amount of Internet traffic that we see.Today we're expanding Cloudflare's product offerings to solve the other half of every IT organization's challenge: ensuring the people and teams within an organization can access the tools they need to do their job and are safe from malware and other online threats. The speed, reliability, and protection we’ve brought to public infrastructure is extended today to everything your team does on the Internet.In addition to protecting an organization's infrastructure, IT organizations are charged with ensuring that employees of an organization can access the tools they need safely. Traditionally, these problems would be solved by hardware products like VPNs and Firewalls. VPNs let authorized users access the tools they needed and Firewalls kept malware out.Castle and MoatThe dominant model was the idea of a castle and a moat. You put all your valuable assets inside the castle. Your Firewall created the moat around the castle to keep anything malicious out. When you needed to let someone in, a VPN acted as the drawbridge over the moat.This is still the model most businesses use today, but it's showing its age. The first challenge is that if an attacker is able to find its way over the moat and into the castle then it can cause significant damage. Unfortunately, few weeks go by without reading a news story about how an organization had significant data compromised because an employee fell for a phishing email, or a contractor was compromised, or someone was able to sneak into an office and plug in a rogue device.The second challenge of the model is the rise of cloud and SaaS. Increasingly an organization's resources aren't in the just one castle anymore, but instead in different public cloud and SaaS vendors.Services like Box, for instance, provide better storage and collaboration tools than most organizations could ever hope to build and manage themselves. But there's literally nowhere you can ship a hardware box to Box in order to build your own moat around their SaaS castle. Box provides some great security tools themselves, but they are different from the tools provided by every other SaaS and public cloud vendor. Where IT organizations used to try to have a single pane of glass with a complex mess of hardware to see who was getting stopped by their moats and who was crossing their drawbridges, SaaS and cloud make that visibility increasingly difficult.The third challenge to the traditional castle and moat strategy of IT is the rise of mobile. Where once upon a time your employees would all show up to work in your castle, now people are working around the world. Requiring everyone to login to a limited number of central VPNs becomes obviously absurd when you picture it as villagers having to sprint back from wherever they are across a drawbridge whenever they want to get work done. It's no wonder VPN support is one of the top IT organization tickets and likely always will be for organizations that maintain a castle and moat approach.But it's worse than that. Mobile has also introduced a culture where employees bring their own devices to work. Or, even if on a company-managed device, work from the road or home — beyond the protected walls of the castle and without the security provided by a moat.If you'd looked at how we managed our own IT systems at Cloudflare four years ago, you'd have seen us following this same model. We used firewalls to keep threats out and required every employee to login through our VPN to get their work done. Personally, as someone who travels extensively for my job, it was especially painful.Regularly, someone would send me a link to an internal wiki article asking for my input. I'd almost certainly be working from my mobile phone in the back of a cab running between meetings. I'd try and access the link and be prompted to login to our VPN in San Francisco. That's when the frustration would start.Corporate mobile VPN clients, in my experience, all seem to be powered by some 100-sided die that only will allow you to connect if the number of miles you are from your home office is less than 25 times whatever number is rolled. Much frustration, and several IT tickets later, with a little luck I may be able to connect. And, even then, the experience was horribly slow and unreliable.When we audited our own system, we found that the frustration with the process had caused multiple teams to create work arounds that were, effectively, unauthorized drawbridges over our carefully constructed moat. And, as we increasingly adopted SaaS tools like Salesforce and Workday, we lost much visibility into how these tools were being used.Around the same time we were realizing the traditional approach to IT security was untenable for an organization like Cloudflare, Google published their paper titled "BeyondCorp: A New Approach to Enterprise Security." The core idea was that a company's intranet should be no more trusted than the Internet. And, rather than the perimeter being enforced by a singular moat, instead each application and data source should authenticate the individual and device each time it is accessed.The BeyondCorp idea, which has come to be known as a ZeroTrust model for IT security, was influential for how we thought about our own systems. Powerfully, because Cloudflare had a flexible global network, we were able to use it both to enforce policies as our team accessed tools as well as to protect ourselves from malware as we did our jobs.Cloudflare for TeamsToday, we're excited to announce Cloudflare for Teams™: the suite of tools we built to protect ourselves, now available to help any IT organization, from the smallest to the largest.Cloudflare for Teams is built around two complementary products: Access and Gateway. Cloudflare Access™ is the modern VPN — a way to ensure your team members get fast access to the resources they need to do their job while keeping threats out. Cloudflare Gateway™ is the modern Next Generation Firewall — a way to ensure that your team members are protected from malware and follow your organization's policies wherever they go online.Powerfully, both Cloudflare Access and Cloudflare Gateway are built atop the existing Cloudflare network. That means they are fast, reliable, scalable to the largest organizations, DDoS resistant, and located everywhere your team members are today and wherever they may travel. Have a senior executive going on a photo safari to see giraffes in Kenya, gorillas in Rwanda, and lemurs in Madagascar — don't worry, we have Cloudflare data centers in all those countries (and many more) and they all support Cloudflare for Teams.All Cloudflare for Teams products are informed by the threat intelligence we see across all of Cloudflare's products. We see such a large diversity of Internet traffic that we often see new threats and malware before anyone else. We've supplemented our own proprietary data with additional data sources from leading security vendors, ensuring Cloudflare for Teams provides a broad set of protections against malware and other online threats.Moreover, because Cloudflare for Teams runs atop the same network we built for our infrastructure protection products, we can deliver them very efficiently. That means that we can offer these products to our customers at extremely competitive prices. Our goal is to make the return on investment (ROI) for all Cloudflare for Teams customers nothing short of a no brainer. If you’re considering another solution, contact us before you decide.Both Cloudflare Access and Cloudflare Gateway also build off products we've launched and battle tested already. For example, Gateway builds, in part, off our 1.1.1.1 Public DNS resolver. Today, more than 40 million people trust 1.1.1.1 as the fastest public DNS resolver globally. By adding malware scanning, we were able to create our entry-level Cloudflare Gateway product.Cloudflare Access and Cloudflare Gateway build off our WARP and WARP+ products. We intentionally built a consumer mobile VPN service because we knew it would be hard. The millions of WARP and WARP+ users who have put the product through its paces have ensured that it's ready for the enterprise. That we have 4.5 stars across more than 200,000 ratings, just on iOS, is a testament of how reliable the underlying WARP and WARP+ engines have become. Compare that with the ratings of any corporate mobile VPN client, which are unsurprisingly abysmal.We’ve partnered with some incredible organizations to create the ecosystem around Cloudflare for Teams. These include endpoint security solutions including VMWare Carbon Black, Malwarebytes, and Tanium. SEIM and analytics solutions including Datadog, Sumo Logic, and Splunk. Identity platforms including Okta, OneLogin, and Ping Identity. Feedback from these partners and more is at the end of this post.If you’re curious about more of the technical details about Cloudflare for Teams, I encourage you to read Sam Rhea’s post.Serving EveryoneCloudflare has always believed in the power of serving everyone. That’s why we’ve offered a free version of Cloudflare for Infrastructure since we launched in 2010. That belief doesn’t change with our launch of Cloudflare for Teams. For both Cloudflare Access and Cloudflare Gateway, there will be free versions to protect individuals, home networks, and small businesses. We remember what it was like to be a startup and believe that everyone deserves to be safe online, regardless of their budget.With both Cloudflare Access and Gateway, the products are segmented along a Good, Better, Best framework. That breaks out into Access Basic, Access Pro, and Access Enterprise. You can see the features available with each tier in the table below, including Access Enterprise features that will roll out over the coming months.We wanted a similar Good, Better, Best framework for Cloudflare Gateway. Gateway Basic can be provisioned in minutes through a simple change to your network’s recursive DNS settings. Once in place, network administrators can set rules on what domains should be allowed and filtered on the network. Cloudflare Gateway is informed both by the malware data gathered from our global sensor network as well as a rich corpus of domain categorization, allowing network operators to set whatever policy makes sense for them. Gateway Basic leverages the speed of 1.1.1.1 with granular network controls.Gateway Pro, which we’re announcing today and you can sign up to beta test as its features roll out over the coming months, extends the DNS-provisioned protection to a full proxy. Gateway Pro can be provisioned via the WARP client — which we are extending beyond iOS and Android mobile devices to also support Windows, MacOS, and Linux — or network policies including MDM-provisioned proxy settings or GRE tunnels from office routers. This allows a network operator to filter on policies not merely by the domain but by the specific URL.Building the Best-in-Class Network GatewayWhile Gateway Basic (provisioned via DNS) and Gateway Pro (provisioned as a proxy) made sense, we wanted to imagine what the best-in-class network gateway would be for Enterprises that valued the highest level of performance and security. As we talked to these organizations we heard an ever-present concern: just surfing the Internet created risk of unauthorized code compromising devices. With every page that every user visited, third party code (JavaScript, etc.) was being downloaded and executed on their devices.The solution, they suggested, was to isolate the local browser from third party code and have websites render in the network. This technology is known as browser isolation. And, in theory, it’s a great idea. Unfortunately, in practice with current technology, it doesn’t perform well. The most common way the browser isolation technology works is to render the page on a server and then push a bitmap of the page down to the browser. This is known as pixel pushing. The challenge is that can be slow, bandwidth intensive, and it breaks many sophisticated web applications.We were hopeful that we could solve some of these problems by moving the rendering of the pages to Cloudflare’s network, which would be closer to end users. So we talked with many of the leading browser isolation companies about potentially partnering. Unfortunately, as we experimented with their technologies, even with our vast network, we couldn’t overcome the sluggish feel that plagues existing browser isolation solutions.Enter S2 SystemsThat’s when we were introduced to S2 Systems. I clearly remember first trying the S2 demo because my first reaction was: “This can’t be working correctly, it’s too fast.” The S2 team had taken a different approach to browser isolation. Rather than trying to push down a bitmap of what the screen looked like, instead they pushed down the vectors to draw what’s on the screen. The result was an experience that was typically at least as fast as browsing locally and without broken pages.The best, albeit imperfect, analogy I’ve come up with to describe the difference between S2’s technology and other browser isolation companies is the difference between WindowsXP and MacOS X when they were both launched in 2001. WindowsXP’s original graphics were based on bitmapped images. MacOS X were based on vectors. Remember the magic of watching an application “genie” in and out the MacOS X doc? Check it out in a video from the launch…At the time watching a window slide in and out of the dock seemed like magic compared with what you could do with bitmapped user interfaces. You can hear the awe in the reaction from the audience. That awe that we’ve all gotten used to in UIs today comes from the power of vector images. And, if you’ve been underwhelmed by the pixel-pushed bitmaps of existing browser isolation technologies, just wait until you see what is possible with S2’s technology.We were so impressed with the team and the technology that we acquired the company. We will be integrating the S2 technology into Cloudflare Gateway Enterprise. The browser isolation technology will run across Cloudflare’s entire global network, bringing it within milliseconds of virtually every Internet user. You can learn more about this approach in Darren Remington's blog post.Once the rollout is complete in the second half of 2020 we expect we will be able to offer the first full browser isolation technology that doesn’t force you to sacrifice performance. In the meantime, if you’d like a demo of the S2 technology in action, let us know.The Promise of a Faster Internet for EveryoneCloudflare’s mission is to help build a better Internet. With Cloudflare for Teams, we’ve extended that network to protect the people and organizations that use the Internet to do their jobs. We’re excited to help a more modern, mobile, and cloud-enabled Internet be safer and faster than it ever was with traditional hardware appliances.But the same technology we’re deploying now to improve enterprise security holds further promise. The most interesting Internet applications keep getting more complicated and, in turn, requiring more bandwidth and processing power to use.For those of us fortunate enough to be able to afford the latest iPhone, we continue to reap the benefits of an increasingly powerful set of Internet-enabled tools. But try and use the Internet on a mobile phone from a few generations back, and you can see how quickly the latest Internet applications leaves legacy devices behind. That’s a problem if we want to bring the next 4 billion Internet users online.We need a paradigm shift if the sophistication of applications and complexity of interfaces continues to keep pace with the latest generation of devices. To make the best of the Internet available to everyone, we may need to shift the work of the Internet off the end devices we all carry around in our pockets and let the network — where power, bandwidth, and CPU are relatively plentiful — carry more of the load.That’s the long term promise of what S2’s technology combined with Cloudflare’s network may someday power. If we can make it so a less expensive device can run the latest Internet applications — using less battery, bandwidth, and CPU than ever before possible — then we can make the Internet more affordable and accessible for everyone.We started with Cloudflare for Infrastructure. Today we’re announcing Cloudflare for Teams. But our ambition is nothing short of Cloudflare for Everyone.Early Feedback on Cloudflare for Teams from Customers and Partners"Cloudflare Access has enabled Ziff Media Group to seamlessly and securely deliver our suite of internal tools to employees around the world on any device, without the need for complicated network configurations,” said Josh Butts, SVP Product & Technology, Ziff Media Group.“VPNs are frustrating and lead to countless wasted cycles for employees and the IT staff supporting them,” said Amod Malviya, Cofounder and CTO, Udaan. “Furthermore, conventional VPNs can lull people into a false sense of security. With Cloudflare Access, we have a far more reliable, intuitive, secure solution that operates on a per user, per access basis. I think of it as Authentication 2.0 — even 3.0”“Roman makes healthcare accessible and convenient,” said Ricky Lindenhovius, Engineering Director, Roman Health. “Part of that mission includes connecting patients to physicians, and Cloudflare helps Roman securely and conveniently connect doctors to internally managed tools. With Cloudflare, Roman can evaluate every request made to internal applications for permission and identity, while also improving speed and user experience.”“We’re excited to partner with Cloudflare to provide our customers an innovative approach to enterprise security that combines the benefits of endpoint protection and network security," said Tom Barsi, VP Business Development, VMware. "VMware Carbon Black is a leading endpoint protection platform (EPP) and offers visibility and control of laptops, servers, virtual machines, and cloud infrastructure at scale. In partnering with Cloudflare, customers will have the ability to use VMware Carbon Black’s device health as a signal in enforcing granular authentication to a team’s internally managed application via Access, Cloudflare’s Zero Trust solution. Our joint solution combines the benefits of endpoint protection and a zero trust authentication solution to keep teams working on the Internet more secure."“Rackspace is a leading global technology services company accelerating the value of the cloud during every phase of our customers’ digital transformation,” said Lisa McLin, vice president of alliances and channel chief at Rackspace. “Our partnership with Cloudflare enables us to deliver cutting edge networking performance to our customers and helps them leverage a software defined networking architecture in their journey to the cloud.”“Employees are increasingly working outside of the traditional corporate headquarters. Distributed and remote users need to connect to the Internet, but today’s security solutions often require they backhaul those connections through headquarters to have the same level of security,” said Michael Kenney, head of strategy and business development for Ingram Micro Cloud. “We’re excited to work with Cloudflare whose global network helps teams of any size reach internally managed applications and securely use the Internet, protecting the data, devices, and team members that power a business.”"At Okta, we’re on a mission to enable any organization to securely use any technology. As a leading provider of identity for the enterprise, Okta helps organizations remove the friction of managing their corporate identity for every connection and request that their users make to applications. We’re excited about our partnership with Cloudflare and bringing seamless authentication and connection to teams of any size,” said Chuck Fontana, VP, Corporate & Business Development, Okta."Organizations need one unified place to see, secure, and manage their endpoints,” said Matt Hastings, Senior Director of Product Management at Tanium. “We are excited to partner with Cloudflare to help teams secure their data, off-network devices, and applications. Tanium’s platform provides customers with a risk-based approach to operations and security with instant visibility and control into their endpoints. Cloudflare helps extend that protection by incorporating device data to enforce security for every connection made to protected resources.”“OneLogin is happy to partner with Cloudflare to advance security teams' identity control in any environment, whether on-premise or in the cloud, without compromising user performance," said Gary Gwin, Senior Director of Product at OneLogin. "OneLogin’s identity and access management platform securely connects people and technology for every user, every app, and every device. The OneLogin and Cloudflare for Teams integration provides a comprehensive identity and network control solution for teams of all sizes.”“Ping Identity helps enterprises improve security and user experience across their digital businesses,” said Loren Russon, Vice President of Product Management, Ping Identity. “Cloudflare for Teams integrates with Ping Identity to provide a comprehensive identity and network control solution to teams of any size, and ensures that only the right people get the right access to applications, seamlessly and securely.""Our customers increasingly leverage deep observability data to address both operational and security use cases, which is why we launched Datadog Security Monitoring," said Marc Tremsal, Director of Product Management at Datadog. "Our integration with Cloudflare already provides our customers with visibility into their web and DNS traffic; we're excited to work together as Cloudflare for Teams expands this visibility to corporate environments."“As more companies support employees who work on corporate applications from outside of the office, it is vital that they understand each request users are making. They need real-time insights and intelligence to react to incidents and audit secure connections," said John Coyle, VP of Business Development, Sumo Logic. "With our partnership with Cloudflare, customers can now log every request made to internal applications and automatically push them directly to Sumo Logic for retention and analysis."“Cloudgenix is excited to partner with Cloudflare to provide an end-to-end security solution from the branch to the cloud.  As enterprises move off of expensive legacy MPLS networks and adopt branch to internet breakout policies, the CloudGenix CloudBlade platform and Cloudflare for Teams together can make this transition seamless and secure. We’re looking forward to Cloudflare’s roadmap with this announcement and partnership opportunities in the near term.” said Aaron Edwards, Field CTO, Cloudgenix.“In the face of limited cybersecurity resources, organizations are looking for highly automated solutions that work together to reduce the likelihood and impact of today’s cyber risks,” said Akshay Bhargava, Chief Product Officer, Malwarebytes. “With Malwarebytes and Cloudflare together, organizations are deploying more than twenty layers of security defense-in-depth. Using just two solutions, teams can secure their entire enterprise from device, to the network, to their internal and external applications.”"Organizations' sensitive data is vulnerable in-transit over the Internet and when it's stored at its destination in public cloud, SaaS applications and endpoints,” said Pravin Kothari, CEO of CipherCloud. “CipherCloud is excited to partner with Cloudflare to secure data in all stages, wherever it goes. Cloudflare’s global network secures data in-transit without slowing down performance. CipherCloud CASB+ provides a powerful cloud security platform with end-to-end data protection and adaptive controls for cloud environments, SaaS applications and BYOD endpoints. Working together, teams can rely on integrated Cloudflare and CipherCloud solution to keep data always protected without compromising user experience.”

Security on the Internet with Cloudflare for Teams

CloudFlare Blog -

Your experience using the Internet has continued to improve over time. It’s gotten faster, safer, and more reliable. However, you probably have to use a different, worse, equivalent of it when you do your work. While the Internet kept getting better, businesses and their employees were stuck using their own private networks.In those networks, teams hosted their own applications, stored their own data, and protected all of it by building a castle and moat around that private world. This model hid internally managed resources behind VPN appliances and on-premise firewall hardware. The experience was awful, for users and administrators alike. While the rest of the Internet became more performant and more reliable, business users were stuck in an alternate universe.That legacy approach was less secure and slower than teams wanted, but the corporate perimeter mostly worked for a time. However, that began to fall apart with the rise of cloud-delivered applications. Businesses migrated to SaaS versions of software that previously lived in that castle and behind that moat. Users needed to connect to the public Internet to do their jobs, and attackers made the Internet unsafe in sophisticated, unpredictable ways - which opened up every business to  a new world of never-ending risks.How did enterprise security respond? By trying to solve a new problem with a legacy solution, and forcing the Internet into equipment that was only designed for private, corporate networks. Instead of benefitting from the speed and availability of SaaS applications, users had to backhaul Internet-bound traffic through the same legacy boxes that made their private network miserable.Teams then watched as their bandwidth bills increased. More traffic to the Internet from branch offices forced more traffic over expensive, dedicated links. Administrators now had to manage a private network and the connections to the entire Internet for their users, all with the same hardware. More traffic required more hardware and the cycle became unsustainable.Cloudflare’s first wave of products secured and improved the speed of those sites by letting customers, from free users to some of the largest properties on the Internet, replace that hardware stack with Cloudflare’s network. We could deliver capacity at a scale that would be impossible for nearly any company to build themselves. We deployed data centers in over 200 cities around the world that help us reach users wherever they are.We built a unique network to let sites scale how they secured infrastructure on the Internet with their own growth. But internally, businesses and their employees were stuck using their own private networks.Just as we helped organizations secure their infrastructure by replacing boxes, we can do the same for their teams and their data. Today, we’re announcing a new platform that applies our network, and everything we’ve learned, to make the Internet faster and safer for teams.Cloudflare for Teams protects enterprises, devices, and data by securing every connection without compromising user performance. The speed, reliability and protection we brought to securing infrastructure is extended to everything your team does on the Internet.The legacy world of corporate securityOrganizations all share three problems they need to solve at the network level: Secure team member access to internally managed applications Secure team members from threats on the Internet Secure the corporate data that lives in both environments Each of these challenges pose a real risk to any team. If any component is compromised, the entire business becomes vulnerable.Internally managed applicationsSolving the first bucket, internally managed applications, started by building a perimeter around those internal resources. Administrators deployed applications on a private network and users outside of the office connected to them with client VPN agents through VPN appliances that lived back on-site.Users hated it, and they still do, because it made it harder to get their jobs done. A sales team member traveling to a customer visit in the back of a taxi had to start a VPN client on their phone just to review details about the meeting. An engineer working remotely had to sit and wait as every connection they made to developer tools was backhauled  through a central VPN appliance.Administrators and security teams also had issues with this model. Once a user connects to the private network, they’re typically able to reach multiple resources without having to prove they’re authorized to do so . Just because I’m able to enter the front door of an apartment building, doesn’t mean I should be able to walk into any individual apartment. However, on private networks, enforcing additional security within the bounds of the private network required complicated microsegmentation, if it was done at all.Threats on the InternetThe second challenge, securing users connecting to SaaS tools on the public Internet and applications in the public cloud, required security teams to protect against known threats and potential zero-day attacks as their users left the castle and moat.How did most companies respond? By forcing all traffic leaving branch offices or remote users back through headquarters and using the same hardware that secured their private network to try and build a perimeter around the Internet, at least the Internet their users accessed. All of the Internet-bound traffic leaving a branch office in Asia, for example, would be sent back through a central location in Europe, even if the destination was just down the street.Organizations needed those connections to be stable, and to prioritize certain functions like voice and video, so they paid carriers to support dedicated multi-protocol label switching (MPLS) links. MPLS delivered improved performance by applying label switching to traffic which downstream routers can forward without needing to perform an IP lookup, but was eye-wateringly expensive.Securing dataThe third challenge, keeping data safe, became a moving target. Organizations had to keep data secure in a consistent way as it lived and moved between private tools on corporate networks and SaaS applications like Salesforce or Office 365.The answer? More of the same. Teams backhauled traffic over MPLS links to a place where data could be inspected, adding more latency and introducing more hardware that had to be maintained.What changed?The balance of internal versus external traffic began to shift as SaaS applications became the new default for small businesses and Fortune 500s alike. Users now do most of their work on the Internet, with tools like Office 365 continuing to gain adoption. As those tools become more popular, more data leaves the moat and lives on the public Internet.User behavior also changed. Users left the office and worked from multiple devices, both managed and unmanaged. Teams became more distributed and the perimeter was stretched to its limit.This caused legacy approaches to failLegacy approaches to corporate security pushed the  castle and moat model further out. However, that model simply cannot scale with how users do work on the Internet today.Internally managed applicationsPrivate networks give users headaches, but they’re also a constant and complex chore to maintain. VPNs require expensive equipment that must be upgraded or expanded and, as more users leave the office, that equipment must try and scale up.The result is a backlog of IT help desk tickets as users struggle with their VPN and, on the other side of the house, administrators and security teams try to put band-aids on the approach.Threats on the InternetOrganizations initially saved money by moving to SaaS tools, but wound up spending more money over time as their traffic increased and bandwidth bills climbed.Additionally, threats evolve. The traffic sent back to headquarters was secured with static models of scanning and filtering using hardware gateways. Users were still vulnerable to new types of threats that these on-premise boxes did not block yet.Securing dataThe cost of keeping data secure in both environments also grew. Security teams attempted to inspect Internet-bound traffic for threats and data loss by backhauling branch office traffic through on-premise hardware, degrading speed and increasing bandwidth fees.Even more dangerous, data now lived permanently outside of that castle and moat model. Organizations were now vulnerable to attacks that bypassed their perimeter and targeted SaaS applications directly.How will Cloudflare solve these problems?Cloudflare for Teams consists of two products, Cloudflare Access and Cloudflare Gateway.We launched Access last year and are excited to bring it into Cloudflare for Teams. We built Cloudflare Access to solve the first challenge that corporate security teams face: protecting internally managed applications.Cloudflare Access replaces corporate VPNs with Cloudflare’s network. Instead of placing internal tools on a private network, teams deploy them in any environment, including hybrid or multi-cloud models, and secure them consistently with Cloudflare’s network.Deploying Access does not require exposing new holes in corporate firewalls. Teams connect their resources through a secure outbound connection, Argo Tunnel, which runs in your infrastructure to connect the applications and machines to Cloudflare. That tunnel makes outbound-only calls to the Cloudflare network and organizations can replace complex firewall rules with just one: disable all inbound connections.Administrators then build rules to decide who should authenticate to and reach the tools protected by Access. Whether those resources are virtual machines powering business operations or internal web applications, like Jira or iManage, when a user needs to connect, they pass through Cloudflare first.When users need to connect to the tools behind Access, they are prompted to authenticate with their team’s SSO and, if valid, are instantly connected to the application without being slowed down. Internally-managed apps suddenly feel like SaaS products, and the login experience is seamless and familiarBehind the scenes, every request made to those internal tools hits Cloudflare first where we enforce identity-based policies. Access evaluates and logs every request to those apps for identity, to give administrators more visibility and to offer more security than a traditional VPN.Every Cloudflare data center, in 200 cities around the world, performs the entire authentication check. Users connect faster, wherever they are working, versus having to backhaul traffic to a home office.Access also saves time for administrators. Instead of configuring complex and error-prone network policies, IT teams build policies that enforce authentication using their identity provider. Security leaders can control who can reach internal applications in a single pane of glass and audit comprehensive logs from one source.In the last year, we’ve released features that expand how teams can use Access so they can fully eliminate their VPN. We’ve added support for RDP, SSH, and released support for short-lived certificates that replace static keys. However, teams also use applications that do not run in infrastructure they control, such as SaaS applications like Box and Office 365. To solve that challenge, we’re releasing a new product, Cloudflare Gateway.Cloudflare Gateway secures teams by making the first destination a Cloudflare data center located near them, for all outbound traffic. The product places Cloudflare’s global network between users and the Internet, rather than forcing the Internet through legacy hardware on-site.Cloudflare Gateway’s first feature begins by preventing users from running into phishing scams or malware sites by combining the world’s fastest DNS resolver with Cloudflare’s threat intelligence. Gateway resolver can be deployed to office networks and user devices in a matter of minutes. Once configured, Gateway actively blocks potential malware and phishing sites while also applying content filtering based on policies administrators configure.However, threats can be hidden in otherwise healthy hostnames. To protect users from more advanced threats, Gateway will audit URLs and, if enabled, inspect  packets to find potential attacks before they compromise a device or office network. That same deep packet inspection can then be applied to prevent the accidental or malicious export of data.Organizations can add Gateway’s advanced threat prevention in two models:by connecting office networks to the Cloudflare security fabric through GRE tunnels andby distributing forward proxy clients to mobile devices.The first model, delivered through Cloudflare Magic Transit, will give enterprises a way to migrate to Gateway without disrupting their current workflow. Instead of backhauling office traffic to centralized on-premise hardware, teams will point traffic to Cloudflare over GRE tunnels. Once the outbound traffic arrives at Cloudflare, Gateway can apply file type controls, in-line inspection, and data loss protection without impacting connection performance. Simultaneously, Magic Transit protects a corporate IP network from inbound attacks.When users leave the office, Gateway’s client application will deliver the same level of Internet security. Every connection from the device will pass through Cloudflare first, where Gateway can apply threat prevention policies. Cloudflare can also deliver that security without compromising user experience, building on new technologies like the WireGuard protocol and integrating features from Cloudflare Warp, our popular individual forward proxy.In both environments, one of the most common vectors for attacks is still the browser. Zero-day threats can compromise devices by using the browser as a vehicle to execute code.Existing browser isolation solutions attempt to solve this challenge in one of two approaches: 1) pixel pushing and 2) DOM reconstruction. Both approaches lead to tradeoffs in performance and security. Pixel pushing degrades speed while also driving up the cost to stream sessions to users. DOM reconstruction attempts to strip potentially harmful content before sending it to the user. That tactic relies on known vulnerabilities and is still exposed to the zero day threats that isolation tools were meant to solve.Cloudflare Gateway will feature always-on browser isolation that not only protects users from zero day threats, but can also make browsing the Internet faster. The solution will apply a patented approach to send vector commands that a browser can render without the need for an agent on the device. A user’s browser session will instead run in a Cloudflare data center where Gateway destroys the instance at the end of each session, keeping malware away from user devices without compromising performance.When deployed, remote browser sessions will run in one of Cloudflare’s 200 data centers, connecting users to a faster, safer model of navigating the Internet without the compromises of legacy approaches. If you would like to learn more about this approach to browser isolation, I'd encourage you to read Darren Remington's blog post on the topic.Why Cloudflare?To make infrastructure safer, and web properties faster, Cloudflare built out one of the world’s largest and most sophisticated networks. Cloudflare for Teams builds on that same platform, and all of its unique advantages.FastSecurity should always be bundled with performance. Cloudflare’s infrastructure products delivered better protection while also improving speed. That’s possible because of the network we’ve built, both its distribution and how the data we have about the network allows Cloudflare to optimize requests and connections.Cloudflare for Teams brings that same speed to end users by using that same network and route optimization. Additionally, Cloudflare has built industry-leading components that will become features of this new platform. All of these components leverage Cloudflare’s network and scale to improve user performance.Gateway’s DNS-filtering features build on Cloudflare’s 1.1.1.1 public DNS resolver, the world’s fastest resolver according to DNSPerf. To protect entire connections, Cloudflare for Teams will deploy the same technology that underpins Warp, a new type of VPN with consistently better reviews than competitors.Massive scalabilityCloudflare’s 30 TBps of network capacity can scale to meet the needs of nearly any enterprise. Customers can stop worrying about buying enough hardware to meet their organization’s needs and, instead, replace it with Cloudflare.Near users, wherever they are — literallyCloudflare’s network operates in 200 cities and more than 90 countries around the world, putting Cloudflare’s security and performance close to users, wherever they work.That network includes presence in global headquarters, like London and New York, but also in traditionally underserved regions around the world.Cloudflare data centers operate within 100 milliseconds of 99% of Internet-connected population in the developed world, and within 100 milliseconds of 94% of the Internet-connected population globally. All of your end users should feel like they have the performance traditionally only available to those in headquarters.Easier for administratorsWhen security products are confusing, teams make mistakes that become incidents. Cloudflare’s solution is straightforward and easy to deploy. Most security providers in this market built features first and never considered usability or implementation.Cloudflare Access can be deployed in less than an hour; Gateway features will build on top of that dashboard and workflow. Cloudflare for Teams brings the same ease-of-use of our tools that protect infrastructure to the products that new secure users, devices, and data.Better threat intelligenceCloudflare’s network already secures more than 20 million Internet properties and blocks 72 billion cyber threats each day. We build products using the threat data we gather from protecting 11 million HTTP requests per second on average.What’s next?Cloudflare Access is available right now. You can start replacing your team’s VPN with Cloudflare’s network today. Certain features of Cloudflare Gateway are available in beta now, and others will be added in beta over time. You can sign up to be notified about Gateway now.

Cloudflare + Remote Browser Isolation

CloudFlare Blog -

Cloudflare announced today that it has purchased S2 Systems Corporation, a Seattle-area startup that has built an innovative remote browser isolation solution unlike any other currently in the market. The majority of endpoint compromises involve web browsers — by putting space between users’ devices and where web code executes, browser isolation makes endpoints substantially more secure. In this blog post, I’ll discuss what browser isolation is, why it is important, how the S2 Systems cloud browser works, and how it fits with Cloudflare’s mission to help build a better Internet.What’s wrong with web browsing?It’s been more than 30 years since Tim Berners-Lee wrote the project proposal defining the technology underlying what we now call the world wide web. What Berners-Lee envisioned as being useful for “several thousand people, many of them very creative, all working toward common goals”[1] has grown to become a fundamental part of commerce, business, the global economy, and an integral part of society used by more than 58% of the world’s population[2].The world wide web and web browsers have unequivocally become the platform for much of the productive work (and play) people do every day. However, as the pervasiveness of the web grew, so did opportunities for bad actors. Hardly a day passes without a major new cybersecurity breach in the news. Several contributing factors have helped propel cybercrime to unprecedented levels: the commercialization of hacking tools, the emergence of malware-as-a-service, the presence of well-financed nation states and organized crime, and the development of cryptocurrencies which enable malicious actors of all stripes to anonymously monetize their activities.The vast majority of security breaches originate from the web. Gartner calls the public Internet a “cesspool of attacks” and identifies web browsers as the primary culprit responsible for 70% of endpoint compromises.[3] This should not be surprising. Although modern web browsers are remarkable, many fundamental architectural decisions were made in the 1990’s before concepts like security, privacy, corporate oversight, and compliance were issues or even considerations. Core web browsing functionality (including the entire underlying WWW architecture) was designed and built for a different era and circumstances.In today’s world, several web browsing assumptions are outdated or even dangerous. Web browsers and the underlying server technologies encompass an extensive – and growing – list of complex interrelated technologies. These technologies are constantly in flux, driven by vibrant open source communities, content publishers, search engines, advertisers, and competition between browser companies. As a result of this underlying complexity, web browsers have become primary attack vectors. According to Gartner, “the very act of users browsing the internet and clicking on URL links opens the enterprise to significant risk. […] Attacking thru the browser is too easy, and the targets too rich.”[4] Even “ostensibly ‘good’ websites are easily compromised and can be used to attack visitors” (Gartner[5]) with more than 40% of malicious URLs found on good domains (Webroot[6]). (A complete list of vulnerabilities is beyond the scope of this post.)The very structure and underlying technologies that power the web are inherently difficult to secure. Some browser vulnerabilities result from illegitimate use of legitimate functionality: enabling browsers to download files and documents is good, but allowing downloading of files infected with malware is bad; dynamic loading of content across multiple sites within a single webpage is good, but cross-site scripting is bad; enabling an extensive advertising ecosystem is good, but the inability to detect hijacked links or malicious redirects to malware or phishing sites is bad; etc.Enterprise Browsing IssuesEnterprises have additional challenges with traditional browsers.Paradoxically, IT departments have the least amount of control over the most ubiquitous app in the enterprise – the web browser. The most common complaints about web browsers from enterprise security and IT professionals are: Security (obviously). The public internet is a constant source of security breaches and the problem is growing given an 11x escalation in attacks since 2016 (Meeker[7]). Costs of detection and remediation are escalating and the reputational damage and financial losses for breaches can be substantial. Control. IT departments have little visibility into user activity and limited ability to leverage content disarm and reconstruction (CDR) and data loss prevention (DLP) mechanisms including when, where, or who downloaded/upload files. Compliance. The inability to control data and activity across geographies or capture required audit telemetry to meet increasingly strict regulatory requirements. This results in significant exposure to penalties and fines. Given vulnerabilities exposed through everyday user activities such as email and web browsing, some organizations attempt to restrict these activities. As both are legitimate and critical business functions, efforts to limit or curtail web browser use inevitably fail or have a substantive negative impact on business productivity and employee morale.Current approaches to mitigating security issues inherent in browsing the web are largely based on signature technology for data files and executables, and lists of known good/bad URLs and DNS addresses. The challenge with these approaches is the difficulty of keeping current with known attacks (file signatures, URLs and DNS addresses) and their inherent vulnerability to zero-day attacks. Hackers have devised automated tools to defeat signature-based approaches (e.g. generating hordes of files with unknown signatures) and create millions of transient websites in order to defeat URL/DNS blacklists.While these approaches certainly prevent some attacks, the growing number of incidents and severity of security breaches clearly indicate more effective alternatives are needed.What is browser isolation?The core concept behind browser isolation is security-through-physical-isolation to create a “gap” between a user’s web browser and the endpoint device thereby protecting the device (and the enterprise network) from exploits and attacks. Unlike secure web gateways, antivirus software, or firewalls which rely on known threat patterns or signatures, this is a zero-trust approach.There are two primary browser isolation architectures: (1) client-based local isolation and (2) remote isolation.Local browser isolation attempts to isolate a browser running on a local endpoint using app-level or OS-level sandboxing. In addition to leaving the endpoint at risk when there is an isolation failure, these systems require significant endpoint resources (memory + compute), tend to be brittle, and are difficult for IT to manage as they depend on support from specific hardware and software components.Further, local browser isolation does nothing to address the control and compliance issues mentioned above.Remote browser isolation (RBI) protects the endpoint by moving the browser to a remote service in the cloud or to a separate on-premises server within the enterprise network:On-premises isolation simply relocates the risk from the endpoint to another location within the enterprise without actually eliminating the risk.Cloud-based remote browsing isolates the end-user device and the enterprise’s network while fully enabling IT control and compliance solutions.Given the inherent advantages, most browser isolation solutions – including S2 Systems – leverage cloud-based remote isolation. Properly implemented, remote browser isolation can protect the organization from browser exploits, plug-ins, zero-day vulnerabilities, malware and other attacks embedded in web content.How does Remote Browser Isolation (RBI) work?In a typical cloud-based RBI system (the blue-dashed box ❶ below), individual remote browsers ❷ are run in the cloud as disposable containerized instances – typically, one instance per user. The remote browser sends the rendered contents of a web page to the user endpoint device ❹ using a specific protocol and data format ❸. Actions by the user, such as keystrokes, mouse and scroll commands, are sent back to the isolation service over a secure encrypted channel where they are processed by the remote browser and any resulting changes to the remote browser webpage are sent back to the endpoint device.In effect, the endpoint device is “remote controlling” the cloud browser. Some RBI systems use proprietary clients installed on the local endpoint while others leverage existing HTML5-compatible browsers on the endpoint and are considered ‘clientless.’Data breaches that occur in the remote browser are isolated from the local endpoint and enterprise network. Every remote browser instance is treated as if compromised and terminated after each session. New browser sessions start with a fresh instance. Obviously, the RBI service must prevent browser breaches from leaking outside the browser containers to the service itself. Most RBI systems provide remote file viewers negating the need to download files but also have the ability to inspect files for malware before allowing them to be downloaded.A critical component in the above architecture is the specific remoting technology employed by the cloud RBI service. The remoting technology has a significant impact on the operating cost and scalability of the RBI service, website fidelity and compatibility, bandwidth requirements, endpoint hardware/software requirements and even the user experience. Remoting technology also determines the effective level of security provided by the RBI system.All current cloud RBI systems employ one of two remoting technologies:(1)    Pixel pushing is a video-based approach which captures pixel images of the remote browser ‘window’ and transmits a sequence of images to the client endpoint browser or proprietary client. This is similar to how remote desktop and VNC systems work. Although considered to be relatively secure, there are several inherent challenges with this approach:Continuously encoding and transmitting video streams of remote webpages to user endpoint devices is very costly. Scaling this approach to millions of users is financially prohibitive and logistically complex.Requires significant bandwidth. Even when highly optimized, pushing pixels is bandwidth intensive.Unavoidable latency results in an unsatisfactory user experience. These systems tend to be slow and generate a lot of user complaints.Mobile support is degraded by high bandwidth requirements compounded by inconsistent connectivity.HiDPI displays may render at lower resolutions. Pixel density increases exponentially with resolution which means remote browser sessions (particularly fonts) on HiDPI devices can appear fuzzy or out of focus.(2) DOM reconstruction emerged as a response to the shortcomings of pixel pushing. DOM reconstruction attempts to clean webpage HTML, CSS, etc. before forwarding the content to the local endpoint browser. The underlying HTML, CSS, etc., are reconstructed in an attempt to eliminate active code, known exploits, and other potentially malicious content. While addressing the latency, operational cost, and user experience issues of pixel pushing, it introduces two significant new issues:Security. The underlying technologies – HTML, CSS, web fonts, etc. – are the attack vectors hackers leverage to breach endpoints. Attempting to remove malicious content or code is like washing mosquitos: you can attempt to clean them, but they remain inherent carriers of dangerous and malicious material. It is impossible to identify, in advance, all the means of exploiting these technologies even through an RBI system.Website fidelity. Inevitably, attempting to remove malicious active code, reconstructing HTML, CSS and other aspects of modern websites results in broken pages that don’t render properly or don’t render at all. Websites that work today may not work tomorrow as site publishers make daily changes that may break DOM reconstruction functionality. The result is an infinite tail of issues requiring significant resources in an endless game of whack-a-mole. Some RBI solutions struggle to support common enterprise-wide services like Google G Suite or Microsoft Office 365 even as malware laden web email continues to be a significant source of breaches.Customers are left to choose between a secure solution with a bad user experience and high operating costs, or a faster, much less secure solution that breaks websites. These tradeoffs have driven some RBI providers to implement both remoting technologies into their products. However, this leaves customers to pick their poison without addressing the fundamental issues.Given the significant tradeoffs in RBI systems today, one common optimization for current customers is to deploy remote browsing capabilities to only the most vulnerable users in an organization such as high-risk executives, finance, business development, or HR employees. Like vaccinating half the pupils in a classroom, this results in a false sense of security that does little to protect the larger organization.Unfortunately, the largest “gap” created by current remote browser isolation systems is the void between the potential of the underlying isolation concept and the implementation reality of currently available RBI systems.S2 Systems Remote Browser IsolationS2 Systems remote browser isolation is a fundamentally different approach based on S2-patented technology called Network Vector Rendering (NVR).The S2 remote browser is based on the open-source Chromium engine on which Google Chrome is built. In addition to powering Google Chrome which has a ~70% market share[8], Chromium powers twenty-one other web browsers including the new Microsoft Edge browser.[9] As a result, significant ongoing investment in the Chromium engine ensures the highest levels of website support, compatibility and a continuous stream of improvements.A key architectural feature of the Chromium browser is its use of the Skia graphics library. Skia is a widely-used cross-platform graphics engine for Android, Google Chrome, Chrome OS, Mozilla Firefox, Firefox OS, FitbitOS, Flutter, the Electron application framework and many other products. Like Chromium, the pervasiveness of Skia ensures ongoing broad hardware and platform support.Skia code fragmentEverything visible in a Chromium browser window is rendered through the Skia rendering layer. This includes application window UI such as menus, but more importantly, the entire contents of the webpage window are rendered through Skia. Chromium compositing, layout and rendering are extremely complex with multiple parallel paths optimized for different content types, device contexts, etc. The following figure is an egregious simplification for illustration purposes of how S2 works (apologies to Chromium experts):S2 Systems NVR technology intercepts the remote Chromium browser’s Skia draw commands ❶, tokenizes and compresses them, then encrypts and transmits them across the wire ❷ to any HTML5 compliant web browser ❸ (Chrome, Firefox, Safari, etc.) running locally on the user endpoint desktop or mobile device. The Skia API commands captured by NVR are pre-rasterization which means they are highly compact.On first use, the S2 RBI service transparently pushes an NVR WebAssembly (Wasm) library ❹ to the local HTML5 web browser on the endpoint device where it is cached for subsequent use. The NVR Wasm code contains an embedded Skia library and the necessary code to unpack, decrypt and “replay” the Skia draw commands from the remote RBI server to the local browser window. A WebAssembly’s ability to “execute at native speed by taking advantage of common hardware capabilities available on a wide range of platforms”[10] results in near-native drawing performance.The S2 remote browser isolation service uses headless Chromium-based browsers in the cloud, transparently intercepts draw layer output, transmits the draw commands efficiency and securely over the web, and redraws them in the windows of local HTML5 browsers. This architecture has a number of technical advantages:(1)    Security: the underlying data transport is not an existing attack vector and customers aren’t forced to make a tradeoff between security and performance.(2)    Website compatibility: there are no website compatibility issues nor long tail chasing evolving web technologies or emerging vulnerabilities.(3)    Performance: the system is very fast, typically faster than local browsing (subject of a future blog post).(4)    Transparent user experience: S2 remote browsing feels like native browsing; users are generally unaware when they are browsing remotely.(5)    Requires less bandwidth than local browsing for most websites. Enables advanced caching and other proprietary optimizations unique to web browsers and the nature of web content and technologies.(6)    Clientless: leverages existing HTML5 compatible browsers already installed on user endpoint desktop and mobile devices.(7)    Cost-effective scalability: although the details are beyond the scope of this post, the S2 backend and NVR technology have substantially lower operating costs than existing RBI technologies. Operating costs translate directly to customer costs. The S2 system was designed to make deployment to an entire enterprise and not just targeted users (aka: vaccinating half the class) both feasible and attractive for customers.(8)    RBI-as-a-platform: enables implementation of related/adjacent services such as DLP, content disarm & reconstruction (CDR), phishing detection and prevention, etc.S2 Systems Remote Browser Isolation Service and underlying NVR technology eliminates the disconnect between the conceptual potential and promise of browser isolation and the unsatisfying reality of current RBI technologies.Cloudflare + S2 Systems Remote Browser IsolationCloudflare’s global cloud platform is uniquely suited to remote browsing isolation. Seamless integration with our cloud-native performance, reliability and advanced security products and services provides powerful capabilities for our customers.Our Cloudflare Workers architecture enables edge computing in 200 cities in more than 90 countries and will put a remote browser within 100 milliseconds of 99% of the Internet-connected population in the developed world. With more than 20 million Internet properties directly connected to our network, Cloudflare remote browser isolation will benefit from locally cached data and builds on the impressive connectivity and performance of our network. Our Argo Smart Routing capability leverages our communications backbone to route traffic across faster and more reliable network paths resulting in an average 30% faster access to web assets.Once it has been integrated with our Cloudflare for Teams suite of advanced security products, remote browser isolation will provide protection from browser exploits, zero-day vulnerabilities, malware and other attacks embedded in web content. Enterprises will be able to secure the browsers of all employees without having to make trade-offs between security and user experience. The service will enable IT control of browser-conveyed enterprise data and compliance oversight. Seamless integration across our products and services will enable users and enterprises to browse the web without fear or consequence.Cloudflare’s mission is to help build a better Internet. This means protecting users and enterprises as they work and play on the Internet; it means making Internet access fast, reliable and transparent. Reimagining and modernizing how web browsing works is an important part of helping build a better Internet.[1] https://www.w3.org/History/1989/proposal.html[2] “Internet World Stats,”https://www.internetworldstats.com/, retrieved 12/21/2019.[3] “Innovation Insight for Remote Browser Isolation,” (report ID: G00350577) Neil MacDonald, Gartner Inc, March 8, 2018”[4] Gartner, Inc., Neil MacDonald, “Innovation Insight for Remote Browser Isolation”, 8 March 2018[5] Gartner, Inc., Neil MacDonald, “Innovation Insight for Remote Browser Isolation”, 8 March 2018[6] “2019 Webroot Threat Report: Forty Percent of Malicious URLs Found on Good Domains”, February 28, 2019[7] “Kleiner Perkins 2018 Internet Trends”, Mary Meeker.[8] https://www.statista.com/statistics/544400/market-share-of-internet-browsers-desktop/, retrieved December 21, 2019[9] https://en.wikipedia.org/wiki/Chromium_(web_browser), retrieved December 29, 2019[10] https://webassembly.org/, retrieved December 30, 2019

Cloudflare Expanded to 200 Cities in 2019

CloudFlare Blog -

We have exciting news: Cloudflare closed out the decade by reaching our 200th city* across 90+ countries. Each new location increases the security, performance, and reliability of the 20-million-plus Internet properties on our network. Over the last quarter, we turned up seven data centers spanning from Chattogram, Bangladesh all the way to the Hawaiian Islands:Chattogram & Dhaka, Bangladesh. These data centers are our first in Bangladesh, ensuring that its 161 million residents will have a better experience on our network.Honolulu, Hawaii, USA. Honolulu is one of the most remote cities in the world; with our Honolulu data center up and running, Hawaiian visitors can be served 2,400 miles closer than ever before! Hawaii is a hub for many submarine cables in the Pacific, meaning that some Pacific Islands will also see significant improvements.Adelaide, Australia. Our 7th Australasian data center can be found “down under” in the capital of South Australia. Despite being Australia’s fifth-largest city, Adelaide is often overlooked for Australian interconnection. We, for one, are happy to establish a presence in it and its unique UTC+9:30 time zone!Thimphu, Bhutan. Bhutan is the seventh SAARC (South Asian Association for Regional Cooperation) country with a Cloudflare network presence. Thimphu is our first Bhutanese data center, continuing our mission of security and performance for all.St George’s, Grenada. Our Grenadian data center is joining the Grenada Internet Exchange (GREX), the first non-profit Internet Exchange (IX) in the English-speaking Caribbean.We’ve come a long way since our launch in 2010, moving from colocating in key Internet hubs to fanning out across the globe and partnering with local ISPs. This has allowed us to offer security, performance, and reliability to Internet users in all corners of the world. In addition to the 35 cities we added in 2019, we expanded our existing data centers behind-the-scenes. We believe there are a lot of opportunities to harness in 2020 as we look to bring our network and its edge-computing power closer and closer to everyone on the Internet.*Includes cities where we have data centers with active Internet ports and those where we are configuring our servers to handle traffic for more customers (at the time of publishing).

Adopting a new approach to HTTP prioritization

CloudFlare Blog -

Friday the 13th is a lucky day for Cloudflare for many reasons. On December 13, 2019 Tommy Pauly, co-chair of the IETF HTTP Working Group, announced the adoption of the "Extensible Prioritization Scheme for HTTP" - a new approach to HTTP prioritization.Web pages are made up of many resources that must be downloaded before they can be presented to the user. The role of HTTP prioritization is to load the right bytes at the right time in order to achieve the best performance. This is a collaborative process between client and server, a client sends priority signals that the server can use to schedule the delivery of response data. In HTTP/1.1 the signal is basic, clients order requests smartly across a pool of about 6 connections. In HTTP/2 a single connection is used and clients send a signal per request, as a frame, which describes the relative dependency and weighting of the response. HTTP/3 tried to use the same approach but dependencies don't work well when signals can be delivered out of order. HTTP/3 is being standardised as part of the QUIC effort. As a Working Group (WG) we've been trying to fix the problems that non-deterministic ordering poses for HTTP priorities. However, in parallel some of us have been working on an alternative solution, the Extensible Prioritization Scheme, which fixes problems by dropping dependencies and using an absolute weighting. This is signalled in an HTTP header field meaning it can be backported to work with HTTP/2 or carried over HTTP/1.1 hops. The alternative proposal is documented in the Individual-Draft draft-kazuho-httpbis-priority-04, co-authored by Kazuho Oku (Fastly) and myself. This has now been adopted by the IETF HTTP WG as the basis of further work; It's adopted name will be draft-ietf-httpbis-priority-00.To some extent document adoption is the end of one journey and the start of the next; sometimes the authors of the original work are not the best people to oversee the next phase. However, I'm pleased to say that Kazuho and I have been selected as co-editors of this new document. In this role we will reflect the consensus of the WG and help steward the next chapter of HTTP prioritization standardisation. Before the next journey begins in earnest, I wanted to take the opportunity to share my thoughts on the story of developing the alternative prioritization scheme through 2019.I'd love to explain all the details of this new approach to HTTP prioritization but the truth is I expect the standardization process to refine the design and for things to go stale quickly. However, it doesn't hurt to give a taste of what's in store, just be aware that it is all subject to change.A recap on prioritiesThe essence of HTTP prioritization comes down to trying to download many things over constrained connectivity. To borrow some text from Pat Meenan: Web pages are made up of dozens (sometimes hundreds) of separate resources that are loaded and assembled by a browser into the final displayed content. Since it is not possible to download everything immediately, we prefer to fetch more important things before less important ones. The challenge comes in signalling the importance from client to server.In HTTP/2, every connection has a priority tree that expresses the relative importance between requests. Servers use this to determine how to schedule sending response data. The tree starts with a single root node and as requests are made they either depend on the root or each other. Servers may use the tree to decide how to schedule sending resources but clients cannot force a server to behave in any particular way.To illustrate, imagine a client that makes three simple GET requests that all depend on root. As the server receives each request it grows its view of the priority tree:The server starts with only the root node of the priority tree. As requests arrive, the tree grows. In this case all requests depend on the root, so the requests are priority siblings.Once all requests are received, the server determines all requests have equal priority and that it should send response data using round-robin scheduling: send some fraction of response 1, then a fraction of response 2, then a fraction of response 3, and repeat until all responses are complete.A single HTTP/2 request-response exchange is made up of frames that are sent on a stream. A simple GET request would be sent using a single HEADERS frame:HTTP/2 HEADERS frame, Each region of a frame is a named fieldEach region of a frame is a named field, a '?' indicates the field is optional and the value in parenthesis is the length in bytes with '*' meaning variable length. The Header Block Fragment field holds compressed HTTP header fields (using HPACK), Pad Length and Padding relate to optional padding, and E, Stream Dependency and Weight combined are the priority signal that controls the priority tree.The Stream Dependency and Weight fields are optional but their absence is interpreted as a signal to use the default values; dependency on the root with a weight of 16 meaning that the default priority scheduling strategy is round-robin . However, this is often a bad choice because important resources like HTML, CSS and JavaScript are tied up with things like large images. The following animation demonstrates this in the Edge browser, causing the page to be blank for 19 seconds. Our deep dive blog post explains the problem further.The HEADERS frame E field is the interesting bit (pun intended). A request with the field set to 1 (true) means that the dependency is exclusive and nothing else can depend on the indicated node. To illustrate, imagine a client that sends three requests which set the E field to 1. As the server receives each request, it interprets this as an exclusive dependency on the root node. Because all requests have the same dependency on root, the tree has to be shuffled around to satisfy the exclusivity rules.Each request has an exclusive dependency on the root node. The tree is shuffled as each request is received by the server.The final version of the tree looks very different from our previous example. The server would schedule all of response 3, then all of response 2, then all of response 1. This could help load all of an HTML file before an image and thus improve the visual load behaviour.In reality, clients load a lot more than three resources and use a mix of priority signals. To understand the priority of any single request, we need to understand all requests. That presents some technological challenges, especially for servers that act like proxies such as the Cloudflare edge network. Some servers have problems applying prioritization effectively.Because not all clients send the most optimal priority signals we were motivated to develop Cloudflare's Enhanced HTTP/2 Prioritization, announced last May during Speed Week. This was a joint project between the Speed team (Andrew Galloni, Pat Meenan, Kornel Lesiński) and Protocols team (Nick Jones, Shih-Chiang Chien) and others. It replaces the complicated priority tree with a simpler scheme that is well suited to web resources. Because the feature is implemented on the server side, we avoid requiring any modification of clients or the HTTP/2 protocol itself. Be sure to check out my colleague Nick's blog post that details some of the technical challenges and changes needed to let our servers deliver smarter priorities.The Extensible Prioritization Scheme proposalThe scheme specified in draft-kazuho-httpbis-priority-04, defines a way for priorities to be expressed in absolute terms. It replaces HTTP/2's dependency-based relative prioritization, the priority of a request is independent of others, which makes it easier to reason about and easier to schedule.Rather than send the priority signal in a frame, the scheme defines an HTTP header - tentatively named "Priority" - that can carry an urgency on a scale of 0 (highest) to 7 (lowest). For example, a client could express the priority of an important resource by sending a request with:Priority: u=0 And a less important background resource could be requested with:Priority: u=7 While Kazuho and I are the main authors of this specification, we were inspired by several ideas in the Internet community, and we have incorporated feedback or direct input from many of our peers in the Internet community over several drafts. The text today reflects the efforts-so-far of cross-industry work involving many engineers and researchers including organizations such Adobe, Akamai, Apple, Cloudflare, Fastly, Facebook, Google, Microsoft, Mozilla and UHasselt. Adoption in the HTTP Working Group means that we can help improve the design and specification by spending some IETF time and resources for broader discussion, feedback and implementation experience.The backstoryI work in Cloudflare's Protocols team which is responsible for terminating HTTP at the edge. We deal with things like TCP, TLS, QUIC, HTTP/1.x, HTTP/2 and HTTP/3 and since joining the company I've worked with Alessandro Ghedini, Junho Choi and Lohith Bellad to make QUIC and HTTP/3 generally available last September.Working on emerging standards is fun. It involves an eclectic mix of engineering, meetings, document review, specification writing, time zones, personalities, and organizational boundaries. So while working on the codebase of quiche, our open source implementation of QUIC and HTTP/3, I am also mulling over design details of the protocols and discussing them in cross-industry venues like the IETF.Because of HTTP/3's lineage, it carries over a lot of features from HTTP/2 including the priority signals and tree described earlier in the post.One of the key benefits of HTTP/3 is that it is more resilient to the effect of lossy network conditions on performance; head-of-line blocking is limited because requests and responses can progress independently. This is, however, a double-edged sword because sometimes ordering is important. In HTTP/3 there is no guarantee that the requests are received in the same order that they were sent, so the priority tree can get out of sync between client and server. Imagine a client that makes two requests that include priority signals stating request 1 depends on root, request 2 depends on request 1. If request 2 arrives before request 1, the dependency cannot be resolved and becomes dangling. In such a case what is the best thing for a server to do? Ambiguity in behaviour leads to assumptions and disappointment. We should try to avoid that.Request 1 depends on root and request 2 depends on request 1. If an HTTP/3 server receives request 2 first, the dependency cannot be resolved.This is just one example where things get tricky quickly. Unfortunately the WG kept finding edge case upon edge case with the priority tree model. We tried to find solutions but each additional fix seemed to create further complexity to the HTTP/3 design. This is a problem because it makes it hard to implement a server that handles priority correctly.In parallel to Cloudflare's work on implementing a better prioritization for HTTP/2, in January 2019 Pat posted his proposal for an alternative prioritization scheme for HTTP/3 in a message to the IETF HTTP WG.Arguably HTTP/2 prioritization never lived up to its hype. However, replacing it with something else in HTTP/3 is a challenge because the QUIC WG charter required us to try and maintain parity between the protocols. Mark Nottingham, co-chair of the HTTP and QUIC WGs responded with a good summary of the situation. To quote part of that response:My sense is that people know that we need to do something about prioritisation, but we're not yet confident about any particular solution. Experimentation with new schemes as HTTP/2 extensions would be very helpful, as it would give us some data to work with. If you'd like to propose such an extension, this is the right place to do it.And so started a very interesting year of cross-industry discussion on the future of HTTP prioritization.A year of prioritizationThe following is an account of my personal experiences during 2019. It's been a busy year and there may be unintentional errors or omissions, please let me know if you think that is the case. But I hope it gives you a taste of the standardization process and a look behind the scenes of how new Internet protocols that benefit everyone come to life.JanuaryPat's email came at the same time that I was attending the QUIC WG Tokyo interim meeting hosted at Akamai (thanks to Mike Bishop for arrangements). So I was able to speak to a few people face-to-face on the topic. There was a bit of mailing list chatter but it tailed off after a few days.February to AprilThings remained quiet in terms of prioritization discussion. I knew the next best opportunity to get the ball rolling would be the HTTP Workshop 2019 held in April. The workshop is a multi-day event not associated with a standards-defining-organization (even if many of the attendees also go to meetings such as the IETF or W3C). It is structured in a way that allows the agenda to be more fluid than a typical standards meeting and gives plenty of time for organic conversation. This sometimes helps overcome gnarly problems, such as the community finding a path forward for WebSockets over HTTP/2 due to a productive discussion during the 2017 workshop. HTTP prioritization is a gnarly problem, so I was inspired to pitch it as a talk idea. It was selected and you can find the full slide deck here.During the presentation I recounted the history of HTTP prioritization. The great thing about working on open standards is that many email threads, presentation materials and meeting materials are publicly archived. It's fun digging through this history. Did you know: HTTP/2 is based on SPDY and inherited its weight-based prioritization scheme, the tree-based scheme we are familiar with today was only introduced in draft-ietf-httpbis-http2-11? One of the reasons for the more-complicated tree was to help HTTP intermediaries (a.k.a. proxies) implement clever resource management. However, it became clear during the discussion that no intermediaries implement this, and none seem to plan to. I also explained a bit more about Pat's alternative scheme and Nick described his implementation experiences. Despite some interesting discussion around the topic however, we didn't come to any definitive solution. There were a lot of other interesting topics to discover that week.MayIn early May, Ian Swett (Google) restarted interest in Pat's mailing list thread. Unfortunately he was not present at the HTTP Workshop so had some catching up to do. A little while later Ian submitted a Pull Request to the HTTP/3 specification called "Strict Priorities". This incorporated Pat's proposal and attempted to fix a number of those prioritization edge cases that I mentioned earlier.In late May, another QUIC WG interim meeting was held in London at the new Cloudflare offices, here is the view from the meeting room window. Credit to Alessandro for handling the meeting arrangements.Thanks to @cloudflare for hosting our interop and interim meetings in London this week! pic.twitter.com/LIOA3OqEjr— IETF QUIC WG (@quicwg) May 23, 2019 Mike, the editor of the HTTP/3 specification presented some of the issues with prioritization and we attempted to solve them with the conventional tree-based scheme. Ian, with contribution from Robin Marx (UHasselt), also presented an explanation about his "Strict Priorities" proposal. I recommend taking a look at Robin's priority tree visualisations which do a great job of explaining things. From that presentation I particularly liked "The prioritization spectrum", it's a concise snapshot of the state of things at that time:An overview of HTTP/3 prioritization issues, fixes and possible alternatives. Presented by Ian Swett at the QUIC Interim Meeting May 2019.June and JulyFollowing the interim meeting, the prioritization "debate" continued electronically across GitHub and email. Some time in June Kazuho started work on a proposal that would use a scheme similar to Pat and Ian's absolute priorities. The major difference was that rather than send the priority signal in an HTTP frame, it would use a header field. This isn't a new concept, Roy Fielding proposed something similar at IETF 83.In HTTP/2 and HTTP/3 requests are made up of frames that are sent on streams. Using a simple GET request as an example: a client sends a HEADERS frame that contains the scheme, method, path, and other request header fields. A server responds with a HEADERS frame that contains the status and response header fields, followed by DATA frame(s) that contain the payload.To signal priority, a client could also send a PRIORITY frame. In the tree-based scheme the frame carries several fields that express dependencies and weights. Pat and Ian's proposals changed the contents of the PRIORITY frame. Kazuho's proposal encodes the priority as a header field that can be carried in the HEADERS frame as normal metadata, removing the need for the PRIORITY frame altogether.I liked the simplification of Kazuho's approach and the new opportunities it might create for application developers. HTTP/2 and HTTP/3 implementations (in particular browsers) abstract away a lot of connection-level details such as stream or frames. That makes it hard to understand what is happening or to tune it.The lingua franca of the Web is HTTP requests and responses, which are formed of header fields and payload data. In browsers, APIs such as Fetch and Service Worker allow handling of these primitives. In servers, there may be ways to interact with the primitives via configuration or programming languages. As part of Enhanced HTTP/2 Prioritization, we have exposed prioritization to Cloudflare Workers to allow rich behavioural customization. If a Worker adds the "cf-priority" header to a response, Cloudflare’s edge servers use the specified priority to serve the response. This might be used to boost the priority of a resource that is important to the load time of a page. To help inform this decision making, the incoming browser priority signal is encapsulated in the request object passed to a Worker's fetch event listener (request.cf.requestPriority).Standardising approaches to problems is part of helping to build a better Internet. Because of the resonance between Cloudflare's work and Kazuho's proposal, I asked if he would consider letting me come aboard as a co-author. He kindly accepted and on July 8th we published the first version as an Internet-Draft.Meanwhile, Ian was helping to drive the overall prioritization discussion and proposed that we use time during IETF 105 in Montreal to speak to a wider group of people. We kicked off the week with a short presentation to the HTTP WG from Ian, and Kazuho and I presented our draft in a side-meeting that saw a healthy discussion. There was a realization that the concepts of prioritization scheme, priority signalling and server resource scheduling (enacting prioritization) were conflated and made effective communication and progress difficult. HTTP/2's model was seen as one aspect, and two different I-Ds were created to deprecate it in some way (draft-lassey-priority-setting, draft-peon-httpbis-h2-priority-one-less). Martin Thomson (Mozilla) also created a Pull Request that simply removed the PRIORITY frame from HTTP/3.To round off the week, in the second HTTP session it was decided that there was sufficient interest in resolving the prioritization debate via the creation of a design team. I joined the team led by Ian Swett along with others from Adobe, Akamai, Apple, Cloudflare, Fastly, Facebook, Google, Microsoft, and UHasselt.August to OctoberMartin's PR generated a lot of conversation. It was merged under proviso that some solution be found before the HTTP/3 specification was finalized. Between May and August we went from something very complicated (e.g. Orphan placeholder, with PRIORITY only on control stream, plus exclusive priorities) to a blank canvas. The pressure was now on!The design team held several teleconference meetings across the months. Logistics are a bit difficult when you have team members distributed across West Coast America, East Coast America, Western Europe, Central Europe, and Japan. However, thanks to some late nights and early mornings we managed to all get on the call at the same time.In October most of us travelled to Cupertino, CA to attend another QUIC interim meeting hosted at Apple's Infinite Loop (Eric Kinnear helping with arrangements).  The first two days of the meeting were used for interop testing and were loosely structured, so the design team took the opportunity to hold the first face-to-face meeting. We made some progress and helped Ian to form up some new slides to present later in the week. Again, there was some useful discussion and signs that we should put some time in the agenda in IETF 106.NovemberThe design team came to agreement that draft-kazuho-httpbis-priority was a good basis for a new prioritization scheme. We decided to consolidate the various I-Ds that had sprung up during IETF 105 into the document, making it a single source that was easier for people to track progress and open issues if required. This is why, even though Kazuho and I are the named authors, the document reflects a broad input from the community. We published draft 03 in November, just ahead of the deadline for IETF 106 in Singapore.Many of us travelled to Singapore ahead of the actual start of IETF 106. This wasn't to squeeze in some sightseeing (sadly) but rather to attend the IETF Hackathon. These are events where engineers and researchers can really put the concept of "running code" to the test. I really enjoy attending and I'm grateful to Charles Eckel and the team that organised it. If you'd like to read more about the event, Charles wrote up a nice blog post that, through some strange coincidence, features a picture of me, Kazuho and Robin talking at the QUIC table.Link: https://t.co/8qP78O6cPS— Lucas Pardue (@SimmerVigor) December 17, 2019 The design team held another face-to-face during a Hackathon lunch break and decided that we wanted to make some tweaks to the design written up in draft 03. Unfortunately the freeze was still in effect so we could not issue a new draft. Instead, we presented the most recent thinking to the HTTP session on Monday where Ian put forward draft-kazuho-httpbis-priority as the group's proposed design solution. Ian and Robin also shared results of prioritization experiments. We received some great feedback in the meeting and during the week pulled out all the stops to issue a new draft 04 before the next HTTP session on Thursday. The question now was: Did the WG think this was suitable to adopt as the basis of an alternative prioritization scheme? I think we addressed a lot of the feedback in this draft and there was a general feeling of support in the room. However, in the IETF consensus is declared via mailing lists and so Tommy Pauly, co-chair of the HTTP WG, put out a Call for Adoption on November 21st.DecemberIn the Cloudflare London office, preparations begin for mince pie acquisition and assessment.The HTTP priorities team played the waiting game and watched the mailing list discussion. On the whole people supported the concept but there was one topic that divided opinion. Some people loved the use of headers to express priorities, some people didn't and wanted to stick to frames.On December 13th Tommy announced that the group had decided to adopt our document and assign Kazuho and I as editors. The header/frame divide was noted as something that needed to be resolved.The next step of the journeyJust because the document has been adopted does not mean we are done. In some ways we are just getting started. Perfection is often the enemy of getting things done and so sometimes adoption occurs at the first incarnation of a "good enough" proposal.Today HTTP/3 has no prioritization signal. Without priority information there is a small danger that servers pick a scheduling strategy that is not optimal, that could cause the web performance of HTTP/3 to be worse than HTTP/2. To avoid that happening we'll refine and complete the design of the Extensible Priority Scheme. To do so there are open issues that we have to resolve, we'll need to square the circle on headers vs. frames, and we'll no doubt hit unknown unknowns. We'll need the input of the WG to make progress and their help to document the design that fits the need, and so I look forward to continued collaboration across the Internet community.2019 was quite a ride and I'm excited to see what 2020 brings.If working on protocols is your interest and you like what Cloudflare is doing, please visit our careers page. Our journey isn’t finished, in fact far from it.

Happy Holidays!

CloudFlare Blog -

I joined Cloudflare in July of 2019, but I've known of Cloudflare for years. I always read the blog posts and looked at the way the company was engaging with the community. I also noticed the diversity in the names of many of the blog post authors. There are over 50 languages spoken at Cloudflare, as we have natives from many countries on our team, with different backgrounds, religions, gender and cultures. And it is this diversity that makes us a great team.A few days ago I asked one of my colleagues how he would say "Happy Holidays!" in Arabic. When I heard him say it, I instantly got the idea of recording a video in as many languages as possible of our colleagues wishing all of you, our readers and customers, a happy winter season.It only took one internal message for people to start responding and sending their videos to me. Some did it themselves, others flocked in a meeting room and helped each other record their greeting. It took a few days and some video editing to put together an informal video that was entirely done by the team, to wish you all the best as we close this year and decade.So here it is: Happy Holidays from all of us at Cloudflare!Let us know if you speak any of the languages in the video. Or maybe you can tell us how you greet each other, at this time of the year, in your native language.

This holiday's biggest online shopping day was... Black Friday

CloudFlare Blog -

What’s the biggest day of the holiday season for holiday shopping? Black Friday, the day after US Thanksgiving, has been embraced globally as the day retail stores announce their sales. But it was believed that the following Monday, dubbed “Cyber Monday,” may be even bigger. Or, with the explosion of reliable 2-day and even 1-day shipping, maybe another day closer to Christmas has taken the crown. At Cloudflare, we aimed to answer this question for the 2019 holiday shopping season.Black Friday was the biggest online shopping day but the second biggest wasn't Cyber Monday... it was Thanksgiving Day itself (the day before Black Friday!). Cyber Monday was the fourth biggest day. Here's a look at checkout events seen across Cloudflare's network since before Thanksgiving in the US.Checkout events as a percentage of checkouts on Black FridayThe weekends are shown in yellow and Black Friday and Cyber Monday are shown in green. You can see that checkouts ramped up during Thanksgiving week and then continued through the weekend into Cyber Monday.Black Friday had twice the number of checkouts as the preceding Friday and the entire Thanksgiving week dominates. Post-Cyber Monday, no day reached 50% of the number of checkouts we saw on Black Friday. And Cyber Monday was just 60% of Black Friday.So, Black Friday is the peak day but Thanksgiving Day is the runner up. Perhaps it deserves its own moniker: Thrifty Thursday anyone?Checkouts occur more frequently from Monday to Friday and then drop off over the weekend.  After Cyber Monday only one other day showed an interesting peak. Looking at last week it does appear that Tuesday, December 17 was the pre-Christmas peak for online checkouts. Perhaps fast online shipping made consumers feel they could use online shopping as long as they got their purchases by the weekend before Christmas.Happy Holidays from everyone at Cloudflare!

First Half 2019 Transparency Report and an Update on a Warrant Canary

CloudFlare Blog -

Today, we are releasing Cloudflare’s transparency report for the first half of 2019. We recognize the importance of keeping the reports current, but It’s taken us a little longer than usual to put it together. We have a few notable updates. Pulling a warrant canarySince we issued our very first transparency report in 2014, we’ve maintained a number of commitments - known as warrant canaries - about what actions we will take and how we will respond to certain types of law enforcement requests. We supplemented those initial commitments earlier this year, so that our current warrant canaries state that Cloudflare has never:Turned over our encryption or authentication keys or our customers' encryption or authentication keys to anyone.Installed any law enforcement software or equipment anywhere on our network.Terminated a customer or taken down content due to political pressure*Provided any law enforcement organization a feed of our customers' content transiting our network.Modified customer content at the request of law enforcement or another third party.Modified the intended destination of DNS responses at the request of law enforcement or another third party.Weakened, compromised, or subverted any of its encryption at the request of law enforcement or another third party.These commitments serve as a statement of values to remind us what is important to us as a company, to convey not only what we do, but what we believe we should do. For us to maintain these commitments. we have to believe not only that we’ve met them in the past, but that we can continue to meet them. Unfortunately, there is one warrant canary that no longer meets the test for remaining on our website. After Cloudlfare terminated the Daily Stormer’s service in 2017, Matthew observed:"We're going to have a long debate internally about whether we need to remove the bullet about not terminating a customer due to political pressure. It's powerful to be able to say you've never done something. And, after today, make no mistake, it will be a little bit harder for us to argue against a government somewhere pressuring us into taking down a site they don't like." We addressed this issue in our subsequent transparency reports by retaining the statement, but adding an asterisk identifying the Daily Stormer debate and the criticism that we had received in the wake of our decision to terminate services. Our goal was to signal that we remained committed to the principle that we should not terminate a customer due to political pressure, while not ignoring the termination. We also sought to be public about the termination and our reasons for the decision, ensuring that it would not go unnoticed. Although that termination sparked significant debate about whether infrastructure companies making decisions about what content should remain online, we haven’t yet seen politically accountable actors put forth real alternatives to address deeply troubling content and behavior online. Since that time, we’ve seen even more real world consequences from the vitriol and hateful content spread online, from the screeds posted in connection with the terror attacks in Christchurch, Poway and El Paso to the posting of video glorifying those attacks. Indeed, in the absence of true public policy initiatives to address those concerns, the pressure on tech companies -- even deep Internet infrastructure companies like Cloudflare --  to make judgments about what stays online has only increased.  In August 2019, Cloudflare terminated service to 8chan based on their failure to moderate their hate-filled platform in a way that inspired murderous acts. Although we don’t think removing cybersecurity services to force a site offline is the right public policy approach to the hate festering online, a site’s failure to take responsibility to prevent or mitigate the harm caused by its platform leaves service providers like us with few choices. We’ve come to recognize that the prolonged and persistent lawlessness of others might require action by those further down the technical stack. Although we’d prefer that governments recognize that need, and build mechanisms for due process, if they fail to act, infrastructure companies may be required to take action to prevent harm. And that brings us back to our warrant canary. If we believe we might have an obligation to terminate customers, even in a limited number of cases, retaining a commitment that we will never terminate a customer “due to political pressure” is untenable. We could, in theory, argue that terminating a lawless customer like 8chan was not a termination “due to political pressure.”  But that seems wrong.  We shouldn’t be parsing specific words of our commitments to explain to people why we don’t believe we’ve violated the standard. We remain committed to the principle that providing cybersecurity services to everyone, regardless of content, makes the Internet a better place. Although we’re removing the warrant canary from our website, we believe that to earn and maintain our users’ trust, we must be transparent about the actions we take. We therefore commit to reporting on any action that we take to terminate a user that could be viewed as a termination “due to political pressure.”UK/US Cloud agreementAs we’ve described previously, governments have been working to find ways to improve law enforcement access to digital evidence across borders. Those efforts resulted in a new U.S. law, the Clarifying Lawful Overseas Use of Data (CLOUD) Act, premised on the idea that law enforcement around the world should be able to get access to electronic content related to their citizens when conducting law enforcement investigations, wherever that data is stored, as long as they are bound by sufficient procedural safeguards to ensure due process.On October 3, 2019, the US and UK signed the first Executive Agreement under this law.  According to the requirements of U.S. law, that Agreement will go into effect in 180 days, in March 2020, unless Congress takes action to block it.  There is an ongoing debate as to whether the agreement includes sufficient due process and privacy protections. We’re going to take a wait and see approach, and will closely monitor any requests we receive after the agreement goes into effect.For the time being, Cloudflare intends to comply with appropriately scoped and targeted requests for data from UK law enforcement, provided that those requests are consistent with the law and international human rights standards. Information about the legal requests that Cloudflare receives from non-U.S. governments pursuant to the CLOUD Act will be included in future transparency reports.

An Update on CDNJS

CloudFlare Blog -

When you loaded this blog, a file was delivered to your browser called jquery-3.2.1.min.js. jQuery is a library which makes it easier to build websites, and was at one point included on as many as 74.1% of all websites. A full eighteen million sites include jQuery and other libraries using one of the most popular tools on Earth: CDNJS. Beginning about a month ago Cloudflare began to take a more active role in the operation of CDNJS. This post is here to tell you more about CDNJS’ history and explain why we are helping to manage CDNJS.What CDNJS DoesVirtually every site is composed of not just the code written by its developers, but also dozens or hundreds of libraries. These libraries make it possible for websites to extend what a web browser can do on its own. For example, libraries can allow a site to include powerful data visualizations, respond to user input, or even get more performant.These libraries created wondrous and magical new capabilities for web browsers, but they can also cause the size of a site to explode. Particularly a decade ago, connections were not always fast enough to permit the use of many libraries while maintaining performance. But if so many websites are all including the same libraries, why was it necessary for each of them to load their own copy?If we all load jQuery from the same place the browser can do a much better job of not actually needing to download it for every site. When the user visits the first jQuery-powered site it will have to be downloaded, but it will already be cached on the user's computer for any subsequent jQuery-powered site they might visit.The first visit might take time to load:But any future visit to any website pointing to this common URL would already be cached:<!-- Loaded only on my site, will need to be downloaded by every user --> <script src="./jquery.js"></script> <!-- Loaded from a common location across many sites --> <script src="https://cdnjs.cloudflare.com/jquery.js"></script> Beyond the performance advantage, including files this way also made it very easy for users to experiment and create. When using a web browser as a creation tool users often didn't have elaborate build systems (this was also before npm), so being able to include a simple script tag was a boon. It's worth noting that it's not clear a massive performance advantage was ever actually provided by this scheme. It is becoming even less of a performance advantage now that browser vendors are beginning to use separate cache's for each website you visit, but with millions of sites using CDNJS there's no doubt it is a critical part of the web.A CDN for all of usMy first Pull Request into the CDNJS project was in 2013. Back then if you created a JavaScript project it wasn't possible to have it included in the jQuery CDN, or the ones provided by large companies like Google and Microsoft. They were only for big, important, projects. Of course, however, even the biggest project starts small. The community needed a CDN which would agree to host nearly all JavaScript projects, even the ones which weren't world-changing (yet). In 2011, that project was launched by Ryan Kirkman and Thomas Davis as CDNJS.The project was quickly wildly successful, far beyond their expectations. Their CDN bill quickly began to skyrocket (it would now be over a million dollars a year on AWS). Under the threat of having to shut down the service, Cloudflare was approached by the CDNJS team to see if we could help. We agreed to support their efforts and created cdnjs.cloudflare.com which serves the CDNJS project free of charge.CDNJS has been astonishingly successful. The project is currently installed on over eighteen million websites (10% of the Internet!), offers files totaling over 1.5 billion lines of code, and serves over 173 billion requests a month. CDNJS only gets more popular as sites get larger, with 34% of the top 10k websites using the service. Each month we serve almost three petabytes of JavaScript, CSS, and other resources which power the web via cdnjs.cloudflare.com.Spikes can happen when a very large or popular site installs CDNJS, or when a misbehaving web crawler discovers a CDNJS link.The future value of CDNJS is now in doubt, as web browsers are beginning to use a separate cache for every website you visit. It is currently used on such a wide swath of the web, however, it is unlikely it will be disappearing any time soon.How CDNJS WorksCDNJS starts with a Github repo. That project contains every file served by CDNJS, at every version which it has ever offered. That’s 182 GB without the commit history, over five million files, and over 1.5 billion lines of code. Given that it stores and delivers versioned code files, in many ways it was the Internet’s first JavaScript package manager. Unlike other package managers and even other CDNs everything CDNJS serves is publicly versioned. All 67,724 commits! This means you as a user can verify that you are being served files which haven’t been tampered with.To make changes to CDNJS a commit has to be made. For new projects being added to CDNJS, or when projects change significantly, these commits are made by humans, and get reviewed by other humans. When projects just release new versions there is a bot made by Peter and maintained by Sven which sucks up changes from npm and automatically creates commits.Within Cloudflare’s infrastructure there is a set of machines which are responsible for pulling the latest version of the repo periodically. Those machines then become the origin for cdnjs.cloudflare.com, with Cloudflare’s Global Load Balancer automatically handling failures. Cloudflare’s cache automatically stores copies of many of the projects making it possible for us to deliver them quickly from all 195 of our data centers.The Internet on a Shoestring BudgetThe CDNJS project has always been administered independently of Cloudflare. In addition to the founders, the project has additionally been maintained by exceptionally hard-working caretakers like Peter and Matt Cowley. Maintaining a single repo of nearly every frontend project on Earth is no small task, and it has required a substantial amount of both manual work and bot development.Unfortunately approximately thirty days ago one of those bots stopped working, preventing updated projects from appearing in CDNJS. The bot's open-source maintainer was not able to invest the time necessary to keep the bot running. After several weeks we were asked by the community and the CDNJS founders to take over maintenance of the CDNJS repo itself. This means the Cloudflare engineering team is taking responsibility for keeping the contents of github.com/cdnjs/cdnjs up to date, in addition to ensuring it is correctly served on cdnjs.cloudflare.com.We agreed to do this because we were, frankly, scared. Like so many open-source projects CDNJS was a critical part of our world, but wasn’t getting the attention it needed to survive. The Internet relies on CDNJS as much as on any other single project, losing it or allowing it to be commandeered would be catastrophic to millions of websites and their visitors. If it began to fail, some sites would adapt and update, others would be broken forever. CDNJS has always been, and remains, a project for and by the community. We are invested in making all decisions in a transparent and inclusive manner. If you are interested in contributing to CDNJS or in the topics we're currently discussing please visit the CDNJS Github Issues page.A Plan for the FutureOne example of an area where we could use your help is in charting a path towards a CDNJS which requires less manual moderation. Nothing can replace the intelligence and creativity of a human (yet), but for a task like managing what resources go into a CDN, it is error prone and time consuming. At present a human has to review every new project to be included, and often has to take additional steps to include new versions of a project. As a part of our analysis of the project we examined a snapshot of the still-open PRs made against CDNJS for several months:The vast majority of these PRs were changes which ultimately passed the automated review but nevertheless couldn't be merged without manual review.There is consensus that we should move to a model which does not require human involvement in most cases. We would love your input and collaboration on the best way for that to be solved. If this is something you are passionate about, please contribute here.Our plan is to support the CDNJS project in whichever ways it requires for as long as the Internet relies upon it. We invite you to use CDNJS in your next project with the full assurance that it is backed by the same network and team who protect and accelerate over twenty million of your favorite websites across the Internet. We are also planning more posts diving further into the CDNJS data, subscribe to this blog if you would like to be notified upon their release.

Announcing the CSAM Scanning Tool, Free for All Cloudflare Customers

CloudFlare Blog -

Two weeks ago we wrote about Cloudflare's approach to dealing with child sexual abuse material (CSAM). We first began working with the National Center for Missing and Exploited Children (NCMEC), the US-based organization that acts as a clearinghouse for removing this abhorrent content, within months of our public launch in 2010. Over the last nine years, our Trust & Safety team has worked with NCMEC, Interpol, and nearly 60 other public and private agencies around the world to design our program. And we are proud of the work we've done to remove CSAM from the Internet.The most repugnant cases, in some ways, are the easiest for us to address. While Cloudflare is not able to remove content hosted by others, we will take steps to terminate services to a website when it becomes clear that the site is dedicated to sharing CSAM or if the operators of the website and its host fail to take appropriate steps to take down CSAM content. When we terminate websites, we purge our caches — something that takes effect within seconds globally — and we block the website from ever being able to use Cloudflare's network again.Addressing the Hard CasesThe hard cases are when a customer of ours runs a service that allows user generated content (such as a discussion forum) and a user uploads CSAM, or if they’re hacked, or if they have a malicious employee that is storing CSAM on their servers. We've seen many instances of these cases where services intending to do the right thing are caught completely off guard by CSAM that ended up on their sites. Despite the absence of intent or malice in these cases, there’s still a need to identify and remove that content quickly.Today we're proud to take a step to help deal with those hard cases. Beginning today, every Cloudflare customer can login to their dashboard and enable access to the CSAM Scanning Tool. As the CSAM Scanning Tool moves through development to production, the tool will check all Internet properties that have enabled CSAM Scanning for this illegal content. Cloudflare will automatically send a notice to you when it flags CSAM material, block that content from being accessed (with a 451 “blocked for legal reasons” status code), and take steps to support proper reporting of that content in compliance with legal obligations.CSAM Scanning will be available via the Cloudflare dashboard at no cost for all customers regardless of their plan level. You can find this tool under the “Caching” tab in your dashboard. We're hopeful that by opening this tool to all our customers for free we can help do even more to counter CSAM online and help protect our customers from the legal and reputational risk that CSAM can pose to their businesses.It has been a long journey to get to the point where we could commit to offering this service to our millions of users. To understand what we're doing and why it has been challenging from a technical and policy perspective, you need to understand a bit about the state of the art of tracking CSAM.Finding Similar ImagesAround the same time as Cloudflare was first conceived in 2009, a Dartmouth professor named Hany Farid was working on software that could compare images against a list of hashes maintained by NCMEC. Microsoft took the lead in creating a tool, PhotoDNA, that used Prof. Farid’s work to identify CSAM automatically. In its earliest days, Microsoft used PhotoDNA for their services internally and, in late 2009, donated the technology to NCMEC to help manage its use by other organizations. Social networks were some of the first adopters. In 2011, Facebook rolled out an implementation of the technology as part of their abuse process. Twitter incorporated it in 2014. The process is known as a fuzzy hash. Traditional hash algorithms like MD5, SHA1, and SHA256 take a file (such as an image or document) of arbitrary length and output a fixed length number that is, effectively, the file’s digital fingerprint. For instance, if you take the MD5 of this picture then the resulting fingerprint is 605c83bf1bba62e85f4f5fccc56bc128.The base imageIf we change a single pixel in the picture to be slightly off white rather than pure white, it's visually identical but the fingerprint changes completely to 42ea4fb30a440d8787477c6c37b9daed. As you can see from the two fingerprints, a small change to the image results in a massive and unpredictable change to the output of a traditional hash.The base image with a single pixel changedThis is great for some uses of hashing where you want to definitively identify if the document you're looking at is exactly the same as the one you've seen before. For example, if an extra zero is added to a digital contract, you want the hash of the document used in its signature to no longer be valid.Fuzzy HashingHowever, in the case of CSAM, this characteristic of traditional hashing is a liability. In order to avoid detection, the criminals producing CSAM resize, add noise, or otherwise alter the image in such a way that it looks the same but it would result in a radically different hash.Fuzzy hashing works differently. Instead of determining if two photos are exactly the same it instead attempts to get at the essence of a photograph. This allows the software to calculate hashes for two images and then compare the "distance" between the two. While the fuzzy hashes may still be different between two photographs that have been altered, unlike with traditional hashing, you can compare the two and see how similar the images are.So, in the two photos above, the fuzzy hash of the first image is00e308346a494a188e1042333147267a 653a16b94c33417c12b433095c318012 5612442030d1484ce82c613f4e224733 1dd84436734e4a5c6e25332e507a8218 6e3b89174e30372d and the second image is 00e308346a494a188e1042333147267a 653a16b94c33417c12b433095c318012 5612442030d1484ce82c613f4e224733 1dd84436734e4a5c6e25332e507a8218 6e3b89174e30372d There's only a slight difference between the two in terms of pixels and the fuzzy hashes are identical. The base image after increasing the saturation, changing to sepia, adding a border and then adding random noise.Fuzzy hashing is designed to be able to identify images that are substantially similar. For example, we modified the image of dogs by first enhancing its color, then changing it to sepia, then adding a border and finally adding random noise.  The fuzzy hash of the new image is 00d9082d6e454a19a20b4e3034493278 614219b14838447213ad3409672e7d13 6e0e4a2033de545ce731664646284337 1ecd4038794a485d7c21233f547a7d2e 663e7c1c40363335 This looks quite different from the hash of the unchanged image above, but fuzzy hashes are compared by seeing how close they are to each other.The largest possible distance between two images is about 5m units. These two fuzzy hashes are just 4,913 units apart (the smaller the number, the more similar the images) indicating that they are substantially the same image.Compare that with two unrelated photographs. The photograph below has a fuzzy hash of011a0d0323102d048148c92a4773b60d 0d343c02120615010d1a47017d108b14 d36fff4561aebb2f088a891208134202 3e21ff5b594bff5eff5bff6c2bc9ff77 1755ff511d14ff5b The photograph below has a fuzzy hash of062715154080356b8a52505955997751 9d221f4624000209034f1227438a8c6a 894e8b9d675a513873394a2f3d000722 781407ff475a36f9275160ff6f231eff 465a17f1224006ff The distance between the two hashes is calculated as 713,061. Through experimentation, it's possible to set a distance threshold under which you can consider two photographs to be likely related.Fuzzy Hashing's Intentionally Black BoxHow does it work? While there has been lots of work on fuzzy hashing published, the innards of the process are intentionally a bit of a mystery. The New York Times recently wrote a story that was probably the most public discussion of one such technology works. The challenge was if criminal producers and distributors of CSAM knew exactly how such tools worked then they might be able to craft how they alter their images in order to beat it. To be clear, Cloudflare will be running the CSAM Screening Tool on behalf of the website operator from within our secure points of presence. We will not be distributing the software directly to users. We will remain vigilant for potential attempted abuse of the platform, and will take prompt action as necessary.Tradeoff Between False Negatives and False PositivesWe have been working with a number of authorities on how we can best roll it out this functionality to our customers. One of the challenges for a network with as diverse a set of customers as Cloudflare's is what the appropriate threshold should be to set the comparison distance between the fuzzy hashes.If you set the threshold too strict — meaning that it's closer to a traditional hash and two images need to be virtually identical to trigger a match — then you're more likely to have have many false negatives (i.e., CSAM that isn't flagged). If you set the threshold too loose, then it's possible to have many false positives. False positives may seem like the lesser evil, but there are legitimate concerns that increasing the possibility of false positives at scale could waste limited resources and further overwhelm the existing ecosystem.  We will work to iterate the CSAM Scanning Tool to provide more granular control to the website owner while supporting the ongoing effectiveness of the ecosystem. Today, we believe we can offer a good first set of options for our customers that will allow us to more quickly flag CSAM without overwhelming the resources of the ecosystem.Different Thresholds for Different CustomersThe same desire for a granular approach was reflected in our conversations with our customers. When we asked what was appropriate for them, the answer varied radically based on the type of business, how sophisticated its existing abuse process was, and its likely exposure level and tolerance for the risk of CSAM being posted on their site.For instance, a mature social network using Cloudflare with a sophisticated abuse team may want the threshold set quite loose, but not want the material to be automatically blocked because they have the resources to manually review whatever is flagged.A new startup dedicated to providing a forum to new parents may want the threshold set quite loose and want any hits automatically blocked because they haven't yet built a sophisticated abuse team and the risk to their brand is so high if CSAM material is posted -- even if that will result in some false positives.A commercial financial institution may want to set the threshold quite strict because they're less likely to have user generated content and would have a low tolerance for false positives, but then automatically block anything that's detected because if somehow their systems are compromised to host known CSAM they want to stop it immediately.Different Requirements for Different JurisdictionsThere also may be challenges based on where our customers are located and the laws and regulations that apply to them. Depending on where a customers business is located and where they have users, they may choose to use one, more than one, or all the different available hash lists.In other words, one size does not fit all and, ideally, we believe allowing individual site owners to set the parameters that make the most sense for their particular site will result in lower false negative rates (i.e., more CSAM being flagged) than if we try and set one global standard for every one of our customers.Improving the Tool Over TimeOver time, we are hopeful that we can improve CSAM screening for our customers. We expect that we will add additional lists of hashes from numerous global agencies for our customers with users around the world to subscribe to. We're committed to enabling this flexibility without overly burdening the ecosystem that is set up to fight this horrible crime.Finally, we believe there may be an opportunity to help build the next generation of fuzzy hashing. For example, the software can only scan images that are at rest in memory on a machine, not those that are streaming. We're talking with Hany Farid, the former Dartmouth professor who now teaches at Berkeley California, about ways that we may be able to build a more flexible fuzzy hashing system in order to flag images before they're even posted.Concerns and ResponsibilityOne question we asked ourselves back when we began to consider offering CSAM scanning was whether we were the right place to be doing this at all. We share the universal concern about the distribution of depictions of horrific crimes against children, and believe it should have no place on the Internet, however Cloudflare is a network infrastructure provider, not a content platform. But we thought there was an appropriate role for us to play in this space. Fundamentally, Cloudflare delivers tools to our more than 2 million customers that were previously reserved for only the Internet giants. The security, performance, and reliability services that we offer, often for free, without us would have been extremely expensive or limited to the Internet giants like Facebook and Google.Today there are startups that are working to build the next Internet giant and compete with Facebook and Google because they can use Cloudflare to be secure, fast, and reliable online. But, as the regulatory hurdles around dealing with incredibly difficult issues like CSAM continue to increase, many of them lack access to sophisticated tools to scan proactively for CSAM. You have to get big to get into the club that gives you access to these tools, and, concerningly, being in the club is increasingly a prerequisite to getting big.If we want more competition for the Internet giants we need to make these tools available more broadly and to smaller organizations. From that perspective, we think it makes perfect sense for us to help democratize this powerful tool in the fight against CSAM.We hope this will help enable our customers to build more sophisticated content moderation teams appropriate for their own communities and will allow them to scale in a responsible way to compete with the Internet giants of today. That is directly aligned with our mission of helping build a better Internet, and it's why we're announcing that we will be making this service available for free for all our customers.

A new mom’s guide to pumping milk while traveling for work

CloudFlare Blog -

Recently, I deployed a human to production. Shipped at 11 lbs 3 oz, he rapidly doubled in size in his first six months. At Cloudflare, I run the Developer Relations team, and my first quarter back from parental leave, I had 3 business trips: 2 international, 1 domestic. As an exclusive breastfeeder, this means solving the logistical puzzle of moving a large quantity of milk home, to the tune of 40-50 oz (1200 - 1500 mL) per day given the size of my baby.Since I ferried milk home to my baby and did extensive research in preparation, I figured I'd pay it forward and share my own learnings, and publish the guide that I wished someone wrote for me. In the final section for further reading, I've linked many of the articles I read in preparation although some of the advice from the reading is rather dated. I’m including them because I’m grateful to be standing on the shoulders of giants and accumulating the wisdom of all the parents who went on this adventure before me. What's possible in 2019 is truly amazing compared to a generation ago or even half to one decade ago. Before I dive into the advice, I’d like to thank the other parents in our Parents group for their advice and help, our Office Team for maintaining a mother’s room in every HQ (even 2 in San Francisco), our People Team for their support of the Parents Employee Resource Group (ERG) and for helping me research insurance related questions, and Cloudflare in general for family friendly policies. (See career opportunities at Cloudflare )What's in my pump bag?When packing my pump bag, I packed for 2 pumping use cases: 1) pumping on the airplane or in a non-ideal (nursing room) area of an airport, and 2) pumping in a conference-provided mother’s room or a non-ideal private area. Here’s my packing list:Pump Bag packing list (and notes):Insulated cooler backpack I used an Igloo cooler bag because it was large enough to accommodate a smaller insulated milk bag and had separate compartments so I can access pump items without subjecting the inside to warm / room temperature air.Insulated milk bagTravel pump and bottlesBaby Buddha pumpIt charges via USB so I can use my power brick as a backup. This was recommended by another parent in the Parents ERG group. My first trip I packed my Baby Buddha, my Willow set, and my manual pumps for the trip, but I really relied on the Baby Buddha for all my subsequent trips. (At home I use Spectra, and at work we share a Medela hospital grade pump. I suppose I’m pumped to be a pump enthusiast.) On subsequent trips, I no longer packed bottles and went exclusively Kiinde + Baby Buddha.Pump cleaning wipesI used Medela pump wipes. A large box came with a microwave sterilizer bag.2 refrigerator thermometers (see temperature management section below)Extra gallon ziplock bags (a lot of them)PrintoutsIf you are traveling in the U.S., I recommend printing out these two TSA policy pages. Many airlines allow your medical device (e.g., breast pump) to be a separate carry-on item. Before my trips, I printed the airline policy page that states this for each airline I had flights with and stored it in my pump bag. Although it didn’t come in necessary, I’m glad it was there just in case. Each airline may have a different policy, so call each airline and confirm that the pump bag did not count against carry-on limits, even though you’re also printing out the policy from their website.Sharpie to label milk bagsTravel pump parts cleaning kitIce Packs (must be frozen solid to pass through security)It is possible for an insulated milk bag inside a cooler backpack to maintain <4C for a 10 hour flight, I’ve done it. However, it will require multiple replenishments of ziplock bags of ice. Use the long lasting type of ice pack (not the ones for treating injuries) so they stay frozen solid, not only through security but throughout the flight.Manual pump as a backupI have both a Haakaa and a NatureBond manual pump for worst case scenario planning. Although I didn’t end up needing them during the flight, I would still pack them in the future to literally take the pressure off in tight situations.Regular personal item packing list:Hand sanitizer wipes (wipes are not a liquid and don’t count toward your quota, hooray!)Baby bottle dish soap, travel size (with your liquids)Big tupperware container (makeshift dishpan for washing pump parts)Battery pack backup in case I can’t find an outlet (optional)Nursing cover for pumping in my seat (optional)Pro-Tips and lessons I learned from my journey:Pre-assemble pump parts and store in ziplock bags; each pump separately. My preference for a 10-12 hour flight is to pack 3 kits: each kit is one pump part set, fully assembled with adapter and Kiinde bag, and an extra Kiinde bag. I’d then pump one side at a time, using the same kit, swap bags between sides (or when one was full). Since the Baby Buddha comes with 2 sets (one for each side), I used a Spectra set from home and the unofficial Baby Buddha component hacks. Even though I had pump wipes, I saved all my pump sets for a proper deep clean after getting to the hotel.If/when I do need to clean pump parts on the plane (e.g., if, due to delays, I need to pump more times), I use my Medela pump wipes.Make friends with a flight attendant and let them know how they can be helpful, e.g., fill your ziplock bags with a steady supply of ice.Pack a lot of gallon ziplock bags. I packed over a dozen gallon ziplock bags for my first trip. It wasn't nearly enough. I recommend packing half a package of gallon size and half a package of quart sized with the zipper top. Asking for ice from airport vendors, flight attendants, storing a used pump I'll wash later, everything uses a fresh bag.Stockpile frozen milk before your trip. I estimated that my baby consumes 40-50 oz per day, and I had nearly enough for a one week trip. It turned out that he consumed less milk than expected, and there was still some frozen milk in the freezer when I got home.What happens if I need to be away from my hotel or a fridge for more than 3-4 hours? I pack my manual pump in my purse for post-conference social outings, and go to the bathroom and pump and dump just enough to take a little pressure off the top, and pump as soon as I get back to my hotel.Once on the plane, you have several options as to where to pump. Some people pump in the bathrooms, but I prefer getting a window seat and pumping under a tulip style nursing top, which provides a similar amount of privacy to a nursing cover, but is much more maneuverable.Liquid or gel hand sanitizer counts as a liquid for security purposes. My strategy is to rely on a combination of Medela pump wipes (FSA eligible) and hand sanitizer wipes. Your own comfort level with pumping at your seat may differ from mine, but I used hand sanitizer wipes on my hands (and arm rests, etc.) and another to wipe down the tray table for my pumping items. All cleaning after those 2 wipes were with the alcohol free pump wipes.Where can a mother pump around this airport / conference center / anywhere?Many airports have mother's rooms, family bathrooms, nursery rooms, (see list and list) or Mamava pods. Personally, I didn't want to be in a rush to finish pumping or washing parts while boarding for my flight gets announced, even though I’m very grateful they exist and would use them when there’s a flight delay.The ANA checkin agent thoughtfully warned me that Tokyo airport security won't allow milk as a carry-on liquid above the volume limits on the way back. Luckily, I was already planning to check it in a Milk Stork box, and cool my last batch in an ice bath before sealing the box. Milk Stork has compiled this helpful list of links to the airport security policy of different countries. The Points Guy blog compiled a helpful list of airline policies on whether a medical device (breast pump) qualifies as an additional carry on.On my phone, I have installed these 2 apps for locating mother’s rooms: Pumpspotting and Mamava. Your luck with either of them will depend on the country you’re in. In Japan, for instance, nearly every shopping mall and hotel lobby seemed to have a “nursery” room which fit the bill, but almost none are on the apps. North America is better represented on the apps.As a first resort, however, check with the conference organizers about a mother’s room. Developer Week and dotJS have both done a phenomenal job in making a mother’s room available. In one case, when I asked, the organizers learned that the venue already had a fully equipped mother’s room on site. In another case, the organizers worked with the venue to create a lockable private room with a refrigerator. Don’t be afraid of being the first person to request a mothers’ room of a conference organizer. You get zero percent of the things you don’t ask for, and worst case scenario, if there isn’t one, you may need to take a trip back to your hotel room for a mid-day pump session.Temperature management: is my cooler or refrigerator cold enough?My general approach to temperature management, be it hotel room refrigerators, my cooler backpack or my mini cooler bag, is trust but verify. I bought 2 little thermometers, and on the road, I had one in the small bag, and one in the large bag but outside the small bag. This allowed me to be sure that my milk is cold enough no matter where I was.Left: small milk cooler bag (Skip Hop from Target); Right: Igloo cooler backpack. I liked this bag because it has a top pocket, where I can quick access my pump and pump parts without opening the main compartment and exposing it to room temperature air.The main igloo compartment stabilized at 13C and the internal bag stabilized at 8C with 2 long lasting gel ice packs. When I asked for additional ice from the flight attendants, it stabilized at 10C in the main compartment and 3C in the internal bag. After a while, I got an ice refresh and the internal compartment stabilized at 1C.To prevent newly pumped warm milk from warming up the cooler, I used an external ziplock ice bath to rapid chill the milk bag before storing it in the cooler. For this reason, I preferred the stability of the Kiinde twist top bags and not being afraid of bursting a ziplock seal.Always trust but verify temperature. Milk bags in a full size hotel refrigerator with fridge thermometerSome hotel room mini fridges are cold enough and others aren't. Same with large refrigerators. Just like with cooler backpacks, my general advice is trust but verify. With the two little thermometers: I took the one outside the internal insulated pouch and put it in the fridge to measure the refrigerator temperature before unpacking the cooler backpack.At an Extended Stay Austin, I had to switch rooms to get a cold enough fridge. In the first room, the full sized refrigerator stabilized at 8C at its max, and couldn’t get colder, and the front desk people were happy to switch me to another room with a colder fridge, which was cold enough.My fridge in the Tokyo hotel stabilized at 8-9C when I put milk bags in, but can get down to 4C when it's not trying to cool warm milk. So I had the hotel store my milk in their fridge with my igloo cooler backpack. 1 thermometer in hotel fridge, one in backpack, so I can confirm their fridge is cold enough at 3-4C.My fridge in Paris was an old and weak little fridge that can get to 10C in ideal conditions, so I kept my milk at 4C in that fridge with a twice daily addition of ziplock bags full of ice provided by the hotel. Lastly, some rooms have their power modulated by the key card in a slot by the door, and the refrigerator turns off when you're not there. Don't feel bad about using a business card to keep the power on so the refrigerator can stay on.Milk Stork vs. OrderBoxesNowMilk Stork came highly recommended by parents in our Parent Chat channel, as used by their spouses at other companies and there’s currently internal discussion about potentially offering it as a benefit in the future.Since my baby is very large (99th percentile), he consumes 40 - 50 oz per day (1200 - 1500 mL per day). That means Milk Stork’s large 100 oz box is 2 - 2.5 days supply for my baby, whereas that’s close to a one week supply for a regular sized baby of a similar age. So I decided to try Milk Stork kits for some trips and compare the experience with buying replacement engines and/or boxes myself for other trips in order to compare the experience. And oh what a difference. I don’t have enough words for Milk Stork customer service. Milk Stork isn’t just a box with a refrigeration engine. You give them your trip information and they ship an entire kit to your hotel, which includes: the box and the refrigeration engine, milk storage bags, tamper evident seals, etc. Although you have to arrange the FedEx pickup yourself (and coordinate with your hotel front desk), they will pay for the freight and take care of the rest. When there was a hiccup with my FedEx pickup, and when I got a surprise FedEx invoice for import taxes on my milk, Milk Stork customer support got on the phone with FedEx to reverse the charges, saving me the headache of multiple phone calls.It is incredibly easy to make minor mistakes when buying replacement engines instead. On one trip, I brought my empty Milk Stork boxes to re-use and shipped replacement engines to the hotel. Not only did I have a slight panic because the hotel at first thought they didn’t have my replacement engines, it also turned out that I had ordered the wrong size. After a last minute trip to Home Depot for some supplies (zip ties, tape, bubble wrap), I was able to disassemble the two Milk Stork coolers into panels and MacGyver them together into a functional franken-cooler that was the correct size for the refrigeration engine that I used for multiple trips. Since this required pulling an all-nighter due to regular pumping interruptions, this is not for the faint of heart.MacGyvered franken-cooler box, constructed from 2 Milk Stork boxes, zipties, and packing tape in order to fit a larger size refrigeration engine.Reasons you might consider buying the boxes instead of a kit:You need a bigger volume box (e.g., shipping over 120 oz)You are comfortable re-using the boxes and are buying replacement refrigeration enginesYou’re comfortable with some last minute MacGyvering in case of errorsYou (and baby’s caretaker) really prefer Kiinde bags for feeding (our family does) and you need a larger box to fit more bags inSince I use Kiinde bags, I used plastic shrink bands instead of tamper evident seal stickers.Final thoughts and Shout-outsI would like to give a shout-out to the other parents in our Parents ERG (employee resource group). I especially want to thank Renee, my parent buddy in our returning-parent buddy system, for her contributions to the research, and Marina from our People Team for setting up that buddy system and also for helping research policy, from company internal to FSA related questions. Jill for recommending the Baby Buddha pump, Dane for recommending Milk Stork, Amy for keeping not just one but two nice mother's rooms in the SF office to keep up with our demand, and Nicole who always lets me borrow ice packs when I forget mine at home. And thank you Rebecca and all the other parents who trod down this path before me. Every time more parents take on the challenges, we collectively increase the demand for the products and services that makes the challenges easier, and maybe a version of this post in 2025 will be a piece of cake. (See career opportunities at Cloudflare≫ )We have come such a long way from the days of shipping frozen breastmilk packed with dry ice. I am so grateful that I was not out trying to source dry ice in a country where I don’t speak the language.Last but not least, I want to thank my husband and my mother in law whose backs and wrists strain under the weight of our very large baby since I have been recovering from a wrist injury.Further ReadingI count myself lucky to be able to stand on the shoulders of giants, that is, all the parents who have gone on this adventure before me who have shared their wisdom. Reference Guides:Airport security by country (print the relevant section for your trip, just in case)Tips for checking your milk as checked luggageArticles (and a few notes that I took whilst reading them):Breastfeeding in Combat Boots (book)What Nursing Parents Need to Know About Pumping During Work Travel (HBR)Check with hotel on fridge temperature. Bringing my own small thermometers.10 Tips to Prep for Pumping on the Road (Working Mother)Pump 45 minutes before landing so you have enough time to make it to your hotel in a variety of traffic conditions.Pumping on a Business Trip (Medium)Tips on Pumping While Traveling for Work (Apres Group)The Working Mom's Guide to Breastfeeding and Business Trips (Parents.com)Avoid airplane water for washing your hands; use hand sanitizer, pump wipes, and store parts in your cooler. Ask for a microwave to steam sterilize parts. Bring steam sterilizer bag.Breastfeeding While Traveling? Here’s How to Make It Work (TheBump)Mark item clearly as perishable. Large planes have some type of refrigerator that you can use to refrigerate your cooler. Smaller planes can provide extra ice.Your Ultimate Guide To Pumping While Traveling (Land & Air)(Mom Loves Best) How to Fly With Breast Milk in the United States (The Points Guy) The Best Airline Seats, Suites, Lactation Rooms and Lounges When Breastfeeding (The Points Guy) Milk Stork Review (The Points Guy) Can I Take Breast Milk on a Plane? Your Guide to Pumping and Traveling (What to Expect)

How we used our new GraphQL Analytics API to build Firewall Analytics

CloudFlare Blog -

Firewall Analytics is the first product in the Cloudflare dashboard to utilize the new GraphQL Analytics API. All Cloudflare dashboard products are built using the same public APIs that we provide to our customers, allowing us to understand the challenges they face when interfacing with our APIs. This parity helps us build and shape our products, most recently the new GraphQL Analytics API that we’re thrilled to release today.By defining the data we want, along with the response format, our GraphQL Analytics API has enabled us to prototype new functionality and iterate quickly from our beta user feedback. It is helping us deliver more insightful analytics tools within the Cloudflare dashboard to our customers.Our user research and testing for Firewall Analytics surfaced common use cases in our customers' workflow:Identifying spikes in firewall activity over timeUnderstanding the common attributes of threatsDrilling down into granular details of an individual event to identify potential false positivesWe can address all of these use cases using our new GraphQL Analytics API.GraphQL BasicsBefore we look into how to address each of these use cases, let's take a look at the format of a GraphQL query and how our schema is structured.A GraphQL query is comprised of a structured set of fields, for which the server provides corresponding values in its response. The schema defines which fields are available and their type. You can find more information about the GraphQL query syntax and format in the official GraphQL documentation.To run some GraphQL queries, we recommend downloading a GraphQL client, such as GraphiQL, to explore our schema and run some queries. You can find documentation on getting started with this in our developer docs.At the top level of the schema is the viewer field. This represents the top level node of the user running the query. Within this, we can query the zones field to find zones the current user has access to, providing a filter argument, with a zoneTag of the identifier of the zone we'd like narrow down to.{ viewer { zones(filter: { zoneTag: "YOUR_ZONE_ID" }) { # Here is where we'll query our firewall events } } } Now that we have a query that finds our zone, we can start querying the firewall events which have occurred in that zone, to help solve some of the use cases we’ve identified.Visualising spikes in firewall activityIt's important for customers to be able to visualise and understand anomalies and spikes in their firewall activity, as these could indicate an attack or be the result of a misconfiguration.Plotting events in a timeseries chart, by their respective action, provides users with a visual overview of the trend of their firewall events.Within the zones field in the query we’ve created earlier, we can query our firewall event aggregates using the firewallEventsAdaptiveGroups field, providing arguments to limit the count of groups, a filter for the date range we're looking for (combined with any user-entered filters), and a list of fields to order by; in this case, just the datetimeHour field that we're grouping by.Within the zones field in the query we created earlier, we can further query our firewall event aggregates using the firewallEventsAdaptiveGroups field and providing arguments for:A limit for the count of groupsA filter for the date range we're looking for (combined with any user-entered filters)A list of fields to orderBy (in this case, just the datetimeHour field that we're grouping by).By adding the dimensions field, we're querying for groups of firewall events, aggregated by the fields nested within dimensions. In this case, our query includes the action and datetimeHour fields, meaning the response will be groups of firewall events which share the same action, and fall within the same hour. We also add a count field, to get a numeric count of how many events fall within each group.query FirewallEventsByTime($zoneTag: string, $filter: FirewallEventsAdaptiveGroupsFilter_InputObject) { viewer { zones(filter: { zoneTag: $zoneTag }) { firewallEventsAdaptiveGroups( limit: 576 filter: $filter orderBy: [datetimeHour_DESC] ) { count dimensions { action datetimeHour } } } } } Note - Each of our groups queries require a limit to be set. A firewall event can have one of 8 possible actions, and we are querying over a 72 hour period. At most, we’ll end up with 567 groups, so we can set that as the limit for our query.This query would return a response in the following format:{ "viewer": { "zones": [ { "firewallEventsAdaptiveGroups": [ { "count": 5, "dimensions": { "action": "jschallenge", "datetimeHour": "2019-09-12T18:00:00Z" } } ... ] } ] } } We can then take these groups and plot each as a point on a time series chart. Mapping over the firewallEventsAdaptiveGroups array, we can use the group’s count property on the y-axis for our chart, then use the nested fields within the dimensions object, using action as unique series and the datetimeHour as the time stamp on the x-axis.Top NsAfter identifying a spike in activity, our next step is to highlight events with commonality in their attributes. For example, if a certain IP address or individual user agent is causing many firewall events, this could be a sign of an individual attacker, or could be surfacing a false positive.Similarly to before, we can query aggregate groups of firewall events using the firewallEventsAdaptiveGroups field. However, in this case, instead of supplying action and datetimeHour to the group’s dimensions, we can add individual fields that we want to find common groups of.By ordering by descending count, we’ll retrieve groups with the highest commonality first, limiting to the top 5 of each. We can add a single field nested within dimensions to group by it. For example, adding clientIP will give five groups with the IP addresses causing the most events.We can also add a firewallEventsAdaptiveGroups field with no nested dimensions. This will create a single group which allows us to find the total count of events matching our filter.query FirewallEventsTopNs($zoneTag: string, $filter: FirewallEventsAdaptiveGroupsFilter_InputObject) { viewer { zones(filter: { zoneTag: $zoneTag }) { topIPs: firewallEventsAdaptiveGroups( limit: 5 filter: $filter orderBy: [count_DESC] ) { count dimensions { clientIP } } topUserAgents: firewallEventsAdaptiveGroups( limit: 5 filter: $filter orderBy: [count_DESC] ) { count dimensions { userAgent } } total: firewallEventsAdaptiveGroups( limit: 1 filter: $filter ) { count } } } } Note - we can add the firewallEventsAdaptiveGroups field multiple times within a single query, each aliased differently. This allows us to fetch multiple different groupings by different fields, or with no groupings at all. In this case, getting a list of top IP addresses, top user agents, and the total events.We can then reference each of these aliases in the UI, mapping over their respective groups to render each row with its count, and a bar which represents the proportion of total events, showing the proportion of all events each row equates to.Are these firewall events false positives?After users have identified spikes, anomalies and common attributes, we wanted to surface more information as to whether these have been caused by malicious traffic, or are false positives.To do this, we wanted to provide additional context on the events themselves, rather than just counts. We can do this by querying the firewallEventsAdaptive field for these events.Our GraphQL schema uses the same filter format for both the aggregate firewallEventsAdaptiveGroups field and the raw firewallEventsAdaptive field. This allows us to use the same filters to fetch the individual events which summate to the counts and aggregates in the visualisations above.query FirewallEventsList($zoneTag: string, $filter: FirewallEventsAdaptiveFilter_InputObject) { viewer { zones(filter: { zoneTag: $zoneTag }) { firewallEventsAdaptive( filter: $filter limit: 10 orderBy: [datetime_DESC] ) { action clientAsn clientCountryName clientIP clientRequestPath clientRequestQuery datetime rayName source userAgent } } } } Once we have our individual events, we can render all of the individual fields we’ve requested, providing users the additional context on event they need to determine whether this is a false positive or not.That’s how we used our new GraphQL Analytics API to build Firewall Analytics, helping solve some of our customers most common security workflow use cases. We’re excited to see what you build with it, and the problems you can help tackle.You can find out how to get started querying our GraphQL Analytics API using GraphiQL in our developer documentation, or learn more about writing GraphQL queries on the official GraphQL Foundation documentation.

Introducing the GraphQL Analytics API: exactly the data you need, all in one place

CloudFlare Blog -

Today we’re excited to announce a powerful and flexible new way to explore your Cloudflare metrics and logs, with an API conforming to the industry-standard GraphQL specification. With our new GraphQL Analytics API, all of your performance, security, and reliability data is available from one endpoint, and you can select exactly what you need, whether it’s one metric for one domain or multiple metrics aggregated for all of your domains. You can ask questions like “How many cached bytes have been returned for these three domains?” Or, “How many requests have all the domains under my account received?” Or even, “What effect did changing my firewall rule an hour ago have on the responses my users were seeing?”The GraphQL standard also has strong community resources, from extensive documentation to front-end clients, making it easy to start creating simple queries and progress to building your own sophisticated analytics dashboards.From many APIs...Providing insights has always been a core part of Cloudflare’s offering. After all, by using Cloudflare, you’re relying on us for key parts of your infrastructure, and so we need to make sure you have the data to manage, monitor, and troubleshoot your website, app, or service. Over time, we developed a few key data APIs, including ones providing information regarding your domain’s traffic, DNS queries, and firewall events. This multi-API approach was acceptable while we had only a few products, but we started to run into some challenges as we added more products and analytics. We couldn’t expect users to adopt a new analytics API every time they started using a new product. In fact, some of the customers and partners that were relying on many of our products were already becoming confused by the various APIs.Following the multi-API approach was also affecting how quickly we could develop new analytics within the Cloudflare dashboard, which is used by more people for data exploration than our APIs. Each time we built a new product, our product engineering teams had to implement a corresponding analytics API, which our user interface engineering team then had to learn to use. This process could take up to several months for each new set of analytics dashboards....to one Our new GraphQL Analytics API solves these problems by providing access to all Cloudflare analytics. It offers a standard, flexible syntax for describing exactly the data you need and provides predictable, matching responses. This approach makes it an ideal tool for: Data exploration. You can think of it as a way to query your own virtual data warehouse, full of metrics and logs regarding the performance, security, and reliability of your Internet property.Building amazing dashboards, which allow for flexible filtering, sorting, and drilling down or rolling up. Creating these kinds of dashboards would normally require paying thousands of dollars for a specialized analytics tool. You get them as part of our product and can customize them for yourself using the API.In a companion post that was also published today, my colleague Nick discusses using the GraphQL Analytics API to build dashboards. So, in this post, I’ll focus on examples of how you can use the API to explore your data. To make the queries, I’ll be using GraphiQL, a popular open-source querying tool that takes advantage of GraphQL’s capabilities.Introspection: what data is available?The first thing you may be wondering: if the GraphQL Analytics API offers access to so much data, how do I figure out what exactly is available, and how I can ask for it? GraphQL makes this easy by offering “introspection,” meaning you can query the API itself to see the available data sets, the fields and their types, and the operations you can perform. GraphiQL uses this functionality to provide a “Documentation Explorer,” query auto-completion, and syntax validation. For example, here is how I can see all the data sets available for a zone (domain):If I’m writing a query, and I’m interested in data on firewall events, auto-complete will help me quickly find relevant data sets and fields:Querying: examples of questions you can askLet’s say you’ve made a major product announcement and expect a surge in requests to your blog, your application, and several other zones (domains) under your account. You can check if this surge materializes by asking for the requests aggregated under your account, in the 30 minutes after your announcement post, broken down by the minute:{ viewer { accounts (filter: {accountTag: $accountTag}) { httpRequests1mGroups(limit: 30, filter: {datetime_geq: "2019-09-16T20:00:00Z", datetime_lt: "2019-09-16T20:30:00Z"}, orderBy: [datetimeMinute_ASC]) { dimensions { datetimeMinute } sum { requests } } } } }Here is the first part of the response, showing requests for your account, by the minute:Now, let’s say you want to compare the traffic coming to your blog versus your marketing site over the last hour. You can do this in one query, asking for the number of requests to each zone:{ viewer { zones(filter: {zoneTag_in: [$zoneTag1, $zoneTag2]}) { httpRequests1hGroups(limit: 2, filter: {datetime_geq: "2019-09-16T20:00:00Z", datetime_lt: "2019-09-16T21:00:00Z"}) { sum { requests } } } } }Here is the response:Finally, let’s say you’re seeing an increase in error responses. Could this be correlated to an attack? You can look at error codes and firewall events over the last 15 minutes, for example:{ viewer { zones(filter: {zoneTag: $zoneTag}) { httpRequests1mGroups (limit: 100, filter: {datetime_geq: "2019-09-16T21:00:00Z", datetime_lt: "2019-09-16T21:15:00Z"}) { sum { responseStatusMap { edgeResponseStatus requests } } } firewallEventsAdaptiveGroups (limit: 100, filter: {datetime_geq: "2019-09-16T21:00:00Z", datetime_lt: "2019-09-16T21:15:00Z"}) { dimensions { action } count } } } }Notice that, in this query, we’re looking at multiple datasets at once, using a common zone identifier to “join” them. Here are the results:By examining both data sets in parallel, we can see a correlation: 31 requests were “dropped” or blocked by the Firewall, which is exactly the same as the number of “403” responses. So, the 403 responses were a result of Firewall actions.Try it todayTo learn more about the GraphQL Analytics API and start exploring your Cloudflare data, follow the “Getting started” guide in our developer documentation, which also has details regarding the current data sets and time periods available. We’ll be adding more data sets over time, so take advantage of the introspection feature to see the latest available.Finally, to make way for the new API, the Zone Analytics API is now deprecated and will be sunset on May 31, 2020. The data that Zone Analytics provides is available from the GraphQL Analytics API. If you’re currently using the API directly, please follow our migration guide to change your API calls. If you get your analytics using the Cloudflare dashboard or our Datadog integration, you don’t need to take any action.One more thing....In the API examples above, if you find it helpful to get analytics aggregated for all the domains under your account, we have something else you may like: a brand new Analytics dashboard (in beta) that provides this same information. If your account has many zones, the dashboard is helpful for knowing summary information on metrics such as requests, bandwidth, cache rate, and error rate. Give it a try and let us know what you think using the feedback link above the new dashboard.

New tools to monitor your server and avoid downtime

CloudFlare Blog -

When your server goes down, it’s a big problem. Today, Cloudflare is introducing two new tools to help you understand and respond faster to origin downtime — plus, a new service to automatically avoid downtime.The new features are:Standalone Health Checks, which notify you as soon as we detect problems at your origin server, without needing a Cloudflare Load Balancer.Passive Origin Monitoring, which lets you know when your origin cannot be reached, with no configuration required.Zero-Downtime Failover, which can automatically avert failures by retrying requests to origin.Standalone Health ChecksOur first new tool is Standalone Health Checks, which will notify you as soon as we detect problems at your origin server -- without needing a Cloudflare Load Balancer.A Health Check is a service that runs on our edge network to monitor whether your origin server is online. Health Checks are a key part of our load balancing service because they allow us to quickly and actively route traffic to origin servers that are live and ready to serve requests. Standalone Health Checks allow you to monitor the health of your origin even if you only have one origin or do not yet need to balance traffic across your infrastructure.We’ve provided many dimensions for you to hone in on exactly what you’d like to check, including response code, protocol type, and interval. You can specify a particular path if your origin serves multiple applications, or you can check a larger subset of response codes for your staging environment. All of these options allow you to properly target your Health Check, giving you a precise picture of what is wrong with your origin.If one of your origin servers becomes unavailable, you will receive a notification letting you know of the health change, along with detailed information about the failure so you can take action to restore your origin’s health.  Lastly, once you’ve set up your Health Checks across the different origin servers, you may want to see trends or the top unhealthy origins. With Health Check Analytics, you’ll be able to view all the change events for a given health check, isolate origins that may be top offenders or not performing at par, and move forward with a fix. On top of this, in the near future, we are working to provide you with access to all Health Check raw events to ensure you have the detailed lens to compare Cloudflare Health Check Event logs against internal server logs. Users on the Pro, Business, or Enterprise plan will have access to Standalone Health Checks and Health Check Analytics to promote top-tier application reliability and help maximize brand trust with their customers. You can access Standalone Health Checks and Health Check Analytics through the Traffic app in the dashboard.Passive Origin MonitoringStandalone Health Checks are a super flexible way to understand what’s happening at your origin server. However, they require some forethought to configure before an outage happens. That’s why we’re excited to introduce Passive Origin Monitoring, which will automatically notify you when a problem occurs -- no configuration required.Cloudflare knows when your origin is down, because we’re the ones trying to reach it to serve traffic! When we detect downtime lasting longer than a few minutes, we’ll send you an email. Starting today, you can configure origin monitoring alerts to go to multiple email addresses. Origin Monitoring alerts are available in the new Notification Center (more on that below!) in the Cloudflare dashboard:Passive Origin Monitoring is available to customers on all Cloudflare plans.Zero-Downtime FailoverWhat’s better than getting notified about downtime? Never having downtime in the first place! With Zero-Downtime Failover, we can automatically retry requests to origin, even before Load Balancing kicks in.How does it work? If a request to your origin fails, and Cloudflare has another record for your origin server, we’ll just try another origin within the same HTTP request. The alternate record could be either an A/AAAA record configured via Cloudflare DNS, or another origin server in the same Load Balancing pool.Consider an website, example.com, that has web servers at two different IP addresses: 203.0.113.1 and 203.0.113.2. Before Zero-Downtime Failover, if 203.0.113.1 becomes unavailable, Cloudflare would attempt to connect, fail, and ultimately serve an error page to the user. With Zero-Downtime Failover, if 203.0.113.1 cannot be reached, then Cloudflare’s proxy will seamlessly attempt to connect to 203.0.113.2. If the second server can respond, then Cloudflare can avert serving an error to example.com’s user.Since we rolled Zero-Downtime Failover a few weeks ago, we’ve prevented tens of millions of requests per day from failing! Zero-Downtime Failover works in conjunction with Load Balancing, Standalone Health Checks, and Passive Origin Monitoring to keep your website running without a hitch. Health Checks and Load Balancing can avert failure, but take time to kick in. Zero-Downtime failover works instantly, but adds latency on each connection attempt. In practice, Zero-Downtime Failover is helpful at the start of an event, when it can instantly recover from errors; once a Health Check has detected a problem, a Load Balancer can then kick in and properly re-route traffic. And if no origin is available, we’ll send an alert via Passive Origin Monitoring.To see an example of this in practice, consider an incident from a recent customer. They saw a spike in errors at their origin that would ordinarily cause availability to plummet (red line), but thanks to Zero-Downtime failover, their actual availability stayed flat (blue line).During a 30 minute time period, Zero-Downtime Failover improved overall availability from 99.53% to 99.98%, and prevented 140,000 HTTP requests from resulting in an error.It’s important to note that we only attempt to retry requests that have failed during the TCP or TLS connection phase, which ensures that HTTP headers and payload have not been transmitted yet. Thanks to this safety mechanism, we're able to make Zero-Downtime Failover Cloudflare's default behavior for Pro, Business, and Enterprise plans. In other words, Zero-Downtime Failover makes connections to your origins more reliable with no configuration or action required.Coming soon: more notifications, more flexibilityOur customers are always asking us for more insights into the health of their critical edge infrastructure. Health Checks and Passive Origin monitoring are a significant step towards Cloudflare taking a proactive instead of reactive approach to insights.To support this work, today we’re announcing the Notification Center as the central place to manage notifications. This is available in the dashboard today, accessible from your Account Home.From here, you can create new notifications, as well as view any existing notifications you’ve already set up. Today’s release allows you to configure  Passive Origin Monitoring notifications, and set multiple email recipients.We’re excited about today’s launches to helping our customers avoid downtime. Based on your feedback, we have lots of improvements planned that can help you get the timely insights you need:New notification delivery mechanismsMore events that can trigger notificationsAdvanced configuration options for Health Checks, including added protocols, threshold based notifications, and threshold based status changes.More ways to configure Passive Health Checks, like the ability to add thresholds, and filter to specific status codes

Pages

Recommended Content

Subscribe to Complete Hosting Guide aggregator - Service Provider Blogs