The Ubiquity of Caching in Modern Computer Systems
6 min read

The Ubiquity of Caching in Modern Computer Systems

I want to show you how caching is everywhere with a simple example: visiting a website.
The Ubiquity of Caching in Modern Computer Systems

The title is a fancy sentence that I read in the book "Computer Systems: A Programmer's Perspective," and I thought it deserved an article. The book explains caching strategies at the hardware level, but it does not cover the amount of caching we have nowadays, primarily due to the internet.

I want to show you how caching is everywhere with a simple example: visiting a website such as ""

What Is Caching

A cache is a component that stores data to deliver future requests without fetching or computing the data again.

Caching is applied at all levels in computer systems, from caching in the CPU with registers to browsers caching CSS files.

A Simple Example: Visit a Website

To show that caching is everywhere, let's take a simple example and then present all the caching strategies that could be implemented.

How many caching strategies can we find in this simple process of accessing the Netflix home page?

No Caching At All

First, let's look at the example if there was no caching.

This schema is a simplified version of what is actually happening. I kept the process as simple as possible, omitting the exact architecture of Netflix. Yet, the simple process already gives us plenty of opportunities to talk and mention different caching strategies.

When accessing a website, we have two distinct network operations: getting the IP from the domain and the HTTP requests. Therefore, we look at them separately because both are subject to caching.

First, getting the IP of the domain: ""

The browser needs to get the IP address; therefore, it asks for the address to the Operating System. The OS forwards the requests to the first DNS server it has contact with, which forwards the request to other DNS servers.

I recommend you to read this article for more details.

The second part is getting the actual page of "" with HTTP requests.

Though Netflix uses multiple servers and databases, I assume a highly simplified schema of just one server and one database. Remember that browsers need to request all the files and data needed for the website, not just the initial HTML. These extra requests are static files like CSS, images, or JS, and also requests coming for the execution of JS in the browser. All this happens behind the scenes without the user knowing about it.

How is it possible that all this happens in just under a second?

On one side, getting the IP address already means connecting and interacting with many nodes. On the other side, the server of "" is bombarded with requests just to visit the home page. How can browsers do this so fast?

Caching in IP Lookup

Let's look at the caching strategies in the first part, getting the IP address for the domain.

The first cache that is implemented is at the browser level. If the user has recently visited the page, the browser doesn't bother checking outside and uses the cached IP.

The second cache is at the Operating System level. The browser asks the OS for the domain, and the OS decides to respond with the cached IP or forward the request to the DNS servers.

The DNS servers also have a cache and can respond with an IP address instead of forwarding the request to other DNS Servers. There can be different caches to speed up the process at this level. It depends on each server.

We learned of three different caches at this stage: browser, OS, and DNS Server.

Networking and Server Caching

Let's go to the other part, which begins after getting the IP address.

Let's focus on the caching happening at the network level, outside the browser and the user's computer.

We can identify at least four different caching strategies at this level.

  • Database cache. Most DB services can store recent queries in memory to speed up the reply to the same queries. This caching strategy allows the database to handle more queries and be faster.
  • Server cache. Servers can also store common requests received from clients in memory. As long as these requests return the same content, they can be retrieved from memory and sent back. This cache avoids computation and external connections—to databases, for example— which speeds up the server's response time.
  • Proxy servers for static files. CDNs are popular because they store copies of static files in servers close to the users. These servers handle requests of static files instead of using the regular server.
  • Proxy servers for requests. Proxy servers can also respond with data instead of forwarding the request to the server, reducing the load on the central server.

This is just a small list, yet it offers a good overview of the omnipresence of caching on the internet.

Client-Side Caching

Caching is as old as computers, and it's not a solution used only in networking and distributed systems.

Inside the computer, starting with the browser until we get to the CPU and GPU, we can find at least four more different caching strategies.

  • Browser cache. Browsers store a copy of the files requested to the servers and might serve the cached file before they receive the response from the server. Some of this caching can be managed with HTTP headers.
  • Code cache. The code can also store computed data and responses from requests made with Javascript in memory. This way, the next time the data is needed, it can be delivered immediately. A typical example is when using memoization.
  • CPU cache. Central Processing Units work in cycles; depending on where the data is stored, it takes more or fewer cycles to read. More cycles mean slower processing; therefore, storing data next to the CPU increases performance. CPUs also have different caching layers to duplicate data they know will be used again, improving the performance. This caching happens at the hardware level, and most high-level programming languages don't have access to this.
  • GPU cache. The Graphics Processor Unit takes advantage of duplicating data close to the core to speed up processing. This process also happens at the hardware level.

Caching, caching everywhere.

We started with a simple example of visiting a website.

And, we learned that reality is more complex.

The book I talked about at the beginning presents eleven levels of caching. Five of them are hardware strategies (which I grouped as CPU cache), and only one is outside the computer—web cache.

In this article, we also saw a total of eleven caching strategies when visiting a website.

  • Three to look up the IP address.
    • Browser IP cache, OS IP cache, and DNS Server cache
  • Four at the networking level.
    • Database service cache, server cache, a proxy server for static files, and a proxy server for data requests.
  • Four at the computer level.
    • Code (like memoization), browser, CPU, and GPU cache.

Note that this is not an exhaustive list. Instead, this article is an exercise to show how caching is everywhere.

Or paraphrasing Billy Mack from the movie Love Actually: "🎵 Caching is all around 🎵."

If you like this post, consider sharing it with your friends on twitter or forwarding this email to them 🙈

Don't hesitate to reach out to me if you have any questions or see an error. I highly appreciate it.

And thanks to Michal and Sebastià for reviewing this article 🙏

Thanks for reading, don't be a stranger 👋

GIMTEC is the newsletter I wish I had earlier in my software engineering career.

Every other Wednesday, I share an article on a topic that you won't learn at work.

Join more than 3,000 subscribers below.

Thanks for subscribing! A confirmation email has been sent.

Check the SPAM folder if you don't receive it shortly.

Sorry, there was an error 🤫.

Try again and contact me at llorenc[at] if it doesn't work. Thanks!