The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or clickhereto continue anyway

At Instagram, we treat performance as a feature, and as we build products, we constantly think about how we can make things faster and more efficient. We've found that improving performance can actually drive usage, and by making small changes we can also improve user experience with the service. Here, we explore one case study where identifying and fixing low-hanging fruit reduced network usage, improved client-side reliability, reduced server-side latency, and in turn we saw an improvement in app-wide user experience.


Before a mobile client downloads actual media (i.e. photos or videos) from our CDNs, it must first fetch JSON metadata (media bundles) from our Django webserver endpoints. Depending on the endpoint, the compressed response is typically 10-60 kB. Each bundle contains information like media id, metadata about the author, the number of likes, the caption, and the most recent comments (called preview or summary comments).

  items: [{
    "id": "###",
    author: {
      id: ###,
    "like_count": 500,
    caption: tbt,
    "comments": [{
      "id": "###",
      "text": "Great pic!,
      user: {
        id: ###,
      {(another comment)},
  {(another media bundle)},
   ] }
Simplified illustrative example of a media bundle JSON

When you open Instagram to the main feed, you will notice that you only see up to three preview comments below each photo (in addition to the caption). In grid view (e.g. in the Search Explore tab or user profiles), no preview comments are visible at all.

3 preview comments per item are visible in feed, and no preview comments are visible in grid view.

However, we had been sending up to 20 comments to the client with each bundle. Originally, this was intended as an optimization to make the View all comments screen load faster. But when viewed holistically, this now seems like a poor trade-off for these reasons:

*Media are viewed more commonly than their comments, and we should optimize for the common case.

*Comment bundles are particularly heavy:

*Generating profile picture URLs is a CPU-inefficient operation because we must dynamically compute the correct CDN URL. The more comments we load, the more profile picture URLs we need to generate.

*When a user clicks on View all # comments we ask the server for new comments anyway!

For these reasons, reducing the maximum number of summary comments in each media bundle seemed like an obvious thing to do. But it was still unclear how much of a user-facing impact it would have. After all, this only makes a difference in the order of tens of kilobytes per payload, a difference that is dominated by the size of photo or video files. We hypothesized that the impact on network latency should be negligent if a user were using a connection slow enough where downloading a few more kilobytes matters, just about any Internet service would probably be too difficult to use anyway. But, considering the possible bandwidth and CPU savings, we decided to do an experiment to see if there were in fact any user-facing effects.

The experiment

We ran an A/B experiment that reduced the maximum number of summary comments in each bundle from 20 to 5. This dropped the median response size of the main newsfeed endpoint from 15 KB to 10KB, while the median response size size of the Explore Posts endpoint dropped from 46 KB to 23 KB. This drop is even more pronounced when considering response sizes at higher percentiles: at the 95th percentile, median response size of the main feed endpoint dropped from 32 KB to 16 KB.


As expected, reducing the size of the payload by a few kilobytes had no perceptible impact on network latency. But it had a surprising impact on memory usage: reducing the average memory usage for each feed screen ended up significantly improving the stability of the entire app. Android out-of-memory (OOM) errors dropped 30%! We hypothesize that the difference between platforms results from the Android market: some Android phones come with very low amounts of RAM, and correspondingly high memory pressure.

Median CPU usage on our most popular endpoints, like the main feed endpoint, dropped 20%! This translated into a median savings of 30ms in server-side wall time (and thus reduced end-to-end latency), and at the 95th percentile, we saved 70ms in server-side wall time. That makes a difference!

Infra improvements

When we launched this across all our users, CPU across our entire Django fleet dropped about 8% and egress dropped about 25%. Egress is a measure of site health, and such a drop would normally be alarming. But in this case, its a good sign that we're reducing the load on our infrastructure!


During the A/B test, we saw app-wide impressions across all mobile platforms increase by 0.7%, and likes increase by 0.4%. This was driven by increases of impressions on all surfaces for instance, Explore Photos impressions increased over 3%, and user profile scrolls increased 2.7%. These trends continued over time, confirming that good performance brings users back.

Percent increase in user profile scrolls over 3-month period


Thanks to Lisa Guo, Hao Chen, Tyler Kieft, Jimmy Zhang, Kang Zhang, and William Liu.

Continue reading on