The Google File System [GFS]

Introduction

The Google File System is a scalable distributed file system for large distributed data-intensive applications. It is designed to run on inexpensive commodity hardware, same as MapReduce.

The largest cluster at that time (i.e., 2003) have over 1000 storage nodes, over 300 TB of disk storage, and are heavily accessed by hundreds of clients on distinct machines.

The main problem GFS focused on, is to meet the rapidly growing demands of Google’s data processing needs. By using inexpensive commodity hardware, it’ll be so easy + scalable.

At the same time, it’s not compromising on typical qualities of previous distributed fs such as performance, scalability, reliability, and availability.

Their findings

  1. Component failures are common. Commodity hardware adds on this along with software errors. So constant monitoring, error detection, fault tolerance, and automatic recovery are a must.
  2. Huge files in order of multiple GBs. These files seem like an archive of documents related to webpage. These in total can grow in TBs containing billions of KB-sized files. So I/O & file block sizes are one consideration in designing system.
  3. Mostly new data is appended to already existed files than over writing old file data. No random writes (so no seeking in HDD), once written, they’re read only (mostly sequential).
  4. Easy integrations/flexibility b/w app.s and fs api. There’s atomic append operation, so multiple clients append concurrently.

Design Overview

Assumptions

Workloads will have large streaming reads (sequential) & small random reads.

Applications often batch and sort their small reads to advance through the file rather than go back and forth.

Writes are also similar to above Reads

Files are used for producer-consumer queues & many-way merging (K-way_merge_algorithm like tournament tree)

Bandwidth preferred over latency

Interface & Architecture