Most developers think Git stores the differences between file versions. It doesnβt. Git stores complete snapshots of every file at every commit. Hereβs how.
The three objects
Everything in Git is stored as one of three object types:
1. Blob (Binary Large Object)
A blob is the contents of a file. Not the filename β just the contents. If two files have identical contents, Git stores one blob and points to it twice.
# See the blob hash for a file
git hash-object README.md
# e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
2. Tree
A tree is a directory listing. It maps filenames to blobs (files) and other trees (subdirectories).
tree abc123
βββ README.md β blob e69de2
βββ src/ β tree def456
β βββ index.js β blob 789abc
β βββ utils.js β blob 012def
βββ package.json β blob 345ghi
3. Commit
A commit points to a tree (the snapshot of your entire project at that moment), plus metadata: author, date, message, and parent commit(s).
commit 1a2b3c
βββ tree: abc123 (the root tree)
βββ parent: 9x8y7z (previous commit)
βββ author: Alice <alice@example.com>
βββ date: 2026-03-16 10:30:00
βββ message: "Fix login bug"
What happens when you commit
Step 1: git add
When you run git add file.js, Git:
- Computes the SHA-1 hash of the file contents
- Compresses the contents and stores it as a blob in
.git/objects/ - Updates the staging area (
.git/index) to point to this blob
Step 2: git commit
When you run git commit, Git:
- Creates a tree object from the staging area (mapping filenames to blobs)
- Creates a commit object pointing to that tree, with your message and the parent commit
- Updates the branch pointer (e.g.,
refs/heads/main) to this new commit
Thatβs it. A commit is just a pointer to a snapshot.
How branches work
A branch is literally a file containing a commit hash. Thatβs all.
cat .git/refs/heads/main
# 1a2b3c4d5e6f7890abcdef1234567890abcdef12
When you create a branch, Git creates a new file with the same commit hash. When you commit on that branch, Git updates the file to the new commit hash.
HEAD is a pointer to the current branch:
cat .git/HEAD
# ref: refs/heads/main
How Git is efficient (if it stores full snapshots)
βWait, if every commit stores every file, doesnβt that use insane amounts of space?β
Three tricks:
1. Content-addressable storage
If a file hasnβt changed between commits, the blob hash is the same. Git reuses the existing blob. Only changed files get new blobs.
2. Compression
All objects are zlib-compressed. Text files compress extremely well.
3. Packfiles
Periodically (and during git gc), Git packs objects into .pack files using delta compression. It stores the full content of one version and deltas for similar objects. This is where Git gets its space efficiency β but itβs an optimization layer, not the core model.
The mental model
HEAD β main β commit C β tree β blobs (your files)
β
commit B β tree β blobs
β
commit A β tree β blobs
Every commit is a complete snapshot. Branches are movable pointers. Tags are fixed pointers. Thatβs Git.
Why this matters
Understanding this model explains everything:
- Why
git checkoutis fast β it just swaps which tree your working directory points to - Why branches are cheap β theyβre a 41-byte file
- Why rebasing rewrites history β it creates new commit objects with new hashes
- Why force-pushing is dangerous β youβre moving a pointer, and the old commits become unreachable
- Why
git reflogcan save you β it tracks where HEAD pointed, even after βdeletedβ commits
Related resources
Related: Fix: fatal: not a git repository Related: Fix: Git merge conflict