πŸ“š Learning Hub
Β· 3 min read

How npm install Actually Works Behind the Scenes


You run npm install dozens of times a week. But what actually happens between pressing Enter and seeing β€œadded 847 packages”?

Step 1: Read package.json

npm reads your package.json and builds a dependency tree. Every package you listed, plus every package those packages need, recursively.

A project with 15 dependencies in package.json might resolve to 847 packages because of transitive dependencies β€” dependencies of dependencies of dependencies.

Step 2: Resolve versions

For each package, npm needs to figure out which exact version to install. Your package.json says "react": "^18.2.0" β€” that’s a range, not a specific version.

npm checks the registry (registry.npmjs.org) for all versions of react that satisfy ^18.2.0. That means any version >= 18.2.0 and < 19.0.0.

It picks the highest matching version unless your package-lock.json already specifies one. This is why the lockfile matters β€” without it, two developers running npm install on the same package.json might get different versions.

Step 3: Check the cache

Before downloading anything, npm checks its local cache (~/.npm/_cacache/). If you’ve installed react@18.2.0 before on this machine, it’s already cached. No network request needed.

This is why your second npm install is faster than the first.

Step 4: Fetch packages

For anything not cached, npm downloads tarballs from the registry. Each package is a .tgz file containing the source code, package.json, and whatever the author published.

npm downloads in parallel β€” multiple packages at once. The progress bar you see is tracking these parallel downloads.

Step 5: Extract to node_modules

npm extracts each tarball into node_modules/. But the structure isn’t straightforward.

The flat structure (npm v3+):

node_modules/
  react/
  react-dom/
  scheduler/        ← dependency of react-dom
  loose-envify/     ← dependency of react

npm tries to flatten everything to the top level. This avoids deeply nested paths (Windows has a 260-character path limit) and allows shared dependencies.

When flattening fails: If two packages need different versions of the same dependency, npm nests the conflicting one:

node_modules/
  package-a/
  package-b/
    node_modules/
      lodash@3.0.0/   ← package-b needs an older lodash
  lodash@4.0.0/        ← everyone else uses this

Step 6: Run lifecycle scripts

After installation, npm runs scripts in this order:

  1. preinstall
  2. install (often compiles native modules with node-gyp)
  3. postinstall

This is where things sometimes break β€” native modules need compilers (python, make, g++), and if they’re missing, you get cryptic errors.

Step 7: Write package-lock.json

npm writes the exact resolved tree to package-lock.json. Every package, every version, every integrity hash. This file is the source of truth for reproducible installs.

npm install vs npm ci

npm install:

  • Reads package.json
  • May update package-lock.json
  • Installs missing packages, keeps existing ones

npm ci:

  • Reads package-lock.json only
  • Deletes node_modules/ entirely
  • Installs exactly what the lockfile says
  • Faster, deterministic, used in CI/CD

Why node_modules is so big

A typical Next.js project has 200-400MB in node_modules. Why?

  1. Transitive dependencies. You install 20 packages, they bring 800 friends.
  2. No deduplication across versions. If three packages need three different versions of the same library, you get three copies.
  3. Published junk. Many packages include test files, documentation, TypeScript source, and build artifacts that aren’t needed at runtime.

The full timeline

npm install
  β”œβ”€β”€ Read package.json (1ms)
  β”œβ”€β”€ Resolve dependency tree (200-500ms)
  β”œβ”€β”€ Check cache (50ms)
  β”œβ”€β”€ Fetch missing packages (1-30s, depends on network)
  β”œβ”€β”€ Extract to node_modules (2-10s)
  β”œβ”€β”€ Run lifecycle scripts (0-60s)
  └── Write package-lock.json (100ms)

Total: anywhere from 3 seconds (everything cached) to 2 minutes (fresh install, slow network, native modules).

Now you know what’s happening behind that progress bar.

πŸ“˜