One npm package nearly owned our Next.js 15 app

how we caught it

This happened on a fairly normal weekday release, on a Next.js 15 app with App Router, a couple of internal packages, a PostgreSQL-backed API, and the usual JavaScript dependency pile that quietly grows while nobody's looking. One pull request updated a small transitive package, nothing exciting, just a lockfile churn diff that most teams would wave through because the top-level package.json barely moved, and that was exactly the problem, the dangerous code didn't arrive through an obvious dependency decision, it arrived through the trust we place in package managers to keep the graph boring.

The first useful signal came from a CI job that started hanging during install, then making outbound requests to a host nobody recognized. We run install steps in a constrained environment, which helped, because the package tried to execute a postinstall script and fetch a remote payload. The request failed, loudly. If that same install had run on a less restricted runner, or on a developer laptop with broader access to internal tokens, we'd have had a much uglier incident.

The suspicious package had a package.json snippet that looked roughly like this:

json

 1{
 2  "scripts": {
 3    "postinstall": "node ./dist/install.js"
 4  }
 5}

That alone isn't proof of anything, plenty of packages do dumb things in lifecycle hooks, but dist/install.js was obfuscated, string arrays, hex escapes, dynamic Function(...), the usual garbage, and inside it was a tiny loader that resolved https, pulled a second-stage script, wrote it to a temp path, and executed it. That wasn't analytics. That wasn't a binary download. That was remote code execution during dependency installation.

We diffed the lockfile immediately. The meaningful lines were small enough to miss in review:

diff

 1-  suspicious-lib@1.4.2:
 2-    resolution: {integrity: sha512-old...}
 3+  suspicious-lib@1.4.3:
 4+    resolution: {integrity: sha512-new...}
 5+    hasBin: true
 6+    requiresBuild: true

requiresBuild: true on a package that previously had no build step should stop people in their tracks. It usually doesn't. That one field earned us a postmortem.

the blast radius

The part people underestimate with package compromise is timing. You don't need your production containers to npm install at boot for this to become serious. You need one CI runner with write access to artifacts, one GitHub Actions token with default permissions, one build machine that injects SENTRY_AUTH_TOKEN, NEXT_PUBLIC_* values, cloud credentials, or a private package registry token, and suddenly the attacker doesn't care about your frontend package anymore, they care about your release pipeline.

Our Next.js build had the usual footprint, next build generating server bundles, image optimization config, environment reads during compile, and a deploy step that pushed images downstream. The malicious package executed before any of that. If the loader had succeeded, it could have scraped environment variables, modified emitted assets, patched next.config.ts, tampered with generated server code under .next/server, or written a credential stealer into a build artifact that looked like a harmless helper module.

One ugly detail, Next.js apps tend to blur runtime boundaries in ways people forget. Client code, edge code, server actions, route handlers, all sit in one repo, often one build, often one install. A compromised package in a shared utility path can influence much more than a static marketing page. On one of our internal reviews at steezr we found teams pulling markdown renderers, analytics helpers, and tiny AST utilities into both server and client bundles because tree shaking made it feel cheap. Cheap until the package author account gets hijacked.

The nearest miss for us wasn't browser compromise. It was CI compromise. Our GitHub Actions workflow still had broader permissions than it needed:

yaml

 1permissions:
 2  contents: write
 3  packages: write
 4  id-token: write

That config existed because it was convenient months earlier, and convenience lingers. A malicious install step on that runner had enough room to pivot into package publication or release tampering. We got lucky once. I don't count on luck twice.

the diffs that mattered

Once we isolated the package, we stopped treating the lockfile as machine noise and read it like source code. Most teams claim lockfiles are sacred, then review them with the same care they'd give a minified vendor blob. That's backwards. The lockfile is where supply-chain compromise becomes visible.

We pulled three diffs. First, the package.json of the compromised version, fetched directly from the registry tarball. Second, the tarball file listing. Third, the lockfile change between the last good build and the bad one. The package diff told the story fast:

diff

 1{
 2   "name": "suspicious-lib",
 3-  "version": "1.4.2",
 4+  "version": "1.4.3",
 5   "main": "dist/index.js",
 6+  "scripts": {
 7+    "postinstall": "node dist/install.js"
 8+  },
 9   "files": [
10-    "dist/index.js"
11+    "dist/index.js",
12+    "dist/install.js"
13   ]
14 }

That should never have made it through unchallenged. New lifecycle hooks in a patch release are a giant red light. New executable files in dist/ are another. The lockfile then showed the integrity hash rotating, expected on version change, plus metadata that indicated install-time execution. We now automatically fail CI if a dependency introduces preinstall, install, postinstall, prepare, or prepublishOnly unless the package is on an explicit allowlist.

The script we added is intentionally blunt:

bash

 1node scripts/audit-lifecycle-hooks.mjs pnpm-lock.yaml allowlist/lifecycle-hooks.json

It resolves package tarballs, inspects package.json, compares against a checked-in allowlist, and exits non-zero on new hooks. No AI. No fuzzy heuristics. Just deterministic rejection.

We also started generating a review artifact on every dependency PR, a plain text summary of added packages, removed packages, new scripts, new binaries, native addons, and changed registries. Senior engineers will read a 40-line risk summary. They won't read 2,700 lines of lockfile churn unless they're already suspicious.

installs must be boring

The fix wasn't one silver bullet, it was a stack of boring constraints that make install-time code execution much harder to weaponize. We standardized on pnpm 8 for existing projects that haven't moved yet, and on newer pnpm where the repo already supports it, with strict flags wired into local dev, CI, and container builds. The exact command matters less than the posture, installs should be reproducible, offline where possible, and hostile to surprise mutation.

For CI, we now use:

bash

 1pnpm install --frozen-lockfile --ignore-scripts --prefer-offline

If a project genuinely requires build scripts for a known package, we allow that in a separate, tightly scoped step after validation. Defaulting to --ignore-scripts closes off the laziest attack path immediately. For npm-based repos we use:

bash

 1npm ci --ignore-scripts --prefer-offline

People object that some packages break. Fine, then identify those packages and justify them in code review. Hidden script execution during install is a terrible default.

Immutable lockfiles are non-negotiable. CI must fail if the lockfile changes. Docker builds must copy package.json and lockfile first, install, then copy the rest, because you want the dependency graph pinned before application files enter the picture. We also block install commands that would rewrite the lockfile in CI by running builds in read-only workspaces where possible.

Package allowlists help more than most people expect. Not a giant list of every dependency, that becomes theater, an allowlist for exceptions, packages permitted to run lifecycle hooks, packages allowed to ship native addons, registries allowed for fetch. Everything else gets denied by default. One of our internal services now fails if a package resolves from anything other than https://registry.npmjs.org/ and our private scope.

Boring installs are good installs. If package installation feels dynamic, clever, or magical, you've already lost half the fight.

provenance and signatures

Package integrity hashes in lockfiles protect against some classes of tampering, they don't tell you whether the thing you pinned came from a trustworthy build process. That's where provenance helps, and yes, the JavaScript ecosystem is still uneven here, which means you need to do more work than you'd like.

We added Sigstore verification to packages we publish ourselves and to any internal release process that emits artifacts consumed by other apps. For container images and release bundles we verify with cosign in CI before promotion. A stripped down check looks like this:

bash

 1cosign verify \
 2  --certificate-identity "https://github.com/steezr/portal/.github/workflows/release.yml@refs/heads/main" \
 3  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
 4  ghcr.io/steezr/portal-web:sha-2f4c9d1

That won't save you from a malicious upstream npm package by itself, still worth doing, because it closes the gap between compromised build runners and trusted deployment artifacts. If an attacker modifies a build outside the expected workflow identity, verification fails, deployment stops.

For npm packages, provenance support has improved, though not uniformly enough that I'd tell anyone to rely on it exclusively. We verify what we can, record publisher metadata, and flag changes in maintainers, publish tooling, and tarball structure. A package that suddenly flips from a clean source publish to an opaque prebuilt bundle deserves scrutiny. Same for packages that start shipping extra files, especially executable installers.

One practical rule we enforce now, internal packages consumed by our Next.js apps must be published from GitHub Actions with OIDC-backed signing, and the workflow file path is pinned in policy. That gives us a concrete chain of custody. It also forces teams to stop publishing from random laptops at 23:40 before a launch, which was always a bad habit and now has a security reason attached to the process.

lock down actions

GitHub Actions was the second half of this incident, because a malicious dependency is dangerous in direct proportion to what the runner can reach. Most teams leave the defaults wide open, then act surprised when a compromised step has enough permissions to rewrite tags, push packages, mint cloud credentials, or open pull requests that smuggle in persistence.

We tightened every workflow. Default permissions are now read-only unless a job proves it needs more:

yaml

 1permissions:
 2  contents: read

Jobs that publish images get only what's necessary:

yaml

 1permissions:
 2  contents: read
 3  packages: write
 4  id-token: write

Nothing else. No blanket write access on test jobs, no inherited token scopes because a starter template included them. We also moved secrets out of workflows that don't need them, disabled untrusted pull request access to sensitive jobs, pinned third-party actions to full commit SHAs, and banned pull_request_target except in narrowly reviewed cases. That event is a footgun.

One specific mitigation mattered a lot, dependency install happens in a job with no deploy credentials, no package publish token, and no write permission to the repo. Build artifacts move forward only after policy checks pass. If install-time malware runs in that first stage, it hits a wall instead of a hallway.

Runtime mitigations matter too. Our production images are built once, promoted unchanged, and run with a read-only root filesystem where the platform supports it. Egress is constrained. Containers don't need to reach arbitrary hosts. Next.js servers don't get shell tools unless there's a compelling reason, and there usually isn't. A lot of supply-chain hardening advice stops at package install. The smarter move is assuming one layer fails, then making the next pivot expensive.

the rules now

We wrote the policy the same afternoon because vague security intentions decay fast. The current rules for our Next.js projects are blunt, and I prefer blunt rules to elegant exceptions nobody remembers six months later.

Every repo must use a committed lockfile, CI uses pnpm install --frozen-lockfile --ignore-scripts --prefer-offline or npm ci --ignore-scripts --prefer-offline, lifecycle hooks are denied unless allowlisted, third-party GitHub Actions are pinned by SHA, workflow permissions default to contents: read, and dependency updates get a generated risk summary attached to the pull request. Internal packages must publish through signed CI workflows. Builds that need scripts run them in a second stage with no secrets and with the exact package names documented.

We also added a tiny check that catches one of the dumbest, most common mistakes, accidental lockfile drift from a developer using a different package manager or a different major version. If packageManager in package.json says pnpm, CI fails on npm-generated changes. If the repo expects npm, a stray pnpm-lock.yaml is a hard stop. Mixed tooling produces messy diffs, and messy diffs hide bad things.

None of this is exotic. That's the point. You can wire it into a real team without turning delivery into a ceremony marathon. At steezr we still ship quickly, for startups, internal systems, AI-heavy products, native apps, the whole spread, and speed survives just fine once the defaults are sane. What doesn't survive is the fantasy that npm install is a harmless setup step. It's remote code execution against your build chain, wrapped in a convenience command, and it deserves to be treated that way every single time.

One npm package nearly owned our Next.js 15 app

how we caught it

the blast radius

the diffs that mattered

installs must be boring

provenance and signatures

lock down actions

the rules now

AI Code Passes CI and Breaks Production Anyway

Your Tests Passed and Production Still Broke: The AI Verification Gap Nobody Wants to Name

The Vercel Breach Is a Template for How OAuth Sprawl Kills You

Want to work with us?