CI/CD Caching Strategies That Cut Pipeline Time by 60%
Every minute your pipeline spends re-downloading dependencies is a minute wasted. I've watched teams accept 45-minute builds as "just how things are" while their caches sit misconfigured or completely unused. Last quarter, I cut a client's average pipeline time from 38 minutes to 14 minutes—a 63% reduction—by fixing their caching strategy. No hardware upgrades, no parallelization magic. Just proper caching.
Here's the thing: most CI/CD caching documentation tells you that you should cache, but not how to cache effectively. Let's fix that.
Understanding What Actually Needs Caching
Before touching any configuration, audit where your pipeline spends time. Run this in your CI environment:
# Add timing to each major step
time npm ci # or pip install, bundle install, etc.
time docker build .
time npm run build
time npm test
In 90% of projects I've audited, dependency installation and Docker layer building consume 60-80% of total pipeline time. These are your primary cache targets.
The mental model is simple: cache anything that's (1) slow to generate, (2) changes infrequently, and (3) can be keyed deterministically. Dependencies fit perfectly—they change when your lockfile changes, not on every commit.
Dependency Caching Done Right
Most teams cache their node_modules or .venv directories directly. This works but creates cache bloat and invalidation issues. Cache the package manager's cache directory instead—let the package manager handle the extraction.
Here's a GitLab CI configuration that caches npm properly:
variables:
npm_config_cache: "$CI_PROJECT_DIR/.npm"
cache:
key:
files:
- package-lock.json
paths:
- .npm/
policy: pull-push
install_dependencies:
stage: setup
script:
- npm ci --prefer-offline
cache:
policy: pull-push
build:
stage: build
script:
- npm run build
cache:
policy: pull # Don't update cache, just read it
Key details that matter:
npm ciovernpm install:ciis deterministic and faster—it doesn't modify your lockfile.--prefer-offline: Uses cached packages without checking the registry when possible.- Cache policy separation: Only the install job pushes to cache. Build and test jobs pull only. This prevents cache corruption from parallel jobs.
- File-based cache key: The cache invalidates only when
package-lock.jsonchanges, not on every commit.
For Python projects, cache ~/.cache/pip with a key based on requirements.txt or poetry.lock. For Ruby, cache ~/.bundle keyed on Gemfile.lock.
Docker Layer Caching: The Biggest Win You're Ignoring
Docker builds in CI are notoriously slow because, by default, every build starts cold. You're re-downloading base images and re-running every RUN instruction from scratch. BuildKit's cache mounts and registry-based caching change this entirely.
Here's a GitHub Actions workflow with proper Docker caching:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and Push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
cache-from: type=registry,ref=ghcr.io/${{ github.repository }}:buildcache
cache-to: type=registry,ref=ghcr.io/${{ github.repository }}:buildcache,mode=max
The mode=max flag is crucial—it caches all layers, not just the final image layers. Without it, intermediate build stages aren't cached.
For even better results, structure your Dockerfile to maximize cache hits:
FROM node:20-alpine AS builder
WORKDIR /app
# Copy dependency files first - these change less frequently
COPY package*.json ./
RUN npm ci --only=production
# Copy source code last - this changes on every commit
COPY . .
RUN npm run build
This ordering means your npm ci layer stays cached unless dependencies change, even when source code changes every commit.
Cache Storage: Local vs. Distributed
CI providers offer different caching backends with vastly different performance characteristics:
- GitHub Actions: Cache stored in Azure Blob, 10GB limit per repo, ~30-60 seconds to restore large caches
- GitLab CI: Distributed cache via S3/GCS/MinIO, configurable limits
- CircleCI: ~15-second restore times with their storage
For GitHub Actions, cache restore time becomes a bottleneck with large caches. I've seen teams cache 2GB of node_modules and spend 45 seconds restoring it—sometimes longer than a fresh npm ci would take. Measure your restore times:
- name: Restore cache with timing
id: cache-restore
uses: actions/cache@v4
with:
path: ~/.npm
key: npm-${{ hashFiles('**/package-lock.json') }}
- name: Report cache performance
run: |
echo "Cache hit: ${{ steps.cache-restore.outputs.cache-hit }}"
echo "Restored at: $(date)"
If restore times exceed 30 seconds, consider splitting caches by directory or using the actions/cache/restore and actions/cache/save actions separately for more control.
Cache Invalidation Strategies That Don't Break
The hardest problem in caching is invalidation. Too aggressive, and you never hit the cache. Too lenient, and you ship bugs from stale dependencies.
My baseline strategy uses three cache key tiers:
- uses: actions/cache@v4
with:
path: ~/.npm
key: npm-${{ runner.os }}-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
npm-${{ runner.os }}-
The restore-keys fallback means a partial cache hit still provides value. If your lockfile changed slightly, you restore the old cache and only download the delta.
For scheduled cache rotation (preventing months-old stale caches), add a date component:
key: npm-${{ runner.os }}-week${{ steps.date.outputs.week }}-${{ hashFiles('**/package-lock.json') }}
This forces a full cache refresh weekly while maintaining cache hits within each week.
Measuring Your Gains
Don't optimize blindly. Track these metrics before and after implementing caching:
- P50 and P95 pipeline duration: Median shows typical experience, P95 catches cache miss impact
- Cache hit rate: Below 80% means your cache keys are too specific
- Cache restore time vs. fresh install time: If restore approaches fresh install time, your cache is too large
In GitHub Actions, extract these from the workflow run API. In GitLab, the CI/CD Analytics dashboard shows pipeline duration trends.
One team I worked with discovered their cache hit rate was 95% on feature branches but 0% on main—they'd configured branch-specific cache keys. Removing the branch component from their cache key immediately improved main branch builds by 8 minutes.
What To Do Monday Morning
Pull your last 20 pipeline runs and calculate average duration. Identify your two slowest steps—I guarantee one is dependency installation or Docker build. Implement the caching patterns above for those specific steps. Push the change and compare your next 20 runs.
Targeting a 60% reduction isn't aggressive. With proper caching of dependencies and Docker layers, I consistently see 50-70% improvements. The pipelines that don't improve usually have other bottlenecks—test parallelization, network-bound operations, or genuinely compute-heavy builds that can't be cached.
Start with one cache, measure, iterate. Your future self waiting for green builds will thank you.
Written by GeekOnCloud
DevOps & Infrastructure engineer at geekoncloud.com