
How I Cut Our Docker Images by 90% (And Why It Matters)
A few months ago, I was working on a project where deployments were painfully slow. Every time we pushed a change, it took forever for containers to pull and start. Our images were over 1GB each. That's when I finally sat down and learned multi-stage builds properly.
The result? Our main application image went from 1.2GB to 89MB. Deployments that took 10 minutes now take less than 2. And our security team stopped complaining about all the unnecessary packages we were shipping.
If you're dealing with bloated Docker images, slow deployments, or security headaches from oversized containers, this post will show you exactly how to fix it. I'll walk through the specific techniques that made the biggest difference, the mistakes I made along the way, and why these optimizations matter more than you might think.
Why Docker Image Size Actually Matters
Before diving into solutions, let's talk about why image size isn't just a vanity metric. When I started containerizing applications, I thought storage was cheap and network speeds were fast enough that a few hundred extra megabytes wouldn't matter. I was wrong.
Network transfer time is the most obvious impact. In our case, pulling a 1.2GB image over our corporate network took 6-8 minutes during peak hours. Multiply that by multiple services and frequent deployments, and you're looking at serious productivity hits.
Security surface area expands dramatically with larger images. Every package, library, and binary you include is a potential attack vector. Our security scans were flagging vulnerabilities in build tools and development dependencies that had no business being in production containers.
Storage costs add up quickly when you're running at scale. We had 15 microservices, each with multiple versions stored in our registry. Large images meant higher registry storage costs and longer backup times.
Cold start performance suffers when container orchestrators need to pull massive images before starting new instances. This directly impacts auto-scaling responsiveness and deployment rollback speed.
What Was Going Wrong
Our old Dockerfiles were simple but inefficient. Install Node, copy everything, run npm install, build, done. Here's what a typical Dockerfile looked like:
FROM node:16
WORKDIR /app
COPY . .
RUN npm install
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]The problem is that "everything" included build tools, dev dependencies, test files, and all sorts of stuff we didn't need in production. We were essentially shipping our entire development environment.
Build tools and compilers like gcc, make, and python were installed as dependencies for native npm packages, then left sitting in the final image. These tools alone can add 200-300MB.
Development dependencies included testing frameworks, linters, and development servers that served no purpose in production. Our package.json had over 40 dev dependencies that were being installed in every image.
Source files and artifacts like TypeScript files, test directories, documentation, and git history were all copied into the container. The built JavaScript was tiny, but we were shipping all the source code too.
Package manager caches from npm, apt, and other tools were left behind after installation, adding unnecessary bloat.
Every extra package is a potential security hole. Every megabyte is slower network transfers. It adds up fast when you're running dozens of containers.
The Fix Is Simpler Than You Think
Multi-stage builds let you use one container to build your app and another to run it. The build container has all the tools you need - compilers, package managers, test frameworks. The runtime container only has what's actually needed to run.
Here's how we transformed our Dockerfile:
Build stage
FROM node:16-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production --silent
Development dependencies for building
FROM node:16-alpine AS dev-builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --silent
COPY . .
RUN npm run build
Production stage
FROM node:16-alpine AS production
RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY --from=dev-builder /app/dist ./dist
COPY --from=dev-builder /app/package.json ./package.json
USER nextjs
EXPOSE 3000
CMD ["node", "dist/index.js"]You just copy the built files from the first stage to the second. The build tools stay behind. It sounds obvious once you know it, but I wish someone had explained it to me years earlier.
Breaking Down the Multi-Stage Approach
Stage 1: Production Dependencies - Install only the packages needed at runtime. Using `npm ci --only=production` ensures dev dependencies are skipped.
Stage 2: Build Process - Install all dependencies (including dev dependencies) and build the application. This stage includes TypeScript compilation, bundling, and any other build processes.
Stage 3: Final Runtime - Start with a clean base image and copy only the essential files from previous stages. No source code, no build tools, no dev dependencies.
Advanced Optimization Techniques
Leveraging Alpine and Distroless Images
Switching to Alpine Linux as our base image was a game-changer. Alpine images are typically 5-10x smaller than their Ubuntu counterparts:
- node:16 - 993MB
- node:16-alpine - 174MB
- node:16-alpine (multi-stage final) - 89MB
For even more security-conscious applications, consider distroless images. These contain only your application and runtime dependencies, with no shell, package manager, or other utilities:
FROM gcr.io/distroless/nodejs16-debian11
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
CMD ["dist/index.js"]Optimizing Layer Caching
Layer ordering dramatically impacts build performance. Docker caches each layer, so structure your Dockerfile to maximize cache hits:
Good: Dependencies change less frequently than code
COPY package*.json ./
RUN npm ci --only=production
COPY . .
Bad: Every code change invalidates package installation
COPY . .
RUN npm ci --only=productionUsing .dockerignore Effectively
A comprehensive `.dockerignore` file prevents unnecessary files from being copied into build contexts:
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
coverage
.nyc_output
.DS_Store
*.md
.vscode
.idea
tests
*.test.js
*.spec.jsA Few Things I Learned the Hard Way
First, order matters more than you think. Put things that change rarely at the top of your Dockerfile. Put things that change often at the bottom. Docker caches each layer, so if you copy your code before installing dependencies, you'll reinstall everything on every change.
I spent a frustrating week wondering why our build times were still slow after implementing multi-stage builds. The issue was dependency installation running on every code change because I had the COPY commands in the wrong order.
Second, use Alpine or distroless images for your final stage. They're tiny and have almost nothing that attackers could exploit. Our security scans went from hundreds of warnings to almost none. The reduced attack surface was immediately noticeable in our vulnerability reports.
Third, don't forget your .dockerignore file. I once spent an hour debugging why my image was still huge, only to realize I was copying the entire node_modules and .git folder. This single file can save you hundreds of megabytes.
Fourth, test your optimized images thoroughly. Aggressive optimization can sometimes remove dependencies your application needs at runtime. We discovered this when our image optimization removed a system library that one of our npm packages required.
Fifth, consider your specific runtime needs. Not every optimization applies to every use case. If you're running multiple processes in a container (though generally not recommended), you might need some of the tools that Alpine strips out.
Real-World Impact and Monitoring
The improvements went beyond just faster deployments. Here's what we measured after implementation:
Deployment velocity increased by 400%. Our CI/CD pipeline went from 10-minute deployments to 2-minute deployments, enabling more frequent releases and faster rollbacks.
Registry storage costs dropped by 70%. With smaller images and more efficient layer sharing, our Docker registry bills decreased significantly.
Security posture improved dramatically. Vulnerability scans went from flagging 200+ issues to fewer than 10, mostly in base OS packages rather than unnecessary development tools.
Developer experience got better. Developers could pull and run services locally much faster, improving the onboarding experience and development workflow.
Common Pitfalls to Avoid
Over-optimizing can lead to runtime failures. Always test your optimized images in staging environments that mirror production. We caught several issues where Alpine's minimal environment lacked utilities our application expected.
Ignoring layer sharing between images. If you have multiple services, consider using shared base images to maximize Docker's layer caching benefits across your entire stack.
Forgetting about multi-architecture builds. If you're deploying to ARM-based instances (like AWS Graviton), ensure your optimization strategy works across architectures.
Skipping security scanning on optimized images. Smaller doesn't always mean more secure. Continue running security scans on your final images to catch any remaining vulnerabilities.
Looking Forward
Small changes, big impact. That's what I love about this kind of optimization work. The techniques I've shared here reduced our image sizes by 90%, but more importantly, they made our entire development and deployment pipeline more efficient and secure.
Start with multi-stage builds, add a comprehensive .dockerignore file, and switch to Alpine-based images. These three changes alone will likely get you 70-80% of the benefits. Then you can fine-tune based on your specific requirements and constraints.
The time invested in optimizing Docker images pays dividends in faster deployments, lower costs, and better security posture. In our case, the few days spent learning and implementing these techniques saved us hours every week in deployment time alone.