AWS Lambda cold starts have long been the Achilles’ heel of serverless architectures. In 2026, with Lambda’s SnapStart now supporting Node.js and Python runtimes alongside Java, and new provisioned concurrency optimizations, there are more ways than ever to eliminate cold start latency. This guide walks you through seven battle-tested techniques to get your Lambda functions responding in under 100 milliseconds — every single time.
Understanding Cold Starts in 2026
A cold start occurs when AWS Lambda needs to spin up a new execution environment for your function. This involves downloading your code, initializing the runtime, and running your initialization code. In 2026, the baseline cold start times are:
- Python 3.13: ~200-400ms
- Node.js 22: ~150-350ms
- Java 21 (without SnapStart): ~2-6 seconds
- Rust/Go: ~50-100ms
Let’s explore how to bring all of these down dramatically.
1. Use Lambda SnapStart (Now for Python & Node.js)
Lambda SnapStart, originally launched for Java, now supports Python and Node.js runtimes as of late 2025. It works by taking a snapshot of the initialized execution environment and caching it, so subsequent invocations skip the init phase entirely.
# serverless.yml configuration
functions:
myFunction:
handler: handler.main
runtime: python3.13
snapStart: true
environment:
POWERTOOLS_SERVICE_NAME: my-serviceEnable it in your SAM or Serverless Framework config. For Python functions, this typically reduces cold starts from ~300ms to ~50-80ms.
2. Minimize Your Deployment Package
Every megabyte in your deployment package adds to cold start time. In 2026, Lambda supports container images up to 10GB, but that doesn’t mean you should use them carelessly.
# Use Lambda layers for shared dependencies
# requirements-layer.txt (goes into a layer)
boto3==1.35.0
requests==2.32.0
pydantic==2.10.0
# requirements.txt (your function - keep it tiny)
# Only function-specific deps herePro tips for package optimization:
- Use
lambda-layer-builderto strip unnecessary files (.pyc, tests, docs) - Replace heavy SDKs with lightweight alternatives (e.g.,
httpxinstead ofrequests+urllib3) - Use tree-shaking with esbuild for Node.js functions
// esbuild config for minimal Node.js Lambda bundle
import { build } from 'esbuild';
await build({
entryPoints: ['src/handler.ts'],
bundle: true,
minify: true,
platform: 'node',
target: 'node22',
outfile: 'dist/handler.js',
external: ['@aws-sdk/*'], // AWS SDK v3 is included in runtime
treeShaking: true,
});3. Lazy-Load Heavy Dependencies
Don’t import everything at the module level. Defer heavy imports until they’re actually needed:
import json
import os
def handler(event, context):
action = event.get('action')
if action == 'analyze':
# Heavy import only when needed
import pandas as pd
import numpy as np
return analyze_data(pd, np, event['data'])
elif action == 'notify':
# Light path - no heavy imports
import urllib.request
return send_notification(event['message'])This pattern ensures that requests hitting the lightweight code path never pay the cost of importing heavy libraries.
4. Provisioned Concurrency with Auto-Scaling
For production workloads where cold starts are unacceptable, provisioned concurrency keeps execution environments warm. In 2026, AWS has improved auto-scaling for provisioned concurrency:
# CloudFormation template
MyFunctionAlias:
Type: AWS::Lambda::Alias
Properties:
FunctionName: !Ref MyFunction
FunctionVersion: !GetAtt MyFunction.Version
Name: prod
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 5
ScalingTarget:
Type: AWS::ApplicationAutoScaling::ScalableTarget
Properties:
MaxCapacity: 50
MinCapacity: 5
ResourceId: !Sub function:${MyFunction}:prod
ScalableDimension: lambda:function:ProvisionedConcurrency
ServiceNamespace: lambda
ScalingPolicy:
Type: AWS::ApplicationAutoScaling::ScalingPolicy
Properties:
PolicyType: TargetTrackingScaling
TargetTrackingScalingPolicyConfiguration:
TargetValue: 0.7
PredefinedMetricSpecification:
PredefinedMetricType: LambdaProvisionedConcurrencyUtilization5. Connection Pooling & SDK Client Reuse
One of the most common cold start mistakes is initializing SDK clients inside the handler. Always initialize them outside:
import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient, GetCommand } from '@aws-sdk/lib-dynamodb';
// Initialize OUTSIDE the handler - reused across invocations
const client = new DynamoDBClient({
maxAttempts: 2,
requestHandler: {
connectionTimeout: 3000,
socketTimeout: 3000,
},
});
const docClient = DynamoDBDocumentClient.from(client);
export const handler = async (event) => {
const result = await docClient.send(
new GetCommand({
TableName: process.env.TABLE_NAME,
Key: { pk: event.id },
})
);
return { statusCode: 200, body: JSON.stringify(result.Item) };
};6. Choose the Right Memory (and CPU) Configuration
Lambda allocates CPU proportionally to memory. More memory means faster initialization. Use AWS Lambda Power Tuning to find the sweet spot:
# Deploy the power tuning state machine
sam deploy --template-file powertuning.yml --stack-name power-tuning
# Run it against your function
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:123456789:stateMachine:powerTuning \
--input '{
"lambdaARN": "arn:aws:lambda:us-east-1:123456789:function:my-func",
"powerValues": [128, 256, 512, 1024, 1769, 2048],
"num": 50,
"payload": "{}",
"parallelInvocation": true
}'Often, bumping from 128MB to 512MB reduces cold start time by 40-60% and barely increases cost because the function runs faster.
7. Use ARM64 (Graviton3) Architecture
Graviton3-based Lambda functions (arm64) offer up to 20% better price-performance and noticeably faster cold starts compared to x86_64:
# Switch to ARM64 in your config
functions:
myFunction:
handler: handler.main
runtime: python3.13
architecture: arm64 # Graviton3 in 2026
memorySize: 512Most Python and Node.js packages work seamlessly on ARM64. For compiled dependencies, ensure your build pipeline targets the correct architecture.
Benchmarking Your Cold Starts
Measure before and after with CloudWatch Logs Insights:
filter @type = "REPORT"
| stats
avg(@initDuration) as avg_cold_start,
max(@initDuration) as max_cold_start,
pct(@initDuration, 99) as p99_cold_start,
count(@initDuration) as cold_start_count
| sort avg_cold_start descWrapping Up
In 2026, there’s no excuse for slow Lambda cold starts. Between SnapStart’s expanded runtime support, smarter provisioned concurrency auto-scaling, and Graviton3 performance gains, you can realistically achieve sub-100ms cold starts across all major runtimes. Start with SnapStart and package optimization — they’re free and deliver the biggest wins. Then layer on provisioned concurrency for latency-critical paths.
The serverless tax on latency is shrinking every year. Apply these techniques, measure with CloudWatch Insights, and your users won’t even know they’re hitting a serverless backend.

Leave a Reply