TL;DR — fix in this order
- DB query optimisation (indexes, N+1) — biggest win, lowest effort
- Caching layer (Redis) for hot reads
- Background jobs for heavy/slow ops
- CDN for static + media
- Read replicas if DB still bottlenecked
- Horizontal scaling (multi-server) — last, only when truly needed
What breaks first (almost always)
1. The N+1 query (90% of slowdowns)
Page that loads 10 items also fires 10 sub-queries to fetch related data. Worked at 100 users, dies at 10K. Fix with proper joins / eager-loading. Single biggest performance win for any growing app.
2. Missing indexes
Every WHERE / JOIN / ORDER BY column should be indexed. EXPLAIN ANALYZE on slow queries reveals which. 60-90% of "slow page" complaints fix here.
3. Synchronous email / webhook calls
API endpoint waits for transactional email to send → user sees 3-5s latency. Move to background queue (Sidekiq / BullMQ / Celery). Endpoint returns instantly.
What NOT to optimise yet
- Microservices: monolith is fine to 1M MAU if structured well. Splitting prematurely 3x your devops burden.
- Multi-region: India + US users? Single Mumbai region serves both with 200ms latency, which is fine for SaaS. Multi-region pays off only for real-time apps.
- Custom load balancer / autoscaling: Vercel / Heroku-style PaaS handles to 100K MAU. Skip Kubernetes until you have a real need.
- NoSQL migration: Postgres handles 100K MAU fine if indexed well. Don't switch to MongoDB / DynamoDB based on theory.
When to refactor architecture
| Symptom | Likely cause | Fix |
|---|---|---|
| Pages slow at 10K MAU | N+1 queries, missing indexes | EXPLAIN + index + eager-load |
| Database CPU 80%+ at 30K MAU | Hot reads, no cache | Redis cache + read replica |
| Background jobs piling up | Single worker, slow tasks | Multiple workers + queue priority |
| Memory leaks at 50K MAU | Long-running process state | Forking / process recycling |
| Deploy-time downtime | Single instance | Blue-green deploys + load balancer |
Indian-specific scaling considerations
- UPI peak: 10am-12pm + 7-9pm see 3-5x payment volume. Cache user lookups.
- Mobile network variance: 70% of Indian users on mobile, often on flaky 4G. Aggressive HTTP/2, lazy load, smaller payloads.
- Tier-2/3 cities: latency to Mumbai region 30-80ms vs 10ms in Bangalore. Test on real devices in target geographies.
- Ad-hoc Razorpay webhook bursts: when you run a campaign, expect 100x baseline webhook volume for an hour. Queue with retry.
Performance audit + scale refactor: ₹50K-2L. Typical 6-week engagement: profile → fix top 10 bottlenecks → load-test → handover. ROI: pages 5-10x faster, infra cost often drops 20-40%. Engagement options →
Last reviewed: 27 April 2026.
Want this built for you?
Talk to Kashvi — 30-min call, honest assessment, no pitch deck.