How to Design a URL Shortener Like
2025-08-27
URL shorteners like Bit.ly or TinyURL look simple — take a long URL and return a short link. But at scale, they serve billions of redirects per day and need to stay fast, reliable, and secure. Here’s a step-by-step system design guide you can use in interviews.
Step 1: Clarify Requirements
Functional
- Shorten a long URL into a unique short code
- Redirect short URL → original long URL
- Handle custom aliases (optional)
- Track analytics (clicks, location, referrer)
Non-functional
- Low latency (redirect < 50ms)
- High availability (99.9%+)
- Handle billions of requests
- Prevent abuse (spam, malicious links)
Step 2: Define APIs
POST /shorten
Body: { "url": "https://example.com/abc" }
Resp: { "short_url": "https://sho.rt/xyz123" }
GET /:code
→ 301 Redirect to original URL
POST /custom
Body: { "url": "...", "alias": "promo2025" }
GET /stats/:code
→ analytics data
Step 3: Data Model
Table: urls
id
(primary key, unique integer or Snowflake ID)short_code
(string, unique, 6–8 chars)long_url
(text)created_at
(timestamp)click_count
(int, incremented on each redirect)metadata
(optional JSON for analytics)
Indexes:
short_code
unique indexlong_url
index (optional for deduplication)
Step 4: Core Components
- API Servers – handle shorten/redirect requests (stateless)
- Database – persistent store (SQL or NoSQL, sharded at scale)
- Cache Layer – Redis/Memcached for fast lookup of popular codes
- CDN – serve static redirect service globally with edge caching
- Analytics Pipeline – log clicks asynchronously via Kafka/stream workers
Step 5: Generating Short Codes
Options:
- Hashing (MD5/SHA-256) → take first 6–8 chars (collision handling needed).
- Base62 encoding of IDs → sequential IDs encoded as alphanumeric (fast, no collisions).
- Random strings → generate with uniqueness check in DB.
Preferred in interviews: Base62 (0-9, A-Z, a-z). It guarantees uniqueness and compactness.
Step 6: Handling Redirects (Read Path)
- User clicks short link → request hits CDN/load balancer.
- Lookup short code in cache (Redis).
- If miss, fallback to DB, then warm the cache.
- Return 301 Redirect to the original URL.
- Send click event asynchronously to analytics pipeline.
Step 7: Scaling the System
- Caching: Hot short links stay in Redis for sub-millisecond lookups.
- Sharding: Partition DB by short code hash or ID range.
- Replication: Read replicas to scale redirect lookups.
- CDN Edge nodes: Serve redirects close to users worldwide.
- Queue + Stream (Kafka/Pulsar): Click events processed asynchronously so redirects stay fast.
Step 8: Analytics Pipeline
- Click logs pushed to Kafka.
- Stream processors aggregate counts, geolocation, device info.
- Results stored in an OLAP DB (like ClickHouse, BigQuery, or Redshift).
GET /stats/:code
queries aggregated data.
Step 9: Security and Abuse Prevention
- Blacklist malicious domains.
- Rate limit link creation per user/IP.
- ReCAPTCHA for anonymous shorten requests.
- Monitoring for unusual redirect traffic.
Step 10: Fault Tolerance & Reliability
- All services stateless → scale horizontally.
- Cache with eviction policy and fallback to DB.
- DB replication and automated failover.
- Logs written to durable storage before analytics processing.
- Graceful degradation: if analytics pipeline fails, redirects should still work.
Example Walkthrough
- Alice shortens
https://longsite.com/article
. - API generates ID=123456 → Base62(
123456
) =w7eX
. - Stores mapping:
w7eX → https://longsite.com/article
. - Alice shares
https://sho.rt/w7eX
. - Bob clicks link → lookup in Redis → redirect in <10ms.
- Click logged to Kafka for later analytics.
Quick Tradeoffs to Mention
- SQL vs NoSQL: SQL fine early; NoSQL needed at billions of records.
- Code length: Shorter codes (6 chars) → risk collisions at scale.
- Consistency vs Availability: Eventual consistency OK for analytics, but redirects must be strongly consistent.
- TTL on unused links: Expire old links to save storage.
Final Note
A URL shortener is a classic system design problem because it covers IDs, caching, scaling, and tradeoffs. Keep your answer structured: requirements → API → data model → components → scaling → tradeoffs.
👉 If you’d like to practice with step-by-step outlines and even diagrams of systems like URL shorteners, feeds, or messaging apps, check out StealthCoder. It can capture prompts and give you a design flow to rehearse before interviews.