How to Design a URL Shortener Like

2025-08-27

URL shorteners like Bit.ly or TinyURL look simple — take a long URL and return a short link. But at scale, they serve billions of redirects per day and need to stay fast, reliable, and secure. Here’s a step-by-step system design guide you can use in interviews.


Step 1: Clarify Requirements

Functional

  • Shorten a long URL into a unique short code
  • Redirect short URL → original long URL
  • Handle custom aliases (optional)
  • Track analytics (clicks, location, referrer)

Non-functional

  • Low latency (redirect < 50ms)
  • High availability (99.9%+)
  • Handle billions of requests
  • Prevent abuse (spam, malicious links)

Step 2: Define APIs

POST /shorten
Body: { "url": "https://example.com/abc" }
Resp: { "short_url": "https://sho.rt/xyz123" }

GET /:code
→ 301 Redirect to original URL

POST /custom
Body: { "url": "...", "alias": "promo2025" }

GET /stats/:code
→ analytics data

Step 3: Data Model

Table: urls

  • id (primary key, unique integer or Snowflake ID)
  • short_code (string, unique, 6–8 chars)
  • long_url (text)
  • created_at (timestamp)
  • click_count (int, incremented on each redirect)
  • metadata (optional JSON for analytics)

Indexes:

  • short_code unique index
  • long_url index (optional for deduplication)

Step 4: Core Components

  • API Servers – handle shorten/redirect requests (stateless)
  • Database – persistent store (SQL or NoSQL, sharded at scale)
  • Cache Layer – Redis/Memcached for fast lookup of popular codes
  • CDN – serve static redirect service globally with edge caching
  • Analytics Pipeline – log clicks asynchronously via Kafka/stream workers

Step 5: Generating Short Codes

Options:

  1. Hashing (MD5/SHA-256) → take first 6–8 chars (collision handling needed).
  2. Base62 encoding of IDs → sequential IDs encoded as alphanumeric (fast, no collisions).
  3. Random strings → generate with uniqueness check in DB.

Preferred in interviews: Base62 (0-9, A-Z, a-z). It guarantees uniqueness and compactness.


Step 6: Handling Redirects (Read Path)

  1. User clicks short link → request hits CDN/load balancer.
  2. Lookup short code in cache (Redis).
  3. If miss, fallback to DB, then warm the cache.
  4. Return 301 Redirect to the original URL.
  5. Send click event asynchronously to analytics pipeline.

Step 7: Scaling the System

  • Caching: Hot short links stay in Redis for sub-millisecond lookups.
  • Sharding: Partition DB by short code hash or ID range.
  • Replication: Read replicas to scale redirect lookups.
  • CDN Edge nodes: Serve redirects close to users worldwide.
  • Queue + Stream (Kafka/Pulsar): Click events processed asynchronously so redirects stay fast.

Step 8: Analytics Pipeline

  • Click logs pushed to Kafka.
  • Stream processors aggregate counts, geolocation, device info.
  • Results stored in an OLAP DB (like ClickHouse, BigQuery, or Redshift).
  • GET /stats/:code queries aggregated data.

Step 9: Security and Abuse Prevention

  • Blacklist malicious domains.
  • Rate limit link creation per user/IP.
  • ReCAPTCHA for anonymous shorten requests.
  • Monitoring for unusual redirect traffic.

Step 10: Fault Tolerance & Reliability

  • All services stateless → scale horizontally.
  • Cache with eviction policy and fallback to DB.
  • DB replication and automated failover.
  • Logs written to durable storage before analytics processing.
  • Graceful degradation: if analytics pipeline fails, redirects should still work.

Example Walkthrough

  • Alice shortens https://longsite.com/article.
  • API generates ID=123456 → Base62(123456) = w7eX.
  • Stores mapping: w7eX → https://longsite.com/article.
  • Alice shares https://sho.rt/w7eX.
  • Bob clicks link → lookup in Redis → redirect in <10ms.
  • Click logged to Kafka for later analytics.

Quick Tradeoffs to Mention

  • SQL vs NoSQL: SQL fine early; NoSQL needed at billions of records.
  • Code length: Shorter codes (6 chars) → risk collisions at scale.
  • Consistency vs Availability: Eventual consistency OK for analytics, but redirects must be strongly consistent.
  • TTL on unused links: Expire old links to save storage.

Final Note

A URL shortener is a classic system design problem because it covers IDs, caching, scaling, and tradeoffs. Keep your answer structured: requirements → API → data model → components → scaling → tradeoffs.

👉 If you’d like to practice with step-by-step outlines and even diagrams of systems like URL shorteners, feeds, or messaging apps, check out StealthCoder. It can capture prompts and give you a design flow to rehearse before interviews.