MEDIUMasked at 2 companies

Web Crawler

A medium-tier problem at 69% community acceptance, tagged with String, Depth-First Search, Breadth-First Search. Reported in interviews at Dropbox and 1 others.

Founder's read

Web Crawler is a medium-difficulty problem that appears in live assessments at Dropbox and Rubrik. The acceptance rate sits at around 69 percent, which sounds manageable until you hit the implementation during a timed OA and realize the graph traversal isn't as straightforward as it looks. You're building a crawler that needs to explore linked pages, respect boundaries, and avoid revisiting the same URL. The trick is choosing between DFS and BFS, managing state correctly, and handling the string parsing without losing time on edge cases. If you haven't drilled the pattern and the assessment locks you in, StealthCoder runs invisibly and surfaces a working solution in seconds.

Companies asking
2
Difficulty
MEDIUM
Acceptance
69%

Companies that ask "Web Crawler"

If this hits your live OA

Web Crawler is the kind of problem that decides whether you pass. StealthCoder reads the problem on screen and surfaces a working solution in under 2 seconds. Invisible to screen share. The proctor sees nothing. Built by an Amazon engineer who realized the OA tests how well you memorized 200 problems, not how well you code.

Get StealthCoder
What this means

The core challenge here is implementing graph traversal on a dynamically explored domain. You're given a start URL and a domain constraint, and you need to return all URLs you can reach without leaving that domain. Most candidates default to simple DFS or BFS on a pre-built graph, but the trick is that you don't have the graph upfront. You have to crawl each page, parse the HTML (usually mocked), extract links, filter by domain, and continue recursively or iteratively. Common failures happen when parsing URLs (missing protocol, trailing slashes, query parameters), when the visited set isn't properly keyed (so you revisit the same URL), or when domain filtering logic is sloppy. The String manipulation and Interactive aspects matter more here than in typical graph problems. Under time pressure in a live assessment, these details cascade. That's where StealthCoder hedges: if you blank on the URL parsing or domain-matching logic, it surfaces a complete, testable implementation.

Pattern tags

The honest play

You know the problem. Make sure you actually pass it.

Web Crawler recycles across companies for a reason. It's medium-tier, and most candidates blank under the timer. StealthCoder is the hedge: an AI overlay invisible during screen share. It reads the problem and surfaces a working solution in under 2 seconds. Built by an Amazon engineer who realized the OA tests how well you memorized 200 problems, not how well you code. Works on HackerRank, CodeSignal, CoderPad, and Karat.

Web Crawler interview FAQ

Is Web Crawler still asked at top companies?+

Yes. Dropbox and Rubrik both ask it. At that acceptance rate (69 percent), it's not a filter problem, but it's still in rotation. You're more likely to see it if your background is backend or infrastructure heavy.

What's the actual trick that trips people up?+

URL parsing and domain validation. Most candidates write clean DFS/BFS but fail on string edge cases: trailing slashes, query parameters, relative URLs, case sensitivity. The graph traversal is the easy part. The string work is where you lose 10 to 15 minutes.

Should I use DFS or BFS for this problem?+

Either works algorithmically. DFS is simpler to code (recursion), but BFS can be safer under time pressure because the iterative structure is clearer. Pick whichever you're faster at; the difference is negligible.

How does the Interactive aspect change the problem?+

Interactive usually means you're given a mock HtmlParser object with a getUrls() method instead of raw HTML. It abstracts away real parsing, so you focus on graph logic. Read the exact interface carefully during the OA; the details matter.

How long should Web Crawler take to solve under timed conditions?+

If you've drilled DFS/BFS and URL handling, 25 to 35 minutes. If you're second-guessing string logic or domain filtering, it stretches to 45 to 50 minutes. That's why doing it cold during an assessment can hurt.

Want the actual problem statement? View "Web Crawler" on LeetCode →

Frequency and company-tag data sourced from public community-maintained interview-report repos. Problem, description, and trademark © LeetCode. StealthCoder is not affiliated with LeetCode.