Log Ingestion and Query
Reported by candidates from Datadog's online assessment. Pattern, common pitfall, and the honest play if you blank under the timer.
You're building a log system for Datadog in April 2026. This is a storage-and-filter problem disguised as systems design. You'll ingest logs with timestamp, service, level, and message, then answer queries that filter on any or all of those fields plus substring matching on the message. The trick is efficient lookup: hash on the exact match fields, then linear scan the candidates for substring. StealthCoder will catch the schema edge case if you blank on the filtering logic.
The problem
Implement a simplified logging system that supports log ingestion and conditional queries. Each operation is one of the following strings: A log matches a query if startTs, service and level match exactly unless the filter is *, and keyword appears as a substring of the message unless the filter is *. Process all operations in order and return the outputs produced by the QUERY commands. Function Description Complete the function runLogIngestionQueries in the editor below. runLogIngestionQueries has the following parameter: Returns The three queries count all logs, then all api logs in the first two timestamps, then only the single error log whose message contains failed. Substring matching on the message field allows both successful and failed login events to match the keyword login.
Reported by candidates. Source: FastPrep
Pattern and pitfall
The pattern is hash-table with a secondary filter pass. Store logs in a map keyed by (timestamp, service, level) tuple, or flatten into a list and filter on demand. The catch: a query with service='*' or level='*' means match any value in that field. Keyword='*' means skip substring check. For each QUERY, iterate stored logs, check exact-match fields (using wildcard logic), then check if keyword is a substring of the message (or skip if keyword is '*'). Common pitfall: treating '*' as a literal string instead of a wildcard. Datadog's logging volume is huge in production, but here you're just filtering in-memory. The secondary filter pass is O(n) per query, which is acceptable for the OA scale.
StealthCoder is the hedge for the one pattern you didn't drill. It runs invisibly during the screen share.
You can drill Log Ingestion and Query cold, or you can hedge it. StealthCoder runs invisibly during screen share and surfaces a working solution in under 2 seconds. The proctor sees the IDE. They don't see what's behind it. If you're reading this with an OA window open, you're who this was built for.
Get StealthCoderYou've seen the question.
Make sure you actually pass Datadog's OA.
Datadog reuses patterns across OAs. If you're reading this with an OA window open, you're who this was built for. Works on HackerRank, CodeSignal, CoderPad, and Karat.
Log Ingestion and Query FAQ
What does the asterisk wildcard actually mean?+
In a query field, '*' means match any value for that field. So service='*' matches logs from any service. Timestamp and level can also be '*'. The keyword field uses '*' to skip substring matching entirely. Treat it as a sentinel, not a literal string.
Do I need a complex index structure or is a simple list okay?+
A simple list of all logs works fine for the OA. On each QUERY, filter the list by checking each log's timestamp, service, and level (with wildcard rules), then check substring. It's O(n*m) where n is log count and m is message length. No need for suffix trees or complex indexing here.
How do I handle substring matching correctly?+
Use the language's built-in substring function. In Python, use 'in'. In Java, use indexOf or contains. If keyword is '*', skip the substring check entirely. Case sensitivity matches the problem: compare strings as-is unless stated otherwise.
What's the difference between exact match and wildcard match?+
Exact match means the field value must equal the query value. Wildcard means if the query value is '*', any log value passes that filter. So a log with service='web' matches a query with service='*', but not a query with service='api'.
Should I validate input or assume it's well-formed?+
Assume the input is well-formed. Focus on the filtering logic, not error handling. Parse the operation type, extract the fields, and either store or query. No need to handle malformed commands or missing fields.