Question 1

What is event filtering and why would I use it?

Accepted Answer

Event filtering allows you to include or exclude log events from pipeline processing based on specific conditions. Use filtering to remove unwanted events, focus processing on relevant data, reduce volume, prevent sensitive data from flowing through the pipeline, or route different event types to different processing stages.

Question 2

Where are filters located in a Grepr pipeline?

Accepted Answer

Filters are available in predefined locations throughout a pipeline: Pre-parsing filter (before initial processing), Post-parsing filter (after Grok parsing), and Post-warehouse filter (after data lake storage). Each location allows you to filter at different stages of log processing.

Question 3

What filtering options are available in Grepr?

Accepted Answer

Grepr provides options to drop late-arriving events, pass all or some events to the next pipeline step using query filters, process events with SQL before passing to a pipeline step, or pass all records to the next step while also routing some through SQL processing for enrichment.

Question 4

How do I filter out late-arriving events?

Accepted Answer

In the filter configuration, enter a Max Lateness value in ISO-8601 duration format. Events that arrive later than this duration after their timestamp are dropped from the pipeline. This is useful when timely data is critical and late data could affect analysis.

Question 5

What query languages does Grepr support for filtering?

Accepted Answer

Grepr supports multiple query syntaxes including Datadog query language, Splunk Processing Language (SPL), New Relic-like syntax, and standard SQL. Choose the query language that matches your observability platform or preference.

Question 6

How do I create a simple filter to exclude events?

Accepted Answer

In the Pre-parsing, Post-parsing, or Post-warehouse filter, ensure 'Enable data passthrough to next step' is selected. Enter a filter query that matches events you want to keep. Events matching the query pass to the next step, while others are excluded or processed separately.

Question 7

Can I route different event types to different pipeline paths?

Accepted Answer

Yes. Use multiple SQL views in the filter's SQL processing path to route different events to different destinations. Each view can contain a different SQL query that processes a specific event type and routes it to appropriate pipeline steps.

Question 8

What is the difference between passthrough and SQL processing?

Accepted Answer

Passthrough sends events directly to the next pipeline step based on a filter query. SQL processing allows you to transform events using SQL statements before routing them to different pipeline steps. You can enable both to handle events in different ways simultaneously.

Question 9

How do I use SQL to transform events in a filter?

Accepted Answer

Enable SQL processing in the filter configuration, then create SQL views with names and SQL queries. Each view can process events and the results can be routed to pipeline steps like the Data Lake, Reducer, or Sinks. Reference input events using the 'logs' input table.

Question 10

What does the shouldMaterialize option do in filters?

Accepted Answer

When enabled, shouldMaterialize caches the results of a SQL view so subsequent SQL views can reuse the cached results instead of re-running the query. This improves performance when multiple views need the results of the same transformation.

Filter events in a Grepr pipeline

Filtering options

Drop late-arriving data

Pass events to the next pipeline step

Process events with SQL

Configure filters in the Grepr pipelines UI

Frequently Asked Questions