Question 1

What is an exception in a Grepr pipeline?

Accepted Answer

An exception is a rule that you configure on a pipeline to identify log messages that should bypass aggregation by the reducer. Messages that match an exception are forwarded to your sinks unmodified. Some exception types also trigger a backfill from the data lake and temporarily pause aggregation for related messages.

Question 2

What types of exceptions can I configure?

Accepted Answer

You can configure five types of exceptions: a pattern exception that always forwards messages matching a query, a query trigger exception that backfills and pauses aggregation when a query matches, an integration exception that excludes queries used by alerts in your observability tool, a trace sampler exception that forwards a representative sample of full traces, and an external trigger exception that backfills and pauses aggregation in response to a call to the Grepr External Triggers API.

Question 3

How do I add an exception to a pipeline?

Accepted Answer

Open the pipeline in the Grepr UI, click Exceptions in the left-hand navigation menu, and click the add button. In the Add Exception dialog, select the type of exception, complete the configuration fields, and click Add.

Question 4

What query languages can I use in an exception?

Accepted Answer

You write exception queries using the same query syntaxes available for the filtering and transformation steps in your pipeline. Queries are used on pattern exceptions, query trigger exceptions, and as the optional scoping query on trace sampler exceptions.

Question 5

What are scoping keys on an exception?

Accepted Answer

Scoping keys are the attribute paths and tag keys you configure on a query trigger exception or external trigger exception. They limit the backfill and the paused aggregation to messages whose values for those keys match the values from the triggering event.

Question 6

What does Auto-sync exceptions do on an integration exception?

Accepted Answer

When Auto-sync exceptions is enabled, Grepr automatically adds new queries discovered in the integration as exceptions, instead of using only the queries you initially selected. You can clear specific queries to exclude them from auto-sync, and you can use the Max % field to prevent any single exception from excluding more than a specified percentage of incoming logs.

Question 7

How do I prevent an auto-synced exception from forwarding too many logs?

Accepted Answer

On an integration exception with Auto-sync exceptions enabled, set the Max % field to the maximum percentage of incoming logs that any single exception is allowed to exclude from reduction. If the estimated percentage of logs excluded by an auto-synced exception exceeds this threshold, Grepr does not activate the exception.

Question 8

How does trace sampling work in Grepr?

Accepted Answer

Based on the configured trace ID for each log message, Grepr samples a configurable percentage of trace IDs and forwards every log message that belongs to a sampled trace unmodified. The trace ID does not have to be an APM trace; anything that groups logs together, such as a request ID, user ID, or session ID, can be used.

Question 9

What's the difference between Trace ID Attribute Paths and Trace ID Tag Keys on a trace sampler exception?

Accepted Answer

Trace ID Attribute Paths are paths inside the structured attributes of a log message that contain the trace ID, expressed in dot notation or as a JSON array. Trace ID Tag Keys are top-level tag keys that contain the trace ID. You must specify at least one path or one tag key.

Question 10

How does the Sample Percentage on a trace sampler exception work?

Accepted Answer

The Sample Percentage is the percentage of distinct trace IDs that Grepr samples. Grepr forwards every log message that belongs to a sampled trace, so a sample percentage of 1% forwards complete logs for approximately 1% of the traces seen by the reducer.

Question 11

In what order does Grepr apply trace samplers?

Accepted Answer

Grepr applies trace samplers in the order they are listed. After a message matches a trace sampler, Grepr does not evaluate it against any of the remaining trace samplers.

Question 12

Can Grepr import exceptions from my observability platform's monitors and dashboards?

Accepted Answer

Yes. Some Grepr integrations, such as the Datadog and New Relic integrations, parse the queries used by your monitors, dashboards, and alerts and suggest them as exceptions you can apply. Each suggested exception includes a link back to the source of the alert in your observability vendor's UI.

Question 13

How does an external trigger exception get activated?

Accepted Answer

An external trigger exception is activated by a call to the Grepr External Triggers API. After you save the exception, the row for the exception in the Exceptions list provides a sample curl command that you can use to call the API. Scoping keys configured on the exception take their values from the API call.

Question 14

How does the query trigger exception work?

Accepted Answer

A query trigger exception activates when an incoming log message matches the configured query. When it activates, Grepr backfills matching messages from the data lake for a configured duration and pauses aggregation of matching messages for a configured duration. You can also choose to skip the backfill and only pause aggregation, or to keep aggregating and only run the backfill. You can use scoping keys to limit the backfill and paused aggregation to messages that share the triggering message's values.

Question 15

Why might I need to rewrite alert or dashboard queries after adding exceptions?

Accepted Answer

When you first deploy log reduction, you typically add exceptions for existing alerts and dashboards to minimize disruption. Some of those alerts and dashboards are powered by counts of high-volume events, such as HTTP requests with status 200, which limits how much volume the reducer can reduce. Rewriting those queries lets the reducer aggregate the underlying messages while keeping the dashboards accurate.

Question 16

Can I rewrite my dashboards to use reduced logs?

Accepted Answer

Yes, you can rewrite dashboard and alert queries to use grepr.repeatCount for accurate metrics on aggregated data. Add the processor:grepr tag to queries to ensure they only match processed logs, enabling cost-effective metrics on sampled data.

Bypass aggregation of important messages with exceptions

Configure exceptions in the Grepr UI

Prevent aggregation of messages used in your observability platform

Always forward messages matching a query

Forward samples of full logs based on trace identifiers

Pause aggregation and backfill logs for events matching a query

Pause aggregation and backfill logs in response to an external event

Rewrite queries

Frequently Asked Questions