Creating and managing pipelines

After you start a pipeline, you will be redirected to the pipeline details view.

pipelineDetailsView

This page will allow you to manage your pipeline job graph.

As soon as you start editing you enter the edit mode. A black "Changes pending" bar will appear at the top, like in the following screen:

createNewPipeline

When you create a new pipeline, you will already start in edit mode.

Pipeline job graph

The pipeline job graph is a visual representation of your pipeline. It is composed of the following elements:

Sources
Filters
Parser
Data warehouse
Exceptions (Rule Engine)
Log reducer
Sinks

Sources

Sources are the entry point of your pipeline. They are the place where your logs are coming from. You can have multiple sources in a pipeline.

A source can be an integration like Datadog.

If you don't have any integrations at this point, you will see the following screen:

createIntegrationFromSource

Click on the create integration button to create a new integration.

createDatadogIntegration

Once you have created your integration, you can select it as a source for your pipeline, by clicking on the add source button.

pipelineAddSource

Once added, you can see the source in the pipeline job graph. For datadog agents, you will have to add the ingestion URL to your agent configuration.

pipelineSources

Filters

Filters are used to filter out logs that you don't want to keep. You can have multiple filters in a pipeline.

There are three places you can add filters, if you want to drop logs at a certain point in the pipeline:

After the source
After the parser
After the log reducer

To add a filter, click on the add filter button.

pipelineAddFilter

You will then see an input where you can put you query. This is a Datadog-syntax query for the logs you would like to keep. Everything else is dropped.

Parser

The Grepr allows you to add parsing rules to your logs using Grok patterns. To add a parser, click on the add parser button.

pipelineAddParser

You will then see an input where you can put your parsing rules.

Data warehouse

The data warehouse is where your raw logs will be stored before being reduced. To add a data warehouse, click on the add data warehouse button. This will allow you to select a data warehouse integration.

pipelineAddDataWarehouse

Exceptions

You can specify exceptions as part of the Rule Engine that will dynamically affect the processing of logs.

ruleEngineOverviewPage

You can add multiple exceptions to your pipeline, and there are multiple types that you may choose from. See more details here.

ruleEngineAddException

External trigger exceptions

In some situations, you might want to trigger an exception from an external callback. For example, you might have an alert configured in your observability tool that might indicate an incident. You might want to make sure that all relevant logs for that incident are not aggregated, so that when an engineer goes to troubleshoot, those logs are already available.

This feature enables a webhook in Grepr to trigger an "exception". When this webhook is called, Grepr can stop aggregating data for a customizable time period, and can also load some historical data. To configure this trigger, there are a few parts to configure/setup:

Creating an API key to call the hook externally
Defining relevant data via scoping keys
Backfill timespan
How long to stop aggregation
Generating callback payload and URL

Creating an API key

If you haven't already, create an API key by going to the API key page at https://[ORG_ID].app.grepr.ai/api-keys. Copy and save the API key so you can use it in your call when you configure it.

Defining relevant data

When a trigger call arrives, it needs to tell Grepr what the "scope" of the exception is. The scope defines the tags and attributes, what we call "scope keys", that should be used for backfills and for selecting messages to pass through unaggregated. For tags, just use the tag key such as host. For attributes, prepend a @ and use a dotted JSON path, such as @url.host.

The API call payload will need to provide these keys as part of the call. One wrinkle here is that you'll need to prepend __ (a double underscore) to the names of all these scoping keys in the payload.

Backfill timespan

This defines the length of time before the current time for which to load unaggregated data back to your observability tool. Note that the backfill takes a couple of minutes usually before it actually starts to allow for some all the data to make it to the data lake and become available for querying.

How long to stop aggregation

You can also optionally define how long to stop aggregating relevant, in-scope data. After this time period ends, data will be aggregated again.

Generating the callback payload and URL

The API call is a POST request to a URL like https://[ORG_ID].app.grepr.ai/api/v1/triggers. The payload is in JSON and consists of the following:

"__@<attribute path>": "<attribute value>",
"__<tag key>": "<tag value>",
...

To make this easier, we provide a cURL command example that you can copy and use, based on the configuration of the trigger so far. You will need to update the values to your needs.

If you are creating a new pipeline, you will see the Generate cURL command button disabled, because the pipeline needs to be created first.

ruleEngineExternalTriggerExceptionList

Once your pipeline is successfully created, you will be able to see the button enabled and the following dialog:

ruleEngineExternalTriggerExceptionCurl

From here you can copy the cURL command and run it in your terminal to trigger the exception as you need.

If you don't have an API key yet, you will see the following dialog instead:

ruleEngineExternalTriggerExceptionApiKeyWarning

You can click on the API key link to create a new API key. You will be redirected to the API key page, where you can create a new API key.

ruleEngineExternalTriggerExceptionCreateApiKey

Once your API key is created, you can go back to the pipeline page and copy the cURL command.

Log reducer

The log reducer is used to reduce the amount of logs stored.

pipelineLogReducer

In the log reducer, you can choose how to group the logs and the aggregation threshold to start reducing the logs.

pipelineLogReducerSettings

Sinks

Sinks are the exit point of your pipeline. They are the place where your reduced logs are going to.

You can choose your sink by clicking on the add sink button.

pipelineAddSink

You can also add some additional tags you will want to add to your logs. By default, we add processor:grepr and pipeline:{YOUR_PIPELINE_NAME}.

Once you have added all the elements to your pipeline, you can click on the save button to save your pipeline!

Integrations Pipelines