Operations
Operations are the building blocks of a Grepr pipeline. They are the individual steps that are applied to each event as it passes through the pipeline. Operations can be combined to create complex pipelines that transform, filter, and enrich log events. Each operation appears as a vertex
More detailed information is accessible in the API spec
Parsing in Grepr
Json Message Parsing
Grepr has the ability to automatically detect and parse JSON log messages into structured objects. All fields
in the JSON object are parsed and added into the attributes
field of the log event. Using this along with the
log attributes remapper allows users to enrich their logs.
Note: This operation is automatically performed when creating pipeline through the UI. However, it needs to be explicitly added in the Grepr job graph when creating a pipeline through the API.
Operation execution example:
Incoming log message:
{
"message": "{\"msg\": \"Example log message\", \"@timestamp\": \"2007-12-03T10:15:30.00Z\", \"log\": {\"level\": \"INFO\"}}"
}
Parsed log message:
{
"message": "",
"attributes": {
"@timestamp": "2007-12-03T10:15:30.00Z",
"msg": "Example log message",
"log": {
"level": "INFO"
}
}
}
Grok Parser
For non-JSON formats, Grepr provides a Grok parser that can be used to parse semi-structured log messages and enrich
logs with extracted fields. Grok parsing rules are of the form %{MATCHER:EXTRACT:TRANSFORM}
which are supplied to the Grok Parser
operation as configurations.
- Matcher: An identifier (possibly a reference to another custom matcher) that describes what to expect (number, word, notSpace, etc.) when parsing the message.
- Extract (optional): An identifier representing the piece of text matched by the Matcher.
- Transform (optional): A transformer applied to the match to update the extracted value before enriching the message.
Note: All the extracted fields are added to the attributes
field of the log event.
Example:
Grok pattern:
MyParsingRule %{word:user} connected on %{date("MM/dd/yyyy"):date} from %{ipv4:ip}
applied to the log message:
{
"message": "john connected on 11/08/2017 from 127.0.0.1"
}
would result in the following enriched log message:
{
"message": "john connected on 11/08/2017 from 127.0.0.1",
"attributes": {
"user": "john",
"date": 15759590400000, // (Milliseconds since epoch)
"ip": "127.0.0.1"
}
}
The user can define multiple grok parsing rules (each with a unique name, like MyParsingRule from above) which can reference each other. The user can also define helper rules that are groks which act as references in the main grok rules that get applied to the log messages.
Currently supported Grok Matchers
are:
- All the ones defined in Logstash (opens in a new tab)
- Additional list of Matchers supported (These are compatible with the Grok Matchers supported by Datadog (opens in a new tab)):
boolean("truePattern", "falsePattern")
: Matches and parses a Boolean, optionally defining the true and false patterns (defaults to true and false, ignoring case).notSpace
: Matches any string until the next space.regex("pattern")
: Matches a regex.numberStr
: Matches a decimal floating point number and parses it as a string.number
: Matches a decimal floating point number and parses it as a double precision number.numberExtStr
: Matches a floating point number (with scientific notation support) and parses it as a string.numberExt
: Matches a floating point number (with scientific notation support) and parses it as a double precision number.integerStr
: Matches an integer number and parses it as a string.integer
: Matches an integer number and parses it as an integer number.integerExtStr
: Matches an integer number (with scientific notation support) and parses it as a string.integerExt
: Matches an integer number (with scientific notation support) and parses it as an integer number.word
: Matches characters from a-z, A-Z, 0-9, including the _ (underscore) character.doubleQuotedString
: Matches a double-quoted string.singleQuotedString
: Matches a single-quoted string.quotedString
: Matches a double-quoted or single-quoted string.uuid
: Matches a UUID.mac
: Matches a MAC address.ipv4
: Matches an IPV4 address.ipv6
: Matches an IPV6 address.ip
: Matches an IP (v4 or v6).hostname
: Matches a hostname.ipOrHost
: Matches a hostname or IP.data
: Matches any string including spaces and newlines. Equivalent to.*
in regex.
- List of Transformers supported (These are compatible with Datadog (opens in a new tab)):
number
: Parses a match as double precision number.integer
: Parses a match as an integer number.boolean
: Parsestrue
andfalse
strings as booleans ignoring case.nullIf("value")
: Returns null if the match is equal to the provided value.json
: Parses properly formatted JSON.useragent([decodeuricomponent:true/false])
: Parses a user-agent and returns a JSON object that contains the device, OS, and the browser represented by the Agent.
Note: More Matchers and Filters can easily be added by our team. If you have a specific requirement, please reach out to us.
Example of how a Grok Parser operation will look like in the Grepr job graph:
{
"name": "log-message-grok-parser",
"type": "grok-parser",
"grokParsingRules": [
"MyParsingRule %{user} %{connection} %{server}",
"ParsingRule2 %{MyParsingRule} %{user}"
],
"grokHelperRules": [
"user %{word:user_name} id:%{integer:user_id}",
"connection connected %{integer:times} times",
"server on server %{notSpace:server.name} in %{notSpace:server.env}"
]
}
Log Attributes Remapper
Maps attributes to top-level fields and tags.
Sets values in the log event model by extracting information out of the attribute.
For example, this event:
{
"id": "ABCDEF",
"timestamp": "",
"message": "",
"severity": "",
"service": "",
"attributes": {
"syslog": {
"severity": "10",
"appname": "test name"
},
"status": "5",
"message": "message 1",
"timestamp": {
"ms_since_epoch": 9001
},
"eventTime": " "
}
}
Would be transformed to look like the following:
{
"id": "ABCDEF",
"timestamp": "",
"message": "message 1",
"severity": "5",
"service": "test name",
"attributes": {
"syslog": {
"severity": "10",
"appname": "test name"
},
"status": "5",
"timestamp": {
"ms_since_epoch": 9001
},
"eventTime": " "
}
}
Note the following behaviors:
severity
usesstatus: "5"
instead ofsyslog.severity: "10"
becausestatus
has a higher priority in the defaultstatusReservedAttributes
.- Also note that
syslog.appname: "test name"
was still used, even thoughsyslog.severity: "10"
was skipped. - The attribute
message
was removed because it's marked as removed once remapped. - If the message attribute was
log.message
, thenmessage
would have been removed, but it's parentlog
would still exist, even if empty. timestamp: {}
andeventTime: " "
are not used at all because they are not a non-blank string value.- If
maxNestedDepthForReservedFields
was set to1
, then onlymessage
andseverity
would be set.
Variables | Description |
---|---|
name | The name of the operation |
maxNestedDepthForReservedFields | The maximum depth to expand. Can be [1, 10] . Defaults to 3 |
timestampReservedAttributes | Override default list of timestamp attributes |
hostReservedAttributes | Override default list of host attributes |
serviceReservedAttributes | Override default list of service attributes |
statusReservedAttributes | Override default list of status attributes |
messageReservedAttributes | Override default list of message attributes |
traceIdReservedAttributes | Override default list of trace attributes |
Attribute | Removed | Default names |
---|---|---|
timestamp | false | "@timestamp", "timestamp", "_timestamp", "Timestamp", "eventTime", "date", "published_date", "syslog.timestamp" |
host | false | "host", "hostname", "syslog.hostname" |
service | false | "service", "syslog.appname", "dd.service" |
status | false | "log.level", "status", "severity", "level", "syslog.severity" |
message | true | "message", "msg", "log" |
trace | false | "dd.trace_id", "contextMap.dd.trace_id" |
Log Reducer
The Log Reducer is the core IP that Grepr has developed to identify and merge similar log messages. The reducer works by clustering similar messages into patterns. Messages continue to be passed through with no modification until the number of messages passed for a pattern hits a threshold. At that point, the messages for that pattern are merged together for two minutes and not passed through. At the end of the two-minute window a summary is emitted for the aggregated patterns.
The reducer works in 3 steps:
- Masking: the reducer masks any values that are known to change often such as numbers, UUIDs, or timestamps. This allows the reducer to be more efficient since it knows that a certain word has to be of a certain type.
- Tokenizing: the reducer tokenizes the messages into words based on a configurable set of punctuation characters.
- Clustering: the reducer uses a similarity metric to cluster messages into patterns. The reducer uses a similarity threshold to determine if two messages are similar enough to be in the same pattern.
The Log Reducer has the following configuration options:
Variables | Description |
---|---|
name | The name of the operation |
dedupThreshold | The number of messages that must be seen for a pattern before the reducer starts aggregating messages. |
partitionByTags | A list of tags to partition the messages by. Messages with different values of these tags will be split into different patterns. This field can be used to limit aggregation across domains or teams. |
similarityThreshold | Percent of tokens that have to match exactly between two messages before being considered similar. Default is 70. |
masks | A list of masks to apply to the message. Each mask is a list of two elements, the mask id such as "uuid" and the regular expression matching the mask. By default, the reducer has regular expressions for "URL", "IP & port", "UUID", and "number". |
enabledMasks | A list of the mask ids from the masks field that should be enabled. |
delimiters | An array of characters that are used to split the message into tokens. Default is `':', '#', '[', ']', '(', ')', ', ', ' |
logReducerExceptions | A list of event predicates that when matched to a message will skip the aggregation of that message. This is used to skip aggregating messages that should always be passed through. |
exceptionProviderIntegrationIds | A list of integration ids that can provide exceptions to the log reducer. Add an observability vendor integration ID here so Grepr automatically skips aggregating messages used in alerts, dashboards, etc. |
Rule Engine
The Rule Engine can dynamically affect the processing of logs based on the provided context. The user has the power to configure the rule engine, via the attributes mentioned below, that dictate the actions that it takes. One example usage scenario is if an incident occurs and the user wants to stop log aggregation and, backfill already aggregated data for a period of time for a service.
The Rule Engine has the following configuration options:
Variables | Description |
---|---|
triggers | Map of trigger id to trigger objects. The ids are a unique name for the trigger. |
conditions | Map of condition id to condition objects. The ids are a unique name for the conditions. |
Example definition of a Rule Engine as a vertex in the Grepr job graph:
{
"name": "my-rule-engine",
"type": "rule-engine",
"triggers": {
"trigger1": {
"type": "event-predicate",
"predicate": {
"type": "datadog-query",
"query": "service:my-service"
},
"conditionIds": ["condition1"],
"duration": "5m",
"variables": {
"__host": "host" // Extracts host tag from the matching event
}
}
},
"conditions": {
"condition1": {
"actionRules": [
{
"type": "event-rule",
"actionPredicate": {
"type": "datadog-query",
"query": "host:__host" // Matches events with host tag value extracted from the event that fired trigger1
},
"actions": [
{
"type": "tag-action",
"order": 0,
"modification": "ADD",
"tagKey": "new-tag",
"values": ["new-value"] // Adds {new-tag: [new-value]} to tags
}
]
},
{
"type": "job-rule",
"actions": [
{
"type": "backfill-job-action",
"name": "my_backfill_job",
"order": 1,
"backfillTimespan": "PT10M",
"jobTags": {
"job-type": "backfill"
},
"backfillQuery": { // Setting the query to define the context for the backfill
"type": "datadog-query",
"query": "host:__host"
},
"rawLogsTable": "my_rawlogs_table",
"processedLogsTable": "my_processedlogs_table",
"sinkOperationName": "sink",
"limit": 50000
}
]
}
]
}
}
}
Condition
Conditions describe the state of the environment. When a condition is "triggered", some actions can be taken by Grepr and, those are described by the condition itself. A condition has a duration, and actions can be taken when the condition starts or for all messages while the condition is ongoing depending on the type of the action.
The actions are described in the condition as action rules. They can be of the following types:
Event Action Rule
: This takes actions on events that match a certain predicate while the condition is active.Job Action Rule
: This executes a set of associated job actions when a condition is first triggered.
Event Action Rule
This is executed on events that match a certain predicate, for the entire duration that the associated condition is active.
The event action rule
has the following configuration options:
Variables | Description |
---|---|
actionPredicate | Event predicate that triggers actions on an event. |
actions | List of event actions to be executed on the event. |
As an example, the UI configures the rule engine and conditions such that, when there's an abnormal event or an external API call hits Grepr, events that are related to the incident are tagged with a special tag that tells the Log Reducer to skip aggregating them.
Job Action Rule
A set of job actions that are executed once per associated condition. This means, if an already active condition
is extended by the (re-)firing of a trigger, the job actions will not be executed again. The job action rule
has the
following configuration options:
Variables | Description |
---|---|
actions | List of job actions to be executed. |
As an example, the UI sets up the Rule Engine to kick off a backfill job when a trigger fires.
Trigger
A trigger kicks-off a one or more conditions. Triggers may continue to fire, and that would extend the activation time of the associated conditions. Each trigger has a unique identifier and is of two types:
Event Trigger
: This trigger is activated by a matching predicate on an event. The predicate is provided by the user as part of the Rule Engine configuration.External Trigger
: This trigger is activated by an external source, for example, an alert from an observability tool. These will allow users to feed external stimuli into the Rule Engine to take appropriate actions.
The Event Trigger
has the following configuration options:
Variables | Description |
---|---|
predicate | Event predicate on events that will fire a trigger. |
conditionIds | The ids for the conditions that are activated by this trigger. |
duration | The duration for which the associated conditions will be active. |
variables | Map from variable names to paths for variables to extract from a matching event. |
The External Trigger
has the following configuration options:
Variables | Description |
---|---|
conditionIds | The ids for the conditions that are activated by this trigger. |
duration | The duration for which the associated conditions will be active. |
Variables
Variables are defined as part of the trigger configuration. These are used to extract values from the event that
fired the trigger. The extracted values could be used as context for the conditions that are activated by the trigger.
For example, if the trigger is activated by an event with a service
tag, the service
tag value could be extracted
and used in the conditions to perform actions on the events that have the same service
tag value.
It's a map from variable names to paths for variables to extract from a matching event. The way to reference an attribute
is by using an @
in the variable path. For example, @syslog.appname
will extract the value of the
attribute syslog.appname
from the event. If @
is not specified, the variable will be extracted from the tags of the
event. For example, the path app
will extract the value of the tag app
from the event.
The variable names must start with '__' and be unique. Note that __timestamp
(and __severity
for logs) is
automatically extracted from the respective top level fields of the matching event and are available for use in
conditions out of the box.
Note: If the path provided does not exist for a variable, the path itself will be taken as a literal value for the variable.
Event Action
These are actions taken on an event. The EventAction
can be one of the following:
Tag Action
Variables | Description |
---|---|
order | What order to apply this action in, lower is earlier. |
modification | One of ADD , REMOVE , SET , or DELETE |
tagKey | The key of the tag to be modified |
values | The values to add, remove, or set |
Attribute Remove Action
Variables | Description |
---|---|
order | What order to apply this action in, lower is earlier. |
targetPath | The path of the map attribute to modify. If the attribute does not exist or is not a map, there is no effect. |
values | The values to remove from the attribute. |
Attribute Add Action
Variables | Description |
---|---|
order | What order to apply this action in, lower is earlier. |
targetPath | The path of the list attribute to create or modify. If the attribute is not a list, there is no effect. |
values | The values to add to a list attribute. |
Attribute Merge Action
Variables | Description |
---|---|
order | What order to apply this action in, lower is earlier. |
targetPath | The path of the map attribute to create or modify. If the attribute is not a map, there is no effect. |
value | The values to set in the map attribute. Any existing values are overwritten. |
Job Action
These are Grepr asynchronous batch jobs configured to run automatically as part of a condition.
The JobAction
can be one of the following:
Backfill Job Action
This action runs an asynchronous batch job to backfill data from
the raw logs table to the configured sink. The back-filled data is also added to the processed logs table to ensure that
the same log is not backfilled twice leading to duplicates. The context on what needs to be backfilled is provided by the
backfillQuery
which will filter logs that will be backfilled. It has the following configuration options:
Variables | Description |
---|---|
name | The name of the backfill job action. |
order | Action order. Lower is applied first. |
backfillTimespan | The timespan (Duration ) for which to backfill data |
jobTags | The tags added to the backfill job, for ease of search. |
backfillQuery | The query to use for the backfill. This scopes the data that will be backfilled into the sink. This query can use variables to parameterize the data that will be backfilled based on the trigger |
rawLogsTable | The name of the table that has the raw logs. |
processedLogsTable | The name of the table that has the processed logs. |
sinkOperationName | The name of the Sink Operation to which the backfilled data should be written. This must be the name of a sink vertex in the same Job that defines this action. |
limit | Maximum number of logs to backfill. Default and max is 500,000. |
Event Predicate
Event predicates are used to match events based on certain parameters. The EventPredicate
can be one of the following types:
Datadog Query Predicate
This is a predicate that checks if an event matches a datadog query. It has the following configuration options:
Variables | Description |
---|---|
query | The datadog query to match. See this (opens in a new tab) for more info on syntax |
Log Transform
Applies transformations to log events.
Variables | Description |
---|---|
name | The name of the operation |
transforms | A list of Event Actions to apply |