APIs
Operations

Operations

Operations are the building blocks of a Grepr pipeline. They are the individual steps that are applied to each event as it passes through the pipeline. Operations can be combined to create complex pipelines that transform, filter, and enrich log events. Each operation appears as a vertex

More detailed information is accessible in the API spec

Parsing in Grepr

Json Message Parsing

Grepr has the ability to automatically detect and parse JSON log messages into structured objects. All fields in the JSON object are parsed and added into the attributes field of the log event. Using this along with the log attributes remapper allows users to enrich their logs.

Note: This operation is automatically performed when creating pipeline through the UI. However, it needs to be explicitly added in the Grepr job graph when creating a pipeline through the API.

Operation execution example:

Incoming log message:

{
	"message": "{\"msg\": \"Example log message\", \"@timestamp\": \"2007-12-03T10:15:30.00Z\", \"log\": {\"level\": \"INFO\"}}"
}

Parsed log message:

{
    "message": "",
    "attributes": {
        "@timestamp": "2007-12-03T10:15:30.00Z",
        "msg": "Example log message",
        "log": {
          "level": "INFO"
        }
    }
}

Grok Parser

Jump to API spec

For non-JSON formats, Grepr provides a Grok parser that can be used to parse semi-structured log messages and enrich logs with extracted fields. Grok parsing rules are of the form %{MATCHER:EXTRACT:TRANSFORM} which are supplied to the Grok Parser operation as configurations.

  • Matcher: An identifier (possibly a reference to another custom matcher) that describes what to expect (number, word, notSpace, etc.) when parsing the message.
  • Extract (optional): An identifier representing the piece of text matched by the Matcher.
  • Transform (optional): A transformer applied to the match to update the extracted value before enriching the message.

Note: All the extracted fields are added to the attributes field of the log event.

Example:

Grok pattern:

MyParsingRule %{word:user} connected on %{date("MM/dd/yyyy"):date} from %{ipv4:ip}

applied to the log message:

{
 "message": "john connected on 11/08/2017 from 127.0.0.1"
}

would result in the following enriched log message:

{
  "message": "john connected on 11/08/2017 from 127.0.0.1",
  "attributes": {
    "user": "john",
    "date": 15759590400000, // (Milliseconds since epoch)
    "ip": "127.0.0.1"
  }
}

The user can define multiple grok parsing rules (each with a unique name, like MyParsingRule from above) which can reference each other. The user can also define helper rules that are groks which act as references in the main grok rules that get applied to the log messages.

Currently supported Grok Matchers are:

  1. All the ones defined in Logstash (opens in a new tab)
  2. Additional list of Matchers supported (These are compatible with the Grok Matchers supported by Datadog (opens in a new tab)):
    • boolean("truePattern", "falsePattern") : Matches and parses a Boolean, optionally defining the true and false patterns (defaults to true and false, ignoring case).
    • notSpace: Matches any string until the next space.
    • regex("pattern"): Matches a regex.
    • numberStr: Matches a decimal floating point number and parses it as a string.
    • number: Matches a decimal floating point number and parses it as a double precision number.
    • numberExtStr: Matches a floating point number (with scientific notation support) and parses it as a string.
    • numberExt: Matches a floating point number (with scientific notation support) and parses it as a double precision number.
    • integerStr: Matches an integer number and parses it as a string.
    • integer: Matches an integer number and parses it as an integer number.
    • integerExtStr: Matches an integer number (with scientific notation support) and parses it as a string.
    • integerExt: Matches an integer number (with scientific notation support) and parses it as an integer number.
    • word: Matches characters from a-z, A-Z, 0-9, including the _ (underscore) character.
    • doubleQuotedString: Matches a double-quoted string.
    • singleQuotedString: Matches a single-quoted string.
    • quotedString: Matches a double-quoted or single-quoted string.
    • uuid: Matches a UUID.
    • mac: Matches a MAC address.
    • ipv4: Matches an IPV4 address.
    • ipv6: Matches an IPV6 address.
    • ip: Matches an IP (v4 or v6).
    • hostname: Matches a hostname.
    • ipOrHost: Matches a hostname or IP.
    • data: Matches any string including spaces and newlines. Equivalent to .* in regex.
  3. List of Transformers supported (These are compatible with Datadog (opens in a new tab)):
    • number: Parses a match as double precision number.
    • integer: Parses a match as an integer number.
    • boolean: Parses true and false strings as booleans ignoring case.
    • nullIf("value"): Returns null if the match is equal to the provided value.
    • json: Parses properly formatted JSON.
    • useragent([decodeuricomponent:true/false]): Parses a user-agent and returns a JSON object that contains the device, OS, and the browser represented by the Agent.

Note: More Matchers and Filters can easily be added by our team. If you have a specific requirement, please reach out to us.

Example of how a Grok Parser operation will look like in the Grepr job graph:

{
  "name": "log-message-grok-parser",
  "type": "grok-parser",
  "grokParsingRules": [
      "MyParsingRule %{user} %{connection} %{server}",
      "ParsingRule2 %{MyParsingRule} %{user}"
  ],
  "grokHelperRules": [
      "user %{word:user_name} id:%{integer:user_id}",
      "connection connected %{integer:times} times",
      "server on server %{notSpace:server.name} in %{notSpace:server.env}"
  ]
}

Log Attributes Remapper

Jump to API spec

Maps attributes to top-level fields and tags.

Sets values in the log event model by extracting information out of the attribute.

For example, this event:

{
    "id": "ABCDEF",
    "timestamp": "",
    "message": "",
    "severity": "",
    "service": "",
    "attributes": {
        "syslog": {
            "severity": "10",
            "appname": "test name"
        },
        "status": "5",
        "message": "message 1",
        "timestamp": {
            "ms_since_epoch": 9001
        },
        "eventTime": " "
    }
}

Would be transformed to look like the following:

{
    "id": "ABCDEF",
    "timestamp": "",
    "message": "message 1",
    "severity": "5",
    "service": "test name",
    "attributes": {
        "syslog": {
            "severity": "10",
            "appname": "test name"
        },
        "status": "5",
        "timestamp": {
            "ms_since_epoch": 9001
        },
        "eventTime": " "
    }
}

Note the following behaviors:

  • severity uses status: "5" instead of syslog.severity: "10" because status has a higher priority in the default statusReservedAttributes.
  • Also note that syslog.appname: "test name" was still used, even though syslog.severity: "10" was skipped.
  • The attribute message was removed because it's marked as removed once remapped.
  • If the message attribute was log.message, then message would have been removed, but it's parent log would still exist, even if empty.
  • timestamp: {} and eventTime: " " are not used at all because they are not a non-blank string value.
  • If maxNestedDepthForReservedFields was set to 1, then only message and severity would be set.
VariablesDescription
nameThe name of the operation
maxNestedDepthForReservedFieldsThe maximum depth to expand. Can be [1, 10]. Defaults to 3
timestampReservedAttributesOverride default list of timestamp attributes
hostReservedAttributesOverride default list of host attributes
serviceReservedAttributesOverride default list of service attributes
statusReservedAttributesOverride default list of status attributes
messageReservedAttributesOverride default list of message attributes
traceIdReservedAttributesOverride default list of trace attributes
AttributeRemovedDefault names
timestampfalse"@timestamp", "timestamp", "_timestamp", "Timestamp", "eventTime", "date", "published_date", "syslog.timestamp"
hostfalse"host", "hostname", "syslog.hostname"
servicefalse"service", "syslog.appname", "dd.service"
statusfalse"log.level", "status", "severity", "level", "syslog.severity"
messagetrue"message", "msg", "log"
tracefalse"dd.trace_id", "contextMap.dd.trace_id"

Log Reducer

Jump to API spec

The Log Reducer is the core IP that Grepr has developed to identify and merge similar log messages. The reducer works by clustering similar messages into patterns. Messages continue to be passed through with no modification until the number of messages passed for a pattern hits a threshold. At that point, the messages for that pattern are merged together for two minutes and not passed through. At the end of the two-minute window a summary is emitted for the aggregated patterns.

The reducer works in 3 steps:

  1. Masking: the reducer masks any values that are known to change often such as numbers, UUIDs, or timestamps. This allows the reducer to be more efficient since it knows that a certain word has to be of a certain type.
  2. Tokenizing: the reducer tokenizes the messages into words based on a configurable set of punctuation characters.
  3. Clustering: the reducer uses a similarity metric to cluster messages into patterns. The reducer uses a similarity threshold to determine if two messages are similar enough to be in the same pattern.

The Log Reducer has the following configuration options:

VariablesDescription
nameThe name of the operation
dedupThresholdThe number of messages that must be seen for a pattern before the reducer starts aggregating messages.
partitionByTagsA list of tags to partition the messages by. Messages with different values of these tags will be split into different patterns. This field can be used to limit aggregation across domains or teams.
similarityThresholdPercent of tokens that have to match exactly between two messages before being considered similar. Default is 70.
masksA list of masks to apply to the message. Each mask is a list of two elements, the mask id such as "uuid" and the regular expression matching the mask. By default, the reducer has regular expressions for "URL", "IP & port", "UUID", and "number".
enabledMasksA list of the mask ids from the masks field that should be enabled.
delimitersAn array of characters that are used to split the message into tokens. Default is `':', '#', '[', ']', '(', ')', ', ', '
logReducerExceptionsA list of event predicates that when matched to a message will skip the aggregation of that message. This is used to skip aggregating messages that should always be passed through.
exceptionProviderIntegrationIdsA list of integration ids that can provide exceptions to the log reducer. Add an observability vendor integration ID here so Grepr automatically skips aggregating messages used in alerts, dashboards, etc.

Rule Engine

Jump to API spec

The Rule Engine can dynamically affect the processing of logs based on the provided context. The user has the power to configure the rule engine, via the attributes mentioned below, that dictate the actions that it takes. One example usage scenario is if an incident occurs and the user wants to stop log aggregation and, backfill already aggregated data for a period of time for a service.

The Rule Engine has the following configuration options:

VariablesDescription
triggersMap of trigger id to trigger objects. The ids are a unique name for the trigger.
conditionsMap of condition id to condition objects. The ids are a unique name for the conditions.

Example definition of a Rule Engine as a vertex in the Grepr job graph:

{
    "name": "my-rule-engine",
    "type": "rule-engine",
    "triggers": {
        "trigger1": {
            "type": "event-predicate",
            "predicate": {
                "type": "datadog-query",
                "query": "service:my-service"
            },
            "conditionIds": ["condition1"],
            "duration": "5m",
            "variables": {
                "__host": "host" // Extracts host tag from the matching event
            }
        }
    },
    "conditions": {
        "condition1": {
            "actionRules": [
                {
                    "type": "event-rule",
                    "actionPredicate": {
                        "type": "datadog-query",
                        "query": "host:__host" // Matches events with host tag value extracted from the event that fired trigger1
                    },
                    "actions": [
                        {
                            "type": "tag-action",
                            "order": 0,
                            "modification": "ADD",
                            "tagKey": "new-tag",
                            "values": ["new-value"] // Adds {new-tag: [new-value]} to tags
                        }
                    ]
                },
                {
                    "type": "job-rule",
                    "actions": [
                        {
                            "type": "backfill-job-action",
                            "name": "my_backfill_job",
                            "order": 1,
                            "backfillTimespan": "PT10M",
                            "jobTags": {
                                "job-type": "backfill"
                            },
                            "backfillQuery": { // Setting the query to define the context for the backfill
                                "type": "datadog-query",
                                "query": "host:__host"
                            },
                            "rawLogsTable": "my_rawlogs_table",
                            "processedLogsTable": "my_processedlogs_table",
                            "sinkOperationName": "sink",
                            "limit": 50000
                        }
                    ]
                }
            ]
        }
    }
}

Condition

Conditions describe the state of the environment. When a condition is "triggered", some actions can be taken by Grepr and, those are described by the condition itself. A condition has a duration, and actions can be taken when the condition starts or for all messages while the condition is ongoing depending on the type of the action.

The actions are described in the condition as action rules. They can be of the following types:

  1. Event Action Rule: This takes actions on events that match a certain predicate while the condition is active.
  2. Job Action Rule: This executes a set of associated job actions when a condition is first triggered.

Event Action Rule

This is executed on events that match a certain predicate, for the entire duration that the associated condition is active. The event action rule has the following configuration options:

VariablesDescription
actionPredicateEvent predicate that triggers actions on an event.
actionsList of event actions to be executed on the event.

As an example, the UI configures the rule engine and conditions such that, when there's an abnormal event or an external API call hits Grepr, events that are related to the incident are tagged with a special tag that tells the Log Reducer to skip aggregating them.

Job Action Rule

A set of job actions that are executed once per associated condition. This means, if an already active condition is extended by the (re-)firing of a trigger, the job actions will not be executed again. The job action rule has the following configuration options:

VariablesDescription
actionsList of job actions to be executed.

As an example, the UI sets up the Rule Engine to kick off a backfill job when a trigger fires.

Trigger

A trigger kicks-off a one or more conditions. Triggers may continue to fire, and that would extend the activation time of the associated conditions. Each trigger has a unique identifier and is of two types:

  1. Event Trigger: This trigger is activated by a matching predicate on an event. The predicate is provided by the user as part of the Rule Engine configuration.
  2. External Trigger: This trigger is activated by an external source, for example, an alert from an observability tool. These will allow users to feed external stimuli into the Rule Engine to take appropriate actions.

The Event Trigger has the following configuration options:

VariablesDescription
predicateEvent predicate on events that will fire a trigger.
conditionIdsThe ids for the conditions that are activated by this trigger.
durationThe duration for which the associated conditions will be active.
variablesMap from variable names to paths for variables to extract from a matching event.

The External Trigger has the following configuration options:

VariablesDescription
conditionIdsThe ids for the conditions that are activated by this trigger.
durationThe duration for which the associated conditions will be active.

Variables

Variables are defined as part of the trigger configuration. These are used to extract values from the event that fired the trigger. The extracted values could be used as context for the conditions that are activated by the trigger. For example, if the trigger is activated by an event with a service tag, the service tag value could be extracted and used in the conditions to perform actions on the events that have the same service tag value.

It's a map from variable names to paths for variables to extract from a matching event. The way to reference an attribute is by using an @ in the variable path. For example, @syslog.appname will extract the value of the attribute syslog.appname from the event. If @ is not specified, the variable will be extracted from the tags of the event. For example, the path app will extract the value of the tag app from the event.

The variable names must start with '__' and be unique. Note that __timestamp (and __severity for logs) is automatically extracted from the respective top level fields of the matching event and are available for use in conditions out of the box.

Note: If the path provided does not exist for a variable, the path itself will be taken as a literal value for the variable.

Event Action

These are actions taken on an event. The EventAction can be one of the following:

Tag Action

VariablesDescription
orderWhat order to apply this action in, lower is earlier.
modificationOne of ADD, REMOVE, SET, or DELETE
tagKeyThe key of the tag to be modified
valuesThe values to add, remove, or set

Attribute Remove Action

VariablesDescription
orderWhat order to apply this action in, lower is earlier.
targetPathThe path of the map attribute to modify. If the attribute does not exist or is not a map, there is no effect.
valuesThe values to remove from the attribute.

Attribute Add Action

VariablesDescription
orderWhat order to apply this action in, lower is earlier.
targetPathThe path of the list attribute to create or modify. If the attribute is not a list, there is no effect.
valuesThe values to add to a list attribute.

Attribute Merge Action

VariablesDescription
orderWhat order to apply this action in, lower is earlier.
targetPathThe path of the map attribute to create or modify. If the attribute is not a map, there is no effect.
valueThe values to set in the map attribute. Any existing values are overwritten.

Job Action

These are Grepr asynchronous batch jobs configured to run automatically as part of a condition. The JobAction can be one of the following:

Backfill Job Action

This action runs an asynchronous batch job to backfill data from the raw logs table to the configured sink. The back-filled data is also added to the processed logs table to ensure that the same log is not backfilled twice leading to duplicates. The context on what needs to be backfilled is provided by the backfillQuery which will filter logs that will be backfilled. It has the following configuration options:

VariablesDescription
nameThe name of the backfill job action.
orderAction order. Lower is applied first.
backfillTimespanThe timespan (Duration) for which to backfill data
jobTagsThe tags added to the backfill job, for ease of search.
backfillQueryThe query to use for the backfill. This scopes the data that will be backfilled into the sink. This query can use variables to parameterize the data that will be backfilled based on the trigger
rawLogsTableThe name of the table that has the raw logs.
processedLogsTableThe name of the table that has the processed logs.
sinkOperationNameThe name of the Sink Operation to which the backfilled data should be written. This must be the name of a sink vertex in the same Job that defines this action.
limitMaximum number of logs to backfill. Default and max is 500,000.

Event Predicate

Event predicates are used to match events based on certain parameters. The EventPredicate can be one of the following types:

Datadog Query Predicate

This is a predicate that checks if an event matches a datadog query. It has the following configuration options:

VariablesDescription
queryThe datadog query to match. See this (opens in a new tab) for more info on syntax

Log Transform

Jump to API spec

Applies transformations to log events.

VariablesDescription
nameThe name of the operation
transformsA list of Event Actions to apply