Grok Parser

Jump to API spec

Grepr provides a Grok parser that can be used to parse semi-structured log messages and enrich logs with extracted fields. Grok parsing rules are of the form %{MATCHER:EXTRACT:TRANSFORM} which are supplied to the Grok Parser operation as configurations.

Matcher: An identifier (possibly a reference to another custom matcher) that describes what to expect (number, word, notSpace, etc.) when parsing the message.
Extract (optional): An identifier representing the piece of text matched by the Matcher.
Transform (optional): A transformer applied to the match to update the extracted value before enriching the message.

Note: All the extracted fields are added to the attributes field of the log event, unless:

You preface the extractor with tags., in which case the extracted field is added to the tags of the log event.
The extractor is severity, in which case the extracted field is added to the top level severity` field of the log event.
The extractor is eventTimestamp, in which case the extracted field is added to the top level eventTimestamp field of the log event.

Example:

Grok pattern:

MyParsingRule %{word:user} %{word:tags.lastname} connected on %{date("MM/dd/yyyy"):date} from %{ipv4:ip}

applied to the log message:

{
 "message": "john smith connected on 11/08/2017 from 127.0.0.1 with severity 5  "
}

would result in the following enriched log message:

{
  "message": "john connected on 11/08/2017 from 127.0.0.1",
  "attributes": {
    "user": "john",
    "date": 15759590400000, // (Milliseconds since epoch)
    "ip": "127.0.0.1"
  },
  "severity": 5
}

and with tags containing:

[
  "lastname:smith"
]

The user can define multiple grok parsing rules (each with a unique name, like MyParsingRule from above) which can reference each other. The user can also define helper rules that are groks which act as references in the main grok rules that get applied to the log messages.

Currently supported Grok Matchers are:

All the ones defined in Logstash (opens in a new tab)
Additional list of Matchers supported (These are compatible with the Grok Matchers supported by Datadog (opens in a new tab)):

boolean("truePattern", "falsePattern") : Matches and parses a Boolean, optionally defining the true and false patterns (defaults to true and false, ignoring case).
notSpace: Matches any string until the next space.
regex("pattern"): Matches a regex.
numberStr: Matches a decimal floating point number and parses it as a string.
number: Matches a decimal floating point number and parses it as a double precision number.
numberExtStr: Matches a floating point number (with scientific notation support) and parses it as a string.
numberExt: Matches a floating point number (with scientific notation support) and parses it as a double precision number.
integerStr: Matches an integer number and parses it as a string.
integer: Matches an integer number and parses it as an integer number.
integerExtStr: Matches an integer number (with scientific notation support) and parses it as a string.
integerExt: Matches an integer number (with scientific notation support) and parses it as an integer number.
word: Matches characters from a-z, A-Z, 0-9, including the _ (underscore) character.
doubleQuotedString: Matches a double-quoted string.
singleQuotedString: Matches a single-quoted string.
quotedString: Matches a double-quoted or single-quoted string.
uuid: Matches a UUID.
mac: Matches a MAC address.
ipv4: Matches an IPV4 address.
ipv6: Matches an IPV6 address.
ip: Matches an IP (v4 or v6).
hostname: Matches a hostname.
ipOrHost: Matches a hostname or IP.
data: Matches any string including spaces and newlines. Equivalent to .* in regex.

List of Transformers supported (These are compatible with Datadog (opens in a new tab)):

number: Parses a match as double precision number.
integer: Parses a match as an integer number.
boolean: Parses true and false strings as booleans ignoring case.
nullIf("value"): Returns null if the match is equal to the provided value.
json: Parses properly formatted JSON.
useragent([decodeuricomponent:true/false]): Parses a user-agent and returns a JSON object that contains the device, OS, and the browser represented by the Agent.

Note: More Matchers and Filters can easily be added by our team. If you have a specific requirement, please reach out to us.

Filtering Json Parsing