Skip to Content

SQL transform supported data types

This page provides details on the three data types supported as input to and output from a SQL transform: LOG_EVENT, VARIANT, and COMPLETE_SPAN. These are not scalar data types. Instead, they are structured or semi-structured data formats.

The LOG_EVENT data type

The LOG_EVENT type is a structured format representing a log event, and has the following schema:

ROW< id STRING, eventtimestamp TIMESTAMP_LTZ(3), receivedtimestamp TIMESTAMP_LTZ(3), message STRING, severity INT, tags MAP<STRING, ARRAY<STRING>>, attributes STRING >
ColumnTypeDescription
idSTRINGGlobally unique identifier for the log event.
eventtimestampTIMESTAMP_LTZ(3)When the event occurred, formatted as TIMESTAMP(3) WITH LOCAL TIME ZONE. This field is parsed from the input log event.
receivedtimestampTIMESTAMP_LTZ(3)When Grepr received the event, formatted as TIMESTAMP(3) WITH LOCAL TIME ZONE.
messageSTRINGThe log message content.
severityINTThe severity level of the event. This is a value between 1 and 24, following the OpenTelemetry convention.
tagsMAP<STRING, ARRAY<STRING>>Key-value pairs for filtering and routing.
attributesSTRINGJSON string containing structured data associated with the event.

If you’re transforming a field like message or severity, the transformed field must appear before the * wildcard in your SELECT clause.

Create custom log events

You can use a transformation to construct new LOG_EVENT records with specific fields. For example:

-- Explicitly construct a new LogEvent with specific fields. SELECT UPPER(message) as message, CASE WHEN severity > 16 THEN 21 ELSE severity END as severity, MAP['transformed', ARRAY['true']] as tags, JSON_OBJECT('original_message' VALUE message) as attributes FROM logs

Grepr automatically adds default values for fields that are not specified:

  • id: When not specified or NULL, defaults to a new random UUID.
  • eventtimestamp: When not specified or NULL, defaults to the current timestamp in milliseconds.
  • receivedtimestamp: When not specified or NULL, defaults to the current timestamp in milliseconds.
  • severity: When not specified or NULL, defaults to 9 (INFO level).
  • message: When not specified or NULL, defaults to an empty string.
  • tags: When not specified or NULL, defaults to an empty map.
  • attributes: When not specified or NULL, defaults to an empty JSON object.

The following shows an example input event to the above transformation and the resulting output event:

Input:

{ "id": "evt1", "eventtimestamp": 1724214074062, "receivedtimestamp": 1724214074188, "severity": 17, "message": "user login successful", "tags": {"service": ["auth"]}, "attributes": {} }

Output:

{ "id": "12345678-1234-5678-1234-567812345678", // A new generated UUID "eventtimestamp": 174000000000, // Current timestamp in milliseconds "receivedtimestamp": 174000000002, // Current timestamp in milliseconds "severity": 21, // Transformed severity "message": "USER LOGIN SUCCESSFUL", // Transformed message "tags": {"transformed": ["true"]}, // New tags "attributes": { // New attributes "original_message": "user login successful" } }

The VARIANT data type

A VARIANT data type contains timestamped, semi-structured data with flexible JSON content:

ROW< receivedtimestamp TIMESTAMP_LTZ(3), eventtimestamp TIMESTAMP_LTZ(3), data STRING >

The following is an example of a simple VARIANT:

{ "receivedtimestamp": 174000000000, "eventtimestamp": 174000000000, "data": { "count": 10 } }

The COMPLETE_SPAN data type

The COMPLETE_SPAN data type represents an OpenTelemetry span with complete resource and scope context. See Spans  in the OpenTelemetry documentation.

Complex nested ROW structure with resource, scope, and span fields
Last updated on