Skip to Content
TutorialsTroubleshooting incidents with Grepr-processed data

Tutorial: Troubleshooting incidents with Grepr-processed data and your observability tool

This tutorial describes how to use your observability tools to examine Grepr-processed log data, providing a powerful tool for investigating and troubleshooting issues in your infrastructure or applications.

This tutorial describes using the Datadog UI to work with log data, but you can follow the same steps with any Grepr-supported observability tool. For the complete list of supported observability vendors, see Supported integrations.

Requirements

To use the example in this tutorial, you must first complete the Build your first Grepr pipeline tutorial, which creates the processed logs used by this tutorial.

Troubleshooting Example

Suppose you need to find the cause of a production incident. As part of your investigation, you need to view the raw log messages that Grepr aggregated into a summary message. To find the raw log messages, you can use the following steps:

  1. In the Datadog Log Explorer, open one of the summary messages. These are messages that start with Repeats NNx in the past.... Grepr replaces parameters that change between aggregated messages with a placeholder, such as <number>, <timestamp>, or <any>.

    Summary message view in the Datadog Log Explorer.

    In this example message, Grepr has identified a timestamp parameter and a number parameter that change between log messages. You can also see some Grepr-specific fields in the details, including:

    • firstTimestamp and lastTimestamp are the timestamps of the first and last log message that Grepr aggregated into this summary message.
    • patternId is the identifier of the pattern matched by messages in this summary.
    • rawLogsUrl is a URL that you can click on to see the raw log messages that Grepr aggregated into this summary message.
    • repeatCount is the number of log messages that Grepr aggregated into this summary message.
  2. To find the other messages that belong to this summary’s pattern, hover over patternId and click **Filter by @grepr.patternId:<pattern-id>**, where pattern-id is the unique pattern identifier of the aggregated messages.

    Filter messages in the Datadog Log Explorer detail view.

    This selection filters the logs to display only log messages with the same pattern as the selected message.

    View of messages filtered by pattern in the Datadog Log Explorer.

    In some cases, although you might see multiple summary messages with the same pattern ID, Grepr is aggregating these messages correctly. Grepr creates these multiple summary messages to ensure that no messages are lost in the following scenarios:

    • Late-arriving data: If a message arrives after a summary message is created, Grepr creates a new summary message to include the late-arriving message, ensuring you receive the late-arriving data correctly.
    • Summary messages that exceed observability tool limits: To avoid errors if your observability tool receives an aggregated message that exceeds its limits, Grepr creates a new summary message for aggregating additional messages.
  3. Reopen the summary message and click the URL in the rawLogsUrl field.

    Raw log URL in the Datadog Log Explorer summary view.

    The Grepr Data Explorer opens and displays the results of a search for the raw messages belonging to the same hosts, service, and time period as the summarized messages, highlighting all the messages with the same pattern ID.

    Raw log view in the Grepr Data Explorer.

  4. To open a side panel displaying the details of a message, click one of the messages.

    Log message detail in the Grepr Data Explorer.

  5. To load all these raw log messages to Datadog so that you can use the Datadog UI to search and analyze them, expand the menu next to Search in the top right of the Data Explorer and select Backfill.

    Selecting backfill from the Search menu.

    Clicking Backfill starts a Grepr backfill job that loads the raw logs into Datadog. You can see the status of the job by clicking Jobs in the top bar of the Grepr Data Explorer.

    Viewing the status of a backfill job in the Grepr Data Explorer.

  6. To view the backfilled messages in Datadog after the job completes, select the job in the Jobs menu. Selecting the job takes you to the Datadog Log Explorer, where you can view the backfilled messages. Because Datadog needs to index the new messages, there might be a short delay before they appear in the Datadog UI.

    Datadog backfilled logs

    Grepr automatically deduplicates messages when you run a backfill to prevent multiple copies of log messages if you backfill logs across multiple searches.

    You can now use the Datadog Log Explorer to search and analyze the backfilled messages.

Last updated on