Overview of the Grepr Platform
The Grepr platform enables efficient collection, storage, processing, and analysis of log data. Because Grepr is built on a general-purpose stream processing engine, you can create processing pipelines for your logs using standard observability pipeline capabilities such as filtering, parsing, transforming, and routing of processed logs. The Grepr log processing functionality includes optimizing the logs shipped to observability platforms to reduce log volumes and observability costs.
This document provides an overview of the core components of the Grepr platform that support this functionality, as well as how to access platform features. This document also provides information on security, high availability, and scalability support in the platform.
The Grepr platform is provided using a software-as-a-service (SaaS) model. For larger deployments or for customers with strict compliance requirements, Grepr also offers a private cloud deployment model. Grepr runs in AWS for both models.
Reducing log volumes with dynamic aggregation
Grepr uses machine learning to identify patterns in your log data and aggregate similar log messages. The output of this aggregation includes summaries and samples of similar messages, which are then forwarded to your observability vendor’s platform. By sending summaries of similar messages instead of multiple events with redundant information, Grepr significantly reduces the size of the shipped logs and your observability costs, while retaining the ability to analyze and troubleshoot issues effectively. Grepr retains the original raw logs in low-cost object storage, allowing you to access the full logs for troubleshooting or analysis.
This aggregation occurs automatically and requires no configuration. When you need to modify the default aggregation, several configuration options are available to tune the aggregation behavior based on your requirements.
When an incident occurs or alerts are raised in your infrastructure, Grepr can automatically ensure that you have a complete set of logs to debug the issue. Grepr does this by temporarily increasing the granularity of related logs forwarded to your observability platform and by backfilling relevant logs from the raw data store. This capability can be triggered manually or automatically based on alerts from your monitoring system or when specific matches are found in the log data.
To learn more about how Grepr reduces log volumes with dynamic aggregation, see Optimizing logs with Grepr’s intelligent aggregation.
To learn how to use summarized logs in your observability platform along with raw logs stored in Grepr, see Troubleshoot incidents with Grepr-processed data and your observability tool.
Understanding the Grepr processing and data models
Grepr uses a flexible processing model that allows you to create complex log processing pipelines and run queries. The processing model supports standard observability pipeline capabilities, including filtering, parsing, transforming, and routing of processed logs. Grepr also uses a standardized data model for processed log events that represents log data as records with fields and attributes. This data model enables efficient processing and querying of log data.
See The Grepr processing and data models.
Raw data storage in the Grepr data lake
The Grepr data lake uses Amazon S3 to provide low-cost object storage for your full raw logs. You can use an S3 bucket provided by Grepr or a bucket in your own account. Logs are stored using Apache Parquet files and the Apache Iceberg table format, providing efficient storage and querying of the logs. To query logs stored in the data lake, the Data Explorer in the Grepr UI provides a user-friendly, visual interface.
To learn more about the data lake, see The Grepr data lake.
To learn more about querying log data in the Grepr data lake, see Query log data in Grepr.
For programmatic access, the Grepr CLI and REST API also support functionality for querying logs in the data lake. See The Grepr CLI and Automate Grepr processing with the Grepr REST APIs.
Security in the Grepr platform
The Grepr platform offers comprehensive security and compliance features to safeguard your data and fulfill enterprise security requirements. These features include secure authentication and identity management with OAuth 2, SOC2 Type 2 compliance, HIPAA compliance, infrastructure and networking security, and data encryption.
See Security in the Grepr platform.
High availability and scalability
The Grepr platform is designed for high availability, ensuring minimal downtime for your log processing pipelines. The platform is also designed for scalability, automatically adding and removing capacity as needed to match processing loads.
See How does Grepr ensure high availability?.
Accessing Grepr functionality
Grepr offers three primary access methods to interact with the platform’s capabilities: a web-based user interface (UI), a comprehensive set of RESTful APIs, and a command-line interface (CLI). You can use any combination of these access methods based on your requirements and preferences.
The Grepr web-based user interface
The Grepr UI provides:
- A streamlined, task-oriented experience for creating and managing pipelines.
- Visual tools for searching and exploring log data.
- Dashboards that visualize pipeline performance and data flows.
- Guided workflows for common observability tasks.
The Grepr REST API
To access some advanced features and automate tasks, the REST API provides:
- Complete programmatic control over all Grepr capabilities.
- The ability to integrate Grepr into your existing workflows and tools.
- Customization options for complex observability pipelines.
- Support for infrastructure-as-code and GitOps approaches.
See Automate Grepr processing with the Grepr REST APIs.
The Grepr command line interface (CLI)
The Grepr CLI allows you to manage jobs and run queries in the Grepr platform from the command line. You can also use the CLI in your scripts or other clients to automate tasks.
See The Grepr CLI.