Skip to Content
Grepr CLI

The Grepr CLI

The Grepr command-line interface (CLI) allows you to manage jobs and run queries in the Grepr platform. You can also use the CLI in your scripts or other clients to automate tasks. The CLI supports the following features:

  • Secure authentication with OAuth 2.0 and automatic refresh and storage of tokens.
  • Management of multiple named configuration profiles.
  • Real-time streaming of job results in a standard, structured format.
  • A choice of output format for results, including enhanced tables, CSV, pretty-printed JSON, raw, and single-line JSON without indentation (compact).
  • Saving results directly to files with the --output option.
  • Running queries against datasets.

The Grepr CLI uses the Grepr REST APIs to provide a convenient command-line interface for common operations. For more advanced programmatic access, you can use the APIs directly. See Automate Grepr processing with the Grepr REST APIs.

Requirements

Installing the Grepr CLI requires version 20 or above of Node.js .

Installation

# Install globally npm install -g @grepr/cli
# Or with yarn yarn global add @grepr/cli
# Or with pnpm pnpm add -g @grepr/cli

After installation, you can run the CLI directly:

grepr --help

Option 2: Run with npx (No Installation Required)

# Run directly from npm registry npx @grepr/cli --help

Example: Create a CLI configuration, create and run a job, and run a query

# Save a configuration for easy reuse grepr --org-name myorg config:save myconfig # Or save and set as default (no need to specify --conf for future commands) grepr --org-name myorg config:save myconfig --default # Use the saved configuration to create and run a job grepr --conf myconfig job:create templates/default-job.json # Or if you set a default configuration, simply run commands without --conf or --org-name grepr job:create templates/default-job.json # Execute a quick query without job files with default config grepr query --dataset-name "production-logs" --query "level:ERROR"

Configuration files are stored in ~/.grepr/cli-config.json:

{ "_default": "dev", "dev": { "orgName": "myorg", "apiBaseUrl": "https://myorg.app.grepr.ai/api", "authBaseUrl": "https://grepr-prod.us.auth0.com", "clientId": "${CLIENT_ID}", "authMethod": "oauth" } }

Authentication

The Grepr CLI redirects you to a login form when you run your first CLI command or if your authentication token has expired. After you sign in with your Grepr credentials, you can return to the command line and wait for the completion of your command.

If you experience an authentication issue, you can often resolve it by re-authenticating. To re-authenticate, remove any currently cached tokens so that the next command you run requires authentication:

# Clear cached tokens and re-authenticate rm -rf ~/.grepr/auth/ grepr --conf dev integration:list

Command Reference

Global Options

The following options are available for all commands:

--conf <name> Use saved configuration --org-name <name> Organization name (required) --api-base-url <url> API server base URL (default: https://<orgName>.app.grepr.ai/api) --auth-base-url <url> Auth0 base URL (default: https://grepr-prod.us.auth0.com) --auth-method <method> Authentication method: oauth --client-id <id> OAuth Client ID (default: web client ID) --timezone <tz> Timezone for timestamp formatting (default: system) -o, --output <file> Output results to file instead of stdout -d, --debug Enable debug output -q, --quiet Suppress non-essential output

Troubleshoot issues with debug mode

If you experience issues when running CLI commands, such as network issues, you can use the --debug option to capture the full logging output:

# Enable debug mode for detailed logging grepr --conf dev --debug job:create job.json

CLI Commands

To display the help for a specific command, use:

grepr <command> --help

For example:

grepr job:create --help

The Grepr CLI supports the following commands:

Command groupDescriptionCommands
Job ManagementCommands for creating and managing Grepr jobs.job:create, job:list, job:get, job:update, job:delete
QueriesRun a query against data in the Grepr data lake.query
Integration ManagementManage integrations with external services and data sources.integration:list, integration:get
Dataset ManagementManage datasets for log storage and querying.dataset:create, dataset:list, dataset:get, dataset:update, dataset:delete
ConfigurationConfigure CLI settings and preferences.config:save, config:delete, config:list, config:show, config:default
DocumentationSearch and retrieve Grepr documentation.docs:search, docs:get

Command examples: run queries

# Query by dataset name with time range grepr --conf dev query \ --dataset-name "production-logs" \ --start "2024-01-01T00:00:00Z" \ --end "2024-01-01T01:00:00Z" \ --query "service:web-api AND level:ERROR"
# Query by dataset ID with CSV output to file grepr --conf dev query \ --dataset-id "abc123-def456-ghi789" \ --format csv \ --output errors.csv \ --limit 1000 \ --query "level:ERROR"
# Get all logs for the last hour, sorted by timestamp grepr --conf dev query \ --dataset-name "production-logs" \ --sort-order "ASC" \ --sort "eventTimestamp:asc"

Command examples: search documentation

The CLI includes built-in semantic search over Grepr documentation, enabling you to find relevant information without leaving the terminal. This is particularly useful when working with AI assistants or scripting workflows.

# Search for information about Datadog integration grepr docs:search "datadog integration" # Search only API operations grepr docs:search "create job" --type api # Search only data schemas grepr docs:search "grok parser" --type schema # Search only user documentation grepr docs:search "log reduction" --type doc # Search and output as JSON for scripting grepr docs:search "log reduction" -f json # Disable colors for piping to other tools grepr docs:search "grok patterns" --no-color # Custom preview length (default: 400 characters) grepr docs:search "flink configuration" -p 200

After finding a document, retrieve its full content:

# Get full document by URI from search results grepr docs:get "doc://integrations/datadog/page.mdx" # Pipe to AI assistant or other tools grepr docs:get "doc://tutorials/first-pipeline/page.mdx" | claude # Get API operation documentation grepr docs:get "api://api/Jobs/submitSyncJob" # Get schema definition grepr docs:get "schema://GrokParser" # Save documentation locally grepr docs:get "doc://integrations/datadog/page.mdx" > datadog-docs.md

The documentation search includes three types of content:

  • User documentation (doc://): Tutorials, guides, and integration instructions
  • API operations (api://): REST endpoint documentation with parameters, request/response schemas, and examples
  • Data schemas (schema://): Complete schema definitions for job configurations and API payloads

For example, searching for “create grok parser” will return both the GrokParser schema definition and the API endpoint for validating Grok rules. This makes it easy to find the exact structure needed for job configurations or API requests.

The typical workflow combines both commands:

  1. Search to find relevant documentation: grepr docs:search "topic" -f compact
  2. Get the full content using the URI from results: grepr docs:get "doc://path/to/file.mdx"

Job Definition Format

Job definitions are JSON files that specify the query to execute. The CLI automatically handles both synchronous and asynchronous jobs based on the execution field.

To learn more about job definitions, see Create and manage jobs with the Grepr REST API.

The following fields are required in a job definition:

  • name: A name identifying the job.
  • execution:
    • SYNCHRONOUS for a job that streams results in real-time to the CLI. The output format is set with the --format option.
    • ASYNCHRONOUS for a job that runs in the background. These jobs can be monitored with the job:list and job:get commands.
  • processing: BATCH or STREAMING.
  • jobGraph: the data processing pipeline definition.

Example: define a synchronous job

Streams results in real-time:

{ "name": "realtime-query-job", "execution": "SYNCHRONOUS", "processing": "BATCH", "jobGraph": { "vertices": [ { "type": "logs-iceberg-table-source", "name": "data_source", "start": "2025-10-13T00:00:00Z", "end": "2025-10-15T00:00:00Z", "query": { "type": "datadog-query", "query": "level:ERROR" }, "sortOrder": "UNSORTED", "datasetId": "your-dataset-id" }, { "type": "logs-sync-sink", "name": "sink" } ], "edges": [ "data_source -> sink" ] } }

Example: define an asynchronous job

Creates a job that can be monitored and managed:

{ "name": "background-processing-job", "execution": "ASYNCHRONOUS", "processing": "BATCH", "jobGraph": { "vertices": [ { "type": "logs-iceberg-table-source", "name": "data_source", "start": "2025-10-13T00:00:00Z", "end": "2025-10-15T00:00:00Z", "datasetId": "your-dataset-id" }, { "type": "logs-sink", "name": "output_sink" } ], "edges": [ "data_source -> output_sink" ] } }

Output formats

Table format (default)

Displays results in a formatted table with:

  • Column wrapping (80 characters max).
  • Pretty-printed JSON in cells.
  • Configurable sorting.
  • Proper alignment.
+----------+------------------------+------------------+ | id | eventTimestamp | message | +----------+------------------------+------------------+ | abc123 | 10/19/2025, 6:00:00 PM | Error occurred | | def456 | 10/19/2025, 6:01:00 PM | Request | | | | completed | +----------+------------------------+------------------+

Pretty format

Syntax-highlighted JSON with colors:

{ "id": "abc123", "eventTimestamp": "2025-10-19T18:00:00Z", "tags": { "service": "web-api", "environment": "prod" } }

Raw format

Unprocessed JSON strings, one per line:

{"id":"abc123","eventTimestamp":"2025-10-19T18:00:00Z"} {"id":"def456","eventTimestamp":"2025-10-19T18:01:00Z"}

CSV format

Comma-separated values with proper escaping:

id,eventTimestamp,message,tags abc123,"10/19/2025, 6:00:00 PM","Error occurred","{""service"":[""web-api""]}" def456,"10/19/2025, 6:01:00 PM","Request completed","{""service"":[""web-api""]}"

Useful for data analysis, spreadsheets, and data pipelines. Handles complex JSON data by properly escaping nested objects and arrays.

Compact format

Single-line JSON without indentation:

{"id":"abc123","eventTimestamp":"2025-10-19T18:00:00Z","message":"Error"}

Track job status in real time

The CLI displays real-time job status updates:

StatusDescription
[CONNECTING]Establishing connection.
[CONNECTED]Connection established.
[RUNNING]The job is processing data.
[FINISHED]The job completed successfully.
[FAILED]The job failed.
[CANCELLED]The job was cancelled.
[TIMED_OUT]The job exceeded the time limit.
[SCANNED_MAX]The job exceeded the data scan limit.
Last updated on