Skip to Content
Grepr CLI

Grepr CLI

The Grepr CLI (command-line interface) allows you to manage jobs and run queries in the Grepr platform. You can also use the CLI in your scripts or other clients to automate tasks. The CLI supports the following features:

  • Secure authentication with OAuth 2.0 and automatic refresh and storage of tokens.
  • Management of multiple named configuration profiles.
  • Real-time streaming of job results in a standard, structured format.
  • A choice of output format for results, including enhanced tables, CSV, pretty-printed JSON, raw, and single-line JSON without indentation (compact).
  • Saving results directly to files with the --output option.
  • Running queries against datasets.

The Grepr CLI uses the Grepr REST APIs to provide a convenient command-line interface for common operations. For more advanced programmatic access, you can use the APIs directly. See Automate Grepr processing with the Grepr REST APIs.

Requirements

Installing the Grepr CLI requires version 20 or above of Node.js .

Installation

# Install globally npm install -g @grepr/cli # Or with yarn yarn global add @grepr/cli # Or with pnpm pnpm add -g @grepr/cli

After installation, you can run the CLI directly:

grepr --help

Option 2: Run with npx (No Installation Required)

# Run directly from npm registry npx @grepr/cli --help

Example: Create a CLI configuration, create and run a job, and run a query

# Save a configuration for easy reuse grepr --org-name myorg config:save myconfig # Use the saved configuration to create and run a job grepr --conf myconfig job:create templates/default-job.json # Or specify all options directly grepr --org-name myorg job:create templates/default-job.json # Execute a quick query without job files grepr --conf myconfig query --dataset-name "production-logs" --query "level:ERROR"

Configuration files are stored in ~/.grepr/cli-config.json:

{ "dev": { "orgName": "myorg", "apiBaseUrl": "https://myorg.app.grepr.ai/api", "authBaseUrl": "https://grepr-prod.us.auth0.com", "clientId": "${CLIENT_ID}", "authMethod": "oauth" } }

Authentication

The Grepr CLI redirects you to a login form when you run your first CLI command or if your authentication token has expired. After you sign in with your Grepr credentials, you can return to the command line and wait for the completion of your command.

If you experience an authentication issue, you can often resolve it by re-authenticating. To re-authenticate, remove any currently cached tokens so that the next command you run requires authentication:

# Clear cached tokens and re-authenticate rm -rf ~/.grepr/auth/ grepr --conf dev integration:list

Command Reference

Global Options

The following options are available for all commands:

--conf <name> Use saved configuration --org-name <name> Organization name (required) --api-base-url <url> API server base URL (default: https://<orgName>.app.grepr.ai/api) --auth-base-url <url> Auth0 base URL (default: https://grepr-prod.us.auth0.com) --auth-method <method> Authentication method: oauth --client-id <id> OAuth Client ID (default: web client ID) --timezone <tz> Timezone for timestamp formatting (default: system) -o, --output <file> Output results to file instead of stdout -d, --debug Enable debug output -q, --quiet Suppress non-essential output

Troubleshoot issues with debug mode

If you experience issues when running CLI commands, such as network issues, you can use the --debug option to capture the full logging output:

# Enable debug mode for detailed logging grepr --conf dev --debug job:create job.json

CLI Commands

To display the help for a specific command, use:

grepr <command> --help

For example:

grepr job:create --help

The Grepr CLI supports the following commands:

Command groupDescription and commands
Job ManagementCommands for creating and managing Grepr jobs.
job:create, job:list, job:get, job:update, job:delete
QueriesRun a query against data in the Grepr data lake.
query
Integration ManagementManage integrations with external services and data sources.
integration:list, integration:get
Dataset ManagementManage datasets for log storage and querying.
dataset:create, dataset:list, dataset:get, dataset:update, dataset:delete
ConfigurationConfigure CLI settings and preferences.
config:save, config:delete, config:list, config:show

Command examples: run queries

# Query by dataset name with time range grepr --conf dev query \ --dataset-name "production-logs" \ --start "2024-01-01T00:00:00Z" \ --end "2024-01-01T01:00:00Z" \ --query "service:web-api AND level:ERROR" # Query by dataset ID with CSV output to file grepr --conf dev query \ --dataset-id "abc123-def456-ghi789" \ --format csv \ --output errors.csv \ --limit 1000 \ --query "level:ERROR" # Get all logs for the last hour, sorted by timestamp grepr --conf dev query \ --dataset-name "production-logs" \ --sort-order "ASC" \ --sort "eventTimestamp:asc"

Job Definition Format

Job definitions are JSON files that specify the query to execute. The CLI automatically handles both synchronous and asynchronous jobs based on the execution field.

To learn more about job definitions, see Create and manage jobs with the Grepr REST API.

The following fields are required in a job definition:

  • name: A name identifying the job.
  • execution:
    • "SYNCHRONOUS" for a job that streams results in real-time to the CLI. The output format is set with the --format option.
    • "ASYNCHRONOUS" for a job that runs in the background. These jobs can be monitored with the job:list and job:get commands.
  • processing: "BATCH" or "STREAMING".
  • jobGraph: the data processing pipeline definition.

Example: define a synchronous job

Streams results in real-time:

{ "name": "realtime-query-job", "execution": "SYNCHRONOUS", "processing": "BATCH", "jobGraph": { "vertices": [ { "type": "logs-iceberg-table-source", "name": "data_source", "start": "2025-10-13T00:00:00Z", "end": "2025-10-15T00:00:00Z", "query": { "type": "datadog-query", "query": "level:ERROR" }, "sortOrder": "UNSORTED", "datasetId": "your-dataset-id" }, { "type": "logs-sync-sink", "name": "sink" } ], "edges": [ "data_source -> sink" ] } }

Example: define an asynchronous job

Creates a job that can be monitored and managed:

{ "name": "background-processing-job", "execution": "ASYNCHRONOUS", "processing": "BATCH", "jobGraph": { "vertices": [ { "type": "logs-iceberg-table-source", "name": "data_source", "start": "2025-10-13T00:00:00Z", "end": "2025-10-15T00:00:00Z", "datasetId": "your-dataset-id" }, { "type": "logs-sink", "name": "output_sink" } ], "edges": [ "data_source -> output_sink" ] } }

Output formats

Table format (default)

Displays results in a formatted table with:

  • Column wrapping (80 characters max)
  • Pretty-printed JSON in cells
  • Configurable sorting
  • Proper alignment
+----------+------------------------+------------------+ | id | eventTimestamp | message | +----------+------------------------+------------------+ | abc123 | 10/19/2025, 6:00:00 PM | Error occurred | | def456 | 10/19/2025, 6:01:00 PM | Request | | | | completed | +----------+------------------------+------------------+

Pretty format

Syntax-highlighted JSON with colors:

{ "id": "abc123", "eventTimestamp": "2025-10-19T18:00:00Z", "tags": { "service": "web-api", "environment": "prod" } }

Raw format

Unprocessed JSON strings, one per line:

{"id":"abc123","eventTimestamp":"2025-10-19T18:00:00Z"} {"id":"def456","eventTimestamp":"2025-10-19T18:01:00Z"}

CSV format

Comma-separated values with proper escaping:

id,eventTimestamp,message,tags abc123,"10/19/2025, 6:00:00 PM","Error occurred","{""service"":[""web-api""]}" def456,"10/19/2025, 6:01:00 PM","Request completed","{""service"":[""web-api""]}"

Useful for data analysis, spreadsheets, and data pipelines. Handles complex JSON data by properly escaping nested objects and arrays.

Compact format

Single-line JSON without indentation:

{"id":"abc123","eventTimestamp":"2025-10-19T18:00:00Z","message":"Error"}

Track job status in real time

The CLI displays real-time job status updates:

StatusDescription
[CONNECTING]Establishing connection
[CONNECTED]Connection established
[RUNNING]The job is processing data
[FINISHED]The job completed successfully
[FAILED]The job failed
[CANCELLED]The job was cancelled
[TIMED_OUT]The job exceeded the time limit
[SCANNED_MAX]The job exceeded the data scan limit
Last updated on