Grepr CLI
The Grepr CLI (command-line interface) allows you to manage jobs and run queries in the Grepr platform. You can also use the CLI in your scripts or other clients to automate tasks. The CLI supports the following features:
- Secure authentication with OAuth 2.0 and automatic refresh and storage of tokens.
- Management of multiple named configuration profiles.
- Real-time streaming of job results in a standard, structured format.
- A choice of output format for results, including enhanced tables, CSV, pretty-printed JSON, raw, and single-line JSON without indentation (compact).
- Saving results directly to files with the
--outputoption. - Running queries against datasets.
The Grepr CLI uses the Grepr REST APIs to provide a convenient command-line interface for common operations. For more advanced programmatic access, you can use the APIs directly. See Automate Grepr processing with the Grepr REST APIs.
Requirements
Installing the Grepr CLI requires version 20 or above of Node.js .
Installation
Option 1: Install from npm Registry (Recommended)
# Install globally
npm install -g @grepr/cli
# Or with yarn
yarn global add @grepr/cli
# Or with pnpm
pnpm add -g @grepr/cliAfter installation, you can run the CLI directly:
grepr --helpOption 2: Run with npx (No Installation Required)
# Run directly from npm registry
npx @grepr/cli --helpExample: Create a CLI configuration, create and run a job, and run a query
# Save a configuration for easy reuse
grepr --org-name myorg config:save myconfig
# Use the saved configuration to create and run a job
grepr --conf myconfig job:create templates/default-job.json
# Or specify all options directly
grepr --org-name myorg job:create templates/default-job.json
# Execute a quick query without job files
grepr --conf myconfig query --dataset-name "production-logs" --query "level:ERROR"Configuration files are stored in ~/.grepr/cli-config.json:
{
"dev": {
"orgName": "myorg",
"apiBaseUrl": "https://myorg.app.grepr.ai/api",
"authBaseUrl": "https://grepr-prod.us.auth0.com",
"clientId": "${CLIENT_ID}",
"authMethod": "oauth"
}
}Authentication
The Grepr CLI redirects you to a login form when you run your first CLI command or if your authentication token has expired. After you sign in with your Grepr credentials, you can return to the command line and wait for the completion of your command.
If you experience an authentication issue, you can often resolve it by re-authenticating. To re-authenticate, remove any currently cached tokens so that the next command you run requires authentication:
# Clear cached tokens and re-authenticate
rm -rf ~/.grepr/auth/
grepr --conf dev integration:listCommand Reference
Global Options
The following options are available for all commands:
--conf <name> Use saved configuration
--org-name <name> Organization name (required)
--api-base-url <url> API server base URL (default: https://<orgName>.app.grepr.ai/api)
--auth-base-url <url> Auth0 base URL (default: https://grepr-prod.us.auth0.com)
--auth-method <method> Authentication method: oauth
--client-id <id> OAuth Client ID (default: web client ID)
--timezone <tz> Timezone for timestamp formatting (default: system)
-o, --output <file> Output results to file instead of stdout
-d, --debug Enable debug output
-q, --quiet Suppress non-essential outputTroubleshoot issues with debug mode
If you experience issues when running CLI commands, such as network issues, you can use the --debug option to capture the full logging output:
# Enable debug mode for detailed logging
grepr --conf dev --debug job:create job.jsonCLI Commands
To display the help for a specific command, use:
grepr <command> --helpFor example:
grepr job:create --helpThe Grepr CLI supports the following commands:
| Command group | Description and commands |
|---|---|
| Job Management | Commands for creating and managing Grepr jobs.job:create, job:list, job:get, job:update, job:delete |
| Queries | Run a query against data in the Grepr data lake.query |
| Integration Management | Manage integrations with external services and data sources.integration:list, integration:get |
| Dataset Management | Manage datasets for log storage and querying.dataset:create, dataset:list, dataset:get, dataset:update, dataset:delete |
| Configuration | Configure CLI settings and preferences.config:save, config:delete, config:list, config:show |
Command examples: run queries
# Query by dataset name with time range
grepr --conf dev query \
--dataset-name "production-logs" \
--start "2024-01-01T00:00:00Z" \
--end "2024-01-01T01:00:00Z" \
--query "service:web-api AND level:ERROR"
# Query by dataset ID with CSV output to file
grepr --conf dev query \
--dataset-id "abc123-def456-ghi789" \
--format csv \
--output errors.csv \
--limit 1000 \
--query "level:ERROR"
# Get all logs for the last hour, sorted by timestamp
grepr --conf dev query \
--dataset-name "production-logs" \
--sort-order "ASC" \
--sort "eventTimestamp:asc"Job Definition Format
Job definitions are JSON files that specify the query to execute. The CLI automatically handles both synchronous and asynchronous jobs based on the execution field.
To learn more about job definitions, see Create and manage jobs with the Grepr REST API.
The following fields are required in a job definition:
name: A name identifying the job.execution:"SYNCHRONOUS"for a job that streams results in real-time to the CLI. The output format is set with the--formatoption."ASYNCHRONOUS"for a job that runs in the background. These jobs can be monitored with thejob:listandjob:getcommands.
processing:"BATCH"or"STREAMING".jobGraph: the data processing pipeline definition.
Example: define a synchronous job
Streams results in real-time:
{
"name": "realtime-query-job",
"execution": "SYNCHRONOUS",
"processing": "BATCH",
"jobGraph": {
"vertices": [
{
"type": "logs-iceberg-table-source",
"name": "data_source",
"start": "2025-10-13T00:00:00Z",
"end": "2025-10-15T00:00:00Z",
"query": {
"type": "datadog-query",
"query": "level:ERROR"
},
"sortOrder": "UNSORTED",
"datasetId": "your-dataset-id"
},
{
"type": "logs-sync-sink",
"name": "sink"
}
],
"edges": [
"data_source -> sink"
]
}
}Example: define an asynchronous job
Creates a job that can be monitored and managed:
{
"name": "background-processing-job",
"execution": "ASYNCHRONOUS",
"processing": "BATCH",
"jobGraph": {
"vertices": [
{
"type": "logs-iceberg-table-source",
"name": "data_source",
"start": "2025-10-13T00:00:00Z",
"end": "2025-10-15T00:00:00Z",
"datasetId": "your-dataset-id"
},
{
"type": "logs-sink",
"name": "output_sink"
}
],
"edges": [
"data_source -> output_sink"
]
}
}Output formats
Table format (default)
Displays results in a formatted table with:
- Column wrapping (80 characters max)
- Pretty-printed JSON in cells
- Configurable sorting
- Proper alignment
+----------+------------------------+------------------+
| id | eventTimestamp | message |
+----------+------------------------+------------------+
| abc123 | 10/19/2025, 6:00:00 PM | Error occurred |
| def456 | 10/19/2025, 6:01:00 PM | Request |
| | | completed |
+----------+------------------------+------------------+Pretty format
Syntax-highlighted JSON with colors:
{
"id": "abc123",
"eventTimestamp": "2025-10-19T18:00:00Z",
"tags": {
"service": "web-api",
"environment": "prod"
}
}Raw format
Unprocessed JSON strings, one per line:
{"id":"abc123","eventTimestamp":"2025-10-19T18:00:00Z"}
{"id":"def456","eventTimestamp":"2025-10-19T18:01:00Z"}CSV format
Comma-separated values with proper escaping:
id,eventTimestamp,message,tags
abc123,"10/19/2025, 6:00:00 PM","Error occurred","{""service"":[""web-api""]}"
def456,"10/19/2025, 6:01:00 PM","Request completed","{""service"":[""web-api""]}"Useful for data analysis, spreadsheets, and data pipelines. Handles complex JSON data by properly escaping nested objects and arrays.
Compact format
Single-line JSON without indentation:
{"id":"abc123","eventTimestamp":"2025-10-19T18:00:00Z","message":"Error"}Track job status in real time
The CLI displays real-time job status updates:
| Status | Description |
|---|---|
[CONNECTING] | Establishing connection |
[CONNECTED] | Connection established |
[RUNNING] | The job is processing data |
[FINISHED] | The job completed successfully |
[FAILED] | The job failed |
[CANCELLED] | The job was cancelled |
[TIMED_OUT] | The job exceeded the time limit |
[SCANNED_MAX] | The job exceeded the data scan limit |