Pipelines
Using Pipelines
Pipelines define how data moves through Datable — from source to transformation to destination. Each pipeline is made up of connected nodes that apply transformations, filter or enrich data, and route it to various outputs.
Creating a Pipeline
From the Pipelines page, click + New Pipeline. You’ll be prompted to select an initial source and destination. You can add or modify these later.
Building Your Flow
To expand your pipeline, hover over a node and click the + button to add a new node. You can insert:
- Source nodes
- Transformation nodes
- Routing nodes
- Destination nodes
- JavaScript nodes
Node Types
Source Nodes
The entry point for your data. Sources can include tools like Datadog, New Relic, or custom OpenTelemetry agents.
Transformation Nodes
These nodes apply step-based logic to your data. Each transformation node can include up to 10 sequential steps. The order of operations matters — each step builds on the result of the previous one
Routing Nodes
Routing nodes allow you to branch data based on attribute values. For example, you might route failed login events to a security destination while allowing successful logins to continue unmodified.
Destination Nodes
These are the endpoints for your processed data. Examples include:
- Object storage (e.g., S3)
- Monitoring tools (e.g., Datadog, Splunk, New Relic)
- Alerting systems (e.g., PagerDuty, Slack, webhooks)
You can send the same data to multiple destinations or conditionally route it based on logic earlier in the pipeline.
Adding Transformation Steps
To insert a transformation node and add a step:
- Hover over an existing node and click the + button to add a new node
- Select Transformation Node
- Click the node to open the Transformation Panel
- Click + Add Step
- Choose the desired transformation step type (e.g., Drop, Mask, Lookup)
- Configure the step’s conditions and logic
- Select the applicable data types: Logs, Spans, OCSF
Available Steps
Step | Description |
---|---|
Drop | Removes events based on conditions |
Sample | Reduces event volume by applying a percentage filter |
Regex | Matches or extracts values using regular expressions |
Select | Passes through only selected attributes |
Extract | Retrieves values using a regex or structured field path |
Mask | Redacts sensitive fields (e.g., email, credit card, user ID) |
Deduplicate | Removes duplicate events by field |
Lookup Table | Enriches events using a user-provided or system-managed lookup table |
Geo IP Lookup | Adds location metadata based on IP address |
Parse Log Formats | Parses raw log text into structured format |
Code | Executes custom JavaScript on each event |
Data Type Awareness
Steps are tagged by the data types they apply to:
- Blue: Logs
- Green: Spans
- Orange: OCSF
Incompatible steps are greyed out when a data type is excluded.
Example Transformation
A typical transformation might include:
- Parse Log Formats – Convert raw text to JSON
- Drop – Remove development environment logs
- Mask – Redact sensitive user fields
- Geo IP Lookup – Enrich IP addresses with location data
- Lookup Table – Enrich with department metadata
- Code – Normalize or tag events for downstream processing
Output Panel
The Output Panel lets you inspect the data flowing through your pipeline in real time. It helps you validate transformations, test changes, and debug data discrepancies.
Located at the bottom of the pipeline editor, the Output Panel has two views: Live and Simulate.
Live Tab
Displays real-time data passing through the pipeline as it runs.
- Updates automatically
- Shows saved transformation output
- Reflects current published logic
- Allows you to filter or pin events
- Supports switching between data types (Logs, Spans, OCSF)
Use this tab to verify that data is flowing as expected in your production or sandbox environment.
Simulate Tab
Displays how data will look after applying unsaved transformation steps.
- Pulls a static sample from the input
- Shows before/after view for each event
- Updates as you configure or reorder steps
- Does not impact the live pipeline
Use Simulate to test changes safely before publishing.
Live vs Simulate
Feature | Live | Simulate |
---|---|---|
Data Source | Real-time | Static sample |
Reflects Saved State | Yes | No |
Editable | No | Yes |
Refresh Behavior | Auto | Manual (Fetch More Data) |
Use Case | Monitoring output | Testing step logic |
Tips for Using the Output Panel
- If no data appears in Live:
- Make sure the pipeline is published
- Confirm the correct data type is selected
- Check that the source is active
- Use Simulate before publishing any step changes
- Pin events to keep important samples in view while you iterate