Getting StartedQuickstart Guide

Quickstart Guide

Learn exactly how to orchestrate your very first data engineering pipeline architecture in minutes. Connect your mock database, formally define a target warehouse destination, and let DataFlow AI build the underlying Directed Acyclic Graph (DAG) autonomously.

Your First Pipeline in Dataflow AI

Building your first end-to-end data pipeline completely bypasses hours of manual configuration by utilizing the Dataflow AI Natural Language Pipeline interface natively.

Follow this highly straightforward flow below to instantly create a working ETL (Extract, Transform, Load) cluster asynchronously extracting from an imaginary application API directly into your managed cloud warehouse securely.

Connect a Data Source

Navigate securely to the Connectors page. Select the standard "HTTP API" abstraction and enter a mock REST API URL (e.g., https://dummyjson.com/users).

DataFlow AI will transparently proxy this connection to securely fetch the endpoint structure without aggressively importing the entire payload dataset remotely, saving massive network bandwidth during the initial tracking phase.

Access the AI Prompt Canvas

Jump back intuitively to the main Pipeline Canvas. Instead of rigorously dragging and dropping 40 distinct architectural components manually, visually click the Copilot interactive icon and articulate your exact linguistic intention natively:

"Pull baseline tracking data from my dummy JSON REST API, absolutely drop the PII address columns securely, mathematically format birth dates to MM-DD-YYYY uniformly, and cleanly push the resulting aggregation matrix to Snowflake."

Review & Deploy Compute Arrays

The integrated LLM Copilot will automatically intercept and interpret your linguistic parameters accurately, deterministically generating the underlying Python and PySpark architecture. Seamlessly verify the nodes visually, click Run, and watch the ephemeral processing cluster securely parse the mathematical logs logically in real-time.