Skip to main content

Getting Started

Welcome to the Creao dashboard! This guide will help you quickly get started with setting up and using the pipeline for large-scale data generation.

What is a Pipeline?

The Creao pipeline is a framework for building and managing workflows to generate synthetic data using Large Language Models (LLMs). By using and connecting the provided components available to you, each designed to handle specific tasks, you can create customized and streamlined data generating pipelines adapted to your own use cases.

Components Overview

Creao supports several components to perform different tasks:

Building Your Pipeline

To build your pipeline:

  1. Define the Input Dataset:

    • Specify the input dataset path of the input_data component.
    • Click "Pipeline Config" and input the Hugging Face dataset path.
    • Click "Update Dataset" to load the dataset.
  2. Add New Components:

    • Custom variables (e.g., extractInterests, rewriteQuestions, etc.) can be added to the pipeline.
    • Navigate to the "Add Component" button to add new components to the pipeline.
    • Configure the components based on your requirements in the provided fields on the right side of the dashboard.
  3. Sumbit the Pipeline:

    • Click the "Submit Pipeline" button to run the pipeline.

Data & Variables

Creao interacts with three types of data and variables:

  • Output from the First Preceding LLM Component:

    • Access output from the initial LLM component in the pipeline, tracing back if necessary.
  • Custom Pipeline Variables:

    • Define global variables in the "Add New Variable" section, accessible to all components.
  • Dataset Variables:

    • Define variables from specific columns of your input dataset. These variables are available to all components after selecting the parse option.

Conclusion

With these steps, you can set up and customize your Creao Pipeline for efficient data processing! For more detailed information about each component, please refer to the rest of the documentation.

If you have any questions or need further assistance, feel free to reach out!