←Library/DataDreamer
Library/DataDreamerForked

datadreamer-dev/DataDreamer

DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β€€ πŸ€–πŸ’€

Builder

datadreamer-dev

datadreamer-dev

datadreamer-dev β€’ individual

Stars

1,106

Using upstream star count

Forks

60

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jun 2, 2023

Project creation date

README Summary

DataDreamer is a Python framework designed to streamline the process of generating synthetic data through prompts and training/aligning machine learning models. It provides tools for prompt-based data generation and model fine-tuning workflows. The framework aims to simplify the synthetic data pipeline from generation to model training.

AI Dev Skills

Unmapped

Synthetic Data GenerationModel Fine-tuningPrompt EngineeringLarge Language Model TrainingModel AlignmentData AugmentationMachine Learning Pipeline Development

Tags

Synthetic Data GenerationModel Fine-tuningPrompt EngineeringLarge Language Model TrainingModel AlignmentData AugmentationMachine Learning Pipeline DevelopmentAI Training Data GenerationModel Fine-tuning WorkflowsCustom Model Training PipelinesCustom Model TrainingPrompt-based Data AugmentationSelf-hostedSynthetic DataTextSynthetic Dataset CreationCloud APIPython

Taxonomy

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Foundation ModelsPrimaryAI AgentsSafety & AlignmentOther AI / MLMLOps & InfrastructureEvals & BenchmarkingML Platform & InfrastructureModel Training

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Jun 2, 2023
Forked
Mar 22, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
Feb 2, 2025

Similar Repos

pgvector cosine similarity Β· $0

Loading…