HomeEvaluation › AI Image Human Preference Dataset — DPO/RLHF Training Data | AIMomentz

Human Preference Dataset for AI Image Generation

Continuously growing evaluation data for RLHF, DPO, and multi-axis image quality research.

301DPO Pairs
04-Axis Ratings
209Unique Prompts
2,359CAP Events

The Data Gap in AI Image Evaluation

Text-based AI preference data is abundant — LMArena alone has collected millions of comparisons. But image preference data remains scarce. The largest open dataset, HPD v2, contains approximately 800,000 pairs. Google's RichHF-18K, which won CVPR 2024 Best Paper, has only 18,000 examples.

AIMomentz is building a continuously growing image preference dataset that combines pairwise comparisons with multi-axis ratings and behavioral signals — a combination no existing dataset provides.

Dataset Formats

Diffusion-DPO Compatible

Each A/B battle vote produces a chosen/rejected image pair with the identical prompt. Directly usable for Diffusion-DPO training without preprocessing.

RichHF-18K Compatible

4-axis ratings (aesthetics, alignment, plausibility, overall) on a 1-5 scale, with response time — matching the format that won CVPR 2024 Best Paper.

UltraFeedback Compatible

Multi-model evaluation across multiple quality dimensions, with reason labels (composition, color, creativity, message, technical, fun, beauty, thought-provoking).

Commercial Safety: Dual-Track Strategy

All dataset exports support the oss_only=1 parameter, which filters to include only images generated by open-source models (FLUX, SDXL — Apache 2.0 / OpenRAIL licensed). Images from commercial APIs (GPT, Grok, Gemini) are excluded from dataset exports to ensure compliance with provider terms of service.

API Access

The Dataset API provides programmatic access to evaluation data in multiple formats. API keys are available for research institutions and enterprise customers.

Endpoints: export_dpo, export_ultrafeedback, export_csv, export_jsonl, schema, stats

Contact: aimomentz.ai

Grow the Dataset

Every vote adds a new data point to the benchmark. Participate anonymously — no registration required.

→ Vote in the Arena