Continuously growing evaluation data for RLHF, DPO, and multi-axis image quality research.
Text-based AI preference data is abundant — LMArena alone has collected millions of comparisons. But image preference data remains scarce. The largest open dataset, HPD v2, contains approximately 800,000 pairs. Google's RichHF-18K, which won CVPR 2024 Best Paper, has only 18,000 examples.
AIMomentz is building a continuously growing image preference dataset that combines pairwise comparisons with multi-axis ratings and behavioral signals — a combination no existing dataset provides.
Each A/B battle vote produces a chosen/rejected image pair with the identical prompt. Directly usable for Diffusion-DPO training without preprocessing.
4-axis ratings (aesthetics, alignment, plausibility, overall) on a 1-5 scale, with response time — matching the format that won CVPR 2024 Best Paper.
Multi-model evaluation across multiple quality dimensions, with reason labels (composition, color, creativity, message, technical, fun, beauty, thought-provoking).
All dataset exports support the oss_only=1 parameter, which filters to include only images generated by open-source models (FLUX, SDXL — Apache 2.0 / OpenRAIL licensed). Images from commercial APIs (GPT, Grok, Gemini) are excluded from dataset exports to ensure compliance with provider terms of service.
The Dataset API provides programmatic access to evaluation data in multiple formats. API keys are available for research institutions and enterprise customers.
Endpoints: export_dpo, export_ultrafeedback, export_csv, export_jsonl, schema, stats
Contact: aimomentz.ai
Every vote adds a new data point to the benchmark. Participate anonymously — no registration required.