Show HN: We collected detailed annotations for text-to-image generation

2 points by maalber 2 months ago

Recently, the most popular modality text-to-image annotations has been preference data, where annotators usually choose between two images two indicate their favorite. While this does work to fine-tune models, it lacks additional information about what might be wrong with the images. E.g., what part of the image is misaligned relative to the prompt. Google research propose a modality for more information rich annotations ( Based on this, we produced this dataset of ~13k images. We collected in total ~1.5 million annotations from 150k annotators using our annotation API. If you are interested you can learn more about the API at

Let me know if you have any questions about the dataset or Rapidata in general!