MIT and IBM Research Release ChartNet — 1.5M Synthetic Charts for Training Vision Models
MIT, in collaboration with IBM Research, has introduced ChartNet — a synthetic dataset of 1.5 million charts designed to train vision models to interpret graphs and visualizations. The team converted real-world charts into executable code and then programmatically modified that code to generate diverse new examples spanning 24 chart types across 6 visualization libraries. Each sample includes the generated image, source code, a data table, and a text description.
According to the researchers, fine-tuning on ChartNet enabled compact models to outperform larger proprietary systems on specialized visual information extraction benchmarks. The dataset is available on Hugging Face.
MIT News: Teaching AI models to interpret charts · Paper on arXiv