About Chart Analysis AI
Hybrid AI system for extracting structured data from chart images in Vietnamese academic papers
Chart Analysis AI is a 5-stage pipeline system that processes PDF documents and chart images to extract structured data, answer questions, and generate analytical insights. It combines object detection (YOLO), vision-language models (DePlot, PaddleOCR-VL, Vintern), and fine-tuned small language models (6 SLM variants from 0.5B to 7B parameters) with cloud AI fallback (Gemini).
The system was designed for Vietnamese academic papers and supports bilingual (English/Vietnamese) chart analysis. A key finding: DePlot-extracted table data improves SLM accuracy by +3.0% on average -- a larger effect than scaling model size from 0.5B to 7B parameters.
Parse PDF/DOCX/images into clean page images
PyMuPDF, PillowLocate and crop chart regions from pages
YOLOv8-M (93.5% mAP@50)Classify chart type and extract structured table data
EfficientNet-B0 + DePlot VLMRefine data, answer questions, generate descriptions
AI Router (7 SLM + 3 Gemini + Vintern)Generate insights, format output as JSON/Markdown/CSV
Template engine + insight rules6 models screened in 2 rounds: Round 1 (text-only) and Round 2 (with DePlot extraction). Trained on 4,000 bilingual samples with QLoRA 4-bit quantization.
| Model | R1 (no DePlot) | R2 (+ DePlot) | Delta | VRAM |
|---|---|---|---|---|
| Llama-3.2-3BWINNER | 83.3% | 86.0% | +2.7% | 5.5 GB |
| Qwen-2.5-7B | 82.7% | 85.7% | +2.9% | 10.6 GB |
| Llama-3.2-1B | 82.3% | 85.1% | +2.8% | 3.6 GB |
| Qwen-2.5-3B | 81.7% | 84.9% | +3.1% | 5.3 GB |
| Qwen-2.5-1.5B | 81.1% | 84.3% | +3.2% | 4.0 GB |
| Qwen-2.5-0.5B | 80.1% | 83.6% | +3.5% | 2.8 GB |
Key finding: DePlot adds +3.0% average accuracy across ALL models -- a bigger impact than scaling from 0.5B to 7B parameters (only 3.2% gap). Data quality matters more than model size.