StarVector: A Multimodal Model for Generating SVG Code from Images and Text

What is StarVector?

StarVector is a multimodal vision-language model (VLM) designed specifically for Scalable Vector Graphics (SVG) generation. It can produce high-precision, semantically rich SVG code through both Image-to-SVG and Text-to-SVG methods. Unlike traditional curve vectorization techniques, StarVector operates directly at the SVG code level, allowing it to accurately utilize SVG primitives (such as ellipses, rectangles, polygons, and text), thus avoiding common distortions and artifacts seen in conventional methods.


Core Technologies of StarVector

1. Multimodal Architecture

StarVector employs a multimodal architecture capable of processing both images and text as inputs:

  • Image-to-SVG: Converts images into visual tokens and then generates SVG code.
  • Text-to-SVG: Creates new SVGs purely from text instructions, without needing an image.

The model is built upon StarCoder, enabling it to transfer coding capabilities to the SVG generation domain, ensuring concise and syntactically correct code.


Challenges in SVG Generation & StarVector’s Advantages

1. Overcoming Traditional Method Limitations

Traditional SVG generation methods, such as AutoTrace, Potrace, and VTracer, primarily rely on curve fitting and lack semantic understanding of images. This often results in distorted or overly complex path data, making it difficult to handle complex SVG elements.

StarVector’s advantages:

  • Semantic Understanding: The model can analyze image content and correctly select appropriate SVG primitives (e.g., circles, rectangles, polylines).
  • Concise Code: Directly outputs structured and compact SVG code instead of complex <path> data.
  • Supports Various SVG Generation Scenarios: Including logos, technical diagrams, and icons.

2. More Accurate Evaluation Metrics

Many previous SVG generation methods relied on pixel-level evaluation metrics (e.g., MSE), which fail to measure the true semantic accuracy of SVGs. To address this, the StarVector team developed SVG-Bench, a benchmark specifically designed to assess SVG generation quality, covering 10 datasets and 3 types of SVG generation tasks:

  1. Image-to-SVG
  2. Text-to-SVG
  3. Diagram-to-SVG

StarVector Models & Evaluation Results

Currently, StarVector offers two model versions, both available for download on Hugging Face:

  • 💫 StarVector-8B
  • 💫 StarVector-1B

In SVG-Bench testing, StarVector outperformed all baseline models in DinoScore metrics:

Method SVG-Stack SVG-Fonts SVG-Icons SVG-Emoji SVG-Diagrams
AutoTrace 0.942 0.954 0.946 0.975 0.874
Potrace 0.898 0.967 0.972 0.882 0.875
VTracer 0.954 0.964 0.940 0.981 0.882
Im2Vec 0.692 0.733 0.754 0.732 -
LIVE 0.934 0.956 0.959 0.969 0.870
DiffVG 0.810 0.821 0.952 0.814 0.822
GPT-4-V 0.852 0.842 0.848 0.850 -
💫 StarVector-1B 0.926 0.978 0.975 0.929 0.943
💫 StarVector-8B 0.966 0.982 0.984 0.981 0.959

Note: StarVector is not designed for natural images or illustrations since its training data primarily consists of icons, technical diagrams, charts, and logos.


SVG-Bench Dataset Overview

StarVector’s training data comes from SVG-Bench, a specialized dataset for SVG generation models, covering 10 sub-datasets, each targeting different SVG generation scenarios:

Dataset Training Set Validation Set Test Set Avg. Token Length Supported SVG Primitives Annotation Type
SVG-Stack 2.1M 108k 5.7k 1,822 ± 1,808 All SVG Primitives Image Annotation
SVG-Stack_sim 601k 30.1k 1.5k 2,000 ± 918 Vector path -
SVG-Diagrams - - 472 3,486 ± 1,918 All SVG Primitives -
SVG-Fonts 1.8M 91.5k 4.8k 2,121 ± 1,868 Vector path Font Annotation
SVG-Fonts_sim 1.4M 71.7k 3.7k 1,722 ± 723 Vector path Font Annotation
SVG-Emoji 8.7k 667 668 2,551 ± 1,805 All SVG Primitives -
SVG-Emoji_sim 580 57 96 2,448 ± 1,026 Vector path -
SVG-Icons 80.4k 6.2k 2.4k 2,449 ± 1,543 Vector path -
SVG-Icons_sim 80.4k 2.8k 1.2k 2,005 ± 824 Vector path -
SVG-FIGR 270k 27k 3k 5,342 ± 2,345 Vector path Image Classification & Annotation

Conclusion: Why StarVector Matters

SVGs play a crucial role in icons, branding, technical diagrams, and map design. StarVector is currently the most advanced Image-to-SVG and Text-to-SVG generation model. Compared to traditional curve-fitting methods, it offers:
Semantic Understanding, ensuring accurate image structure recognition
Concise Code, generating more efficient SVGs
More Accurate Evaluation Metrics, overcoming pixel-based limitations
Support for Hugging Face training & testing, available for developers

StarVector makes AI-generated SVGs more precise and reliable, opening up new possibilities for vector graphics applications. 💡

👉 Resources:

Share on:
Previous: DeepSeek-V3-0324 Launches: Free for Commercial Use & Runs on Consumer Hardware
Next: OpenAI Introduces New Speech AI Model: gpt-4o-transcribe and Its Potential Applications
DMflow.chat

DMflow.chat

ad

DMflow.chat: Intelligent integration that drives innovation. With persistent memory, customizable fields, seamless database and form connectivity, plus API data export, experience unparalleled flexibility and efficiency.

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to Change Forever?
10 April 2025

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to Change Forever?

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to...

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind
5 April 2025

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind? The heavywe...

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into Anime Art
2 April 2025

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into Anime Art

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into...

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing
26 March 2025

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing On March 25, 2025, OpenAI announ...

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability
21 March 2025

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability Major Updat...

Free AI Drawing Tool Arrives! Unlimited Creativity with Raphael AI
17 January 2025

Free AI Drawing Tool Arrives! Unlimited Creativity with Raphael AI

Free AI Drawing Tool Arrives! Unlimited Creativity with Raphael AI In this age of boundless c...

Creator's Blessing! YouTube Tests Google Gemini to Aid Video Idea Generation
9 August 2024

Creator's Blessing! YouTube Tests Google Gemini to Aid Video Idea Generation

Creator’s Blessing! YouTube Tests Google Gemini to Aid Video Idea Generation YouTube is testing t...

Google AI Studio is Now Accessible via ai.dev
25 March 2025

Google AI Studio is Now Accessible via ai.dev

Google AI Studio is Now Accessible via ai.dev! A New Era for Google AI Studio with a Simpler, Mo...

Industry Shakeup! NVIDIA Acquires Run:ai for $700M and Makes it Open Source
3 January 2025

Industry Shakeup! NVIDIA Acquires Run:ai for $700M and Makes it Open Source

Industry Shakeup! NVIDIA Acquires Run:ai for $700M and Makes it Open Source Summary NVIDIA has a...