AI prompts
base on Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ # DeTi*k*Zify<br><sub><sup>Synthesizing Graphics Programs for Scientific Figures and Sketches with Ti*k*Z</sup></sub>
[](https://openreview.net/forum?id=bcVLFQCOjc)
[](https://arxiv.org/abs/2405.15306)
[](https://huggingface.co/collections/nllg/detikzify-664460c521aa7c2880095a8b)
[](https://colab.research.google.com/drive/1hPWqucbPGTavNlYvOBvSNBAwdcPZKe8F)
Creating high-quality scientific figures can be time-consuming and challenging,
even though sketching ideas on paper is relatively easy. Furthermore,
recreating existing figures that are not stored in formats preserving semantic
information is equally complex. To tackle this problem, we introduce
[DeTi*k*Zify](https://github.com/potamides/DeTikZify), a novel multimodal
language model that automatically synthesizes scientific figures as
semantics-preserving [Ti*k*Z](https://github.com/pgf-tikz/pgf) graphics
programs based on sketches and existing figures. We also introduce an
MCTS-based inference algorithm that enables DeTi*k*Zify to iteratively refine
its outputs without the need for additional training.
https://github.com/potamides/DeTikZify/assets/53401822/203d2853-0b5c-4a2b-9d09-3ccb65880cd3
## News
* **2025-03-17**: We release
[Ti*k*Zero](https://huggingface.co/nllg/tikzero-adapter) adapters which plug
directly into [DeTi*k*Zify<sub>v2</sub>
(8b)](https://huggingface.co/nllg/detikzify-v2-8b) and enable zero-shot
text-conditioning, and
[Ti*k*Zero+](https://huggingface.co/nllg/tikzero-plus-10b) with additional
end-to-end fine-tuning. For more information see our
[paper](https://arxiv.org/abs/2503.11509) and usage examples [below](#usage).
* **2024-12-05**: We release [DeTi*k*Zify<sub>v2</sub>
(8b)](https://huggingface.co/nllg/detikzify-v2-8b), our latest model which
surpasses all previous versions in our evaluation and make it the new default
model in our [Hugging Face
Space](https://huggingface.co/spaces/nllg/DeTikZify). Check out the [model
card](https://huggingface.co/nllg/detikzify-v2-8b-preview#model-card-for-detikzifyv2-8b)
for more information.
* **2024-09-24**: DeTi*k*Zify was accepted at [NeurIPS
2024](https://neurips.cc/Conferences/2024) as a [spotlight
paper](https://neurips.cc/virtual/2024/poster/94474)!
## Installation
> [!TIP]
> If you encounter difficulties with installation and inference on your own
> hardware, consider visiting our [Hugging Face
> Space](https://huggingface.co/spaces/nllg/DeTikZify) (please note that
> restarting the space can take up to 30 minutes). Should you experience long
> queues, you have the option to
> [duplicate](https://huggingface.co/spaces/nllg/DeTikZify?duplicate=true) it
> with a paid private GPU runtime for a more seamless experience. Additionally,
> you can try our demo on [Google
> Colab](https://colab.research.google.com/drive/1hPWqucbPGTavNlYvOBvSNBAwdcPZKe8F).
> However, setting up the environment there might take some time, and the free
> tier only supports inference for the 1b models.
The Python package of DeTi*k*Zify can be easily installed using
[pip](https://pip.pypa.io/en/stable):
```sh
pip install 'detikzify[legacy] @ git+https://github.com/potamides/DeTikZify'
```
The `[legacy]` extra is only required if you plan to use the
DeTi*k*Zify<sub>v1</sub> models. If you only plan to use
DeTi*k*Zify<sub>v2</sub> you can remove it. If your goal is to run the included
[examples](examples), it is easier to clone the repository and install it in
editable mode like this:
```sh
git clone https://github.com/potamides/DeTikZify
pip install -e DeTikZify[examples]
```
In addition, DeTi*k*Zify requires a full
[TeX Live 2023](https://www.tug.org/texlive) installation,
[ghostscript](https://www.ghostscript.com), and
[poppler](https://poppler.freedesktop.org) which you have to install through
your package manager or via other means.
## Usage
> [!TIP]
> For interactive use and general [usage tips](detikzify/webui#usage-tips),
> we recommend checking out our [web UI](detikzify/webui), which can be started
> directly from the command line (use `--help` for a list of all options):
> ```sh
> python -m detikzify.webui --light
> ```
If all required dependencies are installed, the full range of DeTi*k*Zify
features such as compiling, rendering, and saving Ti*k*Z graphics, and
MCTS-based inference can be accessed through its programming interface:
<details open><summary>DeTi<i>k</i>Zify Example</summary>
```python
from operator import itemgetter
from detikzify.model import load
from detikzify.infer import DetikzifyPipeline
image = "https://w.wiki/A7Cc"
pipeline = DetikzifyPipeline(*load(
model_name_or_path="nllg/detikzify-v2-8b",
device_map="auto",
torch_dtype="bfloat16",
))
# generate a single TikZ program
fig = pipeline.sample(image=image)
# if it compiles, rasterize it and show it
if fig.is_rasterizable:
fig.rasterize().show()
# run MCTS for 10 minutes and generate multiple TikZ programs
figs = set()
for score, fig in pipeline.simulate(image=image, timeout=600):
figs.add((score, fig))
# save the best TikZ program
best = sorted(figs, key=itemgetter(0))[-1][1]
best.save("fig.tex")
```
</details>
Through [Ti*k*Zero](https://huggingface.co/nllg/tikzero-adapter) adapters and
[Ti*k*Zero+](https://huggingface.co/nllg/tikzero-plus-10b) it is also possible
to synthesize graphics programs conditioned on text (cf. our
[paper](https://arxiv.org/abs/2503.11509) for
details). Note that this currently only supported through the programming
interface:
<details open><summary>Ti<i>k</i>Zero+ Example</summary>
```python
from detikzify.model import load
from detikzify.infer import DetikzifyPipeline
caption = "A multi-layer perceptron with two hidden layers."
pipeline = DetikzifyPipeline(*load(
model_name_or_path="nllg/tikzero-plus-10b",
device_map="auto",
torch_dtype="bfloat16",
))
# generate a single TikZ program
fig = pipeline.sample(text=caption)
# if it compiles, rasterize it and show it
if fig.is_rasterizable:
fig.rasterize().show()
```
</details>
<details><summary>Ti<i>k</i>Zero Example</summary>
```python
from detikzify.model import load, load_adapter
from detikzify.infer import DetikzifyPipeline
caption = "A multi-layer perceptron with two hidden layers."
pipeline = DetikzifyPipeline(
*load_adapter(
*load(
model_name_or_path="nllg/detikzify-v2-8b",
device_map="auto",
torch_dtype="bfloat16",
),
adapter_name_or_path="nllg/tikzero-adapter",
)
)
# generate a single TikZ program
fig = pipeline.sample(text=caption)
# if it compiles, rasterize it and show it
if fig.is_rasterizable:
fig.rasterize().show()
```
</details>
More involved examples, for example for evaluation and training, can be found
in the [examples](examples) folder.
## Model Weights & Datasets
We upload all our DeTi*k*Zify models and datasets to the [Hugging Face
Hub](https://huggingface.co/collections/nllg/detikzify-664460c521aa7c2880095a8b)
(Ti*k*Zero models are available
[here](https://huggingface.co/collections/nllg/tikzero-67d1952fab69f5bd172de1fe)).
However, please note that for the public release of the
[DaTi*k*Z<sub>v2</sub>](https://huggingface.co/datasets/nllg/datikz-v2)
and [DaTi*k*Z<sub>v3</sub>](https://huggingface.co/datasets/nllg/datikz-v3)
datasets, we had to remove a considerable portion of Ti*k*Z drawings
originating from [arXiv](https://arxiv.org), as the [arXiv non-exclusive
license](https://arxiv.org/licenses/nonexclusive-distrib/1.0/license.html) does
not permit redistribution. We do, however, release our [dataset creation
scripts](https://github.com/potamides/DaTikZ) and encourage anyone to recreate
the full version of DaTi*k*Z themselves.
## Citation
If DeTi*k*Zify and Ti*k*Zero have been beneficial for your research or
applications, we kindly request you to acknowledge this by citing them as
follows:
```bibtex
@inproceedings{belouadi2024detikzify,
title={{DeTikZify}: Synthesizing Graphics Programs for Scientific Figures and Sketches with {TikZ}},
author={Jonas Belouadi and Simone Paolo Ponzetto and Steffen Eger},
booktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},
year={2024},
url={https://openreview.net/forum?id=bcVLFQCOjc}
}
```
```bibtex
@misc{belouadi2025tikzero,
title={{TikZero}: Zero-Shot Text-Guided Graphics Program Synthesis},
author={Jonas Belouadi and Eddy Ilg and Margret Keuper and Hideki Tanaka and Masao Utiyama and Raj Dabre and Steffen Eger and Simone Paolo Ponzetto},
year={2025},
eprint={2503.11509},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2503.11509},
}
```
## Acknowledgments
The implementation of the DeTi*k*Zify model architecture is based on
[LLaVA](https://github.com/haotian-liu/LLaVA) and
[AutomaTikZ](https://github.com/potamides/AutomaTikZ) (v1), and [Idefics
3](https://huggingface.co/HuggingFaceM4/Idefics3-8B-Llama3) (v2). Our MCTS
implementation is based on
[VerMCTS](https://github.com/namin/llm-verified-with-monte-carlo-tree-search).
The Ti*k*Zero architecture draws inspiration from
[Flamingo](https://deepmind.google/discover/blog/tackling-multiple-tasks-with-a-single-visual-language-model/)
and [LLaMA
3.2-Vision](https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices).
", Assign "at most 3 tags" to the expected json: {"id":"13590","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"