base on BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models # BoxMOT: pluggable SOTA tracking modules for segmentation, object detection and pose estimation models
<div align="center">
<img width="640"
src="https://github.com/mikel-brostrom/boxmot/releases/download/v12.0.0/output_640.gif"
alt="BoxMot demo">
<br> <!-- one blank line -->
[](https://github.com/mikel-brostrom/yolov8_tracking/actions/workflows/ci.yml)
[](https://badge.fury.io/py/boxmot)
[](https://pepy.tech/project/boxmot)
[](https://github.com/mikel-brostrom/boxmot/blob/master/LICENSE)
[](https://badge.fury.io/py/boxmot)
[](https://colab.research.google.com/drive/18nIqkBr68TkK8dHdarxTco6svHUJGggY?usp=sharing)
[](https://doi.org/10.5281/zenodo.8132989)
[](https://hub.docker.com/r/boxmot/boxmot)
[](https://discord.gg/3w4aYGbU)
</div>
## Introduction
This repository addresses the fragmented nature of the multi-object tracking (MOT) field by providing a standardized collection of pluggable, state-of-the-art trackers. Designed to seamlessly integrate with segmentation, object detection, and pose estimation models, the repository streamlines the adoption and comparison of MOT methods. For trackers employing appearance-based techniques, we offer a range of automatically downloadable state-of-the-art re-identification (ReID) models, from heavyweight ([CLIPReID](https://arxiv.org/pdf/2211.13977.pdf)) to lightweight options ([LightMBN](https://arxiv.org/pdf/2101.10774.pdf), [OSNet](https://arxiv.org/pdf/1905.00953.pdf)). Additionally, clear and practical examples demonstrate how to effectively integrate these trackers with various popular models, enabling versatility across diverse vision tasks.
<div align="center">
<!-- START TRACKER TABLE -->
| Tracker | Status | HOTA↑ | MOTA↑ | IDF1↑ | FPS |
| :-----: | :-----: | :---: | :---: | :---: | :---: |
| [boosttrack](https://arxiv.org/abs/2408.13003) | ✅ | 69.253 | 75.914 | 83.206 | 25 |
| [botsort](https://arxiv.org/abs/2206.14651) | ✅ | 68.885 | 78.222 | 81.344 | 46 |
| [strongsort](https://arxiv.org/abs/2202.13514) | ✅ | 68.05 | 76.185 | 80.763 | 17 |
| [bytetrack](https://arxiv.org/abs/2110.06864) | ✅ | 67.68 | 78.039 | 79.157 | 1265 |
| [deepocsort](https://arxiv.org/abs/2302.11813) | ✅ | 67.509 | 75.83 | 79.976 | 12 |
| [ocsort](https://arxiv.org/abs/2203.14360) | ✅ | 66.441 | 74.548 | 77.899 | 1483 |
<!-- END TRACKER TABLE -->
<sub> NOTES: Evaluation was conducted on the second half of the MOT17 training set, as the validation set is not publicly available and the ablation detector was trained on the first half. We employed [pre-generated detections and embeddings](https://github.com/mikel-brostrom/boxmot/releases/download/v11.0.9/runs2.zip). Each tracker was configured using the default parameters from their official repositories. </sub>
</div>
</details>
## Why BOXMOT?
Multi-object tracking solutions today depend heavily on the computational capabilities of the underlying hardware. BoxMOT addresses this by offering a wide array of tracking methods tailored to accommodate diverse hardware constraints, ranging from CPU-only setups to high-end GPUs. Furthermore, we provide scripts designed for rapid experimentation, enabling users to save detections and embeddings once and subsequently reuse them with any tracking algorithm. This approach eliminates redundant computations, significantly speeding up the evaluation and comparison of multiple trackers.
## Installation
Install the `boxmot` package, including all requirements, in a Python>=3.9 environment:
```bash
pip install boxmot
```
BoxMOT provides a unified CLI `boxmot` with the following subcommands:
```bash
Usage: boxmot COMMAND [ARGS]...
Commands:
track Run tracking only
generate-dets-embs Generate detections and embeddings
generate-mot-results Generate MOT evaluation results based on pregenerated detecions and embeddings
eval Evaluate tracking performance using the official trackeval repository
tune Tune tracker hyperparameters based on selected detections and embeddings
```
## YOLOv12 | YOLOv11 | YOLOv10 | YOLOv9 | YOLOv8 | RFDETR | YOLOX examples
<details>
<summary>Tracking</summary>
```bash
$ boxmot track --yolo-model rf-detr-base.pt # bboxes only
boxmot track --yolo-model yolox_s.pt # bboxes only
boxmot track --yolo-model yolo12n.pt # bboxes only
boxmot track --yolo-model yolo11n.pt # bboxes only
boxmot track --yolo-model yolov10n.pt # bboxes only
boxmot track --yolo-model yolov9c.pt # bboxes only
boxmot track --yolo-model yolov8n.pt # bboxes only
yolov8n-seg.pt # bboxes + segmentation masks
yolov8n-pose.pt # bboxes + pose estimation
```
</details>
<details>
<summary>Tracking methods</summary>
```bash
$ boxmot track --tracking-method deepocsort
strongsort
ocsort
bytetrack
botsort
boosttrack
```
</details>
<details>
<summary>Tracking sources</summary>
Tracking can be run on most video formats
```bash
$ boxmot track --source 0 # webcam
img.jpg # image
vid.mp4 # video
path/ # directory
path/*.jpg # glob
'https://youtu.be/Zgi9g1ksQHc' # YouTube
'rtsp://example.com/media.mp4' # RTSP, RTMP, HTTP stream
```
</details>
<details>
<summary>Select ReID model</summary>
Some tracking methods combine appearance description and motion in the process of tracking. For those which use appearance, you can choose a ReID model based on your needs from this [ReID model zoo](https://kaiyangzhou.github.io/deep-person-reid/MODEL_ZOO). These model can be further optimized for you needs by the [reid_export.py](https://github.com/mikel-brostrom/yolo_tracking/blob/master/boxmot/appearance/reid_export.py) script
```bash
$ boxmot track --source 0 --reid-model lmbn_n_cuhk03_d.pt # lightweight
osnet_x0_25_market1501.pt
mobilenetv2_x1_4_msmt17.engine
resnet50_msmt17.onnx
osnet_x1_0_msmt17.pt
clip_market1501.pt # heavy
clip_vehicleid.pt
...
```
</details>
<details>
<summary>Filter tracked classes</summary>
By default the tracker tracks all MS COCO classes.
If you want to track a subset of the classes that you model predicts, add their corresponding index after the classes flag,
```bash
boxmot track --source 0 --yolo-model yolov8s.pt --classes 16 17 # COCO yolov8 model. Track cats and dogs, only
```
[Here](https://tech.amikelive.com/node-718/what-object-categories-labels-are-in-coco-dataset/) is a list of all the possible objects that a Yolov8 model trained on MS COCO can detect. Notice that the indexing for the classes in this repo starts at zero
</details>
</details>
<details>
<summary>Evaluation</summary>
Evaluate a combination of detector, tracking method and ReID model on standard MOT dataset or you custom one by
```bash
$ boxmot eval --yolo-model yolov8n.pt --reid-model osnet_x0_25_msmt17.pt --tracking-method deepocsort --verbose --source ./assets/MOT17-mini/train
$ boxmot eval --yolo-model yolov8n.pt --reid-model osnet_x0_25_msmt17.pt --tracking-method ocsort --verbose --source ./tracking/val_utils/MOT17/train
```
add `--gsi` to your command for postprocessing the MOT results by gaussian smoothed interpolation. Detections and embeddings are stored for the selected YOLO and ReID model respectively. They can then be loaded into any tracking algorithm. Avoiding the overhead of repeatedly generating this data.
</details>
<details>
<summary>Evolution</summary>
We use a fast and elitist multiobjective genetic algorithm for tracker hyperparameter tuning. By default the objectives are: HOTA, MOTA, IDF1. Run it by
```bash
# saves dets and embs under ./runs/dets_n_embs separately for each selected yolo and reid model
$ boxmot generate-dets-embs --source ./assets/MOT17-mini/train --yolo-model yolov8n.pt yolov8s.pt --reid-model weights/osnet_x0_25_msmt17.pt
# evolve parameters for specified tracking method using the selected detections and embeddings generated in the previous step
$ boxmot tune --dets yolov8n --embs osnet_x0_25_msmt17 --n-trials 9 --tracking-method botsort --source ./assets/MOT17-mini/train
```
The set of hyperparameters leading to the best HOTA result are written to the tracker's config file.
</details>
<details>
<summary>Export</summary>
We support ReID model export to ONNX, OpenVINO, TorchScript and TensorRT
```bash
# export to ONNX
$ python3 boxmot/appearance/reid_export.py --include onnx --device cpu
# export to OpenVINO
$ python3 boxmot/appearance/reid_export.py --include openvino --device cpu
# export to TensorRT with dynamic input
$ python3 boxmot/appearance/reid_export.py --include engine --device 0 --dynamic
```
</details>
## Custom tracking examples
<div align="center">
| Example Description | Notebook |
|---------------------|----------|
| Torchvision bounding box tracking with BoxMOT | [](examples/det/torchvision_boxmot.ipynb) |
| Torchvision pose tracking with BoxMOT | [](examples/pose/torchvision_boxmot.ipynb) |
| Torchvision segmentation tracking with BoxMOT | [](examples/seg/torchvision_boxmot.ipynb) |
</div>
## Contributors
<a href="https://github.com/mikel-brostrom/yolo_tracking/graphs/contributors ">
<img src="https://contrib.rocks/image?repo=mikel-brostrom/yolo_tracking" />
</a>
## Contact
For BoxMOT bugs and feature requests please visit [GitHub Issues](https://github.com/mikel-brostrom/boxmot/issues).
For business inquiries or professional support requests please send an email to:
[email protected]
", Assign "at most 3 tags" to the expected json: {"id":"13239","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"