Featured

Computer vision

New 2026

Official Implementation of MultiWorld: Scalable Multi-Agent Multi-View Video World Models

New 2026

Analyse · Learn · Ingest · Curate · Export — AI-powered YOLO dataset management toolkit

New 2026

A feed-forward 3D foundation model for reconstructing scenes from streaming data

New 2026

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

New 2026

AR 3D object detection for iPhone with LiDAR — YOLO 2D + BoxerNet 3D lifting

New 2026

Fine-tune Gemma 4 and 3n with audio, images and text on Apple Silicon, using PyTorch and Metal Performance Shaders.

New 2026

SteerViT is a framework that equips any ViT with the ability to steer both its global and local visual representations with natural language.

New 2026

Control OpenLayers, Google Maps, and Leaflet with hand gestures via webcam. Uses MediaPipe for real time hand tracking to pan, zoom, and navigate maps hands free in the browser. No backend required.

New 2026

Allen Institute for AI: WildDet3D: Scaling Promptable 3D Detection in the Wild

New 2026

A simple video streaming baseline that outperforms SOTAs.

New 2026

Give Claude the ability to watch and understand videos — Claude Code plugin with frame extraction and multimodal audio analysis

New 2026

"Single-image Layer Decomposition for Anime Characters" (SIGGRAPH 2026, Conditionally Accepted)

New 2026

Inference repo for Falcon-Perception and Falcon-OCR model, early-fusion, natively multimodal, dense Autoregressive Transformer models.

New 2026

[CVPR 2026] From None to All: Self-Supervised 3D Reconstruction via Novel View Synthesis

New 2026

[CVPR 2026 Highlight] A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

New 2026

MegaFlow: Zero-Shot Large Displacement Optical Flow

New 2026

Official implementation and models for OVIE (One View Is Enough! Monocular Training for In-the-Wild Novel View Generation)

New 2026

Fast GPU OCR server. 270 img/s on FUNSD. TensorRT FP16, PP-OCRv5, HTTP + gRPC.