Reach 125K+ monthly visitors

Advertise on Trendshift

Computer vision

New 2026

A tiny, fully-reproducible JEPA world model that learns the physics of a noisy bouncing DVD logo in representation space, dreams its future

New 2026

Gesture-controlled computer vision system featuring selective invisibility, privacy-focused focus regions, and AR effects using OpenCV and MediaPipe.

New 2026

AgentVision: Visual Perception System for AI Agents

New 2026

Local SAM3 desktop app for text-prompt image segmentation.

New 2026

Code Repository for "LEAP: Layer-skipping Efficiency via Adaptive Progression for Vision Transformer Distillation“

New 2026

Havada parmakla matematik yaz, yapay zekâ çözsün · Draw math in the air with your finger — MediaPipe hand tracking + Claude vision

New 2026

A deep learning framework for 4-stage Alzheimer's Disease classification using T1 MRI scans. It features a ResNet-18 architecture with Squeeze-and-Excitation (SE) blocks. To handle severe class imbalance, Focal Loss and Weighted Sampling are used. Achieves 78.89% accuracy and 100% recall for Moderate Demented cases.

New 2026

[ECCV 2026] Official Pytorch implementation for See & Sniff: Learning Visuo-Olfactory Representations

New 2026

"A Python script that automatically detects faces using MediaPipe Tasks API and applies a smart padded Gaussian Blur with OpenCV to protect privacy."

New 2026

A FiftyOne Plugin that flags dark, elongated, smooth blobs as raw carbon-fiber aircraft candidates in aerial survey imagery

New 2026

Human Universal Grasping

New 2026

# 简短要点总结 1. 基于PP-OCRv6(Tiny/Small/Medium三版本)搭建本地OCR工作台 2. 苹果硅芯片支持CoreML硬件加速 3. 提供模型一键切换功能 4. 配套OmniDocBench评测能力

New 2026

Optical neural computation on a commodity smartphone: OLED screen + mirror + front camera as an analog matrix engine. 101 experiments, bilingual docs, and 6 Zenodo papers.

New 2026

Give an AI eyes for video — turn a clip into a numbered frame grid + transcript a vision LLM can read.

New 2026

Experiment upload image, generate 3d point cloud.

New 2026

Model-agnostic state layer for world models — turn any detector (YOLO · VLM · DINO) into one standard, queryable stream of events + latent state. numpy-only, runs on CPU at the edge.

New 2026

✨️ A Convolutional Neural Network implemented completely from scratch in C++ featuring custom tensors, Conv2D and Dense layers, forward and backward propagation, max pooling, dropout, cross-entropy loss, Adam optimization, and a custom JPEG decoder. No TensorFlow, PyTorch, OpenCV, or neural network libraries.

New 2026

OpenCode plugin that lets text-only models work with user-provided and tool-returned images by sending each image to a vision-capable model first, then replacing the image with a text description.

New 2026

Open research demo for dental CBCT anatomy segmentation using a fine-tuned NVIDIA NV-Segment-CT workflow, with 2D slice review, 3D mesh visualization