Trendshift - Ask AI

base on null # SAM.cpp Inference of Meta's [Segment Anything Model](https://github.com/facebookresearch/segment-anything/) in pure C/C++ https://github.com/YavorGIvanov/sam.cpp/assets/1991296/a69be66f-8e27-43a0-8a4d-6cfe3b1d9335 ## Quick start ```bash git clone --recursive https://github.com/YavorGIvanov/sam.cpp cd sam.cpp ``` Note: you need to download the model checkpoint below (`sam_vit_b_01ec64.pth`) first from [here](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth) and place it in the `checkpoints` folder ```bash # Convert PTH model to ggml. Requires python3, torch and numpy python convert-pth-to-ggml.py checkpoints/sam_vit_b_01ec64.pth . 1 # You need CMake and SDL2 SDL2 - Used for GUI windows & input [libsdl](https://www.libsdl.org) [Ubuntu] $ sudo apt install libsdl2-dev [Mac OS with brew] $ brew install sdl2 [MSYS2] $ pacman -S git cmake make mingw-w64-x86_64-dlfcn mingw-w64-x86_64-gcc mingw-w64-x86_64-SDL2 # Build sam.cpp. mkdir build && cd build cmake .. && make -j4 # run inference ./bin/sam -t 16 -i ../img.jpg -m ../checkpoints/ggml-model-f16.bin ``` Note: The optimal threads parameter ("-t") value should be manually selected based on the specific machine running the inference. Note: If you have problems with the Windows build, you can check [this issue](https://github.com/YavorGIvanov/sam.cpp/issues/8) for more details ## Downloading and converting the model checkpoints You can download a [model checkpoint](https://github.com/facebookresearch/segment-anything/tree/main#model-checkpoints) and convert it to `ggml` format using the script `convert-pth-to-ggml.py`: ``` # Convert PTH model to ggml python convert-pth-to-ggml.py sam_vit_b_01ec64.pth . 1 ``` ## Example output on M2 Ultra ``` $ ▶ make -j sam && time ./bin/sam -t 8 -i img.jpg [ 28%] Built target common [ 71%] Built target ggml [100%] Built target sam main: seed = 1693224265 main: loaded image 'img.jpg' (680 x 453) sam_image_preprocess: scale = 0.664062 main: preprocessed image (1024 x 1024) sam_model_load: loading model from 'models/sam-vit-b/ggml-model-f16.bin' - please wait ... sam_model_load: n_enc_state = 768 sam_model_load: n_enc_layer = 12 sam_model_load: n_enc_head = 12 sam_model_load: n_enc_out_chans = 256 sam_model_load: n_pt_embd = 4 sam_model_load: ftype = 1 sam_model_load: qntvr = 0 operator(): ggml ctx size = 202.32 MB sam_model_load: ...................................... done sam_model_load: model size = 185.05 MB / num tensors = 304 embd_img dims: 64 64 256 1 f32 First & Last 10 elements: -0.05117 -0.06408 -0.07154 -0.06991 -0.07212 -0.07690 -0.07508 -0.07281 -0.07383 -0.06779 0.01589 0.01775 0.02250 0.01675 0.01766 0.01661 0.01811 0.02051 0.02103 0.03382 sum: 12736.272313 Skipping mask 0 with iou 0.705935 below threshold 0.880000 Skipping mask 1 with iou 0.762136 below threshold 0.880000 Mask 2: iou = 0.947081, stability_score = 0.955437, bbox (371, 436), (144, 168) main: load time = 51.28 ms main: total time = 2047.49 ms real 0m2.068s user 0m16.343s sys 0m0.214s ``` Input point is (414.375, 162.796875) (currently hardcoded) Input image: ![llamas](https://user-images.githubusercontent.com/8558655/261301565-37b7bf4b-bf91-40cf-8ec1-1532316e1612.jpg) Output mask (mask_out_2.png in build folder): ![mask_glasses](https://user-images.githubusercontent.com/8558655/265732931-e7e31285-7efc-4009-98c8-57fd819bdfc1.png) ## References - [ggml](https://github.com/ggerganov/ggml) - [ggml SAM example](https://github.com/ggerganov/ggml/tree/master/examples/sam) - [SAM](https://segment-anything.com/) - [SAM demo](https://segment-anything.com/demo) ## Next steps - [X] Reduce memory usage by utilizing the new ggml-alloc - [X] Remove redundant graph nodes - [X] Fix the difference in output masks compared to the PyTorch implementation - [X] Filter masks based on stability score - [X] Add support for point user input - [X] Support bigger model checkpoints - [ ] Make inference faster - [ ] Support F16 for heavy F32 ops - [ ] Test quantization - [ ] Add support for mask and box input + #14 - [ ] GPU support ", Assign "at most 3 tags" to the expected json: {"id":"144","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"

AI prompts