base on Lightweight geocoder in pure Rust # 📫 Airmail 📫 Airmail is an extremely lightweight geocoder[^1] written in pure Rust. Built on top of [tantivy](https://github.com/quickwit-oss/tantivy), it offers a low memory footprint and fast performance. Airmail aims to support international queries in several languages, but in practice it's still very early days and there are definitely bugs preventing correct behavior. [^1]: A geocoder is a search engine for places. When you type in "vegan donut shop" into your maps app of choice, a geocoder is what shows you nearby places that fit your query. ## Features Airmail's killer feature is the ability to query remote indices, e.g. on S3. This lets you keep your index hosting costs fixed while you scale horizontally. The baseline cost of a global Airmail deployment is about $5 per month. ## Roadmap - [x] Index OpenStreetMap data, from osmx or pbf file. - [ ] Index OpenAddresses data (not currently used in demo). - [ ] Index WhosOnFirst data. - [x] API server. - [x] Address queries. - [x] Named POI queries. - [ ] Administrative area (city, province/state, country etc) queries. - [x] Prefix queries. - [x] Query remote indices. - [x] Support and test planet-scale indices. - [x] International address queries. - [ ] Categorical search, e.g. "coffee shop seattle". - [x] Typo tolerance (limited to >=8 character input tokens) - [x] Bounding box restriction. - [ ] Focus point queries. - [ ] Systematic/automatic quality testing in CI. ## Quickstart This guide will create an index with a chosen geographical region (or the planet!) and run Airmail. ### Requirements - Rust environment, Docker with Docker Compose, or Podman with Podman Compose. - ~16GB memory and 10-100GB of free space. ### Clone the Repo ```bash git clone [email protected]:ellenhp/airmail.git cd airmail mkdir ./data ``` ### Fetch Data It's a good idea to build a smaller region first, and then planet if you have the need and space. This guide references Australia, but you can use any region. 1. Download OSM probuf file (.pbf file) for the target region of interest. See: <https://download.geofabrik.de> or <https://download.bbbike.org/osm/planet/> and place into `./data` folder. 2. Download Who's On First (SpatiaLite format). For planet see: <https://geocode.earth/data/whosonfirst/combined/> and <https://data.geocode.earth/wof/dist/spatial/whosonfirst-data-admin-latest.spatial.db.bz2> 3. Ensure files are present and decompressed in the `./data/` directory. ### Option 1 - Docker ```bash # Build the images docker compose build # Build the index (from a pbf) docker compose run indexer \ indexer --wof-db /data/whosonfirst-data-admin-latest.spatial.db \ --index /data/index/ \ load-osm-pbf /data/australia-oceania-latest.osm.pbf # Launch the service docker compose up airmail ``` ### Option 2 - From Source ```bash # Install deps apt-get install -y libssl-dev capnproto clang pkg-config libzstd-dev libsqlite3-mod-spatialite # Run indexer cargo run --bin indexer -- \ --wof-db /data/whosonfirst-data-admin-latest.spatial.db \ --index /data/index/ \ load-osm-pbf /data/australia-oceania-latest.osm.pbf # Run service cargo run --bin airmail_service -- \ --index /data/index/ ``` ## License Dual MIT/Apache 2 license, at your option. ", Assign "at most 3 tags" to the expected json: {"id":"7913","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"