base on jq extension for Postgres # pgJQ: Use `jq` in Postgres <a href="https://hub.docker.com/repository/docker/florents/pgjq"><img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/florents/pgjq"></a> <img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/Florents-Tselai/pgJQ"> <a href="https://www.linkedin.com/in/florentstselai/"><img src="https://img.shields.io/badge/LinkedIn-0077B5?logo=linkedin&logoColor=white"></a> <a href="https://github.com/sponsors/Florents-Tselai/"><img src="https://img.shields.io/static/v1?label=Sponsor&message=%E2%9D%A4&logo=GitHub&link=https://github.com/sponsors/Florents-Tselai/"></a> <a href="https://opensource.org/licenses/MIT license"><img src="https://img.shields.io/badge/MIT license-blue.svg"></a> *Note*: If you like this idea check out: [liteJQ: jq extension for SQLite](https://github.com/Florents-Tselai/liteJQ) The **pgJQ** extension embeds the standard jq compiler and brings the much loved [jq](https://github.com/jqlang/jq) lang to Postgres. It adds a `jqprog` data type to express `jq` programs and a `jq(jsonb, jqprog)` function to execute them on `jsonb` objects. It works seamlessly with standard `jsonb` functions, operators, and `jsonpath`. ```sql SELECT jq('[{"bar": "baz", "balance": 7.77, "active":false}]'::jsonb, '.[0].bar'); ``` ``` jq ------- "baz" (1 row) ``` ![til](./pgjq-demo.gif) ## Usage ### Filters You can run basic filters: ```sql SELECT jq('[{"bar": "baz", "balance": 7.77, "active":false}]'::jsonb, '.[0].bar'); ``` ``` jq ------- "baz" (1 row) ``` ### `jsonb` `@@` `jqprog` If you're a syntactic sugar addict, you can use the `@@` operator to achieve the same. It's better be explicit with the `::jqprog` when using operators. ```sql SELECT '[{"bar": "baz", "balance": 7.77, "active":false}]' @@ '.[0].bar'::jqprog; ``` ``` jq ------- "baz" (1 row) ``` ### Complex Programs You can run more complex `jq` programs too: ```sql SELECT jq('[true,false,[5,true,[true,[false]],false]]', '(..|select(type=="boolean")) |= if . then 1 else 0 end'); ``` ``` jq ----------------------------- [1, 0, [5, 1, [1, [0]], 0]] (1 row) ``` ```sql SELECT jq('[1,5,3,0,7]' , '(.[] | select(. >= 2)) |= empty'); ``` ``` jq -------- [1, 0] (1 row) ``` ### Passing Arguments to `jqprog` If you want to pass dynamic arguments to `jqprog`, you can pass them as a `jsonb` object and refer to them as `$var`. ```sql select jq('{ "runner": 1, "message": "jobStatus", "jobs": [ { "id": 9, "is_successfull": true }, { "id": 100, "is_successfull": false, "metdata": { "environ": "prod" } } ] }'::jsonb, '.jobs[] | select(.is_successfull == $is_success and .id == 100) | .', '{"is_success": false, "id": 100}'); ``` ``` jq ---------------------------------------------------------------------- {"id": 100, "metdata": {"environ": "prod"}, "is_successfull": false} (1 row) ``` ### `jq` and `jsonpath` You can even chain `jq` and `jsonpath` together! Note here that the later part `- '{trans}' @> '{"cust": "baz"}'` is `jsonpath`, not `jq` code. ```sql SELECT jq('[ { "cust": "baz", "trans": { "balance": 100, "date": "2023-08-01" }, "active": true, "geo": { "branch": "paloukia" } } ]', '(.[] | select(.active == true))') - '{trans}' @> '{"cust": "baz"}'; ``` ``` ?column? ---------- t (1 row) ``` If you opt for using operators here, you should help the parser by adding parentheses and explicit casts. ```sql SELECT ('[ { "cust": "baz", "trans": { "balance": 100, "date": "2023-08-01" }, "active": true, "geo": { "branch": "paloukia" } } ]' @@ '(.[] | select(.active == true))'::jqprog) - '{trans}' @> '{"cust": "baz"}'; ``` It is strongly recommended to be explicit with type casts and ordering when using overloaded operators, especially when you're working a lot with text. Otherwise, you'll find yourself in an obfuscated labyrinth of `jqprog`s, `jsonb`s, `jsonpath`s and possibly `tsvector`s , impossible to escape from. ### Working with Files If you have superuser privileges in Postgres you can use the `pg_read_file` to run your queries on JSON files. ```sql SELECT jq(pg_read_file('/path/to/f.json', '.[]')) ``` You can see more examples in the [test cases](test/sql/basic.sql) or try reproducing the [`jq` manual](https://jqlang.github.io/jq/manual/) . ## Installation ```sh git clone https://github.com/Florents-Tselai/pgJQ.git cd pgJQ make install # set PG_CONFIG=/path/to/bin/pg_config if necessary. make installcheck ``` In a Postgres session run ```sql CREATE EXTENSION pgjq ``` ## How it Works pgJQ does not re-implement the `jq` lang in Postgres. It instead embeds the standard `jq` compiler and uses it to parse `jq` programs supplied in SQL queries. These programs are fed with `jsonb` objects as input. ## Issues `jq` has evolved from *just a cli tool* to a full-fledged DSL, but it still remains a 20-80 tool. **pgJQ** has been TDDed against those 20% of the cases. If you come across regressions between vanilla `jq` and pgJQ, especially around piped filters or complex functions, please do add an issue, along with a test case! Keeping in mind, though, that there's probably not much point reproducing the whole DSL in an RDBMS context. Some known issues are: * Only string, bool and numeric arguments can be passed to `jqprog`. * Currently, `jq` programs including pipes, like `.[] | .name` are buggy and unpredictable. * Modules are not supported, but they could be theoretically supported, given that Postgres is fairly open to dynamic loading. ", Assign "at most 3 tags" to the expected json: {"id":"8811","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"