AI prompts
base on A prototype of a tool that generates web forms from document forms # Form Extractor Prototype
This tool extracts the structure from a PDF or image of a form.
By default it uses the [Claude 3 LLM](https://claude.ai) model by Anthropic.
But it can also use the OpenAI LLM.
A single extraction of an A4 form page costs about 10p.
It replicates the form structure in JSON, following the schema used by [GOV.UK Forms](https://www.forms.service.gov.uk/).
It then uses that to generate a multi-page web form in the GOV.UK style.
Here's a short demo video:
https://github.com/timpaul/form-extractor-prototype/assets/1590604/81580afb-e3c9-41dd-a451-be418372ef2d
You'll notice that it doesn't try to faithfully replicate every field in a question.
Instead, it uses the relevant components and patterns from the [GOV.UK Design System](https://design-system.service.gov.uk/).
This is a feature not a bug ;-)
## Install
You'll need either an [Anthropic API key](https://www.anthropic.com/api), or an [Open AI one](https://openai.com/index/openai-api/).
Add the key as a local environment variable called `ANTHROPIC_API_KEY`, or `OPENAI_API_KEY`.
Install the app locally with `npm install`.
You'll also need to install GraphicsMagick. It's used to convert PDF pages into images.
[There's a guide for doing that here](https://github.com/yakovmeister/pdf2image/blob/HEAD/docs/gm-installation.md).
## Run
Start the app locally with `npm start dev`.
It'll be available at http://localhost:3000/
## Current capabilities
- processing PDF forms or images of forms
- breaking a form down into questions
- distinguishing between question, hint and field text
- distinguishing between single-choice and multiple-choice questions
- recognising common question types like 'name', 'address', 'date' etc.
- recognising when an image isn't a form
- recognising when a question has conditional routing
- processing hand drawn forms
- browsing previously processed forms
## Current limitations
- it only knows about certain kinds of question types
- you can't provide your own API key via the UI
- like a lot of Gen AI, it can be unpredictable
## How it works
*Disclaimer: This is a prototype and I am not a developer ;-).*
The main UI is in [app/views/index.html](https://github.com/timpaul/form-extractor-prototype/blob/main/app/views/index.html).
Other Nunjucks page templates and macros are in [app/views](https://github.com/timpaul/form-extractor-prototype/tree/main/app/views).
Additional CSS styles are in [assets/style.scss](https://github.com/timpaul/form-extractor-prototype/blob/main/assets/style.scss).
Generate updates to the CSS with `sass assets/style.scss public/assets/style.css`.
The script in [public/assets/scripts.js](https://github.com/timpaul/form-extractor-prototype/blob/main/assets/scripts.js) enhances file upload and adds loading spinners.
The form in [index.html](https://github.com/timpaul/form-extractor-prototype/blob/main/app/views/index.html) uploads the file to the server.
If it's a PDF it uses GraphicsMagick to convert the pages into image files.
Form files are stored in subfolders in [public/results](https://github.com/timpaul/form-extractor-prototype/blob/main/public/results).
The images are sent to an LLM, along with a prompt and JSON schema, via the 'SendToLLM' function in [server.js](https://github.com/timpaul/form-extractor-prototype/blob/main/server.js).
The JSON schema for each LLM is specified in [data/](https://github.com/timpaul/form-extractor-prototype/blob/main/data/).
The results are saved as a JSON files in the subfolders in [public/results](https://github.com/timpaul/form-extractor-prototype/blob/main/public/results).
Those files are used to generate the pages that are loaded into iframes in [app/views/index.html](https://github.com/timpaul/form-extractor-prototype/blob/main/app/views/index.html).
The form components are specificed in [app/views/answer-types.njk](https://github.com/timpaul/form-extractor-prototype/blob/main/app/views/answer-types.njk)
They are built using the Nunjucks components in [GOV.UK Frontend](https://www.npmjs.com/package/govuk-frontend).
Page rendering is defined in the URL routing rules found at the bottom of [server.js](https://github.com/timpaul/form-extractor-prototype/blob/main/server.js).
", Assign "at most 3 tags" to the expected json: {"id":"9465","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"