base on Ollama JavaScript library # Ollama JavaScript Library The Ollama JavaScript library provides the easiest way to integrate your JavaScript project with [Ollama](https://github.com/jmorganca/ollama). ## Getting Started ``` npm i ollama ``` ## Usage ```javascript import ollama from 'ollama' const response = await ollama.chat({ model: 'llama3.1', messages: [{ role: 'user', content: 'Why is the sky blue?' }], }) console.log(response.message.content) ``` ### Browser Usage To use the library without node, import the browser module. ```javascript import ollama from 'ollama/browser' ``` ## Streaming responses Response streaming can be enabled by setting `stream: true`, modifying function calls to return an `AsyncGenerator` where each part is an object in the stream. ```javascript import ollama from 'ollama' const message = { role: 'user', content: 'Why is the sky blue?' } const response = await ollama.chat({ model: 'llama3.1', messages: [message], stream: true, }) for await (const part of response) { process.stdout.write(part.message.content) } ``` ## Cloud Models Run larger models by offloading to Ollama’s cloud while keeping your local workflow. [You can see models currently available on Ollama's cloud here.](https://ollama.com/search?c=cloud) ### Run via local Ollama 1) Sign in (one-time): ``` ollama signin ``` 2) Pull a cloud model: ``` ollama pull gpt-oss:120b-cloud ``` 3) Use as usual (offloads automatically): ```javascript import { Ollama } from 'ollama' const ollama = new Ollama() const response = await ollama.chat({ model: 'gpt-oss:120b-cloud', messages: [{ role: 'user', content: 'Explain quantum computing' }], stream: true, }) for await (const part of response) { process.stdout.write(part.message.content) } ``` ### Cloud API (ollama.com) Access cloud models directly by pointing the client at `https://ollama.com`. 1) Create an [API key](https://ollama.com/settings/keys), then set the `OLLAMA_API_KEY` environment variable: ``` export OLLAMA_API_KEY=your_api_key ``` 2) Generate a response via the cloud API: ```javascript import { Ollama } from 'ollama' const ollama = new Ollama({ host: 'https://ollama.com', headers: { Authorization: 'Bearer ' + process.env.OLLAMA_API_KEY }, }) const response = await ollama.chat({ model: 'gpt-oss:120b', messages: [{ role: 'user', content: 'Explain quantum computing' }], stream: true, }) for await (const part of response) { process.stdout.write(part.message.content) } ``` ## API The Ollama JavaScript library's API is designed around the [Ollama REST API](https://github.com/jmorganca/ollama/blob/main/docs/api.md) ### chat ```javascript ollama.chat(request) ``` - `request` `<Object>`: The request object containing chat parameters. - `model` `<string>` The name of the model to use for the chat. - `messages` `<Message[]>`: Array of message objects representing the chat history. - `role` `<string>`: The role of the message sender ('user', 'system', or 'assistant'). - `content` `<string>`: The content of the message. - `images` `<Uint8Array[] | string[]>`: (Optional) Images to be included in the message, either as Uint8Array or base64 encoded strings. - `tool_name` `<string>`: (Optional) Add the name of the tool that was executed to inform the model of the result - `format` `<string>`: (Optional) Set the expected format of the response (`json`). - `stream` `<boolean>`: (Optional) When true an `AsyncGenerator` is returned. - `think` `<boolean | "high" | "medium" | "low">`: (Optional) Enable model thinking. Use `true`/`false` or specify a level. Requires model support. - `logprobs` `<boolean>`: (Optional) Return log probabilities for tokens. Requires model support. - `top_logprobs` `<number>`: (Optional) Number of top log probabilities to return per token when `logprobs` is enabled. - `keep_alive` `<string | number>`: (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.) - `tools` `<Tool[]>`: (Optional) A list of tool calls the model may make. - `options` `<Options>`: (Optional) Options to configure the runtime. - Returns: `<ChatResponse>` ### generate ```javascript ollama.generate(request) ``` - `request` `<Object>`: The request object containing generate parameters. - `model` `<string>` The name of the model to use for the chat. - `prompt` `<string>`: The prompt to send to the model. - `suffix` `<string>`: (Optional) Suffix is the text that comes after the inserted text. - `system` `<string>`: (Optional) Override the model system prompt. - `template` `<string>`: (Optional) Override the model template. - `raw` `<boolean>`: (Optional) Bypass the prompt template and pass the prompt directly to the model. - `images` `<Uint8Array[] | string[]>`: (Optional) Images to be included, either as Uint8Array or base64 encoded strings. - `format` `<string>`: (Optional) Set the expected format of the response (`json`). - `stream` `<boolean>`: (Optional) When true an `AsyncGenerator` is returned. - `think` `<boolean | "high" | "medium" | "low">`: (Optional) Enable model thinking. Use `true`/`false` or specify a level. Requires model support. - `logprobs` `<boolean>`: (Optional) Return log probabilities for tokens. Requires model support. - `top_logprobs` `<number>`: (Optional) Number of top log probabilities to return per token when `logprobs` is enabled. - `keep_alive` `<string | number>`: (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.) - `options` `<Options>`: (Optional) Options to configure the runtime. - Returns: `<GenerateResponse>` ### pull ```javascript ollama.pull(request) ``` - `request` `<Object>`: The request object containing pull parameters. - `model` `<string>` The name of the model to pull. - `insecure` `<boolean>`: (Optional) Pull from servers whose identity cannot be verified. - `stream` `<boolean>`: (Optional) When true an `AsyncGenerator` is returned. - Returns: `<ProgressResponse>` ### push ```javascript ollama.push(request) ``` - `request` `<Object>`: The request object containing push parameters. - `model` `<string>` The name of the model to push. - `insecure` `<boolean>`: (Optional) Push to servers whose identity cannot be verified. - `stream` `<boolean>`: (Optional) When true an `AsyncGenerator` is returned. - Returns: `<ProgressResponse>` ### create ```javascript ollama.create(request) ``` - `request` `<Object>`: The request object containing create parameters. - `model` `<string>` The name of the model to create. - `from` `<string>`: The base model to derive from. - `stream` `<boolean>`: (Optional) When true an `AsyncGenerator` is returned. - `quantize` `<string>`: Quanization precision level (`q8_0`, `q4_K_M`, etc.). - `template` `<string>`: (Optional) The prompt template to use with the model. - `license` `<string|string[]>`: (Optional) The license(s) associated with the model. - `system` `<string>`: (Optional) The system prompt for the model. - `parameters` `<Record<string, unknown>>`: (Optional) Additional model parameters as key-value pairs. - `messages` `<Message[]>`: (Optional) Initial chat messages for the model. - `adapters` `<Record<string, string>>`: (Optional) A key-value map of LoRA adapter configurations. - Returns: `<ProgressResponse>` Note: The `files` parameter is not currently supported in `ollama-js`. ### delete ```javascript ollama.delete(request) ``` - `request` `<Object>`: The request object containing delete parameters. - `model` `<string>` The name of the model to delete. - Returns: `<StatusResponse>` ### copy ```javascript ollama.copy(request) ``` - `request` `<Object>`: The request object containing copy parameters. - `source` `<string>` The name of the model to copy from. - `destination` `<string>` The name of the model to copy to. - Returns: `<StatusResponse>` ### list ```javascript ollama.list() ``` - Returns: `<ListResponse>` ### show ```javascript ollama.show(request) ``` - `request` `<Object>`: The request object containing show parameters. - `model` `<string>` The name of the model to show. - `system` `<string>`: (Optional) Override the model system prompt returned. - `template` `<string>`: (Optional) Override the model template returned. - `options` `<Options>`: (Optional) Options to configure the runtime. - Returns: `<ShowResponse>` ### embed ```javascript ollama.embed(request) ``` - `request` `<Object>`: The request object containing embedding parameters. - `model` `<string>` The name of the model used to generate the embeddings. - `input` `<string> | <string[]>`: The input used to generate the embeddings. - `truncate` `<boolean>`: (Optional) Truncate the input to fit the maximum context length supported by the model. - `keep_alive` `<string | number>`: (Optional) How long to keep the model loaded. A number (seconds) or a string with a duration unit suffix ("300ms", "1.5h", "2h45m", etc.) - `options` `<Options>`: (Optional) Options to configure the runtime. - Returns: `<EmbedResponse>` ### web search - Web search capability requires an Ollama account. [Sign up on ollama.com](https://ollama.com/signup) - Create an API key by visiting https://ollama.com/settings/keys ```javascript ollama.webSearch(request) ``` - `request` `<Object>`: The search request parameters. - `query` `<string>`: The search query string. - `max_results` `<number>`: (Optional) Maximum results to return (default 5, max 10). - Returns: `<SearchResponse>` ### web fetch ```javascript ollama.webFetch(request) ``` - `request` `<Object>`: The fetch request parameters. - `url` `<string>`: The URL to fetch. - Returns: `<FetchResponse>` ### ps ```javascript ollama.ps() ``` - Returns: `<ListResponse>` ### version ```javascript ollama.version() ``` - Returns: `<VersionResponse>` ### abort ```javascript ollama.abort() ``` This method will abort **all** streamed generations currently running with the client instance. If there is a need to manage streams with timeouts, it is recommended to have one Ollama client per stream. All asynchronous threads listening to streams (typically the `for await (const part of response)`) will throw an `AbortError` exception. See [examples/abort/abort-all-requests.ts](examples/abort/abort-all-requests.ts) for an example. ## Custom client A custom client can be created with the following fields: - `host` `<string>`: (Optional) The Ollama host address. Default: `"http://127.0.0.1:11434"`. - `fetch` `<Object>`: (Optional) The fetch library used to make requests to the Ollama host. - `headers` `<Object>`: (Optional) Custom headers to include with every request. ```javascript import { Ollama } from 'ollama' const ollama = new Ollama({ host: 'http://127.0.0.1:11434' }) const response = await ollama.chat({ model: 'llama3.1', messages: [{ role: 'user', content: 'Why is the sky blue?' }], }) ``` ## Custom Headers You can set custom headers that will be included with every request: ```javascript import { Ollama } from 'ollama' const ollama = new Ollama({ host: 'http://127.0.0.1:11434', headers: { Authorization: 'Bearer <api key>', 'X-Custom-Header': 'custom-value', 'User-Agent': 'MyApp/1.0', }, }) ``` ## Building To build the project files run: ```sh npm run build ``` ", Assign "at most 3 tags" to the expected json: {"id":"7298","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"