AI prompts
base on Lightpanda: the headless browser designed for AI and automation <p align="center">
<a href="https://lightpanda.io"><img src="https://cdn.lightpanda.io/assets/images/logo/lpd-logo.png" alt="Logo" height=170></a>
</p>
<h1 align="center">Lightpanda Browser</h1>
<p align="center"><a href="https://lightpanda.io/">lightpanda.io</a></p>
<div align="center">
[](https://github.com/lightpanda-io/browser/blob/main/LICENSE)
[](https://twitter.com/lightpanda_io)
[](https://github.com/lightpanda-io/browser)
</div>
Lightpanda is the open-source browser made for headless usage:
- Javascript execution
- Support of Web APIs (partial, WIP)
- Compatible with Playwright, Puppeteer through CDP (WIP)
Fast web automation for AI agents, LLM training, scraping and testing:
- Ultra-low memory footprint (9x less than Chrome)
- Exceptionally fast execution (11x faster than Chrome)
- Instant startup
[<img width="350px" src="https://cdn.lightpanda.io/assets/images/github/execution-time.svg">
](https://github.com/lightpanda-io/demo)
 
[<img width="350px" src="https://cdn.lightpanda.io/assets/images/github/memory-frame.svg">
](https://github.com/lightpanda-io/demo)
</div>
_Puppeteer requesting 100 pages from a local website on a AWS EC2 m5.large instance.
See [benchmark details](https://github.com/lightpanda-io/demo)._
## Quick start
### Install from the nightly builds
You can download the last binary from the [nightly
builds](https://github.com/lightpanda-io/browser/releases/tag/nightly) for
Linux x86_64 and MacOS aarch64.
*For Linux*
```console
curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-x86_64-linux && \
chmod a+x ./lightpanda
```
*For MacOS*
```console
curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/nightly/lightpanda-aarch64-macos && \
chmod a+x ./lightpanda
```
*For Windows + WSL2*
The Lightpanda browser is compatible to run on windows inside WSL. Follow the Linux instruction for installation from a WSL terminal.
It is recommended to install clients like Puppeteer on the Windows host.
### Dump a URL
```console
./lightpanda fetch --dump https://lightpanda.io
```
```console
info(browser): GET https://lightpanda.io/ http.Status.ok
info(browser): fetch script https://api.website.lightpanda.io/js/script.js: http.Status.ok
info(browser): eval remote https://api.website.lightpanda.io/js/script.js: TypeError: Cannot read properties of undefined (reading 'pushState')
<!DOCTYPE html>
```
### Start a CDP server
```console
./lightpanda serve --host 127.0.0.1 --port 9222
```
```console
info(websocket): starting blocking worker to listen on 127.0.0.1:9222
info(server): accepting new conn...
```
Once the CDP server started, you can run a Puppeteer script by configuring the
`browserWSEndpoint`.
```js
'use strict'
import puppeteer from 'puppeteer-core';
// use browserWSEndpoint to pass the Lightpanda's CDP server address.
const browser = await puppeteer.connect({
browserWSEndpoint: "ws://127.0.0.1:9222",
});
// The rest of your script remains the same.
const context = await browser.createBrowserContext();
const page = await context.newPage();
// Dump all the links from the page.
await page.goto('https://wikipedia.com/');
const links = await page.evaluate(() => {
return Array.from(document.querySelectorAll('a')).map(row => {
return row.getAttribute('href');
});
});
console.log(links);
await page.close();
await context.close();
await browser.disconnect();
```
### Telemetry
By default, Lightpanda collects and sends usage telemetry. This can be disabled by setting an environment variable `LIGHTPANDA_DISABLE_TELEMETRY=true`. You can read Lightpanda's privacy policy at: [https://lightpanda.io/privacy-policy](https://lightpanda.io/privacy-policy).
## Status
Lightpanda is still a work in progress and is currently at a Beta stage.
:warning: You should expect most websites to fail or crash.
Here are the key features we have implemented:
- [x] HTTP loader
- [x] HTML parser and DOM tree (based on Netsurf libs)
- [x] Javascript support (v8)
- [x] Basic DOM APIs
- [x] Ajax
- [x] XHR API
- [x] Fetch API
- [x] DOM dump
- [x] Basic CDP/websockets server
NOTE: There are hundreds of Web APIs. Developing a browser (even just for headless mode) is a huge task. Coverage will increase over time.
You can also follow the progress of our Javascript support in our dedicated [zig-js-runtime](https://github.com/lightpanda-io/zig-js-runtime#development) project.
## Build from sources
### Prerequisites
Lightpanda is written with [Zig](https://ziglang.org/) `0.14.0`. You have to
install it with the right version in order to build the project.
Lightpanda also depends on
[zig-js-runtime](https://github.com/lightpanda-io/zig-js-runtime/) (with v8),
[Netsurf libs](https://www.netsurf-browser.org/) and
[Mimalloc](https://microsoft.github.io/mimalloc).
To be able to build the v8 engine for zig-js-runtime, you have to install some libs:
For Debian/Ubuntu based Linux:
```
sudo apt install xz-utils \
python3 ca-certificates git \
pkg-config libglib2.0-dev \
gperf libexpat1-dev \
cmake clang
```
For MacOS, you only need cmake:
```
brew install cmake
```
### Install and build dependencies
#### All in one build
You can run `make install` to install deps all in one (or `make install-dev` if you need the development versions).
Be aware that the build task is very long and cpu consuming, as you will build from sources all dependencies, including the v8 Javascript engine.
#### Step by step build dependency
The project uses git submodules for dependencies.
To init or update the submodules in the `vendor/` directory:
```
make install-submodule
```
**iconv**
libiconv is an internationalization library used by Netsurf.
```
make install-libiconv
```
**Netsurf libs**
Netsurf libs are used for HTML parsing and DOM tree generation.
```
make install-netsurf
```
For dev env, use `make install-netsurf-dev`.
**Mimalloc**
Mimalloc is used as a C memory allocator.
```
make install-mimalloc
```
For dev env, use `make install-mimalloc-dev`.
Note: when Mimalloc is built in dev mode, you can dump memory stats with the
env var `MIMALLOC_SHOW_STATS=1`. See
[https://microsoft.github.io/mimalloc/environment.html](https://microsoft.github.io/mimalloc/environment.html).
**v8**
First, get the tools necessary for building V8, as well as the V8 source code:
```
make get-v8
```
Next, build v8. This build task is very long and cpu consuming, as you will build v8 from sources.
```
make build-v8
```
For dev env, use `make build-v8-dev`.
## Test
### Unit Tests
You can test Lightpanda by running `make test`.
### End to end tests
To run end to end tests, you need to clone the [demo
repository](https://github.com/lightpanda-io/demo) into `../demo` dir.
You have to install the [demo's node
requirements](https://github.com/lightpanda-io/demo?tab=readme-ov-file#dependencies-1)
You also need to install [Go](https://go.dev) > v1.24.
```
make end2end
```
### Web Platform Tests
Lightpanda is tested against the standardized [Web Platform
Tests](https://web-platform-tests.org/).
The relevant tests cases are committed in a [dedicated repository](https://github.com/lightpanda-io/wpt) which is fetched by the `make install-submodule` command.
All the tests cases executed are located in the `tests/wpt` sub-directory.
For reference, you can easily execute a WPT test case with your browser via
[wpt.live](https://wpt.live).
#### Run WPT test suite
To run all the tests:
```
make wpt
```
Or one specific test:
```
make wpt Node-childNodes.html
```
#### Add a new WPT test case
We add new relevant tests cases files when we implemented changes in Lightpanda.
To add a new test, copy the file you want from the [WPT
repo](https://github.com/web-platform-tests/wpt) into the `tests/wpt` directory.
:warning: Please keep the original directory tree structure of `tests/wpt`.
## Contributing
Lightpanda accepts pull requests through GitHub.
You have to sign our [CLA](CLA.md) during the pull request process otherwise
we're not able to accept your contributions.
## Why?
### Javascript execution is mandatory for the modern web
In the good old days, scraping a webpage was as easy as making an HTTP request, cURL-like. It’s not possible anymore, because Javascript is everywhere, like it or not:
- Ajax, Single Page App, infinite loading, “click to display”, instant search, etc.
- JS web frameworks: React, Vue, Angular & others
### Chrome is not the right tool
If we need Javascript, why not use a real web browser? Take a huge desktop application, hack it, and run it on the server. Hundreds or thousands of instances of Chrome if you use it at scale. Are you sure it’s such a good idea?
- Heavy on RAM and CPU, expensive to run
- Hard to package, deploy and maintain at scale
- Bloated, lots of features are not useful in headless usage
### Lightpanda is built for performance
If we want both Javascript and performance in a true headless browser, we need to start from scratch. Not another iteration of Chromium, really from a blank page. Crazy right? But that’s what we did:
- Not based on Chromium, Blink or WebKit
- Low-level system programming language (Zig) with optimisations in mind
- Opinionated: without graphical rendering
", Assign "at most 3 tags" to the expected json: {"id":"12815","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"