base on Make websites accessible for AI agents <picture>
<source media="(prefers-color-scheme: dark)" srcset="./static/browser-use-dark.png">
<source media="(prefers-color-scheme: light)" srcset="./static/browser-use.png">
<img alt="Shows a black Browser Use Logo in light color mode and a white one in dark color mode." src="./static/browser-use.png" width="full">
</picture>
<h1 align="center">Enable AI to control your browser ๐ค</h1>
[](https://github.com/gregpr07/browser-use/stargazers)
[](https://link.browser-use.com/discord)
[](https://cloud.browser-use.com)
[](https://docs.browser-use.com)
[](https://x.com/gregpr07)
[](https://x.com/mamagnus00)
[](https://app.workweave.ai/reports/repository/org_T5Pvn3UBswTHIsN1dWS3voPg/881458615)
๐ Browser-use is the easiest way to connect your AI agents with the browser.
๐ก See what others are building and share your projects in our [Discord](https://link.browser-use.com/discord)! Want Swag? Check out our [Merch store](https://browsermerch.com).
๐ค๏ธ Skip the setup - try our <b>hosted version</b> for instant browser automation! <b>[Try the cloud โ๏ธ](https://cloud.browser-use.com)</b>.
# Quick start
With pip (Python>=3.11):
```bash
pip install browser-use
```
Install Playwright:
```bash
playwright install chromium
```
Spin up your agent:
```python
from langchain_openai import ChatOpenAI
from browser_use import Agent
import asyncio
from dotenv import load_dotenv
load_dotenv()
async def main():
agent = Agent(
task="Compare the price of gpt-4o and DeepSeek-V3",
llm=ChatOpenAI(model="gpt-4o"),
)
await agent.run()
asyncio.run(main())
```
Add your API keys for the provider you want to use to your `.env` file.
```bash
OPENAI_API_KEY=
```
For other settings, models, and more, check out the [documentation ๐](https://docs.browser-use.com).
### Test with UI
You can test [browser-use with a UI repository](https://github.com/browser-use/web-ui)
Or simply run the gradio example:
```
uv pip install gradio
```
```bash
python examples/ui/gradio_demo.py
```
# Demos
<br/><br/>
[Task](https://github.com/browser-use/browser-use/blob/main/examples/use-cases/shopping.py): Add grocery items to cart, and checkout.
[](https://www.youtube.com/watch?v=L2Ya9PYNns8)
<br/><br/>
Prompt: Add my latest LinkedIn follower to my leads in Salesforce.

<br/><br/>
[Prompt](https://github.com/browser-use/browser-use/blob/main/examples/use-cases/find_and_apply_to_jobs.py): Read my CV & find ML jobs, save them to a file, and then start applying for them in new tabs, if you need help, ask me.'
https://github.com/user-attachments/assets/171fb4d6-0355-46f2-863e-edb04a828d04
<br/><br/>
[Prompt](https://github.com/browser-use/browser-use/blob/main/examples/browser/real_browser.py): Write a letter in Google Docs to my Papa, thanking him for everything, and save the document as a PDF.

<br/><br/>
[Prompt](https://github.com/browser-use/browser-use/blob/main/examples/custom-functions/save_to_file_hugging_face.py): Look up models with a license of cc-by-sa-4.0 and sort by most likes on Hugging face, save top 5 to file.
https://github.com/user-attachments/assets/de73ee39-432c-4b97-b4e8-939fd7f323b3
<br/><br/>
## More examples
For more examples see the [examples](examples) folder or join the [Discord](https://link.browser-use.com/discord) and show off your project.
# Vision
Tell your computer what to do, and it gets it done.
## Roadmap
### Agent
- [ ] Improve agent memory (summarize, compress, RAG, etc.)
- [ ] Enhance planning capabilities (load website specific context)
- [ ] Reduce token consumption (system prompt, DOM state)
### DOM Extraction
- [ ] Improve extraction for datepickers, dropdowns, special elements
- [ ] Improve state representation for UI elements
### Rerunning tasks
- [ ] LLM as fallback
- [ ] Make it easy to define workfows templates where LLM fills in the details
- [ ] Return playwright script from the agent
### Datasets
- [ ] Create datasets for complex tasks
- [ ] Benchmark various models against each other
- [ ] Fine-tuning models for specific tasks
### User Experience
- [ ] Human-in-the-loop execution
- [ ] Improve the generated GIF quality
- [ ] Create various demos for tutorial execution, job application, QA testing, social media, etc.
## Contributing
We love contributions! Feel free to open issues for bugs or feature requests. To contribute to the docs, check out the `/docs` folder.
## Local Setup
To learn more about the library, check out the [local setup ๐](https://docs.browser-use.com/development/local-setup).
## Cooperations
We are forming a commission to define best practices for UI/UX design for browser agents.
Together, we're exploring how software redesign improves the performance of AI agents and gives these companies a competitive advantage by designing their existing software to be at the forefront of the agent age.
Email [Toby](mailto:
[email protected]?subject=I%20want%20to%20join%20the%20UI/UX%20commission%20for%20AI%20agents&body=Hi%20Toby%2C%0A%0AI%20found%20you%20in%20the%20browser-use%20GitHub%20README.%0A%0A) to apply for a seat on the committee.
## Swag
Want to show off your Browser-use swag? Check out our [Merch store](https://browsermerch.com). Good contributors will receive swag for free ๐.
## Citation
If you use Browser Use in your research or project, please cite:
```bibtex
@software{browser_use2024,
author = {Mรผller, Magnus and ลฝuniฤ, Gregor},
title = {Browser Use: Enable AI to control your browser},
year = {2024},
publisher = {GitHub},
url = {https://github.com/browser-use/browser-use}
}
```
<div align="center"> <img src="https://github.com/user-attachments/assets/06fa3078-8461-4560-b434-445510c1766f" width="400"/>
[](https://x.com/gregpr07)
[](https://x.com/mamagnus00)
</div>
<div align="center">
Made with โค๏ธ in Zurich and San Francisco
</div>
", Assign "at most 3 tags" to the expected json: {"id":"12931","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"