Trendshift - Ask AI

base on The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬 <h1 align="center"> <a href="https://github.com/SakanaAI/AI-Scientist/blob/main/docs/logo_2.png"> <img src="docs/logo_2.png" width="215" /></a> The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬 </h1> 📚 <a href="https://arxiv.org/abs/2408.06292">[Paper]</a> | 📝 <a href="https://sakana.ai/ai-scientist/">[Blog Post]</a> | 📂 <a href="https://drive.google.com/drive/folders/1G7A0wTqfXVa-cpexjk0oaXakaSJwffEt">[Drive Folder]</a> One of the grand challenges of artificial intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used to aid human scientists—for example, for brainstorming ideas or writing code—they still require extensive manual supervision or are heavily constrained to specific tasks. We're excited to introduce **The AI Scientist**, the first comprehensive system for fully automatic scientific discovery, enabling Foundation Models such as Large Language Models (LLMs) to perform research independently. We provide all runs and data from our paper [here](https://drive.google.com/drive/folders/1G7A0wTqfXVa-cpexjk0oaXakaSJwffEt?usp=sharing), where we run each base model on each template for approximately 50 ideas. We *highly* recommend reading through some of the [Claude papers](https://drive.google.com/drive/folders/1Mmpz6M1FK4q8e-SewgZcUzdeD0Q2zC39?usp=sharing) to get a sense of the system's strengths and weaknesses. Here are some example papers generated by **The AI Scientist** 📝: 1. [DualScale Diffusion: Adaptive Feature Balancing for Low-Dimensional Generative Models](https://github.com/SakanaAI/AI-Scientist/blob/main/example_papers/adaptive_dual_scale_denoising.pdf) 2. [Multi-scale Grid Noise Adaptation: Enhancing Diffusion Models For Low-dimensional Data](https://github.com/SakanaAI/AI-Scientist/blob/main/example_papers/grid_based_noise_adaptation.pdf) 3. [GAN-Enhanced Diffusion: Boosting Sample Quality and Diversity](https://github.com/SakanaAI/AI-Scientist/blob/main/example_papers/gan_diffusion.pdf) 4. [DualDiff: Enhancing Mode Capture in Low-dimensional Diffusion Models via Dual-expert Denoising](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/dual_expert_denoiser.pdf) 5. [StyleFusion: Adaptive Multi-style Generation in Character-Level Language Models](https://github.com/SakanaAI/AI-Scientist/blob/main/example_papers/multi_style_adapter.pdf) 6. [Adaptive Learning Rates for Transformers via Q-Learning](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/rl_lr_adaptation.pdf) 7. [Unlocking Grokking: A Comparative Study of Weight Initialization Strategies in Transformer Models](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/weight_initialization_grokking.pdf) 8. [Grokking Accelerated: Layer-wise Learning Rates for Transformer Generalization](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/layerwise_lr_grokking.pdf) 9. [Grokking Through Compression: Unveiling Sudden Generalization via Minimal Description Length](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/mdl_grokking_correlation.pdf) 10. [Accelerating Mathematical Insight: Boosting Grokking Through Strategic Data Augmentation](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/data_augmentation_grokking.pdf) > **Note:** > **Caution!** This codebase will execute LLM-written code. There are various risks and challenges associated with this autonomy, including the use of potentially dangerous packages, web access, and potential spawning of processes. Use at your own discretion. Please make sure to [containerize](#containerization) and restrict web access appropriately. <a href="https://github.com/SakanaAI/AI-Scientist/blob/main/example_papers/adaptive_dual_scale_denoising/adaptive_dual_scale_denoising.pdf"><img src="https://github.com/SakanaAI/AI-Scientist/blob/main/docs/anim-ai-scientist.gif" alt="Adaptive Dual Scale Denoising" width="80%" /> </a> ## Table of Contents 1. [Introduction](#introduction) 2. [Requirements](#requirements) - [Installation](#installation) - [Supported Models and API Keys](#supported-models-and-api-keys) 3. [Setting Up the Templates](#setting-up-the-templates) - [NanoGPT Template](#nanogpt-template) - [2D Diffusion Template](#2d-diffusion-template) - [Grokking Template](#grokking-template) 4. [Run AI Scientist Paper Generation Experiments](#run-ai-scientist-paper-generation-experiments) 5. [Getting an LLM-Generated Paper Review](#getting-an-llm-generated-paper-review) 6. [Making Your Own Template](#making-your-own-template) - [Community-Contributed Templates](#community-contributed-templates) 7. [Template Resources](#template-resources) 8. [Citing The AI Scientist](#citing-the-ai-scientist) 9. [Frequently Asked Questions](#frequently-asked-questions) 10. [Containerization](#containerization) ## Introduction We provide three templates, which were used in our paper, covering the following domains: **NanoGPT**, **2D Diffusion**, and **Grokking**. These templates enable The AI Scientist to generate ideas and conduct experiments in these areas. We accept contributions of new templates from the community, but please note that they are not maintained by us. All other templates beyond the three provided are community contributions. ## Requirements This code is designed to run on Linux with NVIDIA GPUs using CUDA and PyTorch. Support for other GPU architectures may be possible by following the [PyTorch guidelines](https://pytorch.org/get-started/locally/). The current templates would likely take an infeasible amount of time on CPU-only machines. Running on other operating systems may require significant adjustments. ### Installation ```bash conda create -n ai_scientist python=3.11 conda activate ai_scientist # Install pdflatex sudo apt-get install texlive-full # Install PyPI requirements pip install -r requirements.txt ``` **Note:** Installing `texlive-full` can take a long time. You may need to [hold Enter](https://askubuntu.com/questions/956006/pregenerating-context-markiv-format-this-may-take-some-time-takes-forever) during the installation. ### Supported Models and API Keys We support a wide variety of models, including open-weight and API-only models. In general, we recommend using only frontier models above the capability of the original GPT-4. To see a full list of supported models, see [here](https://github.com/SakanaAI/AI-Scientist/blob/main/ai_scientist/llm.py). #### OpenAI API (GPT-4o, GPT-4o-mini, o1 models) By default, this uses the `OPENAI_API_KEY` environment variable. #### Anthropic API (Claude Sonnet 3.5) By default, this uses the `ANTHROPIC_API_KEY` environment variable. ##### Claude Models via Bedrock For Claude models provided by [Amazon Bedrock](https://aws.amazon.com/bedrock/), please install these additional packages: ```bash pip install anthropic[bedrock] ``` Next, specify a set of valid [AWS Credentials](https://docs.aws.amazon.com/cli/v1/userguide/cli-configure-envvars.html) and the target [AWS Region](https://docs.aws.amazon.com/bedrock/latest/userguide/bedrock-regions.html): Set the environment variables: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION_NAME`. ##### Claude Models via Vertex AI For Claude models provided by [Vertex AI Model Garden](https://cloud.google.com/model-garden?hl=en), please install these additional packages: ```bash pip install google-cloud-aiplatform pip install anthropic[vertex] ``` Next, set up valid authentication for a [Google Cloud project](https://cloud.google.com/vertex-ai/docs/authentication), for example by providing the region and project ID: ```bash export CLOUD_ML_REGION="REGION" # for Model Garden call export ANTHROPIC_VERTEX_PROJECT_ID="PROJECT_ID" # for Model Garden call export VERTEXAI_LOCATION="REGION" # for Aider/LiteLLM call export VERTEXAI_PROJECT="PROJECT_ID" # for Aider/LiteLLM call ``` #### DeepSeek API (deepseek-chat, deepseek-reasoner) By default, this uses the `DEEPSEEK_API_KEY` environment variable. #### OpenRouter API (Llama3.1) By default, this uses the `OPENROUTER_API_KEY` environment variable. #### Google Gemini We support Google Gemini models (e.g., "gemini-1.5-flash", "gemini-1.5-pro") via the [google-generativeai](https://pypi.org/project/google-generativeai) Python library. By default, it uses the environment variable: ```bash export GEMINI_API_KEY="YOUR GEMINI API KEY" ``` #### Semantic Scholar API (Literature Search) Our code can also optionally use a Semantic Scholar API Key (`S2_API_KEY`) for higher throughput [if you have one](https://www.semanticscholar.org/product/api), though it should work without it in principle. If you have problems with Semantic Scholar, you can skip the literature search and citation phases of paper generation. Be sure to provide the key for the model used for your runs, e.g.: ```bash export OPENAI_API_KEY="YOUR KEY HERE" export S2_API_KEY="YOUR KEY HERE" ``` #### OpenAlex API (Literature Search Alternative) OpenAlex API can be used as an alternative if you do not have a Semantic Scholar API Key. OpenAlex does not require API key. ```bash pip install pyalex export OPENALEX_MAIL_ADDRESS="YOUR EMAIL ADDRESS" ``` And specify `--engine openalex` when you execute the AI Scientist code. Note that this is experimental for those who do not have a Semantic Scholar API Key. ## Setting Up the Templates This section provides instructions for setting up each of the three templates used in our paper. Before running The AI Scientist experiments, please ensure you have completed the setup steps for the templates you are interested in. ### NanoGPT Template **Description:** This template investigates transformer-based autoregressive next-token prediction tasks. **Setup Steps:** 1. **Prepare the data:** ```bash python data/enwik8/prepare.py python data/shakespeare_char/prepare.py python data/text8/prepare.py ``` 2. **Create baseline runs (machine dependent):** ```bash # Set up NanoGPT baseline run # NOTE: YOU MUST FIRST RUN THE PREPARE SCRIPTS ABOVE! cd templates/nanoGPT python experiment.py --out_dir run_0 python plot.py ``` ### 2D Diffusion Template **Description:** This template studies improving the performance of diffusion generative models on low-dimensional datasets. **Setup Steps:** 1. **Install dependencies:** ```bash # Set up 2D Diffusion git clone https://github.com/gregversteeg/NPEET.git cd NPEET pip install . pip install scikit-learn ``` 2. **Create baseline runs:** ```bash # Set up 2D Diffusion baseline run cd templates/2d_diffusion python experiment.py --out_dir run_0 python plot.py ``` ### Grokking Template **Description:** This template investigates questions about generalization and learning speed in deep neural networks. **Setup Steps:** 1. **Install dependencies:** ```bash # Set up Grokking pip install einops ``` 2. **Create baseline runs:** ```bash # Set up Grokking baseline run cd templates/grokking python experiment.py --out_dir run_0 python plot.py ``` ## Run AI Scientist Paper Generation Experiments **Note:** Please ensure the setup steps above are completed before running these experiments. ```bash conda activate ai_scientist # Run the paper generation. python launch_scientist.py --model "gpt-4o-2024-05-13" --experiment nanoGPT_lite --num-ideas 2 python launch_scientist.py --model "claude-3-5-sonnet-20241022" --experiment nanoGPT_lite --num-ideas 2 ``` If you have more than one GPU, use the `--parallel` option to parallelize ideas across multiple GPUs. ## Getting an LLM-Generated Paper Review ```python import openai from ai_scientist.perform_review import load_paper, perform_review client = openai.OpenAI() model = "gpt-4o-2024-05-13" # Load paper from PDF file (raw text) paper_txt = load_paper("report.pdf") # Get the review dictionary review = perform_review( paper_txt, model, client, num_reflections=5, num_fs_examples=1, num_reviews_ensemble=5, temperature=0.1, ) # Inspect review results review["Overall"] # Overall score (1-10) review["Decision"] # 'Accept' or 'Reject' review["Weaknesses"] # List of weaknesses (strings) ``` To run batch analysis: ```bash cd review_iclr_bench python iclr_analysis.py --num_reviews 500 --batch_size 100 --num_fs_examples 1 --num_reflections 5 --temperature 0.1 --num_reviews_ensemble 5 ``` ## Making Your Own Template If there is an area of study you would like **The AI Scientist** to explore, it is straightforward to create your own templates. In general, follow the structure of the existing templates, which consist of: - `experiment.py` — This is the main script where the core content is. It takes an argument `--out_dir`, which specifies where it should create the folder and save the relevant information from the run. - `plot.py` — This script takes the information from the `run` folders and creates plots. The code should be clear and easy to edit. - `prompt.json` — Put information about your template here. - `seed_ideas.json` — Place example ideas here. You can also try to generate ideas without any examples and then pick the best one or two to put here. - `latex/template.tex` — We recommend using our LaTeX folder but be sure to replace the pre-loaded citations with ones that you expect to be more relevant. The key to making new templates work is matching the base filenames and output JSONs to the existing format; everything else is free to change. You should also ensure that the `template.tex` file is updated to use the correct citation style / base plots for your template. ### Community-Contributed Templates We welcome community contributions in the form of new templates. While these are not maintained by us, we are delighted to highlight your templates to others. Below, we list community-contributed templates along with links to their pull requests (PRs): - Infectious Disease Modeling (`seir`) - [PR #137](https://github.com/SakanaAI/AI-Scientist/pull/137) - Image Classification with MobileNetV3 (`mobilenetV3`) - [PR #141](https://github.com/SakanaAI/AI-Scientist/pull/141) - Sketch RNN (`sketch_rnn`) - [PR #143](https://github.com/SakanaAI/AI-Scientist/pull/143) - AI in Quantum Chemistry (`MACE`) - [PR#157](https://github.com/SakanaAI/AI-Scientist/pull/157) - Earthquake Prediction (`earthquake-prediction`) - [PR #167](https://github.com/SakanaAI/AI-Scientist/pull/167) - Tensorial Radiance Fields (`tensorf`) - [PR #175](https://github.com/SakanaAI/AI-Scientist/pull/175) - Large Language Model Steering / Probes (`probes`) - [PR #215](https://github.com/SakanaAI/AI-Scientist/pull/215) *This section is reserved for community contributions. Please submit a pull request to add your template to the list! Please describe the template in the PR description, and also show examples of the generated papers.* ## Template Resources We provide three templates, which heavily use code from other repositories, credited below: - **NanoGPT Template** uses code from [NanoGPT](https://github.com/karpathy/nanoGPT) and this [PR](https://github.com/karpathy/nanoGPT/pull/254). - **2D Diffusion Template** uses code from [tiny-diffusion](https://github.com/tanelp/tiny-diffusion), [ema-pytorch](https://github.com/lucidrains/ema-pytorch), and [Datasaur](https://www.research.autodesk.com/publications/same-stats-different-graphs/). - **Grokking Template** uses code from [Sea-Snell/grokking](https://github.com/Sea-Snell/grokking) and [danielmamay/grokking](https://github.com/danielmamay/grokking). We would like to thank the developers of the open-source models and packages for their contributions and for making their work available. ## Citing The AI Scientist If you use **The AI Scientist** in your research, please cite it as follows: ``` @article{lu2024aiscientist, title={The {AI} {S}cientist: Towards Fully Automated Open-Ended Scientific Discovery}, author={Lu, Chris and Lu, Cong and Lange, Robert Tjarko and Foerster, Jakob and Clune, Jeff and Ha, David}, journal={arXiv preprint arXiv:2408.06292}, year={2024} } ``` ## Frequently Asked Questions We recommend reading our paper first for any questions you have on The AI Scientist. **Why am I missing files when running The AI Scientist?** Ensure you have completed all the setup and preparation steps before the main experiment script. **Why has a PDF or a review not been generated?** The AI Scientist finishes an idea with a success rate that depends on the template, the base foundation model, and the complexity of the idea. We advise referring to our main paper. The highest success rates are observed with Claude Sonnet 3.5. Reviews are best done with GPT-4o; all other models have issues with positivity bias or failure to conform to required outputs. **What is the cost of each idea generated?** Typically less than $15 per paper with Claude Sonnet 3.5. We recommend DeepSeek Coder V2 for a much more cost-effective approach. A good place to look for new models is the [Aider leaderboard](https://aider.chat/docs/leaderboards/). **How do I change the base conference format associated with the write-ups?** Change the base `template.tex` files contained within each template. **How do I run The AI Scientist for different subject fields?** Please refer to the instructions for different templates. In this current iteration, this is restricted to ideas that can be expressed in code. However, lifting this restriction would represent exciting future work! :) **How do I add support for a new foundation model?** You may modify `ai_scientist/llm.py` to add support for a new foundation model. We do not advise using any model that is significantly weaker than GPT-4 level for **The AI Scientist**. **Why do I need to run the baseline runs myself?** These appear as `run_0` and should be run per machine you execute **The AI Scientist** on for accurate run-time comparisons due to hardware differences. **What if I have problems accessing the Semantic Scholar API?** We use the Semantic Scholar API to check ideas for novelty and collect citations for the paper write-up. You may be able to skip these phases if you don't have an API key or the API is slow to access. ## Containerization We include a [community-contributed](https://github.com/SakanaAI/AI-Scientist/pull/21) Docker image that may assist with your containerization efforts in `experimental/Dockerfile`. You can use this image like this: ```bash # Endpoint Script docker run -e OPENAI_API_KEY=$OPENAI_API_KEY -v `pwd`/templates:/app/AI-Scientist/templates <AI_SCIENTIST_IMAGE> \ --model gpt-4o-2024-05-13 \ --experiment 2d_diffusion \ --num-ideas 2 ``` ```bash # Interactive docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY \ --entrypoint /bin/bash \ <AI_SCIENTIST_IMAGE> ``` ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=SakanaAI/AI-Scientist&type=Date)](https://star-history.com/#SakanaAI/AI-Scientist&Date) ", Assign "at most 3 tags" to the expected json: {"id":"13208","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"

AI prompts