base on Incredibly fast crawler designed for OSINT. <h1 align="center"> <br> <a href="https://github.com/s0md3v/Photon"><img src="https://image.ibb.co/h5OZAK/photonsmall.png" alt="Photon"></a> <br> Photon <br> </h1> <h4 align="center">Incredibly fast crawler designed for OSINT.</h4> <p align="center"> <a href="https://github.com/s0md3v/Photon/releases"> <img src="https://img.shields.io/github/release/s0md3v/Photon.svg"> </a> <a href="https://pypi.org/project/photon/"> <img src="https://img.shields.io/badge/[email protected]?style=style=flat-square" alt="pypi"> </a> <a href="https://github.com/s0md3v/Photon/issues?q=is%3Aissue+is%3Aclosed"> <img src="https://img.shields.io/github/issues-closed-raw/s0md3v/Photon.svg"> </a> <a href="https://travis-ci.com/s0md3v/Photon"> <img src="https://img.shields.io/travis/com/s0md3v/Photon.svg"> </a> </p> <p align="center"> <a href="https://github.com/s0md3v/Photon/wiki">Photon Wiki</a> • <a href="https://github.com/s0md3v/Photon/wiki/Usage">How To Use</a> • <a href="https://github.com/s0md3v/Photon/wiki/Compatibility-&-Dependencies">Compatibility</a> • <a href="https://github.com/s0md3v/Photon/wiki/Photon-Library">Photon Library</a> • <a href="#contribution--license">Contribution</a> • <a href="https://github.com/s0md3v/Photon/projects/1">Roadmap</a> </p> **Sponsored By [Thordata](https://www.thordata.com/?ls=github&lk=Photon)** <a href="https://www.thordata.com/?ls=github&lk=Photon"><img src="https://github.com/user-attachments/assets/2cfb6f56-3547-4f82-9d47-2eb14ee3f099"/></a> ### Key Features #### Data Extraction Photon can extract the following data while crawling: - URLs (in-scope & out-of-scope) - URLs with parameters (`example.com/gallery.php?id=2`) - Intel (emails, social media accounts, amazon buckets etc.) - Files (pdf, png, xml etc.) - Secret keys (auth/API keys & hashes) - JavaScript files & Endpoints present in them - Strings matching custom regex pattern - Subdomains & DNS related data The extracted information is saved in an organized manner or can be [exported as json](https://github.com/s0md3v/Photon/wiki/Usage#export-formatted-result). ![save demo](https://image.ibb.co/dS1BqK/carbon_2.png) #### Flexible Control timeout, delay, add seeds, exclude URLs matching a regex pattern and other cool stuff. The extensive range of [options](https://github.com/s0md3v/Photon/wiki/Usage) provided by Photon lets you crawl the web exactly the way you want. #### Genius Photon's smart thread management & refined logic gives you top notch performance. Still, crawling can be resource intensive but Photon has some tricks up it's sleeves. You can fetch URLs archived by [archive.org](https://archive.org/) to be used as seeds by using `--wayback` option. #### Plugins - **[wayback](https://github.com/s0md3v/Photon/wiki/Usage#use-urls-from-archiveorg-as-seeds)** - **[dnsdumpster](https://github.com/s0md3v/Photon/wiki/Usage#dumping-dns-data)** - **[Exporter](https://github.com/s0md3v/Photon/wiki/Usage#export-formatted-result)** #### Docker Photon can be launched using a lightweight Python-Alpine (103 MB) Docker image. ```bash $ git clone https://github.com/s0md3v/Photon.git $ cd Photon $ docker build -t photon . $ docker run -it --name photon photon:latest -u google.com ``` To view results, you can either head over to the local docker volume, which you can find by running `docker inspect photon` or by mounting the target loot folder: ```bash $ docker run -it --name photon -v "$PWD:/Photon/google.com" photon:latest -u google.com ``` #### Frequent & Seamless Updates Photon is under heavy development and updates for fixing bugs. optimizing performance & new features are being rolled regularly. If you would like to see features and issues that are being worked on, you can do that on [Development](https://github.com/s0md3v/Photon/projects/1) project board. Updates can be installed & checked for with the `--update` option. Photon has seamless update capabilities which means you can update Photon without losing any of your saved data. Use Control + Shift + m to toggle the tab key moving focus. Alternatively, use esc then tab to move to the next interactive element on the page. Attach files by dragging & dropping, selecting or pasting them. (https://www.thordata.com/?ls=github&lk=Photon) ![demo](https://image.ibb.co/kQSUcz/demo.png) <p align="center"> <a href="https://github.com/s0md3v/Photon/wiki">Photon Wiki</a> • <a href="https://github.com/s0md3v/Photon/wiki/Usage">How To Use</a> • <a href="https://github.com/s0md3v/Photon/wiki/Compatibility-&-Dependencies">Compatibility</a> • <a href="https://github.com/s0md3v/Photon/wiki/Photon-Library">Photon Library</a> • <a href="#contribution--license">Contribution</a> • <a href="https://github.com/s0md3v/Photon/projects/1">Roadmap</a> </p> ### Key Features #### Data Extraction Photon can extract the following data while crawling: - URLs (in-scope & out-of-scope) - URLs with parameters (`example.com/gallery.php?id=2`) - Intel (emails, social media accounts, amazon buckets etc.) - Files (pdf, png, xml etc.) - Secret keys (auth/API keys & hashes) - JavaScript files & Endpoints present in them - Strings matching custom regex pattern - Subdomains & DNS related data The extracted information is saved in an organized manner or can be [exported as json](https://github.com/s0md3v/Photon/wiki/Usage#export-formatted-result). ![save demo](https://image.ibb.co/dS1BqK/carbon_2.png) #### Flexible Control timeout, delay, add seeds, exclude URLs matching a regex pattern and other cool stuff. The extensive range of [options](https://github.com/s0md3v/Photon/wiki/Usage) provided by Photon lets you crawl the web exactly the way you want. #### Genius Photon's smart thread management & refined logic gives you top notch performance. Still, crawling can be resource intensive but Photon has some tricks up it's sleeves. You can fetch URLs archived by [archive.org](https://archive.org/) to be used as seeds by using `--wayback` option. #### Plugins - **[wayback](https://github.com/s0md3v/Photon/wiki/Usage#use-urls-from-archiveorg-as-seeds)** - **[dnsdumpster](https://github.com/s0md3v/Photon/wiki/Usage#dumping-dns-data)** - **[Exporter](https://github.com/s0md3v/Photon/wiki/Usage#export-formatted-result)** #### Docker Photon can be launched using a lightweight Python-Alpine (103 MB) Docker image. ```bash $ git clone https://github.com/s0md3v/Photon.git $ cd Photon $ docker build -t photon . $ docker run -it --name photon photon:latest -u google.com ``` To view results, you can either head over to the local docker volume, which you can find by running `docker inspect photon` or by mounting the target loot folder: ```bash $ docker run -it --name photon -v "$PWD:/Photon/google.com" photon:latest -u google.com ``` #### Frequent & Seamless Updates Photon is under heavy development and updates for fixing bugs. optimizing performance & new features are being rolled regularly. If you would like to see features and issues that are being worked on, you can do that on [Development](https://github.com/s0md3v/Photon/projects/1) project board. Updates can be installed & checked for with the `--update` option. Photon has seamless update capabilities which means you can update Photon without losing any of your saved data. ### Contribution & License You can contribute in following ways: - Report bugs - Develop plugins - Add more "APIs" for ninja mode - Give suggestions to make it better - Fix issues & submit a pull request Please read the [guidelines](https://github.com/s0md3v/Photon/wiki/Guidelines) before submitting a pull request or issue. Do you want to have a conversation in private? Hit me up on my [twitter](https://twitter.com/s0md3v/), inbox is open :) **Photon** is licensed under [GPL v3.0 license](https://www.gnu.org/licenses/gpl-3.0.en.html) ", Assign "at most 3 tags" to the expected json: {"id":"11576","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"