base on E-mails, subdomains and names Harvester - OSINT 
 
[](https://inventory.raw.pm/)
About
-----
theHarvester is a simple to use, yet powerful tool designed to be used during the reconnaissance stage of a red
team assessment or penetration test. It performs open source intelligence (OSINT) gathering to help determine
a domain's external threat landscape. The tool gathers names, emails, IPs, subdomains, and URLs by using
multiple public resources that include:
Install and dependencies
------------------------
* Python 3.12 or higher.
* https://github.com/laramies/theHarvester/wiki/Installation
Install uv:
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
Clone the repository:
```bash
git clone https://github.com/laramies/theHarvester
cd theHarvester
```
Install dependencies and create a virtual environment:
```bash
uv sync
```
Run theHarvester:
```bash
uv run theHarvester
```
## Development
To install development dependencies:
```bash
uv sync --all-groups
```
To run tests:
```bash
uv run pytest
```
To run linting and formatting:
```bash
uv run ruff check
```
```bash
uv run ruff format
```
Passive modules
---------------
* baidu: Baidu search engine (https://www.baidu.com)
* bevigil: CloudSEK BeVigil scans mobile application for OSINT assets (https://bevigil.com/osint-api)
* brave: Brave search engine - now uses official Brave Search API (https://api-dashboard.search.brave.com)
* bufferoverun: Fast domain name lookups for TLS certificates in IPv4 space (https://tls.bufferover.run)
* builtwith: Find out what websites are built with (https://builtwith.com)
* censys: Uses certificates searches to enumerate subdomains and gather emails (https://censys.io)
* certspotter: Cert Spotter monitors Certificate Transparency logs (https://sslmate.com/certspotter)
* criminalip: Specialized Cyber Threat Intelligence (CTI) search engine (https://www.criminalip.io)
* crtsh: Comodo Certificate search (https://crt.sh)
* dehashed: Take your data security to the next level is (https://dehashed.com)
* dnsdumpster: Domain research tool that can discover hosts related to a domain (https://dnsdumpster.com)
* duckduckgo: DuckDuckGo search engine (https://duckduckgo.com)
* fofa: FOFA search eingine (https://en.fofa.info)
* fullhunt: Next-generation attack surface security platform (https://fullhunt.io)
* github-code: GitHub code search engine (https://www.github.com)
* hackertarget: Online vulnerability scanners and network intelligence to help organizations (https://hackertarget.com)
* haveibeenpwned: Check if your email address is in a data breach (https://haveibeenpwned.com)
* hunter: Hunter search engine (https://hunter.io)
* hunterhow: Internet search engines for security researchers (https://hunter.how)
* intelx: Intelx search engine (https://intelx.io)
* leakix: LeakIX search engine (https://leakix.net)
* leaklookup: Data breach search engine (https://leak-lookup.com)
* netlas: A Shodan or Censys competitor (https://app.netlas.io)
* onyphe: Cyber defense search engine (https://www.onyphe.io)
* otx: AlienVault open threat exchange (https://otx.alienvault.com)
* pentesttools: Cloud-based toolkit for offensive security testing, focused on web applications and network penetration testing (https://pentest-tools.com)
* projecdiscovery: Actively collects and maintains internet-wide assets data, to enhance research and analyse changes around DNS for better insights (https://chaos.projectdiscovery.io)
* rapiddns: DNS query tool which make querying subdomains or sites of a same IP easy (https://rapiddns.io)
* rocketreach: Access real-time verified personal/professional emails, phone numbers, and social media links (https://rocketreach.co)
* securityscorecard: helps TPRM and SOC teams detect, prioritize, and remediate vendor risk across their entire supplier ecosystem at scale (https://securityscorecard.com)
* securityTrails: Security Trails search engine, the world's largest repository of historical DNS data (https://securitytrails.com)
* -s, --shodan: Shodan search engine will search for ports and banners from discovered hosts (https://shodan.io)
* subdomaincenter: A subdomain finder tool used to find subdomains of a given domain (https://www.subdomain.center)
* subdomainfinderc99: A subdomain finder is a tool used to find the subdomains of a given domain (https://subdomainfinder.c99.nl)
* thc: Free subdomain enumeration service with no API key required (https://ip.thc.org)
* threatminer: Data mining for threat intelligence (https://www.threatminer.org)
* tomba: Tomba search engine (https://tomba.io)
* urlscan: A sandbox for the web that is a URL and website scanner (https://urlscan.io)
* venacus: Venacus search engine (https://venacus.com)
* virustotal: Domain search (https://www.virustotal.com)
* whoisxml: Subdomain search (https://subdomains.whoisxmlapi.com/api/pricing)
* yahoo: Yahoo search engine (https://www.yahoo.com)
* windvane: Windvane search engine (https://windvane.lichoin.com)
* zoomeye: China's version of Shodan (https://www.zoomeye.org)
Active modules
--------------
* DNS brute force: dictionary brute force enumeration
* Screenshots: Take screenshots of subdomains that were found
Modules that require an API key
-------------------------------
Documentation to setup API keys can be found at - https://github.com/laramies/theHarvester/wiki/Installation#api-keys
* bevigil - 50 free queries/month. 1k queries/month $50
* brave - free plan available. Pro plans for higher limits
* bufferoverun - 100 free queries/month. 10k/month $25
* builtwith - 50 free queries ever. $2950/yr
* censys - 500 credits $100
* criminalip - 100 free queries/month. 700k/month $59
* dehashed - 500 credts $15, 5k credits $150
* dnsdumpster - 50 free querries/day, $49
* fofa - query credits 10,000/month. 100k results/month $25
* fullhunt - 50 free queries. 200 queries $29/month, 500 queries $59
* github-code
* haveibeenpwned - 10 email searches/min $4.50, 50 email searches/min $22
* hunter - 50 free credits/month. 12k credits/yr $34
* hunterhow - 10k free API results per 30 days. 50k API results per 30 days $10
* intelx - free account is very limited. Business acount $2900
* leakix - free 25 results pages, 3000 API requests/month. Bounty Hunter $29
* leaklookup - 20 credits $10, 50 credits $20, 140 credits $50, 300 credits $100
* netlas - 50 free requests/day. 1k requests $49, 10k requests $249
* onyphe - 10M results/month $587
* pentesttools - 5 assets netsec $95/month, 5 assets webnetsec $140/month
* projecdiscovery - requires work email. Free monthly discovery and vulnerability scans on sign-up email domain, enterprise $
* rocketreach - 100 email lookups/month $48, 250 email lookups/month $108
* securityscorecard - requires a work email
* securityTrails - 50 free queries/month. 20k queries/month $500
* shodan - Freelancer $69 month, Small Business $359 month
* tomba - 25 free searches/month. 1k searches/month $39, 5k searches/month $89
* venacus - 1 free search/day. 10 searches/day $12, 30 searches/day $36
* virustotal - 500 free lookups/day, 15.5k lookups/month. Busines accounts requires a work email
* whoisxml - 2k queries $50, 5k queries $105
* windvane - 100 free queries
* zoomeye - 5 free results/day. 30/results/day $190/yr
## Package versions
[](https://repology.org/project/theharvester/versions)
Comments, bugs, and requests
----------------------------
* [](https://twitter.com/laramies) Christian Martorella @laramies
[email protected]
* [](https://twitter.com/NotoriousRebel1) Matthew Brown @NotoriousRebel1
* [](https://twitter.com/jay_townsend1) Jay "L1ghtn1ng" Townsend @jay_townsend1
Main contributors
-----------------
* [](https://twitter.com/NotoriousRebel1) Matthew Brown @NotoriousRebel1
* [](https://twitter.com/jay_townsend1) Jay "L1ghtn1ng" Townsend @jay_townsend1
* [](https://twitter.com/discoverscripts) Lee Baird @discoverscripts
Thanks
------
* John Matherly - Shodan project
* Ahmed Aboul Ela - subdomain names dictionaries (big and small)
", Assign "at most 3 tags" to the expected json: {"id":"11900","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"