AI prompts
base on 小红书爬虫数据采集,小红书全域运营解决方案 <p align="center">
<a href="https://github.com/cv-cat/Spider_XHS" target="_blank" align="center" alt="Go to XHS_Spider Website">
<picture>
<img width="220" src="https://github.com/user-attachments/assets/b817a5d2-4ca6-49e9-b7b1-efb07a4fb325" alt="Spider_XHS logo">
</picture>
</a>
</p>
<div align="center">
<a href="https://www.python.org/">
<img src="https://img.shields.io/badge/python-3.7%2B-blue" alt="Python 3.7+">
</a>
<a href="https://nodejs.org/zh-cn/">
<img src="https://img.shields.io/badge/nodejs-18%2B-blue" alt="NodeJS 18+">
</a>
</div>
# Spider_XHS
**✨ 专业的小红书数据采集解决方案,支持笔记爬取,保存格式为excel或者media**
**✨ 小红书全域运营解决方法,AI一键改写笔记(图文,视频)直接上传**
## ⭐功能列表
**⚠️ 任何涉及数据注入的操作都是不被允许的,本项目仅供学习交流使用,如有违反,后果自负**
| 模块 | 已实现 |
|----------|---------------------------------------------------------------------------------|
| 小红书创作者平台 | ✅ 二维码登录(未开源)<br/>✅ 手机验证码登录(未开源)<br/>✅ 上传(图集、视频)作品(未开源)<br/>✅查看自己上传的作品(未开源) |
| 小红书PC | ✅ 二维码登录(未开源)<br/> ✅ 手机验证码登录(未开源) <br/> ✅ 获取无水印图片(开源)<br/> ✅ 获取无水印视频(开源)<br/> ✅ 获取主页的所有频道(开源)<br/>✅ 获取主页推荐笔记(开源)<br/>✅ 获取某个用户的信息(开源)<br/>✅ 用户自己的信息(开源)<br/>✅ 获取某个用户上传的笔记(开源)<br/>✅ 获取某个用户所有的喜欢笔记(开源)<br/>✅ 获取某个用户所有的收藏笔记(开源)<br/>✅ 获取某个笔记的详细内容(开源)<br/>✅ 搜索笔记内容(开源)<br/>✅ 搜索用户内容(开源)<br/>✅ 获取某个笔记的评论(开源)<br/>✅ 获取未读消息信息(开源)<br/>✅ 获取收到的评论和@提醒信息(开源)<br/>✅ 获取收到的点赞和收藏信息(开源)<br/>✅ 获取新增关注信息(开源)|
## 🌟 功能特性
- ✅ **多维度数据采集**
- 用户主页信息
- 笔记详细内容
- 智能搜索结果抓取
- 🚀 **高性能架构**
- 自动重试机制
- 🔒 **安全稳定**
- 小红书最新API适配
- 异常处理机制
- proxy代理
- 🎨 **便捷管理**
- 结构化目录存储
- 格式化输出(JSON/EXCEL/MEDIA)
## 🎨效果图
### 处理后的所有用户

### 某个用户所有的笔记

### 某个笔记具体的内容

### 保存的excel

## 🛠️ 快速开始
### ⛳运行环境
- Python 3.7+
- Node.js 18+
### 🎯安装依赖
```
pip install -r requirements.txt
npm install
```
### 🎨配置文件
配置文件在项目根目录.env文件中,将下图自己的登录cookie放入其中,cookie获取➡️在浏览器f12打开控制台,点击网络,点击fetch,找一个接口点开

复制cookie到.env文件中(注意!登录小红书后的cookie才是有效的,不登陆没有用)

### 🚀运行项目
```
python main.py
```
### 🗝️注意事项
- main.py中的代码是爬虫的入口,可以根据自己的需求进行修改
- apis/pc_apis.py中的代码包含了所有的api接口,可以根据自己的需求进行修改
## 🍥日志
| 日期 | 说明 |
|----------| --------------------------- |
| 23/08/09 | - 首次提交 |
| 23/09/13 | - api更改params增加两个字段,修复图片无法下载,有些页面无法访问导致报错 |
| 23/09/16 | - 较大视频出现编码问题,修复视频编码问题,加入异常处理 |
| 23/09/18 | - 代码重构,加入失败重试 |
| 23/09/19 | - 新增下载搜索结果功能 |
| 23/10/05 | - 新增跳过已下载功能,获取更详细的笔记和用户信息|
| 23/10/08 | - 上传代码☞Pypi,可通过pip install安装本项目|
| 23/10/17 | - 搜索下载新增排序方式选项(1、综合排序 2、热门排序 3、最新排序)|
| 23/10/21 | - 新增图形化界面,上传至release v2.1.0|
| 23/10/28 | - Fix Bug 修复搜索功能出现的隐藏问题|
| 25/03/18 | - 更新API,修复部分问题|
## 🧸额外说明
1. 感谢star⭐和follow📰!不时更新
2. 作者的联系方式在主页里,有问题可以随时联系我
3. 可以关注下作者的其他项目,欢迎 PR 和 issue
4. 感谢赞助!如果此项目对您有帮助,请作者喝一杯奶茶~~ (开心一整天😊😊)
5. thank you~~~
<div align="center">
<img src="./author/wx_pay.png" width="400px" alt="微信赞赏码">
<img src="./author/zfb_pay.jpg" width="400px" alt="支付宝收款码">
</div>
## 📈 Star 趋势
<a href="https://www.star-history.com/#cv-cat/Spider_XHS&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=cv-cat/Spider_XHS&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=cv-cat/Spider_XHS&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=cv-cat/Spider_XHS&type=Date" />
</picture>
</a>
", Assign "at most 3 tags" to the expected json: {"id":"13631","tags":[]} "only from the tags list I provide: [{"id":77,"name":"3d"},{"id":89,"name":"agent"},{"id":17,"name":"ai"},{"id":54,"name":"algorithm"},{"id":24,"name":"api"},{"id":44,"name":"authentication"},{"id":3,"name":"aws"},{"id":27,"name":"backend"},{"id":60,"name":"benchmark"},{"id":72,"name":"best-practices"},{"id":39,"name":"bitcoin"},{"id":37,"name":"blockchain"},{"id":1,"name":"blog"},{"id":45,"name":"bundler"},{"id":58,"name":"cache"},{"id":21,"name":"chat"},{"id":49,"name":"cicd"},{"id":4,"name":"cli"},{"id":64,"name":"cloud-native"},{"id":48,"name":"cms"},{"id":61,"name":"compiler"},{"id":68,"name":"containerization"},{"id":92,"name":"crm"},{"id":34,"name":"data"},{"id":47,"name":"database"},{"id":8,"name":"declarative-gui "},{"id":9,"name":"deploy-tool"},{"id":53,"name":"desktop-app"},{"id":6,"name":"dev-exp-lib"},{"id":59,"name":"dev-tool"},{"id":13,"name":"ecommerce"},{"id":26,"name":"editor"},{"id":66,"name":"emulator"},{"id":62,"name":"filesystem"},{"id":80,"name":"finance"},{"id":15,"name":"firmware"},{"id":73,"name":"for-fun"},{"id":2,"name":"framework"},{"id":11,"name":"frontend"},{"id":22,"name":"game"},{"id":81,"name":"game-engine "},{"id":23,"name":"graphql"},{"id":84,"name":"gui"},{"id":91,"name":"http"},{"id":5,"name":"http-client"},{"id":51,"name":"iac"},{"id":30,"name":"ide"},{"id":78,"name":"iot"},{"id":40,"name":"json"},{"id":83,"name":"julian"},{"id":38,"name":"k8s"},{"id":31,"name":"language"},{"id":10,"name":"learning-resource"},{"id":33,"name":"lib"},{"id":41,"name":"linter"},{"id":28,"name":"lms"},{"id":16,"name":"logging"},{"id":76,"name":"low-code"},{"id":90,"name":"message-queue"},{"id":42,"name":"mobile-app"},{"id":18,"name":"monitoring"},{"id":36,"name":"networking"},{"id":7,"name":"node-version"},{"id":55,"name":"nosql"},{"id":57,"name":"observability"},{"id":46,"name":"orm"},{"id":52,"name":"os"},{"id":14,"name":"parser"},{"id":74,"name":"react"},{"id":82,"name":"real-time"},{"id":56,"name":"robot"},{"id":65,"name":"runtime"},{"id":32,"name":"sdk"},{"id":71,"name":"search"},{"id":63,"name":"secrets"},{"id":25,"name":"security"},{"id":85,"name":"server"},{"id":86,"name":"serverless"},{"id":70,"name":"storage"},{"id":75,"name":"system-design"},{"id":79,"name":"terminal"},{"id":29,"name":"testing"},{"id":12,"name":"ui"},{"id":50,"name":"ux"},{"id":88,"name":"video"},{"id":20,"name":"web-app"},{"id":35,"name":"web-server"},{"id":43,"name":"webassembly"},{"id":69,"name":"workflow"},{"id":87,"name":"yaml"}]" returns me the "expected json"