【Python爬虫实战】用 Flet 把爬虫做成手机 App

有没有想过，把你写的爬虫装进手机里？
比如：

想听歌时，后台自动爬取音乐的资源并播放；
想搜图时，后台自动爬取高清图接口并下载；
想看人时，一键聚合搜索社交用户数据。

今天我们将实战一个MoonMusic。它的核心不是 UI，而是强大的异步数据采集层。

🔧 核心技术栈

数据采集 (Crawler): httpx (异步 HTTP 请求), BeautifulSoup4 (HTML 解析)
并发控制 (Concurrency): asyncio (协程调度)
数据可视化 (GUI): Flet (基于 Flutter 的 Python UI 框架)
部署 (Deploy): Android APK / iOS IPA

第一部分：硬核爬虫设计

1. 逆向 API 分析与封装

我们通过抓包（F12 Network），分析出了各大平台的搜索接口。为了统一调用，我们定义了一个 CrawlerService 类。

2. asyncio.gather 实现真·并发

这是本项目的技术高光时刻。用户输入一个关键词，我们同时向 3-5 个平台发起请求

3. 反爬与 Cookie 管理

为了应对大厂的反爬策略，我们设计了 DataHelper 类来专门管理 Cookie 和 Headers。

支持从 config.json 动态读取 Cookie（比如 VIP 账号）。
随机 User-Agent 生成。
Referer 防盗链处理。

📱 第二部分：Flet 可视化 (The Visualization)

有了强大的爬虫后端，我们需要一个“皮肤”来展示数据。Flet 允许我们用 Python 写出类似 Flutter 的原生界面。

1. 列表渲染 (ListView)

爬虫返回的 JSON 数据，直接映射为 Flet 的 UI 组件列表。

2. 移动端音频流处理

对于爬取到的 .mp3 或 .m4a 链接，我们不使用 Pygame（兼容性差），而是直接调用 Flet 的 ft.Audio，它底层调用的是 Android 的 ExoPlayer，支持流式播放。

📦 第三部分：从 Python 脚本到 Android APK

这是爬虫工程师最想学的技能：如何让你的脚本脱离电脑运行？

环境：安装 flet 库。
命令：在项目根目录运行：
flet build apk
原理：Flet 会自动拉取 Flutter 引擎，将你的 Python 爬虫代码编译成字节码，并打包进 APK 中。

📥 运行与源码

本项目是一个绝佳的“爬虫 + GUI”练手案例，涵盖了逆向、并发、UI 设计和移动端打包。

效果图：

代码(建议去GitHub链接看，分层更清晰详细，当前代码不全)：

1import warnings
2
3
4warnings.filterwarnings("ignore", category=UserWarning, module="pygame")
5warnings.filterwarnings("ignore", category=DeprecationWarning)
6
7import flet as ft
8import httpx
9import asyncio
10import json
11import base64
12import os
13import random
14import re
15import time
16import uuid
17import urllib.parse  
18from bs4 import BeautifulSoup
19import pygame
20from mutagen.mp3 import MP3
21
22
23# ==========================================
24# 4. Service 层
25# ==========================================
26
27class CrawlerService:
28    def __init__(self, helper):
29        self.helper = helper
30
31    async def search_netease(self, keyword):
32        url = "https://music.163.com/api/search/get/web"
33        params = {"s": keyword, "type": 1, "offset": 0, "total": "true", "limit": 10}
34        async with httpx.AsyncClient(verify=False) as client:
35            try:
36                headers = self.helper.get_headers("netease")
37                resp = await client.post(url, headers=headers, data=params)
38                data = resp.json()
39                songs = data['result']['songs']
40                results = []
41                for s in songs:
42                    pic_url = s.get('album', {}).get('picUrl', '')
43                    if not pic_url and s.get('artists'): pic_url = s['artists'][0].get('img1v1Url', '')
44                    results.append({
45                        "name": s['name'],
46                        "artist": s['artists'][0]['name'],
47                        "id": s['id'],
48                        "media_id": s['id'],
49                        "pic": pic_url,
50                        "url": f"http://music.163.com/song/media/outer/url?id={s['id']}.mp3",
51                        "source": "网易"
52                    })
53                return results
54            except:
55                return []
56
57    async def get_qq_purl(self, songmid, media_id=None):
58        if not media_id: media_id = songmid
59        guid = str(random.randint(1000000000, 9999999999))
60        file_types = [{"prefix": "M500", "ext": "mp3", "mid": media_id},
61                      {"prefix": "C400", "ext": "m4a", "mid": media_id}]
62        url = "https://u.y.qq.com/cgi-bin/musicu.fcg"
63        data = {
64            "req": {"module": "CDN.SrfCdnDispatchServer", "method": "GetCdnDispatch",
65                    "param": {"guid": guid, "calltype": 0, "userip": ""}},
66            "req_0": {
67                "module": "vkey.GetVkeyServer",
68                "method": "CgiGetVkey",
69                "param": {
70                    "guid": guid,
71                    "songmid": [songmid] * 2,
72                    "songtype": [0] * 2,
73                    "uin": self.helper.qq_uin,
74                    "loginflag": 1,
75                    "platform": "20",
76                    "filename": [f"{ft['prefix']}{ft['mid']}.{ft['ext']}" for ft in file_types]
77                }
78            }
79        }
80        async with httpx.AsyncClient(verify=False) as client:
81            try:
82                headers = self.helper.get_headers("qq")
83                resp = await client.get(url, params={"data": json.dumps(data)}, headers=headers)
84                js = resp.json()
85                midurlinfos = js.get('req_0', {}).get('data', {}).get('midurlinfo', [])
86                sip = js.get('req_0', {}).get('data', {}).get('sip', [])
87                for info in midurlinfos:
88                    if info.get('purl'):
89                        base = sip[0] if sip else "http://ws.stream.qqmusic.qq.com/"
90                        return f"{base}{info['purl']}"
91                return ""
92            except:
93                return ""
94
95    async def search_qq(self, keyword):
96        search_url = f"https://c.y.qq.com/soso/fcgi-bin/client_search_cp?p=1&n=10&w={keyword}&format=json"
97        async with httpx.AsyncClient(verify=False) as client:
98            try:
99                headers = self.helper.get_headers("qq")
100                resp = await client.get(search_url, headers=headers)
101                text = resp.text
102                if text.startswith("callback("):
103                    text = text[9:-1]
104                elif text.endswith(")"):
105                    text = text[text.find("(") + 1:-1]
106                data = json.loads(text)
107                songs = data['data']['song']['list']
108                results = []
109                for s in songs:
110                    songmid = s['songmid']
111                    media_mid = s.get('media_mid', s.get('strMediaMid', songmid))
112                    albummid = s['albummid']
113                    pic = f"https://y.gtimg.cn/music/photo_new/T002R300x300M000{albummid}.jpg" if albummid else ""
114                    results.append({
115                        "name": s['songname'],
116                        "artist": s['singer'][0]['name'],
117                        "id": songmid,
118                        "media_id": media_mid,
119                        "pic": pic,
120                        "url": "",
121                        "source": "QQ"
122                    })
123                return results
124            except:
125                return []
126
127    async def search_kugou(self, keyword):
128        search_url = f"http://mobilecdn.kugou.com/api/v3/search/song?format=json&keyword={keyword}&page=1&pagesize=6"
129        async with httpx.AsyncClient(verify=False) as client:
130            try:
131                headers = self.helper.get_headers("kugou")
132                resp = await client.get(search_url, headers=headers)
133                data = resp.json()
134                songs = data['data']['info']
135                tasks = []
136                for s in songs:
137                    tasks.append(client.get(
138                        f"http://www.kugou.com/yy/index.php?r=play/getdata&hash={s['hash']}&album_id={s.get('album_id', '')}",
139                        headers=headers))
140                detail_resps = await asyncio.gather(*tasks, return_exceptions=True)
141                results = []
142                for r in detail_resps:
143                    if isinstance(r, httpx.Response):
144                        try:
145                            d = r.json()['data']
146                            if d['play_url']:
147                                results.append({
148                                    "name": d['audio_name'],
149                                    "artist": d['author_name'],
150                                    "id": d['hash'],
151                                    "media_id": d['hash'],
152                                    "pic": d['img'],
153                                    "url": d['play_url'],
154                                    "source": "酷狗"
155                                })
156                        except:
157                            pass
158                return results
159            except:
160                return []
161
162    async def search_all(self, keyword, platform="all"):
163        tasks = []
164        if platform in ["all", "netease"]: tasks.append(self.search_netease(keyword))
165        if platform in ["all", "qq"]: tasks.append(self.search_qq(keyword))
166        if platform in ["all", "kugou"]: tasks.append(self.search_kugou(keyword))
167        results = await asyncio.gather(*tasks)
168        merged = []
169        if results:
170            max_len = max(len(r) for r in results)
171            for i in range(max_len):
172                for r in results:
173                    if i < len(r): merged.append(r[i])
174        return merged
175
176    async def search_images_bing(self, keyword):
177        url = f"https://www.bing.com/images/search?q={keyword}&form=HDRSC2&first=1"
178        async with httpx.AsyncClient(verify=False, follow_redirects=True) as client:
179            try:
180                headers = {
181                    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
182                    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
183                    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
184                    "Referer": "https://www.bing.com/"
185                }
186                resp = await client.get(url, headers=headers, timeout=8)
187                soup = BeautifulSoup(resp.text, 'html.parser')
188
189                results = []
190                iusc_links = soup.select('a.iusc')
191                for link in iusc_links:
192                    try:
193                        m_str = link.get('m')
194                        if m_str:
195                            m_data = json.loads(m_str)
196                            img_url = m_data.get('turl') or m_data.get('murl')
197                            full_url = m_data.get('murl')
198                            if img_url:
199                                results.append({"url": full_url, "thumb": img_url})
200                    except:
201                        continue
202
203                if not results:
204                    imgs = soup.select('img.mimg')
205                    for img in imgs:
206                        src = img.get('src') or img.get('data-src')
207                        if src and src.startswith('http'):
208                            results.append({"url": src, "thumb": src})
209
210                random.shuffle(results)
211                return results[:24]
212            except Exception as e:
213                print(f"搜图出错: {e}")
214                return []
215
216    async def search_social_users(self, keyword, platform="all"):
217        results = []
218
219        # 内部函数：B站
220        async def fetch_bili(client):
221            try:
222                bili_url = f"https://api.bilibili.com/x/web-interface/search/type?search_type=bili_user&keyword={urllib.parse.quote(keyword)}"
223                headers = self.helper.get_headers("bilibili")
224                if "Cookie" not in headers: headers["Cookie"] = "buvid3=infoc;"
225                resp = await client.get(bili_url, headers=headers)
226                data = resp.json()
227                local_res = []
228                if data.get('code') == 0 and data.get('data') and data['data'].get('result'):
229                    for user in data['data']['result'][:4]:
230                        local_res.append({
231                            "platform": "Bilibili",
232                            "name": user['uname'],
233                            "desc": f"粉丝: {user.get('fans', 0)} | {user.get('usign', '')[:20]}...",
234                            "pic": user['upic'].replace("http://", "https://"),
235                            "url": f"https://space.bilibili.com/{user['mid']}"
236                        })
237                return local_res
238            except:
239                return []
240
241        # 内部函数：微博
242        async def fetch_weibo(client):
243            try:
244                encoded_q = urllib.parse.quote(keyword)
245                weibo_url = f"https://m.weibo.cn/api/container/getIndex?containerid=100103type%3D3%26q%3D{encoded_q}&page_type=searchall"
246                headers = {
247                    "User-Agent": "Mozilla/5.0 (Linux; Android 10; K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Mobile Safari/537.36",
248                    "Referer": "https://m.weibo.cn/"
249                }
250                resp = await client.get(weibo_url, headers=headers)
251                data = resp.json()
252                local_res = []
253                cards = data.get('data', {}).get('cards', [])
254                count = 0
255                for card in cards:
256                    if count >= 3: break
257                    if 'card_group' in card:
258                        for item in card['card_group']:
259                            if item.get('card_type') == 11 and 'user' in item:
260                                u = item['user']
261                                local_res.append({
262                                    "platform": "微博",
263                                    "name": u.get('screen_name'),
264                                    "desc": f"粉丝: {u.get('followers_count', 0)} | {u.get('description', '')[:20]}",
265                                    "pic": u.get('profile_image_url', ''),
266                                    "url": f"https://m.weibo.cn/u/{u.get('id')}"
267                                })
268                                count += 1
269                return local_res
270            except:
271                return []
272
273        # 使用短超时 4s，防止长时间阻塞
274        async with httpx.AsyncClient(verify=False, timeout=4.0) as client:
275            tasks = []
276            if platform in ["all", "bilibili"]:
277                tasks.append(fetch_bili(client))
278            if platform in ["all", "weibo"]:
279                tasks.append(fetch_weibo(client))
280
281            # 并行执行
282            if tasks:
283                task_results = await asyncio.gather(*tasks, return_exceptions=True)
284                for tr in task_results:
285                    if isinstance(tr, list):
286                        results.extend(tr)
287
288            # 直达卡片
289            if platform in ["all", "douyin"]:
290                results.append({
291                    "platform": "抖音",
292                    "name": f"搜索: {keyword}",
293                    "desc": "点击直接跳转抖音网页版搜索",
294                    "pic": "https://lf1-cdn-tos.bytegoofy.com/goofy/ies/douyin_web/public/favicon.ico",
295                    "url": f"https://www.douyin.com/search/{urllib.parse.quote(keyword)}"
296                })
297
298            if platform in ["all", "xiaohongshu"]:
299                results.append({
300                    "platform": "小红书",
301                    "name": f"搜索: {keyword}",
302                    "desc": "点击直接跳转小红书搜索页",
303                    "pic": "https://ci.xiaohongshu.com/fd579468-69cb-4190-8457-377eb60c1d68",
304                    "url": f"https://www.xiaohongshu.com/search_result?keyword={urllib.parse.quote(keyword)}"
305                })
306
307        return results
308
309if __name__ == "__main__":
310    ft.run(main)

GitHub 仓库：

Github：MoonPointer-Byte/MoonMusic

核心文件说明：

services/crawler.py: 爬虫核心逻辑（建议重点阅读）
core/player.py: 播放器逻辑
main.py: UI 界面逻辑

⚠️ 郑重声明

技术边界：本文仅探讨 httpx 异步请求技术与 App 打包流程。
仅供学习交流：本文所涉及的网易云音乐数据爬取相关内容，仅为个人学习交流目的而创作。旨在分享技术探索过程、帮助初学者理解网络数据获取原理，绝无任何商业用途。若您将相关技术用于商业活动，由此引发的一切法律责任与经济纠纷，均与作者无关。
数据使用限制：通过文中方法获取的网易云音乐数据，应严格遵循数据来源平台的使用规则及相关法律法规。禁止将这些数据用于非法目的，如侵犯他人知识产权、恶意传播、用于不正当竞争等行为。否则，您需自行承担相应的法律后果。
技术风险提示：文中所描述的技术手段可能会因网易云音乐平台的更新、反爬虫策略调整等因素而失效。在尝试复现相关操作时，您可能会遇到各种技术问题，甚至导致账号受限、设备异常等情况。作者无法对这些风险提供任何担保或承担责任，请您谨慎操作。
**法律责任自负：**网络数据爬取涉及诸多法律问题，不同地区的法律法规对数据获取、使用的规定存在差异。在使用本文技术前，请确保您已充分了解并遵守当地法律法规。若因您的操作违反法律规定而产生法律纠纷，作者不承担任何法律责任，一切后果由您自行承担。
内容准确性与时效性：尽管作者在创作过程中尽力确保内容的准确性，但由于技术的快速发展和平台的不断变化，文中信息可能存在过时或不准确的情况。如果您发现内容存在错误或需要更新，请及时指出，但作者不承担因内容不准确或过时给您造成的任何损失。

本文章及所属内容完全属于MoonPointer-Byte也就是本作者，其余用户可以进行学习，但不可商用，任何后续问题本作者不承担任何责任！！！

《【Python爬虫实战】用 Flet 把爬虫做成手机 App》是转载文章，点击查看原文。