目前很多免费小说都不支持下载文章,为了方便在mp4中或一些离线设备刷小说,为此小编写了一个简单下载器( 文章仅限参考,请勿用于非法行为! )
原理: 采用requests对网页类容进行请求,然后用BeautifulSoup对请求的格式化为html形式,在对照网页标签,一步步筛选即可
打个广告哈:
大家是不是找资源特别费力,每次冲浪时遇到好网站要手动收藏,为此,小编的朋友为大家整理了一个导航网站,里面不仅收录了很多热门网站,而且还发布多实用的篇博客文章以及热门软件下载,网站链接-->> 全导航 | Allnav
OK, 话不多说,直接上代码
import os
import requests
from bs4 import BeautifulSoupindex_url = 'http://www.ibiqu.net/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36'}
while True:title = input('书名 >> ')url = f'http://www.ibiqu.net/modules/article/search.php?searchkey={title}'page = input('查找数量 >> ')if page not in map(str,range(99999)):print(f"[-] 该编号 '{page}' 不在筛选的列表内")continuepage = int(page)books = []for soup in BeautifulSoup(requests.get(url, headers=headers).text, 'html.parser').find_all('tr'):try:books.append({"文章名称": f'{soup.find("td", class_="odd").text}',"作者": f'{soup.find_all("td", class_="odd")[1].text}',"最新章节": f'{soup.find("td", class_="even").text}',"字数": f'{soup.find_all("td", class_="even")[1].text} (0k为正在连载中)',"更新": f'{soup.find("td", align="center", class_="odd").text}',"状态": f'{soup.find("td", align="center",class_="even").text}',"url": index_url + f'{soup.find("a", href=True).get("href")}'})page -= 1if page <= 0:breakexcept:passif not books:print(f"[-] 未查到该'{title}'书籍")continuefor bks in books:print(f'图书编号: {page}')for bk in bks:print(f'{bk}: {bks.get(bk)}')page += 1print('')while True:book_index = input("\n请选择图书编号: >> ")if book_index not in map(str,range(len(books))):print(f"[-] 该编号 '{book_index}' 不在筛选的列表内")continuebook_index = int(book_index)breakbook_name = books[book_index].get('文章名称')if book_name not in os.listdir():os.mkdir(book_name)os.chdir(book_name)for soup in BeautifulSoup(requests.get(books[book_index].get('url'),headers=headers).text, 'html.parser').find_all('a',href=True):if "第" in soup.text:chapter = soup.textcurl = index_url + soup.get('href')try:for content in BeautifulSoup(requests.get(curl,headers=headers).text, 'html.parser').find_all('div', id="content"):with open(f'{chapter}.txt', 'w', encoding='utf8') as file:file.write(content.text.replace(' ','\n'))print(f'[+] {chapter} 下载成功!')except:print(f'[-] {chapter} 下载失败!')
.