python中selenium自动化爬取

2025/7/6 1:40:15 来源：https://blog.csdn.net/weixin_42289273/article/details/141567112 浏览: 次关键词：python中selenium自动化爬取

需要注意的是，进行自动化爬取时，需要遵守相关的法律法规，不得进行违法违规的操作。

1. 下载依赖

pip install selenium

2. 下载操作chrome的驱动程序

注意：chromeDriver版本号要与chrome的版本号对应
下载网址：https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json

3. selenium操作示例

import timefrom selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC# 创建一个 ChromeOptions 实例（可选）
# chrome_options = Options()
# chrome_options.add_argument("--headless")  # 例如，使用无头模式# 使用代理
# chrome_options = Options()
# chrome_options.add_argument("--proxy-server=http://127.0.0.1:8080")# 指定 chromedriver 的路径
service = Service('D:/chromedirve_128_0_6613_84/chromedriver-win64/chromedriver.exe')# 实例化 Chrome 浏览器对象
# driver = webdriver.Chrome(service=service, options=chrome_options)
driver = webdriver.Chrome(service=service)
driver.get("https://www.mayanan.cn/vnc.html")
# driver.get("http://www.baidu.com")# 截图并保存
# driver.save_screenshot("vnc.png")# 获取页面元素并点击
# element = driver.find_element(By.ID, "noVNC_connect_button")
# element.click()# 获取页面的html源代码
# print(driver.page_source)# 等待元素加载完成
# element = WebDriverWait(driver, 10).until(
#     EC.presence_of_element_located((By.ID, "noVNC_connect_button"))
# )driver.quit()

python中selenium自动化爬取

1. 下载依赖

2. 下载操作chrome的驱动程序

3. selenium操作示例

最新新闻

热搜词