国产美女一级做性爱视频,国产精品嫩草影院 88AV

文獻(xiàn)批量下載器PyCNKi使用教程

2021-01-28 10:31

PyCNKi下載器使用教程

PyCNKi下載器源碼

（百度鏈接里有．ipynb格式源碼）

一、導(dǎo)入庫

from selenium import webdriverfrom selenium．webdriver import ChromeOptionsfrom selenium．webdriver．chrome．options import Optionsimport openpyxlimport reimport timefrom selenium．webdriver．common．by import Byfrom selenium．webdriver．support．ui import WebDriverWaitfrom selenium．webdriver．support．select import Selectimport urllib．error

二、打開知網(wǎng)并進(jìn)行初始設(shè)置

＃無可視化界面操作def wu＿visual（）： chrome＿options ＝ Options（） chrome＿options．a(chǎn)dd＿argument（＇－－h(huán)eadless＇） chrome＿options．a(chǎn)dd＿argument（＇－－disable－gpu＇） return chrome＿options
def fan＿jiance（）： option ＝ ChromeOptions（） option．a(chǎn)dd＿experimental＿option（＇excludeSwitches＇，［＇enable－automation＇］）＃option．a(chǎn)dd＿argument（＇－kiosk＇） return optiondef url＿error＿test（url，bro）： try： bro．get（url） print（＂OK＂） except urllib．error．HTTPError as e： print（e．code） print（e．reason） except urllib．error．URLError as e： print（e．reason） return e．reason
chrome＿options＝wu＿visual（）option＝fan＿jiance（）chrome＿path ＝r＇．／chromedriver．exe＇bro ＝ webdriver．Chrome（executable＿path＝chrome＿path，chrome＿options＝chrome＿options，options＝option）
＃用火狐的朋友可以把下一行代碼的“＃”去掉即可＃bro ＝ webdriver．Firefox（）
bro．maximize＿window（）＃最大化url ＝ r＇http：／／kns．cnki．net＇＃知網(wǎng)網(wǎng)址bro．get（url）

三、關(guān)鍵詞搜索

＃模擬輸入關(guān)鍵字查詢＃請(qǐng)選擇您需要使用的查詢方式，本代碼只提供標(biāo)題查詢input＿title ＝ bro．find＿element＿by＿id（＂txt＿SearchText＂）input＿title．click（）time．sleep（2）key＿value ＝ input（＂請(qǐng)輸入你要下載的論文標(biāo)題：＂）
input＿title．send＿keys（key＿value）＃點(diǎn)擊搜索div＿search ＝ bro．find＿element＿by＿xpath（＇／html／body／div［1］／div［2］／div／div［1］／input［2］＇）div＿search．click（）time．sleep（1）＃點(diǎn)擊期刊論文default＿1＝20bro．find＿element＿by＿xpath（＂／html／body／div［5］／div［1］／div／ul［1］／li［1］／a／span＂）．click（）time．sleep（10）total＿num ＝ bro．find＿element＿by＿xpath（＂／html／body／div［5］／div［1］／div／ul［1］／li［1］／a／em＂）if int（total＿num．text）＜＝default＿1： print（＂一共搜索到＂＋total＿num．text＋＂條結(jié)果＂） print（＂共一頁＂）else： print（＂一共搜索到＂＋ total＿num．text ＋＂條結(jié)果＂） total＿page ＝bro．find＿element＿by＿xpath（＇／［＠id＝＂gridTable＂］／div［2］／span［1］＇） print（total＿page．text） num ＝int（total＿page．text［1：－1］）

四、選擇下載格式及批量下載到幾頁

print（＂1：PDF格式2：CAJ格式請(qǐng)輸入下載文件的格式對(duì)應(yīng)數(shù)字：＂）load＿num ＝ int（input（＂請(qǐng)輸入1 or 2：＂））
print（＂請(qǐng)輸入您要下載到第幾頁碼：＂）

五、開始批量下載

load＿page ＝ int（input（））while load＿page＞num or load＿page＜＝0： print（＂輸入頁碼錯(cuò)誤，請(qǐng)重新輸入：＂） load＿page ＝ int（input（＂請(qǐng)輸入1 or 2：＂））bro＿new ＝ webdriver．Chrome（executable＿path＝chrome＿path， chrome＿options＝chrome＿options，options＝option）if int（total＿num．text）＜＝default＿1： url＿link ＝ bro．find＿elements＿by＿xpath（＇／［＠id＝＂gridTable＂］／table／tbody／tr／td［2］／a＇） for link＿1 in url＿link： count＝1 link ＝ url ＋ r＇／kcms／detail／detail．a(chǎn)spx？＇＋ link＿1．get＿attribute（＂href＂）［20：］ bro＿new ＝ webdriver．Chrome（executable＿path＝chrome＿path，chrome＿options＝chrome＿options，options＝option） bro＿new．get（link） bro＿new．maximize＿window（）＃ print（＂編號(hào)為＂＋str（count）＋＂的論文：＂＋bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text＋＂————正在下載＂） time．sleep（10） if bro＿new．find＿element＿by＿xpath（＇／html／body／div［2］／div＇）．text ＝＝＂URL參數(shù)錯(cuò)誤＂： print（＂編號(hào)為＂＋str（count）＋＂的論文：＂＋bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text＋＂————論文下載失�。ⅲ� bro＿new．quit（） count ＋＝ 1 continue if load＿num ＝＝ 1： bro＿new．find＿element＿by＿id（＇pdfDown＇）．click（） time．sleep（10） print（＂編號(hào)為＂＋ str（count）＋＂的論文：＂＋ bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text ＋＂————下載成功＂） count ＋＝ 1 bro＿new．quit（） if load＿num ＝＝ 2： bro＿new．find＿element＿by＿id（＇cajDown＇）．click（） time．sleep（10） print（＂編號(hào)為＂＋ str（count）＋＂的論文：＂＋ bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text ＋＂————下載成功＂） count ＋＝ 1 bro＿new．quit（）else： for ii in range（0，load＿page）： count＝1 url＿link ＝ bro．find＿elements＿by＿xpath（＇／［＠id＝＂gridTable＂］／table／tbody／tr／td［2］／a＇） for link＿1 in url＿link： link ＝ url ＋ r＇／kcms／detail／detail．a(chǎn)spx？＇＋ link＿1．get＿attribute（＂href＂）［20：］ bro＿new ＝ webdriver．Chrome（executable＿path＝chrome＿path，chrome＿options＝chrome＿options，options＝option） bro＿new．get（link） bro＿new．maximize＿window（） time．sleep（10） if bro＿new．find＿element＿by＿xpath（＇／html／body／div［2］／div＇）．text ＝＝＂URL參數(shù)錯(cuò)誤＂： bro＿new．quit（） print（＂編號(hào)為＂＋ str（count）＋＂的論文：＂＋ bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text ＋＂————論文下載失�。ⅲ� bro＿new．quit（） count ＋＝ 1 continue if load＿num ＝＝ 1： bro＿new．find＿element＿by＿name（＇pdfDown＇）．click（） time．sleep（10） print（＂編號(hào)為＂＋ str（count）＋＂的論文：＂＋ bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text ＋＂————下載成功＂） count ＋＝ 1 bro＿new．quit（） if load＿num ＝＝ 2： bro＿new．find＿element＿by＿name（＇cajDown＇）．click（） time．sleep（5） print（＂編號(hào)為＂＋ str（count）＋＂的論文：＂＋ bro＿new．find＿element＿by＿xpath（＂／html／body／div［2］／div［1］／div［3］／div／div［1］／div［3］／div［1］／h1＂）．text ＋＂————下載成功＂） count ＋＝ 1 bro＿new．quit（） bro．find＿element＿by＿xpath（＇／［＠id＝＂PageNext＂］＇）．click（） time．sleep（10）