日韩a电影,中文字幕亚洲h一二三

使用OpenCV，構(gòu)建文檔掃描儀

2022-05-09 17:15

磐創(chuàng)AI

關(guān)注

本文將使用 OpenCV，創(chuàng)建一個簡單的文檔掃描儀，就像常用的攝像頭掃描儀應(yīng)用程序一樣。這個想法很簡單，因為我們認為文檔是四邊形，我們獲取邊緣的位置并使用它來抓取文檔本身，而忽略無用的圖像背景。

簡單的管道：加載圖像＞＞檢測邊緣和抓取位置＞＞使用圖像上的位置

導(dǎo)入包

首先，我們導(dǎo)入處理圖像可能需要的包。threshold＿local 函數(shù)對你來說可能看起來很新，但這段代碼其實沒有什么特別之處。該函數(shù)來自 scikit 圖像包。

＃ import packages

from skimage．filters import threshold＿local

import numpy as np

import cv2

import imutils

加載圖像。

在這里，我們加載圖像并保留一份副本。在這里，原始的副本對于獲得清晰且未經(jīng)篡改的圖像掃描非常重要。為了處理圖像，我調(diào)整到一個合理的比例，接下來我對圖像進行灰度化以減少顏色并使其模糊（即有助于從圖像背景中去除高頻噪聲），這些都是為了找到文件的邊緣。

＃load in the image

image ＝ cv2．imread（＂images／questions．jpg＂）

orig ＝ image．copy（）

＃Resize the image．

height ＝ image．shape［0］

width ＝ image．shape［1］

ratio ＝ 0．2

width ＝ int（ratio ＊ width）

height ＝ int（ratio ＊ height）

image ＝ cv2．resize（image，（width， height））

＃find edges in the image．

gray＿scaled ＝ cv2．cvtColor（image， cv2．COLOR＿BGR2GRAY）

＃blurr the image

gray＿scaled ＝ cv2．GaussianBlur（gray＿scaled，（5，5），0）

＃Edge detection

edged ＝ cv2．Canny（gray＿scaled，50， 200）

cv2．imshow（＂Image＂， image）

cv2．waitKey（0）

cv2．imshow（＂Edges detected＂， edged）

cv2．waitKey（0）

找到輪廓。

使用 cv2．findcontours（）找到輪廓。接下來，我們使用 imutils 庫抓取輪廓，最后，我們根據(jù)最大輪廓區(qū)域，對抓取的輪廓進行排序。在這種情況下，我保留了最大的 5 個

＃ find contours in the edged image． keep only the largest contours．

contours ＝ cv2．findContours（edged．copy（）， cv2．RETR＿LIST， cv2．CHAIN＿APPROX＿SIMPLE）

＃ grab contours

contours ＝ imutils．grab＿contours（contours）

＃ select contours based on size．

contours ＝ sorted（contours， key＝cv2．contourArea， reverse ＝ True）［：5］

對輪廓進行進一步處理。

首先，我們遍歷輪廓并找到周長，這是將周長近似為點所必需的。完成此操作后，我們最終搜索恰好具有 4 個點的輪廓，這很可能是近似矩形形狀的紙張。完成后，我們獲取這些點的坐標(biāo)，然后將它們初始化為紙張輪廓。

＃ loop over the contours．

for contour in contours：

perimeter ＝ cv2．a(chǎn)rcLength（contour， True）

＃ approximate your contour

approximation ＝ cv2．a(chǎn)pproxPolyDP（contour， 0．02＊perimeter， True）

＃ if our contour has 4 points， then surely， it should be the paper．

if len（approximation）＝＝ 4：

paper＿outline ＝ approximation

break

有了坐標(biāo)，下一步就是畫輪廓，很簡單。

＃ Draw the found contour．

cv2．drawContours（image，［paper＿outline］，－1，（225，0，0），2）

cv2．imshow（＂Found outline＂， image）

cv2．waitKey（0）

你心中的問題是，我們完成了嗎？

好吧，你可能會說是的，因為你在圖像周圍設(shè)置了很好的輪廓。答案是否定的，為了獲得最佳掃描形式的圖像，我們需要 90 度的圖像視圖，尤其是在傾斜的情況下。為此，我們將創(chuàng)建一個函數(shù)來處理此任務(wù)。

管道：排列點＞＞標(biāo)記點＞＞從真實圖像中挑選點

arrange＿points 函數(shù)。

這樣做的方法非常簡單，歸功于 Adrian Rosebrock（博士）。這個函數(shù)背后的直覺是我們獲取文檔四個邊緣的坐標(biāo)，并將其安排到我們認為它應(yīng)該在的位置，我花了一些時間給出描述的圖形表示。

點坐標(biāo)的和

1）從上圖中我們可以看出，點坐標(biāo)（X，Y）的和最大的是在右上角。

2）最小的點總和是左下點。

點坐標(biāo)的差

3）點坐標(biāo)的差的最大值是左上角

4）點坐標(biāo)的差的最小值是左下角。

代碼。

該函數(shù)接受參數(shù)points，接下來，我初始化一個 NumPy 數(shù)組來表示矩形，該數(shù)組是一個 4 x 2 矩陣，因為我們有 4 個點和 2 個坐標(biāo)（X，Y）。

最后，如上所述，我在矩形的點中注冊（點的和以及點的差）。最后，我正確地返回了 Rectangle 的坐標(biāo)。

def arrange＿points（points）：

＃ initialize a list of co－ordinates that will be ordered

＃ first entry is top－left point， second entry is top－right

＃ third entry is bottom－right， forth／last point is the bottom left point．

rectangle ＝ np．zeros（（4，2）， dtype ＝＂float32＂）

＃ bottom left point should be the smallest sum

＃ the top－right point will have the largest sum of point．

sum＿points＝ points．sum（axis ＝1）

rectangle［0］＝ points［np．a(chǎn)rgmin（sum＿points）］

rectangle［2］＝ points［np．a(chǎn)rgmax（sum＿points）］

＃bottom right will have the smallest difference

＃top left will have the largest difference．

diff＿points ＝ np．diff（points， axis＝1）

rectangle［1］＝ points［np．a(chǎn)rgmin（diff＿points）］

rectangle［3］＝ points［np．a(chǎn)rgmax（diff＿points）］

＃ return order of co－ordinates．

return rectangle

設(shè)置四個點。

這個功能很簡單，這里的想法當(dāng)然是拉直紙張，只提取需要的區(qū)域。在這里，輸入是 1）圖像本身和點或坐標(biāo)。首先，我們使用我們創(chuàng)建的第一個函數(shù)“arrange＿points”來排列函數(shù)的點。接下來，我相應(yīng)地分配了點，因為我之前已經(jīng)安排了點并且也很好地命名了它們。

計算。

對于計算，只需兩點之間的距離即可找到每邊的長度。有了這個，我們能夠在對的位置上防止錯誤地調(diào)整圖像。顧名思義，目的地是圖像的新視圖。其中［0，0］表示左上角。接下來，［Max－width － 1，0］表示右上角，我們還有［maxwidth －1， maxheight－1］表示底部右上角，最后是左下角［0， max－h(huán)eight －1］。

轉(zhuǎn)換矩陣

動作完成，工作結(jié)束，我們需要完成的是使用 cv2．getPerspectiveTransform（）的變換矩陣，它接受點的矩形和目的地。現(xiàn)在我們有了矩陣，我們使用 cv2．warpPerspective（）應(yīng)用它，它獲取你提供給函數(shù)的圖像、變換矩陣，最后是建議掃描的（寬度和長度）。全部完成，返回轉(zhuǎn)換后的圖像

＃ set four points．

def set＿four＿points（image， points）：

＃ obtain order of points and unpack．

rectangle ＝ arrange＿points（points）

（top＿left，top＿right，bottom＿right，bottom＿left）＝ rectangle

＃ let＇s compute width of the rectangle．

＃ using formular for distance between two points

left＿height ＝ np．sqrt（（（top＿left［0］－bottom＿left［0］）＊＊2）＋（（top＿left［1］－bottom＿left［1］）＊＊2））

right＿height ＝ np．sqrt（（（top＿right［0］－bottom＿right［0］）＊＊2）＋（（top＿right［1］－bottom＿right［1］）＊＊2））

top＿width ＝ np．sqrt（（（top＿right［0］－top＿left［0］）＊＊2）＋（（top＿right［1］－top＿left［1］）＊＊2））

bottom＿width ＝ np．sqrt（（（bottom＿right［0］－bottom＿left［0］）＊＊2）＋（（bottom＿right［1］－bottom＿left［1］）＊＊2））

maxheight ＝ max（int（left＿height）， int（right＿height））

maxwidth ＝ max（int（top＿width）， int（bottom＿width））

destination ＝ np．a(chǎn)rray（［

［0，0］，

［maxwidth －1，0］，

［maxwidth －1， maxheight－1］，

［0， maxheight － 1］］， dtype ＝＂float32＂）

matrix ＝ cv2．getPerspectiveTransform（rectangle， destination）

warped ＝ cv2．warpPerspective（image， matrix，（maxwidth，maxheight））

return warped

應(yīng)用函數(shù)

我們已經(jīng)創(chuàng)建了函數(shù)，因此我們將其應(yīng)用于最初保存的原始圖像。第二個輸入是論文的大綱。我通過刪除我在開始時所做的比例縮放，將紙張輪廓重新調(diào)整回原來的大小。要獲得圖像的黑白感覺，需要使用 Threshold local，但當(dāng)然，如果你想要對圖像進行彩色掃描，則根本不需要它。最后，我調(diào)整大小并顯示。

warped ＝ set＿four＿points（orig， paper＿outline．reshape（4，2）＊（1／ratio））

＃warped ＝ cv2．cvtColor（warped， cv2．COLOR＿BGR2GRAY）

＃threshold ＝ threshold＿local（warped， 11， offset＝10， method＝＂gaussian＂）

＃warped ＝（warped ＞ threshold）．a(chǎn)stype（＂uint8＂）＊ 255

＃show the original and scanned images

print（＂Image Reset in progress＂）

cv2．imshow（＂Original＂， cv2．resize（orig，（width， height）））

cv2．imshow（＂Scanned＂，cv2．resize（warped，（width， height）））

cv2．waitKey（0）