计算机视觉第七章图像搜索

7.1 基于内容的图像检索

在大型图像数据库上，CBIR（基于内容的图像检索）技术用于检索在视觉上具相似性的图像。这样返回的图像可以是颜色相似、纹理相似、图像中的物体或场景相似；总之，基本上可以是这些图像自身共有的任何信息。

矢量空间模型是一个用于表示和搜索文本文档的模型。这些矢量是由文本词频直方图构成的，矢量包含了每个单词出现的次数，而且在其他别的地方包含很多0元素。由于其忽略了单词出现的顺序及位置，该模型也被称为BOW表示模型。

通过单词计数来构建文档直方图向量v，从而建立文档索引。通常，数据集（或语料库）中一个单词的重要性与它在文档中出现的次数成正比，而与它在语料库中出现的次数成反比。

最常用的权重是tf-idf（词频-逆向文档频率），单词w在文档d中的词频是：

$\mathrm{tf}_{w,d}=\frac{n_w}{\sum_jn_j}$

nw 是单词w在文档d中出现的次数。为了归一化，将nw 除以整个文档中单词的总数。故逆向文档频率为：

$\mathrm{idf}_{w,d}=\log\frac{|(D)|}{|\{d{:}w\in d\}|}$

|D| 是在语料库D中文档的数目，分母是语料库中包含单词w的文档数d。

7.2 视觉单词

为了将文本挖掘技术应用到图像中，我们首先需要建立视觉等效单词；这通常可以用SIFT局部描述子做到。它的思想是将描述子空间量化成一些典型实例，并将图像中的每个描述子指派到其中的某个实例中。这些典型实例可以通过分析训练图像集确定，并被视为视觉单词。所有这些视觉单词构成的集合称为视觉词汇。

创建词汇

为创建视觉词汇，我们使用SIFT特征描述子。imlist包含的是图像的文件名，运行下面的代码，可以得到每幅图像提取出的描述子，并将每幅图像的描述子保存在一个文件中

7.3 图像索引

7.3.1 建立数据库

在索引图像前，我们需要建立一个数据库。这里，对图像进行索引就是从这些图像中提取描述子，利用词汇将描述子转换成视觉单词，并保存视觉单词及对应图像的单词直方图。从而可以利用图像对数据库进行查询，并返回相似的图像作为搜索结果。

在开始之前，我们需要创建表、索引和索引器Indexer类，以便将图像数据写入数据库。首先，创建一个名为imagesearch.py的文件，将下面的代码添加进去：

import pickle
from pysqlite2 import dbapi2 as sqliteclass Searcher(object):def __init__(self,db,voc):"""初始化数据库的名称"""self.con = sqlite.connect(db)self.voc = vocdef __del__(self):self.con.close()def get_imhistpgram(self,imname):"""Return the word histogram for an image."""im_id = self.con.execute("select rowid from imlist where filename='%s'" % imname).fetchone()s = self.con.execute("select histogram from imhistograms where rowid='%d'" % im_id).fetchone()# use pickle to decode NumPy arrays from stringreturn pickle.loads(s[0])def candidates_from_word(self,inword):"""G获取包含imword的图像列表"""im_ids = self.con.execute("select distinct imid from imwords where wordid=%d" % imword).fetchallreturn [i[0] for i in im_ids]def candidates_from_histogram(self,imwords):"""Get list of images with similar words"""#get the word idswords = imwords.nonzero()[0]#find candidatescandidates = []for word in words:c = self.candidates_from_word(word)candidates += c#take all unique words and reverse sort on occurrencetmp = [(w,candidates.count(w)) for w in set(candidates)]tmp.sort(key=cmp_to_key(lambda x,y:operator.gt(x[1],y[1])))tmp.reverse()#return sorted list,best matches firstclass Searcher(object):def __init__(self, db, voc):""" Initialize with the name of the database. """self.con = sqlite3.connect(db)self.voc = vocdef __del__(self):self.con.close()def get_imhistogram(self, imname):""" Return the word histogram for an image. """im_id = self.con.execute("select rowid from imlist where filename='%s'" % imname).fetchone()s = self.con.execute("select histogram from imhistograms where rowid='%d'" % im_id).fetchone()# use pickle to decode NumPy arrays from stringreturn pickle.loads(s[0])def candidates_from_word(self, imword):""" Get list of images containing imword. """im_ids = self.con.execute("select distinct imid from imwords where wordid=%d" % imword).fetchall()return [i[0] for i in im_ids]def candidates_from_histogram(self, imwords):""" Get list of images with similar words. """# get the word idswords = imwords.nonzero()[0]# find candidatescandidates = []for word in words:c = self.candidates_from_word(word)candidates += c# take all unique words and reverse sort on occurrencetmp = [(w, candidates.count(w)) for w in set(candidates)]tmp.sort(key=cmp_to_key(lambda x, y: operator.gt(x[1], y[1])))tmp.reverse()# return sorted list, best matches firstreturn [w[0] for w in tmp]def query(self, imname):""" Find a list of matching images for imname. """h = self.get_imhistogram(imname)candidates = self.candidates_from_histogram(h)matchscores = []for imid in candidates:# get the namecand_name = self.con.execute("select filename from imlist where rowid=%d" % imid).fetchone()cand_h = self.get_imhistogram(cand_name)cand_dist = sqrt(sum(self.voc.idf * (h - cand_h) ** 2))matchscores.append((cand_dist, imid))# return a sorted list of distances and database idsmatchscores.sort()return matchscoresdef get_filename(self, imid):""" Return the filename for an image id. """s = self.con.execute("select filename from imlist where rowid='%d'" % imid).fetchone()return s[0]def tf_idf_dist(voc, v1, v2):v1 /= sum(v1)v2 /= sum(v2)return sqrt(sum(voc.idf * (v1 - v2) ** 2))def compute_ukbench_score(src, imlist):""" Returns the average number of correctimages on the top four results of queries. """nbr_images = len(imlist)pos = zeros((nbr_images, 4))# get first four results for each imagefor i in range(nbr_images):pos[i] = [w[1] - 1 for w in src.query(imlist[i])[:4]]# compute score and return averagescore = array([(pos[i] // 4) == (i // 4) for i in range(nbr_images)]) * 1.0return sum(score) / (nbr_images)# import PIL and pylab for plotting
from PIL import Image
from pylab import *def plot_results(src, res):""" Show images in result list 'res'. """figure()nbr_results = len(res)for i in range(nbr_results):imname = src.get_filename(res[i])subplot(1, nbr_results, i + 1)imshow(array(Image.open(imname)))axis('off')show()

7.4 在数据库中搜索图像

建立好图像的索引，我们就可以在数据库中搜索相似的图像了。这里，我们用BoW 来表示整个图像，不过这里介绍的过程是通用的，可以应用于寻找相似的物体、相似的脸、相似的颜色等，它完全取决于图像及所用的描述子。

为实现搜索，我们在imagesearch.py中添加Searcher类：

class Searcher(object):def __init__(self,db,voc):"""初始化数据库的名称"""self.con = sqlite.connect(db)self.voc = vocdef __del__(self):self.con.close()def get_imhistpgram(self,imname):"""Return the word histogram for an image."""im_id = self.con.execute("select rowid from imlist where filename='%s'" % imname).fetchone()s = self.con.execute("select histogram from imhistograms where rowid='%d'" % im_id).fetchone()# use pickle to decode NumPy arrays from stringreturn pickle.loads(s[0])def candidates_from_word(self,inword):"""G获取包含imword的图像列表"""im_ids = self.con.execute("select distinct imid from imwords where wordid=%d" % imword).fetchallreturn [i[0] for i in im_ids]def candidates_from_histogram(self,imwords):"""Get list of images with similar words"""#get the word idswords = imwords.nonzero()[0]#find candidatescandidates = []for word in words:c = self.candidates_from_word(word)candidates += c#take all unique words and reverse sort on occurrencetmp = [(w,candidates.count(w)) for w in set(candidates)]tmp.sort(key=cmp_to_key(lambda x,y:operator.gt(x[1],y[1])))tmp.reverse()#return sorted list,best matches firstclass Searcher(object):def __init__(self, db, voc):""" Initialize with the name of the database. """self.con = sqlite3.connect(db)self.voc = vocdef __del__(self):self.con.close()def get_imhistogram(self, imname):""" Return the word histogram for an image. """im_id = self.con.execute("select rowid from imlist where filename='%s'" % imname).fetchone()s = self.con.execute("select histogram from imhistograms where rowid='%d'" % im_id).fetchone()# use pickle to decode NumPy arrays from stringreturn pickle.loads(s[0])def candidates_from_word(self, imword):""" Get list of images containing imword. """im_ids = self.con.execute("select distinct imid from imwords where wordid=%d" % imword).fetchall()return [i[0] for i in im_ids]def candidates_from_histogram(self, imwords):""" Get list of images with similar words. """# get the word idswords = imwords.nonzero()[0]# find candidatescandidates = []for word in words:c = self.candidates_from_word(word)candidates += c# take all unique words and reverse sort on occurrencetmp = [(w, candidates.count(w)) for w in set(candidates)]tmp.sort(key=cmp_to_key(lambda x, y: operator.gt(x[1], y[1])))tmp.reverse()# return sorted list, best matches firstreturn [w[0] for w in tmp]def query(self, imname):""" Find a list of matching images for imname. """h = self.get_imhistogram(imname)candidates = self.candidates_from_histogram(h)matchscores = []for imid in candidates:# get the namecand_name = self.con.execute("select filename from imlist where rowid=%d" % imid).fetchone()cand_h = self.get_imhistogram(cand_name)cand_dist = sqrt(sum(self.voc.idf * (h - cand_h) ** 2))matchscores.append((cand_dist, imid))# return a sorted list of distances and database idsmatchscores.sort()return matchscoresdef get_filename(self, imid):""" Return the filename for an image id. """s = self.con.execute("select filename from imlist where rowid='%d'" % imid).fetchone()return s[0]def tf_idf_dist(voc, v1, v2):v1 /= sum(v1)v2 /= sum(v2)return sqrt(sum(voc.idf * (v1 - v2) ** 2))def compute_ukbench_score(src, imlist):""" Returns the average number of correctimages on the top four results of queries. """nbr_images = len(imlist)pos = zeros((nbr_images, 4))# get first four results for each imagefor i in range(nbr_images):pos[i] = [w[1] - 1 for w in src.query(imlist[i])[:4]]# compute score and return averagescore = array([(pos[i] // 4) == (i // 4) for i in range(nbr_images)]) * 1.0return sum(score) / (nbr_images)# import PIL and pylab for plotting
from PIL import Image
from pylab import *def plot_results(src, res):""" Show images in result list 'res'. """figure()nbr_results = len(res)for i in range(nbr_results):imname = src.get_filename(res[i])subplot(1, nbr_results, i + 1)imshow(array(Image.open(imname)))axis('off')show()

它可以将图像的id转换为图像文件名，以便在显示搜索结果时载入图像。用plot_results() 在我们的数据集上进行的一些查询。

Searcher 类中 init() 方法用于初始化数据库的名称。 __ del__() 方法可以确保关闭数据库连接。candidates_from_word() 用于获得包含特定单词的所有图像id号。candidates_from_histogram() 方法从图像单词直方图的非零项创建单词id列表，检索每个单词获得候选集并将其合并到candidates列表中，然后创建一个元组列表每个元组由单词id和次数count构成，其中次数count是候选列表中每个单词出现的次数。同时，还以元组中的第二个元素为准，用sort()方法和一个自定义的比较函数对列表进行排序。该自定义比较函数进行用lambda函数内联声明，对于单行函数声明，使用lambda函数非常方便。最后结果返回一个包含图像id的列表，排在列表最前面的是最好的匹配图像。get_imhistogram() 用于返回一幅图像的单词直方图。

运行结果如下：

7.5 使用几何特性对结果排序

让我们简要地看一种用BoW模型改进检索结果的常用方法。BoW模型的一个主要缺点是在用视觉单词表示图像时不包含图像特征的位置信息，这是为获取速度和可伸缩性而付出的代价。

import pickle
import sift2
import imagesearch
import homography#载入图像列表和词汇
with open('ukbench_imlist.pkl','rb') as f:imlist = pickle.load(f)featlist = pickle.load(f)
nbr_images = len(imlist)with open('vocabulary.pkl','rb') as f:voc = pickle.load(f)src = imagesearch.Searcher('test.db',voc)#查询图像的索引号和返回的搜索结果数目
q_ind = 50
nbr_results = 20#常规查询
res_reg = [w[1] for w in src.query(imlist[q_ind])[:nbr_results]]
print('top match (regular):',res_reg)#载入查询图像特征
q_locs,q_descr = sift2.read_features_from_file(featlist[q_ind])
fp = homography.make_homog(q_locs[:,:2].T)#用RANSAC模型拟合单应性
model = homography.RansacModel()
rank = {}
#载入搜索结果的图像特征
for ndx in res_reg[1:]:locs,descr = sift2.read_features_from_file(featlist[ndx])#获取匹配数matches = sift2.match(q_descr,descr)ind = matches.nonzero()[0]ind2 = matches[ind]tp = homography.make_homog(locs[:,:2].T)#计算单应性，对内点计数。如果没有足够的匹配数则返回空列表try:H,inliers = homography.H_from_ransac(fp[:,ind],tp[:,ind2],model,match_theshold=4)except:inliers = []#存储内点数rank[ndx] = len(inliers)# 将字典排序，以首先获取最内层的内点数sorted_rank = sorted(rank.items(), key=lambda t: t[1], reverse=True)res_geom = [res_reg[0]] + [s[0] for s in sorted_rank]print('top matches (homography):', res_geom)# 显示靠前的搜索结果imagesearch.plot_results(src, res_reg[:8])imagesearch.plot_results(src, res_geom[:8])

首先，载入图像列表、特征列表（分别包含图像文件名和SIFT特征文件）及词汇。然后，创建一个Searcher对象，执行定期查询，并将结果保存在res_reg列表中。然后载入res_reg列表中每一幅图像的特征，并和查询图像进行匹配。单应性通过计算匹配数和计数内点数得到。最终，我们可以通过减少内点的数目对包含图像索引和内点数的字典进行排序。打印搜索结果列表到控制台，并可视化检索靠前的图像。

运行结果如下

计算机视觉第七章图像搜索

7.1 基于内容的图像检索

7.2 视觉单词

7.3 图像索引

7.3.1 建立数据库

7.4 在数据库中搜索图像

7.5 使用几何特性对结果排序

最新新闻

热搜词

计算机视觉 第七章图像搜索

7.1 基于内容的图像检索

7.2 视觉单词

7.3 图像索引

7.3.1 建立数据库

7.4 在数据库中搜索图像

7.5 使用几何特性对结果排序

最新新闻

热搜词

计算机视觉第七章图像搜索