|Diffusion Processes for Retrieval|
|In this work we revisit diffusion processes on affinity graphs for capturing the intrinsic manifold structure defined by pairwise affinity matrices. Such diffusion processes have already proved the ability to significantly improve subsequent applications such as retrieval. We give a thorough overview of the state-of-the-art in this field and discuss obvious similarities and differences. Based on our observations, we are then able to derive a generic framework for diffusion processes in the scope of retrieval applications, where the related work represents specific instances of our generic formulation. We evaluate our framework in the scope of retrieval applications and are able to derive algorithms that e.g. achieve a 100% bullseye score on the popular MPEG-7 shape retrieval data set. The general formulation of our diffusion framework enables potential applications in several other computer vision related fields, such as clustering, graph matching, semi-supervised learning, segmentation, saliency detection and image based localization.|
Figure 1: Illustration of usefulness of diffusion processes for retrieval. Affinities before (left) and after (right) the diffusion are used to assign each element to the highlighted query samples according to their pairwise affinities. As can be seen, pairwise affinities alone are not sufficient to capture the intrinsic structure of the data manifold. Applying a diffusion process spreads affinities and allows to significantly improve retrieval performance.
Video of the corresponding presentation at CVPR in Portland, USA at techtalks.tv
Example: Image-based localization
|Example of a diffusion process application for image-based localization data. We have given a 3D point cloud obtained by any available Structure-from-Motion pipeline and for each 3D point we have a set of 2D SIFT descriptors, representing the 2D points that were used to triangulate the corresponding 3D point. Each 3D point has a different amount of 2D descriptors assigned, i.e. was seen by a different amount of cameras. In the affinity matrix each element represents a SIFT descriptor, and elements are sorted according to the 3D points (thus showing a block structure). Affinities are calculated using the Euclidean distance between the SIFT descriptors. For brute-force 2D-to-3D localization one has to identify nearest neighbors of the 2D descriptors. Diffusion improves the nearest neighbor retrieval score from 88.5% to 96.63%.|
Figure 1: Affinity matrix before (left) and after diffusion (right) for an image-based localization task.
Example: Shape Retrieval
|Example of a diffusion process application for shape retrieval evaluated on the popular MPEG-7 database. For this database we use the best performing shape retrieval method proposed in "Articulation-invariant representation of non-planar shapes" from Gopolan et al., ECCV 2010 to define the affinities. We then apply our diffusion method which re-ranks all the silhouettes considering the underlying affinity manifold. In such a way we are able to improve the bullseye rating (the average percentage of correct instances of the same class within the first 40 ranked elements per query) from 93.67% to 100%.|
Figure 1: Rankings obtained for the query elements on the left before (first row) and after (second row) diffusion. Correct elements are highlighted in green. As can be seen, due to considering context from the other database elements the ranking is improved yielding a higher bullseye score.
|We provide an implementation for Matlab which should run on all platforms. The package contains
|The code contains all scripts and the data sets used for evaluation: Download CODE Ver.1.1 (Matlab) + Data (~100MB).|