In lieu of the impending project I need to complete for ECE420, I have decided that I will pursue this implementation of image segmentation in order to output multiple images to be upscaled individually (my partner’s work) and then later merged back together to form a fully upscaled version of the image. The motivation of studying and adapting their implementation of this algorithm is due to the simplicity of the adaptation.
Although I have begun implementation of Normalized Cut in C, I have realized the port from matlab to C is horrid. This is not only due to the size of the code, but the incredible time complexity of its current iteration. The second problem is the need to the implementation of Matlab native functions (which cannot be converted into C MEX functions). Although octave offers implementations of things like spdiags and eigs, an easy port of things such as imread, will be close to impossible given that matlab has closed off all the source code.
With that I would like to begin discussion of the algorithm discussed in this paper. Efficient Graph-Based Image Segmentation by Pedro F. Felzenszwalb and Daniel P. Huttenlocher.
G=Undirected graph composed of G=(V,E)
S=Array of segments , composed of components which are sections of an image, a disjoint-set forest
C=component, or an individual section of the image which is connected to another component in the graph, a minimum spanning tree
This implementation of image segmentation like Ncut is based on pairwise region comparison. An image is stored in S, a disjoint-set forest, using path compression and union by rank in order to create a flatter and faster to access tree. Each tree stored inside this disjoint-set forest represents a single segment of the image.
This algorithm takes into account a factor which allows one to prioritize larger or smaller segments based upon perceptible differences. This is huge when it comes to our usage as it means tweaking the factor will allow us to in a way effectively tweak edge detection in segmentation.
Condition for Segmentation
In the paper Felzenszwalb and Huttenlocher discuss a predicate D which is the decision factor in which two segments should be joined or should not be joined. The main idea behind this is to evaluate whether the boundary between a pair of components exists. In doing so Felzenszwalb and Huttenlocher compare the difference between two components (in the same segment) to the internal difference of the same two components.
A simple explanation of the terms mentioned above is as follows.
Internal Difference: Maximum weight edge in a component
Difference: Minimum weight edge connecting two components. If the two components are not connected this value is infinity.
Minimum internal Difference: min(minimum edge weight of component 1+ threshold, minimum edge weight of component 2+ threshold)
here tau is defined as where k (controls sensitivity of segmentation) is a variable to be adjusted and |C| represents the cardinality of the component. Larger k will bias the algorithm to produce larger components.
The reasoning for these definitions can be explained as follows.
Internal Difference: If the maximum weight edge of a component is connected we know that this component has relevance to be connected or segmented.
Difference: This is sort of like a “median” of the weights in a graph giving an idea of how the weights are biased in both components.Minimum Internal Difference: This is the same as internal difference except it allows us to account for a threshold to control the “relevance” of a component and whether two components should be merged.
Taking all of this in account Felzenszwalb and Huttenlocher state the decision criteria as follows.
The predicate above will check whether a boundary exists between these two components by checking if the difference between the components is large relative to the internal difference with at least one of the components. If it evaluates to true, it will merge if false it will not merge.
This works because recall Int(C) will return the maximum weight of C and all weights greater than Int(C) will need to be merged. Thus setting the condition to check for weights larger than Int(C) will produce a merge between both components.
A note on
- Large k will produce a preference for larger components
- This will require which parameters are correlated with large k components in the case of checking for long and thin segments, large k will discourage such segments.
- Small k will produce a preference for smaller components
- In the case of checking for long and thin segments a small k will encourage such segments.
The algorithm can be described in 4 steps as follows.
- Construct an array of edges with non decreasing weight
- Start with the first segmentation S_0 with each vertex as its own edge.
- Generate segmentation S_1 to S_m-1 (where m is the number of edges) using the predicate D
- Return S=S_m
Next up: Analyzing the source code and adaptaion
P. Felzenszwalb, D. Huttenlocher
International Journal of Computer Vision, Vol. 59, No. 2, September 2004