Statistical 3D Segmentation With Greedy Connected Component Labeling
Authors: Jiuxiang Hu, Gerald Farin, Matthew Holecko II, Stephen P. Massia, Gregory Nielson, Anshuman Razdan
URL: http://prism.asu.edu/research/data/publications/paper02_s3dswgcclr.pdf#search=%223D%20connected%20components%20greedy%22
Problem: Segmenting protein and nuclei and electrode implant structures to reveal important biological information.
Data Structure: 3D data is represented as 2D plane slices. Typical size: Example A: 512 x 512 with 58 slices. Example B: 512 x 512 with 78 slices. An equivalence table is used to store the information about the equivalence of labels from the same component.
Efficiency: classical CCL pays more attention to the cost of memory then time complexity compared to GCCL (greedy connected component labeling).
Data | Data Size | Number of Components | Largest Component | Time(s) |
GFAP & Nuclei | 29.9MB | 2514 | 87536 | 4 |
GFAP & Implant | 63.9MB | 3115 | 234976 | 8 |
Implementation: Semi-automatic using 2 phases.
Phase 1: Initial Segmentation of image
Phase 2: Refined Image Segmentation using Greedy Connected Component Labeling
Why do we need greedy connected component labeling? After phase 1 the segmentation results in many small isolated regions, therefore we use greedy connected component algorithm to isolate the significant components.
Advantages:
- Wiebull parameters allow local and global information are taken into account for segmentation
Disadvantages:
- Main Problem: GCCL may repeat the labeling of a connected component
- Semi-automatic: requires user input and expertise
Towards Modeling the Performance of a Fast Connected Components Algorithm on Parallel Machines
Authors: Steven S Lumetta, Arvind Krishnamurthy, and David E. Culler
URL: http://www.eecs.berkeley.edu/Research/Projects/CS/parallel/castle/split-c-apps/connect/
Propose: Presents a fast, portable, general purpose algorithm for finding connected components on a distributed memory machine.
Implementation: We want to minimize the high cost of remote access which is typically hundreds of time higher then local access, but still utilize the performance advantage of using parallel processes. Therefore a parallel hybrid algorithm that blends a sequential algorithm optimized for local access with an efficient algorithm optimized for the global(remote) phases
Local: variation of depth-first search Global: powerful efficient variant of Shiloach-Vishkin
Programming Language: Split-C: Split-C is a parallel extension of the C programming language that supports efficient access to a global address space on distributed memory multiprocessors.
Algorithm:
- Local Phase: Perform a local DFS on each processor's portion of the graph, collapsing each local connected component into a representative node. Mark each component of this global graph with a unique value for the global phase. Execution time is O(n), where n is the number of nodes
- Global Initialization: Complete the preparation of the global graph by pointing each remote edge at one of the local representative nodes selected in the local phase.
- Global: Beginning with the reduced global graph of representative nodes and remote edges, repeat the following:
⚠️* <em>Termination Check:</em> If no components remain to be processed on any processor, quit.
⚠️* <em>Hooking Step:</em> Attempt to merge each component into another component by collapsing an edge. Conditions on merging guarantee that representative nodes remain unique.
⚠️* <em>Star Formation Step:</em> Collapse newly formed components into trees of height one (stars) to ensure that representative nodes in a component have a single, consistent value.
⚠️* <em>Self-Loop Removal Step:</em> For each component, remove edges that point to nodes within the component, i.e., nodes with the same value.
⚠️* <em>Edge-List Concatenation Step:</em> For all leaf of components, move remaining edges to the root.
4. Local Cleanup: For each node not selected as a representative node in the global graph, update the value of the node from its representative.
Efficiency: Algorithm is used on three large-scale parallel machines: Cray T3D, the Meiko CS-2 and the Thinking Machines CM-5
T3D | CS-2 | CM-5 | |
Clock (MHz) | 150 (1.00) | 90 (0.60) | 33 (0.22) |
Comm. (Mreads/s) | 1.00 (1.00) | 0.05 (0.05) | 0.08 (0.08) |
DFS (Mn/s) | 0.78 (1.00) | 0.50 (0.64) | 0.22 (0.29) |
Advantages:
⚠️*Hybrid solution has best performance to date on connected component problem (or so they claim). The paper does not provide any information on the size of the datasets in the number of voxels.
Disadvantages:
⚠️*Requires high communication performance to obtain good speed ups in performance
⚠️*Although sequential solution can be simple, parallel solution can be very challenging in practice
Tracking and Matching Connected Components from 3D Video
Authors: David da Silva Pires, Roberto M. Cesar-Jr.
Purpose: Presents new real-time solution for tracking geometrical connected components and texture information provided by the 3D video stream.
Implementation:
corresponding connected components.
Efficiency: Real-time corresponding to a frame rate of 30 fps using high resolution video. Therefore identification and tracking of connected components happens at video rate time.
Applications: Tracking Methods are important in computer vision for feature detection and extraction, matching, image alignment and stitching panorama generation.
Coronary Vessel Cores from 3D Imagery: a Topological Approach
Authors: Andrzej Szymczak and Allen Tannenbaum and Konstantin Mischaikow
URL: http://www-static.cc.gatech.edu/~andrzej/papers/spie.pdf
The Algorithm: Two steps and is a variant of the union-find algorithm
1) Compute Robust Maxima
2) Vessel Reconstruction from the Robust Maxima
Advantages:
- Medical datasets (including those from CT scans) are noisy. A topological method is good at finding and describing persistent features that aren’t strongly influenced by local noise.
- Modeling of an object that has a one-dimensional structure (blood vessels)
Disadvantages:
- Algorithm exhibits far from optimal run time performance.
Efficiency:
⚠️*Datasets are between 10 and 15 million voxels
⚠️*Run-Time is about 6 minutes on a Pentium III-850MHz workstation
Current Work
- To add more interesting 3D connected component implementations to the literature survey
- Provide all references in the literature survey in Bibtex form
- To brainstorm a creative idea(s) for a research project. Some possible ideas below
Real-time product testing: I thought of this idea after taking with a chemical engineer working for Gillette. Gillette is marketing a new kind of razor that has a hardened soup-like solution attached to the razor so the customer does not require shaving cream. Currently they manually inspect each razor to see if there are cracks in the solution. If there are cracks, the product does not meet testing requirements and isn't used. Could we automate this process or a process similar to this? Possibly using X-rays to provide the datasets. What about detecting cracks in glass?Another possibility is to find an application when we need to know whether two voxels are in the same component. We know if two voxels are in the same component if both their parents point to the same root. This can be used to determine if there is a path from one end of a structure to another. This might have applications in determining how water penetrates concrete.Video Tracking: We could create a video from a bunch of still frames running at about 30fps. We might be able to track components as they move in real time. This is similar to the 3D video tracking and matching work done above, but with a different implementation based off Connect.