W. Randolph Franklin, Salles V. G. Magalhães, and Marcus V. A. Andrade.
Data structures for parallel spatial algorithms on large datasets (vision paper).
In Proceedings of BigSpatial’18: 7th ACM SIGSPATIAL Workshop on Analytics for Big Geospatial Data. Seattle, USA, 6 Nov 2018.
[full text] [slides] [BibTeX▼]
This paper describes data structures and algorithms for efficient implementation of GIS operations for large datasets on multicore Intel CPUs and on NVIDA GPUs. Typical operations are boolean combinations of polygons and map overlay. Efficient parallelization prefers simple regular data structures, such as structures of arrays of plain old datatypes. Warps of 32 threads are required to execute the same instruction (or be idle). Ideally, the data used by adjacent threads is adjacent in memory. Minimizing storage is important, as is accessing it in a regular pattern. That disparages pointers, linked lists, and trees. That implies that explicitly representing global topology is bad. If using only local topological formulae is sufficient, then it will be much faster. E.g., for many operations on a 2-D map (aka planar graph), the set of oriented edges suffices. Each edge knows the locations of its endpoints and the ids of its adjacent polygons. Any mass operation, such as area computation or point location, can be implemented as a map-reduce. All these techniques also apply in 3D to CAD/CAM and additive manufacturing. Indeed they are more important there.