Geospatial Big Data Analytics for Quality Control of Surveys
USMA Research Unit Affiliation
Army Cyber Institute, Mathematical Sciences, Electrical Engineering and Computer Science
Geospatial big data analytics allows survey quality control analysts to draw important conclusions about survey data quality that otherwise would take excessive time and resources. In this work, we explored two algorithmic methods that can help ensure reliability of survey interviews by detecting geospatial outliers. Focusing on geospatial data collected from surveys, we implemented outlier detection techniques with two different distance metrics to identify statistical anomalies in real-world datasets that may have qualitative interpretations. We found that one algorithm, which considers the local distribution of points in a dataset, identifies a different set of outliers when compared to another method, which considers the global distribution of points. Since there was a small overlap (10-19%) of flagged points between the two algorithms implemented, it may be helpful for analysts to focus on the fewer “outlier” points that are flagged by both methods rather than all the “outlier” points that are flagged by each algorithm. Finally, analysts should consider the computational costs, as the algorithms differ significantly.
Record links to items hosted by external providers may require fee for full-text.