Data Science & Analytics

Bias Tracking and Reduction Methods for High-Dimensional Exploratory Visual Analysis and Selection



Description

Exploratory visualization and analysis of large and complex datasets is growing increasingly common across a range of domains, and large and complex data repositories are being created with the goal supporting data-driven, evidence-based decision making. However, today's visualization tools are often overwhelmed when applied to high-dimensional datasets (i.e., datasets with large numbers of variables). Real-world datasets can often have many thousands of variables; a stark contrast to the much smaller number of dimensions supported by most visualizations. This gap in dimensionality puts the validity of any analysis at great risk of bias, potentially leading to serious, hidden errors. This research project developed a new approach to high-dimensional exploratory visualization that will help detect and reduce selection bias and other problems with data interpretation during exploratory high-dimensional data visualization. The project's results, including open-source software, are broadly applicable across domains, and have been evaluated with users in a health outcomes research setting. This offers significant potential to improve health care around the world.


RENCI's Role

David Borland is Co-PI and one of the lead developers of the Cadence visual analytics tool developed as part of this project.


Team Members