Project 2: A Machine Learning Approach to Anomaly Detection in Water Quality Data

Project abstract:  This project will involve the application of machine learning tools for anomaly detection and data cleaning to water quality data collected by the US Geological Survey (USGS). Current practices for data validation & verification involve burdensome examination of data “by hand,” i.e., without any automation. However, many of these tasks are appropriately modeled as machine learning problems, especially in terms of anomaly detection and imputation of missing data. The student will work with existing Python libraries (especially scikit-learn) to automate this process.

Keywords:
  • supervised learning
  • time-series modeling
  • anomaly detection
  • water quality
Faculty Mentor: John Lipor  http://ece.pdx.edu/~lipor/
Department: ECE
Community Partner(s): USGS
Desired skills: Programming Experience (python preferred but not necessary), strong background in mathematics
Tools to be used: Python libraries including scikit-learn, pandas, numpy
Involves teamwork:  No