About Missing Data

This blog is an attempt to organize working notes related to research on missing values. If you’d like to read a draft abstract, click here.

Traditionally, data management’s concern with missing values has primarily been focused on how to do analysis on data sets with missing values. This has led folks to come up with all kinds of novel ways of working around missing information. For example, some statistical approaches in data mining might replace missing data values with the average data value from that field, to allow the mining algorithms to “pretend” the data is there.

My interest focuses on what kinds of conclusions can be drawn from the fact that the information is missing. Knowing that your missing something is itself an important piece of information. This blog is devoted to cataloging current work (and hopefully coming up with new observations) about what can be done with the knowledge that something is missing.

Click the image below.

Overview of missing data


