A few observations about these approaches, which illustrate the predispositions of data mining:
- They don’t seem to care about what the value of the missing data is, they primarily care about the missing data’s impact on the value of the particular data observation (or row)
- For expediency, they tend to assume that missing values will be statistically distributed similar to how the non-missing (or observed) values are distributed
- There is a focus on a large corpus of observations; the impact of the individual observation is small.
These are all reasonable constraints given what data mining is doing. As I discover this kind of thing though, I’m trying to keep it documented because these types of themes would probably be interesting to contrast with an approach that was aimed at using missing values as an information channel.