Data Mining by Companies
Data mining — large-scale collection, correlation, and inference from consumer data.
Example (Target): Guest ID links all purchase history, coupons, surveys, web visits, and customer service calls. Used to infer pregnancy status and target advertising accordingly.
Government Data Mining
- Harder to restrict than private-sector mining.
- Often occurs without public announcement; some programs intentionally secret.
- Erroneous conclusions from faulty data are difficult to correct once embedded in government records.
Privacy-Preserving Data Mining
- Removing overtly identifying data is insufficient — re-identification from remaining fields is often possible.
- Data perturbation — adds noise to data. Limits privacy risk without significantly impacting aggregate analysis (correlation, aggregation still reliable on perturbed data).