The right to be let alone; freedom from interference or intrusion. A fundamental right. Part of confidentiality. An important aspect of computer security.
Extent of Privacy
No universal standard. Disagreement is legitimate and has cultural, historical, or personal roots. Laws and ethics (which depends on the region) set baseline expectations for privacy.
Privacy-focused infrastructure costs money and time. Economy (of an organization) constraints how much privacy can be provided. Even users prefer giving up their information for free services, instead of paying for alternatives (for example Google and Facebook).
Conflicts
Privacy is part of confidentiality. Confidentiality can conflict with availability.
For example:
- Unlisted telephone number
Some callers cannot reach you. - Withholding data from a shop
Loss of loyalty discount. - Not signing up for popular social media platforms
Fear or missing out.
Information Privacy
The right to control how personal information is collected and used. Computers enable data collection, correlation, and storage at unprecedented scale.
Has 3 aspects.
Sensitive Data
What is being protected. What constitutes sensitive data is subject-dependent. No objective universal standard exists. Context (who is affected, social norms) determines sensitivity.
- Identity
Name, identifying info, control over private data disclosure. - Finances
Credit rating, bank details, tax info. - Legal
Criminal records, civil suits, marriage history. - Health
Medical conditions, DNA, genetic predispositions. - Opinions/preferences/membership
Voting records, religion, political party, browsing habits. - Biometrics
Fingerprints, polygraph results, physical characteristics. - Documentary evidence
Mail, diaries, correspondence. - Privileged communications
Lawyer, doctor, clergy. - Academic/employment
Grades, performance ratings. - Location data
Current location, travel patterns. - Digital footprint
Email, social media, web searches, call logs.
And many more.
Affected Parties
Who is involved with the data.
- Subject — person or entity described by the data.
- Owner — person or entity that holds the data.
Controlled Disclosure
How and when data is shared. Subject voluntarily decides who can know which information.
Once disclosed, control is lost from the subject. Recipient is trusted to comply with subject’s wishes.
Organizational Privacy
Companies, schools, hospitals, governments all hold sensitive data. They care a lot more about privacy compared to individual users.
- Companies
Product plans, profit margins, customer lists. - Schools Students, teachers and grades.
- Hospitals Patients, doctors and donor records.
- Governments
Military, diplomatic, and citizen tax data.
Issues
- Data collection
Massive storage and processing enables collection at scale; users often unaware. - Notice and consent
Often impossible to know what is collected. - Control and ownership
Once data is given, control is largely ceded; data may be held indefinitely or resold. - Hidden or excessive data collection
- Weak or unclear or no consent
- Unknown storage, reuse, resale
- Discrimination Merchants may use tracking data to charge different prices to different users. For example, Amazon hiked up the price for certain users based on their history.
IoT
Nowadays increased number of intelligent hardware implies a higher chance of data collection to a greater extent.
Spyware
Code designed to collect user data covertly.
Types:
- General spyware
Advertising, identity theft. - Hijackers
Repurpose existing programs (e.g., file-sharing software) to exfiltrate data. - Adware
Displays ads in pop-up or browser windows; typically bundled with other software.
Cross-site tracking
Websites use cookies for authentication. And websites may include third-party links or images (aka. website bugs). They enable the third-party site to track users across different websites.
Solutions
Individual Solutions
- Anonymity → no identity
- Pseudonymity → fake identity
- Multiple identities → separation of contexts
Industrial Solutions
- Privacy focused laws
- Deanonymization of user data
- Collecting only the required information
Privacy Policies
8 elements of a sound privacy policy:
- Information collection Collected only with knowledge and explicit consent.
- Information usage
Used only for specified purposes. - Information retention
Retained only for a set period. - Information disclosure
Disclosed only to an authorized set. - Information security
Appropriate protection mechanisms applied. - Access control
All access modes to all collected data are controlled. - Monitoring
Logs maintained for all data accesses. - Policy changes
Less restrictive policies never applied retroactively.
Fair Practices
- Data obtained lawfully and fairly.
- Data relevant, accurate, complete, and up to date.
- Purpose identified; data destroyed when no longer needed.
- Secondary use requires consent or legal authority.
- Safeguards against loss, corruption, misuse.
- Subjects have right to access and challenge their data.
- A designated data controller accountable for compliance.
U.S. Privacy Laws
- 1974 Privacy Act
Applies to U.S. government data collection. - HIPAA
Healthcare data. - GLBA
Financial data. - COPPA
Children’s web access. - FERPA
Student records.
State law varies widely.
European Privacy Directive (1995)
Applies Fair Information Practices to governments and businesses. Adds:
- Extra protection for sensitive data.
- Strong limits on cross-border data transfers.
- Independent oversight body for compliance.
Data Access Risks
Recognized risks when government acquires third-party data:
- Data error — transcription or analytical errors.
- Inaccurate linking — correct data items incorrectly joined.
- Difference of form/content — precision, format, or semantic mismatch.
- Purposely wrong — data from intentionally falsified sources.
- False accusation — incorrect or outdated conclusions, unverifiable.
- Mission creep — data acquired for one purpose repurposed for another.
- Poorly protected — integrity undermined by poor data management.
Steps to Protect Against Privacy Loss
- Data minimization — collect only the minimum required.
- Data anonymization — replace identifiers with untraceable codes.
- Auditing — log all data accesses; identify responsible parties after a breach.
- Security and controlled access — protect and restrict access to sensitive data.
- Training — ensure handlers understand what and how to protect.
- Quality — assess data fitness: purpose, age, storage method.
- Restricted usage — review all uses for consistency with collection purpose.
- Data left in place — leave data with original owner/collector where possible.
- Policy — establish and enforce clear data privacy policies.
Breach Notification
When a breach occurs, affected parties must be notified.
- GDPR
Notification required within 72 hours. - California law
Requires notification to affected residents; delay permitted only if law enforcement determines it would impede a criminal investigation.
Authentication and Privacy
Authentication vs. Identification
- Authentication — verifying that a claimant is who they say they are. One comparison: “Is this person X?”
- Identification — determining who a person is from authenticating data. Requires n comparisons across the full database.
Identification is harder: subject may not be in the database; partial matches are ambiguous.
Confusion Between Authentication and Identification
Privacy violations arise when data items serve multiple roles.
Example: U.S. Social Security Number — intended as an identifier; now used as authenticator, database key, and identifier simultaneously. Acquiring it for one purpose enables use for others.
Individual Authentication Chain
Birth certificate → school ID → passport/national ID → multiple numbered credentials throughout life. Each credential links to others; the chain can be traced.
Connecting Identities
Multiple identities (credit card, toll device, hotel keycard, meal plan) may or may not be linkable.
- Credit card links to card payer — not necessarily the user.
- Toll device links to registered owner — not necessarily the driver.
- Phone call authentication links to account holder.
Disassociating Actions from Identity
Techniques to break linkage:
- Use public phone or internet café for anonymous reporting.
- Register under a pseudonym.
- Use temporary or disposable email addresses.
- Provide false telephone numbers when not legally required to give real ones.
Anonymized Records
Records with identifying information removed. Used in research to preserve privacy.
- Individual data points may be non-sensitive; the linkage is what becomes sensitive.
- Re-identification is often possible from remaining quasi-identifiers (e.g., phone number, zip code, birthdate).
It was shown that most people can be uniquely identified by the combination of birthdate + gender + 5-digit zip code.
Privacy on the Internet
Apps and SDKs
- SDK (Software Development Kit) — third-party code for data transmission, embedded in apps.
- SDK developers receive data in exchange for providing the code.
- Data extraction via SDK may occur before explicit user permission is granted.
- No regulations govern what SDK developers can do with collected data.
Site Registrations
- Sites collect demographics in exchange for access.
- Email-as-username — becomes a cross-site identity key; enables merging identity across services.
- Stated purpose (enhancing user experience) masks real purpose (selling demographic data to marketers).
Payments on the Internet
- Credit card — exposes number, CVV, expiry, and billing address to merchant; once given, reusable by that merchant.
- Online payment schemes (PayPal, Google Pay, Zelle) — intermediary reduces direct card exposure.
- Cryptocurrency (e.g., Bitcoin) — higher anonymity; pseudonymous by design.
6. Good Privacy Practice
Policies should ensure:
- Consent-based collection
- Limited, specific use
- Defined retention
- Secure storage
- Restricted access
- Monitoring/logging
8. Risks in Data Use
- Errors and mismatches
- Wrong linking of data
- False or low-quality data
- Misuse beyond purpose (mission creep)
- Weak protection
9. Protection Techniques
- Data minimization
- Access control
- Auditing/logging
- Staff training
- Keep data with source when possible
10. Identity & Privacy
Key distinction
- Authentication → verify claim
- Identification → determine identity
Risks
- Reuse of identifiers (e.g., SSN)
- Linking across systems (identity chains)
Mitigation
- Pseudonyms
- Multiple identities
- Minimal data sharing
11. Anonymity Limits
- De-identified data can be re-identified
- Combining datasets increases risk
12. Internet Privacy
Data collection sources
- IoT devices
- Apps & SDKs
- Websites (cookies, trackers)
User trade-off
- Access/services ↔ personal data
13. Payments & Identity
- Credit cards → high exposure
- Payment services → reduced exposure
- Crypto → partial anonymity
14. Breach Response
- Notify quickly (e.g., GDPR: 72 hours)
- Inform affected users