Sign up for our newsletter

Share this article

The advent of artificial intelligence and, more notably, generative AI and large language models (LLMs) has raised alarms all over the world. As with any innovation, the world is split between those excited for the chances it unlocks and the dangers it may pose.

We're going to take a closer look at the current landscape of AI and data privacy, AI regulation, and what companies and consumers can do to preserve privacy and privacy rights.

What We Mean When We Talk About AI

Defining artificial intelligence can actually be kind of tricky.

Stanford Professor John McCarthy, who coined the term “artificial intelligence,” defined it as “the science and engineering of making intelligent machines, especially intelligent computer programs.” He further defined intelligence as “the computational part of the ability to achieve goals in the world.”

Lots of technologies could fall under that definition–content algorithms on social media networks, search engines, and more. But for modern audiences and regulators, artificial intelligence generally refers to systems that rely on machine learning techniques, large language models, generative AI systems, and the like. ChatGPT is probably the best known example of such systems. 

AI becomes problematic for privacy leaders, policymakers, and consumers alike when it’s trained on data that's sourced from a mix of pre-existing datasets, user interactions, sensors, corporate records, simulations, crowdsourcing, and APIs. 

AI technology is starving for data, and its appetite is only growing. For example, ChatGPT’s training dataset grew from 1.5 billion parameters to 175 billion parameters in just one year.

AI's dependence on training data raises concerns over how it collects, stores, shares, or uses data. Privacy-centric policies—both at the corporate and legislative level—are needed to regulate where AI gets this data, how it gets it, and from whom.

What We Mean When We Talk About Data Privacy

Data privacy refers to consumers' right to know about and control what happens to their personal information. By extension, data privacy also refers to the policies and practices implemented by businesses to protect and enable those rights.

Personal information is any information about a person that can be linked (directly or indirectly) back to them. That could mean a person's name, address, date of birth, social security number, etc.

Data privacy also concerns itself with sensitive personal information, which is information that could cause harm if it were mishandled or leaked, such as health records, banking information, biometrics, racial background, sexual orientation, religious views, and similar information. 

Considering the huge amounts of data consumed by AI systems, you can imagine how much of that is personal, sensitive, private, or at least, used without the consumer's full understanding or consent (as we've seen with Clearview AI and numerous other companies).

How Does Artificial Intelligence Collect and Use Personal Data?

AI collects data from various sources, many of which involve collecting data from consumers without their knowledge or consent. AI systems might scrape data from photos posted online, social media post interactions, security camera footage, fitness tracker data, purchase histories—pretty much any digital interaction can be used to train an AI. 

Sometimes users are very clearly informed that their actions provide training data to an AI. Most people, for example, know that their interactions with ChatGPT help train the large language model. Other times, users are “informed” and “give consent” by accepting lengthy terms and conditions for a given product or service. 

Other times, there isn’t even the pretense of informing a user or securing their consent before their data is scraped into an AI model—as was the case with Clearview.

Concerns About AI and Privacy Risks 

The benefits of AI—whether that’s enabling us to complete our daily work more efficiently or making the world more accessible—likely outweigh the risks. But that doesn’t mean we shouldn’t try to mitigate those risks anyways. AI governance is still an evolving practice space. That means our privacy rights in the context of AI are receiving fewer protections than they deserve. 

Here are some of the risks to our privacy rights that we need to be cognizant of when working with AI.

Bias, Discrimination, & Inequality

While responsible AI developers aim to weed out or reduce the influence of harmful data in training sets (i.e., data that perpetuates misinformation, harmful ideas, bigotry, sexism, inequality, or discrimination of any kind), many AI models are already being trained on biased data, and therefore are more likely to make biased decisions (e.g., denying a loan application) that can impact users on the commercial end of various AI applications.

Manipulation  

AI has been known to influence and control public perception. Technologies like deepfakes can be leveraged to spread misinformation, sway public opinion, and deceive individuals on a massive scale by exploiting their personal data and creating highly personalized and persuasive content.

Strong data privacy measures can protect people from AI-driven manipulation. By limiting unauthorized access to personal data, privacy protections can help prevent AI from being used to influence decision-making without consent.

Increased Vulnerability to Data Breaches

The more personal data businesses collect and store,  the greater the risk and impact of a breach. AI systems need vast quantities of data and incentivizes digital hoarding, which makes it all the more important for companies using AI to take stronger measures to protect their customer data.

Consider Clearview AI, a for-profit facial recognition company serving federal agencies and retail companies alike. When Clearview AI was breached in 2020, it put countless people at risk. The company's technology relied on scraping over three billion user-uploaded photos from the web to help law enforcement track down suspects through facial recognition—but without consumers’ knowledge or consent. 

What Are Governments Doing About AI & Data?

While lawmakers could be doing a lot more around the world to protect sensitive information from being used as AI training data, the EU AI Act and Colorado AI Act has shown us that things are starting to change.

What Is the European Union Artificial Intelligence Act?

Coming into force on the 1st of August 2024, the act establishes a legal framework for AI and the entities that build and use AI systems on a professional level. Like the GDPR and CPRA, the EU AI Act applies to institutions and businesses from outside the European Union if their users are within the EU.

Highlights of the act include:

  • Established risk categories: AI applications are divided into five categories based on their risk of causing harm to individuals, from minimal to unacceptable, the latter of which is banned except under certain circumstances, such as national security, research, and non-professional uses. 
  • Transparency and accountability: High-risk applications must meet certain transparency, security, and quality requirements and undergo compliance testing. 
  • New governing bodies: Four new bodies have been charged with enforcing the Act, though member states are required to set up their own authorities. These new bodies are the AI Office, the European Artificial Intelligence Board, the Advisory Forum, and the Scientific Panel of Independent Experts.

AI Risk Categories

Unacceptable Risk

Any AI app that can manipulate people, identify people in real time using biometric data (facial recognition), or be used for social scoring.

Exceptions are apps used for military, national security, and pure scientific research.

High-Risk AI

AI apps that could potentially impact health, safety, or fundamental human rights, i.e.,  healthcare, education, recruitment, law enforcement, the justice system, etc.

Require evaluations before they can be released to the market; transparency and quality requirements include human oversight and safety obligations; Some apps may need a rights impact assessment.

General-Purpose AI

Apps that can perform a wide range of functions, e.g., large language models like ChatGPT. These apps are harder to classify and regulate within traditional risk categories.

If it's open-source,  developers only need to provide a summary of the training data and a copyright compliance policy. If the model isn’t open-source, developers must meet stricter transparency requirements, i.e. create detailed documentation and risk disclosures.

Limited-Risk AI

AI applications that make it possible to generate or manipulate images, sound, or videos

Transparency obligations include making sure users know that they're interacting with AI so they can make informed decisions about their usage.

Minimal-Risk AI

AI systems used for video games or spam filters, for example.

Unregulated but should follow a code of conduct and best practices.

What Is the Colorado Artificial Intelligence Act?

In May 2024, Colorado enacted the nation’s first thorough piece of AI legislation. Scheduled to go into effect in February, 2026, the Colorado Artificial Intelligence Act, or CAIA, will aim to protect consumers from biased treatment and “algorithmic discrimination” resulting from high-risk AI systems.

A "high-risk AI system" is a system that can influence a person’s ability to access:

  • Education 
  • Employment
  • A financial or lending service
  • Government services
  • Healthcare services
  • Housing
  • Insurance
  • Legal services

According to CAIA, “algorithmic discrimination" is when an AI system unfairly treats or negatively impacts a person or group of people based on factors like their age (or perceived age), race, gender, disability, national origin, religion, or other protected characteristics.

Under the act, deployers and developers of AI systems have certain responsibilities and must meet certain requirements. For example:

  • Deployers of high-risk AI systems have to complete annual impact assessments.
  • AI developers and deployers have to meet certain transparency requirements, such as making users aware that they are using AI systems. 

How Do the EU AI and the Colorado AI Act Affect Data Privacy Laws?

Data privacy is still the purview of the GDPR, US, and international comprehensive privacy laws. The EU and Colorado AI Acts govern the development and use of AI technology, even if no personal information is being processed. That’s not to say that AI systems won't have to maintain compliance with both of the relevant AI and data privacy laws, but sometimes the GDPR and other privacy laws won't apply.

While AI in and of itself is not automatically a consumer privacy issue, these two topics will intersect if your business uses, develops, or partners with AI, such as using generative AI or ChatGPT in your products or services. 

What Can Businesses Do to Get Ahead of AI?

If you develop or use AI in your business and you fall below the unacceptable risk category, what can your company do to make sure you're not only compliant but that you're also following business best practices and a set of ethical standards?

Start with a Privacy Impact Assessment (PIA)

These impact assessments will help your company identify, evaluate, and mitigate privacy risks associated with processing someone’s data. They may also help your business categorize these risks, which you’ll need to comply with the AI Act.

You’ll have to conduct a PIA whenever you introduce a new product or feature or  update an existing system to assess potential privacy risks. Key factors to evaluate include:

  • How personal data will be collected, used, shared, and stored
  • The categories of PI being handled
  • Why the information is being collected and handled
  • What consumer expects for PI processing
  • The purpose, benefits, and potential negative impacts of data processing
  • The measures that will be put in place to address negative impacts
  • An assessment of whether the negative impacts outweigh the benefits

Have Another Look at Your Third-Party Data Processing Agreements

Now is the time to review your contracts with AI vendors and take a closer look at any clauses related to data, such as, data transfers, data processing, and internal practices. You may need to revisit their privacy and security practices and be prepared to take the next steps if the third parties aren’t compliant. 

Establish a Legal Basis for Processing Data

What’s the legal basis for processing someone’s data? For any data collection, usage, storage, or sharing under the GDPR and other data privacy laws,, you need a valid legal basis (and that applies to AI data collection, usage, storage, and sharing, too). These could include legitimate interest, contractual obligations, or user consent (the most common). While other state privacy laws don’t always list out the legal bases described in the GDPR, the same concepts broadly apply.

Get User Consent

User consent is the shining star of information privacy regulations. Remember that your consumers need to understand what they’re consenting to and how their personal information will be used for AI (and any other purpose). They also have the right to remove their consent as well as access, change, and request that their sensitive information be deleted. 

Exceptions apply only to anonymized data or when processing is required for legal obligations.

Make Provisions to Protect Minors 

If your organization processes children’s data, make sure their information is sufficiently protected. Keep in mind that age requirements differ from country to country and state to state, so be sure to check local regulations and always err on the side of caution to avoid potentially expensive fines.

Adopt the Basic Principles of Information Privacy

  • Purpose Limitation: Only collect data for a specific, legitimate purpose, which you must clearly define at the point of collection. The data you collect should only be used for the exact reason for which it was collected. If you plan on utilising user data for any other reason, you need consent.
  • Data Minimization: You should only collect, process, and store the very minimum amount of personal information you need to meet the purpose of the collection.

Manage AI and Privacy Challenges with Osano

Although privacy and AI are distinct disciplines, we’re still only on the left-hand side of AI’s timeline. Having a solid privacy program will help future-proof your business against emerging AI innovation and regulation, while keeping your customer data protected. Prioritizing privacy today will pave the way for future AI success. Find out how Osano can help you harness the benefits of AI while maintaining data compliance.

Schedule a demo of Osano today

DPIA Template

Planning a project that might involve AI or other data collection and processing activities? Use this assessment template to determine the what steps you need to take to ensure your project is compliant.

Download Now
DPIA template with drop shadow - switchback
Share this article