Bill Text: CA AB3211 | 2023-2024 | Regular Session | Amended

NOTE: There are more recent revisions of this legislation. Read Latest Draft
Bill Title: California Digital Content Provenance Standards.

Spectrum: Partisan Bill (Democrat 1-0)

Status: (Engrossed - Dead) 2024-08-31 - Ordered to inactive file at the request of Senator Gonzalez. [AB3211 Detail]

Download: California-2023-AB3211-Amended.html

Amended  IN  Senate  June 10, 2024
Amended  IN  Assembly  April 18, 2024
Amended  IN  Assembly  March 21, 2024

CALIFORNIA LEGISLATURE— 2023–2024 REGULAR SESSION

Assembly Bill
No. 3211


Introduced by Assembly Member Wicks

February 16, 2024


An act to add Chapter 41 (commencing with Section 22949.90) to Division 8 of the Business and Professions Code, relating to artificial intelligence.


LEGISLATIVE COUNSEL'S DIGEST


AB 3211, as amended, Wicks. California Provenance, Authenticity and Watermarking Standards.
Existing law requires the Secretary of Government Operations to develop a coordinated plan to, among other things, investigate the feasibility of, and obstacles to, developing standards and technologies for state departments to determine digital content provenance. For the purpose of informing that coordinated plan, existing law requires the secretary to evaluate, among other things, the impact of the proliferation of deepfakes, as defined.
Beginning February 1, 2025, this bill, the California Provenance, Authenticity and Watermarking Standards Act, would require a generative artificial intelligence (AI) provider to, among other things, place an imperceptible and maximally indelible watermarks containing provenance data watermark into synthetic content produced or significantly modified by a generative AI system that the provider makes available, as those terms are defined. The bill would require, within 24 96 hours of discovering a material vulnerability or failure in a generative AI system, system related to the erroneous or malicious inclusion or removal of provenance information or watermarks, a generative AI provider to report the vulnerability or failure to the Department of Technology and to notify other generative AI providers, as specified. The bill would also require a conversational AI system, as defined, to clearly and prominently disclose to users that the conversational AI system receives generates synthetic content.
Beginning March 1, 2025, this bill would require a large online platform, as defined, to, among other things, use labels to prominently disclose the provenance data found in watermarks or digital signatures in content distributed to users on its platforms, of content distributed on its platform, as specified. The bill would require a large online platform to require a user that uploads or distributes content on its platform to disclose whether the content is synthetic content, as specified. If a large online platform is not able to detect the provenance data of content, the bill would require the platform to label the content as unknown provenance. If content uploaded to or distributed on a large online platform by a user does not contain provenance data, or if the content’s provenance data cannot be interpreted or detected by the platform, the bill would require the platform to require the user to disclose whether the content is synthetic content, as specified.
Beginning January 1, 2026, this bill would require newly manufactured digital cameras and recording devices sold, offered for sale, or distributed in California to offer users the option to place an authenticity watermark and provenance watermark in the a watermark into content produced by that device. The bill would require the authenticity watermark and provenance watermark to be compatible with widely used industry standards. If technically feasible, the bill would require a camera and recording device manufacturer, as defined, to offer to a user of a digital camera or recording device purchased in California prior to before January 1, 2026, a software or firmware update enabling the user to place an authenticity watermark and provenance a watermark on the content created by the device. device and decode the provenance data.
Beginning January 1, 2026, and annually thereafter, this bill would also require generative AI providers and large online platforms to produce a Risk Assessment and Mitigation Report that assesses the risks posed and harms caused by synthetic content generated in their generative AI systems or hosted on their generative AI hosting platforms, as prescribed. The bill would require the report to be audited by qualified, independent auditors who are required to assess and either validate or invalidate the claims made in the report, as specified.
This bill would provide that a violation of its provisions may result in an administrative penalty, assessed by the department, of up to $1,000,000 or 5% of the violator’s annual global revenue, whichever is greater. The bill would require the department to adopt regulations as necessary to implement and carry out the purposes of this act and to review and update those regulations as needed.
Vote: MAJORITY   Appropriation: NO   Fiscal Committee: YES   Local Program: NO  

The people of the State of California do enact as follows:


SECTION 1.

 The Legislature finds and declares all of the following:
(a) Generative artificial intelligence (GenAI) technologies are increasingly able to produce inauthentic synthesize images, audio, video, and text content in ways that are harmful to society.
(b) In order to reduce the severity of the harms caused by GenAI, it is important for GenAI content to be clearly disclosed and labeled.
(c) Failing to appropriately label GenAI content can skew election results, enable academic dishonesty, and erode trust in the online information ecosystem.
(d) The Legislature should act to adopt standards pertaining to the clear disclosure and labeling of GenAI content, in order to alleviate harms caused by the misuse of these technologies.
(e) The Legislature should push for the creation of tools that allow Californians to assess the authenticity of online content. provenance of online content and the extent to which content has been doctored or completely synthesized by GenAI.
(f) The Legislature should require online platforms to label inauthentic synthetic content produced by GenAI.
(g) Through these actions, the Legislature can help to ensure that Californians remain safe and informed.

SEC. 2.

 Chapter 41 (commencing with Section 22949.90) is added to Division 8 of the Business and Professions Code, to read:
CHAPTER  41. California Provenance, Authenticity, and Watermarking Standards

22949.90.
 For purposes of this chapter, the following definitions apply:
(a) “AI red-teaming” means a structured testing effort to find flaws and vulnerabilities in an a generative AI system, including, but not limited to, harmful or discriminatory outputs, unforeseen or undesirable system behaviors, limitations, or potential risks associated with misuse of the generative AI system.
(b) “Artificial intelligence” or “AI” means an engineered or machine-based system that varies in its level of autonomy and that can, for explicit or implicit objectives, infer from the input it receives how to generate outputs that can influence physical or virtual environments.

(c)“Authentic content” means images, videos, audio, or text created by human beings without any modifications or with only minor modifications that do not lead to significant changes to the perceived contents or meaning of the content. Minor modifications include, but are not limited to, changes to brightness or contrast of images, removal of background noise in audio, and spelling or grammar corrections in text.

(d)

(c) “Conversational AI system” means chatbots and other audio- or video-based systems that can hold humanlike conversations through digital media, including, but not limited to, online calling, phone calling, video conferencing, messaging, application or web-based chat interfaces, or other conversational interfaces. Conversational AI systems include, but are not limited to, chatbots for customer service or entertainment purposes embedded in internet websites and applications.
(d) “Digital fingerprint” means a unique value that can be used to identify identical or similar digital content.
(e) “Digital signature” means a digital method that allows a user to sign a piece of authentic or synthetic content with their name or device information, verifying that they created the content. a method based on cryptography that allows a user or entity to digitally sign content with provenance data in order to verify that the user or entity participated in the creation of the content.
(f) “Generative AI hosting platform” means an online repository or other internet website that makes generative AI systems available for download.
(g) “Generative AI provider” means an organization or individual that creates, codes, substantially modifies, or otherwise produces a generative AI system.
(h) “Generative AI system” means an artificial intelligence system that generates derived synthetic content, including images, videos, audio, text, and other digital content.

(i)“Inauthentic content” means synthetic content that is so similar to authentic content that it could be mistaken as authentic.

(j)

(i) “Large online platform” means a public-facing internet website, web application, or digital application, including a social network, video sharing video-sharing platform, messaging platform, advertising network, or search engine that had at least 1,000,000 California users during the preceding 12 months. months and can facilitate the sharing of synthetic content.

(k)

(j) “Maximally indelible watermark” means a watermark that is designed to be as difficult to remove as possible using state-of-the-art techniques and relevant industry standards.
(k) ”Nonsynthetic content” means images, videos, audio, or text created by human beings without any modifications or with only minor modifications that do not lead to significant changes to the perceived contents or meaning of the content. Minor modifications include, but are not limited to, changes to brightness or contrast of images, removal of background noise in audio, and spelling or grammar corrections in text.
(l) “Potentially deceptive content” means synthetic content that is so similar to nonsynthetic content that it could reasonably be mistaken as nonsynthetic content.

(l)

(m) “Provenance data” means data that identifies the origins of synthetic information about the history of the content, including, but not limited to, the following:
(1) The name of the generative AI provider. provider or the camera or recording device manufacturer.
(2) The name and version number of the AI system that generated the content. content or the operating system, version of the operating system, or the application used to capture, create, or record the content.
(3) The time and date of the creation. content’s creation and any additional modifications of the content.
(4) The portions of content that are synthetic. have been changed by a generative AI system, if applicable.

(m)

(n) “Synthetic content” means information, including images, videos, audio, and text, that has been produced or significantly modified by a generative AI system.

(n)

(o) “Watermark” means information that is embedded into a generative AI system’s output for the purpose of conveying its synthetic nature, identity, content, including image, audio, video, text, or computer code, for the purpose of communicating the provenance, history of modifications, modification, or history of conveyance.

(o)

(p) “Watermark decoders” decoder” means a freely available software tools tool or online services that can read or interpret watermarks a watermark and output the provenance data embedded in them. associated with the watermark.

22949.90.1.
 (a) A generative AI provider shall do all of the following:
(1) Place an imperceptible and maximally indelible watermarks containing provenance data watermark into synthetic content produced or significantly modified by a generative AI system that the provider makes available.
(A) If a sample of synthetic content is too small to directly contain the required provenance data, the provider shall, at minimum, attempt to embed watermarking information provenance data that identifies the content as synthetic and provide communicate the following provenance information data in order of priority, with clause (i) being the most important, and clause (iv) (v) being the least important:
(i) The synthetic nature of the content.

(i)

(ii) The name of the generative AI provider.

(ii)

(iii) The name and version number of the AI system that generated the content.

(iii)

(iv) The time and date of the creation of the content.

(iv)

(v) If applicable, the specific portions of the content that are synthetic.
(B) To the greatest extent possible, watermarks shall be designed to retain communicate information that identifies content as synthetic and gives the name of identifies the provider in the event that a sample of synthetic content is corrupted, downscaled, cropped, or otherwise damaged.

(2)Develop downloadable watermark decoders that allow a user to determine whether a piece of content was created with the provider’s system, and make those tools available to the public.

(2) Make available to the public a watermark decoder that meets both of the following criteria:
(A) The watermark decoders shall be Is easy to use by individuals an individual seeking to quickly assess the provenance of a single piece of content.
(B) The watermark decoders shall adhere, Adheres, to the greatest extent possible, to relevant national or international standards.
(3) Conduct AI red-teaming exercises involving third-party experts to test whether watermarks can be easily removed from synthetic content produced by the provider’s generative AI systems, as well as whether the provider’s generative AI systems can be used to falsely add watermarks to otherwise authentic nonsynthetic content. Red-teaming exercises shall be conducted before the release of any new generative AI system and annually thereafter.
(A) If a provider allows their generative AI systems to be downloaded and modified, the provider shall additionally conduct AI red-teaming to assess whether their systems’ watermarking functionalities can be disabled.
(B) A provider shall make summaries of its AI red-teaming exercises publicly available in a location linked from the home page of the provider’s internet website, using a clearly labeled link that has a similar look, feel, and size relative to other links on the same web page. The provider shall remove from the summaries any details that pose an immediate risk to public safety. safety or provide information that could be used to disable or circumvent the functionality of watermarks specified in this chapter.
(C) A provider shall submit full reports of its AI red-teaming exercises to the Department of Technology within six months of conducting a red-teaming exercise pursuant to this section.

(b)A generative AI provider may continue to make available a generative AI system that was made available before the date upon which this act takes effect and that does not have watermarking capabilities as described by paragraph (1) of subdivision (a), if either of the following conditions are met:

(1)The provider is able to retroactively create and make publicly available a decoder that accurately determines whether a given piece of content was produced by the provider’s system with at least 99 percent accuracy as measured by an independent auditor.

(2)The provider conducts and publishes research to definitively demonstrate that the system is not capable of producing inauthentic content.

(b) A generative AI system capable of producing potentially deceptive content shall generate and store, in a searchable online database in a manner that can be retrieved by a viewer of the content, a digital fingerprint of and provenance data for any piece of potentially deceptive content that they produce. This provenance shall not include personally identifiable information.
(c) Providers and distributors of software and online services shall not make available a system, application, tool, or service that is designed to remove watermarks from synthetic content.
(d) Generative AI hosting platforms shall not make available a generative AI system that does not place maximally indelible watermarks containing communicating provenance data into content created by the system. or substantially modified by the system in a manner consistent with specifications set forth in paragraph (1) of subdivision (a).
(e) (1) Within 24 96 hours of discovering a material vulnerability or failure in a generative AI system, system related to the erroneous or malicious inclusion or removal of provenance information or watermarks, a generative AI provider shall report the vulnerability or failure to the Department of Technology.
(A) A provider shall notify other generative AI providers that may be affected by similar vulnerabilities or failures in a manner that allows the other generative AI provider to harden their own AI systems against similar risks. risks, but that does not compromise the reporting provider’s systems or disclose the reporting provider’s confidential or proprietary information.
(B) A provider shall use commercially reasonable efforts to notify affected parties, including, but not limited to, online platforms, researchers or users who received incorrect results from a watermark decoder, or users who produced AI content that contained incorrect or insufficient watermarking provenance data. A provider shall not be required to notify an affected party whose contact information the provider has not previously collected or retained.
(2) A provider shall make any report to the Department of Technology under this subdivision publicly available in a location linked from the home page of the provider’s internet website with a clearly labeled link that has a similar look, feel, and size relative to other links on the same web page. If public disclosure of the report under this subdivision could pose public safety risks, a provider may instead do either of the following:
(A) Post a summary disclosure of the reported material vulnerability or failure.
(B) Delay, for no longer than 30 days, the public disclosure of the report until the public safety risks have been mitigated. If a provider delays public disclosure, they shall document their efforts to resolve the material vulnerability or failure as quickly as possible in order to meet the reporting requirements under this subdivision.
(f) (1) A conversational AI system shall clearly and prominently disclose to users that the conversational AI system generates synthetic content.
(A) In visual interfaces, including, but not limited to, text chats or video calling, a conversational AI system shall place the disclosure required under this subdivision in the interface itself and maintain the disclosure’s visibility in a prominent location throughout any interaction with the interface.
(B) In audio-only interfaces, including, but not limited to, phone or other voice calling systems, a conversational AI system shall verbally make the disclosure required under this subdivision at the beginning and end of a call.
(2) In all conversational interfaces of a conversational AI system, the conversational AI system shall, at the beginning of a user’s interaction with the system, obtain a user’s affirmative consent acknowledging that the user has been informed that they are interacting with a conversational AI system. A conversational AI system shall obtain a user’s affirmative consent prior to before beginning the conversation.
(3) Disclosures and affirmative consent opportunities shall be made available to a user in the language in which the conversational AI system is communicating with the user. user and may also be provided in additional languages.

(4)The requirements under this subdivision shall not apply to conversational AI systems that do not produce inauthentic content.

(g) This section shall become operative on February 1, 2025.

22949.90.2.

(a)For purposes of this section, the following definitions apply:

(1)“Authenticity watermark” means a watermark of authentic content that includes the name of the device manufacturer.

(2)“Camera and recording device manufacturer” means the makers of a device that can record photographic, audio, or video content, including, but not limited to, video and still photography cameras, mobile phones with built-in cameras or microphones, and voice recorders.

(3)“Provenance watermark” means a watermark of authentic content that includes details about the content, including, but not limited to, the time and date of production, the name of the user, details about the device, and a digital signature.

(b)(1)Beginning

22949.90.2.
 (a) (1) Beginning January 1, 2026, newly manufactured digital cameras and recording devices sold, offered for sale, or distributed in California shall offer users the option to place an authenticity watermark and provenance watermark in the a watermark into content produced by that device.
(2) A user shall have the option to remove the authenticity and provenance watermarks from the content produced by their device. customize the types of provenance data communicated by these watermarks, including by removing any personally identifiable information. Personally identifiable information, including geolocation, shall not be included in provenance data by default.

(3)Authenticity watermarks shall be turned on by default, while provenance watermarks shall be turned off by default.

(4)Newly manufactured digital cameras and recording devices subject to the requirements of this subdivision shall clearly inform a user of the existence of the authenticity and provenance watermarks settings upon the user’s first use of the camera or the recording function on the recording device.

(3) Recording devices subject to the requirements of this subdivision shall clearly inform users of the existence of the watermark settings upon a user’s first use of the recording function on the recording device.

(A)

(4) When a camera or audio recording application is open, a newly manufactured digital camera or recording device’s recording function is in use, the recording device shall have contain a clear indicator that a watermark is being applied.

(B)A newly manufactured digital camera or recording device shall allow the user to adjust the watermarks settings.

(5) Authenticity and provenance watermarks A watermark shall, if enabled, be applied to authentic nonsynthetic content produced using third-party applications that bypass default camera or recording applications in order to offer camera or audio recording functionalities.

(c)

(6) The authenticity watermark and provenance watermark, as required by subdivision (b), watermark shall be compatible with widely used industry standards.

(d)

(b) Beginning January 1, 2026, a camera and if technically feasible, a recording device manufacturer shall offer a software or firmware update enabling a user to place an authenticity watermark and provenance of a recording device manufactured before January 1, 2026, and purchased in California to do both of the following:
(1) Place a watermark on the content created by the device to a user of a digital camera or recording device purchased in California prior to January 1, 2026, if technically feasible. device.
(2) Decode the provenance data.

22949.90.3.
 (a) Beginning March 1, 2025, a large online platform shall use labels to prominently disclose the provenance data found in watermarks or digital signatures in content distributed to users on its platforms. of content distributed on its platform.
(1) The labels shall indicate prominently display whether content is fully synthetic, partially synthetic, authentic, authentic nonsynthetic, nonsynthetic with minor modifications, or does not contain a watermark.
(2) A user shall be able to click or tap on a label to inspect provenance data in an easy-to-understand format.
(b) The disclosure required under subdivision (a) shall be readily legible to an average viewer or, if the content is in audio format, shall be clearly audible. A disclosure in audio content shall occur at the beginning and end of a piece of content and shall be presented in a prominent manner and at a comparable volume and speaking cadence as other spoken words in the content. A disclosure in video content should be legible for the full duration of the video.
(c) A large online platform shall use state-of-the-art techniques to detect and label synthetic content that has had watermarks removed or that was produced by generative AI systems without watermarking functionality. If the platform is not able to detect the provenance data of content, then the platform shall label the content as unknown provenance.

(d)(1)A large online platform shall require a user that uploads or distributes content on its platform to disclose whether the content is synthetic content.

(d) (1) If content uploaded to or distributed on a large online platform by a user does not contain provenance data, or if the content’s provenance data cannot be interpreted or detected by the platform, a platform shall require the user to disclose whether the content is synthetic content.
(2) A large online platform shall include prominent warnings to users that uploading or distributing synthetic content without disclosing that it is synthetic content may result in disciplinary action. be a violation of platform policy.
(3) A large online platform may provide users with an option to indicate that the user is uncertain whether the content they are uploading or distributing is synthetic content. If a user uploads or distributes content and indicates that they are uncertain of whether the content is synthetic content, a large online platform shall indicate prominently disclose that the uploaded or distributed content is possibly synthetic and shall display that indication in a manner that is visible or audible to viewers or listeners of the content. of unknown provenance.
(e) A large online platform shall use state-of-the-art techniques to detect and label text-based inauthentic potentially deceptive content that is uploaded by users.
(f) A large online platform shall make accessible a verification process some functionality for users to apply a digital signature to authentic nonsynthetic content. The verification process functionality shall include options that do not require disclosure of personal personally identifiable information.
(g) A large online platform that can detect potentially deceptive content that does not contain watermarks that comply with applicable industry standards shall generate and store, in an online database to be shared and privately accessible by all other online platforms and the Department of Technology, digital fingerprints and any associated provenance data for these images. This provenance data shall not include personally identifiable information. The Department of Technology may choose to share access to these databases with coordinating bodies acting to facilitate more rapid and computationally efficient detection and labeling of synthetic content.

22949.90.4.
 (a) (1) Beginning January 1, 2026, and annually thereafter, generative AI providers and large online platforms shall produce a Risk Assessment and Mitigation Report that assesses the risks posed and harms caused by synthetic content generated by their generative AI systems or hosted on their generative AI hosting platforms.
(2) The report shall include, but not be limited to, assessments of the distribution of illegal generative AI-generated child sexual abuse materials, nonconsensual intimate imagery, disinformation related to elections or public health, plagiarism, or other instances where synthetic or inauthentic potentially deceptive content caused or may have the potential to cause harm.
(3) The report shall incorporate information known to the generative AI provider or large online platform about known harms caused by synthetic content generated by their systems or hosted on their platforms, as informed by reports submitted to, and confirmed by, the provider or platform, and independent investigation as appropriate, including, for example, illegal material.
(b) The report required under subdivision (a) shall be audited by qualified, independent auditors who shall assess and either validate or invalidate the claims made in the report. Auditors shall use state-of-the-art techniques to assess reports, and shall adhere to relevant national and international standards.

22949.90.5.
 A violation of this chapter may result in an administrative penalty, assessed by the Department of Technology, of up to one million dollars ($1,000,000) or 5 percent of the violator’s annual global revenue, whichever is greater.

22949.90.6.
 Within 90 days of the date upon which this act takes effect, the Department of Technology shall adopt regulations to implement and carry out the purposes of this chapter. The department shall review and update its regulations relating to the implementation of this chapter as needed, including, but not limited to, adopting specific national or international standards for provenance, authenticity, watermarking, and digital signatures, as long as the standards do not weaken the provisions of this chapter.

22949.91.
 The provisions of this chapter are severable. If any provision of this chapter or its application is held invalid, that invalidity shall not affect other provisions or applications that can be given effect without the invalid provision or application.

feedback