APAA e-Newsletter (Issue No. 43, October 2024)
General understanding on AI and copyright in Japan
AOKI Shujiro, SEIWA PATENT & LAW (Japan)
- Introduction
The Legal Subcommittee under the Copyright Subdivision of the Cultural Council for Cultural Affairs (hereinafter referred to as “the committee”) published a report on the general understanding of AI and copyright in Japan (hereinafter referred to as “the report”) on March 15, 2024. This article introduces their opinion on the use of copyrighted works in AI, particularly whether generative AI infringes on any copyright. This article does not discuss measures relating to copyright infringement, or whether or not AI-produced works are protected by copyright.
- Background
The Japanese copyright system does not have a doctrine of “fair use” (like the U.S..) Restrictions on the exercise of copyright are enumerated in Japan’s Copyright Act (Copyright Act Articles 30 to 47).
Article 30-4 of the Copyright Act restricts the enforcement of copyright against prescribed actions that are not intended to appreciate or enjoy the expression of ideas or emotions conveyed in copyright works. The statute does not apply to actions that are intended for enjoyment or that would unreasonably prejudice the interests of the copyright owner considering the nature or purpose of the work and the circumstances of its use.
Article 30-4 (English translation of Copyright Act Article30-4 provided by Japanese Ministry of Justice)
It is permissible to exploit a work, in any way and to the extent considered necessary, in any of the following cases, or in any other case in which it is not a person’s purpose to personally enjoy or cause another person to enjoy the thoughts or sentiments expressed in that work; provided, however, that this does not apply if the action would unreasonably prejudice the interests of the copyright owner in light of the nature or purpose of the work or the circumstances of its exploitation:
(i) if it is done for use in testing to develop or put into practical use technology that is connected with the recording of sounds or visuals of a work or other such exploitation;
(ii) if it is done for use in data analysis (meaning the extraction, comparison, classification, or other statistical analysis of the constituent language, sounds, images, or other elemental data from a large number of works or a large volume of other such data; the same applies in Article 47-5, paragraph (1), item (ii));
(iii) if it is exploited in the course of computer data processing or otherwise exploited in a way that does not involve what is expressed in the work being perceived by the human senses (for works of computer programming, such exploitation excludes the execution of the work on a computer), beyond as set forth in the preceding two items.
(Japanese Law Translation: https://www.japaneselawtranslation.go.jp/ja/laws/view/4207#je_ch2sc3sb5at4)
The premise of this statute is that actions not aimed at enjoying the thoughts or emotions expressed in copyrighted works do not impair the copyright owner’s profits which the Act aims to protect. More specifically, the idea is that such actions do not impact the copyright owner’s ability to derive value from those who seek to enjoy thoughts or sentiments expressed in copyrighted works.
The committee has outlined its perspectives regarding the relationship between this statute and the use of copyrighted works in generative AI, as follows:
- Relationship between usage of copyrighted works in AI and copyright
3-1. Usage of copyrighted works in AI
Copyrighted works can be used in AI, particularly in generative AI, during both the development and training stages, as well as the generation and utilization stages.
Regarding the use of copyrighted works in the development and training stages (see Fig. 1 below), the collection and processing of training data (a), as well as inputting the data into a training program (b), may involve the reproduction of copyrighted works. Additionally, if Retrieval Augmented Generation (RAG) is used, the collection and processing of RAG data for the RAG database (c) may also involve the reproduction of copyrighted works.
[Fig. 1]
In the generation and utilization stage, creating AI-generated content and distributing it over the Internet may be regarded as an exploitation of copyrighted works.
3-2. Development/Training Stage
The collection and processing of training data (a) as well as inputting that data into a training program (b), are considered exploitation of copyrighted works if the data includes such works. However, these actions are primarily for the purpose of information analysis and do not typically have aim to enjoy the expression of ideas or emotions contained in copyrighted works. Therefore, these actions are generally not regarded as copyright infringement under Article 30-4.
However, even if these actions are for information analysis, the statute does not apply if the actions are intended to “enjoy” the “expression of thoughts or emotions” conveyed in the copyrighted works, or if they “would unreasonably prejudice the interests of the copyright owner”.
(1) Purpose of “enjoying” “expression of thoughts or emotions”
The committee has highlighted the following specific cases that may be considered as having the purpose of “enjoying” “expression of thoughts or emotions”, for example:
- Collecting or processing training data, or additional training data for existing models, with the specific intent of outputting or generating all or part of the creatively produced expressions contained in the copyrighted works included in the training data (see to (a) in Fig. 1). These actions are commonly referred to as fine-tuning, and intentional “overfitting”.
- Creating a RAG database, etc. with the specific intent of producing AI-generated content that includes the creative expressions from copyrighted works contained in the training data or publicly available on the Internet, etc. (see (c) in Fig. 1). However, under Article 47-5 of the Copyright Act, such actions may not be considered infringement if the exploitation is minimal in nature.
(2) Unreasonable prejudice of interests of copyright owners
The committee has raised the following examples of specific cases which may be considered to unreasonably prejudice the interests of copyright owners.
- In principle, creating AI-generated content that incorporates ideas or imitates the literary style, etc. of existing copyrighted works does not unreasonably prejudice the interests of copyright owners as the Copyright Act does not protect ideas or literary styles.
- Using a database designed for information analysis, such as for training data, etc., without paying the required fee may unfairly harm the interests of copyright owners. In particular, some databases implement “technical measures” to prevent the reproduction of the database or the copyrighted works it contains. Using such a database by circumventing such “technical measures” may also unreasonably prejudice the interests of copyright owners.
- The committee has emphasized that collecting AI training data from websites that are known to distribute copyright-infringing content should be strictly avoided. They also stated that if a business (e.g., an AI developer or AI service provider), knowingly collects AI training data that includes infringing content, it may be held liable for copyright infringement by the generative AI depending on the circumstances.
3-3. Generation/Utilization Stage
When AI-generated content is shared over the Internet, for example by being uploaded to social media, sold, etc., whether these actions are considered to be copyright infringement will be judged based on the same standards as traditional infringement. Thus, if AI-generated content has similarity to and dependency on existing copyrighted works, the action of producing such content would be considered to be infringement of copyrights.
The committee has stated that AI-generated content is dependent on existing copyrighted works if an AI user is aware of an existing copyrighted work or the copyrighted work is included in the training data. Otherwise, the content is not dependent on the copyrighted work even if the AI-generated content is similar to the copyrighted work.
- Summary
The opinions of the committee in their report are not legally binding nor are they a legal evaluation of generative AI. It will be necessary to await precedents and court decisions before concerns regarding the relationship between generative AI and copyrights can be resolved. However, these opinions may influence the use of copyrighted works in AI, in particular generative AI, in Japan.