DeepSeek, Model Distillation, and the Future of AI IP Protection

By: Robert Hulse , Tyler G. Newby , Stuart P. Meyer , Fredrick Tsang

What You Need To Know

  • DeepSeek’s R1 release has generated heated discussions on the topic of model distillation and how companies may protect against unauthorized distillation.
  • Model distillation has broad IP implications as AI agents and domain-specific AI expert models can also be distilled.
  • Current copyright frameworks may not offer sufficient protection against distillation, particularly when a student model resulting from distillation may have a different architecture than the teacher model from which the student model is derived.
  • Unauthorized distillation would likely be a breach of Terms of Service of the teacher model, but whether remedies against such breach would sufficiently protect teacher-model owner’s IP remains to be determined.
  • With the right planning and strategy, patents can protect not only the teacher model but also against unauthorized student models derived through distillation.

A flurry of developments in late January 2025 has caused quite a buzz in the AI world. On January 20, DeepSeek released a new open-source AI model called R1 and an accompanying research paper. R1 was touted as performing comparably to proprietary frontier models at a fraction of the cost, with drastically reduced computational resources needed both to train and run the model. DeepSeek explained that it used new techniques in reinforcement learning, but others suspect that it might also have benefitted from unauthorized model distillation. Within a week, there was a strong market reaction, with AI-related semiconductor stocks dropping more than 10% on Monday, January 27.

Just a few days later, on January 29, the U.S. Copyright Office published the second installment of its three-part Report on Copyright and Artificial Intelligence—this part focusing on the copyrightability of AI-generated outputs. In that report, the Copyright Office analyzed the amount and nature of human contribution needed to bring AI outputs within the scope of U.S. copyright protection. It concluded that existing law remains appropriate for the new AI technologies and that mere prompts generally do not meet the standard of human contribution needed for copyright protection. It gathered and considered public comments for its analysis but rejected arguments that legal change is necessary to counter international competition, such as foreign entities distilling U.S. frontier models.

Protection against unauthorized model distillation is an emerging issue within the longstanding theme of safeguarding intellectual property (IP). Existing countermeasures have primarily focused on technical solutions, such as restricting access through application programming interface (API) rate limiting or embedding traceable watermarks in AI outputs. This article will examine the legal protections available under the current legal framework and explore why patents may serve as a crucial safeguard against unauthorized distillation.

What Is AI Model Distillation?

Simply put, model distillation is one model teaching another model. Various techniques exist, but typically, a powerful and knowledgeable teacher model generates outputs in specific formats, which are then used to train or fine-tune a student model. The student model is often smaller, faster, and more cost-efficient than the teacher model. Leading AI labs frequently release both their most advanced models and smaller versions, with the latter often partially created through distillation. Authorized model distillation has broad applications, including enhancing efficiency for edge computing and adapting models for specialized domains, such as life sciences.

Model distillation can also occur without authorization. A company may extract knowledge from a proprietary or restricted AI model without permission—often by repeatedly scraping its API to amass large volumes of training data—and then use this data to train a new student model. Notably, the student model may have an entirely different architecture from the teacher model, and conventional IP frameworks may not classify distillation as direct copying. This raises a critical question: What legal protection exists to ensure that frontier AI labs, which invest billions of dollars in developing cutting-edge models, remain incentivized to innovate despite the risk of others distilling their results?

Protection Against Unauthorized Distillation Is Equally Important to Model/AI Agent Deployers

While discussions around unauthorized model distillation have largely centered on the AI race among frontier labs, the issue has significant implications for model deployers—particularly companies commercializing AI agents or domain-specific models. Simply put, AI agents and specialized models can be distilled too. For example, a life sciences company that develops a best-in-class protein simulation model may find that its generated protein structures are easily repurposed to train another model. Without legal protections against distillation, companies may struggle to differentiate their products based on superior AI outputs, leading to the rapid commoditization of even the most advanced AI tools.

Copyright Protection Appears to Be Lacking in This Area

Under the current legal framework, copyright is unlikely to offer meaningful protection—if any—against model distillation. Three components in the distillation process could potentially be subject to copyright:

  • Training source code
  • Training data
  • Generated models

However, copyright protection for these components may be quite limited in the context of AI model distillation.

The training source code consists of the algorithms used to train a model. In cases of unauthorized model distillation, the student-model trainer is unlikely to use the same training source code as the teacher-model owner. Training source code is typically proprietary, even in open-weight models, making it improbable that a third party would have direct access to or copy the original source code, unless corporate espionage is involved.

The training data relevant to distillation consists of the teacher model’s outputs. In unauthorized distillation, the student-model trainer often scrapes large volumes of these outputs—such as through repeated API requests—to create a high-quality training dataset for a student model. Whether this constitutes copyright infringement depends on whether AI-generated outputs qualify as copyrighted material. To date, most courts and the U.S. Copyright Office have indicated that AI outputs that lack sufficient human-generated expressive inputs are generally not copyrightable, making copyright protection unlikely in this context. As noted above, the January 29, 2025, Copyright Office report indicates that the office’s position remains largely unchanged in this regard.

That said, the third part of the Copyright Office report, expected later this year, is specifically directed at the legal implications of training AI models on copyrighted works, and the Copyright Office may consider distillation training in its analysis as well, providing some further guidance. Of course, the Copyright Office’s view only directly impacts its operations (e.g., processing copyright registration applications) and is not a controlling authority in any infringement lawsuits.

The generated model presents another potential copyright issue. One could argue that a student model is a derivative or “copy” of the teacher model and that its creation or storage without authorization constitutes copyright infringement. However, this argument would likely face significant legal challenges. From a technical perspective, the student model may have an entirely different architecture than the teacher model. Legally, copyright protects only expressive elements, whereas AI models primarily consist of large datasets of numerical weight values. Given this, it remains uncertain whether a student model retains enough expressive similarity to the teacher model for copyright protection to apply.

Breach of Terms of Service and Computer Fraud and Abuse Act

To prevent unauthorized model distillation, companies can include contractual provisions in their Terms of Service (ToS) prohibiting users from distilling model outputs. Violating these terms would likely constitute a breach of contract. But key questions remain whether such a breach may entitle the teacher model owner to recover lost profits or justify non-monetary relief, such as an injunction against the downstream use of a student model trained on misappropriated outputs. Despite being atypical, injunctive relief can be awarded for breach of contract where damages are not sufficient to remedy the harm.

In some cases, breaching ToS by engaging in unauthorized model distillation may be considered a form of trespass to chattels, as recognized in eBay v. Bidder’s Edge, 100 F. Supp. 2d 1058 (N.D. Cal. 2000). However, trespass to chattels typically requires proof of some detrimental impact on a computer system’s availability or performance, which may be challenging if the teacher-model owner is not able to pinpoint when the distillation took place. Additionally, remedies for trespass to chattels typically focus on prohibiting further access to the trespassed system and compensating for the negative impact on the computer system. Whether a plaintiff could obtain an injunction against the downstream use of a student model trained on misused data remains an open legal question. Securing an injunction against access to the teacher model itself may be of limited practical value, as restricting API access is a more immediate and effective means of enforcement.

A ToS violation could also implicate the Computer Fraud and Abuse Act (CFAA), particularly if an entity unlawfully obtains information from a “protected computer”—which, in this context, would be the teacher model. However, in Van Buren v. United States, 141 S. Ct. 1648, 1661 (2021), the U.S. Supreme Court held that the CFAA does not apply when an individual who is authorized to access a system later misuses the data in violation of the ToS. Following Van Buren, lower courts have continued to issue injunctions in data-scraping cases, such as Southwest Airlines Co. v. Kiwi.com, Inc. (3:21-cv-00098, N.D. Tex.), but only when additional legal elements are met. For example, the teacher-model owner would likely need to show that access to the teacher model was restricted, such as to persons authorized with access credentials, and that the scraper gained unauthorized access to the model.

While parallels exist between data scraping and unauthorized model distillation, one could argue that distillation is more transforming than merely scraping and republishing data. This distinction may further complicate the application of existing data-scraping legal precedents to unauthorized model distillation. 

Patent May Be a Silver Bullet

With the right planning and strategy, a patentee may be able to secure protection not only for the teacher model but also against unauthorized student models derived through distillation. Although an inventor may not contemplate unauthorized distilled models as the main embodiment of the invention, they can sometimes still claim those distilled models as part of the patent along with or in addition to the teacher model. There are at least four key reasons why patents may serve as an effective tool against unauthorized model distillation:

  • Patents are exclusionary
  • They are flexible in defining an invention’s scope
  • Patents inherently protect practical applications
  • Injunction is an available remedy

Patents, by their exclusionary nature, grant the patentee the right to prevent others from making, using, or selling a patented product or process. In the U.S. and many other jurisdictions, a patentee is not required to practice the patented invention to enforce their rights. This means that a patent holder can obtain claims covering model distillation without actively engaging in the practice—though many frontier labs do, in fact, perform authorized distillation on their own models. For example, a patentee could secure a claim covering a process that involves:
(a) using a teacher model (that is novel and inventive), 
(b) generating outputs from the teacher model, and
(c) training a student model using those outputs.

Patent law also offers significant flexibility in defining what constitutes an invention and how it can be claimed. The patent system was designed to anticipate unforeseen technological advancements, allowing attorneys to adapt claim strategies to emerging innovations. Various legal frameworks enable creative claiming to protect new inventions effectively. One such strategy is the product-by-process claim, which defines a product based on the method by which it is created rather than its structural characteristics. While product-by-process claims have certain limitations, they could be a useful tool for protecting AI models by claiming the AI model itself as a “product” that is made using the distillation “process.” Such a claim would be patentable because it inherently involves the use of a novel teacher model in the distillation (i.e., in the method of making the product).

Detection is a key consideration in any patent strategy, as a patent is valuable only if its infringement can be detected. Patent claims that cover student models and methods of making them may be difficult to detect if their novelty—and hence claim language—depends on features of the teacher models used in the distillation. This detectability problem might be solved by integrating watermarking technology into patent claims. If watermarking advances to a point where training data can be definitively traced back to a teacher model, patent infringement could be demonstrated using specific watermarking signatures in models so that their use as a teacher model can be detected. This could enable a patentee to demonstrate that a claim directed to the products of distillation using the teacher model is infringed, thus enabling protections against unauthorized distillation.

Because of their focus on function, patents are particularly well-suited for protecting practical applications of technology, including the deployment of AI models in specific settings. For companies commercializing AI agents or domain-specific models, patenting AI’s practical use cases can serve as an additional safeguard against unauthorized model distillation.

Injunctions are a strong available remedy. As a final point, a study indicated that roughly four out of five patent infringement lawsuits between competitors that reached infringement verdicts resulted in permanent injunctions. This high success rate underscores the patent system’s potential effectiveness as a legal tool to deter and prevent unauthorized distillation.

What’s Next

The rapid commoditization of large language models (LLMs) may signal an early paradigm shift in this market. In early 2023, the cost of using an LLM was $0.02 per thousand input tokens. By January 2025, DeepSeek was asking $0.14 per million input tokens—or $0.00014 per thousand tokens—representing a staggering 99% price drop in just two years.

Traditionally, software competition has been driven by product differentiation and economies of scale. If DeepSeek’s introductory offer becomes a market norm, the practice would raise fundamental questions about how AI companies can maintain a competitive edge. Only time will tell whether traditional software-business premises will hold in the AI agent world, such as in a situation where distillation may make product differentiation a rather different concept.

Given this uncertainty and potential market transformation, companies should thoughtfully consider investing in legally protectable IP to maximize their strategic options in an increasingly competitive and rapidly evolving AI landscape.