AI Can Rewrite Open Source Code But Can It Rewrite the License, Too

AI Can Rewrite Open Source Code—But Can It Rewrite the License, Too?

Computer engineers have long relied on reverse engineering to copy functionality without copying copyright-protected code directly. Now, AI coding tools are raising new issues with how that “clean room” rewrite process plays out legally, ethically, and practically—as highlighted by a recent controversy over the popular open source library chardet .


Quick Overview

DetailInformation
Projectchardet (Python character encoding detector)
Original AuthorMark Pilgrim (2006, LGPL license)
MaintainerDan Blanchard (since 2012)
New Version7.0.0 (March 2026)
Key ChangeMIT-licensed “ground-up rewrite” using Claude Code
Performance Gain48x speed improvement
Core DisputeIs AI-generated code a derivative work requiring original license?

The Controversy

What Happened

Dan Blanchard, who took over maintenance of chardet in 2012, released version 7.0.0 last week—a “ground-up, MIT-licensed rewrite” built with the help of Claude Code . Blanchard had long wanted to get chardet added to the Python standard library but lacked time to fix issues with “its license, its speed, and its accuracy.” With Claude’s help, he overhauled the library in roughly five days with a 48x performance boost .

Pilgrim’s Objection

A poster using the name Mark Pilgrim surfaced on GitHub to argue that this new version amounts to an illegitimate relicensing of Pilgrim’s original LGPL-licensed code under a more permissive MIT license (which allows use in closed-source projects) .

“Their claim that it is a ‘complete rewrite’ is irrelevant, since they had ample exposure to the originally licensed code (i.e., this is not a ‘clean room’ implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights. I respectfully insist that they revert the project to its original license.”
— Mark Pilgrim


The “AI Clean Room” Defense

Blanchard’s Argument

Blanchard admits having “extensive exposure to the original codebase,” meaning he didn’t have the traditional “strict separation” used for clean room reverse engineering . But he argues that tradition was set up for human coders .

In this case, Blanchard says the new AI-generated code is “qualitatively different” and “structurally independent” from the old code . Evidence includes:

  • JPlag similarity statistics: Maximum 1.29% structural similarity between v7.0.0 and v6.0.0 files
  • Comparison: v5.2.0 to v6.0.0 shows up to 80% similarity in some files
  • “Wipe it clean” commit: Started in empty repository with no access to old source tree

Blanchard explicitly instructed Claude not to base anything on LGPL/GPL-licensed code .

Complicating Factors

FactorIssue
Metadata filesClaude relied on some metadata from previous versions
Training dataClaude’s models almost certainly ingested original chardet code
Human reviewBlanchard reviewed every piece of code with intimate knowledge of original

Free Software Foundation Executive Director Zoë Kooyman told The Register: “There is nothing ‘clean’ about a Large Language Model which has ingested the code it is being asked to reimplement.”


The Bigger Legal Questions

Unsettled Territory

The legal status of AI-generated code remains largely unsettled :

QuestionStatus
Can AI be copyright holder?Courts say no (can’t author patents/copyrights)
Is AI output derivative of training data?Unclear
Does human review create derivative work?Unclear
Can AI-assisted code be relicensed?This case tests it

Ship of Theseus Arguments

Open source developer Armin Ronacher wrote: “If you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship.”

But others argue the “tainting” of open source licenses with AI-generated code can get remarkably complex remarkably quickly .


Community Reactions

The Debate

The open source community is deeply divided:

PositionArgument
Pro-rewriteStructurally different code is a new creation
Pro-PilgrimAI trained on original code creates derivative work
PragmaticThis is the new reality; adapt mental models

Italian coder Salvatore “antirez” Sanfilippo wrote: “Now the process of rewriting is so simple to do, and many people are disturbed by this. There is a more fundamental truth here: the nature of software changed.”

The Fire Alarm

Open source evangelist Bruce Perens told The Register:

“I’m breaking the glass and pulling the fire alarm! The entire economics of software development are dead, gone, over, kaput! We have been there before, for example when the printing press happened and resulted in copyright law, when the scientific method proliferated. I think this one is just as large.”


Practical Implications

ImpactDescription
Speed of RewritesAI enables rapid relicensing of open source projects
License ProliferationMore permissive licenses may emerge from restrictive ones
Legal UncertaintyCourts will eventually need to rule
Community TrustMay erode if maintainers can easily change licenses
Economic ShiftTraditional software economics fundamentally altered

What’s Next

The practical impact of being able to use AI to quickly rewrite and relicense many open source projects—without nearly as much human effort—is likely to have huge knock-on effects throughout the community .

Whatever the outcome of the chardet dispute, it’s clear that the nature of software has transformed forever. As Sanfilippo suggests, instead of combating each manifestation of automatic programming, the community may need to build a new mental model and adapt .

Leave a Reply

Your email address will not be published. Required fields are marked *