Computer engineers have long relied on reverse engineering to copy functionality without copying copyright-protected code directly. Now, AI coding tools are raising new issues with how that “clean room” rewrite process plays out legally, ethically, and practically—as highlighted by a recent controversy over the popular open source library chardet .
Quick Overview
| Detail | Information |
|---|---|
| Project | chardet (Python character encoding detector) |
| Original Author | Mark Pilgrim (2006, LGPL license) |
| Maintainer | Dan Blanchard (since 2012) |
| New Version | 7.0.0 (March 2026) |
| Key Change | MIT-licensed “ground-up rewrite” using Claude Code |
| Performance Gain | 48x speed improvement |
| Core Dispute | Is AI-generated code a derivative work requiring original license? |
The Controversy
What Happened
Dan Blanchard, who took over maintenance of chardet in 2012, released version 7.0.0 last week—a “ground-up, MIT-licensed rewrite” built with the help of Claude Code . Blanchard had long wanted to get chardet added to the Python standard library but lacked time to fix issues with “its license, its speed, and its accuracy.” With Claude’s help, he overhauled the library in roughly five days with a 48x performance boost .
Pilgrim’s Objection
A poster using the name Mark Pilgrim surfaced on GitHub to argue that this new version amounts to an illegitimate relicensing of Pilgrim’s original LGPL-licensed code under a more permissive MIT license (which allows use in closed-source projects) .
“Their claim that it is a ‘complete rewrite’ is irrelevant, since they had ample exposure to the originally licensed code (i.e., this is not a ‘clean room’ implementation). Adding a fancy code generator into the mix does not somehow grant them any additional rights. I respectfully insist that they revert the project to its original license.”
— Mark Pilgrim
The “AI Clean Room” Defense
Blanchard’s Argument
Blanchard admits having “extensive exposure to the original codebase,” meaning he didn’t have the traditional “strict separation” used for clean room reverse engineering . But he argues that tradition was set up for human coders .
In this case, Blanchard says the new AI-generated code is “qualitatively different” and “structurally independent” from the old code . Evidence includes:
- JPlag similarity statistics: Maximum 1.29% structural similarity between v7.0.0 and v6.0.0 files
- Comparison: v5.2.0 to v6.0.0 shows up to 80% similarity in some files
- “Wipe it clean” commit: Started in empty repository with no access to old source tree
Blanchard explicitly instructed Claude not to base anything on LGPL/GPL-licensed code .
Complicating Factors
| Factor | Issue |
|---|---|
| Metadata files | Claude relied on some metadata from previous versions |
| Training data | Claude’s models almost certainly ingested original chardet code |
| Human review | Blanchard reviewed every piece of code with intimate knowledge of original |
Free Software Foundation Executive Director Zoë Kooyman told The Register: “There is nothing ‘clean’ about a Large Language Model which has ingested the code it is being asked to reimplement.”
The Bigger Legal Questions
Unsettled Territory
The legal status of AI-generated code remains largely unsettled :
| Question | Status |
|---|---|
| Can AI be copyright holder? | Courts say no (can’t author patents/copyrights) |
| Is AI output derivative of training data? | Unclear |
| Does human review create derivative work? | Unclear |
| Can AI-assisted code be relicensed? | This case tests it |
Ship of Theseus Arguments
Open source developer Armin Ronacher wrote: “If you throw away all code and start from scratch, even if the end result behaves the same, it’s a new ship.”
But others argue the “tainting” of open source licenses with AI-generated code can get remarkably complex remarkably quickly .
Community Reactions
The Debate
The open source community is deeply divided:
| Position | Argument |
|---|---|
| Pro-rewrite | Structurally different code is a new creation |
| Pro-Pilgrim | AI trained on original code creates derivative work |
| Pragmatic | This is the new reality; adapt mental models |
Italian coder Salvatore “antirez” Sanfilippo wrote: “Now the process of rewriting is so simple to do, and many people are disturbed by this. There is a more fundamental truth here: the nature of software changed.”
The Fire Alarm
Open source evangelist Bruce Perens told The Register:
“I’m breaking the glass and pulling the fire alarm! The entire economics of software development are dead, gone, over, kaput! We have been there before, for example when the printing press happened and resulted in copyright law, when the scientific method proliferated. I think this one is just as large.”
Practical Implications
| Impact | Description |
|---|---|
| Speed of Rewrites | AI enables rapid relicensing of open source projects |
| License Proliferation | More permissive licenses may emerge from restrictive ones |
| Legal Uncertainty | Courts will eventually need to rule |
| Community Trust | May erode if maintainers can easily change licenses |
| Economic Shift | Traditional software economics fundamentally altered |
What’s Next
The practical impact of being able to use AI to quickly rewrite and relicense many open source projects—without nearly as much human effort—is likely to have huge knock-on effects throughout the community .
Whatever the outcome of the chardet dispute, it’s clear that the nature of software has transformed forever. As Sanfilippo suggests, instead of combating each manifestation of automatic programming, the community may need to build a new mental model and adapt .
