Code Monkey home page Code Monkey logo

github-copilot-statement's Introduction

I am posting this on GitHub to note that I did not, and do not, consent to my code being included in the commercial "Github Copilot" program. The code I have posted on this account is available under the license terms I specified in the repositories or by other public announcement. In some cases that license is Creative Commons Zero or public domain and therefore reuse (including incorporating the code into derivative works such as GitHub Copilot) is intentionally unrestricted, but in other cases there are license rules imposed, such as a requirement to supply credit or relicense as GPL, or a bar on commerical use. I am not aware of and did not agree to any GitHub TOS terms that give Microsoft the right to arbitrarily relicense my code to third parties, especially not for Microsoft's commercial benefit. Therefore if I discover that a GitHub Copilot user has reused substantive portions of my code emitted from GitHub Copilot and is not following the licenses I set out for that code, I will react in exactly the same way I would for any other license infringement.

In general, I dislike the idea of copyright. I would be happier living in a world without it. However if we are going to have copyright, I do not want it to be a one-way street where small-scale actors are restricted by it but large corporations are free to ignore its rules while meanwhile enjoying its benefits. My understanding of Microsoft's theory for why GitHub Copilot is legal is that they have avoided copyright because they are using "Machine Learning". I do not think this means much of anything. "Machine Learning" means that they pack the training data into a statistical model, and then reconstruct data from the statistical model. In other words the statistical model is a form of compression, like a zip file, or a database index. The compression is lossy but so is a JPEG and you do not remove the copyright from a work by compressing it as JPEG. Now, I do think it would be pretty great if courts recognized some kind of loophole where compressing as JPEG or otherwise adding a bit of noise to a copyrighted work removes the copyright, since I dislike copyright and would be happy if you could use technical means to circumvent it. But courts do not in general recognize this rule, and if such a rule were created for GitHub Copilot I think that would be a bad outcome.

The rule you'd need to introduce to allow GitHub Copilot without making every JPEG uncopyrighted would be that recurrent neural networks are somehow "special" among statistical models and their outputs are novel works rather than derivative on their inputs. This is problematic becuase recurrent neural networks, compared to other types of statistical modeling, tend to require large amounts of processing time and input data and therefore are most accessible to moneyed entities who can afford the hardware, storage space, data gathering and electricity bills to do it seriously. This means that adopting a rule of legal copyright laundering via RNN would introduce a one way street: a world where large corporations who can afford to do the legally mandated dance can avoid copyrights, but individuals or small-scale actors would struggle to replicate the same results against the copyrights held by corporations. I believe if copyright laundering exists it should be available to all. I also believe that if courts eventually recognize RNN copyright laundering it will be becuase courts often struggle to properly understand new technology and in this case were bamboozled by the complex vocabulary of the machine learning field (complex but extraneous to the basic concept of building a statistical model and then querying it) and the essentially coincidental choice of the word "neural" when neural networks were named in the 1940s. Recurrent neural network models are not brains or "AI"s, they are databases, and even if they were brains the mere fact of using a human brain to memorize a sequence of symbols and then write it back out later does not (for better or worse) strip the work of copyright.

github-copilot-statement's People

Contributors

mcclure avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.