I think copyright issues are getting blown out of proportion.
Ethically it is not very different from myself reading books, blogs and other available source codes, and then writing my own program based on whatever I learned.
Legally, IANAL but either the code generated would be novel, common enough or otherwise easily searchable to original source. For case 3, developer can take the call to keep it or not.
The analogy is wrong, AI does not at all look similar to a human mind, it is a more complex algorithm , a obfuscated script that if done wrong will output the exact input you used it for training. This can happen probably very often withy original stuff and less often with trivial stuff.
AI is not similar to a human mind in the general case (ie general AI), but in the context of reading and learning and generating code there are similarities. And I would argue that humans are also very susceptible to "output the exact input used for training".
How is similar in generating code? From my case I don't create code by combining previous seen code, for example I can write a Lua/Haskell script now even in my memory I have no Lua/Haskell code stored. I can do it because I create a model of the problem, then I create structures of data and operations on those data, only the final step is to look up the syntax and standard libraries to generate the code.
Can someone prove that it at least managed to understand trivial algorithms, like this is a find.sort,reverse,filter,map operation and it can say map an algorithm from one language to other?
Thinking at it , bad students when learning programming are doing this kind of stuff, they start writing stuff from memory that looks like valid code , one student wrote something like if(int i =0; i < n; i++) , clearly this student did not understand the lesson so Copilot will do a similar mistake just one level higher
I definitely agree copilot isn't understanding the problem space of your code. But I also don't believe it's simply remixing code samples from its training set. It's somewhere in between. I don't know the internals, but it looks to me like it's operating at a few layers of abstraction over literal code and syntax. It find patterns in the relationships between symbols and references. Not to mention since it's GPT-3 based, it's cross referencing the "meaning" in these abstract relationships with the meaning of plain English text written in comments, too. These pieces are similar to humans. Just like a human, copilot doesn't have a ton of exact literal code in its model. It's seen lots of code and has patterns and relationships in its model. That's why just like a human, it can translate ideas between different coding languages -- my guess would be it can write an algorithm its seen in C in F# even if its never seen that algorithm in F#; but that's hard to prove. Just like a human though, it might have some literal snippets, though. It can definitely translate between languages ; I've done that a few times. (It can even translate human languages eg English to French! I've done that sometimes for fun) I would highly recommend giving the trial version a go, if only to better understand how it works. Whether it "understands" eg a map operation... That's hard to prove. Can you think of any experiments? I think it understand the relationship between the English word map and the code patterns often associated with that word.
I think your example is close. Copilot is a lot like an inexperienced developer. It doesn't (usually) make syntax errors, but because it doesn't understand your problem space, and because it doesn't have as many layers of abstraction as a human does, it does sometimes make silly mistakes. I definitely wouldn't trust it to write an entire program on its own! But with a human in the loop doing the more complicated abstract pieces of coding, it handles the more simple menial pieces pretty well!
If you have access to it maybe experiment with using snippets from Windows code, like find Wine or Windows code on GitHub and copy the start of a function that is pretty unique , see if it completes it as the original or not.
Other experiment maybe test if it just repeats text that iot seen and has realy no idea about stiuff. I would use missleading variables, like
int namesLis; int[] counter; string i ="test";
add some comments or unrelated stuff here like print heelo world 12 times
then start with for( int ... and see it it just completes it correctly or will do something stupid and have 3 or more error) If it is a bit smart it will know what variable is the list because it associated the [] with a list