Hacker News new | past | comments | ask | show | jobs | submit login
The Good and the Limitations of Github Copilot (hrithwik.me)
110 points by hrithwikbha on July 5, 2021 | hide | past | favorite | 139 comments



> Can help you with Email Validation and API Calls

It generates a nastily complex regular expression that is hopelessly wrong. Visible at https://www.youtube.com/watch?v=9Pw-Roo_duE&t=404, here transcribed:

  /^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$/
For the local part, it requires [\w-\.]+, which excludes many valid characters like everyone’s favourite, +.

For the domain part, it tries to allow IPv4 addresses as well as normal domain labels (not IPv6 addresses, though), but it ends up tangling it up in a way that a human never would, allowing things like [12.34.56.com], [987.654.321.000, example.com] and example.123], while disallowing things like example.studio (the last label only allowing 2–4 letters) and IDN TLDs (which start with xn-- and must allow hyphen and numbers, not just [a-zA-Z]).

The author makes no comment on how hideously bad it is, which makes me suspect he didn’t notice, which… yeah, shows the problems of the whole thing.


Hey Chris I am the author.

When i made the video I didn't really notice the code part of regex, since I am really new to regex but in the conclusion part of my video I did mention that most of the code is not efficient

Your comment was a great learning . Thank you


This is exactly the problem. The regex issue isn’t that it’s not efficient, it’s that it’s wrong. Using this tool to generate code in a problem area you are not qualified to double-check and validate yourself is dangerous.


> Using this tool to generate code in a problem area you are not qualified to double-check and validate yourself is dangerous.

I would like this message to be amplified as much as possible. Never write code you do not understand. I am excited about copilot, but also wary of the programming culture these tools will bring in. Businesses, especially body-shopping companies will want to deliver as much using tools in this category and end up shipping code with disastrous edge cases.


Isn't "Code you don't understand" the definition of AI/ML?


Zing! But well, depends on the algorithm. Some aren't that complicated to understand, like linear regression. Others, like DNN are basically impossible. But with ML you're at least always testing the code you don't understand in the process of training the parameters. That's better than the minimum effort when using copilot code. And many will just make that minimum effort and release untested code they don't understand.


Well, I think this overestimates people outside the HN echochamber again. Most senior ML people we see in big corps have no clue what they are doing: they just fiddle with knobs until it works. They would not be able to explain anything: copy code/model, change parameters and train until convergence, test for overfitting. When automl was coming a bit I hoped they would be fired (as I do not think they are doing useful work) but nope: they have trouble hiring more of them.


I'd say that's "Code you (should) understand doing things you can't understand (and possibly can't audit)."

The art and practice of programming didn't change much over the last 50 years. 50 years from now, though, it will be utterly unrecognizable.


> Isn't "Code you don't understand" the definition of AI/ML?

We don't need to understand the process to evaluate the output in this case. Bad code is bad code no matter who/what wrote it.


No. You could use copilot to generate code you do understand and double check it before committing. It’s similar to just copying and pasting from stack overflow.


I think there are a scary amount of programmers (this is their job and they get hired: often they are already seniors) who cannot explain what they copied or even wrote themselves. I have asked guys with 5 years or more job experience why they wrote something they wrote and I get 'because it works'. Sometimes I have trouble seeing why it works and usually it means there are indeed those disastrous edge cases. Copilot will make this worse and has the potential to make it far worse.


There's two problems:

1. generating code in the problem area (email address validation) which is pretty much a classic 'things programmers believe about' domain - https://haacked.com/archive/2007/08/21/i-knew-how-to-validat...

2. generating code in a programming idiom with which you are unfamiliar - which regex as a DSL is also a pretty classic example. I don't think most programmers are good at regex, I know I'm definitely in the 'now you have two problems' in the regex camp.

So to summarize it generated code written in a way the programmer could not understand what it even claimed to be doing, using a technology that many programmers are not especially good at; and it generated code that did not handle the problem domain correctly, and the problem domain is one that most programmers don't actually know that well either.

the more I think of this thing the more disastrous it seems.


aye - the biggest risk here is that rather than pulling in some standard lib for validating email addresses an engineer may use the copilot suggestion and validate it against a few simple test cases.

Curiously an attacker could probe services for use of the invalid suggestions that copilot generates....


This seems a bit overblown. If someone’s using GitHub Copilot to write code in a place it could be dangerous without any kind of quality control, then the odd wrong regex flag is the least of their problems.


Very well put. Thank You.

I know copilot is in alpha and will improve 100x but you will still need someone qualified to double check


ML applications face a frightening problem of diminishing returns on investment. The first prototype often happens in days or weeks, the next iteration months, after that years.

It's more analogous to clearing a foundation for a house by progressively picking up the boulders, then the rocks, then the grains of sand one at a time.


I don't think it's possible for copilot to improve on this problem. It doesn't actually understand the code, it's just statistical models all the way down. There's no way for copilot to judge how good code is, only how frequently it's seen similar code. And frequency is not the same thing as quality.


>validate yourself is dangerous

You are taking yourself to serious.


Copilot acts like a search engine, you search, you find, then you judge. It was never the case with search engines that you could just copy some code you found without verifying it. Also, it has the same copyright problems as if you used Google to find the code.


Nice theory. Won’t work out in practice, because this produces code that will run, and it’s AI, so it must be good, right?

When you found code on the internet, it was presented in a context that let you make better judgement (e.g. on Stack Overflow this regular expression would have had a score of roughly −∞ and multiple highly-voted comments saying “do not use this, it’s catastrophically bad”), and where you have to put in more effort to plug it in and shuffle things around a bit as well. With Copilot, you get given ready-to-go code without any sanity checking at all.

See even how, a few seconds later in the video, the author does test it out—but not thoroughly enough.


Decently puts my feelings towards this whole thing

Aside from the beaten horse concerns like licensing... I worry about the training we're giving ourselves and future generations

The upfront presentation of 'suggestions' skews the perception, a fair bit of 'no warranty guaranteed' comes from having to go dig it up


I think Copilot should report the matching source URL to allow the user to visit the page and see the context and license. This move would also placate some copyright questions because it would be like searching StackOverflow or Github for inspiration.

The problem of content attribution (exact and fuzzy match) has been studied before under the task of plagiarism detection for student essays. Funny thing is that a plagiarism detection Copilot would also disclose past cases of copyright violation and cause attribution disputes because code sitting unchecked in various repos would suddenly become visible.


> I think Copilot should report the matching source URL

That's the problem. The output of a GAN like Copilot usually can't be traced directly back to a single input.


If you can't trace the source then it's transformative use. If it matches training data then it needs to report the source like a search engine and place all responsibility on the user.

And fuzzy code matching could be easily implemented by using the a model similar to CLIP (contrastive) to embed code snippets.


> If you can't trace the source then it's transformative use.

That's not how "transformative use" works.


A list sorted by probability would be better than nothing. Would a GAN like Copilot be able to provide that?


Not easily. It's a rather opaque process.


And besides, let's get real, nobody would look at the links - it defeats the purpose of copilot.


> Copilot acts like a search engine, you search, you find, then you judge.

OK, but that’s not how GitHub position it:

“Your AI pair programmer” “Skip the docs and searching for examples”

They literally say it’s not a search engine!


Does GitHub Copilot tell me which license the code it suggested has? If not it's a huge difference to code search engines.


Copilot is not a search engine. It synthesises code so it is unlicensed.


The judge is still out on the licensing issue. And given it can, at least sometimes, output verbatim copy paste, including comments, of well-known GPL code[0], well, let's just say the issue is not clear-cut.

[0]: https://news.ycombinator.com/item?id=27710287


Copilot's api and suggestions could easily (and maybe was?) implemented as a SBQA style model. Using a search engine to find promising examples/context followed by a transformer model to synthesize the final output.

Attribution would clearly be required in such a search derived model.


> Also, it has the same copyright problems as if you used Google to find the code.

No, it has the same copyright problems as if you Google and instead of getting links to sites that host code and licenses, you get just the code.


It's even more gray then that.

Regardless of the licence, does the produced code even quality for copywrite protection, or does it fall under fair use?

what licence, if any, is there for unique code generated by co-pilot etc etc etc.

its a great big ball of who knows, however I expect that noting your only getting snippets you would be highly unlikly to get code that dosent fall under the fair use provisions, that said IANAL


So if we classify AI as search.. and then claim fair use.. we can launder dirty viral code.. how could this go wrong?

But really, if thispersondoesnotexist is just a really good per-pixel search against a corpus of human faces where each “page” of result pixels is organized in a grid presented as a new image with its own metadata..

I mean I guess Google really was an AI company all along.


>Copilot acts like a search engine

No, it doesn't. If my understanding of it is correct, it's an autoencoder, then a few more bits of AI. The MINST dataset is a collection of hand written digits used in many early machine learning classes. Usually they are used to train a classifier, which returns the correct digit given an image. They can also be used to train an autoencoder, which will take an image in, compress it down to far fewer channels, and put out an image that quite closely matches the original.

Once you have an autoencoder, it is easer to input data, and train a neural network to do something with the compressed output. There is no way the autoencoder knows which samples were used to generate the resulting output, it's just optimized at compression.

Thus, Copilot isn't search. You could take the entire corpus it was trained on, and log all the compressed outputs. You could then take a given output before the autoencoder expands it back out, tell which few source code fragments were closest, but there are no guarantees.

TLDR; A far closer analogy: Copilot acts like a Comedian who has stolen a lot of jokes, and can't even remember where they came from.


Just like a real copilot in a car or an airplane shouldn't be trusted? Perhaps they should choose a different name then.


Search engines give you a link where you can (usually) see the code in context, who wrote it, when, license, etc. And often more, like who is using it where, how often it's updated, contact info, test suites, and so on.


> same copyright problems

Is this true? From what I remember reading, the code was uniquely created (but I could be wrong). If that's the case, then does it tel you what license the generated code is under?


"Email address validation cannot be solved adequately with a regex" is something we need to start teaching somewhere. The RFC spec for emails is just way to permissive to make validating email addresses a winning move.

I swear I wind up having a battle over email validation at every company I go to. There is inevitably a business person that says "Well what about this site, they do it" and then I have to dig into whatever that site is actually doing and likely find a valid email address that breaks their validation to prove it.

And probably some junior dev (or senior who swears they did email validation flawlessly somewhere else and same story. I have to break their regex a bunch with valid emails they don't permit.

And of course then it's an uphill battle convincing them that what I'm using are in fact valid email addresses. Or you get the "Well no one ever actually does weird things in their email addresses so it's fine" or "gmail doesn't let me register that address so you're wrong"

Email is annoying.


I like the HTML spec’s version, used for <input type=email> validation: https://html.spec.whatwg.org/multipage/input.html#valid-e-ma.... It allows all realistic inputs except for IP addresses (of dubious realism) and email address internationalisation (internationalised domain names are supported, but the local part is still currently stuck with ASCII, which is in keeping with its still-quite-limited support, though https://github.com/whatwg/html/issues/4562 progresses in fits and starts).


That's a good approach honestly.

I tend to just check for an @ and call it a day, validate it by emailing it and giving them a link to click if I need them to.

It's really the only way to ensure it's a valid and active address. People just don't want to build it.


> The author makes no comment on how hideously bad it is, which makes me suspect he didn’t notice, which… yeah, shows the problems of the whole thing.

The author's comments on that "reverse" function are equally bad - https://youtu.be/9Pw-Roo_duE?t=171

It's described as "efficient" but it calls `len` on an unchanging list in 3 places.


That’s actually roughly len(arr) times, not three, since two of the calls are inside the loop.

But the far bigger red flag there is that that it doesn’t just use arr.reverse(), which does the same thing and is typically 8–10× as fast in some simple testing (assuming a list), or arr[::-1], which makes a shallow copy rather than modifying the object in-place.

This matches what I’ve been seeing in code examples: Copilot likes to implement things from scratch rather than using libraries or even standard library functionality.

It’s possible that the word “array” tripped it up here and that it would have done something saner had it been told “list”, but I doubt it. (Python’s built-in array module is very seldom used; if you talk of arrays, you’re probably dealing with something like numpy’s arrays instead. But it’s far more likely that the built-in list type was what was desired here.)

There’s also one other significant point of bad and dangerous style in the code generated: the reverse function mutates its argument and returns it. Outside of fluent APIs (an uncommon pattern in Python, and not in use here), this is generally considered a bad idea in most languages, Python certainly included. It should either mutate its argument and return None, or not mutate its argument and return a new list.


> That’s actually roughly len(arr) times, not three, since two of the calls are inside the loop.

You're right! I noticed it a few minutes ago and changed the wording accordingly. Thanks for pointing out how shockingly bad that algorithm actually is! :D

> (...) the reverse function mutates its argument and returns it.

Yeah, that mutate + return is confusing. It's also worth noting that, as a result of the mutation, the function doesn't work on immutable types like strings and tuples.


It's not shockingly bad to call len(arr) each time through the loop. It would be if we were talking about something like strlen() in C, but in Python it's just one more constant-time† operation each time through the loop, and not a very expensive one at that. Caching it in a local variable would still be better.

______

† Nothing is really constant-time in CPython, but it's pretty close.


Well, to be fair, the `len` operation on lists in Python is a constant time operation. What makes the example particularly bad is using a cast on the result of a floating-point division, rather than just using Python 3 integer division (i.e. the `//` operator). Copilot was clearly just spitting out Python 2 code here.


I think the difference is that redundant re-calculation is a code smell in any language, at all times, whereas the float/int division issue requires knowledge of Python 2/3 syntax quirks.


It's context-specific whether recalculation is a code smell. I've definitely gotten feedback from senior developers to just do len(x) repeatedly instead of "littering" (their words) the code with `xCount=len(x)` assignments (because they knew len was constant-time).


Sure, it makes sense if you find something like `len_x` or `x_count` significantly less readable than `len(x)` and if you don't have to worry about resources. Can't say I've been there though.

I did a quick test of this `reverse` function (which probably shouldn't exist in the first place) and, unsurprisingly, it became ~30% faster when `len(arr)` was only called once.


When I'm that resource constrained, I don't use Python. The changes necessary to make regular Python code performant defeat the goal of making it readable.

There's a reason NumPy's innards aren't Python code.


Both sides of this "readability vs. performance" debate can be argued ad absurdum but that's not my intention. All I know is that I try to conserve resources no matter what level of the stack I'm working on and those generated snippets certainly don't!


Indeed, but they're not intended to. Nothing about Copilot's specs even claims it's generating optimally-performant code. And the Python example indicates why that may not even be a desirable output (most performant and most readable are orthogonal metrics).


> It generates a nastily complex regular expression that is hopelessly wrong.

> [...]

> The author makes no comment on how hideously bad it is[...]

I mean, it's coming up with a solution that's about as good as the average programmer who's going to validate E-Mail with regexes would, so as a crowd-sourced machine learning solution it's not too bad if you think about it.

In other words, Having a co-pilot doesn't mean you're guaranteed to get Chuck Yeager.


Here’s the difference imo, an average programmer with some experience would seek out well tested and used library to help them do something like this, not use sausage meet spat out of a cannon to validate email addresses.


I’d hope so, but then again, this is what their ML model spat out, which suggests people have written stuff like this, though hopefully not the wonkiness around the bit after the last dot.

But on the brighter side, the material the user provided to Copilot in this case was pretty much “I want to implement email validation from scratch” rather than “I want to validate an email address”, which is where hopefully people would look more to existing libraries. And they’ll commonly already such libraries or functions in their code base, e.g. under Django you’d use… uh oh, searching found https://stackoverflow.com/q/3217682/ first which looks frighteningly familiar here in half of the errors it contains; but anyway, you should use https://docs.djangoproject.com/en/3.2/ref/validators/#emailv.... I suspect the Copilot approach as used will be unintentionally biased much more towards boilerplate and implementing things from scratch, rather than using libraries.


>about as good as the average programmer who's going to validate E-Mail with regexes would

IMHO, the average programmer is not even aware of regexs, which is a problem, but here would lead the programmer to a simpler to read solution.

You need to be at a very particular point (good enough to be proficient with regexs, but bad enough to use them for everything and bad enough to not test your regex), and I strongly doubt that's the average.

P.S. The video question was to validate emails, not 'validate using regex'.


Proper email validation: Contains '@' and a message sent to the address is delivered.


This has been discussed at length several times before and the answer to why you use a regex like that to validate emails is because you aren't trying to validate against a standard, but against a subset of email formats. You want a "normal" simple email. No "+" domains etc. Because it doesn't matter if you annoy the 0.01% of your users that would be negatively affected by that, it's better to have their simple/canonical emails.

On the right side of the @ I agree completely, e.g. you must allow longer tlds. The regex is shit. But the same "simplification" thing would apply for IPv4: I'd probably want to have 4 groups of {0-9} even if a valid ipv4 address could be written in a lot more creative ways than that. The normal/simple/canonical way to write the address is a smaller scope than the set of allowed ways.

The regex to parse any valid email and the regex to parse the info I want, (perhaps from the user subset I want!) can be very different.

Edit: don't shoow the messenger - there is just zero chance you want a db that is more likely to contain user errors, has less valuable emails in it.

"Enter your email" doesn't mean "Enter a string that can be considered valid according to the RFC"!

No one cares whether "foo/baz=frob@example.com" is a valid email adress or not. In some cases you want RFC-compliant addresses, in which case it's a perfectly good idea to parse strictly to the spec. But in most cases you want a user identifier you know you can also contact with 100% certainty using some email SaaS. Or one you can cross reference to some other source. That's stricly a different purpose than parsing RFC compliant addresses. And allowing "foo@bar"@baz.com is just not a good idea.


The point being missed here is that no algorithm can tell you whether a string is a valid email address, because that's not a property of the string in the first place, it's a property of the world.

Having a fairly low entropy Gmail address, I get an intermittent drizzle of messages of the form of 'thank you for signing up to Acme!' whose content is such as to make it clear that an actual Acme customer typo'd their email address, and Acme thought you could validate it by checking the form of the string.

The only way to validate an email address is to send email to that address, asking the person behind it, are you the one who just signed up for Acme. And once you are doing that, there is no point checking the string for anything other than containing an @.


> The point being missed here is that no algorithm can tell you whether a string is a valid email address, because that's not a property of the string in the first place, it's a property of the world

"Valid" can mean at least 3 different things:

a) Conformant to a spec

b) Can actually receive email

c) Looks like a nice, simple "canonical" standard email address.

If you validate to the RFC (a) you still might fail b) and c). (The value of c is debated at length in a separate subthread but let'sjust say that there are more or less shady reasons why this is often a business goal).

Since you'll probably validate b) anyway - the validation of either b) or c) is a convenience, because validating b) isn't instant. So you validate to prevent errors and frustration. The question is merely: do I as a business want to have an address with quotes, spaces and backslashes in it, in my database just because it's possible according to the specification?

> And once you are doing that, there is no point checking the string for anything other than containing an @.

I think there is a legitmate case for a service to simply think "I'd rather lose the business of 1 customer out of a million than worry about backslashes in email addresses". It's not user friendly, and it's not "correct", but it's one of those "good enough" scenaroios.


How is it better to have "simple" emails?


This is usually done to catch what they think has a high chance of being a user error, even if otherwise valid.


And what does this solve? A user mistakenly entering an extra plus sign gets a validation error. While user mistakenly entering an extra alphanumeric character (far more common) does not.

Maybe they could ban adresses with `q` as well. Most people's names don't contain Q, so might be a user error.


It depends on the use case, but for some business use cases you e.g prefer access to the end users default/canonical inbox and not a specific one the user can use to filter etc. It’s also much less likely to cause downstream problems like distribution issues, be rejection by spam filters and so on. Annoying a tiny fraction of users or losing their business just isn’t a big enough issue to matter.


The only use case you're describing is being able to send spam or sell the user's information to third-parties while preventing the user from identifying that, or from filtering e-mails coming from your service.

Those are not legitimate uses.


Being able to notify users, limit erroneously input emails, or cross reference to existing databases (which invariably have the same kind of validation) are legitimate use cases.

Even if I have no interest in selling emails, it's still a net benefit if leaked data (e.g. after a breach) isn't full of bob.smith+mycompany@... rather than bob.smith@


> Being able to notify users

Using + emails don't prevent that from happening.

> limit erroneously input emails

BS excuse. That's what email confirmation, confirmation link, smtp inbox validation, etc, are for.

> cross reference to existing databases

Not being able to be cross referenced is a feature for the user, not a bug.

> Even if I have no interest in selling emails, it's still a net benefit if leaked data (e.g. after a breach) isn't full of bob.smith+mycompany@... rather than bob.smith@

It's benefit for the company, not for the user. I'd prefer to know which company leaked my email.


> It's benefit for the company, not for the user.

Yes. Absolutely 100% agree. What I'm arguing is: if you are ready to annoy a tiny fraction of your users, you will get away with a simpler validation, that is better FOR YOU AS A COMPANY, because it has some benefits ranging from shady to just half-shady. This is why companies do this. Not just a small share of them, and not only because developers didn't understand the RFC.

I'm not arguing this is in any way good for end users. I'm saying it can be a good idea despite being horrible towards some users.

You keep arguing from the users' perspective when I'm saying "This is being an asshat to users, but it's worth it." The argument "That's bad for users!" isn't a counterargument to that


Sure. But my whole point is that all the other reasons are not legitimate, the whole point of blocking + is to cover the company's ass, there's no legitimate potential issues with notification, or even to prevent errors. It's just BS rationalisation.


One problem is the lack of context: unlike Stackoverflow, you don't get additional infos from other users who tried the same code.

The only feedback Copilot receives is whether you keep it or not, you can't tell it a few days later that it wasn't a good fit after all (whereas you can comment on a Stackoverflow answer).

In its current form, it amplifies bias whereas code needs accuracy.


That seems like an additional feature. Maybe append some comments to caveat the code snippet


A fun issue I keep hitting with Github Copilot in Python is that it's a coin flip whether it will give me a Python 3-style print statement or a Python 2.7 style print statement.


Given that no one should be _actively_ writing 2.7 code, and that GH is trying to monetize this, I would think they could retrain the main model that excludes 2.7 code, and then allow people who need 2.7 to "pay for supporting 2.7" just like I'm guessing folks stuck on old platforms always do

You raise a fascinating point about older platforms of other languages, too; Java has a super backward compat story, but woe be unto the coder who tries to name a variable "enum" nowadays


This seems to be another side of the problem that it gives code that doesn't actually compile for the target language. They should have a check for that, that should be the baseline.

For me, it sounds like it's closer to dumb copy paste than a smart code generator. AlphaZero wouldn't play a chess move that was against the rules.


Most instances of that should be fixable with some post-processing in Copilot. They could maybe even run py2to3 on it?


I think preprocessing, upgrade the input data before training.


Then you have to trust that not only the original code works, but also the preprocessed upgrade. What could go wrong...


Could Copilot integrate linters? I.e. it doesn't suggest code that doesn't pass lint? That'd get rid of this, at least. I love linters so much...


Same lol . Most of the times I get Python 2 print statements


> Developers with experience can definitely handle this out but what if a newbie directly starts with the help of AI, he will spend more time on stack overflow than writing actual code. (oh wait that's how most of the developers are lmao)

Spontaneous Ask HN: How much truth is there in this "devs be copy-pasting from SO all day" trope?

Personally, I have used SO quite heavily in its early years, circa 2010-2014, including posting my own questions and sometimes posting answers to others' questions. But now, I don't use it actively anymore. Sure, when I search for a concrete question and SO happens to be in the search results, it's sometimes a valuable resource. But it's not the go-to for programming questions that it once was for me.

I'm honestly not sure if that's indicative of my own growth as a developer, or caused by outside factors. I have a vague feeling that developer documentation in general got better in the last decade, at least for the technologies that I'm using... but then again it could be also a sign of personal growth that I'm more comfortable with the upstream documentation. Finding answers for webdev questions on MDN is another sort of game than finding answers for webdev questions on SO, after all. What do you all think?


Based on my own experience, I remembered back in the day when I first started programming, in PHP, I copied many code from other commenters on PHP.net a lot (their online manual allows others to post code underneath the main docs). So I expect some new programmers to do the same but with StackOverflow.

Of course, I stopped doing that once I got hold on the real knowledge. In fact, now I look back the code that I copied from the others, it's just like watching a horror movie, and I rather rewrite the whole thing in my own term.

I guess the fairer statement to put in is "some programmer copy those code to 'get started'".


Honestly I just ditched the idea of Copilot the moment it spewed out an entire file of copied code.

Nice work on making a markov bot with extra steps GitHub. Please, do take my money..


It never did that. It got the license wrong so it wasn't an exact copy...


Until a court decides what it does and doesn't do I'm staying clear

Here's it outputting Quake code, including handy comments it came up with for each line and even an entire line of commented out code. Maybe it decided it was a good choice to comment it out but still include it I guess

Being word for word from the original is just a weird coincidence too

I truly wanted it to be as good as it was sold to us too, but it isn't

https://twitter.com/mitsuhiko/status/1410886329924194309

HN discussion at https://news.ycombinator.com/item?id=27710287

Additionally here's it somehow requesting and using API keys for use in your code https://twitter.com/passcod/status/1410822834272694275


It didn't copy the entire file for quake - it got the license wrong (which is even worse!) That what I said in my original comment!


Ah I see. I think my perception of your point was altered by you having been downvoted by someone making the text grey (can't read text = bad comment). Sorry about that.

I'll try to avoid letting downvotes bias me in future


Yes; that's even worse.


this is the worse take I've seen so far


Could you explain why or do you just want to feel superior? Both good choices honestly but I'd love to know your actual thoughts

If I'm wrong tell me why and I'll happily reevaluate my opinion, promise :)


Well, my understanding is that your point is that it's not useful and only regurgitates code it already saw. That's super false, it's useful tool that generates code in a very context sensitive manner. Not always perfect, and in alpha, but still very useful.

The full copying only results from people actively probing the model to output copies of the code, and they made it work for very famous code that's been copied around github a bunch of times. That doesn't make it a non-useful piece of software.


> that generates code in a very context sensitive manner

That's why I compared it to a markov chain aye

In any case I've come to the conclusion my hater attitude is fueled by disappointment, so maybe it was the worst take from this whole thing. I'll avoid future copilot threads


markov chains model n grams.. this is really much better


Yes I was trying to be insulting at the time, which was wrong


Are you aware co-pilot is still in preview?

It's a bit harsh to make sweeping statements along the lines of 'it's just a fancy markov bot' based on a few well-publicised glitches in a technical preview.

I assume you have built something surpassing the scope and ambition of co-pilot before, not just some armchair tech lead throwing shade.


Fair point on the sweeping statement, I'll rein it in

Thinking about it the main factor is an emotional one. I'm disappointed. Butthurt if you will. It sounded great but it tripped over so far away from the finish line that I've turned against it. I'll excuse myself from any further copilot threads

Of course I can never meet the requirements of your post wanting something more impressive than Copilot. Copilot itself falls far short of that. Nothing I give you will be enough as there'll be flaws you will attack to make your point. Bit of a time sink that, lets just assume you're right :)

No, I've never successfully built anything as ambitious as, and definitely nothing surpassing, copilot. Have I tried? Absolutely. Have I failed? So far yes.

Of course by that logic though I still win this conversation if you've not succeeded in making anything more ambitious than my failed projects, is that correct? :P

For the sake of not wanting to come off as blagging (also I want the holes poked in this one tbf) my most ambitious project I've not figured out how to make work yet is a new (afaik) type of business model: cohan.me/profit-share


A week in and I'm already bored of Copilot submissions... It's not perfect. What do you expect?


Not being perfect would be acceptable, but at this point it looks like something with the potential of setting back software quality and security [1], with the added bonus of also breaking copyright/licensing.

[1] I understand that everyone should review code before committing, but even if me and my team do it, there's no way all the proprietary and open-source software I use has teams doing the same. That's why I worry for my security.


There are two types of products: the ones that people complain about and the ones that nobody uses or cares about; based on a famous Bjarne Stroustrup quote.

The fact that people post about it, point out flaws, leave GH out of spite, etc. only goes to show how impactful a system like this can be. It's a big deal, which is why people talk about it.


That it's perfect.


GitHub Copilot is going to end up being the Google Glass of developer tools.


It could've been a fantastic April Fool's release. I like to imagine that was the original intent and a manager went "eh, release it anyway".


Copilot is going to improve incredibly fast with millions of devs providing ReCaptcha style training data around the clock.


So far the most WTF thing I've gotten out of it is:

    import base64
    test = base64.b64decode("""SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t""".encode())
    print(test)
    # b"I'm killing your brain like a poisonous mushroom"
And the most odd thing:

    # The base URL for all API requests
    base_url = 'https://api.gdax.com/'
    # The base URL for all non-API requests (e.g. static content)
    base_url_static = 'https://static.gdax.com/'
Which are URLs that haven't been a thing for 2 years, I think and I can't find any code in github that uses them still.


A copilot for commit messages will be as useful as this.


Controlling the generation for code quality will be extremely hard.

The only thing I see is that they could filter their dataset so that some bad proxy of code quality is taken into account, something like the number of stars (which is clearly a terrible metric, tell me if you think of something else).

The idea would maybe start by training on all of the subset of Github it is ethical and legal to train on, and then filter down to higher code quality towards the end.

Controlling for the time at which the code is emitted would be easier. Something like, retrieving similar contexts, and guiding the model to be more similar to the recent code if there is similar recent code that exists. I'm not sure exactly of how this would be done, but I can see it working.


Our daily GithHub Copilot message give us today.


GitHub should release a ML powered copyright plagarism app to spot when code has been copied and suggest to the user the license they need to use.


Skimmed through this "review", nothing wowing me.

> Hey Stephen, If you are reading this, please follow me on hashnode .

God, that's obnoxious...


That's what we call a 'joke'


I wonder how much worse or better simply automating a stackexchange search and pasting the results is...


Searching SO and converting the highest rated answer's code that doesn't syntax error to a Python module is about 150 LoC[0].

[0]: https://github.com/drathier/stack-overflow-import



Horrifying and beautiful at the same time!


Someone did a Stackoverflow auto-complete for JavaScript: https://emilschutte.com/stackoverflow-autocomplete/


You know they are hitting something interesting if all of hackernews gets mad haha


What’s with the editorialized title based on a single sentence in the article?


If they are using my code anywhere I expect to be compensated.


The new model pioneered by AirBNB and Uber is to break laws under the guise of "growth hacking" and then change the laws later once you're entrenched.

You won't be compensated. Corporations write the laws.


If you read the license you put on your project (and/or the GitHub Terms of Service), you wouldn't expect that.


Being credited for your work is a form of compensation. Nearly all open source licenses require proper attribution. Github is just violating the open source licenses of nearly all people that posted their code to Github itself, which is a huge breach of trust.


If it’s anything but public domain, there’s probably at least a requirement for credit.


I accept google's and Facebooks terms, and I still don't expect to be tracked across the internet to pay for the services they give me. Basically: my expectation is that companies aren't behaving as badly as their terms allow them to.


Huge leap of logic to assume what license another user publishes their code under. A charitable reading here of chovybizzass's comment leads me to the understanding that they are publishing their code under a license that requires compensation if used, this code can be published to GitHub as well. Would it be stupid? Probably. But even more stupid would be to produce MIT-licensed code and then complain about not being compensated.


So does querying Google or Stackoverflow. Looking for “inspiration” should guide you, but you should always be skeptical and not let your brain fall out


The danger is that CoPilot becomes "AutoPilot", because as the author points out, you get used to it way too quickly.

Searching the web still implies extra effort on your side and (hopefully) some scepticism towards the code you find (SO comments, etc.)


I'm all for moving forward, but Copilot just seems to be a bad idea, amplified.

What I want is careful, thoughtful, knowledgeable people, who have learned and honed their skills over years in various areas and can come up with creative, maintainable solutions to complex problems. This is not something you autocomplete. If it would be, we could autocomplete 80% of all jobs tomorrow.

I don't want code monkeys on steroids. But maybe I'm to far off Silicon Valley.


Is === really necessary for Javascript? Type checking seems like overkill?


It’s a well known rule in JavaScript to default to “===“ for everything except checking for null or undefined, in which case “== null” is acceptable.

https://eslint.org/docs/rules/eqeqeq#smart

Even better is to just use TypeScript.


Does typescript really protect one from javascript though?

I imagine it is fine. If it was just myself developing I would be fine, but the insufferable junior devs that learned "=== or die" rear their ugly heads.


Checking for 'null or undefined' is an antipattern anyway, so there's no real valid use of '=='.


You should confirm that the inputs are what you expect them to be. There's almost no time when you shouldn't be checking the type.


Depends how much you care about the equality and how strict. by default == will do type coercion so you can get weird results where things like [] == 0 are true.

More equality fun in JS can be seen here: https://dorey.github.io/JavaScript-Equality-Table/


In best practice you should always use === unless you know exactly why you wouldn't for a specific case.


    a == !a
There’s several values of a where that yields true (of the top of my head, a = '0' is one of them).

Yes, you do need ===.


If someone doesn't understand this then here's why.

You need to learn the Six Falsey Things In JavaScript. Everything else when cast to boolean will be true. These six things are What and Why and When And How And Where and Who. Wait, no, that's a totally different list of six... anyways

false

undefined

null

NaN

0

"" (empty string)

That's it. Since '0' is neither when the negate operator casts it to boolean it'll become true. And then negating it becomes false. When doing a comparison between '0' and false, the standard says https://262.ecma-international.org/5.1/#sec-11.9.3

> If Type(y) is Boolean, return the result of the comparison x == ToNumber(y).

ToNumber(false) is of course 0. So you are running '0' == 0 which is visibly true...


> So you are running '0' == 0 which is visibly true...

Crucially, there's the extra step of converting '0' to a number.


It's not, it's just one of the tools in the toolbox.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: