Hacker News new | past | comments | ask | show | jobs | submit login

Either way, you’re sending your companys biggest asset to another company, aren’t you? I’ll try these tools when they start being able to run locally



No or no company would be able to use it. As you type fragments of code are sent and discarded after use. You need to trust Microsoft to actually do the discarding but contractually they do and you can sue them if they accidentally or deliberately keep your code around or otherwise mismanage it.


They are obligated to give data to the government, and government took part of spying in Brazil for Boeing in the past, but I guess they are using this capability only for a few strategic companies, and most companies are not that.


> government took part of spying in Brazil for Boeing in the past

Do you have more details? Please elaborate.


But that is naive, isn't it? Who has the money and time in their life, to actually sue MS? Even if "you" is a business, few will have the resources for that.


Individuals do not (although a class action would be feasible), but large companies that use Github and other Microsoft products, of course they have both the means to sue Microsoft and the motivation should their business be impacted.


Exactly


I sort of disagree that code is the biggest asset. Take the Yandex leak. What can you do with it? Outcompete them?


> Take the Yandex leak. What can you do with it?

Obviously, add it to the big training set of the next code model.


I surely hope they use my copyrighted code and make millions out of it. Ideal case for me to sue them for lots of money.


How would you ever know? It will come in chunks of a dozen or less lines at a time and it will be written into your competitor's proprietary codebase (that you don't have access to).


> GitHub Copilot [for business] transmits snippets of your code from your IDE to GitHub to provide Suggestions to you. Code snippets data is only transmitted in real-time to return Suggestions, and is discarded once a Suggestion is returned. Copilot for Business does not retain any Code Snippets Data.

Likely, some employee would whistleblow that they're not complying with their privacy policy, and either government litigation or a class action lawsuit would ensue. That legal process would involve subpoenas and third-party auditors being granted access to GitHub/Microsoft's internal code and communications history, which makes it pretty hard to hide something as big as collecting, storing, and then training from a huge amount of uploaded code snippets they promised not to.

It's not inconceivable that they're noncompliant, but my bet would be that if they are collecting data they explicitly promise not to it's an accidental or malicious action by an individual employee, and they will freak out when they discover it and delete everything as soon as they can. If they intended to collect that data, it would be much easier to write that into the policy than deal with all the risk.

Notably, this applies to Copilot for Business, which is presumably what you're using if you are at work.


Couldn't it happen more subtly, without having the code lying around for long? The model could be doing online-learning (ML term) and only then they discard code that they get send. This means your code could appear in other people's completions/suggestions, without it having to lie anywhere. It is basically learned into the model. The code could appear almost or even completely verbatim on someone else's machine, possibly working for a competitor. Even that it is your code would not be obvious, because MS could claim, that Copilot merely accidentally constructed the same code from other learned code.

Not sure that this is how the model works, but it is conceivable.


Right.

If you are building something truly valuable locally, and it is innovative or otherwise disruptive and relies on being a first mover, centrally hosted LLMs are a non-starter.

Most software corps have countless millions of lines of code. You'd be spending lifetimes tracing where someone ripped your "copyrighted" techniques and methods.

The complete lack of security awareness and willingness to compromise privacy for convenience in people deeply saddens me.


> willingness to compromise privacy for convenience

I have to ask: do you carry a cellphone?


This is not really a valid comparison.

The cellphone is not a compromise for convenience. It allows me to make a living, providing internet connectivity and lets me keep in contact with friends and family. Without it, my freedoms would be drastically diminished.

With software we develop with, we have choices. We can use OSS. We can try to use open hardware. If we are working on sensitive things, we can use an airgapped system with vim.

When you practice these kinds of routines, they are not a burden. Actually, using vim instead of something like vscode increases productivity eventually. It does take a little bit of time.

When we couple our productivity with centrally hosted services, we greatly diminish our freedom to be productive on a wide range of problem areas. I don't say this to brag, it is to maximize all of our freedom.

In my view, most of us SHOULD be working on "sensitive" things. There is so, so much work to be done for the cause of freedom and liberty in software. We need to reserve that capability in us, we cannot let nameless people have an inside access to our expression.


A cellphone literally tracks your every move. If that's not a privacy concern then I don't know what is. Maybe a device with a microphone that's constantly on you. Oh no wait, that's also a cellphone.

I was born in the 70's, and I can tell you, you can survive just fine without a cellphone.

All of what you describe can be done on a desktop. But hey, if you want to compromise your privacy for some convenience, that's your choice.


Are you going to carry your desktop into the forest on a hammock and work? How about on a plane to other countries?

Will you carry your desktop in your car while living on the road? In the middle of forests and on top of mountains?

Will you work from a campsite with your desktop while not connected to the internet?

Can you have a meeting via your desktop from a rocky beach and no internet service?

A cellphone can't track what I type on my laptop, and it can't read encrypted comms my laptop makes to remote systems. I can put a cellphone in a distant location and use a portable, open source router with a VPN on the router, with encrypted, private DNS.

Not everyone lives inside a comfortable little box. There are all kinds of ways to do life.


Sure, if you are willing to compromise your privacy, which you clearly are.


In these companies, people are not permitted to carry their cellphone into workspace.


I was talking about someone stealing my codebase.

Talking about integrating the code into the LLM, others get the same benefit that you are getting, so I don't really see the issue.

So you can either develop everything on your own, or you can leverage LLM's, helping both yourself and others.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: