Hacker Newsnew | past | comments | ask | show | jobs | submit | deanjones's commentslogin

This will fail very quickly. The licence that project owners publish with their code on Github applies to third parties who wish to use the code, but does not apply to Github. Authors who publish their code on Github grant Github a licence under the Github Terms: https://docs.github.com/en/site-policy/github-terms/github-t...

Specifically, sections D.4 to D.7 grant Github the right to "to store, archive, parse, and display Your Content, and make incidental copies, as necessary to provide the Service, including improving the Service over time. This license includes the right to do things like copy it to our database and make backups; show it to you and other users; parse it into a search index or otherwise analyze it on our servers; share it with other users; and perform it, in case Your Content is something like music or video."


I don’t see that being “quickly” - they’d have to get a judge to agree that passing your code off without attribution for other people to use as their own work is a normal service improvement. Given that it’s a separate feature with different billing terms, I’m skeptical that it’s anywhere near the given that you’re portraying it as.


"Without attribution" is a condition of the licence that applies to third-parties. It is not a condition of the licence that applies to Github.


It's worth reading the passage in its entirety and how a court would interpret it:

> We need the legal right to do things like host Your Content, publish it, and share it

> This license does not grant GitHub the right to sell Your Content. It also does not grant GitHub the right to otherwise distribute or use Your Content outside of our provision of the Service, except that as part of the right to archive Your Content, GitHub may permit our partners to store and archive Your Content in public repositories in connection with the GitHub Arctic Code Vault and GitHub Archive Program.

If Copilot is straight-up reproducing work, and it is a service that users have to pay to use, then it seems like Copilot is "sell[ing] your content" and thus the license does not apply.

More generally, a court is likely to look at the plain English summary and judge. Copilot is not an integral part of "the service" as developers understood it before Copilot existed.


"as necessary to provide the Service, including improving the Service over time."


You're trying to play desperate semantic games.

"This license does not grant GitHub the right to sell Your Content" is unambiguously clear.


"desperate semantic games" is actually a reasonable description of the legal process :-)

I'm not sure I agree that anything expressed in a legal contract using natural language is "unambiguously clear". MS / Github's expensively-attired lawyers will not doubt forcefully argue that they are not selling the YOUR content, but a service based on a model generated from a large collection of content, which they have been granted a licence to "parse it into a search index or otherwise analyze it on our servers". There may even be in-court discussion of generalization, which will be exciting.


This is the standard content display license that everyone uses. Even in your quoted text I don't see any hint that snippets can be shown without attribution or the code license.

It also says they can't sell the code, which CoPilot is doing.

Also, in a very high number of cases it isn't the author who uploads.

Repeating your line of argumentation (which occurs in every CoPilot thread) does not make it true.


It's irrelevant whether it's standard or not. Again, the terms in the code licence (including attribution) do not apply to Github, because that is not the licence under which they are using the code. You grant them a separate licence when you start using their service.

If someone who isn't the author has uploaded code which they do not have a right to copy, they are liable, not Github. This is also clear from the Github Terms: "If you're posting anything you did not create yourself or do not own the rights to, you agree that you are responsible for any Content you post"

It's almost as if these highly paid lawyers know what they're doing.


You grant them a content display license, not a general code license.

> It's almost as if these highly paid lawyers know what they're doing.

Sure, they wrote the content display license long before CoPilot even existed. Any court will see the intent and not interpret these terms as a code re-licensing.


There is no such thing as a "content display licence" or "general code licence". There is copyright (literally, the right to make copies) which broadly lies with the author, who can then grant other parties a licence to copy their content.

I'm afraid I do not believe your legal expertise is so extensive that you are able to accurately predict the judgement of "any court".


> You grant them a separate licence when you start using their service.

And that license explicitly states that it doesn't give them the right to sell your code.


And it explicitly states that it does give them the right to share your code. Copilot isn't selling code; if it were, then GitHub wouldn't let you share the output of Copilot; that would destroy their market. That they allow you to share the output of Copilot with others proves that what they are selling is the service, not the output. The output is, at worst, "shared" code from Github's licensors.


They're selling the service, which is a derivative work of the code.


Which is what the licence is granting them the right to do.


So, it isn't clear to me which of these clauses you are citing grants them the forced right to "Copilot" (which I'm using as a verb to avoid defining what stage of production we are talking about) that wasn't granted by the license of the code, but let's assume for a moment that you are correct: that just means that GitHub as a service makes no sense, right? Like, there are a ton of people using GitHub to develop using code I've published in the past... code which is under various of these example licenses, and which I've never myself (as the copyright holder) published to GitHub (and, in fact, would never as I despise GitHub). There are also a number of very popular projects--such as the Linux kernel--which people no only upload to GitHub but which have official mirrors of on GitHub where no party even owns the copyright in order to agree to these terms of service. Meaning, if you are correct, GitHub is often being used illegal and a ton of the source code they are training against wasn't legally provided to them in the first place.


Section D.3: "If you're posting anything you did not create yourself or do not own the rights to, you agree that you are responsible for any Content you post". A lawsuit against Github has no standing for the scenario you suggest, because Github is not at fault.


Ok, so: "that just means that GitHub as a service makes no sense, right?" Like, I feel you simply ignored the core complaint of my comment so you could instead note something about GitHub's potential liability (a thought process I didn't even bring up, though I can see how many you decided to bend my final comment into somehow being relevant for that thought). Like, are you simply ceding then that a ridiculous amount of the content on GitHub -- including major projects such as Linux -- are not allowed to be posted to GitHub?


How about codebase that were uploaded to GitHub, by someone other than the original copyright owner?

e.g. I can clone the GNU codebase and publish it to GitHub. Clearly I don't own the code and do not have any rights to grant GitHub a license.


Section D.3: "If you're posting anything you did not create yourself or do not own the rights to, you agree that you are responsible for any Content you post". A lawsuit against Github has no standing for the scenario you suggest, because Github is not at fault.


Does that mean if anyone posted my open source project to GitHub, I can fill a DCMA takedown request regardless of the license I chose?


Well, I don't think it's a DCMA issue, but it does very much depend on the licence you have chosen. That's what the licence is for, to allow people to use the code that you have copyright of, and to define what they are / are not allowed to do with it.


> Authors who publish their code on Github grant Github a licence under the Github Terms: https://docs.github.com/en/site-policy/github-terms/github-t...

This sounds unenforceable in the general case. How could github know whether someone pushes their own code or not? Is it a license violation to push someone's FOSS code to github because the author didn't sign up with GH?


> Is it a license violation to push someone's FOSS code to github because the author didn't sign up with GH?

It depends on the licence.

It's very much enforceable that companies who provide content publishing platforms will indemnify themselves against people publishing content to which they do not have an appropriate licence.


If that is pretty much verbatim under their terms, then yes the lawsuit is going nowhere.


Also variables.


Leaving aside all of the possible confounders for the difference in severity of Covid-19 infection in men and women, it's not the case that women on average have more body fat than men. So for a "healthy" man and a "healthy" woman, the woman would tend to have more fatty tissue, but more men are overweight. UK figures show that 40% of men are overweight, compared to 31% of women. In the US, 74% of men are overweight, compared to 67% of women.


Those US figures are overweight or obese. The UK figures are likely overweight only. Over 60% of UK adults are overweight or obese [0]

[0] https://www.cancerresearchuk.org/health-professional/cancer-...


The categories of overweight and obese are based on BMI, which is not relative proportion of body fat. On average, of a man and woman who both have a BMI of 30, the woman will have a higher proportion of body fat.


I think both these claims can be true at the same time, can’t they?

To make a toy example just to make the reasoning clear: if men’s BMI was distributed as 50% = 20, 50% = 35, and women’s BMI was distributed as 95% = 20, 5% = 35, the average body fat of all men would be higher than the average body fat of all women, even though the average body fat of women BMI=X would be higher than of men BMI=X for all X.


Do you have a link for those % stats? I'm not questioning the accuracy. Only that I want to share it with others (in a told ya so sorta way).


I misread the UK figures, the above are for overweight-but-not-obese. The overweight-and-obese split of men/women is 67%/61% (NHS, 2019): https://digital.nhs.uk/data-and-information/publications/sta...

US figures are here: https://www.niddk.nih.gov/health-information/health-statisti...


also, women tend to be smaller, so even with higher percentages, the total amount of fat can be roughly similar to men.

but it seems the real issue with fat is how much visceral fat there is vs. the total amount anyway. obese folks all tend to have much more visceral fat than relatively fit individuals. visceral fat interferes more with the workings of organs and the signaling mechanisms (hormones, neural pathways, lymph/blood compositions, etc.) that keep a body healthy.


Yes visceral fat appears to be the real risk factor.

https://www.researchsquare.com/article/rs-880193/v1


My information was: 12% body fat is normal for a man and 24% is normal for a woman. Those deposits are needed since women bear children.


Introspection has long ceased to be a valid methodology in linguistics. Lack of a 1:1 mapping between languages clearly does not imply that you "fundamentally lose information". Just because it takes a language two words to express a concept that another language can express using a single word obviously does not mean that the first language is somehow unable to fully express the meaning of the concept.


Interestingly here, "loanword" is a calque from the German "lehnwort", which reminds me of my favourite language fact: "loanword" is a calque, and "calque" is a loanword.


Alternative title: "One language that has different close compounding rules than another language has different compound nouns than that other language".

Auf Deutsch: Verscheidenzusammengesetztesgebotesprachen haben verscheide zusammengesetztes Substantive.


To the non-native speakers: "Verscheidenzusammengesetztesgebotesprachen" looks long and German, but is not a German word. Maybe its parts were auto-corrected.


As that great Germanic comedian Mr Arnold Schwarzenegger might have said: "And here are the jokes ..." :-)


And I am sitting here, thinking you actually wanted to say, "verschiedene Substantivzusammensetzungsrechtschreibverordnungen" lead to "verschieden zusammengesetzte Substantive", while you were just making jokes. Silly me :-)


Yes.


>Alternative title: "One language that has different close compounding rules than another language has different compound nouns than that other language".

And different cultural and historical sensibilities expressed in those compound nouns - which is the main point of the article one would assume, not the mechanics...


From the article: "the unique wisdom of some of these words, that don't have an exact counterpart in the English language". So the claim is precisely that because English doesn't have single-word translations of these German terms, that the German words somehow encode "unique wisdom", which is obviously BS.

In most cases, close examination reveals that it's unfounded BS. "Aufrichtigkeit" (Sincerity/Honesty) has the same 'literal' meaning as the English "upstanding", as in "an upstanding citizen". "Liebenswürdig" is just "lovable", and so on and so forth.


>From the article: "the unique wisdom of some of these words, that don't have an exact counterpart in the English language". So the claim is precisely that because English doesn't have single-word translations of these German terms, that the German words somehow encode "unique wisdom", which is obviously BS.

As for the general claim, not so obviously.

For one, "wisdom" here is not meant as some expert understanding of who we are and what to do in general (the generic sense of wisdom), but as improved insight about a sentiment/situation captured in a specific word, as opposed to a language that can only describe it in many ad-hoc words.

To put it in IT terms, those words avoid an extra pointer interaction to the subject matter. Or having them is like having first class support for something that in English only can be achieved with "design patterns" and extra ceremony (of course there are English words which the German don't have too).

As for specific examples in the article, sure, they might not be the best picks to illustrate the general claim.


When discussing linguistic issues, it's best to use linguistic theory and terminology, rather than misleading analogies to software development. What you are propounding is a version of the Sapir-Whorf hypothesis, that language structure (such as lexical rules which allow free construction of compound nouns) somehow influences or constrains cognitive processes. Among the linguistics community, it is widely accepted that there is very little evidence for this relativistic view, and it has been largely rejected.

The author of the article is obviously not aware of any of this, and doesn't care. He is writing without concern for the truth status of what he is saying, his aim is only for attention. This is the very definition of bullshit according to the philosopher Harry G. Frankfurt in his essay "On Bullshit": "It is just this lack of connection to a concern with the truth - this indifference to how things really are - that I regard as of the essence of bullshit".


>When discussing linguistic issues, it's best to use linguistic theory and terminology, rather than misleading analogies to software development. What you are propounding is a version of the Sapir-Whorf hypothesis, that language structure (such as lexical rules which allow free construction of compound nouns) somehow influences or constrains cognitive processes.

No, I'm saying a much more basic, and widely accepted, undisputed even, thing: if language X has a specific term for a situation, then its speakers have captured that situation and understand it better and can refer to it more readily than people whose language lacks the term (and can only describe the notion with different ad-hoc phrases).

That's essentially what language does, even for native speakers. Captures things (nouns), sentiments, ideas, etc, to communicate them.

If a language lacks a term akin to "kawaii" then its people don't have immediate access to that notion the way Japanese have. They could say "cute" but it doesn't cut it, or they could explain it with one or more phrases. But still the person hearing it and the person expressing it would have mismatches etc. Whereas for a Japanese it's already a preset cultural notion.

I'm not saying that having such a term or having a syntax structured in such a way, makes the Japanese to thing this or that way, or affects how they think.

I'm saying that having a work for a phenomenon means you can refer to the phenomenon directly versus not having one, and people will immediately understanding what you mean.


> No, I'm saying a much more basic, and widely accepted, undisputed even, thing: if language X has a specific term for a situation, then its speakers have captured that situation and understand it better and can refer to it more readily than people whose language lacks the term (and can only describe the notion with different ad-hoc phrases).

Firstly, do you have any evidence for the claim that, if a language has a specific term for a concept, speakers of that language understand that concept better than speakers of languages that require more than one term to describe the concept? I am genuinely interested in references to the studies you have read which demonstrate this.

Secondly, you contradict yourself. The following statements are inconsistent:

1. "if language X has a specific term for a situation, then its speakers have captured that situation and understand it better"

2. "I'm not saying that having such a term or having a syntax structured in such a way, makes the Japanese to thing (sic) this or that way, or affects how they think."

Either having a specific term for a concept allows better understanding of it (which obviously "affects how they think") or not.


Well, it's either his writing style, or all the horrible sh*t his company did.


It always confuses me as well (and I have at least some Physics background, having done a year as an undergrad). I think this is because I confuse it with the statistical measure of the entropy of a distribution, where a uniform distribution will maximise entropy.


You are aware, I assume, that the Winklevoss twins were awarded $65 million for "their" idea?


Yes, it's evidently, demonstrably, the case that ideas have value, because this is enshrined the concept of intellectual property. Even in a world where IP didn't exist, his statement would be wrong, because there will always be people who can sell an idea, and the value of something is exactly what someone else will pay for it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: