Hacker News new | past | comments | ask | show | jobs | submit | judegomila's comments login

We built it from scratch.


JG here. Will look into this.


Jude from Golden here. Yes we will work very hard to keep to scientific consensus and reach that level of quality or higher. Please bear in mind this is not only crowdsourcing but there is automation in the collection of the information, that too will have its fair share of issues as well no doubt :> There is likely opportunity to auto detect cranky/racist/gaming behavior and we have some plans in that area. Feel free to shoot over ideas as well if you have further thoughts.


"there is likely opportunity to auto detect cranky/racist/gaming behavior and we have some plans in that area."

Ask facebook and twitter how easy automated content moderation is...


True, very hard problem. But combinatorial opinion free form comments phase space >> than phase space of canonical knowledge thus number of patterns of things that can go wrong for them much larger. Still keeping our shields up and not discounting this issue.


Jude from Golden here.

We attribute to wikipedia in general which is inline with their TOS. Did we miss a place? We say when needed "Text adapted from the Wikipedia page "Product Hunt": https://en.wikipedia.orghttps://en.wikipedia.org/wiki/Produc...

FYI as well the copyright doesn't apply to the actual fact only the text and we give attribution as per their TOS in those cases.


I do not see attribution on https://golden.com/wiki/Product_Hunt

The vast majority of the text on that page is a verbatim copy from Wikipedia. You have to provide appropriate attribution.


Jude from Golden here. Agreed that WP is one of the most amazing things ever built and interesting to see your various lenses on our mission.

To the cynical self: see dropbox launch on HN back in the day. PS I’m no way claiming we are dropbox :>

To the angry self: There are various constraints that we want to release ourselves from in working on this problem by starting fresh. We believe the constraint space is too high to not build something new here. There are things that can be reused to build on what has been done already (linking out to WP, WP linking back to us when appropriate, the name space being similar/forked, various policies being built on/forked or rewritten, lessons learnt, content summarization with AI, fact cross checking etc).

To the academic self: we want to cover 10bn+ topics, google knowledge graph is around 3bn+ entities. We are not attempting to map all lamp posts in san francisco which would make a useful data set for a self driving car company but we do want to map all businesses, concepts, science topics, people of interest, species, products, services, etc etc. Instead of notability, we are aiming more at a validation model ie ‘does this entity exist’. There is also a difference between ‘article’ of WP and ‘entity’ of Golden for our model. So I believe there is space for positive coexistence between Golden and WP. We will still want discussion around the validation and ‘what next after 10bn entities are done’ debate.

In terms of the morality part, we wish to be at a more open standard than WP. The trade being for the common user: we open up all the pages on CC-4.0-BY-SA, go hopefully 1000x more entity cardinality, open source useful queries in exchange for less work per topic than alternatives (due to the leverage of the automation on alternatives). So I think we are on strong moral ground here, otherwise I would not work on it. We also have paid helpers as well to fill in gaps on our side to increase content and using part of our revenue to increase content at an ever faster speed. We reviewed the micropayments model and we don’t think it will work (see lunyr failure and others on that front).


Thanks for replying! I agree micropayments are incredibly difficult to make work, as is evidenced by the race to the bottom advertising model that seems to be everywhere all-the-time.

I hear from you that you want to essentially cover 10 billion topics, and essentially validate that they exist, but that says nothing about validating the content of what someone is saying about it, nor organizing it, etc.

I hear lots of AI buzzwords, but essentially I don't see any staff that would leverage all the thinking humanity has done on organizing, validating, and cataloging information. Where are the information scientists? The librarians? The archivists? The journalists? The philosophers? Etc etc etc.

Essentially you are talking about a profoundly /human/ endeavor that requires input (IMHO) from many corners of human knowledge to do in any way that begins to approach wikipedia (or even an encyclopedia, much less a library) in terms of quality and scale.

I hear buzzwords, and see an alarming lack of acknowledgement of how difficult these questions are (or even that they exist).

However, you have the $$ and the people, and I'm just here hiding behind a keyboard criticizing. Clearly you've convinced more people of your ideas than I have of mine, and by all means it's a noble goal, so I wish you the best of luck and will be interesting to see what you are and what Golden looks like in a decade or so!


Thanks so much, in terms of the hard questions after todays madness of launch comes down a little I'll tackle the hardest comments/questions here. In the coming months we will do some technical blog posts to explain how we will tackle the problem space. Many of the problems we have not figured out yet and welcome the community to contact us with new ideas. I 100% agree some of the problems are very hard. In terms of giving a glimpse into some problems we have solved so far, please test out the AI assisted editor, the magic table cells in the editor for auto filling tables, the citation tool by pasting a academic paper in the citation UI, the event detection on the timeline UI and the AI suggestions as well to get to some of the early results we have on automating the problem. Topic prediction, taxonomic detection, claim validation, structured data extraction, auto field detection/suggestions, crosslinking, spelling/grammar checking, sentiment checking, event detection, tense detection, quality on human edit feedbacks and ultimately prose writing (see recent open AI auto writing research) [non exhaustive list] - some we have solved and some not yet, but we will keep working on it. Generally speaking, keen to work on something difficult for the next 10 years...


Hi Jude,

On your page you said "If an extremely niche topic is valuable to just a handful of people and positively contributes to society, it will have a home on Golden."

Who will make that judgement call of what "contributes to society", and who will be paying their salary?

You also said "We believe this advanced query tool is extremely useful for investment funds, large consultancies and large companies, so please get in touch if you want to experience one of the best query tools out there."

That sounds great but its a far cry from "human knowledge". There wasn't much about advanced query tools for academics, nonprofits, activists, or government employees.

Sorry to be so cynical but one can only hear so much of "making the world a better place", to quote Silicon Valley.


> To the academic self: we want to cover 10bn+ topics, google knowledge graph is around 3bn+ entities. We are not attempting to map all lamp posts in san francisco which would make a useful data set for a self driving car company but we do want to map all businesses, concepts, science topics, people of interest, species, products, services, etc etc. Instead of notability, we are aiming more at a validation model ie ‘does this entity exist’. There is also a difference between ‘article’ of WP and ‘entity’ of Golden for our model. So I believe there is space for positive coexistence between Golden and WP. We will still want discussion around the validation and ‘what next after 10bn entities are done’ debate.

How does this differ from wikidata.org?


it will have to be profitable to pay back investors. for one difference.


> we wish to be at a more open standard than WP

As far as I can see, CC-BY-SA license only applies to the text and not the knowledge graph that users contribute.


Hey judegomila,

You're using my images - hundreds of them, it seems - in breach of the licence. Where do I send my invoice?

I'm sure as you wish to be "on strong moral ground", you won't want to deny me what you owe me.


So, you're aiming for an exit as per Metaweb/Freebase?


JG here. No, just a damn good website for you all.


If you've taken VC, then surely there's an exit in about 5 years?


JG: I hope not :>


Like a lot of others here I really wish you good luck, this can become amazing!

Like a lot of others here I'm also afraid hat will happen as VC starts to demand profit, now.

For the record, I'm in no way against successful companies being wildly profitable, quite the contrary: I see that as a guard against being forced to do dumb things.

What I am wary of however is companies being forced by VC to do all kinds of crazy stuff, like back when Quora decided to publish everything one looked at and I left there and then never to return, or when short after WhatsApp joined Facebook talk would start about "integration" and I would immediately start moving my account and all groups elsewhere.

What I could hope for[0] - especially with companies that are hoping to crowdsource a lot of data - would be some kind of effective guarantee and/or escrow to prevent short sighted plays by VCs or hostile takeover by Google, Facebook or similar companies, i.e. that companies would "tie themselves to the mast" to escape the siren songs.

[0]: but don't really expect in most cases as it would limit a number of profitable exits. An upside I could see would be that it would be easier to get crowdsourced data, from both companies as well as from individual contributors if one could believe that the data would stay accessible and not be abused.


Don't let your VC see this because they're definitely expecting to see a return on their investment reasonably soon.


All the best.


Jude from Golden here. I'd love to get your feedback on the editor and fix any bugs you come across / comparisons of previous experiences and why you gave up. You can email me at jude [at] golden [dot] com or post it here...


Jude from Golden here. Good question. We are opening up useful queries over time for users eg https://golden.com/y-combinator-w19-companies/ Future paid/business tools will include using our AI assisted editor for private companies, private storage of their knowledge so we can continue to open up more queries to the public. Our north star is to get to a more open knowledge base than what is currently available. Also here are all the cryptocurrency projects that have whitepapers https://golden.com/cryptocurrency-whitepapers/


Given than Wikidata is published under a CC 0 licence there’s nothing preventing Golden from using it as a knowledge source. Unless Golden provides a better editor experience I see no reason for anyone to contribute to Golden rather than Wikidata.


Just FYI, most of the content on Wikipedia is actually dual licensed under CC-BY-SA and GFDL. Both licenses are copyleft licenses; CC0 is more permissive.

https://en.wikipedia.org/wiki/Wikipedia:Reusing_Wikipedia_co...


Yes, but I’m talking about Wikidata (CC0), not Wikipedia (CC-BY-SA).


Whoops, I misread your comment! Sorry about that.


Parts of Wikipedia articles (e.g. some infoboxes [1]) are actually sourced from Wikidata, which is licensed under CC0 [2]

[1] https://en.wikipedia.org/wiki/Template:Infobox_person/Wikida...

[2] https://www.wikidata.org/wiki/Wikidata:Data_access#Basic_imp...


Jude from Golden here. This knowledge base is for all of humanity as well. I think the important part here is that we are putting out the content on CC-BY-SA-4.0.

There is also risk factor to a donations model for WP and to not having enough revenue to invest in the tools needed to go 1000x on the topics + other features that are important to build. If it means anything, I worked on my last co (Heyzap) for 9 years and its still running today. I can see the worry but there are many options on creating backups and dumps to the text, so we are agreement that more knowledge is better and thanks for the support.


Thank you for your response. I'm greatly encouraged that the license is CC-BY-SA. That's definitely in the public good.

I do believe there is space for non-Wikipedia knowledge repositories. Look at Wikia, for example. People do want to store all their fandom knowledge somewhere. It's just that Wikipedia might not be the place for it.

I also agree that the underinvestment in tools can limit contributions. Wikipedia's new visual editor is the biggest step they've taken in making editing easily accessible. They also have new translation and analysis tools. Not to mention all the specialized wikis under the umbrella - Wiktionary, Wikidata, etc. I do wonder if it's enough though.

Thanks for your work. I hope you can find a balanced business model.


Thanks, glad you like it.


If the knowledge is for all of humanity will you be incorporating as a B-Corp or some other form of social benefit corporation?


Jude from Golden here. Great question. We are actively monitoring all the changes right now, building up the community with a scientific/industrial focused seed and building out UI and AI to track the flat earth type changes that might come up in future. I think if we can get transparency on their best arguments/evidence and see the best counters it is going to become clear that the earth is round in that example. Let us get overun with people that want the best known information on the topics.


This approach may work with natural sciences, but what about political topics? People who spent most time studying an ideology X, are in some sense the best available experts (they remember thousands of details), but are far from impartial. And of course, ideologies may try to call themselves "science", making it seem like people who disagree are simply uneducated.


Who gets to decide who is a scientist and who is not a scientist? Scientists?


CC-BY-SA-4.0 correct, we are correcting the blog post.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: