Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder what is the closest thing to Cyc we have in the open source realm right now. I know that we have some pretty large knowledge bases, like Wikidata, but what about expert system shells or inference engines?


OWL and SPARQL inference engines that use RDF and DSMs - there are LISPy variants like datadog still kicking around, but there are some great, high performance reasoner FOSS projects, like StarDog or Neo4j

https://github.com/orgs/stardog-union/

Looks like Knowledge Graph and semantic reasoner are the search terms du'jour, I haven't tracked these things since OpenCyc stopped being active.

Humans may not be able to effectively trudge through the creation of trillions of little rules and facts needed for an explicit and coherent expert world model, but LLMs definitely can be used for this.


You can actually do "inference" or "deduction" over large amounts of data using any old-fashioned RDBMS, and get broadly equal or better performance than the newfangled "graph" based systems. Graph databases may be a clear win for very specialized network analysis, but that is rare in practice.


Graph databases win in flexibility and ease of writing the queries over graphs, honestly.

Of course the underlying storage can be (and often is) a bunch of specially prepared relational tables.

But the strength in graph databases comes from restating the problem in different way, with query languages targeting the specific problem space.

Similarly there are tasks where SQL will be plainly better.


> ease of writing the queries over graphs, honestly

The SQL standard now includes syntactic sugar for 'Property Graph Query'. Implementations are still in the works AIUI, but can be expected in the reasonably near future.


Having seen PGQL, I think I'll stay with SPARQL.

And for efficient implementation the database underneath still needs to have extended graph support (in fact, I find it hilarious that Oracle seems to be spearheading it, as they have previously canceled their graph support around 2012 - enough that I wrote about how it was deprecated and removed from support in my thesis in 2014.


Every time I try to write a query for GitHub’s graphql API I lose a few hours and go back to rest. May be it’s easy if all the edges and inputs are actually implemented in ways you would expect.


GraphQL isn't exactly a proper graph database query language. The name IIRC comes from Facebook Graph API, and the language isn't actually designed as graph database interface.


Thanks for the correction


> high performance reasoner FOSS projects, like StarDog or Neo4j

StarDog is not FOSS, that github repo is for various utils around their proprietary package in my understanding, actual engine code is not open source.


Did you mean Datalog here?


There are some pretty huge ontology DBs in molecular biology, like GO or Reactome.

But they have never truly exploited logic-based inference, except for some small academic efforts.


I think that GO with GO-CAM is definitely going that way. Basic GO is rather simple and can't infer that much (as in GO by itself has low classification or inference logic build in). Uberon, for anatomy, does use a lot of OWL power and shows that the logic-based inference can help a lot.

Reactome, is a graph, because that is the domain. But technically it does little with that fact (In my disappointed opinion).

Given that GO and Reactome are also relatively small academic efforts in general...


There are a few symbolic logic entailment engines that run atop OWL the Web Ontology Language, some flavors of which which are rough equivalent of Cycs KBs. The challenge though is the underlying approaches are computationally hard so nobody really uses them in practice, plus the retrieval language associated with OWL is SPARQL which also has little traction.


At a much lower level, I've been having fun hacking away at my Concludia side project over time. It's purely proposition level and will eventually support people being able to create their own arguments and contest others. http://concludia.org/


Nice! I've wanted to build something like this for a long time. It requires good faith argument construction from both parties, but it's useful to make the possibility available when you do find the small segment of people who can do that.


Very cool! I've had this idea for 20 years. I'm glad I didn't get around to making it.


I tried to make something along these lines (https://truebase.treenotation.org/).

My approach, Cyc's, and others are fundamentally flawed for the same reason. There's a low level reason why deep nets work and symbolic engines are very bad.


And what is that reason?


The language before language.


Care to expand on this, provide evidence, or even pointers to what you mean by it?


Pardon the cliffhanger style.

I have begun crafting an explanation, but not sure when it will be ready.

But when you recognize that thinking predates symbolic language, and start thinking about what thinking needs, you get closer to the answer.


Err…OpenCyc?


Yep. And it may be just a subset, but it's pretty much the answer to

"I wonder what is the closest thing to Cyc we have in the open source realm right now?".

See:

https://github.com/therohk/opencyc-kb

https://github.com/bovlb/opencyc

https://github.com/asanchez75/opencyc

Outside of that, you have the entire world of Semantic Web projects, especially things like UMBEL[1], SUMO[2], YAMATO[3], and other "upper ontologies"[4] etc.

[1]: https://en.wikipedia.org/wiki/UMBEL

[2]: https://en.wikipedia.org/wiki/Suggested_Upper_Merged_Ontolog...

[3]: https://ceur-ws.org/Vol-2050/FOUST_paper_4.pdf

[4]: https://en.wikipedia.org/wiki/Upper_ontology


Unfortunately no longer (officially) available; and only a small subset anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: