It sounds cliche, but I've found the yEd graph editor crucial in reading a Dostoevsky novel and keeping up with all the characters and their relations. Is yworks pretty much the only game in town when it comes to graph editing?
If you still have that graph somewhere, we'd love to see it. We're always curious (and sometimes surprised and astonished) what people create with yEd.
As far as competitors go, there are lots of other options, both in end-user applications for graph editing, as well as libraries.
For end-users it seems many stick with the first tool they really like and get used to its features, strengths, and idiosyncrasies (and from my experience there are many weirdnesses among those applications, including our own). Automatic layout may be a killer feature for our offering, though. As far as I know there is not much that can compare here (although for many people simple hierarchic or force-directed approaches may suffice and they might not need every option).
For library users it often comes down to a decision based on required features, cost, custom development effort, and target platform. I think we're well-situated for customers where cost is less of an issue, that have competent developers and require extensive customization (and support). It's not uncommon that D3 might be a better choice, depending on the requirements.
Thanks for the response! FWIW, I find nothing lacking in the application, but I'm working on an ipad and it seems a native app would be less constrained by the browser quality.
There is Gephi, which is very capable, though I've also found it frustrating to work with, particularly for graph editing. Interface glitches, unresponsive elements etc. quickly get in the way.
Development seems to have stalled somewhat, so I'm not optimistic that this has improved.
yEd is my default go-to graph drawing tool. I use it mostly for engineering diagrams. It's kind of incredible what you can make with it, and it's basically cross-platform and I use it on Windows and Mac OS. The graph layout engine is first class and it has a ton of options.
If you can get into the flow with it, and trust the layout engine (instead of foofing around with placing your own things) it's basically replaced visio for 99% of my team's diagramming.
chart recognition and OCR in general would make UGC sites, in particular wikipedia, much more powerful. wikipedia is full of uploaded charts that should be datasets.
in general chart sharing on the web is bad. If we can't have a <chart> element, maybe chart parsing is the next best thing to preserve some of the original information
I think Wikipedia has macros that allow you to create a variety of charts with just textual information. I've seen at least timelines and family trees being created that way. The benefit is that it's text-editable, and the information is still accessible. In some cases, the markup is very, very unreadable, though and might actually be tool-generated. It still maintains the benefits in theory.
For things that are trivial in SVG or even HTML, like bar or line charts, there should be little to no excuse not to use markup, agreed.
In this case, we've concentrated on parsing graphs, though, as that's our main line of work (you could draw charts with yFiles, but it's not really that useful for it). We're still trying to find the time to clean up our code and publish it somewhere. This has been just a week-long effort by four people, but was definitely a fun learning experience.
yWorks employee, developer and one of the authors of that blog post (and the described code) here. Happy to answer any questions regarding graph drawing or recognition.
Would you guys mind open sourcing your data set, including the cases where you currently fail (presumably graphs with edge crossings -- I couldn't help but notice that all examples are planar graphs)? I think I may have some ideas how to tackle some of the unresolved issues you state at the end of the blog post.
Yeah, we intentionally omitted edge crossings. There's prior research in dealing with them and those approaches work well (and are clever), so we thought it wouldn't be terribly useful for us to re-invent that part. The papers are linked in the blog post near the end where we compare our approach with previous attempts.
We've then concentrated on different segmentation strategies (the other approaches started with a sensible binarization of the image, which precludes color-based segmentation), as well as getting visual characteristics right, such as shape and color. The algorithms still don't handle cases well where features are much larger than we expect (e.g. photos are very different than screenshots). That'd be certainly an area of improvement.
Our data set is ... basically most of what the screenshots in the article show (one of them opens an album with more images). We didn't have time for a thorough testing of thousands of different graphs. That's something we'd certainly have to do if we'd want to publish anything ;-)
I thought about using it, but since it returns only straight lines in the image it's probably not very useful for curved edges and for node shapes there are probably also better ways. For edges the skeleton has been working really well. It might still be useful for detecting whether edges are orthogonal and to better detect bends (right now, orthogonal edges always have a little curve at the corners, due to how skeletonization works).
The title led to be believe this would be about extracting relational hierarchies (e.g. scene trees, dependency graphs) from arbitrary images. That would have been very impressive, and somewhat unbelievable. The actual article appears to be about extracting graphs from images of literal graphs.