jsr's comments

jsr · on March 4, 2013

The wired version of this device was in human trials ~10 years ago through cyberkinetics, a startup spun out of brown to commercialize the technology. See NYT article from 2004 talking about the start of their clinical trials: http://www.nytimes.com/2004/04/13/health/13BRAI.html

jsr · on March 12, 2012

looks like your site is down?

whatupdave · on March 12, 2012

Seems ok to me! http://www.downforeveryone.com/minefold.com

We're keeping a close eye on it ;)

jsr · on Feb 25, 2012

This is not new technology. A company called cyberkinetics commercialized this 12 years ago and made it work in humans. See their coverage in wired from 2003.

http://www.wired.com/wired/archive/13.03/brain.html

msb · on Feb 25, 2012

Of course it's not new, the article (as brief as it is) states that this work builds upon the electrode based efforts that you referenced from 2003.

jsr · on Feb 15, 2012

Definitely an impressive benchmark by any standard. However, there are some things to be aware of:

1) They used Infiniband interconnects. Running on ethernet is likely to yield less impressive results.

2) Their benchmark does simple primary key lookups. If you start doing joins or transactions that need to hit multiple data nodes, things will slow down. Depending on your workload, this may or may not be an issue.

3) NDB is an in-memory storage engine, so you're limited to the aggregate RAM in your cluster for max storage size.

4) AFAIK, MySQL Cluster doesn't re-balance, you need to pre-determine how data is partitioned and changing it at runtime is hard. I don't know if this has changed in the later releases.

HarrisonFisk · on Feb 15, 2012

For point #3, NDB has supported on disk non-indexed attributes for a while now (2-3 years?). So you just need to be able to fit indexes in memory which is a much smaller dataset, but still limiting.

I'm sure for the benchmark it was all in memory though.

geoffhill · on Feb 15, 2012

Also to add to #3: The NDB C++ API doesn't use raw SQL queries either (it uses a lower lever of abstraction for accessing the database), so it avoids the overhead of having to parse SQL queries. Most production systems and third-party libraries use SQL queries.

jonasoreland · on Feb 15, 2012

#4 is fixed in version 7.0 and upwards

jsr · on Nov 1, 2011

California, New York and London - Multiple positions INTERN, REMOTE, H1B welcome

10gen is the primary sponsor of the MongoDB open source database. We're hiring rapidly in all roles and we'd love to hire you!

Current list of openings here: http://jobvite.com/m?3dFtbfwg

jsr · on Oct 6, 2011

How scalable is this? Are there capacity or transaction rate limits on Cloud SQL?

dsl · on Oct 6, 2011

You get MySQL instances, just like Amazon RDS. All the limitations of MySQL in the real world still apply.

jsr · on Oct 3, 2011

Amazing that this post hits HN the same day Oracle announces "Oracle NoSQL Database" ... http://www.oracle.com/technetwork/database/nosqldb/overview/...

jsr · on Sept 23, 2011

OK, you hooked me with the title. But "FreeBSD + Erlang" was kind of a dissatisfying reason for how you achieved it. Would love to hear more details! How far we've come since http://www.kegel.com/c10k.html

getsat · on Sept 23, 2011

They could have done it using C, Ruby, Python, or any other language. kqueue is what makes FreeBSD (and OSX) awesome at concurrency.

http://en.wikipedia.org/wiki/Kqueue

silentbicycle · on Sept 23, 2011

How does kqueue compare to epoll on Linux? I've written C code using kqueue on OpenBSD and OS X, but have only used epoll via libev (and not at especially high load). I thought the big change came from trading level- for edge-triggered nonblocking IO, but maybe the kqueue implementation is superior for sockets somehow?

The main advantage Erlang has over C/Python/Ruby/etc. is that asynchronous IO is the default throughout all its libraries, and it has a novel technique for handling errors. Its asynchronous design is ultimately about fault tolerance, not raw speed. Also, it can automatically and intelligently handle a lot of asynchronous control flow that node.js makes you manage by hand (which is so 70s!).

You can make event-driven asynchronous systems pretty smoothly in languages with first class coroutines/continuations (like Lua and Scheme), but most libraries aren't written with that use case in mind. Erlang's pervasive immutability also makes actual parallelism easier.

With that many connections, another big issue is space usage -- keeping buffers, object overhead, etc. low per connection. Some languages fare far, far better than others here.

asomiv · on Sept 23, 2011

Yes I would say kqueue, the interface, is superior to epoll. Kqueue allows one to batch modify watcher states and to retrieve watcher states in a single system call. With epoll, you have to call a system call for every modification. Kqueue also allows one to watch for things like filesystem changes and process state changes, epoll is limited to socket/pipe I/O only. It's a shame that Linux doesn't support kqueue.

But as awesome as kqueue is, OS X apparently broke it: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AN...

gorset · on Sept 23, 2011

I fully agree that kqueue is awesome, but what specifically is broken on OSX? I've used in extensively on that platform, and haven't run into any showstoppers.

asomiv · on Sept 23, 2011

You can't poll on TTYs for example. I remember this bug from 2004 and apparently they still haven't fixed it.

gorset · on Sept 23, 2011

Yeah, the lack of support for TTYs can be annoying when writing a terminal application (a workaround for some cases is to use pipes), but it hardly qualifies as a significant problem for writing network applications.

Here's a more interesting bug in OSX: kqueue will sometimes return the wrong number for the listen backlog for a socket under high load.

Flow · on Sept 23, 2011

IIRC, kqueue can be told to read the whole http-request before letting client know there's data to read.

rdtsc · on Sept 23, 2011

How does it work? Do users provide a buffer and the kernel fills the buffer with data and notifies the user when ready?

That is more akin to AIO Linux system, then? Otherwise, epoll/poll/select just notifies users when data is available but the actual copy is done by the user. Surprisingly this can make a huge difference when streaming large amounts of data.

http://blog.lighttpd.net/articles/2006/11/12/lighty-1-5-0-an...

We have argued here before and I have gotten downvoted into oblivion for being pedantic and distinguishing between asynchronous IO and non-blocking IO but it looks like that extra user-space memcpy can make a huge difference.

Flow · on Sept 23, 2011

I can't find anything about this now, just spent a good 20 minutes searching for it. I guess keywords kqueue, buffer request http are too generic in some sense. :-/

Anyway, the idea was to avoid context switches by waiting/parsing in kernel-side till there was enough data for the client to do something else that just another gimme_more_data()-call back to the kernel.

It could even be applied to other methods than kqueue, so perhaps I remember a bit wrong that this was just for kqueues.

gorset · on Sept 23, 2011

That's actually accept_filter(9). The man page for freebsd has a interesting info:

    The accept filter concept was pioneered by David Filo at Yahoo! and
    refined to be a loadable module system by Alfred Perlstein.

The closest you can get by using kqueue is to set a low water mark, so that a read event is only returned when there's enough data ready.

Flow · on Sept 23, 2011

Ah, that's it! Thanks for finding it.

asomiv · on Sept 23, 2011

The kqueue interface is awesome, but unfortunately kqueue is bugged on OS X: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#OS_X_AN... (as are poll() and select()!)

Hoff · on Sept 23, 2011

There might well be errors here, but what those errors might be is not stated.

From the cited article on ports of the libev event loop library: "The whole thing is a bug if you ask me - basically any system interface you touch is broken, whether it is locales, poll, kqueue or even the OpenGL drivers." with no particular details on what is broken in Mac OS X.

Issues with porting to AIX, Solaris and Windows are also discussed in that article, and with reports of errors, though with no specific details provided for those platforms.

Without error details, there is also insufficient information around whether alternatives or workarounds or fixes might exist, or whether there were bug reports and reproducers logged that would allow the vendors to address the (unspecified) errors.

jsr · on June 30, 2011

Well put. Many of the same reasons I joined 10gen and am very happy to work with Nosh ;)

jsr · on Feb 21, 2011

It's useful to differentiate between "SQL Databases do not scale" and "SQL Databases do not _cost effectively_ scale". The second argument is more accurate.

Vertical scaling of a DB is definitely an option for many people and has been used to scale many applications. However, the cost curve associated with buying bigger and bigger hardware is super-linear; doubling CPU & Memory in a single system leads to more than doubling the hardware cost. This can be problematic for many businesses whose revenue growth is exceeded by cost growth of the database.

Sharding is also an option for scaling, leveraged to great success by Facebook, Yahoo, and many others. However as the article points out, sharding prevents the developer from using many of the features that make a relational database a productive development environment. There are lots of foot guns that emerge in a sharded SQL environment and if you have not set up your development constraints appropriately, you can slow the pace of development considerably. This again leads to a cost problem because the incremental costs of adding features grows as you add more things like sharding around your database.

SQL is not useless and not hopeless. In a large number of cases, SQL is the right solution. However the techniques used to scale SQL tend to be options only to very large budget organisations. NoSQL solutions tend to be more cost effective in their scaling approach (scale out vs. scale up) without crippling the developers productivity. For these reasons, NoSQL solutions tend to be the better choice for the cost-conscious.