More

pjbringer · on Dec 7, 2014

This exact problem had me worried when I heard about OpenBSD's arc4random, and how they've been promoting a pervasive use of it. I haven't taken the time to look at how they solve the problem of contention, while still maximizing the unpredictability of the random number generator's state by using it.

clarry · on Dec 7, 2014

I don't think threads are all that popular near the OpenBSD group. Most daemons handle multiple connections asynchronously without threads, and if that's not enough, multiple processes may be used to process multiple requests simultaneously.

arc4random() is simply a locking wrapper for the underlying rng.

anon4 · on Dec 7, 2014

Aren't you supposed to use arc4random once to seed a high-quality pseudorandom generator? From what I understand, seeding a high-quality PRNG with good source of randomness is really all you need.

makomk · on Dec 7, 2014

arc4random is their high-quality pseudorandom generator that's seeded from a good source of randomness (namely, the getentropy system call).

thirsteh · on Dec 7, 2014

Yes, but like urandom the general intent (at least when performance is desired) is you use it to seed your own PRNG and avoid making system calls every time you need random data.

pjbringer · on Oct 18, 2013

That would be interesting, however after launching adb shell:

  shell@mako:/ $ grep vfat /proc/filesystems
          vfat
  shell@mako:/ $ grep vfat /proc/mounts
  /dev/block/platform/msm_sdcc.1/by-name/modem /firmware vfat ro,relatime,uid=1000,gid=1000,fmask=0337,dmask=0227,codepage=cp437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro 0 0

So fat is being used in spite of not having an SD card.

pjbringer · on Oct 17, 2013

Just the other day, I noticed that on my kde desktop, wireshark was the only program using gtk3, whereas gtk2 is used by a bunch of cross platform programs. I'm not too sure what that means though.

tenfingers · on Oct 17, 2013

GTK3 has a very small adoption rate for cross-desktop/OS applications. Except for the gnome DE (which I'm not using), on my system the only two applications using GTK3 are wireshark and lightdm. Heck, even GIMP on Debian is still using GTK2.

QT4+ is faster than GTK3, the API is superior in almost every respect, has better cross-platform OS support. It should be telling to every user how some basic widgets like the file-open dialog in GTK3 has actually regressed in every respect compared to GTK2, and many others actually behave worse from the user's point of view (which should be the #1 priority of any widget system, even before the API).

I used to prefer GTK for the footprint (and mind you, I was always aiming at pure X11 development), but not anymore.

pjbringer · on Oct 1, 2013

Putting a dollar amount on anything signals value perception. 12.50$ is a lot worse than a warm welcome, or other free rewards like public acknowledgement, because it says Yahoo really couldn't care less about finding such bugs.

ijk · on Oct 1, 2013

Known as the overjustification effect: http://en.wikipedia.org/wiki/Overjustification_effect

pjbringer · on Sept 29, 2013

I notice the ideas that it is not the number of contributors that matter, but the number of sufficiently skilled ones, and the that popularity impedes change. I can't help draw a parallel with the advice that you should listen to your most valuable customers, and potential customers, and that the rest of your users will expect free stuff, and complain loudly when you pivot.

pjbringer · on Sept 23, 2013

Also, cache invalidation can actually be cache replacement. Otherwise, under very high load, you risk cache stampede.

pjbringer · on Sept 20, 2013

The unsafe language argument of the first paragraph doesn't hold. You can design you API such that handles are copied by value, and opaquely contain pointer. Usually the handle IS the pointer, but it doesn't have to be this way. When you do this, you are able to perform exactly the same kinds of memory operations that a virtual machine would perform.

pjbringer · on Sept 20, 2013

I too wondered why he would ask Apple rather than the DOJ. If he did the latter, the answer would be that they don't know the details of the implementation, and can't say. Apple is the party which knows right now.

So what I would like, is that after Apple comes out with the details, the question gets asked to the government. That would ensure that the government does behave according to the intent of Congress.

pjbringer · on Sept 19, 2013

Stanford has another somewhat related project: http://dune.scs.stanford.edu

In their case, the application runs both as a process in the host and as a guest. It gives the application access to traditional OS APIs, and allows use of processor extensions to directly access virtualized hardware. The benefits of doing this include ability for an application to do low level custom IPC, to use the page tables for garbage collection, to trace the use of system calls much more efficiently, to hook into page faults, and so on. Very cool stuff.

pjbringer · on Sept 17, 2013

In that case, your IO problem is easy, because it's small with regard to CPU time. You can get away with putting the whole data in sql databases, and/or making multiple copies of your data. Then you can use as many workers as you want, with usually simple partitioning logic.

lgieron · on Sept 17, 2013

Originally I did just that, but ultimately decided to move to Hadoop. When combined with Amazon EMR, launching arbitrarily large cluster is just a few clicks. You can then monitor progress, have robust cluser-wide error handling, and your data gets nicely merged into output files in S3 (not so easy with the home-baked solution).

vosper · on Sept 17, 2013

We've had a lot of success with EMR as well - we have an hourly Pig job that produces data for our analytics database. It's not a particularly complex script, but our traffic volume is unpredictable so it's reassuring to know that we can add resources to a slow job and have it finish faster.

The downside of EMR is that it can be fairly expensive once you start needing the beefy machines. We're lucky that we can afford to have our analytics delayed an hour or two and can thus run on Spot instances (except for the Master node). When we move to a streaming architecture I'm not sure EMR will still be competitive, since we won't be able to have those machines go away on us.

Edit: clarity.

mightybyte · on Sept 17, 2013

If you only look at CPU time, then yes, maybe you could do that. But there are many more factors at play.