More

rmcclellan · on Jan 10, 2021

Great article!

For those interested in an open source design embodying these techniques, I have an open source hardware design here: https://github.com/russellmcc/dco with a blog write-up here: https://www.russellmcc.com/posts/2013-12-01-DCO.html .

Also of relevance is an open source hardware design for the later roland alpha juno digital oscillator design: https://github.com/russellmcc/alphaosc https://www.russellmcc.com/posts/2019-06-14-Alpha-DCO.html

rmcclellan · on Oct 28, 2019

Awesome project! As a professional audio developer, I was really blown away that this was the author's first project working with audio.

For anyone interested, I'd recommend checking out "The Infinite Jukebox", which has a similar goal, but perhaps a more robust approach: http://infinitejukebox.playlistmachinery.com

If I had to guess at why your approach didn't work well on recorded music, it's probably because most of the time, there is more than a single event happening at a time, so picking out just the highest FFT bin is probably not a very robust "fingerprint" of that part of the music. The infinite jukebox uses timbral features as the fingerprint, rather than just a single note.

michaelmior · on Oct 28, 2019

Thanks for the pointer to The Infinite Jukebox! It's impressive how well it does. It's biggest point of failure seems to be jumping to different parts of a verse or chorus and mixing up bits of lyrics. It makes that this would happen, but it would be really interesting to see what could be done to avoid this.

tux1968 · on Oct 28, 2019

It's a shame you can no longer upload music to the InfiniteJukebox. Apparently it relied on a Spotify API that was shut down October 1st. Best hope going forward is an open source library that can offer the same track analysis that used to be provided by that API.

johnsoft · on Oct 28, 2019

The API was an Echo Nest API. Spotify acquired Echo Nest and shut down all their APIs in 2016.

I'd love to see the app live on too! Spotify's API has similar functionality[1] to the old Echo Nest API, for now at least. (But I don't know if it returns all the same data.) Or, if you don't want to rely on Spotify, I bet Essentia[2][3] could do the job just as well. Essentia is the open-source brains behind AcousticBrainz[4]. So you could either use Essentia directly, or grab the data from the AcousticBrainz API.

[1] https://developer.spotify.com/documentation/web-api/referenc...

[2] https://github.com/MTG/essentia

[3] https://essentia.upf.edu/documentation/essentia_python_examp...

[4] https://acousticbrainz.org/

rmcclellan · on Dec 28, 2018

I work using C++17 for high performance applications, and I can relate to a lot of these gripes. I think it's a fair point that C++ is unreasonably complex as a language, and it's been a serious problem in the community for a long time.

One part that really struck me as odd is the focus on non-optimized performance. To me, this is an important consideration, but not nearly as important as optimized performance. Using techniques like ranges can definitely slow down debug performance, but much of the time it _dramatically increases_ optimized performance vs. naive techniques.

How do ranges speed up optimized builds? One of the best techniques for very high performance code is separation of specifying the algorithm and scheduling the computation. What I mean by this is techniques like [eigen](http://eigen.tuxfamily.org/index.php?title=Main_Page) and [halide](http://halide-lang.org) where you can control _what_ gets done and _how_ it gets done separately. Being able to modify execution orders like this is critical for ensuring that you're using your single-core parallelism and cache space in an efficient way. This sort of control is exactly what you get out of range view transformers.

modeless · on Dec 28, 2018

> I work using C++17 for high performance applications

> One part that really struck me as odd is the focus on non-optimized performance

I'm guessing your high performance applications aren't interactive? When your application has to respond to user input in real time, a binary that is 100x slower than real time is completely useless. You can't play a game at 0.3 frames per second.

I would be interested in seeing an example of how Halide-like techniques can be used with C++ ranges. I am skeptical that you could get the kind of performance improvements that Halide can achieve. And of course you won't get the GPU or DSP support that is really useful for that kind of computation.

snovv_crash · on Dec 28, 2018

This is what RelWithDebInfo builds are for.

Don't make my Debug build into a RelWithDebInfo build or it makes it a huge pain to track down subtle bugs/errors in non-performance-critical unit tests.

tom_ · on Dec 28, 2018

This is dealt with in the article - debugging optimised code is a pain, even when you know what you're doing. The source-level debugging often doesn't work, variable watches often don't work (and this even though DWARF has a whole pile of features specifically so that this stuff can work...), and debugging at the assembly language level is a chore.

fanf2 · on Dec 28, 2018

gcc has -Og (optimize without harming debugging) which is supposed to avoid these problems

rmcclellan · on Dec 28, 2018

Well, you aren’t rebuilding your binary every frame, are you? I might be missing something.

Also, I think build time is super important in most contexts - what I think is less important is runtime speed when you’ve disabled all optimizations.

sharpneli · on Dec 28, 2018

It’s not the speed of compilation. It’s the speed the program runs with debug build. So runtime speed.

And for games you need decent runtime speed. If you cannot run your game in debug build one has to do good old printf debugging. And yes, if you cannot actually play the game (as in over 10fps) that means you cannot run it in debug build.

modeless · on Dec 28, 2018

Are you confusing build time with performance of the resulting binary? I'm talking about the latter. Both are important and both are lacking with modern C++ in debug mode.

Edit: I see, I carelessly used the word "build" to mean a compiled binary, which was ambiguous. I've changed it.

rmcclellan · on Dec 28, 2018

Thanks for the clarification. I guess like all trade-offs, it's context dependent. I see the advantages of having a realtime usable non-optimized build for debugging. Since I use modern libraries like Eigen, that option has not been available to me for some time.

With "modern" techniques, the performance ceiling is a bit higher - whether that benefit is worth it depends on a lot of factors.

neutronicus · on Dec 28, 2018

If you're doing Linear Algebra, you're kind of in a C++ sweet-spot, I think.

In particular, you can always debug a tiny version of whatever problem you're trying to solve, so you don't really care that much about non-optimized performance, and a lot of times you're willing to eat a long compile time if it means you squeeze out that last couple percent. Conversely, you care a lot about cache micro-optimizations and talking to GPUs and stuff like that, and generally you want to be just banging on some piece of memory you got from the OS, all things that non-C++ languages make extraordinarily painful.

Even Fortran, which the haters were trying to push as "just better" than C++ for linear algebra has really disappointed me.

kbwt · on Dec 28, 2018

> Using techniques like ranges [...] _dramatically increases_ optimized performance vs. naive techniques.

This claim will require some evidence. In my experience, it's extremely common for novice engineers to trade orders of magnitude in build time overhead chasing negligible runtime performance improvements.

rmcclellan · on Feb 7, 2014

There was a C with Classes.

http://www.stroustrup.com/bs_faq.html#invention

pjmlp · on Feb 7, 2014

There is no reason to code in 2014 as if CFront was just released today.

rmcclellan · on Sept 30, 2013

Of course you can do this in Haskell:

http://www.haskell.org/ghc/docs/7.6.2/html/libraries/base/Un...

brandonbloom · on Sept 30, 2013

I'll save the value judgement for another time, but I'd like to point out an important difference:

As the word "unsafe" implies, these Haskell primitives forego type safety in addition to type correctness. That means you can get segfaults and other undefined behavior at runtime. Such a type error on the JVM will simply produce an exception at runtime.

tikhonj · on Sept 30, 2013

This is why we have Data.Dynamic which does safe dynamic types.

It almost never comes up because it turns out to be virtually useless, but that's a story for another time.

brandonbloom · on Sept 30, 2013

> it turns out to be virtually useless

That's debatable, however Data.Dynamic is built on top of Data.Typeable, which provides a lower level runtime type safety facility. I think we can both agree Typable has lots of interesting uses.

tel · on Sept 30, 2013

Typeable is interesting in theory and generic traversals are a godsend, but usually I find that when I'm reaching for that particular hammer I should check twice.

brandonbloom · on Sept 30, 2013

I'll agree it's best to avoid fully open unions when you can, but some (super useful) things truly don't work that way. Check out http://okmij.org/ftp/Computation/monads.html#ExtensibleDS for a cool example.

tel · on Sept 30, 2013

I always think of Control.Exception as the poster child for open unions.

brandonbloom · on Oct 1, 2013

Perfect, since exceptions are a subset of effects! Check out http://math.andrej.com/eff/ and its literature.

emillon · on Sept 30, 2013

The difference is that Haskell does not check types at runtime like Clojure does (which is the point of strong static typing), so if the type is wrong it becomes as unsafe as C.

rmcclellan · on June 20, 2012

This was done on a much larger scale (without primes) as a lottery game in Sweden called "Limbo" or "LUPI" (lowest unique positive integer). Several game theorists have analyzed the data with some interesting results:

http://swopec.hhs.se/hastef/abs/hastef0671.htm

Calculating the equillibrium strategy for rational actors is difficult because each player doesn't know how many other players there are. In the paper above, game theorists calculate it and show that the distributions seen in the lottery match up fairly well to a rational strategy.

rmcclellan · on June 1, 2012

Multi-armed bandit isn't an algorithm, it's a model of how to view the problem. Like it or not, the problem web designers face fits the multi-armed bandit model pretty well. The algorithm called "MAB" in the article is one of many that have been developed for multi-armed bandit problems. Traditionally, the "MAB" of this article is known as "epsilon-greedy".

The point of multi-armed bandit situations is that there is a trade-off to be made between gaining new knowledge and exploiting existing knowledge. This comes up in your charts - the "MAB"s always have better conversion rates, because they balance between the two modes. The "A/B testing" always gain more information quickly because they ignore exploitation and only focus on exploration.

I should say also that multi-armed bandit algorithms also aren't supposed to be run as a temporary "campaign" - they are "set it and forget it". In epsilon-greedy, you never stop exploring, even after the campaign is over. In this way, you don't need to achieve "statistical significance" because you're never taking the risk of choosing one path for all time. In traditional A/B testing, there's always the risk of picking the wrong choice.

You aren't comparing A/B testing to a multi-armed bandit algorithm because both are multi-armed bandit algorithms. You're in a bandit situation either way. The strategy you were already using for your A/B tests is a different common bandit strategy called "epsilon-first" by wikipedia, and there is a bit of literature on how it compares to epsilon-greedy.

http://en.wikipedia.org/wiki/Multi-armed_bandit#Common_bandi...

KaoruAoiShiho · on June 1, 2012

This comment just sold me on MAB. You can just keep on throwing variations on a design at the system without having to make tenuous decisions. I hope all the A/B tools implement this soon.

noelwelsh · on June 1, 2012

Don't wait: http://mynaweb.com/ [Yes, this is a shameless plug for my startup.]

arkitaip · on June 1, 2012

It seems like most people who use these content optimization tools don't really understand the statistics involved. What are your thought on this? How do you educate your users on the merit of your approach vs a/b testing when the topic is so complex?

Also, despite this being a slightly pro a/b testing post, I have to say it's actually made me more interested in trying out Myna's approach MAB algorithm.

Drbble · on June 1, 2012

Same way every product from GWO and T&T on down: show a pretty graph that ignores the underlying assumption that it's even possible to use statistics to conjure certainty from uncertainty, and trust that users will never know or care about the difference.

/former AB test software dev who fought my users to try to stop them from misinterpretation results, and failed.

btilly · on June 1, 2012

If it gives you comfort, if there is a significant underlying difference and the calculations are done right, with high probability they will get the right answer even though they are misunderstanding the statistics.

Acceptance of this fact has avoided a lot of potential ulcers for me.

paraschopra · on June 1, 2012

Just to be clear, we don't have anti-MAB stance or pro-A/B testing. The point was that MAB is not "better" as an earlier article titled (20 lines of code that beat A/B testing) had claimed. These methodologies clearly serve two different needs.

KaoruAoiShiho · on June 1, 2012

You camouflaged your sign up button with a nature color. Took me the longest to find it.

richdougherty · on June 1, 2012

You probably just pulled the wrong lever on his logarithmic-regret-optimised multi-armed bandit. ;)

btilly · on June 1, 2012

I should note here that if you use Myna, you will be using a much better multi-armed bandit approach than the epsilon-greedy which lost in this blog post.

See my longer top-level comment for some of the trade-offs.

rmcclellan · on May 8, 2012

I find it interesting that there is no mention of using dependently typed languages or proof engines for this application. Something like Coq, where you write the formal proof of correctness as you write the program, would fit the bill nicely if you really care about safety over ease of implementation.

rmcclellan · on April 17, 2012

Seems like a nice companion to Whitespace:

http://en.wikipedia.org/wiki/Whitespace_(programming_languag...

sophacles · on April 17, 2012

Whitespace is too slow for big systems. The non-implicit nature of semicolon will really allow for prime-time server stacks. I do think though that whitespace will retain its market share on the front end.

jaylevitt · on April 17, 2012

So what I hear you saying is that Semicolon is webscale.

rmcclellan · on March 26, 2012

I have mixed feelings about this book. It's how I learned electronics, so I can't knock it too much. However, sometimes the explanations, which tend to be intuitive rather than logical, can be quite hard to follow for someone who is more methodically-minded. So, for the nitty-gritty analog stuff in the bipolar transistor chapter of AoE, I much prefer "Design of Analog Integrated Circuits" by Gray and Meyer. It works for discrete circuits, too, even though integrated is in the name.