Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ha didn't expect this to end up here. If anyone is interested, I'm working on a blogpost explaining how we built the app in detail… It uses embedded TensorFlow on device (better availability, better speed, better privacy, $0 cost), with a custom neural net inspired by last month's MobileNets paper, built & trained with Keras. It was loads of fun to build :)


Damn! I thought it was using GCloud ML API but it felt so fast. The one in the show felt so slow in comparison! Kudos Jian Yang!


How large is the training set / how did you get a sufficient amount of hot dog pics for a custom model, especially since you did not have a class of naive Stanford CS students at your disposal?


About 150k total images, 3k of which were hotdogs. The results are far from perfect (there's ton of subtle — or hilarious — ways to trick the app) but it was better than using a pre-trained model or doing transfer learning, accuracy-wise (honestly it was even better than using Cloud APIs). As for the difficulty in preparing the training set, I'll just say I definitely empathize with Dinesh and Jian Yang’s feelings in episode 4 :D


> there's ton of subtle — or hilarious — ways to trick the app

ಠ_ಠ


AWESOME!

Can you please elaborate a bit more on the model architecture and what you tried with respect to transfer learning?

Did you use an imagenet architecture e.g. VGG and retrain from scratch or a custom architecture? Did you try chop off the last 1/2/3 layers of a prerrained mode and fine-tune?

Bonus points: 1. How much better were your results trained from scratch vs fine-tuned? 2. How long did it take to train your model and on what hardware?

:)


Hey so I actually tried Vgg, Inception and SqueezeNet, out of the box, chopped and trained from scratch (SqueezeNet only for the latter due to resource constraints).

We ended up with a custom architecture trained from scratch due to runtime constraints more so than accuracy reasons (the inference runs on phones, so we have to be efficient with CPU + memory), but that model also ended up being the most accurate model we could build in the time we had. (With more time/resources I have no doubt I could have achieved better accuracy with a heavier model!)

Training the final model took about 80 hours on a single Nvidia GTX 980 Ti (the best thing I could hook to my MacBook Pro at the time). That's for 240 epochs (150k images in an epoch) ran in 3 rate annealing phases, each phase being a handful of CLR (cyclical learning rate) phases.

I'll answer in more detail in the full blogpost, it's a bit complicated to explain in a comment. I'll have charts & figures for y'all :)


As someone currently writing an app which uses a retrained Inception model, I watched the show pointing and laughing (and then crying) at the same issues and frustrations. The accuracy of the show and especially this episode has been just brilliant.

Thanks for sharing all the tech details too, it's been great to read. I'm even more amazed to see it as a real app, that I didn't expect!


I'm confused about the 80 hours to train. Wasn't this episode shown on HBO less than 48 hrs ago?

Edit: Just read your bio. Now it makes sense!


He's a guest lecturer at Stanford, and he had the students help him.


great, thanks for your reply and looking forward to that blogpost!


I instantly noticed the speed and thought, "did they put TensorFlow on the device?!". props


How does one get this job? This sounds ridiculously fun.


It was random, I was already working on the show as what Hollywood calls a (technical) “consultant”: advising on storylines, dialogue, background assets, etc. When this idea popped up, someone suggested we build the app for real. We took a try and ended up building the entire thing in-house with the crew, as opposed to hiring an external agency to do it for us.


Sounds like you're handling yourself even better than the characters in the show. :)

Edit: I guess it's official: The time of XKCD 1425 has passed. https://xkcd.com/1425/ September 2014; Randall undershot that by a bit :)


If I understood correctly by reading the comments, this was built for a show ? What show ?

EDIT: ah, it's in your profile ;)


For future reference, the show is Silicon Valley.


What was your path to becoming a Hollywood technical "consultant"?

Would you write a blog post about that journey?


How long did it take you to build the app? How many devs worked on it and who got to feed the training data set?


You can fetch a dataset of 1273 labeled hot-dog images and 123287 non-hot-dog images from mscoco training dataset:

search for "hot dog" or click on the hot dog icon:

http://mscoco.org/explore/


Were the images gathered by Stanford undergrads?


I should say a lot of people made the app possible, including the show's awesome producers, writers, designers, and a lot of kind folks at HBO. To answer your question, I was the only dev on the project, and I've been working on it since last Summer, on a very part-time basis (some nights and weekends). A lot of time was spent learning Deep Learning to be honest. The last revision of the neural net was designed & trained in less than a month of nights/weekend work but obviously couldn't have been achieved without the preceding months of work — but if I was starting today knowing what I know now yeah it'd probably be about a month of work. The React Native shell around the neural net was just a few weekends worth of work — mostly it was about finding the right extensions, tuning a few things for rendering/performance, and like a whole weekend dealing with the UX around iOS permissions to access the camera & photos (lol it's seriously so complicated).


Were you able to expense the hot dogs?

In the early 90's, I had the honor of corresponding with "Uncle Frank" Webster [1], the curator of the Hot Dog Hall of Fame [2]! It boasts of more than 2,500 frankfurter items, including the Lamborweenie, a dog on wheels that "Uncle Frank" hopes some day to race against the Oscar Meyer Weinermobile [3]. Unfortunately the hot dog museum is currently closed and in (hopefully refrigerated) storage [4], otherwise the museum, gallery and gift shop would be a great place to train your app.

He asked for permission to publish in his newsletter a gif [5] of a photo I'd taken and put on my web site of the Doggie Diner head [6] on Sloat Boulevard in San Francisco [7]. (This was years before John Law acquired all the Doggie Diner heads he could find for his restoration project [8], so there weren't so many photos on them on the internet at the time.)

Of course I gave him permission because he asked so politely, and although at first he seemed a little creepy, I could tell he was authentic and sincere since he signed his correspondence: "With Relish, Uncle Frank." [9] And he even delivered on his promise to send me copies of his newsletter!

He enthusiastically informed me that the highest quality hot dogs he's ever found are from Top Dog in Berkeley [10]. He admitted that he went through their dumpster to find out where they sourced them from because they wouldn't tell him, and he vouches that they are made from the finest possible ingredients. I agree with Uncle Frank that Top Dog has really excellent hot dogs (ask for them cooked butterfly style), and I take him at his word that they come from a reputable source. You can now actually order them online [11] if you would prefer not to go through their dumpster.

[1] http://www.smithsonianmag.com/arts-culture/hot-dogs-are-us-6...

[2] http://thehotdoghalloffame.blogspot.nl/

[3] https://www.questia.com/magazine/1G1-19007874/the-almost-all...

[4] http://www.roadsideamerica.com/tip/20747

[5] http://donhopkins.com/home/catalog/images/DoggieDiner.gif

[6] https://en.wikipedia.org/wiki/Doggie_Diner

[7] http://www.roadsideamerica.com/story/14441

[8] https://www.kickstarter.com/projects/2118888480/doggie-diner...

[9] http://www.downtownmakeover.com/comments.asp?CZID=17

[10] https://www.yelp.com/biz/top-dog-berkeley

[11] http://www.topdoghotdogs.com/mailorder.html


How did you not expect it to end up here?


I honestly thought the app itself would come across as too limited — and I wasn't quite sure how HN felt about the show it's attached to. I was preparing that technical blogpost specifically for HN because I thought that would be a more hacker-centric way of looking at the same thing.


You can tell a lot about SV-types based on how they feel about the show SV...


Had not heard MobileNets before. Mind sharing which paper are you referring to?


I believe they are referring to this paper titled "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"

https://arxiv.org/pdf/1704.04861.pdf


Great work, man.

The effort you put into the show is much appreciated.


the funny thing is that there are already "seefood"s out there. Among others:

https://www.infino.me/hungrybot


Daww! I'm the author, thanks!

I was REALLY depressed after watching these last two episodes. HungryBot was easily the coolest thing I've done in my unfunded nonprofit. But it was too buggy to fly. Kicked it out of the nest too early, but to the best of my knowledge it was the first deep learning food recognition app.

Looking for philanthropists if anyone wants to help me study diabetes with apps like this.


psst this is isaac


I would love to look at the source code for this!


And the training set! Mmmmmm!


yes please - would love to read that blogpost!


I would love to read this.


Definitely! Link please?


> It uses embedded TensorFlow on device (better availability, better speed, better privacy, $0 cost)

It's sad that this isn't more common outside of hot dog detectors.


Is there a reason you didn't release on Android? The app was even demoed on a Pixel.



I have an Android version being released as we speak.


It was written in Objective C?


It's actually written in React Native with a fair bit of C++ (TensorFlow), and some Objective-C++ to glue the two. One cool thing we added on top of React Native was a hack to let us inject new versions of our deep learning model on the fly without going through an App Store review. If you thought injecting JavaScript to change the behavior of your app was cool, you need to try injecting neural nets, it's quite a feeling :D


This looks great, haha! Shame I can't access it from the UK.

It would be really interesting to read more about your thoughts on working with RN and C++ and perhaps how you did some of it. I'm currently doing the same (but with a C++ audio engine rather than image processing stuff) and I think it's an incredibly powerful combination - but I do feel like I'm making up some interop patterns as I go and there might be better ways, so would love to hear how other people use it!

Broadly, I've created a "repository" singleton that stores a reference to both the React Native module instance (which gets set when I call its setup method from JS) and the C++ main class instance (which gets set when it starts up), so they can get a handle on each other (I bet there are better ways to do this, but I'm new to C++/ObjC and couldn't work out a better way to get a reference to the RN module).

I'm then using RCT_EXPORT_METHOD to provide hooks for JS to call into C++ via an ObjC bridge (in an RCT_EXPORT_MODULE class), and using the event emitter to communicate back to JS (so the C++ can get the RN module instance from the singleton and call methods which emit events).

I've not done anything that really pushes the bridge performance to a point where I've seen any noticeable latency/slow down caused by the interop - have you had any issues here?

Like I say, I'm finding a really cool way to build apps that need the power of native code but still with the ease of RN as the GUI and some logic, and I actually quite like the separation it enforces with the communication boundary.


Sounds like you're further ahead than I was with the React Native part! Not Hotdog is very simple so I just wrote a simple Native module around my TensorFlow code and let the chips fall where they may performance-wise. The snap/analyze/display sequence is slow enough that I don't need to worry about fps or anything like that. As much as I enjoyed using RN for this app, I would probably move to native code if I needed to be able to tune performance.


Can you explain to a noob how you wrote the Native module around TensorFlow? My main area of focus is in python, but I feel hindered when I think I'm ready to start developing for mobile apps. I'm looking into RN, but still not sure how that plays with TF and other python modules.


It was honestly just maybe 10 lines of code, but I was very confused about it before I got it done. The message passing is a bit counterintuitive at first. I'll try to share example code in my blogpost!


awesome, what's your blog?


See this sub-discussion re:performance (+serialization/marshalling) in today's React Native discussion:

https://news.ycombinator.com/item?id=14349426


can you please open source it. It would be great if you open source it.


So what you're saying is you can update the neural net to do dick or not dick without going through the App Store review?


Nice! I'd never known that C++ could be used in an iOS app; learned something new today thanks.


It would be relatively easy to port to Android. Please open source it!


Hey guys, how would I be able to add characters to an account for injustice and then disable the anti-cheat. I want to basically make it so that i cant get banned for using characters I added to my account.


What was the hack?


When using React Native (and also storing your network in JS) you can choose to push updates and fixes to JS part of your code without going through store again.

Check: http://microsoft.github.io/code-push/


Yup that's basically it. The hack was just in getting Tensorflow to accept/load its neural network definition from the JS bundle (what CodePush distributes for you) rather than from the main Cocoa bundle.


Just as a note, people/developers have had messages from Apple telling them that they need to remove code which allows them to update their app outside of the app update/review process.

See https://news.ycombinator.com/item?id=13817557 for some more detail/discussion.

[edit: Apart from in Apple-approved manners]


https://developer.apple.com/programs/ios/information/iOS_Pro...

> 3.3.2 […] The only exception to the foregoing is scripts and code downloaded and run by Apple’s built-in WebKit framework or JavascriptCore […]


Well, that's not gonna fly with the "code is data, data is code" crowd ...

If the newspaper apps get to load fresh front page images, why should a poor neural net be discriminated in its quest for fresh coefficients?

(Raw oppression and injustice there, get yer indignation reservoir topped up.)


Really cool stuff! I didn't even realise you've switched jobs, but that explains why I didn't see you when we popped in at Townsend the other day! Keep up the good work.


I'm surprised Apple allows this, I thought they hated this kind of thing.


I really have no idea what any of that means, no Android then?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: