Hacker News new | past | comments | ask | show | jobs | submit login

This is obviously a talent acquisition in more ways than one (the Kaggle team, but also their ability to source machine learning talent). I wonder to what degree it's also a Tensorflow promotion move? It seems like Google is very interested in growing a community around it.

For example: some friends who run a seed-stage biotech deep learning startup were offered a considerable discount by the Google Cloud folks. Their ask? That the company switch to Google Cloud, rewrite some proprietary software in Tensorflow, and heavily publicize both moves.

I wonder if we'll see Kaggle gain a specific bent towards that ecosystem.




Not clear to me why this is a talent acquisition. The Kaggle team (Ben in particular) have some talents in ML, but I'd be surprised if they have anyone there working day to day on ML tasks.

It seems to me more like an old school product-and-media acquisition: Google like the product, and love the audience. This is a good way to get both.


I think parent's focus was on the "sourcing ML talent" part rather than the Kaggle team itself.


It was; should have emphasized it more. The Kaggle team is talent for sourcing ML talent.


Plus Kaggle is a good tool for weeding out talent. Sending an ML candidate a kaggle competition is much better than a traditional code interview.


I don't think Google actually has that problem.

The whole thing is strange.


Kaggle can bring out unknown or underprivileged gems into the spotlight. I remember reading an article about a top performer on Kaggle who was a school teacher somewhere in SE Asia (Singapore?).


1) Why do you believe that the hypothetical Singaporean isn't going to apply to Google? Google has no shortage of applicants. And if the applicant believes that Kaggle could help them, whey not simply put the score on the application / resume?

2) If Google is trying to recruit people from Kaggle accounts, why not simply index the accounts?

Neither approach requires purchasing Kaggle at all.


Singapore is a bad example. How about a (hypothetical) guy/gal learning ML from Coursera and living in a remote village in Indonesia? No way to go to college because it's simply too far and he/she has to support their family. The person stumbled upon Kaggle, and started to compete with the best in the world.

Only Kaggle has the full data to be able to make an accurate decision. I don't really think indexing account pages is even remotely enough to find the really talented people among the noise.

I think Google acquired Kaggle for one of the following two reasons: 1) they wanted to expand their talent acquisition reach[1], or 2) they wanted to build a platform like Kaggle aimed at Google Cloud, but figured out that it was just easier to acquire Kaggle itself.

[1]: Google will NEVER be satisfied with its talent pool given their size and rate of expansion. The company is prepared to do a ton -- perhaps even acquiring Kaggle -- to get the best of the best, wherever they are.


If you own all the user data then you know you have access to, and control of, all of it.


But if you index it. You have it.


Not the team themselves, the competitors...


>This is obviously a talent acquisition in more ways than one (the Kaggle team


Last I heard was Kaggle runs atop Azure and is heavily a C# shop. It'll be interesting to see the transition to Google Cloud if that's the case.


I can confirm that Kaggle runs on Azure because I block all Microsoft IPs (to avoid the ninja Windows 10 upgrade) and must disable the blocker in order to go on the site.


As skrebbel said, don't they charge for the upgrade now? That said, Never10[1] was (still is?) a great tool to prevent the Windows 10 auto-upgrade. Also, according to the Never10 page, Microsoft now has an optional update to get rid of the GWX stuff.[2]

[1] https://www.grc.com/never10.htm

[2] https://support.microsoft.com/en-us/kb/3184143


> to avoid the ninja Windows 10 upgrade

What ninja upgrade? You always had to opt-in. Yes, they were really pushing the offer annoyingly hard, but I had no problems whatsoever to keep one of my machines on Windows 7.

Anyway, you can stop doing so now, the time for a free upgrade is over.


This is incorrect. There was an opt-out phase where the Windows 10 install started automatically in the middle of work. I've experienced this myself, there's a moment where Windows 7 just shuts down and starts installing Windows 10 and I had to wait 30 minutes until I could press "I disagree" to the EULA and then it would start rolling back the Windows 10 it just installed.


Entirely off topic, but I thought they now charge for the Windows 10 upgrade and don't force it anymore?


They do charge now but you can get it free if you say you will use an accessibility feature.


Why not upgrade to Windows 10? It's my favorite Windows OS yet, and has me even rethinking whether I want our house to be all OS X...


This thread from a while back covers some of people's objections to Windows 10, outside the usual privacy concerns:

https://news.ycombinator.com/item?id=13555100


So you don't get windows updates?


At this point presumably a system not running Windows 10 is not getting updates anymore. Unless it's an enterprise install, in which case the ninja update is irrelevant.


I get updates on Win 7


This is a great idea. Adding it to my DNS blackhole as we speak.


It's really not a great idea. Either you don't run Windows, and it's not an issue, or you just blocked Windows Update and other important services Microsoft provide that work in tandem to keep your systems safe.


Blocking Windows Update sounds like a feature to me, not an issue.


> Either you don't run Windows, and it's not an issue,

Not a solution for those of us who run Windows boxes for various reasons...

And to clarify, I plan on occasionally letting updates through (I'm already on Windows 10) but this is a great way to prevent data collection / backdoor activation, which I hadn't considered. Seems like the simplest way to add a lot of privacy to Windows.


There are a shed load you can block without interefering with updates.


Yet that's not what the parent and its parent were talking about/implying. It clearly said "blocking all Microsoft IPs".

And considering the Windows 10 upgrade was being pushed through Windows Update I'm not sure how you'd want to prevent that specific update by blocking an IP and not interfere with Windows Update as a whole.


Makes sense. Azure LBs do not support ICMP and all ping packets are dropped. You can't ping any Azure-hosted services. Kaggle.com fits the description.


I'm pretty sure it supports ICMP, as TCP/IP cannot work properly without it. I guess you mean ICMP echo. Also there are like four kind of Azure load balancers and this is only true for some of them.


I can ping bing.com, but does that mean bing is not hosted on Azure? [Though it redirects to pinging something like a-0001.a-msedge.net]


It may just not use the Azure LB service (e.g. running HAProxy on virtual machines instead).


They are also known to have used F#, and even provided a testimonial to this effect: http://fsharp.org/testimonials/. Can't say if it's still used, though. That's two recent high-profile acquisitions (with Jet.com) for F# shops.

> At Kaggle we initially chose F# for our core data analysis algorithms because of its expressiveness. We’ve been so happy with the choice that we’ve found ourselves moving more and more of our application out of C# and into F#. The F# code is consistently shorter, easier to read, easier to refactor, and, because of the strong typing, contains far fewer bugs.

> As our data analysis tools have developed, we’ve seen domain-specific constructs emerge very naturally; as our codebase gets larger, we become more productive.

> The fact that F# targets the CLR was also critical - even though we have a large existing code base in C#, getting started with F# was an easy decision because we knew we could use new modules right away.


Google Cloud supports Windows, right? What would be the problem? (Honest question)


None whatsoever, unless they're heavily bought into Azure-specific services.

The idea that if you do C# you must be on Azure (or the other way around) has been outdated since Azure started. The first startup I ran tech at hosted C# on Mono in Docker containers on DigitalOcean and had devs on all 3 major OSes.


I'd be surprised if there isn't a decent amount of C# somewhere in the Google ecosystem.


I'd be interested if anyone knows anything about this. Especially given the recent updates to for running .NET core on Linux/Mac, a company like Google could make great use of C# without needing to shell out for Windows licenses.


Relevant 10 year old blog post [1].

[1]: http://steve-yegge.blogspot.com/2007/06/rhino-on-rails.html?...

Don't know how true this still holds, but there was a time at least where it sounds like anything outside of C++, JVM languages and Python was off limits.


The IP of kaggle.com reverse DNS is cloudapp.net which is a Microsoft Azure domain so I think that this makes sense.


That's really interesting to hear. I wouldn't read too much into it, I was mostly just speculating. It's quite likely that they mostly scooped them up for the rolodex that is their user database.

In any case, congrats to the Kaggle team!


I think this may have something to do with Jeremy Howard's time as president there - I remember watching a few of his tutorials a couple of years ago when he was still at Kaggle and he was really into C#.


I wonder if Nest has support contracts for any Java 6/7 they are still using.


Why does Google want to promote TensorFlow? To make people use more of their cloud offerings?


Likely to avoid their mistake with MapReduce, where by around 2011 candidates were coming in to interviews and saying "MapReduce? That's sorta like Hadoop, right?"

There's value in controlling mindshare; keep everything proprietary too long, and people just use open-source clones that may be inferior but can actually be used by the majority of the talent pool.


More specifically, Amazon Elastic MapReduce (EMR) beat Google to market. By years, if I recall correctly.


Does the downvote indicate my memory is faulty?

I believe I was already using EMR when Google's MapReduce service was announced. I'm not referring to their internal tool, but the external service.


EMR beat Google Cloud MapReduce to market, but you're forgetting that before there was such a thing as cloud services, we relied on open-source frameworks and setup our own clusters. EMR is based on an open-source framework called Hadoop, which itself was built on a closed-source Google framework called MapReduce that Google released a paper about. MapReduce came out in 2003, Hadoop in 2006, Amazon EMR in 2009, and Cloud MapReduce in 2015.

...which is sorta my point. People remember the version of the technology that makes it accessible to them, not the first one that comes out. When Google keeps thing proprietary forever and only releases academic papers, people quickly forget just how far ahead they were.


That's all true, but what may matter more to Google was the missed business opportunity of being first to market with a relatively easy distributed computing paradigm.


That's exactly backwards - the MapReduce paper was intentionally released as vaporware to make the rest of the industry spin its gears trying to replicate an imaginary result. And that's why we have Hadoop.


You realize you're arguing with an ex-Googler who has worked on production MapReduces that were first written around 2005 and has read the initial MapReduce commit?


I thought the MR paper described an actual working implementation. It had performance test results, descriptions of issues they encountered and solved, and some sample source code of how MR is used. It seems like a lot of effort was put in for it to be a hoax.


I imagine part of it is that businesses built on Tensorflow play nice with Google Cloud at their TPUs, but mostly I suspect it's just a mindshare thing. If Google becomes the place that all the top data scientists want to work – such that they don't even have to be poached – that's a Very Good Thing for them. It probably doesn't hurt if those data scientists come in already familiar with a tool Google uses internally.

Kind of reminds me of the genius move by Tesla to crowdsource collection of self-driving car information. Experts want to get where they have the data to train their models, and if Tesla propels itself ahead of the pack for number of miles of real-world training data, then that makes them very attractive to talent.


If all machine learning experts use TensorFlow, all the machine learning chips coming out will be highly optimized for TensorFlow. Higher competition among TensorFlow chips = better acquisition prices for Google. They also don't have to go around convincing chip makers to support TensorFlow (like they did, for instance, with the VP8/VP9 codec).


I am curious to see what will happen to Tensor Flow. I hope the code will get clean up... I also hope they will eventually pay somebody to do it, as the open source option clearly generates heterogeneous nightmare.


The rewrite in TensorFlow is somewhat worrying though, since TensorFlow is open source, meaning that there's no real benefit to google if it's written in TensorFlow (except for recruitment purposes).

It's worrying since it suggests that google might be planning to make it, or at least parts of it proprietary in the future....

For the record, I don't think that google will, but I'm still worried about the possibility....


Doubt it.

They made angular and they didn't some how proprietary it.

The more worrisome stuff is when they close shop on services or completely change a framework.

TensorFlow isn't a service so we don't need to worry. And I doubt they would change TensorFlow so much like angular 1 to 2 to 3 kinda deal. If it does happen Keras library abstract it iirc.

I think their goals is to get people to use their cloud services imo. They do the same with their nexus without the SD card to push people to the cloud.

Also I think it's almost like the idea of controlling a framework instead of being on the whim of some other company. I'm looking at Oracle and Java here.

Facebook have their NN. Google have their owns. So they don't have politics to deal with.


Recruitment is an important purpose, though. Having a steady supply of pre-Tensorflow-trained engineers available is presumably why they opened up Tensorflow to begin with. They're not going to benefit more than that anytime soon by closing it off again.


I think their play has been building specialized hardware that executes TensorFlow better than anyone else. "You could use a GPU to do this, but check out our custom ASIC that does it 400x faster for 1/5th the cost..."


NO ONE does it 400X faster...

And the hardware advantage is easily negated. For example, our startup is building something like this.


The 400x was clearly hyperbolic – and sure. But you probably don't have one-click integration with an existing battle-tested IaaS platform.

Vertical integration is powerful, and by open-sourcing Tensorflow Google is achieving useful synergies in sales and recruiting. At their vast scale, even small ROIs (as a percentage) can be massive.


No, but our customers, ie AWS will...


> It's worrying since it suggests that google might be planning to make it, or at least parts of it proprietary in the future....

That would be the Google-only Tensorflow acceleration hardware they have.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: