Kite: Thoughts on Security

tptacek · on April 21, 2016

Something people should consider when thinking about tools like Kite:

If you're a contract developer, or a developer working full-time for a consulting firm, you might not have the authority to determine for yourself whether it's contractually allowable to upload code to Kite's servers. But if you're working for a pro shop, you can bet every dollar in your pocket that the contracts your firm has with its clients technically prohibit it.

Kite is really neat, but I'm a little uncomfortable with the idea that I'd have to remember to remind consulting vendors not to let their developers use it when working with my codebase.

kbenson · on April 21, 2016

Really, they need a GitHub Enterprise like offering. A local Kite instance to run behind the enterprise firewall that allows control of updating (for those that view that as a hard requirement, as some should), that you can point your Kite client to. Kite still gets their rapid iteration and rollout, and enterprise clients don't get left out due to security. Not to mention Kite would probably make a small fortune on licensing.

Kunlun · on April 21, 2016

That would be great. I can not use Kit for this exact reason.

anonbananon · on April 21, 2016

I used to work as a contract developer for a consulting firm. I kept the project repo in my Dropbox. Everything got uploaded. This was obviously a security issue, but nobody asked, and I didn't tell. Dropbox saved my ass multiple times. The cost of accidentally losing uncommitted changes seems much greater than the probability weighted cost of somebody 1. hacking Dropbox, and 2. giving a shit about the client's project. I would do it again. #thuglife

tptacek · on April 21, 2016

#thingsYouMostlyOnlyHearFromAnonymousAccounts

halostatue · on April 21, 2016

#butThatMostPeopleDoAnyway

metaphor · on April 22, 2016

(?<!DoD)[citation needed]

infinite8s · on April 22, 2016

I don't think that was your trade off to make.

adamsmith · on April 21, 2016

This is great feedback, and definitely something we need to address.

Initially we've been focused on giving users control and transparency. We need to extend this to employers.

One quick idea that we'd love your feedback on: a .kiteignore file that can exclude Kite from responding to files that match a certain pattern (e.g. "secrets.py"). Presumably an employer could put a "[^.]*" in a repo-level .kiteignore file, and anyone working with that repo would have to explicitly delete that file to use Kite with it.

We'd love to hear any other ideas folks might have as well!

fweespee_ch · on April 21, 2016

The only way to truly solve this problem is on-premise devices that do not push data to Kite, honestly.

I'd need to ignore every file I have.

LeifCarrotson · on April 21, 2016

Why does the device need to be on-premise? Would a Github Enterprise-like setup, where you have a dedicated Kite instance/VPS at their datacenter, work for you?

fweespee_ch · on April 21, 2016

Contractual agreements, NDAs, etc. might make it impossible to disclose work product and/or source code to a 3rd party which would include Kite.

> If you're a contract developer, or a developer working full-time for a consulting firm, you might not have the authority to determine for yourself whether it's contractually allowable to upload code to Kite's servers. But if you're working for a pro shop, you can bet every dollar in your pocket that the contracts your firm has with its clients technically prohibit it.

As tptacek said, you might be prohibited and/or not have the authority to do so.

> Why does the device need to be on-premise? Would a Github Enterprise-like setup, where you have a dedicated Kite instance/VPS at their datacenter, work for you?

We run GitLab on-premise so I'm not familiar enough to answer. However, if by "their" you mean Kite? Yeah, that won't work as its essentially the same thing. [e.g. Disclosing it to a 3rd party outside of my control]

simonw · on April 21, 2016

GitHub Enterprise allows you to run a dedicated instance on a virtual machine in your own datacenter, which lets you keeps all of your stuff behind your own firewall.

aaronbasssett · on April 22, 2016

A .kiteignore file would be brilliant. I do a lot of contract work, but also a lot of opensource. My issue with Kite is I would forget to disable it for some projects. If I could have my default project scaffold contain a "*" .kiteignore file it would reduce this fear.

chinathrow · on April 22, 2016

A major feedback was that folks need that thing and they need it on premise. On the recent update, you left out that folks should contact you at that other mail address, how so?

I can't use Kite if it's not on premise.

entitydc · on April 21, 2016

This is absolutely true. Beyond this, there's an additional concern that Kite simply hasn't been around long enough as a company for people who have the authority to allow it to be used within their teams.

If you're in the position to make a judgment call on whether or not Kite would be allowed in your team, there's a real cost to being wrong about the trustworthiness and security of Kite as a whole.

I'm intrigued by the tech, but it'll likely be quite a while before I use this with any of my teams for anything other than trivial coding in pet projects that are already open source/public repos.

dangoor · on April 21, 2016

I had the top comment on the original post and was critical of Kite's approach. I still think that Jetbrains has proven that you can be very effective locally without consuming your entire machine, because the needs of one project do not require making sense of all Python libraries. You only need to make sense of the fairly small subset of libraries that a given project actually uses.

That said, I think this response is terrific. The value proposition was outlined well. Feedback loops are important and cloud services naturally have much tighter loops. Security considerations are no different than they are for GitHub, so those will not be insurmountable for many.

So, neat looking product and very nice response to initial feedback!

Zombieball · on April 21, 2016

Expanding upon this line of thought I am not sure their comment RE: storing data indexed in 32GB of RAM holds any meaning considering it requires a hop over the wire to access. Locally stored indexes on disk would likely be much faster.

I would imagine a smooth update mechanism could allow you to update file indexes without loosing too much agility on feature development but come with huge security gains.

MichaelGG · on April 21, 2016

Well they say "machines" so it might be much more than 32GB. (Though on disk it might be strongly compressed.) And on HDDs, a few seeks is actually slower than a network round-trip.

Zombieball · on April 21, 2016

Uncompressed might indeed eat up tons of disk space.

Admittedly I was thinking of an SSD + some caching in RAM when thinking of this timing. But I was always under the impression an HDD seek would be on the order of 10's of ms? I'd assume a query to a cloud service would be on the order of 100-200ms.

MichaelGG · on April 23, 2016

The network latency should be somewhere around 40ms assuming you're in the US on a proper connection. If a query takes 10 disk seeks (10ms each) vs 10 RAM accesses + HTTP parsing...

stegosaurus · on April 22, 2016

It's not really about 'security being addressed', we all know that. It's about marketers creating the illusion that security has been addressed for end users that don't understand it anyway. In the case of stuff like Windows, it's about eventually forcibly removing control from the users by pushing updates.

I think it's dishonest to not just simply post that 'if you're afraid of the cloud, you are not our target market, go away'. I guess it's a capitalism thing. Wouldn't look good to investors, or whatever.

I am sad because I have a beefy machine, and I want to use this, but I can't. I'd pay for it, you know? But I don't 'do' SaaS, the reasons are too long to list here.

More concretely, 32GB RAM is trivial, and preselecting my languages is... I've already pre-selected them! It takes months, years to learn a language :P

Kite looks really super cool.

nikolay · on April 21, 2016

You "launched", really? Maybe internally, but I haven't received even an email confirmation that you've got my "signup request". As far as most of us are concerned, you launched a video.

polartx · on April 21, 2016

second that. I never got so much as a confirmation email when I signed up. Signed up again today--I'd really love to have a tool like this while I attempt to learn programming.

educar · on April 21, 2016

Many people complaining about security because of the nature of the 'cloud' here. This is what I thought when github came out as well. But look today, everyone has their code (supposedly their IP) on github. Ulitmately, kite's track record on the cloud will trump any security considerations.

welder · on April 21, 2016

The difference here is GitHub doesn't auto-upload Python files without me knowing. If I open my secrets.py file I know it's not getting uploaded to GitHub because I see it's absent from the staged files.

I don't think Kite prompts you before uploading every time you open a file.

franciscop · on April 30, 2016

Maybe add a .kiteignore then?

MichaelGG · on April 21, 2016

Really? Github's whole business is selling on-prem versions of Github.

educar · on April 21, 2016

Do you know if Github has released any public information about this? From what I can tell, they were a profitable successful company even before they went and took some VC funding and at that point they didn't have on-prem. In general, most big companies I know use the atlassian suite instead of Github for code.

richard_mcp · on April 21, 2016

This sort of open communication is great way to start to build trust of users (both relating to security and otherwise). I was thrilled to see the devs responding to comments positively in both the HN and reddit threads.

I'm looking forward to seeing Kite expand their security support and I can't wait to try it out on Linux.

zuck9 · on April 21, 2016

Reposting since it wasn't answered earlier:

Even though I trust you, there's no way anyone can guarantee that a hacker won't get into your database and get my proprietary source code.

I'm no security expert but one way I can think of is creating an encryption system which works like this: all my source code will be stored encrypted on your (non-ephemeral) databases. The decryption key will be stored on my computer, and it'll be transferred to the server when I run Kite and destroyed as soon as I quit Kite. The key will be stored in your server only in an ephemeral storage (in-memory database etc.) Do you have something like this in the works?

nickles · on April 21, 2016

The approach you described is largely security theater. Supposing an attacker has compromised a machine and is capable of retrieving stored data, it reasonably likely that the attacker will be capable of either capturing the key as it is transmitted or reading the key while it is stored in memory.

If you start with the assumption that a machine is compromised, then there's not really a way to guarantee secrecy of anything done on the machine. Homomorphic encryption resolves this, but (as far as I'm aware) it is too computationally expensive to be viable at present.

CameronBanga · on April 21, 2016

Some things I didn't seem mentioned (I may have skipped over) which would be awesome:

* Let us delete all of the data we have stored on your servers, whenever we want. * Let us see all of the data we have stored on your servers, whenever we want. I really don't care how you're manipulating it, but would like to see (and additionally delete) any information you have stored on me that I'm uncomfortable with.

LeifCarrotson · on April 21, 2016

While those would help with consumer confidence, the reason the EULA probably disclaims all that is because the data is very hard to track down.

It may be on magnetic tape backup in archives. Do they have to dig out each roll of tape each time for each customer who wants to erase a segment of their data?

It may be indexed or used as input to shape a larger or disconnected part of the software. It shows up in logs, and therefore in archived statistics analysis. It may have been copied and modified incrementally by a dozen other users. Should all these developments from user data be destroyed because they are based on old data with a delete request?

Perhaps an opt-in to a feature that doesn't back up or analyze your data for an hour could help prevent "committed the password to version control" accidents. But it would need to be opt-in, because the whole point of Kite is that your code is indexed in real time, right?

adamsmith · on April 21, 2016

We agree; these features are at the top of our list. : )

Alex3917 · on April 21, 2016

Not (directly) related to the security, but I was wondering if you were inspired by any research from academia when creating this. I only ask because I just watched a talk from a Stanford professor from 2012 where he talks about anonymizing and aggregating everyone's code in the cloud to create better documentation, albeit as a one sentence aside at the end of an otherwise unrelated talk.

adamsmith · on April 21, 2016

The initial idea came from thinking about live contextual search and community. This thinking intersected with programming because of my background, and once that happened there was a lot of iteration—informed by research, and mostly conversations with friends—to get where we are today.

mbrock · on April 21, 2016

I'd like to hear a statement like this that explicitly acknowledges something like, "as a private company funded by Silicon Valley investors, we need a clear way to capture more and more value, and that's a big reason why we want to collect your data on our servers and use a client model for our proprietary algorithms."

That's how it works, we all get it!

borski · on April 22, 2016

For the record, we went through this too. Kite is awesome, and I have no doubt it will succeed, but we fought the enterprise virtual appliance train for years. The cloud grew, we got more customers, but there were certain verticals we could never reach: finance, government, etc.

We recently built a virtual appliance. It's growing infinitely faster than our cloud solution ever did. Those numbers speak for themselves.

Certain products just need to have the virtual appliance option. I'm sure Kite will get there one day.

For now, I'm going to use it for personal projects because it's still a badass solution to a problem I have. :)

chinathrow · on April 22, 2016

"Some folks still use Garmin GPS due to privacy concerns, but most of the world uses internet-connected navigation for its many advantages: fresher maps, more coverage, better tuned navigation algorithms, better user experience because iteration is 10x cheaper, etc."

There is one thing missing: people use Google Maps, Waze etc simply because it's free.

andy_ppp · on April 21, 2016

Just out of interest is there a plan to make Kite work on an iPad rather than just a window on my computer - even some kind of screen sharing with (basic) touch support would be excellent.

I will probably never use it because it scares the crap out of me that I'll type my password in the wrong window though :-)

_wldu · on April 21, 2016

This is more about Thoughts on Privacy than security.

nikolay · on April 21, 2016

Unfortunately,

    Convenience > Security

Laziness is both a curse and a blessing.

pbreit · on April 21, 2016

So how do I sign up?

stronglikedan · on April 21, 2016

https://kite.com/

You sign up for an invitation. I am curious to know how long that invite typically takes. I just signed up about an hour ago.

pbreit · on April 21, 2016

I "signed up" last week. Nothing yet.

_ikuh · on April 21, 2016

Same. I have heard no anecdotes of people being invited yet either.

educar · on April 21, 2016

I was looking at this as well. Is there only a video?

nikolay · on April 21, 2016

Yup, they launched a video not a service.

_mhyx · on April 22, 2016

franciscop · on April 21, 2016

> "we believe we will set industry standards that will be adopted across multiple categories of tools such as continuous integration and code review systems"

Excuse me? Why is that? It sounds like either they think they are programming wizards or they believe the CI folks are incompetent, none of which signals a company I would like to trust.