Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That works well for small repos or a few repos but if you want to find all cc files, at all release branch's, in your entire company and check for some exploit it is helpful to have a VFS. Makes it so you could also support N SCMs through one API. You just need to make a new VFS.


Isn't there already a good way to push computation closer to the data?

GmailFS and pyfilesystem (userspace FUSE) and rclone are neat as well.

https://stackoverflow.com/questions/1960799/how-to-use-git-a... explains about the `git push` step that git-remote-dropbox enables: https://github.com/anishathalye/git-remote-dropbox


GitHub also has a code search now: https://cs.github.com


Needing to tie into a specific API (like codesearch) couples you to the specific storage backend (Github). If you build your software to operate on a POSIX-y file system, you can support anything that shows up as a file system. For example: A local working tree of files, an NFS share, or now a remote git repository.


Running the code where the data already is saves network transfer: with data locality, you don't need to download each file before grepping.

Locality_of_reference#Matrix_multiplication explains how the cache miss penalty applies to optimizing e.g. matrix multiplication: https://en.wikipedia.org/wiki/Locality_of_reference#Matrix_m...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: