Hacker News new | past | comments | ask | show | jobs | submit login
The single most important criteria when replacing GitHub (joeyh.name)
211 points by edward on June 6, 2018 | hide | past | favorite | 131 comments



I’ve always thought of version control systems and issue trackers as separate products. GitHub just happens to implement them both in one place.

As others are alluding to in the comments, I’m highly skeptical of coupling the issue tracker and other non-VCS features with git. A far better solution would be to keep them decoupled, but easily pluggable and extendable.

In fact, this is basically the status quo with any issue tracker that has git integration (ability to tag issues in commits, etc.) If this was the only problem GitHub solved, then it would be competing with pivotal and the like.

But issue tracking is not the only problem GitHub solves. The reason GitHub grew so large is because it created a community within a single space. It gave devs a place to collaborate on, and discover projects within an environment highly integrated with VCS and issue tracking.

People don’t publish to GitHub for the issue tracker or repo hosting. Those are solved problems and many companies even use separate software for them. People publish to GitHub for the visibility, discoverability, and community.

There is no technical solution that can usurp the advantages of GitHub as a social network, just like creating mastodon is not enough to kill twitter.

The technology features of GitHub are easily substitutable. The community is not. Network effects are real, and they are the reason Microsoft felt confident paying $7.5bn for a platform that has no major technical differentiation from its competitors, but does have a huge and defensible moat around mindshare, due to its main value-add being rooted in network effects.


I feel like the HN community fails repeatedly to really grok this concept. They focus on the negatives of having a single dominant player for a service (with worries about monopolistic practices and stagnation), but completely ignore WHY these single providers become dominant.

There are HUGE network-effect benefits to having a single dominant provider. Right now, if I am looking for a code library, I pretty much only use the ones I find on github, even if the google search shows me projects hosted on other places.

I want to be able to fork, clone, and contribute back without having to create accounts on other VCS sites. I don't want to learn another interface, or have to remember which site had which project. I want to be able to have a single list of 'starred' projects that I am following. I want to only have to learn one system and check in one place.

No matter what, that is just easier as a user. I know there are costs, but those really have to outweigh the benefits to make it worthwhile having multiple providers. I have yet to see a federated system that solves these problems, and the fact that no federated system is as popular as the centralized systems lead me to believe it might be an unsolvable problem.


It works both ways. If you're too lazy to learn a service I chose to host my code, you don't deserve my code.

(Sorry to put it harshly, but it's felt like we're all getting a bit spoiled. Especially compared to the old days.)

For better or worse, you are the product if you use a centralized, free service. And the only thing keeping those dominant players dominant is the blind loyalty we seem to give freely.

It's all about that convenience! I get that. But morals have their place, and in the coming months the world will show whether they care at all about centralization.

It's probably important to stay nervous about centralization. If we let our guards down, we could find ourselves on the short end of an upsetting stick. History has shown time and again that when companies have no incentive to compete, they tend not to try. The reason Facebook can be so free with your data is that there is no way to compete with them. With github, at least there's a way.

Honestly, the biggest thing holding back Github competitors is that they won't just make their service look identical to Github. Gitlab looks strange every time I run across a repo. Instead everyone wants to be different, and usually it's not better.


I use (for example) 50 libraries, and want to submit bug reports and pull requests to all of them over the course of 5 years...

and your attitude is "well.. if you don't make 50 accounts you don't deserve it!". Really? And also that's why we have package repositories... npm, nuget, PECL, composer, should i also register on 500 websites just so i can build a website or two?

Also, github is free for open-source projects, but it's paid for teams and enterprises. it's a win-win situation where they do social good and get paid for private service.

Alternatives & competition is good, but too much competition is not that great either in this case.


This is what I like about collaborating on code that still uses mailing lists to collaborate. I already have an account that I can use for every mailing list in existence -- my email account.


> they do social good

Maybe, but maybe I disagree with how they treat their female employees, or maybe I don't like that they financially support some political thing or whatever.

Capitalism doesn't work without real competition. I shouldn't be obliged to do business with this one particular company if I want to develop Free Software. My point isn't that GitHub is evil; it's that each person should be free to decide that individually.


Correct, capitalism isn't inherently evil, competitions is healthy. That's why we have github, bitbucket, sourceforge,gitlab,gitea,phabricator, and probably few more projects like this. however imagine if we had thousands and the open source community was give or take evenly distributed between them. that would simply be nightmare. discoverability will be pretty low, contributions will be even lower and the community wouldn't be thriving as it is now. path of least resistance and all that..


> imagine if we had thousands

First you spoke of 50, then 500 (web sites), now thousands (yet you only managed to name 6).. this is starting to resemble reductio ad absurdum.

You haven't really made a convincing argument for why we might have excessive variety, such as an unusually low barrier to entry compared with other open source tools.

Moreover, I'm pretty sure your arguments could also be used in favor of federation, rather than centralization.


Why would discoverability be low? Search engines exist.

Why would contributions be low? Presumably we'd have a common way to contribute from your own federated instance.

There are thousands of websites (maybe even more!) and it's possible to discover them and comment/upload/whatever. Would it be better if they all moved to Medium, Wordpress.com and Facebook?


> It works both ways. If you're too lazy to learn a service I chose to host my code, you don't deserve my code.

I have infinite work that needs to be done, and a finite amount of time. I have found in my experience that only looking at github works for maximizing my productivity; I can get almost everything there, and the returns for learning another system does not make up for the time it takes. This isn't about laziness, it is about choosing to put effort where I get the most value for it.

> For better or worse, you are the product if you use a centralized, free service

I pay for github (7 bucks a month for a personal account, and my company pays $100k+ for github enterprise).

I think the thing we should be nervous about is not centralization, but vendor lock-in. As long as we can switch, we are ok. Making sure we use abstractions in our interaction with github API helps with this.

However, as long as they keep providing the best service, I will keep using them.


I would not say that GH offers the best service. There are lots of features that they are lacking compared to BitBucket and GitLab. I think it is about the interface and that we are just too used to it and lazy to retrain around brains... again


> It works both ways. If you're too lazy to learn a service I chose to host my code, you don't deserve my code.

Not really, you just get less people using, contributing, and testing your project - software should try and stay in the known, not scattered around sourceforge, google code ... etc


> It works both ways. If you're too lazy to learn a service I chose to host my code, you don't deserve my code.

Linus is that you?

[1] in case you don’t get the reference.

But in all seriousness this is a terrible attitude to have. One of the things I enjoy about being a developer is the community and collaboration and this type of thinking is the antithesis of that.

I’d much rather have centralised tools and a distributed/diverse development community than decentralised tools and isolationist community.

[1] https://github.com/torvalds/linux/pull/17#issuecomment-56546...


> I have yet to see a federated system that solves these problems, and the fact that no federated system is as popular as the centralized systems lead me to believe it might be an unsolvable problem.

A dominant siloed service is precisely what stops a federated network from emerging. This is why I'm very pleased that Microsoft has bought GitHub, because it's apparently jolted a lot of people into realising that GitHub is a single dominant provider.

I think GitHub became dominant the same way Twitter and Gmail did. Techy early adopters ignored the fact that it was proprietary, because it was a nice service among many options; but this growth and the network effect led to an effective monopoly:

1. Neat tech demo, siloed but harmlessly tiny

2. Techy early adopters start to rely on it because it's genuinely useful

3. Network effect brings in the masses

4. Scalability problems: interoperability is still desired, but less urgent than fail whales

5. Investors invest; money pays for scaling up and fixing fail whales

6. Investors get itchy at loss-making, want a return on their investment

Now you have the masses using your service, and any competitors are ghost towns. Investors want money and your most valuable asset is your user base. Interoperability would make it easier to lose your users, so lock-in becomes essential to your business strategy.

If enough people jump into the silo before federation works, federation will never be added. Avoid supporting proprietary networks.


I find that this network effect really makes me want to improve my projects, too. I've worked for hours on README files, making sure I had my licenses in check, and crafting the short repo description to be informative and useful for this reason [0]. If it's on GitHub, I feel like it needs to be presentation ready. Not everyone feels this way, but I do.

If I just need a git repository, I store it locally. Git is a DVCS for a reason -- most single person projects don't need a remote anyway. This makes GitHub the home for my more permanent projects, with local git for everything else.

[0]: https://github.com/Pryaxis/TShock


>If I just need a git repository, I store it locally. Git is a DVCS for a reason -- most single person projects don't need a remote anyway.

I use a remote as an easier-to-maintain-and-test backup than trying to maintain a mirrored local repository on another drive.


> I use a remote as an easier-to-maintain-and-test backup than trying to maintain a mirrored local repository on another drive.

You should really consider getting a good backup solution, like Tarsnap or Backblaze, to follow from the "two is one and one is none" principle of backups. That is, if you haven't already.


I already use Backblaze as it's important to have off-site backups whenever possible. However, needing to wait several weeks to download my data is not always an option. I'm a digital hoarder and by that I mean I have >30TB of data. My backup solution for most things is 3 local copies and 1 remote copy. Files I don't deem important enough to pay to store 3 times locally aren't backed up locally but still are backed up with Backblaze. Some things are easier to entrust to someone else (eg: git repos) than trying to maintain local mirrors. Also a Backblaze backup of a git repository doesn't do me much good if I want to quickly grab it (talking minutes to download from Gitlab instead of hours/days from Backblaze) so it makes more sense to use Gitlab as a remote so I can quickly grab it if I lose my local copy.

All of that aside, it's still easier to maintain-and-test a remote repository than to maintain-and-test local repositories.


I intentionally leave some of my projects in a super-raw state as a way to obfuscate their true intended purpose; utilizing a public [but obscured] repo as a quasi-private one. :)

I realize not everyone feels this way, but I do.


Why not just host them somewhere private repos are free of charge? (i.e. GitLab, Bitbucket)


It's like putting something on the blockchain-- I want it there so that I can point to it from the future as "my work" but without attracting too much attention in the meantime.

If the attention comes, no worry. If it doesn't, perfect.


There are a lot more eyes on those repositories than you think. Things like misplaced API tokens are vacuumed up nearly instantly. You'd be much, much better off hosting private stuff on GitLab or BitBucket.


I do have the really secret stuff on BitBucket. :)


Appreciate all the hard work! TShock is great :)


Thanks! It's a community effort, though, and I only deserve a small fraction of the credit. I'll pass along your message to my team.


I havent found a code yet that I will only keep locally in a git repo. Can you give an example of such type of code?


I don't know about other debs, but I have a decently large directory of python scripts that are kinda 'single-use', such as scripts to reformat a json that is screwed up in some oddball way. I used git for version control, and save them on the off chance they'll be useful someday for cut n paste, but there is no need for storing something I may never use again remotely. In fact, the only reason they aren't deleted is the space saved wouldn't be worth it when archive disks are so cheap. If my house burns down or the NAT is stolen, that code is the last thing that will be on my mind.


> Right now, if I am looking for a code library, I pretty much only use the ones I find on github, even if the google search shows me projects hosted on other places.

I don't know what ecosystem you're working with, but I typically find code libraries for my projects using the Python Package Index (PyPI). A lot of packages on PyPI are located at GitHub but by no means all. pip can install packages from other sources that PyPI or GitHub. It's not that difficult.

Regarding creating accounts, yes it's a speedbump to have to create multiple accounts but using a good password manager with autologin features can take away a lot of that pain.


>I feel like the HN community fails repeatedly to really grok this concept

No. Two weeks ago if you took the temperature of the crowd it would seem dead set on singing the praises of github centralization, encourage others to put all their eggs in that basket, and pretending like open source doesn't exist unless on github.

The popular conversations here have shifted. It probably indicates some shift in what the audience really thinks, but mostly just shows whats popular to bikeshed this week.


> WHY these single providers become dominant.

In this case, providing a free "let's pretend we're doing open source" play area to millions of people who can't code their way out of a wet paper bag,


> can't code their way out of a wet paper bag

There was no lack of competent, original code on GitHub last time I checked... What are you referring to?


And this is exactly why it's wrong.


Exactly. But an OSS community is very different from Twitter or Instagram. People will migrate to another website if Microsoft screws this up. Even if they don't screw it up, a lot of people are already willing to move because they hold grudges or are against a company like Microsoft owning the space.

If a GitLab or Gitea-backed community surfaces managed by a foundation, GitHub will be vulnerable.

Right now there isn't much of a point in moving to GitLab.com because they could be acquired as well.


> As others are alluding to in the comments, I’m highly skeptical of coupling the issue tracker and other non-VCS features with git.

What do you mean by coupling? Are you talking about what Gitlab Omnibus does by bundling an issue tracker and other web services along with its core git functionality?

If so, you're wrong-- Gitlab's approach helps a busy maintainer get their work done without being hemmed in to a service like Github. The issue tracker's defaults are sane, non-technical users can easily sign up using the other bundled services and communicate using it, and it gets upgraded as a side-effect of upgrading Gitlab itself.

If I had instead used one of your claimed "far better solutions," I'd have spent hours researching issue trackers and installing/configuring one of them. That's time taken away from developing the software I'm housing in Gitlab.

Or, even worse, I'd have asked on the dev list, "what's a good, solied open source issue tracker?" And we'd have bike-shedded for 10x the time.


> I’ve always thought of version control systems and issue trackers as separate products.

Agreed, but coupling has extreme benefits. No matter how stringent you are on your commit message requirements, you'll never capture the scope of a code change as well as the originating issue.

They may be separate products, but the need for a cross-vendor common interface/implementation persists. Also, the ability to take your ball and play elsewhere is required too. If a git-like approach were taken towards issue management and all of these platforms could implement/use it, nobody would ask to piggy back on git. Then again, you can apply this same discussion to any form of structured, persisted digital content...nobody wants to be locked in. Just so happens that the code part happens to have an unencumbered implementation that others have embraced.


> I’m highly skeptical of coupling the issue tracker and other non-VCS features with git.

But isn't it nice that you can say "issue X has been solved in commit Y"?


You can do that without having the repo and the issue tracker in the same product. For example, at work we use Jira for issue management and GitHub for code, and with a Git Jira plugin, they integrate pretty seamlessly in both directions.


Yes it’s nice, and there are dozens of products that allow you to do that. My point is that this is not the main value-add of GitHub, and discussions of “replacing GitHub” that center around re-implementing these features are missing the point. The value-add of GitHub is its community and years of developer mindshare. There is no simple technical solution to replacing that.


You seem to focus on step 2, e.g. having community and mindshare, while completely ignoring why all the people got there in there in the first place and why it gained such momentum, the step 1.


Step 1 for a category defining product will never be the same as step 1 for a future competitor to that product. What worked for GitHub as step 1 will never work as step 1 for any service that replaces GitHub, because GitHub’s very existence changes the environment that allowed their step 1 to enable step 2.

This is the luxury of first-mover advantage. GitHub only had to implement “good enough” features to attract a community. Any competitor that usurps GitHub will not only need to implement the right features, but also figure out how to move the community from GitHub to the new platform.


It feels like there might be two different uses of "coupling" here.

The linked article argues that GitHub locks in users by not storing everything in git, which what the post you're replying to is skeptical of. (I am, too.)

Being able to say "issue X has been solved in commit Y" is coupling the issue tracker and source control at a user experience level. The back end solutions are immaterial; they just need to be talking to one another, whether it's through plugins, or through proprietary features like GitHub already has (in which commits can be automatically linked to issues and vice-versa, merging PRs can automatically close issues, etc.).


That’s a matter of preference. A git repo will probably outlive fashionable-issue-tracker-of-the-day, so it might make sense to not embed this sort of information right inside git commits.


This also accurately describes what Microsoft purchased with LinkedIn. It still isn't the best platform in terms of what it covered but there's immense value in the community that was purchased. Because I see the pattern now, I'm curious what other communities will get folded into the MS ecosystem as a result in the future.


Skype was also a similar purchase. The pattern seems to be about dominant companies in the productivity/development space.

I wouldn't be surprised if companies like Atlassian, Slack, Jetbrains, Docker or Hashicorp are next.


> GitHub just happens to implement them both in one place.

Anyone who has version control and bug tracking under one roof in one company has them "in one place". Plenty of organizations have them integrated with some sort of common "dashboard", and multi-way navigation between tickets, commits and reviews.


If you are only interested in the version control part and you are a Mac user, you can give EasyGit (https://easygit.me) a try. It stores your repos privately on iCloud and it doesn't use any 3rd party servers or analytics. In fact its sandbox doesn't allow any outgoing network connections, apart from talking to iCloud.

Disclaimer: This is my app and btw I'm running a WWDC promotion and you can download it for free till Friday: https://itunes.apple.com/us/app/easygit/id1228242832?mt=12


Exactly this. I can discover and contribute to many projects (creating issues, submitting pull requests, with discussion) without having to learn 100s of subtly different UX each “distributed” system chooses to implement it.

And I know most other developers can also contribute to my projects in a familiar way.


Even if I ended up migrating from github, I would still look for a centralized instance.

Why ?

The price for one migration in a few years is not a big deal compare to self hosting, which is a human resource hog for small entities.

I have more important things to do than to deal with deploy, configure, maintain, document, secure, and test my own instance. I really don't want to admin one more server just for that. I don't want to play cat and mouse with spiders, brute forcers or DDOS events.

And I do want a huge visibility and community around the tool I use. And my users may not want to create yet another account to open a bug ticket.


I feel like everybody is overestimating the cost of self hosting. I run my own instance of GitLab, but the "costly" part was setting it up and making backups. Since then I just "apt-get upgrade" and there is a new version and that's it. Since I set it up, I haven't touched any of the configuration files.

The availability guarantees are much easier to keep for a small team than for thousands, so you don't really have to worry about that either.


How often do you test your backups? How many places (physical) do they exist? How often are they taken?

What’s your server’s uptime? Is it highly available? How often do you patch? Do you test them first? How do you keep up with security updates from other software on the machine(s).

How is your security team in general? Do they monitor for stolen credentials? Do pen-tests? Watch for network scans and attacks? Look for weird traffic patterns?

Do you have redundant network links? Backup generators? Do you test them? Sure your datacenter SAYS they have them, but remember when Intuit went down for 2 days a few years ago? I bet they had told their bosses they had generator backups.

Is someone always on call? What are they experts in? How do you get support?

GitHub provides a TON of stuff above a cheap VPS or EC2 instance. And that’s not including the network effects of having so many projects in the same site.


> What’s your server’s uptime? Is it highly available?

Who cares about uptime, when you use a server alone or with a small team? It can be down for 10 hours and you would not even notice.

You don't understand what you need.

> How often do you patch?

I patch the machine multiple times a week with some "apt update && apt upgrade" which takes literally less then a couple of minutes a week.

> Do you test them first?

I don't need to because I'm using ZFS and make a snapshot before upgrades, so in case of a disaster (which already happened multiple times) I can roll back within minutes, which happened multiple times, once when the machine didn't even booted.

> How is your security team in general? Do they monitor for stolen credentials? Do pen-tests? Watch for network scans and attacks? Look for weird traffic patterns?

You don't need any of that. If you think you do, you think you are far more important than you really are. Maybe framework authors or super-popular library writers would need to worry about that, but self-hosting makes you an order of magnitude lower chance of being a target.

> Is someone always on call? What are they experts in? How do you get support?

We might not talk about the same thing...

> And that’s not including the network effects of having so many projects in the same site.

This is the only valid point you came up with.


I'm not sure what to call this counterpoint to the more general mantra of "cloud is cheaper and easier than rolling your own!" (with an implied "always" in the middle).

Maybe the ineconomy of scale? Except it's not really about scale but about customization, so maybe the "economy of customization".

> You don't understand what you need.

Unfortunately, I think this statement is more often true than not, and it's much easier to go with what's popular right now than to form that understanding.

Of course, there's some merit to that strategy if one is new, but I can't believe that many people are that new.


Yet github has still had outages from being the target of nation-state attacks. The bigger the service, the worse the enemies.


I have the simplest self hosted wordpress blog. That's not really what I'd call "critical infra". It still goes down a few time a year.

My non self hosted blog ? Maybe a few hours a year, and I don't have to do anything to restore it.


Don't you forget a little something ?

Buying the server, reading the docs, setting it up, buying an URL domain, binding it to the server, getting the backup server, setting it up, testing that everything works, training the team on the new UI, documenting the whole things for when you are not available or it's not your job anymore, blocking one team of 5 devs from working for 2 hours (or a man day of work) because something went wrong just once...

It's not just resources you spend on useless things. It's resources you take __from__ useful things.

Or worst, you are in a company. Now you have to REQUEST a server, ask a sysadmin to set it up, the security team to approve your software, the budget to validate the payment and the blessing of your network authorities.

Then, you change job/client/whatever. What do you do ? Do it again ?


Isn't that why GitLab (and others) publish things like AMIs where config is easy and takes 10 minutes? https://about.gitlab.com/aws/


Aaaah, the infamous "10 minutes" sale pitch.

I think it's getting 10 minutes years old to this day.


It doesnt take long to set up iirc, the problem when I used gitlab was it was resource intensive. It somehow takes a 10$mo instance when it should be able to run as a side process on a 2.50$mo server imo. 10$mo was the same pricing as github so it didn't make sense to use.


Honestly for resource efficient Git service people should use (and I've seen it shared places lately) Gitea or Gogs. Gitea is a fork of Gogs but picked up pace and has multiple maintainers who care a lot for it. It uses minimal resources and looks good. $2.50 server on Vultr with 512MB RAM should cover it.


Which place sells server for 2.50?


There are a few but I use vultr


ovh


You're ignoring the cost for community projects where the community now needs to invest in learning and using your own self-hosted repo, which may or may not have a good UX. This is a barrier to entry.


I agree with this, but the cost is centralization and single point of failure.


Sure. No free lunch. It's just important to understand the nature and size of the costs, so that you can make the trade off adapted to your context.


I think a properly decentralized system (basically everything Github was except stored inside or alongside the repo) solves the self-hosting headaches also -- If everyone has the issue queue, kanban board, yadda, CI results, etc. on their laptops you don't have to host and administer some app to do any of those things, only a dumb storage for backup and sync and services like CI that support those things in any kind of centralized way -- all of which you can still pay someone else to do for you.


And accounts.


This seems like a conflation. Your repo issues could be stored in a SQLite database, or a flat list of JSON files, or a git repo, or a giant text file, any of which your host might or might not give you direct access to.

The thing to consider (if portability is your primary concern) isn't the underlying storage format, it's whether or not you have a straightforward way to move it between providers.


Couldn't agree more. It seems naive to think that source control would be a good system for the kind of operations performed in issue tracking


Issue tracking is all about managing changes to objects (issues) made over time by multiple people from different machines. Seems like a perfect case for VCS to me.


I thought so, too and developed SIT (https://sit.fyi). An interesting thought that I had initially helped me make it even more future-proof: Git is likely not here forever, it'll probably get replaced with something else in due time. So I designed SIT to be both merge-conflict-free and SCM-independent by relying just on additive sets of files.


A VCS might be a good storage location for these 'objects', but it certainly doesn't provide the structure to manage them.


> Your repo issues could be stored in a SQLite database, or a flat list of JSON files, or a git repo, or a giant text file

That's exactly what Fossil does, and the creator has urged Git to do the same.


I didnt understand why this mattered either.

One is storage of data that can be downloaded whenever, the other is basically a comment field.


This isn't entirely true. GitHub's wikis and the gist system are both backed by git - because it makes sense for those.

It wouldn't make so much sense for issues, pull requests, etc. If they were backed that way, they'd very likely be rather slow to query (as GitHub doesn't keep the latest data in a "checked out" state).

Not to mention the fact that you can just ask them for a data export and they'll provide it all to you happily in JSON format.

This post doesn't really have much reality backing its criticisms.


Came here to say this. That wikis and gist are backed by git is very useful. Trying to wedge issues etc into a git model would be an engineering disaster; these are clearly problems best solved with a database.

There may be valid criticisms about MS GitHub, but this isn't one of them.


This reminds me Fossil SCM. The bug tracking and wiki are part of the distributed version control system.

https://en.wikipedia.org/wiki/Fossil_(software)


https://en.wikipedia.org/wiki/Trac has that functionality but integrates with popular VCSes instead of rolling its own.


A bit of trivia: The creator of Fossil also created cvstrac, which later turned into trac.

A further bit of trivia: Fossil and Git were developed at the same time. You could argue that Linus should have used Fossil rather than rolling his own. Fossil was built for the workflow of the sqlite developers, who do not believe Git is a good fit, and they make it available to others.


Fossil is great software and it's worth mentioning that it's written by the creator of SQLite. I'm using it to version-control my dotfiles and it works nicely.


I used it for a while and it was great. Sadly too many people refused to use it because "...it's not git!" :(


I had a job once where we used Lotus Notes. I'll take extensibility over integration any day.


But you had the unlimited power of LotusScript(tm)(c)(r) at your fingertips to add all those niche features that the Notes developers don't care about!

You know, crazy things like forwarding calendar invites as calendar invites instead of emails, deleting emails older than X days, etc.

/s


Came here to say exactly this.


Don't forget GitHub, Microsoft and et al, co-wrote and contribute to libgit2[1], a portable pure-C Git core implementation that's aim for cross platform compatibility.

As the author noted himself, GitHub keeps, source code, user pages/wikis and gists in Git. Other objects like issues, repo relationships, accounts, project details in their own database[2]. To me, that's a fairly standard way of storing data if I were to develop a GitHub system. Even rate limits for APIs are common way of protecting them from API abuse.

Why is all of sudden Joey has criticism?

[1]: https://libgit2.github.com/

[2]: http://joeyh.name/blog/entry/a_Github_survey/


Did you read your second link to Joey's post from 6 years ago? After reading that myself I don't see the criticism as "all of sudden".


"out of the trap we now find ourselves in."

The anti Microsoft panic about GitHub feels childish and pedantic.

Perhaps if there was some evidence post acquisition of Microsoft trying to do unpalatable things then there might be room to complain.

But just to knee jerk advocate leaving GitHub because you are a Microsoft hater lacks credibility and comes across as whining and silly.


I don't know how these groups have the time to worry about what-ifs. It's better to just play wait and see. Does MS make GitHub worse, does Google buy a git provider to compete. To many questions to take action right now.


Interesting thought exercise, but must disagree - git is VERY FAR from a database replacement (especially at scale).

Maybe some git-like interface on top of a database is more what you want...


Single git repositories do not need to scale to a million issues or comments.

It've used git as a database very productively, at well beyond that scale. https://joeyh.name/blog/entry/databranches/


that says > NOSQL database

which is an important limitation.


Why is it an important limitation in this case?


People keep implying that Microsoft is somehow going to break git and somehow use it to destroy people who want to give away their code.

You can just change your remotes and push to a different server. The protocol was designed to be distributed and prevent a single point of failure.

Also, Microsoft and people who develop on Microsoft tools realise that there are network effects of being able to share code. This was just a side benefit and not altruism. What they were after was Electron, since that's a very popular way of writing desktop applications currently.


> What they were after was Electron, since that's a very popular way of writing desktop applications currently.

Would you mind explaining that? Microsoft doesn't need to purchase Electron for $7.5B of stock in order to write an Electron app. They already write Electron apps, one of which is used by many people who read HN.


Microsoft doesn't need to purchase Electron for $7.5B of stock in order to write an Electron app. They already write Electron apps, one of which is used by many people who read HN.

But not sure if they had control over Electron's future direction which they seem to clearly have atleast now.

V8 can also be replaced with Chakra in Electron i think.

In that light, GP might be right.


True. I'm not necessarily against that position. 100% of the purchase is scheduled to happen in stock, so this may not be a direct cost for MS, even for a benefit that shouldn't be valued anywhere near $7.5B in cash. There is nothing wrong with a win-win deal for everyone involved.

If this is a $7.5B purchase of Electron, my gut tells me that it isn't an arm's length transaction. I doubt that would happen in the US for Microsoft today, so I can't help but think there are other reasons than Electron for the purchase.

The announcement from MS didn't really spell out the reasons for the prospective purchase other than to support developers. When I see the filing for the purchase, there may be additional reasons listed in that.


One would think setting up a non-profit organization (a la Wikipedia) to host a Github-like service for open source projects would be a good idea (I'd work on such a project). But without capital to expend on marketing/outreach, I'm not sure such a thing would gain much traction.


Why is a nonprofit better than a commercial vendor? It's still subject to the whims of the leadership. It's especially ironic with Git, which has portability as a core feature, so hosting can be on as many hosts as possible, and it's simple to move the "master" as desired.


Why exactly would a source control system need traction?

It's just a place where you store your git repo.

And as for discoverability: I wouldn't want to give a single party control over that, actually. I'd say let people discover projects through other means, such as HN.


> It's just a place where you store your git repo.

But Github is so much more than that. It's a public forum to discuss and prioritize issues, a social network to start projects with friends, an automatic portfolio of sorts for your resume, and a really well managed, nicely integrated, reliable service.

This is all, I think, a function of its scale and the traction it's gotten.


Still doesn't explain why projects should be all in one place, as opposed to spread over the internet.


I think https://notabug.org/about fits your criteria.


Storing things like issues in git is possible. But it makes it harder to make functionality on top like upvotes and issue boards. Not impossible for sure but it slows down development. And it would only help with portably if rate limits are the problem. So far GitHub rate limits for logged in users are decent, we migrated 200k projects in 48 hours. Even when you store issues in repos the format would likely differ between different tools. If the format differs it is just as easy to write an importer that uses the API.


But the wikis are in git and you can contribute to them that way if you want to.

The rest have no real business being in git. There are other interfaces for getting your data out if you really need it.

So much of this argument is just “this one format is better than X” without any tradeoffs being mentioned.


While I'm sympathetic to the mindset behind this post (ie, by wary of git centralization), I think that (for example) issue storage is an orthogonal issue from issue format. With a standardized issue format, it could be stored in a database or git, and as long as the service had an open export function, it would require no lock-in.

Also, as some have pointed out here, storing highly relational data like issues and connected profiles, etc, could be an exercise in frustration if you tried to store all that in git. At the very least, it would be limiting.


Agreed.

This reminds me of people in the healthcare space who advocate for blockchain as some magical way of resolving integration issues.

The absence of a robust, well adopted and supported format, is the greater reason for integration issues.

Changing the storage and distribution format might bring certain benefits, but it doesn’t solve the core problem.


https://sit.fyi/ allows you to keep your issues in the github repo itself.

It's written by https://news.ycombinator.com/user?id=yrashk


I'm perfectly happy with Github. They have a very clear value proposition which they offer at a price point which is only a tiny percentage of the value they create. Great company that has empowered accelerated software innovation across the world. Not sure what's up with this whole company X makes money so must be evil vibe. Most of the people complaining about this have well paid jobs so there's a bit of hypocrisy going on here.


Nobody cares that they make money. They care about it not being open source or having easy-to-migrate data (lock-in).


Strange that people are only caring that it's closed source and locked in after MS bought them.


When you couple services together, you decrease the potential user base.

For example, Github issues are underpowered for many people. But at least you can ignore them and use your own service.

Many source control systems tried to offer all-in-one platforms but it’s never worked so far. In fact, we don’t even do PRs the way the authors of git intended.

For integrations, it might be better to settle on a configuration file format and maybe some conventions. Maybe that could be checked into your repo.

But then you need fine-grained control over who can modify that, and encryption to keep passwords secret, and those aren’t standard features of source control either.

Maybe we could come up with conventions that were great for hobbyists and the enterprise. But open source communities and standards committees are really bad at that, compared to just leaving it the marketplace.


Plus, even if Git did have all of the features people want today, people will continue to think of new features that are not supported by Git and need to be implemented outside of it.


Criterion.


"And this kind of pedantry feels like a tactic."

Astonishing that this was the response to a comment which offered-up le mot juste.


No kidding huh. He's doubling down on "criteria" though. Gets all butthurt when somebody politely tries to point it out. Cites "common usage," which is for commoners, chortle chortle, and doesn't make it right.


The massive rush to migrate from github is childish at best.

To put that in context, I lived through MS's various attempts to kill off competitors (Yes, I paid for netscape.)

I've worked with LDAP/kerberos and cursed the weird schema changes that are AD.

But that was > 15 years ago. I'd gladly pay for AD, even more if its a linux based infra. (Hmmmmm kerberos)

Github was loosing cash, had a toxic culture (Abuse of power, side effects of sexism, and then hiring people for being agitators instead of coders.) Microsoft has the money, time and power to improve the product, fire the arseholes and make sure that github is in some sort of shape in the long run.

Just look at the alternatives: atlassian, or worse still oracle/IBM.

So sure, migrate to gitlab, Its a good product, when its working. Github has many faults, but reliability isn't one of them.


I don't agree with the author here. Over the years i read about using VCS to store underlying data, so it's all shared, and you keep the data. Reality is that building a complex source code management platform is not easy. Storing things inside GIT repo and not database would be a huge performance issue when we're talking about scaling such system. If you have 1 repo yeah, it's doable and easy, but if you're talking 100 000 projects that's another story.

This problem keeps returning onto HN again, and again. I think this simply wouldn't work for mentioned reason, and few others that come to my mind.

And i know a bit about building a source code management platform because I built RhodeCode.


Yet another way to store issues in the git repo

https://gitlab.com/monnier/bugit/

This one is by Stefan Monnier (former emacs lead developer)


If you are in the EU, you have the right to take the data with you. Thanks, GDPR.

https://ec.europa.eu/info/law/law-topic/data-protection/refo...


The idea of including wiki and issue tracking as part of the repo is a great idea to include in my OpenVersionControl spec idea -- https://news.ycombinator.com/item?id=17242015

Of course Fossil has exactly what you're suggesting.


There exist multiple solutions that store issues in git:

* https://github.com/duplys/git-issues

* https://github.com/dspinellis/git-issue


So is the main complaint that GitHub is a closed-source, proprietary product? Or is it that they don't provide you with an "export all of my code, issues, wiki, conversations, social graph, etc to GitLab" button?


The criteria he proposes is about your second point: an easy way to ensure you are able to export all the data that you can later import somewhere else. The second part of this is as important as the former, a pretty "Export" button is useless if the data can't be used later to replicate your project somewhere else.

The opening paragraph is also very on point about the negative effects of Github on the Git ecosystem.


Git appraise stores PR data in git repository: https://github.com/google/git-appraise


If anyone has a good idea for a format/protocol to store issues & pull requests in Git itself, I'm all ears.

There is a reason why every single product uses a separate database for those.


I've worked on this in SIT project (https://sit.fyi). While the core is generalized, the very first module available for it is issue tracking and I've been using it for a few months now. It enabled some scenarios previously difficult/impossible to achieve -- for example, merge requests can contain updates to issues as well, allowing one to make non-trivial updates to issues once their patch is merged in (for example, open a dozen of new issues [todo list for improving the new feature], leave comments, etc.)


Separate (orphan) branch, which contains one markdown file for each issue/PR. If you figure out how to encode comment metadata and other pesky stuff like PR reviews, this should be both human and machine-readable.


What happened to this:

https://keybase.io/blog/keybase-chat

Encrypted github support!


This site on mobile lol, how did this even happen: https://i.imgur.com/4AbIEuz.png


Looks fine on my Google Pixel (Chrome, Android 8.1). But yeah the site's rendering on your phone seems broken indeed.


For some reason there's a height property on the nav bar, which prevents it from expanding when the browser is wrapping. When removed it looks fine.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: