Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Still, to this day, I see people doing things with “modern” platforms like Kubernetes and such and I chuckle. Had that in Condor 15 years ago in many cases. :)

I'm reading the docs and it seems used mostly for solving long running math problems like protein folding or seti at home?

Can it be used for scaling a website too? I think that's k8s "killer" feature heh.



Most systems like Condor have a concept that a job or task is something that "comes up, runs for a while, writes to log files, and then exits on its own, or after a deadline". I've talked to the various batch queue providers and I don't think they really consider "services" (like a webserver, app server, rpc server, or whatever) in their purview.

In fact, that was what struck me the most when I started at Google (a looong time ago): at first I thought of borg like a batch queue system, but really it's more of a "Service-keeper-upper" that does resource management and batch jobs are just sort one type of job (webserving, etc are examples of "Service jobs") laid on top of the RM system.

Over time I've really come to prefer the google approach, for example when a batch job is running, it still listens on a web port, serves up a status page, and numerous other introspection pages that are great for debugging.

TBH I haven't read the Condor, PBS, LSF manuals in a while so it's very well possible they handle service jobs and the associated problems like dynamic port management, task discovery, RPC balancing, etc.


But in a world where you're continuously deploying on a cadence that's incredibly quick, how do things differ? I contend the batch and online worlds start to get pretty blurry at this stage. We're not in a world where bragging about uptime on our webserver in years is a thing any more.

I was routinely using Condor in semi-conductor R&D and running batches of jobs where each job was running for many days -- that's probably far longer than any single instance of a service exists at Google in this day and age, right?

None of the batch stuff does the networking management though. No port mapping, no service discovery registration, no load balancer integration, etc. That's Kubernetes sugar they lack. But...has never struck me as overly hard to add, especially if you use Condor's Docker runner facilities.

Edit: I should say that I don't _really_ think you could swap out Kubernetes for Condor. Not easily. But it's always been in my long list of weekend projects to see what running an cluster of online services would be like on Condor. I don't think it'd be awful or all that hard.

The other killer Condor tech is their database technology. the double-evaluation approach of ClassAd is so fantastic for non-homogenous environments. Where loads have needs and computational nodes have needs and everyone can end up mostly happy.


Yes, it can scale based on metrics. And metrics can be anything.

What it’s missing is all the discovery and network plumbing to tie running instances together with load balancing and inter-service comms.

Googles old Borg paper mentions Condor as a thing they considered and cribbed features from.

Honestly, serving a website is not as different from batch processing problems as you’d think. There are differences but they’re subtle, not mountainous.


I've used condor for ~5 years now, mostly for running simulations and processing data. Everything i've done with it has been trivially parallelizable (divide data into chunks based on time, etc), and in those applications it has been a superb tool that just works.

It should be possible to run a scalable website with it, but then you don't get "infinite" scalability like cloud services offer, since you're limited by the size of the compute pool. It would probably have its pitfalls.

That being said, coming in without knowledge of either, i found it much easier to learn and get started doing things with condor than kubernetes. I had all kinds of issues just getting simple things like LaTeX compilation as part of gitlab CI to work reliably. Clearly the experts know how to make things go, but condor is lower barrier to entry. For use cases where condor CAN work, especially data processing, i always recommend that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: