My first job was sysadmin of a third tier ISP back in the dialup days. The account management and provisioning system that ran EVERYTHING was probably close to 100k lines of csh. Everything was done via a UI that the shell generated as a sort of curses-style interface.
What was horrible about it was that it controlled everything from who got a website, active domains, what POPs users could dial into, metered billing, you name it. And it did all of this by manipulating flat files of pipe-delimited data on a central server, then rcp’ing those files to the various machines, then rsh’ing to the various machines and kicking off THEIR scripts, which parsed the source files and generated their own files, which called another set of scripts that parsed THOSE files and generated the software config files.
This included doing things like updating init scripts so that new IPs got added to interfaces, and what email server a user was provisioned on, so it had to generate new exim configr with routing rules.
All this to say that it all worked, but I dreaded having to go in to manipulate anything. Adding a server at least had a dedicated procedure so that was fine, but anything else was a nightmare.
Case in point - as part of a gradual plan to remove this nightmare, I swapped out the radius server that they were using for one that could support a database backend, and modified the local config generator script to make a new config for the new software as a stopgap until I could get it into a database.
The config file had a series of fields that just had numbers in them, and after much digging, it seemed like that controlled whether a terminal dial in user was presented with a menu of options, and what options. I had to reimplement that logic for the new software, made a mistake, and accidentally removed the option for UUCP for the 10 customers that were still using UUCP. One of them was on an ISDN line and their mailer decided to continuously redial looking for the UUCP, tacking up thousands of dollars in carrier rate charges for the weekend that it took anyone to notice something was broken.
I got given an IDE that was written in korn shell to maintain. Not as mission critical as this sounds, but was the only way to edit, compile, link and deploy around 6000 COBOL programs that made up a very large and expensive financial services platform. It also integrated with the SCM (unix RCS!), did checkout, checkin, merging, branching and all manner of amazing things.
There was probably 30 devs who used it, all running on a HPUX server.
It was very powerful, but a total nightmare to look after.
Wow this is legendary. I’d love to direct a short film or tv series that revolves around an IT/software team using a massive csh codebase like this. I’d love to generate some training montage / diagram sequence shots of the system being built by the characters maybe make some cool blender / adobe premiere overlay screen splits of the high level architecture as the team references certain aspects of the system
I'd take something less fictionally dramatic and more along the lines of reality TV (ala home makeovers/Kitchen Nightmares/Bar Rescue): a team of crack engineers untangling the mess and laying out the best practices for future development. The concept is even ripe for booze sponsorship.
What was horrible about it was that it controlled everything from who got a website, active domains, what POPs users could dial into, metered billing, you name it. And it did all of this by manipulating flat files of pipe-delimited data on a central server, then rcp’ing those files to the various machines, then rsh’ing to the various machines and kicking off THEIR scripts, which parsed the source files and generated their own files, which called another set of scripts that parsed THOSE files and generated the software config files.
This included doing things like updating init scripts so that new IPs got added to interfaces, and what email server a user was provisioned on, so it had to generate new exim configr with routing rules.
All this to say that it all worked, but I dreaded having to go in to manipulate anything. Adding a server at least had a dedicated procedure so that was fine, but anything else was a nightmare.
Case in point - as part of a gradual plan to remove this nightmare, I swapped out the radius server that they were using for one that could support a database backend, and modified the local config generator script to make a new config for the new software as a stopgap until I could get it into a database.
The config file had a series of fields that just had numbers in them, and after much digging, it seemed like that controlled whether a terminal dial in user was presented with a menu of options, and what options. I had to reimplement that logic for the new software, made a mistake, and accidentally removed the option for UUCP for the 10 customers that were still using UUCP. One of them was on an ISDN line and their mailer decided to continuously redial looking for the UUCP, tacking up thousands of dollars in carrier rate charges for the weekend that it took anyone to notice something was broken.