I engage with ChatGPT daily now on a number of topics. In general, I've been trying to search on google and talk to ChatGPT about the same topic, of course I don't put three words in I try to use natural language with ChatGPT. (I had a stint in NLP systems for 5 years around 1997, so that might bias me a bit on how I engage)
What has impressed me is some stuff like "Can you summarize this for me?" and "How would you parse the datetime out of this log entry in python3: {raw text}", "How could I make the following mysql query more readable?" etc.
At this stage, it's like when Stack Overflow came out. And yes, some SO stuff is crap, but once in a while, you get something that saves you 3 hours of your life. For example, recently, ChatGPT has saved me hours of poking around on topics I wanted to solve quickly without thinking so I could get to the high-value work that would get me closer to my goal.
That said, I am amused at how it can bald-face lie about even mathematically incorrect things, and it gets lost if a thread gets a bit too long.
Something is happening here. I think it'll be a while before these things write code from reading a paragraph from a product manager or a less technical user's "use case".
What's interesting is to watch this pendulum swing back and forth, from expert systems coded by hand to neural nets and now these large language models. If the pendulum keeps swinging, it might land on your head one day if you don't pay attention.
All this said, I enjoy my interactions with ChatGPT more than most SO posts, so I continue to use it, and luckily I have 40 years of coding experience to help me identify where it's a bit off.
I have taken to pasting questions from my mentees into ChatGPT and sharing the result and suggesting they try ChatGPT to learn python3 in addition to SO and other tools. It seems to help them, I worry it will confuse them with a bald-faced lie, but I'm here to help when it does!
Ed is indeed a good choice for a line editor. We're not talking about line editors, though, but about text editors. Ed is not useful for writing, because it does not have the immediate visual feedback when entering text that even mechanical typewriters have. For that, you need a text editor, and between Vim and Nano, only one behaves like most Unix software developed in the past three decades.
Ed is not useful for writing, because it does not have the immediate visual feedback when entering text that even mechanical typewriters have.
What do you mean?
$ ed
i
Ed is indeed a good choice for a line editor. We're not talking about line editors, though, but about text editors. Ed is not useful for writing, because it does not have the immediate visual feedback when entering text that even mechanical typewriters have. For that, you need a text editor, and between Vim and Nano, only one behaves like most Unix software developed in the past three decades.
.
.s/not useful/useful/g
p
Ed is indeed a good choice for a line editor. We're not talking about line editors, though, but about text editors. Ed is useful for writing, because it does not have the immediate visual feedback when entering text that even mechanical typewriters have. For that, you need a text editor, and between Vim and Nano, only one behaves like most Unix software developed in the past three decades.
.s/does not have/does have/g
p
Ed is indeed a good choice for a line editor. We're not talking about line editors, though, but about text editors. Ed is useful for writing, because it does have the immediate visual feedback when entering text that even mechanical typewriters have. For that, you need a text editor, and between Vim and Nano, only one behaves like most Unix software developed in the past three decades.
w ed_is_useful.txt
389
q
$ cat ed_is_useful.txt
Ed is indeed a good choice for a line editor. We're not talking about line editors, though, but about text editors. Ed is useful for writing, because it does have the immediate visual feedback when entering text that even mechanical typewriters have. For that, you need a text editor, and between Vim and Nano, only one behaves like most Unix software developed in the past three decades.
$
Typewriters don't have substitution. The immediate visual feedback when entering text in ed is that the keystrokes are echoed to the screen. This occurs regardless of whether you're writing commands or you're in insert mode, such as in this case:
i
Ed is indeed a good choice for a line editor. We're not talking about line editors, though, but about text editors. Ed is not useful for writing, because it does not have the immediate visual feedback when entering text that even mechanical typewriters have. For that, you need a text editor, and between Vim and Nano, only one behaves like most Unix software developed in the past three decades.
Line / prompt input is not the same as text editing. View it from a average user / UX perspective. Having Ed as the default editor is just as surprising as Vim.
Power users and Linux geeks might prefer Vim, rightfully, but neither Vim nor Ed are good defaults. Both mandate having them used before as a prerequesite. You want something that as many people as possible can use out of the box as the default.
Don't forget "pets vs cattle", thinking of servers as ephemeral and working towards quickly being able to scale up/down based on demand. So often I see people "lift and shift" from a dedicated server model into the cloud and never convert their pets into cattle. This reduces flexibility later, not to mention makes it harder to respond to patching needs, scaling, and moving to optimize latency or costs.
As an ex-FAANG engineer, this is FAANG advice. Pets are just fine. Most companies arent FAANG and don't need that class of solution.
An R620 plugged into a switch in a colo, a bash script via cron, or a cloudflare worker are just fine for a lot of use cases. The only time it stops being fine is when you can't afford to do your pet -> cattle migration as you scale up. But I don't think this is a common death for companies.
If you call "cattle" a cloudflare worker or lambda function - fine. But when we are talking about multiple redundant servers with load balancing across them, you really need to justify the cost of that vs the value you squeeze out. Sometimes you're squeezing the juice out of the rind.
Treating servers as disposable is about more than just scale. It helps avoid creating snowflake servers, makes DR more predictable, and makes creating dev environments much easier.
I'd say that until you're FANG-level or at least hockey-sticking, it's actually totally okay for some of your things to be pets and that cattleizing literally everything is actually a premature optimization. API boxes which you have 100 of? Absolutely cattleize. That one bespoke server that runs that one service that is totally weird? Just let it be weird for a while. Work on more important things.
I think there’s degrees of cattleisation, I have in the past deployed crappy vendor software which needs a unique license key for each running instance. We discussed some sort of license service which would allow keys to be checked out when a container is started, but in the end settled on an auto scaling group with 1 instance for each running container with the key in an environment variable.
That got us the comfort of knowing if the host machine died in the night the task would be rescheduled without a bunch of extra engineering to be able to scale arbitrarily when we only needed a couple of instances running.
Is this on AWS where instances get reaped and can't be transparently migrated to a different physical machine like that can on GCP, or is my info out of date? (Just haven't been on AWS for my latest projects.) Because yeah, that would also keep me up at night if random instances will randomly get deleted. My proposal that pets are actually totally okay relies on that not being the case.
My real, pointed question is: what was the opportunity cost of cattelizing that vs any of the other work you could have done in that time? Because given a fresh instance but I'm allowed to use the rest of my infra, setting that machine back up in the morning should just be to instantiating the right docker container on the host and copying the license file in. Depending on what the vendor software does, I probably still wouldn't ASG that until it had actually happened more often than once a month - there are so many more things to stress about! Even if this were the splunk license checker, losing some log files would be quite disappointing but not the absolute end of the world - if you're losing api calls to stripe or the equivalent that's a different matter!
Also, just knowing what you're running at the time is great. Changing things manually means you'll forget some of them when you need to do a bigger change. Or when you're seeing up server no.2. We've got great tooling these days, so whether you're going with nixos, docker, puppet, shell scripts for installation, or something else, the cattle approach gives you benefits.
Unfortunately most base OSes are designed around being pets - the FreeBSD manual is all about a world where you get a single machine, give it a cute name, install a web server, install a C compiler and the OS source, build a custom kernel for some reason…
This may make sense if you can only afford a single machine you want to hyperoptimize but it’s not a good workflow for anyone. Not even BSD kernel developers.
All that is scriptable. I ran thousands of FreeBSD machines. When you get a new one, you run the script that sets up users, pulls the source, builds the custom kernel, installs the packages, etc. A little bit of automation goes a long way. (make world is a reasonable, although not particularly comprehensive, burn-in test too, but if you were in a hurry, you could build in one place and distribute)
You shouldn’t have a compiler on your webserver at all, and building a kernel should happen on a different machine than installing it. (And why isn’t the official kernel good enough?)
Beyond the security surface that’s just asking to brick the machine; if you never make any mistakes it’s because you never tried anything very interesting with it in the first place.
Furthermore, I claim that “you can write a script to edit /etc” is not how an OS should support configuration. Instead of being possible to keep a copy of the config in a script somewhere else, it should be impossible to do anything but that. So basically /etc and UNIX are bad ‘cause they’re not declarative.
Declarative won't necessarily give you cattle. What you want is a continuous automated build process, and immutable images. That way you have a reliable image that has been tested and you know works, you have documented the build steps, and get an error when the build fails. Over time you can make it more declarative, if being imperative is adding a lot of toil.
The important thing to remember is that improvements should happen gradually in small chunks; you don't have to have a perfect cattle system out of the gate, but you should be working your way there.
> You shouldn’t have a compiler on your webserver at all, and building a kernel should happen on a different machine than installing it.
More likely than not, if they've got execution permission on your webserver, they can send down a compiler to run. Plus or minus a c compiler doesn't open up much (anything?) they couldn't have compiled elsewhere or done in the scripting language or shell code or ... Unless you're running a static website or something. Plus, it's presumptious that all my servers are webservers. :p Do my other servers (other than build servers) get to have compilers because they're less naughty?
> And why isn’t the official kernel good enough?
Too many drivers, not enough local patches that are worth having but not worth pushing upstream. (Going upstream is a good metaphor, it often takes a lot of effort)
> Beyond the security surface that’s just asking to brick the machine; if you never make any mistakes it’s because you never tried anything very interesting with it in the first place.
That's what the console is for. Serial and IPMI consoles are way more convenient than vga consoles, but I've certainly screwed up bunches of times and had to use a console to sort things out.
> So basically /etc and UNIX are bad ‘cause they’re not declarative.
So you declare. ;) If it helps, I ran the script from Make, which is declarative (although not really the way we used it), and we mostly ran Erlang, which is claimed to be declaritive in Erlang: The Movie, but seems to be useful, so I'm not sure. I've never quite understood how the goal of declaritive system management matches with reality; it feels like a very leaky abstraction to me; the part that changes the system is necessary complexity, and hiding it away adds additional complexity. Maybe if you build a read-only root filesystem declaritively and run that? Which you can if you want? While you're saying I shouldn't do something the way I did it, my team managed a bunch of machines and a complex system with very few people, so I think the method works.
But I think about things differently than a lot of other people; I'm comfortable hot loading code to change uninterupted services, wheras others like to move traffic to new servers (and hope it all moves). You could transfer existing connections to your new servers, and that would be a lot of fun (and probably require a custom kernel), but nobody does that. Instead you either serve old connections with old code, or forceably terminate old connections. If I'm going to hot load code, it makes an immutable system directly opposed to operability. Why would I be interested in making my system less operable?
Adding on: farmera don't kill their livestock at the first sign of injury or illness; they won't do heroic interventions like you might do with a pet, but they'll certainly do simple fixes on the live cattle in the hopes of returning to health and retaining the value.
Just the other day I had to perform some maintenance on a long-running VM hosting some monitoring software. A backup VM is supposedly always running and ready to handle the workload in case of downtime. The switchover seemed to go fine, at first.
Turns out, someone long ago had manually added a cron job to the primary server without adding it to the backup server, without documenting what it does, what permissions it needs, how it works, or why it's needed. This was only discovered after some manager in a different department complained that he stopped receiving the daily report to his e-mail inbox.
If whoever deployed the report generation script took an extra hour to document what the script did, or even better, added it to VCS as part of the provisioning process for the server and re-deployed the server to ensure that the process works as expected, a day's worth of headache could have been averted.
> But when we are talking about multiple redundant servers with load balancing across them, you really need to justify the cost of that vs the value you squeeze out.
Sometimes you can justify using a thing for the wrong reasons.
I recently attached 1x NLB to each of our Swarm clusters to migrate to automatically managed certificates directly attached to the NLB (Digital Ocean).
$COMPANY has maybe ~3 users accessing each production application at a time. So the NLB itself is utterly pointless.
But Engineering no longer have to fix the certificates each quarter after users see an insecure browser warning and email us about it.
There is a lot to unpack in this statement. First, this sentiment is already covered by the article under:
> Git should be your only source of truth. Discard any local files or changes, what's not pushed into the repository, does not exist.
Second, don't fool yourself into thinking, just because you wrote a script or configured some provisioning service, that means you've pulled yourself out of the valley of nondeterministic unreproducible production environments. Pets can and do live in git. A majority of the systems I've worked on in both FAANG and non-FAANG in my career couldn't be reproduced from the source repo in the case of a catastrophic disaster (I.e. us-east-1 disappears) or a year of being stable enough to not deploy even though their "source of truth" lives in git. The world moves out from under your repo and many of the steps you take for granted when writing your automation aren't understood by the next engineer running the automation. Then one day you have to go upgrade this one repo and realize it's IaC is no longer compatible with the infra in the real world and you burn a day trying to figure out what's wrong with a 1000LoC YAML file.
> Then one day you have to go upgrade this one repo and realize it's IaC is no longer compatible with the infra in the real world and you burn a day trying to figure out what's wrong with a 1000LoC YAML file.
While I'm all too familiar with this problem from the NodeJS ecosystem, it's rare to encounter it in Terraform in my experience. The only exception that needs manual intervention in a "cold boot" scenario is AWS Cloudfront using certificates provisioned from AWS ACM - the provider will assume that certificates are already valid when creating the CF distribution and error out if ACM hasn't issued the certificate yet.
> Then one day you have to go upgrade this one repo and realize it's IaC is no longer compatible with the infra in the real world
Thats yet another reason to avoid pets. When I say pets I mean relatively big pieces of SW, usually 3rd party, that need non-trivial configuration. Like Jenkins or Windows or some security gizmos.
$PREVIOUS_COMPANY used to have pages and pages and pages of HOWTOs in Confluence re what to click on in the AWS console for the stuff that had to be done manually.
Doubled up as amazing training / onboarding material for folks that maybe hadn't worked intimately with AWS before (including myself).
Some replies are saying this is only for "as-scale/FAANG".
It may only be absolutely necessary there, but it's helpful even for smaller folks.
Over the years, even Debian LTS goes out of support and new features and software should be installed. There's moving systems, doing restores, things breaking and wanting to "reset" to a known working state. Any time you can do something simple with docker or even just (short) step-by-step build scripts, that's a huge win.
I have playbooks for deploying a system, but with npm installs, bower installs, secrets to be hand copied from multiple places, etc, it feels more like pets and it's NOT simple to deploy.
There might be an earlier source, but I first ran across the pets versus cattle nomenclature in Tom Limoncelli's _Handbook of System and Network Administration_ - which is a really, really good read for anyone going deep into ops space (like a cloud engineer should be).
I've replaced pets with ASGs that do nothing but sit there unless the EC2 instance fails. Its amazing because you no longer have to baby the pet, it just recovers in a nice clean state, same massive instance..
It should still not be administered as a pet. In fact, it's even more important when you have a single instance of some importance to make it entirely rebuildable and replaceable.
If you manage to run your entire service on a single (heavy weight) server, uptimes will usually be excellent. Reliability for a single server is high. It gets lower for each server you add, unless you're adding redundancy. But that adds complexity, and that hurts reliability too.
My point: a single dedicated server may be a more reliable, simpler and much cheaper solution than the cloud provides.
But you are introducing the complexities of managing a fleet. This may or may not be an improvement in overall complexity, cost, or reliability. This decision should be made per-project, and neither option is a good fit for all projects.
These things are inherent complexities in any situation with backup/DR. You've already got to figure this out. When one machine is degraded and another is trying to recover from db logs, that's a cluster.
> This decision should be made per-project
This is one of the least project-dependent things ever, akin to source control. If you don't have scripted installs you don't know what you're running and you can't reliably test it or upgrade it.
You don't need to run a cluster, or have auto-scaling groups, just because you can.
> But you are introducing the complexities of managing a fleet.
Depends on who manages the fleet. An EC2 instance provisioned by Elastic Beanstalk is quite easy, the problem is EBS can be an utter PITA of edge cases and weird failure modes.
They can fail much less often than you’d think. StackOverflow runs on 10-ish dedicated servers. There’s plenty of other services out there that don’t autoscale and are much more reliable than anyone needs them to be.
A cattle server needs no special maintenance. Yes, if you only have one server then by definition it'll require special maintenance, but that maintenance should be as generic as practicable so that if you ever need to set up a second one then you don't need to do anything special. That's the distinction between pets and cattle, not the tooling: you can configure cattle by hand if you have a repeatable and non-special cased setup script to follow, and you can use Ansible or Puppet or whatever to set up the unique environment your pet server needs. The important thing is that the setup process, maintenance, etc is standardized and documented. Automated is a bonus that should be relatively easy if you're doing it right
You must have been pretty good at billing every single hour you spend on your clients.
I was once told that it's unwise to expect to bill more than 50% of the time I spend, so I've been sticking close to $target_income / 1000 for the last few years. The actual ratio of billable to unbillable hours probably depends a lot on what kind of contracts you get and what type of work you do.