> Barlow found Mitchell did have an existing reputation as a cheat and for suing people who alleged he was a cheat, and found that Mitchell had expressed joy when he believed – incorrectly – on an earlier occasion that Apollo Legend may have died.
I think attributing the problem to “stalkers” minimizes the issues this arrangement of publicly searchable surveillance data creates. Imagine a website where you can type in anyone’s name and it shows you their last known location and their location history. You would have a system which supports universal spying for mundane and nefarious reasons alike. Not just criminal “stalkers” will take advantage of it.
Potentially this sort of arrangement would work if there are limits on the granularity, frequency, and history of the tracking data.
> Web search is available now in feature preview for all paid Claude users in the United States. Support for users on our free plan and more countries is coming soon.
Are people fine-tuning LLMs on their local machines with a single GPU? What are people using to scale their training to multiple nodes / gpus? I've been playing around with Hugging Face Estimators in sagemaker.huggingface but not sure if there are better options for this?
It takes a significant amount of time (few hours) on a single consumer GPU, even 4090 / 5090, on personal machines. I think most people use online services like runpod, vast ai, etc to rent out high-powered H100 and similar GPUs for a few cents per hour, run the fine-tuning / training there, and just use local GPUs for inference on those fine-tuned models generated on cloud-rented instances.
It used to be that way! Interestingly I find people in large orgs and the general enthusiast don't mind waiting - memory usage and quality are more important factors!
Google Colab is quite easy to use and has the benefit of not making your local computer feel sluggish while you run the training. The linked Unsloth post provides a notebook that can be launched there and I've had pretty good luck adapting their other notebooks with different foundational models. As a sibling noted, if you're using LORA instead of a full fine-tune, you can create adapters for fairly large models with the VRAM available in Colab, especially the paid plans.
If you have a Mac, you can also do pretty well training LORA adapters using something like Llama-Factory, and allowing it to run overnight. It's slower than an NVIDIA GPU but the increased effective memory size (if you say have 128GB) can allow you more flexibility.
A 'LoRA' is a memory-efficient type of fine tuning that only tunes a small fraction of the LLM's parameters. And 'quantisation' reduces an LLM to, say, 4 bits per parameter. So it's feasible to fine-tune a 7B parameter model at home.
Anything bigger than 7B parameters and you'll want to look at renting GPUs on a platform like Runpod. In the current market, there are used 4090s selling on ebay right now for $2100 while runpod will rent you a 4090 for $0.34/hr - you do the math.
It's certainly possible to scale model training to span multiple nodes, but generally scaling through bigger GPUs and more GPUs per machine is easier.
For experimentation and smaller models, single gpu is the way to go! Tbh I normally find most people to spend the majority of their time on datasets, training loss convergence issues etc!
But if its helpful I was thinking about spinning up a platform for something like that!
I had the same reaction. If you make it to the end he concludes with:
> The wave function’s pattern can travel across regions of possibility space that are associated with the slits.
Which to me conflicts with his emphatic “no” at the beginning of the article because this implies you can define some mapping between the physical and probability space. And of course you can because if you couldn’t the theory would not be physically predictive.
His point from the beginning is this: the particle described by the wavefunction can't be said to move through both slits at once, because ψ(t, x, y) has a single value for a particular x and y at a particular time. The particle has non-0 probability for both x, y1, t and for x, y2, t, of course - but that just means the particle has non-0 probability to pass through either slit.
And as for saying that the wave moves through both slits, that also doesn't make sense, by the very definition of the wave function - it's a wave in probability space, not in space, so it just doesn't move through space.
> And as for saying that the wave moves through both slits, that also doesn't make sense, by the very definition of the wave function - it's a wave in probability space, not in space, so it just doesn't move through space.
I don't think that's a valid argument. Imagine a regular water wave, i.e. a wavefunction h = h(x, y, t) describing the height of the water at position (x, y) at time t. You could say "this is a wave in height space, not in space, so it just doesn't move through space" and in a certain sense that's true. But obviously there is something that does "move" through "space" to the extent that anything can ever be said to do so.
I’m with you on point 1, (I think this is also obvious from experiment because you will never measure a particle at both slits).
for point 2 it seems you can define a mapping from the physical space to probability space. Saying that the wave doesn’t “move through” space might be technically correct but also seems like semantics on the definition of the phrase “move through” ?
Of course it is to some extent semantics. But the important point is that the wavefunction is not something like a sound wave, or even something like a classical EM wave. Those are all waves defined over 3-dimensional physical space.
In the original QM model, light is not a wave in the classical electrical theory sense. Light is made up entirely of photons, which are particles just like electrons or billiard balls, and they are described by a wavefunction. That wavefunction gives them various probabilities of being in various states at a certain time, and those probabilities can increase or decrease when more particles come into the mix. The states can represent position, momentum, charge, spin, energy levels, etc.