I feel like floating point is a completely separate branch of programming that a lot of coders never use. When they do it's often a mistake, like for currency. Yet floating point is very useful for scientific calculations and simulations, which is what computers were all about for the first couple of decades.
I have a suspicion that on a fundamental level, floating point isn't actually good for games or machine learning. They're just used because existing computers are so good at floating point number crunching, especially GPUs.
> When they do it's often a mistake, like for currency.
The most beautiful FP bug I remember was a denial of service in some webservers where by setting the header "Accepted Language: en-gb;q=0.3 en;q=0.8 ..." to a specific value you could send the official Java floating-point parser in an infinite loop (and this affected several Java webservers).
So at each webpage request, you were sending one CPU core of the webserver basically busy-looping. Hence crashing the webserver in a few requests at most.
Now it'd be heresy if I were to say that maybe, just maybe, had the standards mandated to use 0 to 100 for weighting instead of 0 to 1.0 we could have dodged a whole source of potential issues!?
No, no, I realize this is heresy: let's all keep using numbers that cannot even be represented correctly (except as strings), parse those strings into something approximately representing what's written in the string, then make more approximation errors while doing computation with these numbers, propagation errors, epsilon estimation errors, and keep insisting we all could have authored "What every computer scientist should known about floating-point numbers" (which is 80 pages long and, well, approaching a treaty) ; )
There is nothing wrong with the protocol. The protocol has nothing to do with what representation is used by software that speaks the protocol. Software could just as well parse those values as integers. Just drop the decimal point and pad with zeroes. Or use a decimal type (but there is really no need to).
It’s on implementors to be correct. No one said the Java implementation needed to use float or double. Java isn’t the only web server implementation language.
Probably https://www.exploringbinary.com/java-hangs-when-converting-2...: “Java — both its runtime and compiler — go into an infinite loop when converting the decimal number 2.2250738585072012e-308 to double-precision binary floating-point. This number is supposed to convert to 0x1p-1022, which is DBL_MIN; instead, Java gets stuck, oscillating between 0x1p-1022 and 0x0.fffffffffffffp-1022, the largest subnormal double-precision floating-point number.”
Don't blame the protocol, blame the language. Nothing prevents the use of fixed point for such numbers. It's just the Java ecosystem and the weakness of languages that use pervasive exceptions to the point that nobody can keep track of them all that is the cause of such issues.
> Don't blame the protocol, blame the language. Nothing prevents the use of fixed point for such numbers.
I absolutely blame the protocol. Fixed point isn't widely available and used, but integers are. Someone below says protocol spec goes to three digits past the decimal. Using a constrained integer would have allowed the same range of values, but without the complexity of decimal numbers; that's a protocol mistake.
In addition to the ease of using floating point numbers to incorrectly represent decimals, there's also fun with formatting, comma vs period, number of digits to report, etc.
Meh, it's not like there can't be a bug in whatever other parsing code a webserver uses.
Maybe using a float is overkill in this case, since you're never going to hit more than a hundred languages. But it's at least not setting any limitations.
There are 1001 valid values with 1117 representations. Only in practice, Firefox clamps to 2 decimal places, and it used to clamp to 1 decimal place (https://bugzilla.mozilla.org/show_bug.cgi?id=672448), and who knows what other software does.
And it gets worse: the grandparent suggests there are servers that parse it as floating point. Do they accept q=1e2? If they accept it, are there clients that send it? Do you need to be compatible with those clients?
Based on the OP, this was the official parser for floats. That's a bit more extravagant of a bug.
I also would invite you to find a similar instance where a parsing bug for integers crashes the server. Not throws an exception, but falls over in an uncatchable way. I'm not sure I have ever seen one.
The issue is that the representation of floating point numbers imposes additional cognitive burden that the code must handle. How does the web server header parsing code handle positive/negative infinity? How about NaN?
NaNs/infs are particularly vicious, as they propagate through computations.
This. Floating point numbers, as they are lossy attempts to discretize continuous data, are more complex and usually non-deterministic. They should never be chosen when a non-lossy representation is available for the problem domain.
-- IEEE754/64-bits: Interfaces.IEEE_Float_64
-- And here we strip out non-numeric values.
Use Interfaces;
Subtype Numeric_Real is IEEE_Float_64 range IEEE_Float_64'Range;
As someone who has started out making games on Playstation 1 that didn't have floating point hardware, I can with authority say that developing games with only integers suck. Floating point number are far easier and robust to use.
Isn't that what caused the PS1 graphics "wobble"? I remember you could just stare at a wall, rotate the camera, and the textures would slightly jitter and move. It's really obvious in some the Metal Gear cutscenes, especially this conversation with Donald Anderson where the camera pans up from the floor (at 15m) or the sweep(15m31s):
This is because of two things: the affine texture mapping was incorrect, and there was no sub-pixel precision. The sub-pixel part is kinda float-related (though you could make use of fixed point too).
> I never had a PS1 but it always bugged me when playing on someone elses.
Conversely I like the fact a lot that it does. And some others that probably have fond memories of PS1 too recreated this on modern computers using shaders.
yes, and the PS1 had a vector unit built in to the CPU that was used for world transform and it was only 24 bits.
Early PS1 games fit the entire world in to a 24 bit space, and you can tell that there is steping if you drive very slow in a game like Ridge Racer. Later games moved the world around in the coordinate system to manage precision.
Another reason why the graphics looks extra "popy" on PS1 (And early DirectX games) is that vertices where computed at a pixel accuracy. OpenGL, Doom, Quake and more modern hardware have sub pixel accuracy.
Did you use macros/operator overloading for all that bit shifting or just hardcoded it everywhere?
We made a simple 3d demo for uni using fixed math (and we were clueless then and weren't using macros because sometimes you could refactor some operations out that way and we wanted all the speed). There were lots of bugs caused by this.
Can't imagine how painful writing a whole game this way would be.
In signal processing on constrained domains (constrained including the lack of an FPU), the usual alternative to floating point is "fixed point" arithmethic.
Basically you use the platform's native types (e.g. uint32) and decide which of the bits are for the integer part and which are for the fractional. uint32 can be interpreted as 20/12 for example. Your register went from representing "units" to representing "2^-12 increments of an unit". The ALU can do sum/comparison/subtractions transparently, but multiplications and (god forbid) divisions require shifting to return to the original representation.
The choice of how to split the bits is a tradeoff between range and precision. It can vary from one routine to another. It's a pain to compose code with it.
Short story: native float types are a blessing for programmer productivity and library interoperability.
To use only integers, you generally try to express all numbers in units of the smallest quantity. So any money is in integer cents, any distance is in integer (spatial resolution size). It gets annoying with derived units: Time can be stored in integer frames, but then you'll need a rational type of some kind to properly handle velocity. Or you just round to the nearest (spatial resolution) per frame and hope the errors don't look too weird on the other side.
Floats are essential for machine learning, scientific computing, and any other application where the goal is to model mathematical processes involving real numbers. The idea that there is something out there that could do the job better than floats is almost certainly wrong.
I'm fascinated by how common it is for programmers to hate floats. Yes, if you write business software for a living you may not have much use for them. But there's a lot more to computing than business software.
For some reason many programmers love to hate technologies that are old.
Hate floats, love big decimals
Hate OOP, love functional
Hate RDBMS, love nosql
Hate html web pages, love SPA
In my opinion it's a confluence of influences.
1) the same social media attitudes poisoning society generally. "I've got an opinion, and it's worth just as much as yours". News flash - opinions are like arseholes, everyone has one.
2) Inexperience - devs who have only ever built chat apps and don't understand why relationships hence RDBMS are common and useful in line of business applications. Devs who have not tried to capture the water level in a tank or the voltage from a solar panel and don't get why floats are useful.
Not disputing your other points, but functional programming is by most measures older than OOP. The FP hype now is definitely more a matter of rediscovering the wisdom of the past, and I think that's a good thing.
Okay, so now, relative to floating point arithmetic, you've sacrificed an enormous amount of dynamic range and have to worry about overflow and underflow much more than before.
What did you gain from this, exactly?
I'm sure there are some special situations where fixed point is better, but for general purpose scientific and technical computing, floating point is obviously the right choice.
Fixed point is obsolete on the vast majority of compute devices today. It is only a reasonable alternative in ultra low power scenarios or on tech stacks that haven't migrated to modern architectures, like some DSPs (which are also really only applicable in ultra low power devices).
On modern CPUs, fixed point is slower and error prone (both in usage and the computations themselves) relative to floating point.
I don't think it's a pity that fixed point support is out the door. It sucks. Floats minus subnormals are the least surprising numerical representation out there, and the most versatile.
Do you really want to worry about whether your numbers are saturating or wrapping around? Keeping track of what your max/min are? Tracking decimal places? Implementing division by hand? Screw all that. Fixed point is awful and is essentially an academic exercise today.
Well, as mentioned elsewhere, because of historical purposes, GPUs tend to optimize floating point operations really heavily. Whereas bigint style things likely don't benefit, in part because there's no "standard" for them at that low a level. So it seems to me to be a bit of chicken and egg; low level languages don't have them as part of the standard, so graphic cards can't universally optimize for them, and since graphic cards don't support optimizations for them, there isn't anyone clamoring for it.
One example where floating point does not match the problem field is GIS. Coordinates are bound by the earth's circumference. Fixed point would make much more sense, but neither hardware nor software support is there.
I mean most GIS software is pretty aware of precision models. Take GEOS for example, it's got a variety of ways to specify the precision in number of decimal places for a geometry. It still uses doubles underneath, but that's to avoid unneeded code complexity.
Lots of the math in games and ML assumes you’re operating in the reals. For example, the optimization in neural networks. Integers and fixed point won’t work.
Only in the strictly mathematical sense. To a first approximation, floating-point numbers behave like real numbers, and that's good enough for many use-cases (not all, though).
Similarly, fixed-width integers have a totally different algebra than true integers, and yet they're immensely useful.
fixed-width integers have exactly the same semantics as true integers bar overflow, which is a very easy to understand concept.
Floating-point numbers behave like a physicist would do calculations rounding at every step, but doing everything in binary. Or hexadecimal if you prefer, it's equivalent, but still difficult to get your head around to. Then there additionally is + and - 0 and + and - infinity, and several settings for how rounding is done exactly.
"To a first approximation, floating-point numbers behave like real numbers"
This is incorrect. To count as an approximation, there have to be some accuracy bounds. This is impossible to define, as the reals don't have the same cardinality as any floating point number system.
Now, for many interesting and useful real-valued functions, a float-valued function can be defined that is an approximation. But there is no general way to map a function defined on the reals to a function defined on the floats with a known accuracy.
ML optimization algorithms generally do require proper handling of very,very small numbers for e.g. calculation of gradients.
There's a bunch of work on discretization of models after you've done optimizing them - that works, you can get high-performance inference with low-bit fixed point numbers; but that's after the learning has been done using proper floating point (however, you don't necessarily need high accuracy floating point, e.g. single precision may work betted than double precision simply because it's twice less bytes to copy over).
Referring to ML is so vague that it’s not particularly meaningful. The answer in general is no. There’s been some work showing neural networks can function with low precision floating point, but that’s still pretty different than fixed point.
From Microsoft BASIC to Javascript, many programmers have worked in languages that only have floats. Financial calculations involve exponential/log math much like scientific problems. There are issues with rounding and representation there (e.g. there is no such thing as $0.01 in floating point, only $0.50, $0.25, $0.125 and some sum of 1/2 fractions that comes very close to $0.01 and even reads in and writes out as $0.01 even though it isn't.)
In some game engines, Fixed Point Arithmetic is used instead of Floating Point, because it's faster, and the approximation is generally good enough for a game.
It might be a little faster for addition and subtraction in synthetic benchmarks -- and with the use of pipelining might end up being slower overall -- but multiplication and division are considerably faster with floating point.
division, yes, but multiplication is faster, but it's not huge. The limiting factor for multiplication is that it's O(N^2) in multiplied digits; so a 32-bit fixed point has 32x32 and IEEE fp32 multiplication has 23x23.
That's assuming that you're implementing multiplication in software though, right? Since CPUs have dedicated multipliers, aren't both around a single cycle?
They are not in general (iirc 64b mult is down to 3 cycles now), but usually in an x86 cpu it can run microinstructions out of order enough so that the multiplication looks like it's finished in one cycle.
I dont think this is true. Integer math is used in some game engines because they need to be deterministic, between networked computers (and different CPUs rounds floating point numbers differently). I have written an engine like that, and i know that Starcraft 2 is all integer for the same reason. No one does it because its faster or easier. Its a pain.
It depends on which 0 you are talking about. The 1-1=0 or the 1/infinity zero. Somewhere deep in a physics department there is an equation where the distinction either confirms or denies the multiverse.
Actually, the rounding for basic add/multiply are strictly defined and configurable (some algorithms need other rounding rules than the default "round to nearest even").
If you only use those (maybe including division, I'm not sure), and don't rely on hardware support for sqrt, sin/cos/tan, approximate-inverse, etc., you can totally use them deterministically between different architectures.
>And computing integers IS faster than computing floats, at least on a CPU.
In the abstract yes. In reality not so much. A lot of what floats are used for in games is vector math (you are often in a 2D/3D world), and vector math in fixed point requires a lot more work in integer math since you constantly need to shift down things in order to avoid overflow. Overflow bugs are a constant problem when doing dot products in fixed point integer math.
Another common operation in vector math is square root (to normalize vectors) Modern CPU's do that very fast, and you can also use clever approximations that use the way floating point's are represented in hardware.
Most of whats done in 32 bit floats in games, needs to be done with 64bit integers to manage these problems if you decide to use integers, and that means that your data grows, more cache misses, and things slow down.
On top of this everything you do in a game needs to be drawn on the GPU so you need to convert everything to floats anyway to show them on screen, and converting between integers and floats is also slow.
I never use the square root in games, I always compare squared distances (since the properties are the same).
If you try to do with Fixed Point Arithmetic what was intended to be done with Floating Point, you're the problem.
A 32 bits integer can hold values up to 4 billions, if I have that kind of value in a simple game, then yes i will switch to Floating Point Arithmetic, but when does that use case happen if you're not writing a physic simulation game ?
> A 32 bits integer can hold values up to 4 billions, if I have that kind of value in a simple game, then yes i will switch to Floating Point Arithmetic, but when does that use case happen if you're not writing a physic simulation game ?
All the time. Let me give you an example from a game I made. During early R&D I used 32 bit numbers and a fixed point of 11 bits. That means that a normalized vector is between 1024 and -1023. You can do 2 dot products between vectors before it breaks it breaks. (10 + 10 + 10 bits plus one sign bit). That means that you have to do a lot of shifting down. the world can only be 20 bits large, because you need to be able to multiply world coordinates with vectors without getting overflow.
20 bis is very low resolution for a real-time world. You get problems with things not being able to move slow enough at high frame rates. (this was in 2D). Switching to 64 bit was the right move.
Technically, 32-bit fixed-point numbers have 8-bit more precision compared to 32-bit floating-point numbers, but the real problem is it's quite cumbersome to fully utilize the precision offered by fixed-point numbers as you'd have to keep in mind their numerical range all the time.
The biggest problem with fixed point is reciprocals behave very poorly; ie. x·(1/x)=1 and 1/(1/x)=x fail very badly. In order for these to work you obviously need the same amount of numbers above 1 as below 1, like in floating point.
Using fixed point with 11 bits after the decimal, with numbers as low as 100 (18 bits) we already have errors as large as 100·(1/100)=0.977.
Modern CPUs, like Zen2, have vastly more f64 multiplier throughput than 64 bit integer multiplier throughput.
This goes so far, that Intel added an AVX512 instruction to expose this 52/53 bit multiplier for integer operations, to accelerate bigint cryptography calculations like RSA and ECC.
They all follow IEEE 754. But that defines the representation of the bits in a floating point number, not what happens when you operate on them. Different implementations do rounding differently, and there are arguments about how to do this correctly. There can be multiple ways to represent the same number even (this is a reason not to ever do a == compare on floating point number that have been operated on). Not only have there been differences between CPU makers, but also between different CPU of the same vendor.
A separate, complicating factor is that floating point math often don't yield the same results with different levels of optimization turned on. The reason to use integer math for games is because you want it to be deterministic so can be an issue.
IEEE 754 does also specify operations and their behavior. E.g. section 5.1 (from 2008 edition):
> All conforming implementations of this standard shall provide the operations listed in this clause for all supported arithmetic formats, except as stated below. Each of the computational operations that return a numeric result specified by this standard shall be performed as if it first produced an intermediate result correct to infinite precision and with unbounded range, and then rounded that intermediate result, if necessary, to fit in the destination’s format (see 4 and 7).
Well the fact is that different hardware rounds it differently, so its the world we live in. My guess is that actually implementing something in hardware " to infinite precision and with unbounded range" might not be realistic.
// 0.0d / 0 equals 0x7ff8000000000000 (NaN)
// Math.sqrt(-1) equals 0xfff8000000000000 (NaN)
// 0x0p+0d is a funky way to specify 0x0000000000000000 (0)
// -0x0p+0d is a funky way to specify 0x8000000000000000 (-0)
// without 0xHEXp+NUMd my compiller optimizes "-0" literal to 0
// 0 == -0
0x0p+0d == -0x0p+0d
// hashCodes for 0 and -0 are different
Double.hashCode(0x0p+0d) != Double.hashCode(-0x0p+0d)
// hashCodes for different NaNs collapse to the same value
Double.hashCode(0.0d / 0) == Double.hashCode(Math.sqrt(-1))
I can't find the original complaints about this (was back in early 2000s after all), but as I recall it someone was very surprised that something that compares as equality ended up two times in the hashmap, and that messed up their code bad.
> I recall there being an issue with double.GetHashCode() in the early versions of .Net, where it would return different hash values for 0.0 and -0.0.
I never thought about that (I avoid FP as much as I can) but, oh boy, the can of worms!
.Net (or Java) OOP where an object's hashcode needs to give the same (or not?) value for 0.0 or -0.0: this is the kind of stuff nightmares are made off.
I'm sure this can be turned into some very funny "Java puzzler".
If you compare for equality then those two numbers, 0.0 and -0.0, are considered equal. It seems a lot of people consider that a good reason for them to have the same hash value, which I consider entirely fairly reasonable.
Of course since NaN != x, whatever x is (including infinity and NaN), one could argue it's fine for different NaNs to return different hash codes. But I think most people think of NaN as one thing, and not the 9007199254740990 or so different NaN encodings[1] there are in a double.
Does anyone have a good example of a programming problem where we _need_ signed zero? Or where it makes things significantly simpler? As far as I know, there is no distinction between -0 and +0 in math, so I have never really understood why this is a thing in computers.
When does this need arise? Well, otherwise inverting a value can change it's sign and in particular inverting -∞
twice will give you +∞, and being off by "2∞" is a pretty large error for a lot of computations ;)
You can end up with zeros and infinities pretty easily because you overflow or underflow the range of floating point precision, and generally you want something sensible to happen in typical cases, even if some common arithmetic identities necessarily break down.
I would actually like a true signed zero (or rather "epsilon" value
), so -0 and +0 as distinct from "normal" 0 which is truly unsigned, neither positive nor negative. The former two would only arise from underflow, and the reason this is useful that if you underflow from below zero and invert that you want to get -∞ and if you underflow from above zero and invert that you want to get +∞. Inverting a signless zero should give NaN (instead it gives +∞, which is nonsense in basically any case where the domain is not inherently the non-negative reals already and the 0 did not come about by and underflow; in particular 1/0 should be NaN).
If anyone knows why this design was not chosen and what fundamental downsides it has, I'd love to hear it. Obviously representing three zeros is a tad more annoying, but IEEE754 has a lot of stuff that's annoying implementation wise but was added for nicer numerical behavior (e.g. denormals, and of course various "global" rounding modes etc. which probably qualify as a mistake in retrospect).
> If anyone knows why this design was not chosen and what fundamental downsides it has, I'd love to hear it.
No fundamental ones, but a few practical.
32-bit numbers have 2^32 unique values, the number is even. Your approach makes the range asymmetrical like it happens with integers.
The range for 8-bit signed integers is [ -128 .. +127 ]. On ARM NEON there’re two versions of integer negate and absolute instructions, some (like vqnegq_s8 or vqabsq_s8) do saturation i.e. transform -128 into +127, others (vnegq_s8, vabsq_s8) don’t change -128. Neither of them is particularly good: the saturated version violates -(-x) == x, non-saturated version violates abs(x)>=0. Same applies to the rest of the signed integers (16, 32, 64 bits), an all modern platforms.
With IEEE floats the range is symmetrical and none of that is needed. Moreover, PCs don’t have absolute or negate instructions, but they instead have bitwise instructions processing floats, like andps, orps, xorps, andnps, they allow to flip, clear or set just the sign bit, very fast.
Another useful property of IEEE representation is that for two intervals [ 0 .. FLT_MAX ] and [-FLT_MAX .. -0.0f ] sort order of floats corresponds to [inverted] sort order of 32-bit integers.
> Your approach makes the range asymmetrical like it happens with integers.
Not necessarily since you've got (a lot of different) NaNs anyway. For the sake of argument, you could give up one one of them and make it positive zero (since this representation would be unnatural, it would slow stuff down, just as non-finite values do on many CPUs. Wouldn't matter that much since signed zeros would only arise from underflow).
> Another useful property of IEEE representation is that for two intervals [ 0 .. FLT_MAX ] and [-FLT_MAX .. -0.0f ] sort order of floats corresponds to [inverted] sort order of 32-bit integers.
I'm aware, but as you correctly note this only works in the right direction for unsigned values. And it's just not that important a benefit, I'd much rather have my calculations come out right than being able to sort positive floating point numbers with an integer sort routine.
> you could give up one one of them and make it positive zero
What do you expect to happen when you set the sign bit of that number? A possible answer to that is "negative zero", and now you have 2 separate encodings for negative zeroes: one of them with exponent 0, another one 0xFF, and they behave slightly differently.
If you manually screwed around with the most significant bit of the underlying bit pattern (rather than just writing -x like any normal person) I'd expect to get a NaN -- assuming of we keep an explicit sign bit in the representation at all. It would obviously make hardware implementation of negation more complex (and thus slower) but I don't think there is any conceptual problem.
PC hardware doesn’t have hardware implementation of negation.
Possible to do in 2 instruction, xorps to make zero, then subps to subtract. Combined, they gonna take 4-5 cycles of latency (xorps is 1 cycle, subps is 3 cycles on AMD, 4 cycles on Intel).
If you do that a lot, a single xorps with a magic number -0.0f gonna negate these floats 4-5 times faster. People don’t pay me because I’m a normal person, they do that because I write fast code for them :-)
On a serious note, I’d rather have the current +0.0 and -0.0 IEEE values to be equal and be the exact zero, and make another one with 0xFF exponent encoding inexact zeroes, +0.0f or -0.0f depending on the sign bit.
Or another option, redefine FLT_MIN to be 2.8E-45, and reuse the current FLT_MIN, which is 1.4E-45 / 0x00000001 bit pattern, as inexact zeroes.
I wonder if it would be useful to have a signed integer that has symmetric range and the one that is left is used as NaN. Overflows would set it to NaN for example.
Then again, once that's on the table it's very tempting to steal two more values for +/- inf. I think it's very useful to have full range unsigned ints but signed ones could have range reduced to make them less error prone.
Posits handle this by using the smallest non-zero number where floating point would go to zero. They also use the largest represent able number instead of infinity. These exist for both positive and negative numbers.
At least that's the way I read it.
Having a single (unsigned) infinity (and a single zero!) seems cleaner in some ways (and I dimly seem to recall that some pre-ieee754 floating point hardware worked that way). On the other hand, having e.g. a neutral element for both max and min also seems pretty nice to have, although without infinities, the maximal and minimal floating point value will equally do the trick in most cases.
How do inequalities work for posit infinity? Is posit infinity both larger and smaller than any other posit?
I meant as in infinitesimal, not as in machine epsilon (which lacks the required properties) -- poor wording on my part. If you have numbers of impossibly large magnitude you probably also want corresponding numbers of impossibly small magnitude. You can do this in a nice, algebraically satisfying way with infinitely many such numbers (hyperreals), but I think if it weren't for the signless zero and positive zero conflation, IEEE754's way would be quite a reasonable finite precision approximation to this.
That is not entirely correct. Schmieden and Laugwitz for example developed in the 1950s a nonstandard Analysis which adjoins an infinitely large element (called Ω) to the natural numbers. The basic idea was a formula A(Ω) was true if A(n) was true for almost all finite natural n.
While it wasn't immensely useful going forward, it helped to clarify the use of infinity and infinitesimals in earlier work.
Ah right, I forget that extending with a single infinity element is useful with complex numbers and with geometry. It's still not very common with the reals alone as +∞ and -∞ are reasonable to want as separate elements there, but it doesn't play nicely with 1/0 that way.
That's right. It's a symbol. When you see it in an expression, you're probably expected to interpret it in light of a limit of some kind. It's just convenient to write it into an expression rather than use cumbersome limit notation all over the place.
Like many notational shortcuts, it's a hack supported by proof. ;-)
In my view, numbers are numbers, and symbols are symbols. There's an agreement that a symbol represents a number, but there's not a one-to-one relationship between available symbols and available numbers. Normally this isn't a problem, and we treat them interchangeably. And indeed the distinction may only be a philosophical oddity or a matter for mathematicians. But I believe nonetheless that there is a distinction.
Now I was merely an undergrad math major, which means I topped out before learning this stuff in a formal way. But at my primitive level of understanding, I think of a number as something that behaves like a number within a given system. What I learned in my courses was how different kinds of numbers behaved: Whole numbers, reals, complex, vectors and tensors, etc. I remember reading a definition of "tensor" that was to the effect of: A tensor is something that behaves like a tensor, meaning that the important thing is the behavior.
Another post in this page expressed that we should be particularly cautious when dealing with numbers, symbols, and IEEE floats, notably to beware that IEEE floats and real numbers don't always behave the same. That was treated in one of my math courses, "Numerical Analysis." You could also get CS credit for that course, suggesting its practical importance.
I think the consequences of what you are saying makes sense. Would be neat to explore more of the idea. I was starting to find, recently, that it was better to think of numbers as symbols that follow operational rules. This view seems counter to that.
Yes, in a way. What distinguishes irrational numbers from rational numbers is that all rational numbers can be represented by strings drawn from a regular language. For example, all strings generated by the regular language "-?\d+\.\d+?_\d+" (where "_" denotes the repeating decimal expansion as in 1/6 = 0.1_6) correspond to exactly one rational number and all rational numbers correspond to at least one string in this regular language. Irrational numbers (and other types of "numbers" such as +inf and -inf) cannot be represented by any such regular language.
Correct. Given the regular language I specified, each rational has an infinite number of matching strings: 1.0 = 1.00 = 1.000 = 0.9_99 = 1.0_0 etc. The point is that for every rational number you can think of, I can show you at least one string in my regular language to represent that number.
According to the finitists, this is a defining feature of a "number". Since the same can't be done for irrational numbers finitists conclude that irrational "numbers" aren't numbers. You probably agree that all numbers are (or can be represented by) symbols, but that not all symbols are numbers. So how do we distinguish symbols from numbers?
Western hemisphere is negative longitude, eastern hemisphere is positive. If your degrees are 0 (within ~100km? I can't remember exactly), then you still need to know if you're east or west of the meridian.
Computers (obviously) have to approximate the majority of actual mathematical numbers, as they do not have infinite storage.
If you've got two numbers, +0.0000001 and -0.0000001, but you can't represent that precision, can you see how it's less bad to round to +0.0000 and -0.0000 rather than to just 0.0000? It's encoding strictly more information.
Reading your comment gave me a realization that transformed my understanding of floats.
What just clicked for me was that in any system where we use finite precision to store exponents, we can't actually have zero as a normal value... we'll always underflow the exponent before we get to actual mathematical zero.
So +0/-0 is actually a convenient misnomer. They really are +/- epsilon.
The only way to have zero is if it's some specially handled value like NaN. Which IEEE doesn't do and that's entirely understandable.
Makes sense why you can never compare a subtraction of floats to zero (it's beyond just "rounding errors") and the existence of +0/-0 seems quite natural now.
> The only way to have zero is if it's some specially handled value like NaN. Which IEEE doesn't do and that's entirely understandable.
Wait what? Am I missing something? 0 is absolutely part of the IEEE 754 spec thanks to the existence of denormalized floating point numbers. So I would certainly call it a "specially handled value", in a sense. The existence of +/- 0 has more to do with the implementation of the leading sign bit.
Really good point. My approach to this was "if it’s not used in any mathematical algorithms, why do we need it in computers?". But in your example, you retain some information even though you can’t represent the whole truth. Thanks!
You can look these things up for yourself - they're standardised in most language's implementations in something called IEEE 754. In the cases you've asked about they're false and true. Is this what you want in all cases? No. Is this what you want in some cases? Yes. It's a tradeoff. You can still observe the difference by dividing by zero (which should be another indication that these aren't real numbers as we're conventionally understand them.)
> they're standardised in most language's implementations in something called IEEE 754
There is the IEEE 754, and there is the language's standard. One should always look at the latter because it's often the case that the language doesn't fully conform to IEEE 754.
You shouldn't really use equality on floating point numbers, except on very special circumstances (and I imagine the == behavior for 0 breaks things more often than it helps). But the wikipedia page on -0 has your case covered:
> According to the IEEE 754 standard, negative zero and positive zero should compare as equal with the usual (numerical) comparison operators, like the == operators of C and Java. In those languages, special programming tricks may be needed to distinguish the two values
> You shouldn't really use equality on floating point numbers, except on very special circumstances.
This is very common advice, so common that it gets cargo-culted into situations where it is really quite poor.
Information storage, retrieval, and transmission systems should faithfully deliver floating-point values that are good to the last bit. Round-trips through databases, transmission over network protocols, etc should all give values back that are exactly identical to what was put into them.
Oh, sure. But those applications should also not use the floating point equality operators. They deal with opaque data, and should make sure not to corrupt it.
Keep in mind that the ISO standard does require that floating point equality tests return false for values that have the exact same binary representation, and that not everything adheres to it and some environments may give you false for identical values even when the standard says it should be true. Also, != is not the negation of == for floating point. So even using those operators to test a round trip over those applications is iffy.
By "the last bit" do you mean the 32nd bit or the 64th bit? :-)
Many times I've tracked down the place in our stack where a double-precision value from user input accidentally goes through a single-precision variable in some C code somewhere and crashes some Python code later on because the values don't match "to the last bit" in the way that the programmer thought... But that's a bug in the C code - I agree completely the the system SHOULD give the value back that was put into it!
This is actually exactly what I mean. Its probably the most common bug I've come across in this class. I don't expect unit tests to capture all rounding bugs (say, due to serialization and de-serialization through text). But I do expect to capture gross errors, such as an inadvertent cast to lower-precision somewhere in the pipeline.
I've worked with highly experienced and accomplished software engineers that expected interchange through protobuf or sql to be inaccurate due to rounding. No! If you stick a finite number in, you should get the exact same finite number back out again. Direct equality is fine for most cases. The sign bit of zero and NaN should also be returned faithfully and tested using memcmp when required.
IMO, the payload bits of NaN should also be also returned faithfully, but too many systems in common practice drop them.
> Round-trips through databases, transmission over network protocols, etc should all give values back that are exactly identical to what was put into them.
Yes that's true... but what's that got to do with using an equality operator?
One should usually not rely on this, unless your language gives guarantees. I don't think even IEEE 754 gives you the guarantee you are implying.
To give you an idea, the C99 standard does not require signed zeros, and if you have them, does not dictate the behavior you are describing. I once worked on a commercial C compiler and we produced a new version that resulted in a change of sign of zero for the exact same computation (and even "worse", would give a different sign on different machines for the same compiler version). We discussed this thoroughly and decided it was OK because our docs made it clear we conform to C99 (which provides no guarantees on signed zero).
No, if x = -0.0 then x+0 = 0.0 and x*0 = -0.0. This violates fundamental tenets of arithmetic, surprises most, and makes many compiler optimizations impossible.
I agree with many of the other replies, but I would also add, perfectly serious and with no sarcasm, do not be fooled; computers do not use "math" numbers. They use what they use; machine-word bounded integers and IEEE floats most commonly, unbounded integers, some variants on rational numbers, and sometimes some more exotic things, but they never use real numbers. They're bounded to the computable numbers.
So whether or not there's a -0 or +0 in some "math" isn't relevant, because computers aren't using "some math" but very specific mathematical constructs that must be understood on their own terms. I haven't seen very many mathematical systems that have a reified "NaN" object that can be explicitly passed around as a valid value to functions. (I know of many systems that have a "bottom" but I would say bottom is usually presented not as an object you can "have" but as a property of some mathematical object. Haskell for instance uses this idea; there are some trivial ways to have or produce something that has the property of being "bottom", like calling "error", but you can't just "have the bottom value".)
Moreover, there are mathematical systems in which such things can appear. There are many exotic and interesting number systems in the mathematical world, which even a bachelor's degree in mathematics may only scratch the surface of, depending on which junior/senior level courses you take. Real numbers are without a doubt the most studied, and they do not contain a -0 (though even then you'll still see it show up sometimes as a special notation in limits), but they are the beginning of number systems, not the end.
I mention all this because it's important; it's a very common misconception that computers use the numbers like you learned in school, and it will lead to nothing but pain. The next misconception is that, ok, sure, they aren't those numbers exactly, but they're so close that I don't have to worry about it. This works as long as you don't push your floats too hard, and a lot of us don't, but also breaks down surprisingly quickly. It's important for programming professionals to understand in general that IEEE floats are their own thing. Whenever I use them I do at least take a couple of seconds to double-check mentally that the sharp pointy bits aren't going to stab me, even for simple things like adding durations of a request to some floating point accumulator for metric purposes.
Floating point math was devised to be used in numerical methods that are applied iteratively, like gradient descent. In that world what you are always doing is "converging" to solutions step by step, and a negative or positive zero can tell you which direction you're converging from.
It’s used when writing SIMD code all the time to quickly mask bits as needed or to check the sign of a floating point number with an SSE intrinsic. _mm_xor_ps(_mm_set1_ps(-0.f), reg) as an example negates all four components of reg.
Sign/magnitude ADCs almost have signed zero in hardware, by generating a sign bit and zero or more magnitude bits. The typical method to map to voltages doubles the span between adjacent values and skips over zero. So a 2-bit sign/magnitude ADC measures values in the series {-3, -1, 1, 3}.
So strictly speaking this number system doesn't have a representation of zero at all. Tiny measurements either round up or round down to +/- 1.
Sometimes the sign is used to indicate which "direction" the temperature is moving. If it was -10° overnight and it's -0° now, the puddles outside will still be frozen. If it was 10° overnight and it's 0° now, the puddles will still be liquid.
(Edit: no idea whether this applies to the Apple watch, it's just a use case for -0 with regards to temperature.)
Your predictions about the state of the puddle are most likely right, but not for the reasons that you think.
Air temperature is commonly measured at 2m above ground. An measurement of 0° air temperature does not imply 0° ground temperature. The more significant effect is that the ground has a higher thermal capacity than the air, so it changes temperature more slowly throughout the day. If it's been colder before and now it's 0°, the ground is still way below 0° and thus puddles are frozen. If it was warmer and now the air has cooled down to 0°, the ground is going to be a bit warmer still and thus puddles are liquid.
Denoting this difference as +0° and -0° does not seem very useful since the same effect is going to be nearly equally significant at 1° or -2°.
(Sidenote: The thermal capacity of the ground is also the reason why air temperature is not measured anywhere near the ground.)
The thermal capacity of the ground is also why, if you're insulating a basement or slab, you want to insulate the edges and DOWN - but you don't need to insulate in the center of the slab (as eventually the ground reaches the ambient temperature and acts as a huge thermal mass).
You're right. What I meant to say was that it's possible the programmers working on the weather app didn't add explicit rounding, they just passed the value through a formatter that did the rounding implicitly.
I believe that's actually a standard presentation, I've seen it in several weather graphs.
It's basically a rounding yeah, signifying it's just slightly below zero, but not enough to round to -1°.
> Does anyone have a good example of a programming problem where we _need_ signed zero? Or where it makes things significantly simpler?
Maybe when you're implementing floating point numbers in a way that's simple and widely applicable?
I'm only half joking, too, though I can't tell you what exactly having a distinct sign bit simplifies.
I can say off the bat that it is probably useful that a very small quantity that might otherwise lose enough precision to round to zero maintains its sign regardless.
A distinction can be made when considering the limit of a sequence. If a sequence converges to zero from positive values (sometimes written -> 0+), then the inverse of the sequence will diverge to +inf, while if the sequence converges to zero from negative values (-> 0-), the inverse will diverge to -inf.
As a sibling comment wrote, rounding numbers to -0 and +0 can provide extra information, though it may not be useful in all contexts.
You can't get a negative zero with a constant declaration, nor can you get NaN, infinity. These are documented language facts.
"""Numeric constants represent exact values of arbitrary precision and do not overflow. Consequently, there are no constants denoting the IEEE-754 negative zero, infinity, and not-a-number values."""
As an aside, historically many of the early binary computers used ones complement. So they had both -0 and +0 integers as well. While ones complement is probably more obvious than twos complement, it was near-universally considered a terrible mistake by the mid-1960s, and everything of note since has been twos complement.
It didn't die quickly though. The UNIVAC is still with us (in emulation) and that is likely the reason why the C standard addresses the question of negative zero integers. (Their handling is implementation specific, of course.)
> Depending on the programming environment and the type of number (e.g. floating point, integer) being divided by zero, it may generate positive or negative infinity by the IEEE 754 floating point standard.
Not if you know that it's positive or negative. The inverse of a negative infinitesimal can't be anything other than a negative infinite number, and the inverse of a positive infinitesimal can't be anything other than a positive infinite number. There's no difficulty with the definitions.
The limit of a function 1/x as x approaches zero from the left or right is well defined as negative or positive infinity respectively. The limit is only undefined when no approach direction is specified (as the results from the left and right do not agree)
Floating point is a real trip. They should probably spend more time going over its intricacies in programming courses, because its not going anywhere.
E.g. you should hardly ever test floating point numbers for equality. Instead you usually check if they are "close enough" within an expected precision.
Floating point numbers represent the extended real set, where infinity exists. What is completely different from the integral numbers most computers use, on those division by zero is undefined.
Division by zero is still undefined in the extended reals. To define it you'd need something like the projective reals. But it doesn't matter; floating point numbers don't have a zero; they have two zeroes of different signs.
There are still (a few) computers in the world (Sperry Univac legacy support) using one's complement arithmetic, hence having a positive and negative zero in the architecture.
there's also explicitly no infinity or NaN, there is a software throw for all (core) implemented function domain failures.
This has, however, recently come up for Nx (numerical elixir) which had to implement and standardize ways to shim these IEEE concepts back into the VM for interop purposes.
> This has, however, recently come up for Nx (numerical elixir) which had to implement and standardize ways to shim these IEEE concepts back into the VM for interop purposes.
This would have been an excellent place to use Ada. The package "Interfaces" has types exactly for IEEE Floats: `Interfaces.IEEE_Float_64`.
You could then get rid of non-numeric representations (raising `Constraint_Error` instead) via the following subtype-definition:
subtype Float is Interfaces.IEEE_Float_64 range Interfaces.IEEE_Float_64'Range;
(I've thought about writing an Erlang in Ada, hoping to tie together Erlang's actors w/ Ada's Task where possible [ie focus on interop], but haven't found any Erlang language definition that's anywhere near recent.)
I don't understand what you think a result could look like here that does distinguish the sign? What are you expecting? -2.0? That doesn't make any sense. 2.0 is the correct answer under all circumstances as far as I can see.
Signed infinity is nowhere near the 'expected result'. Zero has no sign. NAN would be a better result off the top of my head.
But I see from the standard that dividing by zero returns infinity with the xor of the dividend and divisor. This also makes no sense - since zero has no sign, its not possible to tell what kind of infinity would result. It's strange that the standard permits both the infinity result from dividing by zero, and simultaneously supports signed zero.
I have a suspicion that on a fundamental level, floating point isn't actually good for games or machine learning. They're just used because existing computers are so good at floating point number crunching, especially GPUs.