User talk:Vincent Lefèvre

From Wikipedia, the free encyclopedia
Jump to: navigation, search

IEEE 754 preferred width and double rounding[edit]

Someone was adding a bit to IEEE 754-2008 about expression evaluation and a bit of the standard there struck me as worth querying.

My reading of the preferred width recommendations in IEEE 754-2008 is that if you have a statement


where x, y, z are all double but the block has preferred width extended then y and z should be added using extended precision and then assigned to x, so one would have double rounding. Is that correct do you think? Thanks Dmcq (talk) 23:33, 6 April 2012 (UTC)

Yes, in this case, one has double rounding. In order to avoid double rounding, one can use the format-Of operations (Section 5.4) instead of the ones affected by preferredWidth. Note that IEEE 754-2008 doesn't define bindings, so that when one writes x=y+z in a language, it is up to the language to specify which kind of operation is used (depending on the context). Vincent Lefèvre (talk) 00:27, 7 April 2012 (UTC)
Thanks very much. Well setting preferred width to none should do that I believe so that's okay. I guess whatever one does something one doesn't think of will happen! Dmcq (talk) 15:09, 7 April 2012 (UTC)

A barnstar for you![edit]

Vitruvian Barnstar Hires.png The Technical Barnstar
I am awarding you this Technical Barnstar for your work on IEEE floating point. Good Job! Guy Macon (talk) 03:21, 13 November 2012 (UTC)

Double-quad, quad-single, etc.[edit]

Could you take a look at Quadruple-precision floating-point format#Double-double arithmetic an perhaps expand it a bit? In particular, many embedded processors have 32-bit floating point arithmetic, and there is a lot of interest in combining two, three or four 32-bit numbers to get extended precision. Yet double-single and quad-single don't seem to be covered anywhere on Wikipedia. Thanks! --Guy Macon (talk) 08:51, 15 November 2012 (UTC)

"Rounding" article undo (incorrect formulas)[edit]

Why the formulas are incorrect? If you mean rounding half-integer always up (like MS Excel function EVEN() do), or always down - it is not bankers' rounding.

If the fraction of number is 0.5 (half-integer), then rounded number is the even integer nearest (maybe up, maybe down) to initial number.

0.5 rounded to 0; 1.5 to 2; 2.5 to 2; 3.5 to 4; 4.5 to 4 and so on.

-0.5 rounded to 0; -1.5 to -2; -2.5 to -2; -3.5 to -4; -4.5 to -4 and so on.

On the other hand each even number has two entries of half-integers: 0 has 0.5 and -0.5; 2 has 1.5 and 2.5; -2 has -1.5 and -2.5 an so on.

That is the point of banker's rounding - unbiased rounding.

It is almost the same arguments for rounding half to odd formula.

If you are confused with multiplier (factor) before floor brackets, or addend inside floor brackets, you should understand that the floor or ceiling brackets are not usual brackets (parentheses) - you can't carry out or in anything, except integer addend (subtrahend), as usual as you do with simple brackets. Floor and ceiling functions have some unique rules. Anyway, just try my formulas with few half-integer numbers and say where I have mistaken. :-)

P.S. Sorry for my English.

Borman05 (talk) 16:49, 11 April 2014 (UTC)

The incorrectness is for non-half-integers. For Round half to even on y = 1, gives 2 instead of 1. For Round half to odd on y = 0, gives 1 instead of 0.
Vincent Lefèvre (talk) 17:00, 11 April 2014 (UTC)
I understand this, but this is not common formulas for all cases. As it was said earlier in article: "Rounding a number y to the nearest integer requires some tie-breaking rule for those cases when y is exactly half-way between two integers — that is, when the fraction part of y is exactly 0.5"
These simple formulas only for half-integers
Borman05 (talk) 17:32, 11 April 2014 (UTC)
No, there is a tie-breaking rule for special cases, but then, from this tie-breaking rule, one can find formulas that are valid for all numbers (those given for Round half up to Round half towards zero). Note that if you wanted formulas for half-integers only, such formulas could be simpler than those given.
Vincent Lefèvre (talk) 21:44, 11 April 2014 (UTC)

PWD meaning.[edit]

Read the PWD talk page[edit]

You have removed my edit in that page qithout understanding what was posted. Please make sure to READ the relevant talk page for a deeper clarification. Only after you have debunked the claims in that page (if indeed they are wrong) is that you may remove my post. Talk:Pwd#PWD meaning. — Preceding unsigned comment added by JustToHelp (talkcontribs) 05:38, 22 December 2014‎ (UTC)

Vincent Lefèvre doesn't need your permission to remove your edits, so you might as well stop giving orders as if you are in charge. The Wikipedia pages that best explain what behavior is and is not allowed when two editors disagree about the content of a page are WP:BRD and WP:CONSENSUS. --Guy Macon (talk) 01:42, 13 August 2015 (UTC)

C Data Types[edit]

Genepy Quantum (talk) 01:57, 11 November 2015 (UTC)

Why are you reverting my correct changes? You can calculate the range of data types in this way: short int: 2 bytes i.e. 16 bits. 216 = 65536 possibilities. Now let's consider number including zero: we have 65536 numbers from 0 to 65535 (including 0 and 65535). If you split this range with the sign behaviour of the type you have: 32768 numbers from -32768 to -1 and 32768 numbers from 0 to 32767. So for a 2 bytes signed data type the range is [-32768 ; 32767]. for a 4 bytes signed data type is [-2147483648 ; 2147483647] etc...
If you still don't want to understand, compile and run this easy C source code, testing it with, for example, -32768 -32769 +32768. you can also change the type of 'a' to test it more.

'#include <stdio.h>

int main()


short int a;

printf ("\nInsert a number: ");


printf ("\nYour number is: %hi \n\n", a);

return 0;


You're assuming two's complement. But the C standard also allows ones' complement and sign + magnitude, where one loses one value. The minimal ranges are given in Section of the standard, e.g. −32767 for SHRT_MIN. Please read this section.
Giving a C code makes no sense because you are testing just one implementation. Not all implementations behave in the same way.
Vincent Lefèvre (talk) 02:14, 11 November 2015 (UTC)

ArbCom elections are now open![edit]

You appear to be eligible to vote in the current Arbitration Committee election. The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to enact binding solutions for disputes between editors, primarily related to serious behavioural issues that the community has been unable to resolve. This includes the ability to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail. If you wish to participate, you are welcome to review the candidates' statements and submit your choices on the voting page. For the Election committee, MediaWiki message delivery (talk) 12:58, 23 November 2015 (UTC)

C data types[edit]

Here's the way it is now:

Various rules in the C standard make unsigned char the basic type used for arrays suitable to store arbitrary non-bit-field objects: its lack of padding bits and trap representations, the definition of object representation, and the possibility of aliasing.

There are three clauses:

its lack of padding bits and trap representations
the definition of object representation
the possibility of aliasing.

Number 1 is an something that is true of unsigned char arrays. They lack padding bits and trap representations. Number 3 is the opposite. There is no possible of aliasing in unsigned char arrays. You see how that's changing sides in the middle?

The first should be changed to "the possibility of padding bits and trap representations" exclusive-or the last should be changed to "the impossibility of aliasing". - Richfife (talk) 02:07, 15 December 2015 (UTC)

You're wrong. Aliasing is possible with unsigned char arrays. Vincent Lefèvre (talk) 08:04, 15 December 2015 (UTC)
Interesting.. I thought you where ok with any word-length (or smaller) aligned data type, at least a char/byte.. Am I confusing two issues (I see now "A translator is free to ignore any or all aliasing implications of uses of restrict" in the standard. I was thinking with restrict keyword I guess or at the CPU level not C level)? Maybe you are safe in practice, but not on some weired CPUs.. comp.arch (talk) 12:15, 3 May 2016 (UTC)
There are two completely different concepts of aliasing. The first one concerns storage in memory and data types: the notion of effective type (in C11, §6.5 Expressions). The second one concerns whether different pointer variables (in general of the same type) can point to the same object (or a part of the same object) or not: hence the keyword restrict (in C11, § Vincent Lefèvre (talk) 12:48, 3 May 2016 (UTC)

High! You've just reverted an edit on page C data types. I would like to discuss type ranges. Really, standard says that, for example, short should be from -32767 to 32767. But in fact any compiler (clang-3.6, gcc and microsoft according to msdn) allows you to set signed short -32768 without any warnings ( -Wall, Wextra, -Wpedantic, even with -std=iso9899:1990 ). So I guess we should change range to from −(2N − 1) to +(2N − 1 − 1) Yanpas (talk) 20:23, 17 January 2016 (UTC)

The exact range depends on the implementation. The standard says at least [−32767,32767], and this is also what is said on the WP page. On your implementation, the range is [−32768,32767], which contains the interval [−32767,32767]. So, everything is OK. Note that with a 16-bit short and either sign-magnitude or ones' complement representation (both allowed by the C standard), the value −32768 is not possible and the range is [−32767,32767]. Such implementations existed in the past, and might still exist nowadays. There also exist implementations where a short has more than 16 bits. Vincent Lefèvre (talk) 23:47, 17 January 2016 (UTC)
"and might still exist nowadays", I hope not and think not.. Julia (programming language) also assumes the 8-bit byte (not 9- (or 6-)bit).. Less portable yes, but not really.. ARM is taking over anyway, and I can't remember any ones complement machine (there might be microcontrollers being phased out?). Ternary computers like the Setun would also screw us all over.. even the C language.. Julia handles signed and unsigned char/byte for C-types (FFI C API), but defaults to sane signed 32- or 64-bit. Hexadecimal floating-point is also not supported (I think though be C code emulation, that may have been wrapped already). My reading of IEEE-754-2008 (WP page): "and might still exist nowadays" says non-binary floating point only is ok..?! Julia has a package[s] for at least decimal64 floating-point format (emulated), binary uses machine registers, is faster. In case a decimal-only floating point would appear, I'm not sure if C would allow (as float and double), Julia might be easier to amend.. comp.arch (talk) 14:59, 26 April 2016 (UTC)
What matters is that the alternate integer representations have not been removed from the current C standard. There may be some good reason... An IEEE 754-2008 system can provide decimal only. In C, FLT_RADIX is still there. But now, decimal floating point tends to be implemented with _Decimal64, etc. (not yet in the C standard). I'm not sure about the pocket calculators, though. Vincent Lefèvre (talk) 15:32, 26 April 2016 (UTC)

Where does C standard tells about CHAR_BITS >= 8? — Preceding unsigned comment added by (talk) 09:47, 26 February 2016 (UTC)

Section "Sizes of integer types <limits.h>". It is said "CHAR_BIT 8", and the first paragraph of this section says: "Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign." So, this means CHAR_BIT ≥ 8. Vincent Lefèvre (talk) 11:24, 26 February 2016 (UTC)

Numbers, sign and unums..[edit]

I'm ok with your edit. I even kind of like it (how "a bit more" is ambiguous). The sign bit is overhead that does not double.. [Mantissa is/could be larger, usually not double]. Note unums, that do actually have a sign bit in version 1, but he broke away with them in "Unums 2.0" (that is no separate + and - zero). This will make floating-point obsolete.. eventually (see his book, "End of error"). comp.arch (talk) 14:31, 26 April 2016 (UTC)

Not sure what you meant concerning the sign bit. The sign is not taken into account in the precision of the floating-point model. So, for IEEE double precision (binary64), the precision is 53 bits. For the double-double arithmetic, the precision is variable, but such a format contains a floating-point numbers with a 106-bit precision (that's exactly twice double precision), and even a 107-bit precision if one ignores issues near the overflow threshold. IEEE quadruple precision (binary128) is 113 bits, which is just 7 bits more than twice double precision. Thus, "a bit more" in practice. However, binary256 is excluded as its precision is much more than 106 bits.
Concerning the unums, no, they will not make floating-point obsolete. It is not even clear that they would ever be useful in practice (perhaps except as a storage format). The book "The End of Error" is just like commercials: it does not talk about the drawbacks and problems that occur in practice. FYI, the idea to have variable fields for the exponent and fraction (trailing significand) is not new; it was initially introduced in: R. Morris (1971). "Tapered Floating Point: A New Floating-Point Representation". IEEE Transactions on Computers. 20: 1578–1579. doi:10.1109/T-C.1971.223174.  And it was patented and never implemented. Vincent Lefèvre (talk) 15:21, 26 April 2016 (UTC)
Interesting to know the patented 1971 idea. Haven't looked into if it is similar enough. Anyway, I didn't think through how he intended variable-length to work well, but do not care enough any more to check, as if I recall, Unum 2.0 is not variable length.
I guess what is building (and article talks about) was based on the his former idea. When I looked at it here 2.0 wasn't out (good description). I'm not sure what is changed here
The source code, may help with prototyping (not only for new hardware), but yes, would be slower (and the other alternatives there). The pros of Unums, seem good to me, and while ordinary floating-point will not be replaced in practice (everywhere) for a long time, I do not really see the drawbacks, that means they shouldn't replace somewhere and possibly "everywhere" (with a slower fallback to floating-point available, for legacy software). comp.arch (talk) 22:13, 30 April 2016 (UTC)
You may know this, but a more recent mentions you: [1] [I was aware of MPFR, and these IS and MR, but not the new ones in the patent, or how your name relates to all this): "In a comparison between the (M+N, R) representation and the known IS and MR representations the simulation methodology included the use of the C++ MPFR library (see L. Fousse, G. Hanrot, V. Lefevre, P. Pelissier and P. Zimmerman, “MPFR”, ACM Transactions on Mathematical Software, vol. 33, pg. 13-es, June 2007)" comp.arch (talk) 23:20, 30 April 2016 (UTC)
For the 1971 idea, you can search for "tapered floating point" on Google. The corresponding patent is US3742198. The idea is that for a fixed-size format (e.g. 64 bits), you have one field whose goal is to give the size of the exponent field, the sum of the sizes of the exponent field and the fraction field being a constant. There is the same idea for unum's of fixed size. Now, in practice, FPU's work with a fixed number of digits. For instance, if thanks to the additional field, one may have 51 to 54 bits for the significand (depending on the magnitude of the exponent), then the FPU will be designed for 54 bits, and all the computations could internally be done on 54 bits, whatever the value of the exponent. What unum provides could be seen as some kind of compression (with loss). This has some advantages: results could be slightly more accurate in general, in particular if data needs to go out of the FPU. However, the format is no longer regular, which means that error analysis could be more pessimistic. Moreover, some simple algorithms such as TwoSum and Veltkamp's splitting, thanks to which one can efficiently emulate more precision (see e.g. double-double arithmetic), will no longer work.
Concerning the ubit, it is theoretically a nice idea, but takes one bit, which may be important for a 32-bit or 64-bit format, while most applications would not use it. For instance, with IEEE 754, there's an inexact flag, but almost no-one uses it. Moreover, with the ubit, inexactness information can be lost when variables are used for comparison (the result of a comparison is a boolean, which does not contain a ubit).
For guaranteed results (without error analysis done by the user), interval arithmetic can be used with both floating-point numbers and unum's. Both formats are very similar here, with the same issues due to interval arithmetic: intervals get bigger and bigger.
I didn't know about Intel's patent that mentions MPFR. But FYI, if I understand correctly, this (M+N,R) representation is not new at all. It is a particular case of midrad where the midpoint and the radius do not have the same format (in multiple precision, it doesn't make much sense to have a very accurate radius), and more restrictively, this is a particular case of midrad where the midpoint is a floating-point expansion and the radius is a floating-point number, which has been used at least since 2008 (thus before the patent has been filed, in 2012). See Rigorous High Precision Interval Arithmetic in COSY INFINITY by Alexander Wittig and Martin Berz. Vincent Lefèvre (talk) 00:03, 1 May 2016 (UTC)
To bug you a bit more, it seems to me you are not aware of Unum 2.0 [differences], please look over[2] (the only, other than Unum 1.0, slides from him I've seen, until [3] "This presentation is still being tweaked."). Slide 29: "100x the speed of sloppy IEEE floats." [that may be before(?) or after the embarrassingly parallelism is enabled by Unums] on slide 29, "all operations take 1 clock! Even x^y", "128 kbytes total for all four basic ops. Another 128 kbytes if we also table x^y." [Note caveat in slide 34: "Create 32-bit and 64-bit unums with new approach; table look-up still practical?", can't see a lookup-table working (I can't see a double-double trick working, but any such similar idea?), but also not sure of the need for 32-bit+.] [This is if I recall in common with Unum 1.0, at least in part, without the lookup idea] Slide 33: "Uncertainty grows linearly in general" vs. "exponentially in general" for floating-point [I may not understand all the slides or why this is]. See unread here on this: [4]
My take on this, lookup seems to work [for few bits, that seems may be enough], and its time has come (back). At least in z/Arch there is a latency, I'm not sure any CPU has 1-cycle anymore. E.g. less used square-root, already benefits from lookup-tables (but also Newton-Rapson Method).
You can assume I do not know much more on Unum 2.0, I was trying to google for a bit more I've read (see [5]
"I’ve just purchased Mathematica 10.4 so that I can explore unum 2.0 more easily.
For example, I want to explore the fruitful marriage between unum 2.0 and Logarithmic Number Systems (LNS).
Also, with unum 2.0, the number special cases that one has to consider, is much lower then unum 1.0.
Because of that, I strongly believe that unum 2.0 will require less code than unum 1.0."[6]
"Incidentally, I've been challenged to a public debate, unums versus floats, with the headline "The Great Debate: The End of Error?" as a 90-minute event during ARITH23 in San Jose CA, July 10-13.
My challenger is... Professor William Kahan. This should be quite an interesting discussion!"[7] Just reading this and other posts right now]. comp.arch (talk) 15:38, 2 May 2016 (UTC)
These new slides brings nothing at all: no formalization of the theory, no proofs, no code-based examples... The "reciprocal closure" just makes things more complex (except at very low precision since everything can be based on table look-up, so that this is quite arbitrary) without solving anything. For instance, think how you compute a sum of any two numbers of the system (that's the main problem of LNS).
Slides 29 is actually: "Low-precision rigorous math is possible at 100x the speed of sloppy IEEE floats." That's low precision, 6-bit precision. So, I can believe that it is 100 times as fast as 53-bit FPU. But it doesn't scale.
Gustafson convinced (almost) no-one at ARITH-22 (BTW, that was my tweet). I doubt that he can do better at ARITH-23. Vincent Lefèvre (talk) 21:39, 2 May 2016 (UTC)
And if anyone is interested in testing a computation system, I suggest the following sequence by Jean-Michel Muller:
 u[0] = 2
 u[1] = -4
 u[n+1] = 111 - 1130 / u[n] + 3000 / (u[n] * u[n-1])
Vincent Lefèvre (talk) 22:05, 2 May 2016 (UTC)
Yes, I quoted less from slide 29 (but did mention caveat at slide 34, that doesn't seems to worrying now): "Low-precision rigorous math is possible at 100x the speed of sloppy IEEE floats." Note his emphasis. I'm going to quote the new slides[8] (they have good commentary) from now on. Slide 41 (that has some code for you..): "9+ orders of magnitude [..] this is 1.5 times larger than the range for IEEE half-precision floats."
"Besides the advantages of unums listed here, perhaps it deserves mention that the SORNs store decimal numbers whereas the IEEE 16-bit floats are binary and in general make rounding errors converting to and from human-readable form. Also, there are many problems that traditional intervals cannot solve because [something you might want to read] The mathematical rigor claimed for traditional interval arithmetic is actually couched in lots of “gotcha” exceptions that make it even more treacherous to use than floats. Which is why people stick with floats instead of migrating to intervals." Read all of slide/page 43 (and 44–46, where 8-bit unums win 128-bit interval) carefully "Why unums don't have the interval arithmetic problem [..] => bounds grow linearly" [unlike in interval] at least for n-body (that I understand is a difficult problem). Maybe he's "lying" to me, implying unums are good for more than it is, I'm unsure if the linear growth is the general case. Even if it isn't, the error is bounded rigorously [as I assumed with intervals, and he points out flaws with], unlike with floats. Maybe floats are good enough for most/many things, e.g. matrix multiplication, and unums only good/needed where floats are not. Still I'm pretty convinced that the extra bandwidth is the killer there, if it is really true that he can get away with fewer bits. He also has a parallel advantage, that floats disallow [to a full extent].
Your tweet was prior to Unum 2.0. He has e.g. changed his conclusion (in Unum 2.0 and if I recall Unum 1.0): "This is a shortcut to exascale." to "This is path beyond exascale." I acknowledge, the claimed pros, I think where already in Unum 1.0 (he seems to say that), but maybe not all (I'm still wrapping my head around "SORNs", think I'm almost there). At least most of the drawbacks seem to be gone. You (and Kahan) sure know more about this than me.
I see you where also an Acorn/RISC OS users by your program on your web page. I never used Perl there (I noted: "use RISCOS::Filespec;") or anywhere.. Python has dropped RISC OS (AND Amiga) support. I've wandered how difficult it would be to get Julia to work on RISC OS, not that I need to, just nostalgia reasons.. :) comp.arch (talk) 10:28, 3 May 2016 (UTC)
I do not see anything rigorous with unums, except when used for interval arithmetic. But interval arithmetic can be implemented on any point arithmetic, such as floating point, and in any case, you'll get the usual issues with interval arithmetic. For instance, with the sequence I gave above, you'll end up with the full set of real numbers (−∞,+∞). About "this is 1.5 times larger than the range for IEEE half-precision floats", no-one cares. FYI, the gain in range has a drawback: a loss of precision somewhere in the considered subset of real numbers (Level 2 in IEEE 754's formalization of point arithmetics). So, for any choice of arithmetic, some compromise has to be chosen between range and precision. Concerning decimal numbers, IEEE 754 specifies decimal floating-point arithmetic too. But even when implemented in hardware, this is slower than binary floating-point arithmetic (with equivalent resources). Except for some specific applications, binary floating-point arithmetic is often preferred because it is faster and people don't care about the rounding errors in conversions, since such conversions typically occur only at the beginning and at the end of computations. And between them, there are already a lot of more significant rounding errors. Gustafson is wrong about interval arithmetic. First, there are no exceptions for interval arithmetic. Then, for the growth of the bounds, ask him for a proof. :) For interval arithmetic, you have the FTIA, i.e. something proved, showing that it is rigorous. For unums, you have nothing. And IMHO, Unum 2.0 is worse than Unum 1.0 (it is too complex and doesn't solve the real problems). I suggest that you try the sequence I've given above.
Re RISC OS, I've stopped working with it for many years, but I'm still in touch with the French community. Vincent Lefèvre (talk) 12:38, 3 May 2016 (UTC)
"I do not see anything rigorous with unums, except when used for interval arithmetic." That may be true, that he needs what he calls SORNs (that I do not recall from Unum pre-2.0, there he had ubounds, that's been dropped). I may not understand all the details, but so far, it seems like I (and him) do understand enough, and you haven't taken a good enough look. "For instance, with the sequence I gave above, you'll end up with the full set of real numbers", that may be true for this sequence, I haven't checked, but this problem you describe is exactly what he says he's solving over traditional interval arithmetic. It's like you didn't read slide 42 and the text with it (or do not agree):

We can almost compare apples with apples by comparing traditional interval arithmetic, using 16-bit floats, with SORNs restricted to connected sets so that they can be stored in only 32 bits. They both take 32 bits and they both use techniques to rigorously contain the correct mathematical result. SORNs win on every count.

Besides the advantages of unums listed here, perhaps it deserves mention that the SORNs store decimal numbers whereas the IEEE 16-bit floats are binary and in general make rounding errors converting to and from human-readable form. Also, there are many problems that traditional intervals cannot solve because all their intervals are closed at both endpoints. Sometimes it is crucial to know whether the exact endpoint is included, or just approached. You cannot do proper set operations with traditional intervals. Like, if you ask for the set of strictly positive real numbers, you get [0, ∞] which incorrectly includes zero (not strictly positive) and infinity (not a real number). If you ask for the complement of that set, well, the best you can do is [–∞, 0]. How can it be the complement if both sets contain the same number, zero? The mathematical rigor claimed for traditional interval arithmetic is actually couched in lots of “gotcha” exceptions that make it even more treacherous to use than floats. Which is why people stick with floats instead of migrating to intervals.

"FYI, the gain in range has a drawback: a loss of precision somewhere". He called Unums, universial numbers, because they unified floating-point, interval arithmetic and integers (from memory). One thing that [double] float has, that JavaScript relies on (strangely) is that all integers up to 2^52 if I recall are exact. I'm reading into Unums 2.0, that this is dropped (and other stuff that float has that take up bit-pattern space: "Hardware designers hate gradual underflow because it adds almost thirty percent to the silicon area needed for the floating-point math. Subnormal numbers are such a hotly contested part of the IEEE standard that [..]"), as not important. It would be important for JavaScript yes.. :) but to me (and the whole world outside JS), it seems best (or at least ok) to have countables (integers) separate from measurables (floats or unums). That he has every other number exact and thus some integers and decimal fractions exact, may not be too important. The decimal floating-point spec, while *maybe* useful for engineering (I'm told, by Mike Cowlishaw, engineers are used to decimal numbers), seems at least to me overkill for banking.. that needs exact numbers down to the cent. Unums (at least 2.0, 1.0 has been made also with base-10), are probably not useful, at all, for banking, despite he mentioning "decimal".. and then not universal anymore.. I'm ok with that. You shouldn't read that into "Mathematically superior in every way, as sound as integers". Maybe that is a holdover in his slides from pre-2.0, or he's only talking about say unique zero (in 2.0). "no-one cares", then at least the range is enough. :) I'm just not sure what range is needed, I guess depends on the application. "for the growth of the bounds, ask him for a proof." I did point you to the slides (vs. interval), where he shows that. Are you saying his examples are the exception? "you have the FTIA", what is FTIA? "the real problems" What is the real problem? It seems to me he solved it, and floats do not.. Maybe the sequence is difficult, I'm just curious, does it have any special place to being important? Vs. say the n-body problem: "I’ve listed some workloads that I think would be good tests. William Kahan has suggested that the best acid test is to simulate the orbit of Halley’s Comet, with just the sun and the gas giant planets in the n-body problem. The uncertainty, he says, becomes severe with traditional interval arithmetic, and it is easy to see how that can happen. Can unums do better? We will find out once we get an implementation that is both fast and high-precision." I guess you (and him) are saying the precision isn't high enough. Getting bigger unums is a problem (his old method did allow for variable and no lookup-tables, might have been better..) by going to bigger lookup-tables. He does say "Create 32-bit and 64-bit with new approach; table look-up still practical?" and "we do not yet know where the SORN and the table-lookup approach become intractable. This is ongoing research, but there are some promising ways to deal with that apparent explosion in storage demand". I wander if some idea like the double-double trick (was used to good effect in the PlayStation 3 for matrix multiplication, that didn't have doubles, without losing much speed, as not done all the time) is the key here. It seems to me not exactly the same. comp.arch (talk) 15:00, 3 May 2016 (UTC)
SORNs have 32 bits, but far too low precision in practice. The problem with SORN is that it takes an exponential amount of memory compared to floating point or interval arithmetic: if you want to add 1-bit precision, each interval of SORN is split into two, so that the size is doubled (since you need 1 bit per interval). Concerning unums, on simple problems (e.g. math expressions), floating point, interval arithmetic and integers can already be unified with midrad: the midpoint is just a floating-point computation and floating-point numbers contain the integers (in some range, of course). On complex problems, they can't really be unified. The issue with subnormals is that they introduce an irregularity; with unums, and in particular with unums 2.0, the irregularity is much worse, at the point that only table look-up can be used in practice, which is OK for very low precision, but not if one needs at least 6 decimal digits for the computations. So, hardware designers will hate them even more than subnormals. Concerning JavaScript, I agree that integers should have been separated from inexact arithmetic, but that's an issue with JavaScript only. Concerning decimal arithmetic, it is useful for banking due to the rounding rules, which are specified on decimal numbers; if binary floating point is used, you get the well-known "double rounding" problem. Engineers don't need decimal arithmetic for internal computations. FTIA = Fundamental Theorem of Interval Arithmetic. This is the base for interval arithmetic, and which makes it rigorous. The sequence is a bit like chaotic systems: once you get a rounding error, the errors tend to get larger and larger. But AFAIK, this is a bit the same for the n-body problem in the long term. FYI, even double precision is not enough for some problems, for which GNU MPFR has to be used. Vincent Lefèvre (talk) 14:12, 7 May 2016 (UTC)
Slide 44 actually shows that unums/SORNs are not rigorous. Assume that numbers are represented by intervals that contain them (as in the slide). And consider the operation [2,4] − [2,4] like the first iteration of the example of the slide. The implementation (e.g. processor) doesn't know whether these intervals correspond to the same number or to different numbers (e.g. x = 2.1 and y = 3.4, both represented by [2,4], and one does xy). The implementation only sees the intervals [2,4], not the variables. Mathematically, [2,4] − [2,4] = [−2,2], so that interval arithmetic is correct and SORN arithmetic, which gives (−1,1), is incorrect. Note: at the language level, the language implementation could transform xx to 0 as this is mathematically true (that's out of the scope of the arithmetic itself, just related to language specification; in ISO C, this is enabled with the contraction of expressions, see FP_CONTRACT pragma). Vincent Lefèvre (talk) 14:30, 3 May 2016 (UTC)
"Slide 44 actually shows that unums/SORNs are not rigorous." Thanks! It looks like you are right.. :-/ Now I (and maybe he) have to rethink if this is a good idea.. how broken, can it be saved, is it better than floats, just not interval arithmetic (can version 2.0 still be reunified with them, I guess so..)? Did he just make a small mistake? To be fair, he did say "x := x - x", not "x := x - y", and then the answer is 0, but as you say assuming x = y, seems not useful.. When you can't assume that, I think you acknowledge that interval arithmetic is "unstable", but then again, it's the only thing it can do, as in the next step, the new x isn't assumed to have any relation with the previous one..
"With SORNS, the interval [2, 4] gets represented as set of unums. With the 8-bit unums defined a few slides back, it would be the set {2, (2, 2.5), 2.5, (2.5, r10), r10, (r10, 4), 4". So far so good, I've been writing down minor typos, questions etc. and that the "}" to close the set is now the least of his/my worries.. r10 must be sqrt(10) there (some trivia on that in other slides [but "r10" isn't what he must have intended do display there.]) How he thinks this "stable" shrinking range is allowed (vs. intervals) is not clear to me, but it seems at least not worse than floats (without intervals) to me. Maybe he just got a little carried away with showing how much better his idea is or there's a mistake somewhere. His pre-2.0 Unums where supposed to be a superset of floats AND intervals.. comp.arch (talk) 16:27, 3 May 2016 (UTC)

Just to let you know, [the debate with Kahan is over, while I can't find it online..] and I added info on Unum 2.0 implementeation (or modified called Pnum). I see there is an interview with Gustafson, I missed personally (and slides) [that are however not brand new], not sure if he has anything new to change your mind and his implementation, but Pnum might be different enough (just not looked too closely, if I recall not implementing SCORNs and other changes). comp.arch (talk) 14:58, 13 July 2016 (UTC)

I've just added a link to the video of the debate on the Unum page. Note that Jim's microphone wasn't working, but except this problem, the video is OK. Vincent Lefèvre (talk) 00:51, 20 July 2016 (UTC)

ArbCom Elections 2016: Voting now open![edit]

Scale of justice 2.svg Hello, Vincent Lefèvre. Voting in the 2016 Arbitration Committee elections is open from Monday, 00:00, 21 November through Sunday, 23:59, 4 December to all unblocked users who have registered an account before Wednesday, 00:00, 28 October 2016 and have made at least 150 mainspace edits before Sunday, 00:00, 1 November 2016.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2016 election, please review the candidates' statements and submit your choices on the voting page. MediaWiki message delivery (talk) 22:08, 21 November 2016 (UTC)

WP:ANI discussion[edit]

Information icon There is currently a discussion at Wikipedia:Administrators' noticeboard/Incidents regarding an issue with which you may have been involved. Yamla (talk) 11:38, 28 May 2017 (UTC)


Hi Vincent, I see you reverted my edit on the rounding article, the current phrasing is actually wrong, consider:

>>> x = 2**52 + 1
>>> round(x)
>>> math.trunc(x + 0.5)

Both should return 2**52 + 1, but adding 0.5 and truncating does not.

Franciscouzo (talk) 09:40, 1 November 2017 (UTC)

Hi, In practice, the data generally don't reach such large numbers, or the users could have other issues due to early loss of precision (I mean, double rounding effect). And if you want to consider the full generality, you need to also take into account that the current rounding mode may be any of the available ones, in which case the trick to add some fixed value then truncate will not work anyway. Perhaps these limitations should be noted in the article. Note also that your solution will not round halfway cases away from 0, which may be a problem as this rule for halfway cases is a common one, and possibly required in some contexts. And another important point is that if a round() function is not available (as assumed here), then nextafter() is probably not available either. Vincent Lefèvre (talk) 15:40, 1 November 2017 (UTC)

Please chime in...[edit]

Hello Mr. Lefèvre,

This is algoHolic. I stumbled across your Floating-point arithmetic Wikipedia page three days ago while I was refreshing my memory on some of the finer points of floating-point representation in computers. Thank you so much, sir, for taking the time yesterday to make that section a heck of a lot more understandable to "the common man" than it was three days ago.

I'm not a mathematician. Nor am I an electrical engineer. I proudly represent the everyday laymen and laywomen who read Wikipedia to learn new stuff — just for fun.

Every now and then I like to dust the cobwebs off my rusty high school algebra brain cells. So when I saw that summation of pi formula in your Floating-point numbers section, I thought that trying to solve it would be good mental exercise for me. Except, in the state that section was in 3 days ago, the worked equation there was as confusing as Chinese! And the textual explanation read like Greek to me!

My original confusion led me to Math Stack Exchange to ask for clarification from those whose math skills are fresher than mine. My layperson's understanding of summations, plus what I learned from the answers on that math.stackexchange page compelled me to make the changes I made to that one pi conversion sigma notation and its worked equation.

So, in the Wikipedia spirit of the broadest-possible inclusiveness, I would like to invite you (and any other Floating-point numbers contributors) to feedback on some of the questions asked in that math.stackexchange page. Being that you "wrote the book" on the subject, Mr. Lefèvre, I'm sure that if you offered your expert's take on the summation questions there, you could clear up a lot of cobwebs of mine and a lot of other math students' and enthusiast's heads regarding how mathematical notation is actually used outside of academia.

Please consider chiming in with your answers or comments if you ever have any spare time. I'm looking forward to hearing more from you, sir.

Thanks again,

algoHolic (talk) 20:50, 8 November 2017 (UTC)

Thanks for your message, but note that Floating-point arithmetic#Floating-point_numbers is not my page (and most of it wasn't written by me). I just try to contribute / correct it as much as I can (not obvious due to limited time). I've done some corrections and clarification (I hope) on your latest edits about rounding. Please check. I'll try to look at that math.stackexchange page tomorrow.
Vincent Lefèvre (talk) 00:18, 9 November 2017 (UTC)
Thank you, sir, for making that rounding paragraph easier to understand. To give due credit to the original contributor of that rounding paragraph, the edits I made to it were to do with notational style or lexical consistency; not technical correctness. I humbly defer to your good self and/or the original contributor of that paragraph on matters regarding the accuracy of the technical details of the subject.
Which reminds me to ask this admittedly naive question about something in that paragraph. There is a binary point to the right of the rightmost bit in the significand being discussed in that paragraph: Yet, in the paragraph immediately underneath that number, it says, "The significand is assumed to have a binary point to the right of the leftmost bit". I found that to be one of the most confusing things in that section when I first read it three days ago. What is the function of that binary point shown to the right of the rightmost bit in that rounding paragraph's significand, sir?
Thanks in advance for your patient reply, Mr. Lefèvre.
algoHolic (talk) 07:55, 9 November 2017 (UTC)
No, here, to agree with the formula given after this paragraph, the binary point is on the right of the leftmost bit (I corrected this yesterday): . To make it worse, there exist 3 different conventions for the binary point (the other two are: to the left of the leftmost digit as used in the C language and in GNU MPFR; and to the right of the rightmost bit so that the significand is an integer, as often used in proofs), so that the choice can be different in a different context. Both the first and the third conventions are used in the IEEE 754-2008 standard. Vincent Lefèvre (talk) 09:09, 9 November 2017 (UTC)

I'm sure that you are right, Mr. Lefèvre, sir. We are probably referring to two different things.
I just did a diff between your latest revision as of 8 November 2017 (the topmost revision on the History page) and contributor Tea2min's revision as of 3 February 2017 (chosen arbitrarily because it is the earliest/bottommost revision on the default page of 50)
The line that I'm referring to is exactly the same in both revisions...
 : <math>11001001\ 00001111\ 1101101\underline{1}.</math>
And this is how that markup is rendered by the browser...
I refreshed my browser and then took a screenshot showing what I'm seeing. Hopefully you can see the attached screenshot.
Screenshot of a the Floating-point numbers section of the Floating-point arithmetic Wikipedia page
I hope this helps.
Many thanks,
algoHolic (talk) 18:03, 9 November 2017 (UTC)
Thanks for the information. I hadn't noticed this issue. I've just corrected it to:
 : <math>11001001\ 00001111\ 1101101\underline{1}</math>.
This is actually the period at the end of the sentence, not the binary point (which is not shown). Vincent Lefèvre (talk) 00:28, 10 November 2017 (UTC)
Ahh! I see. That explains a lot. Thanks a million times for clearing that up. It really had me confused! I thought the text was referring to that "period" as the binary point.
Is it common to have a period in math markup that is not typeset in-line within a sentence? Why is the block formula style with the period at the end, preferred over the in-line style?
The period struck me as especially confusing in this instance, as the subject matter discusses binary points being embedded within binary numbers. How is the reader expected to differentiate between a sentence period and the mathematical notation for a binary point? They're the exact same glyph after all. Aren't they?
Many thanks,
algoHolic (talk) 01:49, 10 November 2017 (UTC)
It is the normal rule to have punctuation marks with block style too, even though I don't like that very much (sometimes, it's awkward, e.g. after a sum ∑ ... or a big array). You can see discussions in Periods and commas in mathematical writing on MathOverflow. Note that in the past, the fractional point was written as a centered dot (well, at least some mathematicians did); but now, it is no longer standard and it could be confused with a multiplication. A solution might be to put quotes. For instance, here's π rounded to 0 fractional digits: "3.". Vincent Lefèvre (talk) 10:15, 10 November 2017 (UTC)

ArbCom 2017 election voter message[edit]

Scale of justice 2.svg Hello, Vincent Lefèvre. Voting in the 2017 Arbitration Committee elections is now open until 23.59 on Sunday, 10 December. All users who registered an account before Saturday, 28 October 2017, made at least 150 mainspace edits before Wednesday, 1 November 2017 and are not currently blocked are eligible to vote. Users with alternate accounts may only vote once.

The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.

If you wish to participate in the 2017 election, please review the candidates and submit your choices on the voting page. MediaWiki message delivery (talk) 18:42, 3 December 2017 (UTC)