Can anyone explain the following results in SQL Server? I'm stumped.
declare #mynum float = 8.31
select ceiling( #mynum*100)
Results in 831
declare #mynum float = 8.21
select ceiling( #mynum*100)
Results in 822
I've tested a whole range of numbers (in SQL Server 2012). Some increase while others stay the same. I'm at a loss understanding why ceiling is treating some of them differently. Changing from a float to a decimal(18,5) seems to fix the problem but I'm wary there may be other repercussions I'm missing from doing so. Any explanations would help.
I think this is called float precision. You can find it in almost all programming languages and in Database too. This is because data is stored only with some precision and in fact what you set as 8.31 is probably not 8.31 but for example 8.31631312381813 and when multiply it and ceil it may cause that different value appear.
At SQL server documentation page you can read:
Approximate-number data types for use with floating point numeric data. Floating point data is approximate; therefore, not all values in the data type range can be represented exactly.
In other database systems the same problem exists. For example at mysql website you can read:
Floating-point numbers sometimes cause confusion because they are approximate and not stored as exact values. A floating-point value as written in an SQL statement may not be the same as the value represented internally. Attempts to treat floating-point values as exact in comparisons may lead to problems. They are also subject to platform or implementation dependencies. The FLOAT and DOUBLE data types are subject to these issues. For DECIMAL columns, MySQL performs operations with a precision of 65 decimal digits, which should solve most common inaccuracy problems.
Floating point are not 100% accurate. Like Marcin Nabiałek wrote the 8.31 you see is probably represented by something else, something like 8.310000000001. See here for some interesting reading about the accuracy problem of floating point.
Solution is not to use floating point data types unless you really have to. You should rather use DECIMAL or MONEY data types.
If you really have to use a floating point data type, then you can add or subtract a small value (the accuracy thresold or epsilon) before every floor, ceiling or comparison operations to get the precision you want. If you have a lot of floating point operations then it might be worth it to code your own floating point comparison functions.
Related
We are trying to implement a reporting system using software that queries our SQL database. Due to a variety of circumstances, we have a need to round data within the SQL queries. Our goal is to avoid floating point errors, unwanted trailing zeros, and complexity of nested functions (if possible).
The incoming data is always type nvarchar(...) and needs to remain in a string format, which is causing problems for us. Here is an example of what I mean (tested using w3schools.com):
SELECT
STR(235.415, 10, 2) AS StringValue1,
STR('235.415', 10, 2) AS StringValue2,
STR(ROUND(235.415, 2),10,2) AS RoundValue1,
STR(ROUND('235.415', 2),10,2) AS RoundValue2,
STR(CAST('235.415' As NUMERIC(8,2)),10,2) As CastValue1
And, the result:
I know that the issue is a conversion to floating point data type when handling strings. I think the last option, i.e. casting to numeric, is the answer to my issue. However, I can't tell if this output is correct because the CAST guarantees there will not be an error, or because I got lucky for this specific instance.
Is there any type of SQL round function (or combination of functions) that takes string input, outputs string data, and doesn't involve floating point arithmetic? -- Thanks in advance!
NUMERIC/DECIMAL and MONEY don´t uses floating point arithmetic. The are in fact integers with a fixed comma.
Be aware that if you have large sums or do some calculations with these values, your rounding error can get pretty big, pretty fast. So it is advisable to take some moments to think about where you store a value with which precision and when you want to round.
I am trying to figure out why SQL Server is returning 9.999999999999999e+004 when it's supposed to return 1.000000000000000e+005 from the following sql statement:
select Convert(
varchar(32),
round(cast('123456' as Float), -5),
2
)
Even more interesting is that the following statement correctly returns: 1.0000000e+005
select Convert(varchar(32),
round(cast('123456' as Float), -5),
1
)
Any help would be greatly appreciated.
My best guess is that the internal computation for round() is something to the effect:
(123456 / 100000.0) * 100000.0
The fractional part produced by the division is off by the lowest order bit, as floating point arithmetic is wont to do.
(The above will not reproduce the problem because the computation is between integers and decimals. There are no floating point values.)
Note that you don't need the quotes around '123456' to cause the problem. However, because numbers with a decimal point are interpreted as decimals, rather than floats, it does not happen with convert(varchar(32), 123456.0, 2).
The difference between formats "1" and "2" is interesting. I would put this up to the vagaries of floating point arithmetic as well.
I am guessing that you can figure out pretty easy work-arounds.
And, as I allude to in a comment, this is a bit weird. Floating point representations can exactly represent 123,456 as well as 100,000. The problem must be in an intermediate value.
sth about how floats cannot represent every single rational number because you're limited to using bits to represent the entire number. 9.999..^4 is the closest the 64-bit or 32-bit float can represent 10^5.
It's not a bug, more like a implementation limitation.
for more info: Wikipedia: Floating Point > Representable Numbers
I'm literately just doing a multiplication of two floats. How come these statements produce different results ? Should I even be using floats ?
500,000.00 * 0.001660 = 830
How come these statements produce different results ?
Because floating-point arithmetic is not exact and apparently you were not printing the multiplier precisely enough (i. e. with sufficient number of decimal digits). And it wasn't .00166 but something that seemed 0.00166 rounded.
Should I even be using floats ?
No. For money, use integers and treat them as fixed-point rational numbers. (They still aren't exact, but significantly better and less error-prone.)
You didn't show how you initialized periodicInterest, and presumably you think you set it to 0.00166, but in fact the error in your output is large enough that you must not have explicitly initialized it as periodicInterest = 0.00166. It must be closer to 0.00165975, and the difference between 0.00166 and 0.00165975 is definitely large enough not to just be a single floating-point rounding error.
Assuming you are working with monetary quantities, you should use NSDecimalNumber or NSDecimal.
One non-obvious benefit of using NSDecimalNumber is that it works with NSNumberFormatter, so you can let Apple take care of formatting currencies for all sorts of foreign locales.
UPDATE
In response to the comments:
“periodicInterest is clearly not a monetary quantity” and “decimal is no more free of error when dividing by 12 than binary is” - for inexact quantities, I can think of two concerns:
One concern is using sufficient precision to give accurate results. NSDecimalNumber is a floating-point number with 38 digits of precision and an exponent in the range -128…127. This is more than twice the number of decimal digits an IEEE 'double' can store. The exponent range is less than that of a double, but that's unlikely to matter in financial computing. So NSDecimalNumbers can definitely result in smaller error than floats or doubles, even though none of them can store 1/12 exactly.
The other concern is matching the results computed by some other system, like your bank or your broker or the NYSE. In that case, you need to figure out how that other system is storing numbers and computing with them. If the other system is using a decimal format (which is likely in the financial sector), then NSDecimalNumber will probably be useful.
“Wouldn't it be more efficient to use primitive types to do floating point arithmetic, specially thousands in real time.” Arithmetic on primitive types is far faster than arithmetic on NSDecimalNumbers. I haven't measured it, but a factor of 100 would not surprise me.
You have to strike a balance between your requirements. If decimal accuracy is paramount (as it often is in financial programming), you must sacrifice performance for accuracy. If decimal accuracy is not so important, you can consider carefully using a primitive type, but you should be aware of the accuracy you're sacrificing. Even then, the size of a float is so small (usually only 7 significant decimal digits) that you should probably be using double (at least 15, usually 16 significant decimal digits).
If you need to perform millions of arithmetic operations per second with true decimal accuracy, you might be able to do it using doubles, if you are an IEEE 754 expert capable of analyzing your code to figure out where errors are introduced and how to eliminate them. Few people have this level of expertise. (I don't claim to.) You must also understand how your compiler turns your Objective-C code into machine instructions.
Anyway, perhaps you are just writing a casual app to compute a rough estimate of net present value or future value. In that case, using double would probably suffice, but using NSDecimalNumber would probably also be sufficiently fast. Without knowing more about the app you're writing, I can't give you more specific advice.
I have an sql:
SELECT Sum(Field1), Sum(Field2), Sum(Field1)+Sum(Field2)
FROM Table
GROUP BY DateField
HAVING Sum(Field1)+Sum(Field2)<>0;
Problem is sometimes Sum of field1 and field2 is value like: 9.5-10.3 and the result is -0,800000000000001. Could anybody explain why this happens and how to solve it?
Problem is sometimes Sum of field1 and
field2 is value like: 9.5-10.3 and the
result is -0.800000000000001. Could
anybody explain why this happens and
how to solve it?
Why this happens
The float and double types store numbers in base 2, not in base 10. Sometimes, a number can be exactly represented in a finite number of bits.
9.5 → 1001.1
And sometimes it can't.
10.3 → 1010.0 1001 1001 1001 1001 1001 1001 1001 1001...
In the latter case, the number will get rounded to the closest value that can be represented as a double:
1010.0100110011001100110011001100110011001100110011010 base 2
= 10.300000000000000710542735760100185871124267578125 base 10
When the subtraction is done in binary, you get:
-0.11001100110011001100110011001100110011001100110100000
= -0.800000000000000710542735760100185871124267578125
Output routines will usually hide most of the "noise" digits.
Python 3.1 rounds it to -0.8000000000000007
SQLite 3.6 rounds it to -0.800000000000001.
printf %g rounds it to -0.8.
Note that, even on systems that display the value as -0.8, it's not the same as the best double approximation of -0.8, which is:
- 0.11001100110011001100110011001100110011001100110011010
= -0.8000000000000000444089209850062616169452667236328125
So, in any programming language using double, the expression 9.5 - 10.3 == -0.8 will be false.
The decimal non-solution
With questions like these, the most common answer is "use decimal arithmetic". This does indeed get better output in this particular example. Using Python's decimal.Decimal class:
>>> Decimal('9.5') - Decimal('10.3')
Decimal('-0.8')
However, you'll still have to deal with
>>> Decimal(1) / 3 * 3
Decimal('0.9999999999999999999999999999')
>>> Decimal(2).sqrt() ** 2
Decimal('1.999999999999999999999999999')
These may be more familiar rounding errors than the ones binary numbers have, but that doesn't make them less important.
In fact, binary fractions are more accurate than decimal fractions with the same number of bits, because of a combination of:
The hidden bit unique to base 2, and
The suboptimal radix economy of decimal.
It's also much faster (on PCs) because it has dedicated hardware.
There is nothing special about base ten. It's just an arbitrary choice based on the number of fingers we have.
It would be just as accurate to say that a newborn baby weighs 0x7.5 lb (in more familiar terms, 7 lb 5 oz) as to say that it weighs 7.3 lb. (Yes, there's a 0.2 oz difference between the two, but it's within tolerance.) In general, decimal provides no advantage in representing physical measurements.
Money is different
Unlike physical quantities which are measured to a certain level of precision, money is counted and thus an exact quantity. The quirk is that it's counted in multiples of 0.01 instead of multiples of 1 like most other discrete quantities.
If your "10.3" really means $10.30, then you should use a decimal number type to represent the value exactly.
(Unless you're working with historical stock prices from the days when they were in 1/16ths of a dollar, in which case binary is adequate anyway ;-) )
Otherwise, it's just a display issue.
You got an answer correct to 15 significant digits. That's correct for all practical purposes. If you just want to hide the "noise", use the SQL ROUND function.
I'm certain it is because the float data type (aka Double or Single in MS Access) is inexact. It is not like decimal which is a simple value scaled by a power of 10. If I'm remembering correctly, float values can have different denominators which means that they don't always convert back to base 10 exactly.
The cure is to change Field1 and Field2 from float/single/double to decimal or currency. If you give examples of the smallest and largest values you need to store, including the smallest and largest fractions needed such as 0.0001 or 0.9999, we can possibly advise you better.
Be aware that versions of Access before 2007 can have problems with ORDER BY on decimal values. Please read the comments on this post for some more perspective on this. In many cases, this would not be an issue for people, but in other cases it might be.
In general, float should be used for values that can end up being extremely small or large (smaller or larger than a decimal can hold). You need to understand that float maintains more accurate scale at the cost of some precision. That is, a decimal will overflow or underflow where a float can just keep on going. But the float only has a limited number of significant digits, whereas a decimal's digits are all significant.
If you can't change the column types, then in the meantime you can work around the problem by rounding your final calculation. Don't round until the very last possible moment.
Update
A criticism of my recommendation to use decimal has been leveled, not the point about unexpected ORDER BY results, but that float is overall more accurate with the same number of bits.
No contest to this fact. However, I think it is more common for people to be working with values that are in fact counted or are expected to be expressed in base ten. I see questions over and over in forums about what's wrong with their floating-point data types, and I don't see these same questions about decimal. That means to me that people should start off with decimal, and when they're ready for the leap to how and when to use float they can study up on it and start using it when they're competent.
In the meantime, while it may be a tad frustrating to have people always recommending decimal when you know it's not as accurate, don't let yourself get divorced from the real world where having more familiar rounding errors at the expense of very slightly reduced accuracy is of value.
Let me point out to my detractors that the example
Decimal(1) / 3 * 3 yielding 1.999999999999999999999999999
is, in what should be familiar words, "correct to 27 significant digits" which is "correct for all practical purposes."
So if we have two ways of doing what is practically speaking the same thing, and both of them can represent numbers very precisely out to a ludicrous number of significant digits, and both require rounding but one of them has markedly more familiar rounding errors than the other, I can't accept that recommending the more familiar one is in any way bad. What is a beginner to make of a system that can perform a - a and not get 0 as an answer? He's going to get confusion, and be stopped in his work while he tries to fathom it. Then he'll go ask for help on a message board, and get told the pat answer "use decimal". Then he'll be just fine for five more years, until he has grown enough to get curious one day and finally studies and really grasps what float is doing and becomes able to use it properly.
That said, in the final analysis I have to say that slamming me for recommending decimal seems just a little bit off in outer space.
Last, I would like to point out that the following statement is not strictly true, since it overgeneralizes:
The float and double types store numbers in base 2, not in base 10.
To be accurate, most modern systems store floating-point data types with a base of 2. But not all! Some use or have used base 10. For all I know, there are systems which use base 3 which is closer to e and thus has a more optimal radix economy than base 2 representations (as if that really mattered to 99.999% of all computer users). Additionally, saying "float and double types" could be a little misleading, since double IS float, but float isn't double. Float is short for floating-point, but Single and Double are float(ing point) subtypes which connote the total precision available. There are also the Single-Extended and Double-Extended floating point data types.
It is probably an effect of floating point number implementations. Sometimes numbers cannot be exactly represented, and sometimes the result of operations is slightly off what we may expect for the same reason.
The fix would be to use a rounding function on the values to cut off the extraneous digits. Like this (I've simply rounded to 4 significant digits after the decimal, but of course you should use whatever precision is appropriate for your data):
SELECT Sum(Field1), Sum(Field2), Round(Sum(Field1)+Sum(Field2), 4)
FROM Table
GROUP BY DateField
HAVING Round(Sum(Field1)+Sum(Field2), 4)<>0;
Someone mentioned in this decimal vs double! - Which one should I use and when? post that it's best to use double for physical science computations. Does this apply to measurements such as flash point, viscosity, weight and volume? Can someone explain further?
Decimal = exact. That is, I have £1.23 in my pocket and applying VAT, say, is a known and accepted rounding issue.
When you measure something, you can never be exact. If something is 123 centimeters long then strictly speaking it's somewhere between 122.5 and 123.49999 centimeters long....
You are dealing with 2 different kinds of quantities.
So, use double for this.
The question you link to is about the data types in C#, not SQL, and the answers reflect that.
Decimal is a datatype used and created for dealing with currency, making sure calculations balance out (no lost cents or fractions of, for example).
Physical scientific computations rarely deal with money values, hence, you should use double whenever you need the accuracy.
Generally you would always use doubles for recording physical measurements unless you have a good reason to do otherwise. There is potentially a miniscule loss of accuracy when using any floating point number since binary is unable to perfectly represent certain decimal numbers (0.1 being the most obvious example) but the inaccuracy in the double representation is going to be many orders of magnitude smaller than the error in the measurements you take.
Decimal is used where it's very important that numbers are represented exactly so typically only when dealing with money (yes, it would seem we care more about money than science!)
Knowing which database this is for would make things easier...
Oracle only has NUMBER, which if you omit the two optional parameters precision and scale - is a float. Using both parameters is DECIMAL, and only specifying the precision is INTEGER. Can't remember how to get REAL...
MySQL Numeric data type info: http://dev.mysql.com/doc/refman/5.0/en/numeric-types.html
SQL Server Numeric data type info: http://msdn.microsoft.com/en-us/library/aa258271(SQL.80).aspx
I haven't dealt with float and real much, but I've heard they aren't great for mathematical computations. I've used DECIMAL for varying precision, not just for monetary values.
What data type to use depends on the data, and how you intend to use that data.