What is the reason behind precision and scale naming? [closed] - sql

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I have a lot of troubles understanding why precision and scale are called that way in database types.
I see precision and scale differently.
PRECISION IN MY POINT OF VIEW
For me precision would be how many digits there are in the right side. For instance, 1 is less precise than 1,0001.
SCALE IN MY POINT OF VIEW
Scale would be how much a number can go up or down.
For instance 0 - 1000 is a bigger scale than 0 - 10.
Or even 0 - 1,0 is a bigger scale than 0 - 1.
PRECISION AND SCALE IN DATABASE
However in database lexicon it has different meaning, precision is the total of digits in a number and scale the number of digits on right.
I'm always forgetting the meaning of this two words because I can't make sense of them.
Hope you guys can help me out, understanding why they are called this way

Perhaps it's easier if you consider how the numbers look in scientific notation.
X * 10^Y
where X has a single digit before the decimal point.
Now, how big the number is (its "scale") is fundamentally determined by Y. Are we counting in ones? Millions? Thousandths? That's scale.
Regardless of the absolute scale of the number, the digits in X determine how precise we're being. Can I distinguish 1.1 ones from 1.2 ones? Can I distinguish 1.1 millions from 1.2 millions? Can I distinguish 1.1 thousandths from 1.2 thousandths? All are equivalent - two digits (including the one before the decimal point) of precision.
If I can distinguish 1.01 millions from 1.00 millions, that's more precise than only being able to distinguish 0.1 millions; that's 3 digits of precision.
But 1.01*10^-3 is not more precise than 1.01*10^10 ; it merely operates at a smaller scale.
Beyond that, I don't know what you want. Ok, you've told us what you'd like the words to mean; but that's not what they mean. This is what they mean.
UPDATE - One other thing I should mention. It may seem that scale and precision are conflated in some way, because if we take a physical example, surely "1 millimeter from the bullseye" is more precise than "1 meter from the bullseye", right?
But remember that precision and scale describe a variable's data type, not a specific measurement. If measuring in meters, we can't express "1 millimeter from the bullseye" with less than 4 digits of precision ("0.001 meters"); but we could describe "1 meter from the bullseye" with 1 digit of precision. So that actually does align with our desire to call "1 mm from the bullseye" somehow more precise.

Related

When to use decimals or doubles

Quick Aside: I'm going to use the word "Float" to refer to both a .Net float and a SQL float with only 7 significant digits. I will use the word "Double" to refer to a .Net double and a SQL float with 15 significant digits. I also realize that this is very similar to some other posts regarding decimals/doubles, but the answers on those posts are really inconsistent, and I really want some recommendations for my specific circumstance...
I am part of a team that is rewriting an old application. The original app used floats (7 digits). This of course caused issues since the app conducted a lot of calculations and rounding errors accumulated very quickly. At some point, many of these floats were changed to decimals. Later, the floats (7) in the database all became doubles (15). After that we had several more errors with calculations involving doubles, and they too were changed to decimals.
Today about 1/3 of all of our floating point numbers in the database are decimals, the rest are doubles. My team wants to "standardize" all of our floating-point numbers in the database (and the new .Net code) to use either exclusively decimals or doubles except in cases where the other MUST be used. The majority of the team is set on using decimals; I'm the only person on my team advocating using doubles instead of decimals. Here's why...
Most of the numbers in the database are still doubles (though much of the application code still uses floats), and it would be a lot more effort to change all of the floats/doubles to decimals
For our app, none of the fields stored are "exact" decimal quantities. None of them are monetary quantities, and most represent some sort of "natural" measurement (e.g. mass, length, volume, etc.), so a double's 16 significant digits are already way more precise than even our initial measurements.
Many tables have measurements stored in two columns: 1 for the value; 1 for the unit of measure. This can lead to a HUGE difference in scale between the values in a single column. For example, one column can store a value in terms of pCi/g or Ci/m3 (1 Ci = 1000000000000 pCi). Since all the values in a single decimal columns must have the same scale (that is... an allocated number of digits both before and after the decimal point), I'm concerned that we will have overflow and rounding issues.
My teammates argue that:
Doubles are not as accurate nor as precise as decimals due to their inability to exactly represent 1/10 and that they only have 16 significant digits.
Even though we are not tracking money, the app is a inventory system that keeps track of material (mostly gram quantities) and it needs to be "as accurate as possible".
Even after the floats were changed to doubles, we continued to have bad results from calculations that used doubles. Changing these columns (and the application code) to decimals caused these calculations to produce the expected results.
It is my strong belief that the original issues where caused due to floats only having 7 significant digits and that simple arithmetic (e.g. 10001 * 10001) caused them to the data to quickly use up the few significant digits that they had. I do not believe this had anything to do with how binary-floating point numbers can only approximate decimal values, and I believe that using doubles would have fixed this issue.
I believe that the issue with doubles arose because doubles were used along side decimals in calculations that values were be converted back and forth between data types. Many of these calculations would round between intermediary steps in the calculation!
I'm trying to convince my team not to make everything under the sun into a decimal. Most values in the database don't have more than 5 or 6 significant digits anyway. Unfortunately, I am out-ranked by other members of my team that see things rather differently.
So, my question then is...
Am I worrying over nothing? Is there any real harm done by using almost exclusively decimals instead of doubles in an application with nearly 200 database tables, hundreds of transactions, and a rewrite schedule of 5 to 6 years?
Is using decimals actually solving an issue that doubles could not? From my research, both decimals and doubles are susceptible to rounding errors involving arbitrary fractions (adding 1/3 for example) and that the only way to account for this is to consider any value within a certain tolerance as being "equal" when comparing doubles and/or decimals.
If it is more appropriate to use doubles, what arguments could I make (other than what I have already made) could convince my team to not change everything to decimals?
Use decimal when you need perfect accuracy as a base-10 number (financial data, grades)
Use double or float when you are storing naturally imprecise data (measurements, temperature), want much faster mathematical operations, and can sacrifice a minute amount of imprecision.
Since you seem to be only storing various measurements (which have some precision anyways), float would be the logical choice (or double if you need more than 7 digits of precision).
Is using decimals actually solving an issue that doubles could not?
Not really - The data is only going to be as accurate as the measurements used to generate the data. Can you really say that a measured quantity is 123.4567 grams? Does the equipment used to measure it have that level of precision?
To deal with "rounding errors" I would argue that you can't really say whether a measurement of 1234.5 grams is exactly halfway - it could just as easily be 1234.49 grams, which would round down anyways.
What you need to decide is "what level of precision is acceptable" and always round to that precision as a last step. Don't round your data or intermediate calculations.
If it is more appropriate to use doubles, what arguments could I make (other than what I have already made) could convince my team to not change everything to decimals?
Other than the time spent switching, the only thing you're really sacrificing is speed. The only way to know how much speed is to try it both ways and measure the difference.
You'd better try your best not to lose precision. I guess my fault may convince you to choose double.
===> I did some wrong arithmetic, and it returns something very weird:
given 0.60, it returns 5
int get_index(double value) {
if (value < 0 || value > 1.00)
return -1;
return value / 0.10;
}
and I fixed it:
int get_index(double value) {
if (value < 0 || value > 1.00)
return -1;
return (value * 100000000) / (0.10 * 100000000);
}

Decimal(19,4) or Decimal(19.2) - which should I use?

This sounds like a silly question, but I've noticed that in a lot of table designs for e-commerce related projects I almost always see decimal(19, 4) being used for currency.
Why the 4 on scale? Why not 2?
Perhaps I'm missing a potential calculation issue down the road?
First off - you are receiving some incorrect advice from other answers. Obseve the following (64-bit OS on 64-bit architecture):
declare #op1 decimal(18,2) = 0.01
,#op2 decimal(18,2) = 0.01;
select result = #op1 * #op2;
result
---------.---------.---------.---------
0.0001
(1 row(s) affected)
Note the number of underscores underneath the title - 39 in all. (I changed every tenth to a period to aid counting.) That is precisely enough for 38 digits (the maximum allowable, and the default on a 64 bit CPU) plus a decimal point on display. Although both operands were declared as decimal(18,2) the calculation was performed, and reported, in decimal(38,4) datatype. (I am running SQL 2012 on a 64 bit machine - some details may vary based on machine architecture and OS.)
Therefore, it is clear that no precision is being lost. On the contrary, only overflow can occur, not precision loss. This is a direct consequence of all calculations on decimal operands being performed as integer arithmetic. You will occasionally see artifacts of this in intelli-sense when the type of intermediate fields of decimal type are reported as being int instead.
Consider the example above. The two operands are both of type decimal(18,2) and are stored as being integers of value 1, with a scale of 2. When multiplied the product is still 1, but the scale is evaluated by adding the scales, to create a result of integer value 1 and scale 4, which is a value of 0.0001 and of type decimal(18,4), stored as an integer with value 1 and scale 4.
Read that last paragraph again.
Rinse and repeat once more.
In practice, on a 64 bit machine and OS, this is actually stored and carried forward as being of type *decimal (38,4) because the calculations are being done on a CPU where the extra bits are free.
To return to your question - All major currencies of the world (that I am aware of) only require 2 decimal places, but there are a handful where 4 are required, and there are financial transactions such as currency transactions and bond sales where 4 decimal places are mandated by law. When devising the money datatype Microsoft appears to have opted for the maximum scale that might be required rather than the normal scale required. Given how few transactions, and corporations, actually require precision greater than 19 digits this seems eminently sensible.
If you have:
A high expectation of only dealing with major currencies (which at the current time only require 2 digits of scale); and
No expectation of dealing with transactions that are mandated by law to require 4 digits of scale
then you would be safe to use type decimal with scale 2 (such as decimal(19,2) or decimal(18,2) or decimal(38,2)) instead of money. This will ease some of your conversions and, given the assumptions above, have no cost. A typical case where these assumptions are met is in a GL or Subledger accounting system tracking transactions to the penny. However, a stock- or bond-trading system would not meet these assumptions because 4 digits of scale are mandated by law in those case.
A way to distinguish the two cases is whether transactions are reported in cents or percents, which only require 2 digits of scale, or in basis points which require 4 digits of scale.
If you are at all unsure as to which case applies to your programming circumstance, consult your Controller or Director of Finance as to the legal and GAAP requirements for your application. (S)he will be able to give you definitive advice.
In SQL the 19 is amount of integers, the 4 is amount of decimals.
If you only have 2 decimals and you store maybe a result of some calculations, which results in more than 2 decimals, theres "no way" to store those additional decimals.
Some currencies operates with more than 2 decimals.
Use the data type decimal, not money.
Things like gas prices would use the extra "scale" positions. You've seen gas at $1.959 per gallon, right?
When use decimal it's up to you how you want to use according to your business requirements.
But when you will use Money data type in sql by default it stores with 4 decimal places.
although the OP's question is about the scale, let's lament on why 19 is a popular precision for decimal on SQL server.
according to this document, this is how much storage a decimal uses:
Precision
Storage bytes
1 - 9
5
10-19
9
20-28
13
29-38
17
so 1 precision uses as much space as 9, and 10 uses as much as 19.
in a real world scenario 9 can easily be too little for money, especially if you opt for a scale of 4, leaving you between -99999.9999 and 99999.9999.
but 19 is plenty for any imaginable cases, that's why SQL Server's money data type uses that.
one can use 28 or 38 to prevent errors at conversions in case some erroneous data hides in the database.

Storing and computing with real numbers up to an arbitrary precision in vb.net [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
.NET Framework Library for arbitrary digit precision
How can I store a real number, eg, root 2 or one third, up to an arbitrary precision (the precision I need is infinate precision) in vb.net?
I would like to be able to store real numbers and perform operations on them (ie root 2 times root 2) without losing any accuracy - IE storing 1/3 would return the value 1/3 if I needed to retrieve this value.
I was thinking of using a fractal encoding but I am unsure as to the best way to do this.
Storage capacity is not an issue, I just need the real numbers to be 100% accurate.
Will that be a single real number there or does it need to be an arbitrary number of (almost) arbitrary figures? (Sorry for "answer" - for some reason i can't add comments now...)

Accuracy of double Objective-C [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why can't decimal numbers be represented exactly in binary?
When I enter 0.1 as a double value the compiler is adding a tiny value on the end of it that is causing other calculations to go wrong in the program that I am running. My code simply says:
double temp = 0.1;
And I get this in variable viewer:
http://img.skitch.com/20111122-nnrcgi4dtteg8aa3e8926r3fd4.jpg
Does anyone know why this is happening?
Thanks
double is a floating binary point type. In binary, the value of "a half" is 0.1, and the value of "a quarter" is 0.01 etc. There is no way of exactly representing "a tenth" in a finite binary representation, any more than you can exactly represent "a third" in decimal. The compiler is giving you the closest value it can to the value you've actually asked for.
If you want to store decimal values precisely because you care about the decimals (e.g. for current) you should use a decimal-based type such as NSDecimalNumber, or an integer scaled appropriately (e.g. storing 15 for 15 cents instead of 0.15 dollars).
I have articles on binary and decimal floating point in .NET - NSDecimalNumber in Objective-C is slightly different to decimal in C# (see the documentation), but hopefully those articles will give you a bit more insight into what's actually happening.
EDIT: As noted in comments, typically decimal floating point types are significantly slower than binary floating point types, partly because they're often larger and partly because they don't have CPU support. If you have a hard performance requirement and you want to retain digits precisely, the "integer and implied scale" option is usually a good one, though a pain to code against as you need to take it into account every time you read the code :)

Why see -0,000000000000001 in access query?

I have an sql:
SELECT Sum(Field1), Sum(Field2), Sum(Field1)+Sum(Field2)
FROM Table
GROUP BY DateField
HAVING Sum(Field1)+Sum(Field2)<>0;
Problem is sometimes Sum of field1 and field2 is value like: 9.5-10.3 and the result is -0,800000000000001. Could anybody explain why this happens and how to solve it?
Problem is sometimes Sum of field1 and
field2 is value like: 9.5-10.3 and the
result is -0.800000000000001. Could
anybody explain why this happens and
how to solve it?
Why this happens
The float and double types store numbers in base 2, not in base 10. Sometimes, a number can be exactly represented in a finite number of bits.
9.5 → 1001.1
And sometimes it can't.
10.3 → 1010.0 1001 1001 1001 1001 1001 1001 1001 1001...
In the latter case, the number will get rounded to the closest value that can be represented as a double:
1010.0100110011001100110011001100110011001100110011010 base 2
= 10.300000000000000710542735760100185871124267578125 base 10
When the subtraction is done in binary, you get:
-0.11001100110011001100110011001100110011001100110100000
= -0.800000000000000710542735760100185871124267578125
Output routines will usually hide most of the "noise" digits.
Python 3.1 rounds it to -0.8000000000000007
SQLite 3.6 rounds it to -0.800000000000001.
printf %g rounds it to -0.8.
Note that, even on systems that display the value as -0.8, it's not the same as the best double approximation of -0.8, which is:
- 0.11001100110011001100110011001100110011001100110011010
= -0.8000000000000000444089209850062616169452667236328125
So, in any programming language using double, the expression 9.5 - 10.3 == -0.8 will be false.
The decimal non-solution
With questions like these, the most common answer is "use decimal arithmetic". This does indeed get better output in this particular example. Using Python's decimal.Decimal class:
>>> Decimal('9.5') - Decimal('10.3')
Decimal('-0.8')
However, you'll still have to deal with
>>> Decimal(1) / 3 * 3
Decimal('0.9999999999999999999999999999')
>>> Decimal(2).sqrt() ** 2
Decimal('1.999999999999999999999999999')
These may be more familiar rounding errors than the ones binary numbers have, but that doesn't make them less important.
In fact, binary fractions are more accurate than decimal fractions with the same number of bits, because of a combination of:
The hidden bit unique to base 2, and
The suboptimal radix economy of decimal.
It's also much faster (on PCs) because it has dedicated hardware.
There is nothing special about base ten. It's just an arbitrary choice based on the number of fingers we have.
It would be just as accurate to say that a newborn baby weighs 0x7.5 lb (in more familiar terms, 7 lb 5 oz) as to say that it weighs 7.3 lb. (Yes, there's a 0.2 oz difference between the two, but it's within tolerance.) In general, decimal provides no advantage in representing physical measurements.
Money is different
Unlike physical quantities which are measured to a certain level of precision, money is counted and thus an exact quantity. The quirk is that it's counted in multiples of 0.01 instead of multiples of 1 like most other discrete quantities.
If your "10.3" really means $10.30, then you should use a decimal number type to represent the value exactly.
(Unless you're working with historical stock prices from the days when they were in 1/16ths of a dollar, in which case binary is adequate anyway ;-) )
Otherwise, it's just a display issue.
You got an answer correct to 15 significant digits. That's correct for all practical purposes. If you just want to hide the "noise", use the SQL ROUND function.
I'm certain it is because the float data type (aka Double or Single in MS Access) is inexact. It is not like decimal which is a simple value scaled by a power of 10. If I'm remembering correctly, float values can have different denominators which means that they don't always convert back to base 10 exactly.
The cure is to change Field1 and Field2 from float/single/double to decimal or currency. If you give examples of the smallest and largest values you need to store, including the smallest and largest fractions needed such as 0.0001 or 0.9999, we can possibly advise you better.
Be aware that versions of Access before 2007 can have problems with ORDER BY on decimal values. Please read the comments on this post for some more perspective on this. In many cases, this would not be an issue for people, but in other cases it might be.
In general, float should be used for values that can end up being extremely small or large (smaller or larger than a decimal can hold). You need to understand that float maintains more accurate scale at the cost of some precision. That is, a decimal will overflow or underflow where a float can just keep on going. But the float only has a limited number of significant digits, whereas a decimal's digits are all significant.
If you can't change the column types, then in the meantime you can work around the problem by rounding your final calculation. Don't round until the very last possible moment.
Update
A criticism of my recommendation to use decimal has been leveled, not the point about unexpected ORDER BY results, but that float is overall more accurate with the same number of bits.
No contest to this fact. However, I think it is more common for people to be working with values that are in fact counted or are expected to be expressed in base ten. I see questions over and over in forums about what's wrong with their floating-point data types, and I don't see these same questions about decimal. That means to me that people should start off with decimal, and when they're ready for the leap to how and when to use float they can study up on it and start using it when they're competent.
In the meantime, while it may be a tad frustrating to have people always recommending decimal when you know it's not as accurate, don't let yourself get divorced from the real world where having more familiar rounding errors at the expense of very slightly reduced accuracy is of value.
Let me point out to my detractors that the example
Decimal(1) / 3 * 3 yielding 1.999999999999999999999999999
is, in what should be familiar words, "correct to 27 significant digits" which is "correct for all practical purposes."
So if we have two ways of doing what is practically speaking the same thing, and both of them can represent numbers very precisely out to a ludicrous number of significant digits, and both require rounding but one of them has markedly more familiar rounding errors than the other, I can't accept that recommending the more familiar one is in any way bad. What is a beginner to make of a system that can perform a - a and not get 0 as an answer? He's going to get confusion, and be stopped in his work while he tries to fathom it. Then he'll go ask for help on a message board, and get told the pat answer "use decimal". Then he'll be just fine for five more years, until he has grown enough to get curious one day and finally studies and really grasps what float is doing and becomes able to use it properly.
That said, in the final analysis I have to say that slamming me for recommending decimal seems just a little bit off in outer space.
Last, I would like to point out that the following statement is not strictly true, since it overgeneralizes:
The float and double types store numbers in base 2, not in base 10.
To be accurate, most modern systems store floating-point data types with a base of 2. But not all! Some use or have used base 10. For all I know, there are systems which use base 3 which is closer to e and thus has a more optimal radix economy than base 2 representations (as if that really mattered to 99.999% of all computer users). Additionally, saying "float and double types" could be a little misleading, since double IS float, but float isn't double. Float is short for floating-point, but Single and Double are float(ing point) subtypes which connote the total precision available. There are also the Single-Extended and Double-Extended floating point data types.
It is probably an effect of floating point number implementations. Sometimes numbers cannot be exactly represented, and sometimes the result of operations is slightly off what we may expect for the same reason.
The fix would be to use a rounding function on the values to cut off the extraneous digits. Like this (I've simply rounded to 4 significant digits after the decimal, but of course you should use whatever precision is appropriate for your data):
SELECT Sum(Field1), Sum(Field2), Round(Sum(Field1)+Sum(Field2), 4)
FROM Table
GROUP BY DateField
HAVING Round(Sum(Field1)+Sum(Field2), 4)<>0;