How to set the default length of float column? - sql

i have created a table and 1 of the columns in it is float (named ratio),
also have 2x int columns of type INT (KILL and DEATH), ratio column updated automatically by trigger (each time KILL or DEATH is updated the ratio is updated by the trigger),
the size of the float column (ratio) is too big, i mean its length is too long, how i can make define the size of a float column by default?
thanks in advance,

The ratio between two columns is something you ought to be calculating on the fly, as you query the tables. Using a trigger is overkill, IMHO. You could use a view or a calculated column if you are concerned about repeating your logic (such as avoiding division by zero) over and over.
Formatting of a column (how many decimal places) is an application or report issue, not so much a database issue. If at some point you decide that you actually want more precision in one of several displays, you'll have to make database changes rather than just app changes. You also might have a problem if you ever had a ratio of 1 in 1000 or smaller: if you limit yourself to 3 decimal places, your ratio will be calculated as 0, which might cause problems in your logic.

Related

Substring for Ints without Having an Integer Expression

I am trying to update values in a database that does not do floats / decimals through the the DB itself. The front end "translation" of the decimals is also inconsistent.
So column A may take 1234 and turn it into 12.34, and column B may take it and turn it into 1.234.
When multiplying two columns together, I end up with a long int that needs to be cut down. However, there's no consistency for the integer expression.
Employee A:
Pay rate: $20.67. In db: 20670
Hours worked: 3.95. In db: 395
20670 * 395 = 8164650.
Needs to be 8164. LEFT(rate *hrs, 5) = 8164
Employee B:
Pay rate: $18.07. In db: 18079
Hours worked: 8. In db: 800
18079 * 800 = 14463200.
Needs to be 14464. LEFT(rate*hrs,5) = 1446
Is there another way I can reduce the int without relaying on integer expressions?
I'm going to assume this was somebody's ill-fated attempt at avoiding floating point imprecision and it's just a messed up schema; the database itself is a normal SQL database that supports floating point numbers.
Rather than trying to work with this mess, it's going to suck down a lot of time and generate a lot of totally avoidable bugs, fix it. Create a new schema (possibly a new database) that uses proper decimal columns and translate all your data into it. decimal and numeric are generally exact and do not suffer from floating point error (though the results of doing math on them might). All new code should use this, and as much existing code should be switched over as possible.
Then turn all the existing messed up tables into views with the same name. These views translate the decimal columns back into integers. This gives you backwards compatibility with queries you couldn't get fixed (or don't know about), while maintaining one sane set of tables.
For example...
create table employee_pay2 (
EmployeeID integer references(employee),
Rate decimal(10,2),
Hours decimal(10,2)
);
create view employee_pay_view
as
select
EmployeeID,
Rate * 1000,
Hours * 100
from employee_pay;
Once you're sure that view works, drop the old table and rename the view to the old table name (SQL Server recommends you drop and recreate the view rather than renaming it.
Now any old queries will be using the view. You might even be able to use the query log to find stray code that's still doing it the old way.

Handling variable DECIMAL data in SQL

I have schedule job to pull data from our legacy system every month. The data can sometime swell and shrink. This cause havoc for DECIMAL precision.
I just found this job failed because DECIMAL(5,3) was too restrictive. I changed it to DECIMAL(6,3) and life is back on track.
Is there any way to evaluate this shifting data so it doesn't break on the DECIMAL()?
Thanks,
-Allen
Is there any way to evaluate this shifting data so it doesn't break on the DECIMAL()
Find the maximum value your data can have and set the column size appropriately.
Decimal columns have two size factors: scale and precision. Set your precision to as many deimal paces you need (3 in your case), and set the scale based on the largest possible number you can have.
A DECIMAL(5,3) has three digits of precision past the decimal and 5 total digits, so it can store numbers up to 99.999. If your data can be 100 or larger, use a bigger scale.
If your data is scientific in nature (e.g. temperature readings) and you don't care about exact equality, only showing trends, relative value, etc.) then you might use real instead. It takes less space than a DECIMAL(5,3) (4 bytes vs 5), has 7 digits of precision (vs. 5) and a range of -3.4E38 to 3.4E38 (vs -99.999 to 99.999).
DECIMAL is more suited for financial data or other data where exact equality is important (i.e. rounding errors are bad)

precision gains where data move from one table to another in sql server

There are three tables in our sql server 2008
transact_orders
transact_shipments
transact_child_orders.
Three of them have a common column carrying_cost. Data type is same in all the three tables.It is float with NUMERIC_PRECISION 53 and NUMERIC_PRECISION_RADIX 2.
In table 1 - transact_orders this column has value 5.1 for three rows. convert(decimal(20,15), carrying_cost) returns 5.100000..... here.
Table 2 - transact_shipments three rows are fetching carrying_cost from those three rows in transact_orders.
convert(decimal(20,15), carrying_cost) returns 5.100000..... here also.
Table 3 - transact_child_orders is summing up those three carrying costs from transact_shipments. And the value shown there is 15.3 when I run a normal select.
But convert(decimal(20,15), carrying_cost) returns 15.299999999999999 in this stable. And its showing that precision gained value in ui also. Though ui is only fetching the value, not doing any conversion. In the java code the variable which is fetching the value from the db is defined as double.
The code in step 3, to sum up the three carrying_costs is simple ::
...sum(isnull(transact_shipments.carrying_costs,0)) sum_carrying_costs,...
Any idea why this change occurs in the third step ? Any help will be appreciated. Please let me know if any more information is needed.
Rather than post a bunch of comments, I'll write an answer.
Floats are not suitable for precise values where you can't accept rounding errors - For example, finance.
Floats can scale from very small numbers, to very high numbers. But they don't do that without losing a degree of accuracy. You can look the details up on line, there is a host of good work out there for you to read.
But, simplistically, it's because they're true binary numbers - some decimal numbers just can't be represented as a binary value with 100% accuracy. (Just like 1/3 can't be represented with 100% accuracy in decimal.)
I'm not sure what is causing your performance issue with the DECIMAL data type, often it's because there is some implicit conversion going on. (You've got a float somewhere, or decimals with different definitions, etc.)
But regardless of the cause; nothing is faster than integer arithmetic. So, store your values are integers? £1.10 could be stored as 110p. Or, if you know you'll get some fractions of a pence for some reason, 11000dp (deci-pennies).
You do then need to consider the biggest value you will ever reach, and whether INT or BIGINT is more appropriate.
Also, when working with integers, be careful of divisions. If you divide £10 between 3 people, where does the last 1p need to go? £3.33 for two people and £3.34 for one person? £0.01 eaten by the bank? But, invariably, it should not get lost to the digital elves.
And, obviously, when presenting the number to a user, you then need to manipulate it back to £ rather than dp; but you need to do that often anyway, to get £10k or £10M, etc.
Whatever you do, and if you don't want rounding errors due to floating point values, don't use FLOAT.
(There is ALOT written on line about how to use floats, and more importantly, how not to. It's a big topic; just don't fall into the trap of "it's so accurate, it's amazing, it can do anything" - I can't count the number of time people have screwed up data using that unfortunately common but naive assumption.)

What T-SQL data type would you typically use for weight and length?

I am designing a table that has several fields that will be used to record weight and lengths.
Examples would be:
5 kilograms and 50 grams would be stored as 5.050.
2 metres 25 centimetres would be stored as 2.25.
What T-SQL data type would be best suited for these?
Some calculation against these fields will be required but using a default decimal(18,0) seems overkill.
It really depends on the range of values you intend to support. You should use a decimal value that covers this range.
For example for the weight, it looks like you want three decimal places. Say you want the maximum to be 1000kg then you need a precision of 7 digits, 3 being behind the decimal point. This gives you decimal(7,3)
Don't forget to put the units of measure in the column name, e.g. WeightInKilos, LengthInMetres
The best datatype depends on the range and the precision of weights and lengths you'd like to store. For storing people's weight, that would be between 0.00 and 1000.00 kilograms. So you'd need 6 digits most (precision=6), with 2 numbers behind the dot (scale=2). That's a decimal:
weight decimal(6,2)
For normal (non-scientific) use, I'd avoid the approximate number formats float and real. They have some surprising gotcha's, and it's hard for end users to reproduce the results of a calculation.
You need to take inventory of what things you are measuring. If they are measurements of UI windows then integers of pixels would be just fine. But that would not work for holding the measurement of the mass of a proton. It is the old tale of scale and precision.
The easiest solution might be to standardize them all to one unit of measure. As harriyott said you could add that to your column name. I'm not a huge fan of that due to flexibility in refactoring later if requirements or designs change, but it is an option)
If these measurements are wide open and general such that you need to support very large to very small numbers, maybe the measurement could be split into two columns. One to hold the magnitude and one to hold the unit of measure. One of the biggest downfalls to this would be comparing values if you need to find the heaviest objects, etc. It can be done with a lookup table, but certainly adds a level of complexity.

Is there any reason for numeric rather than int in T-SQL?

Why would someone use numeric(12, 0) datatype for a simple integer ID column? If you have a reason why this is better than int or bigint I would like to hear it.
We are not doing any math on this column, it is simply an ID used for foreign key linking.
I am compiling a list of programming errors and performance issues about a product, and I want to be sure they didn't do this for some logical reason. If you follow this link:
http://msdn.microsoft.com/en-us/library/ms187746.aspx
... you can see that the numeric(12, 0) uses 9 bytes of storage and being limited to 12 digits, theres a total of 2 trillion numbers if you include negatives. WHY would a person use this when they could use a bigint and get 10 million times as many numbers with one byte less storage. Furthermore, since this is being used as a product ID, the 4 billion numbers of a standard int would have been more than enough.
So before I grab the torches and pitch forks - tell me what they are going to say in their defense?
And no, I'm not making a huge deal out of nothing, there are hundreds of issues like this in the software, and it's all causing a huge performance problem and using too much space in the database. And we paid over a million bucks for this crap... so I take it kinda seriously.
Perhaps they're used to working with Oracle?
All numeric types including ints are normalized to a standard single representation among all platforms.
There are many reasons to use numeric - for example - financial data and other stuffs which need to be accurate to certain decimal places. However for the example you cited above, a simple int would have done.
Perhaps sloppy programmers working who didn't know how to to design a database ?
Before you take things too seriously, what is the data storage requirement for each row or set of rows for this item?
Your observation is correct, but you probably don't want to present it too strongly if you're reducing storage from 5000 bytes to 4090 bytes, for example.
You don't want to blow your credibility by bringing this up and having them point out that any measurable savings are negligible. ("Of course, many of our lesser-experienced staff also make the same mistake.")
Can you fill in these blanks?
with the data type change, we use
____ bytes of disk space instead of ____
____ ms per query instead of ____
____ network bandwidth instead of ____
____ network latency instead of ____
That's the kind of thing which will give you credibility.
How old is this application that you are looking into?
Previous to SQL Server 2000 there was no bigint. Maybe its just something that has made it from release to release for many years without being changed or the database schema was copied from an application that was this old?!?
In your example I can't think of any logical reason why you wouldn't use INT. I know there are probably reasons for other uses of numeric, but not in this instance.
According to: http://doc.ddart.net/mssql/sql70/da-db_1.htm
decimal
Fixed precision and scale numeric data from -10^38 -1 through 10^38 -1.
numeric
A synonym for decimal.
int
Integer (whole number) data from -2^31 (-2,147,483,648) through 2^31 - 1 (2,147,483,647).
It is impossible to know if there is a reason for them using decimal, since we have no code to look at though.
In some databases, using a decimal(10,0) creates a packed field which takes up less space. I know there are many tables around my work that use that. They probably had the same kind of thought here, but you have gone to the documentation and proven that to be incorrect. More than likely, I would say it will boil down to a case of "that's the way we have always done it, because someone one time said it was better".
It is possible they spend a LOT of time in MS Access and see 'Number' often and just figured, its a number, why not use numeric?
Based on your findings, it doesn't sound like they are the optimization experts, and just didn't know. I'm wondering if they used schema generation tools and just relied on them too much.
I wonder how efficient an index on a decimal value (even if 0 scale is set) for a primary key compares to a pure integer value.
Like Mark H. said, other than the indexing factor, this particular scenario likely isn't growing the database THAT much, but if you're looking for ammo, I think you did find some to belittle them with.
In your citation, the decimal shows precision of 1-9 as using 5 bytes. Your column apparently has 12,0 - using 4 bytes of storage - same as integer.
Moreover, INT, datatype can go to a power of 31:
-2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647)
While decimal is much larger to 38:
- 10^38 +1 through 10^38 - 1
So the software creator was actually providing more while using the same amount of storage space.
Now, with the basics out of the way, the software creator actually limited themselves to just 12 numbers or 123,456,789,012 (just an example for place holders not a maximum number). If they used INT they could not scale this column - it would go up to the full 31 digits. Perhaps there is a business reason to limit this column and associated columns to 12 digits.
An INT is an INT, while a DECIMAL is scalar.
Hope this helps.
PS:
The whole number argument is:
A) Whole numbers are 0..infinity
B) Counting (Natural) numbers are 1..infinity
C) Integers are infinity (negative) .. infinity (positive)
D) I would not cite WikiANYTHING for anything. Come on, use a real source! May as well be http://MyPersonalMathCite.com