Round SQL String Data to correct decimal place, then return string data without floating point errors

Round SQL String Data to correct decimal place, then return string data without floating point errors - sql

We are trying to implement a reporting system using software that queries our SQL database. Due to a variety of circumstances, we have a need to round data within the SQL queries. Our goal is to avoid floating point errors, unwanted trailing zeros, and complexity of nested functions (if possible).
The incoming data is always type nvarchar(...) and needs to remain in a string format, which is causing problems for us. Here is an example of what I mean (tested using w3schools.com):
SELECT
STR(235.415, 10, 2) AS StringValue1,
STR('235.415', 10, 2) AS StringValue2,
STR(ROUND(235.415, 2),10,2) AS RoundValue1,
STR(ROUND('235.415', 2),10,2) AS RoundValue2,
STR(CAST('235.415' As NUMERIC(8,2)),10,2) As CastValue1
And, the result:
I know that the issue is a conversion to floating point data type when handling strings. I think the last option, i.e. casting to numeric, is the answer to my issue. However, I can't tell if this output is correct because the CAST guarantees there will not be an error, or because I got lucky for this specific instance.
Is there any type of SQL round function (or combination of functions) that takes string input, outputs string data, and doesn't involve floating point arithmetic? -- Thanks in advance!

NUMERIC/DECIMAL and MONEY don´t uses floating point arithmetic. The are in fact integers with a fixed comma.
Be aware that if you have large sums or do some calculations with these values, your rounding error can get pretty big, pretty fast. So it is advisable to take some moments to think about where you store a value with which precision and when you want to round.

Related

i cant figure out how to round up decimals in kotlin calculator

I just started coding in android studio and was creating calculator but now I'm stuck on one problem.
after struggling a lot I figured out how to make so u can use one dot but now I came across another problem which is after addition I cant seem to round up the decimals. when I do additions in decimals sometimes it gives me something like 1.9999999998 and I cant seem to round it up. for the reference I used Table Row in xml. if necessary I can show you what I have written so far. Thanks in advance.

You need String.format(".1f", value). 1,99999 -> 1.99. If you need to round to higher value, please use ceil: https://kotlinlang.org/api/latest/jvm/stdlib/kotlin.math/ceil.html

For formatting numbers, you should always be using NumberFormat or similar.
NumberFormat lets you set a RoundingMode which will do what you want.
Or you could be like me and write your own formatter for numbers because the built-in one didn't do what I wanted.

If you care about exact decimal values, then don't use floating-point. Instead, use a type that's intended for storing exact decimal values, such as BigDecimal.
(Floating-point types such as Kotlin's Float and Double can hold numbers across a huge range of magnitude, and store and calculate with them efficiently. But they use binary floating-point, not decimal. So they're great when you care about doing calculations efficiently and flexibly — but not when you need to store exact decimal values. Most of the questions about floating-point on this site seem to be for the latter cases, unfortunately…)
Kotlin has lots of extensions making it almost as easy to handle BigDecimals as the native types. They're a little less efficient, but not by anywhere near enough to be significant in a calculator project. And they do exactly what you want here: storing and manipulating decimal numbers exactly.
And because they're exact, you shouldn't need to do any rounding — and probably won't need to do any formatting either.
(Just make sure you create them directly from strings, not from floats/doubles — which will already have been rounded to the nearest binary floating-point number.)

SQL Query - hexadecimal vs. decimal values

I expected to find the answer to this question fairly quickly, but surprisingly, don't seem to see it anywhere.
I'm guessing that a comparison to a binary constant in an SQL query would be faster than a comparison to a decimal number, as the binary constant is probably a direct lookup while decimal numbers need to be converted, but is the performance difference measurable?
In other words, is the first query better than the second one? If so, how much better?
select *
from Cats
where Cats_Id = 0x0000000000000086
select *
from Cats
where Cats_Id = 134

There absolutely no difference: 0x0000000000000086 is an integer with a decimal value of 134. It's just written in base 16 (hexadecimal).
The two queries are exactly identical and will get exactly the same execution plan.
The one different will be if the column you are comparing to is binary(n) or varbinary(n). There the hex constant is representing a sequence of octets.

Your premise is based on a misunderstanding:
the binary constant is probably a direct lookup while decimal numbers need to be converted
The SQL you enter consists of characters in some text encoding; for simplicity, let's assume ASCII.
In the computer's memory, it's composed of a long string of binary states we normally write as 0 and 1, but could equally write as _ and |.
The binary data for the string 134 in ASCII looks something like __||___|__||__||__||_|__. The binary data for the string 0x0086 looks something like __||_____||||_____||______||______|||_____||_||_
When actually working with the data, e.g. comparing numbers, the computer will use a different representation altogether. The number "one hundred and thirty-four" will look something more like |____||_.
So whichever representation you use, there is a conversion going on.
Nonetheless, one conversion might be more efficient, by some incidental detail of its implementation, but by such a tiny margin that it would be almost impossible to measure amongst the noise of the system you're testing.

The answer is yes. In some cases, the query with the hexadecimal value is substantially better.
I hired a consultant DBA to help us with our system and after reviewing one of our queries, which was running a bit slow, he showed me that changing the value to hexadecimal improved it substantially (by about 95%).
He then showed me that, although I had an index on the field I was searching (binary foreign key), the index wasn't used when executing the query with the decimal value.
If anyone can provide a more detailed answer about the different cases in which these queries are identical or not, performance wise, I would appreciate that.

SQL Server Strange Ceiling() behavior

Can anyone explain the following results in SQL Server? I'm stumped.
declare #mynum float = 8.31
select ceiling( #mynum*100)
Results in 831
declare #mynum float = 8.21
select ceiling( #mynum*100)
Results in 822
I've tested a whole range of numbers (in SQL Server 2012). Some increase while others stay the same. I'm at a loss understanding why ceiling is treating some of them differently. Changing from a float to a decimal(18,5) seems to fix the problem but I'm wary there may be other repercussions I'm missing from doing so. Any explanations would help.

I think this is called float precision. You can find it in almost all programming languages and in Database too. This is because data is stored only with some precision and in fact what you set as 8.31 is probably not 8.31 but for example 8.31631312381813 and when multiply it and ceil it may cause that different value appear.
At SQL server documentation page you can read:
Approximate-number data types for use with floating point numeric data. Floating point data is approximate; therefore, not all values in the data type range can be represented exactly.
In other database systems the same problem exists. For example at mysql website you can read:
Floating-point numbers sometimes cause confusion because they are approximate and not stored as exact values. A floating-point value as written in an SQL statement may not be the same as the value represented internally. Attempts to treat floating-point values as exact in comparisons may lead to problems. They are also subject to platform or implementation dependencies. The FLOAT and DOUBLE data types are subject to these issues. For DECIMAL columns, MySQL performs operations with a precision of 65 decimal digits, which should solve most common inaccuracy problems.

Floating point are not 100% accurate. Like Marcin Nabiałek wrote the 8.31 you see is probably represented by something else, something like 8.310000000001. See here for some interesting reading about the accuracy problem of floating point.
Solution is not to use floating point data types unless you really have to. You should rather use DECIMAL or MONEY data types.
If you really have to use a floating point data type, then you can add or subtract a small value (the accuracy thresold or epsilon) before every floor, ceiling or comparison operations to get the precision you want. If you have a lot of floating point operations then it might be worth it to code your own floating point comparison functions.

precision gains where data move from one table to another in sql server

There are three tables in our sql server 2008
transact_orders
transact_shipments
transact_child_orders.
Three of them have a common column carrying_cost. Data type is same in all the three tables.It is float with NUMERIC_PRECISION 53 and NUMERIC_PRECISION_RADIX 2.
In table 1 - transact_orders this column has value 5.1 for three rows. convert(decimal(20,15), carrying_cost) returns 5.100000..... here.
Table 2 - transact_shipments three rows are fetching carrying_cost from those three rows in transact_orders.
convert(decimal(20,15), carrying_cost) returns 5.100000..... here also.
Table 3 - transact_child_orders is summing up those three carrying costs from transact_shipments. And the value shown there is 15.3 when I run a normal select.
But convert(decimal(20,15), carrying_cost) returns 15.299999999999999 in this stable. And its showing that precision gained value in ui also. Though ui is only fetching the value, not doing any conversion. In the java code the variable which is fetching the value from the db is defined as double.
The code in step 3, to sum up the three carrying_costs is simple ::
...sum(isnull(transact_shipments.carrying_costs,0)) sum_carrying_costs,...
Any idea why this change occurs in the third step ? Any help will be appreciated. Please let me know if any more information is needed.

Rather than post a bunch of comments, I'll write an answer.
Floats are not suitable for precise values where you can't accept rounding errors - For example, finance.
Floats can scale from very small numbers, to very high numbers. But they don't do that without losing a degree of accuracy. You can look the details up on line, there is a host of good work out there for you to read.
But, simplistically, it's because they're true binary numbers - some decimal numbers just can't be represented as a binary value with 100% accuracy. (Just like 1/3 can't be represented with 100% accuracy in decimal.)
I'm not sure what is causing your performance issue with the DECIMAL data type, often it's because there is some implicit conversion going on. (You've got a float somewhere, or decimals with different definitions, etc.)
But regardless of the cause; nothing is faster than integer arithmetic. So, store your values are integers? £1.10 could be stored as 110p. Or, if you know you'll get some fractions of a pence for some reason, 11000dp (deci-pennies).
You do then need to consider the biggest value you will ever reach, and whether INT or BIGINT is more appropriate.
Also, when working with integers, be careful of divisions. If you divide £10 between 3 people, where does the last 1p need to go? £3.33 for two people and £3.34 for one person? £0.01 eaten by the bank? But, invariably, it should not get lost to the digital elves.
And, obviously, when presenting the number to a user, you then need to manipulate it back to £ rather than dp; but you need to do that often anyway, to get £10k or £10M, etc.
Whatever you do, and if you don't want rounding errors due to floating point values, don't use FLOAT.
(There is ALOT written on line about how to use floats, and more importantly, how not to. It's a big topic; just don't fall into the trap of "it's so accurate, it's amazing, it can do anything" - I can't count the number of time people have screwed up data using that unfortunately common but naive assumption.)

How best to represent rational numbers in SQL Server?

I'm working with data that is natively supplied as rational numbers. I have a slick generic C# class which beautifully represents this data in C# and allows conversion to many other forms. Unfortunately, when I turn around and want to store this in SQL, I've got a couple solutions in mind but none of them are very satisfying.
Here is an example. I have the raw value 2/3 which my new Rational<int>(2, 3) easily handles in C#. The options I've thought of for storing this in the database are as follows:
Just as a decimal/floating point, i.e. value = 0.66666667 of various precisions and exactness.
Pros: this allows me to query the data, e.g. find values < 1.
Cons: it has a loss of exactness and it is ugly when I go to display this simple value back in the UI.
Store as two exact integer fields, e.g. numerator = 2, denominator = 3 of various precisions and exactness.
Pros: This allows me to precisely represent the original value and display it in its simplest form later.
Cons: I now have two fields to represent this value and querying is now complicated/less efficient as every query must perform the arithmetic, e.g. find numerator / denominator < 1.
Serialize as string data, i.e. "2/3". I would be able to know the max string length and have a varchar that could hold this.
Pros: I'm back to one field but with an exact representation.
Cons: querying is pretty much busted and pay a serialization cost.
A combination of #1 & #2.
Pros: easily/efficiently query for ranges of values, and have precise values in the UI.
Cons: three fields (!?!) to hold one piece of data, must keep multiple representations in sync which breaks D.R.Y.
A combination of #1 & #3.
Pros: easily/efficiently query for ranges of values, and have precise values in the UI.
Cons: back down to two fields to hold one piece data, must keep multiple representations in sync which breaks D.R.Y., and must pay extra serialization costs.
Does anyone have another out-of-the-box solution which is better than these? Are there other things I'm not considering? Is there a relatively easy way to do this in SQL that I'm just unaware of?

If you're using SQL Server 2005 or 2008, you have the option to define your own CLR data types:
Beginning with SQL Server 2005, you
can use user-defined types (UDTs) to
extend the scalar type system of the
server, enabling storage of CLR
objects in a SQL Server database. UDTs
can contain multiple elements and can
have behaviors, differentiating them
from the traditional alias data types
which consist of a single SQL Server
system data type.
Because UDTs are accessed by the
system as a whole, their use for
complex data types may negatively
impact performance. Complex data is
generally best modeled using
traditional rows and tables. UDTs in
SQL Server are well suited to the
following:
Date, time, currency, and extended numeric types
Geospatial applications
Encoded or encrypted data
If you can live with the limitations, I can't imagine a better way to map data you're already capturing in a custom class.

I would probably go with Option #4, but use a calculated column for the 3rd column to avoid the sync/DRY issue (and also means you actually only store 2 columns, avoiding the "three fields" issue).
In SQL server, calculated column is defined like so:
CREATE TABLE dbo.Whatever(
Numerator INT NOT NULL,
Denominator INT NOT NULL,
Value AS (Numerator / Denominator) PERSISTED
)
(note you may have to do some type conversion and verification that Denominator is not zero, etc).
Also, SQL 2005 added a PERSISTED calculated column that would get rid of the calculation at query time.

How much precision do you need?
The language, C# or otherwise, will round 2/3rds at a given position in the precision. If it's acceptable for whatever you are working on to use decimal values of say scientific notation of 10, then set the precision accordingly in the db.
If the precision is really a concern, then separate the numerator & denominator. This would ensure you always have access to whatever precision you want, and you can use a computed column to represent the value for quick filtering:
numerator INT,
denominator INT,
result AS CASE WHEN denominator > 0 THEN numerator / denominator ELSE NULL END

I have experimented a little bit with using the geometry data type in SQL Server 2008 to store and manipulate rational numbers. Basically, I assume that the numerator goes in the X slot and the denominator goes in the Y slot of a fictitious geometry point.
This was good for my needs, but it might be useless for yours. That will depend on what your priorities are (performance, code readability, etc.). I personally found that T-SQL for geometry data manipulation is hard to write and read.

how much precision are you looking at ? double/float provide decent precision(in my opinion). Am pretty sure scientific/astronomical data need a lot more precision that that. I do know that libraries like matlab and mathematica are good at these. I found that you can use mathematica with your .net program. Here is the link
Edit: adding more links and quotes
"When Mathematica operates on rational numbers, it gives an exact result no matter how many digits are required" from here
Another good read, but you would have to implement it I guess

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas