Convert number to decimal in pentaho - pentaho

In pentaho,i am facing convertion problem when insert the data from selected values. I need to be able to pick up whatever is in that field "as is" and not change it at all.
example
Field - 0.13
inserted field -0
0.13 is converted to 0 but it should be 0.13 only. where as 110 is converted to 110 correctly. Issue with the decimal values, all decimal values converted to 0.
Thanks

There are a few things you can check on the Select Values.
Decimal Field - This checks which is the sign for decimal value for this number, it uses your system Decimal value, so if you are using English Windows/Unix, the default decimal is the dot, where as in other regions it might be the comma.
Always check which, dot or comma, you are receiving for the number before converting.
One quick note as well, Steps AFTER a Group By will receive ANY number with the mask #.#, which is only 1 decimal number after the sign. The data is not lost, it's simply shown with a different mask, be sure to also put that in the select values as well.
The Select Values should look like this for a Number like 0.13 to show as such
EDIT:
Note that in Precision and Format i have used the same number of zeros after the decimal sign, this will account for a maximum of 5 decimal cases after the sign, as a mask, if you have values with more than 5 decimal cases it will load as such, just not show.

In "foreign" Countries ( with e.g. SQLite ) these two steps might help:
1st: in the select statement of "Table input" replace the field (e.g. TRANSAMOUNT ) with field*1.0 field (e.g. TRANSAMOUNT *1.0 TRANSAMOUNT ), so every value will be casted implicitly and
2nd: in the Meta-data Tab ( as mentioned below ) change the Type to Number and choose the appropriate "Date Locate" ( e.g. de_DE ) which also affects Numbers ..

Related

Truncating in Presto when zeroes appears after the decimal point

I am trying to truncate a decimal which has 2 numbers after decimal point in presto that should display on truncating the number without floating values and display the full decimal with floating values when there are numbers from 1 to 9 after decimal point. I have used the following query but it does not do the job and still I am ending up with numbers having zeroes after decimal point.
select column1,case when right(cast(column1 as varchar),7)='.000000' then truncate(column1) else column1 end from table1;
Using varchar pads extra zeroes to the right and hence are the extra zeroes I have used in the above expression after the decimal point
Please let me know what has to done to truncate the decimal only when it has zeroes as the floating values
The thing is truncate(x) → double Returns x rounded to integer by dropping digits after decimal point, but it is double, not integer. And displaying double without non-significant zeroes is a GUI job, it displays all of them or not displays non-significant zeroes. For example when I am using Presto Qubole, it does not displays .000000 if nothing else except 0s after dot. So the problem is in tool you are using probably.
For example this works fime in Presto on Qubole:
with mydata as (
select 123.00000 as figure union all
select 123.0123 )
select case when regexp_like(cast(figure as varchar),'\d+\.0+$') then truncate(figure) else figure end
from mydata
Result:
123.0123
123
But in your GUI it may not work the same because in second line is not integer, it is decimal(8,5), wrap in the typeof() function and you will see, and GUI decides how to display decimal(8,5).
You said:
Using varchar pads extra zeroes to the right and hence are the extra
zeroes I have used in the above expression after the decimal point
No, the result of your expression is not varchar, varchar is being implicitly converted to decimal or double, check using typeof().
If you want it to work not depending on tool you are using, convert to varchar and transform explicitly:
select case when regexp_like(cast(figure as varchar),'\d\.0+$') --all zeroes, change according to your requirements
then regexp_replace(cast(figure as varchar),'\.0+$','') --remove fractional part
else cast(figure as varchar) --we need same type in case
end as result
from mydata
This will work guaranteed because result is varchar and displayed as is.
All that expression can be simplified:
--remove .0+ if no 1-9 after dot:
select regexp_replace(cast(figure as varchar),'\.0+$','')
from mydata

SQL Decimal formatting not working properly in all cases

SQL Server decimal function not working as intended.
To test with sample data, I created a table and inserted values to it.
Then, I tried to run decimal function on these values.
CREATE TABLE TEST_VAL
(
VAL float
)
SELECT * FROM TEST_VAL
Output:
VAL
----------
16704.405
20382.135
2683.135
SELECT CAST(VAL AS DECIMAL(15, 2)) AS NEWVAL
FROM TEST_VAL;
Output:
NEWVAL
-------------
16704.40
20382.13
2683.14
I expected same formatting for all 3 values. But, for third value it returns ceiling round off value.
This is due to the nature of floating point numbers being inexact and being in binary. But I want to demonstrate how this is working.
The issue is that a decimal such as 0.135 cannot be represented exactly. As the floating point representation, it would typically be something like:
0.134999999234243423
(Note that these numbers as with all representations of values in this answer are made up. They are intended to be representative to make the point.)
The number of 9s is actually larger. And the subsequent digits are just representative. In this representation, we wouldn't see a problem with truncating the value. After all 0.1349999 should round to the same value as 0.13499.
In binary, this looks different:
0.11101000010101 10011 10011 10011 10011 . . .
---------------- --------------
~0.135 "arbitrary" repeating pattern
(Note: The values are made up!)
That is, the "infinite" portion of binary representation is not a bunch of repeating 1s or repeating 0s; it has a pattern. This is analogous the inverse of most numbers in base 10 For instance, 1/7 has a repeating component of six digits, 142857. We tend to forget this, because common inverses are either exact (1/5 = 0.2) or have a single repeating digit (1/6 = 0.166666...). 1/7 is the first case that is not so simple -- and almost all decimals are like this. For rational numbers, there is always a repeating sequence regardless of base and it is never longer than dividend (number at the bottom) minus 1).
We can think of this as all decimal representations (regardless of base) always have some number of digits that are repeating. For an exact representation, the repeating portion is 0. For others it is rarely one digit. Usually, it is multiple digits. And it is a fun exercise in mathematics to characterize this. But all that is important is that the repeating portion has 1s and 0s.
Now, what is happening. A floating point number has three parts:
a magnitude. This is a number of bits that represent the exponent.
an integer portion, which is the number before the decimal point.
an integer portion, which is the number after the decimal point.
(Actually, the last two are really one integer, but I find it much easier to explain this by splitting them into two components.)
Only a fixed number of bits are available for the two integer portions. What does this look like? Once again the representative patterns are something like this:
0.135 0 11101000010101100111001110
1.135 1 11101000010101100111001110
2.135 10 1110100001010110011100111
4.135 100 111010000101011001110011
8.135 1000 11101000010101100111001
16.136 10000 1110100001010110011100
-----------^ part before the decimal
------------------^ part after the decimal
Note: This is leaving off the magnitude portion of the decimal representation.
As you can see, digits get chopped off from the end. But sometimes it is 0 that gets chopped off -- so there is no change in the value being represented. And sometimes it is a 1. And there is a change.
With this, you might be able to see how the values essentially fluctuate, say:
0.135 --> 0.135000000004
1.135 --> 0.135000000004
2.135 --> 0.135000000004
4.135 --> 0.135000000001
8.135 --> 0.135999999997
16.135 --> 0.135999999994
These are then rounded differently, which is what you are seeing.
I put together this little db<>fiddle, so you can see how the rounding changes around powers of two.
Perhaps this could be explained if we extend the precision of the three numbers in the first query:
16704.4050
20382.1349
2683.1351
Rounding each of the above to only two decimal places, which is what a cast to DECIMAL(10,2) would do, would yield:
16704.40
20382.13
2683.14
Would this be of use:
select CONVERT(DECIMAL(15,2), ROUND(VAL, 2, 1)) AS NEWVAL
from TEST_VAL;
Here is the DEMO for SQLServer 2012 : DEMO
first question : why they are not same value?
because their type is different , CAST(VAL as decimal(4,2)) will format like ##.## not ##.### so in your case it get ceiling round value.
Why not use the same type ?
CREATE TABLE T
(
[VAL] DECIMAL(8,3)
);
INSERT INTO T ([VAL])
VALUES (16704.405), (20382.135), (2683.135);
SELECT * FROM T
Output:
VAL
-----------
16704.405
20382.135
2683.135
db<>fiddle here
or you can cast AS DECIMAL(8, 3)
SELECT CAST(VAL AS DECIMAL(8,3)) AS NEWVAL
FROM T;

How Can I Get An Exact Character Representation of a Float in SQL Server?

We are doing some validation of data which has been migrated from one SQL Server to another SQL Server. One of the things that we are validating is that some numeric data has been transferred properly. The numeric data is stored as a float datatype in the new system.
We are aware that there are a number of issues with float datatypes, that exact numeric accuracy is not guaranteed, and that one cannot use exact equality comparisons with float data. We don't have control over the database schemas nor data typing and those are separate issues.
What we are trying to do in this specific case is verify that some ratio values were transferred properly. One of the specific data validation rules is that all ratios should be transferred with no more than 4 digits to the right of the decimal point.
So, for example, valid ratios would look like:
.7542
1.5423
Invalid ratios would be:
.12399794301
12.1209377
What we would like to do is count the number of digits to the right of the decimal point and find all cases where the float values have more than four digits to the right of it. We've been using the SUBSTRING, LEN, STR, and a couple of other functions to achieve this, and I am sure it would work if we had numeric fields typed as decimal which we were casting to char.
However, what we have found when attempting to convert a float to a char value is that SQL Server seems to always convert to decimal in between. For example, the field in question shows this value when queried in SQL Server Enterprise Manager:
1.4667
Attempting to convert to a string using the recommended function for SQL Server:
LTRIM(RTRIM(STR(field_name, 22, 17)))
Returns this value:
1.4666999999999999
The value which I would expect if SQL Server were directly converting from float to char (which we could then trim trailing zeroes from):
1.4667000000000000
Is there any way in SQL Server to convert directly from a float to a char without going through what appears to be an intermediate conversion to decimal along the way? We also tried the CAST and CONVERT functions and received similar results to the STR function.
SQL Server Version involved: SQL Server 2012 SP2
Thank you.
Your validation rule seems to be misguided.
An SQL Server FLOAT, or FLOAT(53), is stored internally as a 64-bit floating-point number according to the IEEE 754 standard, with 53 bits of mantissa ("value") plus an exponent. Those 53 binary digits correspond to approximately 15 decimal digits.
Floating-point numbers have limited precision, which does not mean that they are "fuzzy" or inexact in themselves, but that not all numbers can be exactly represented, and instead have to be represented using another number.
For example, there is no exact representation for your 1.4667, and it will instead be stored as a binary floating-point number that (exactly) corresponds to the decimal number 1.466699999999999892708046900224871933460235595703125. Correctly rounded to 16 decimal places, that is 1.4666999999999999, which is precisely what you got.
Since the "exact character representation of the float value that is in SQL Server" is 1.466699999999999892708046900224871933460235595703125, the validation rule of "no more than 4 digits to the right of the decimal point" is clearly flawed, at least if you apply it to the "exact character representation".
What you might be able to do, however, is to round the stored number to fewer decimal places, so that the small error at the end of the decimals is hidden. Converting to a character representation rounded to 15 instead of 16 places (remember those "15 decimal digits" mentioned at the beginning?) will give you 1.466700000000000, and then you can check that all decimals after the first four are zeroes.
You can try using cast to varchar.
select case when
len(
substring(cast(col as varchar(100))
,charindex('.',cast(col as varchar(100)))+1
,len(cast(col as varchar(100)))
)
) = 4
then 'true' else 'false' end
from tablename
where charindex('.',cast(col as varchar(100))) > 0
For this particular number, don't use STR(), and use a convert or cast to varchar. But, in general, you will always have precision issues when storing in float... it's the nature of the storage of that datatype. The best you can do is normalize to a NUMERIC type and compare with threshold ranges (+/- .0001, for example). See the following for a breakdown of how the different conversions work:
declare #float float = 1.4667
select #float,
convert(numeric(18,4), #float),
convert(nvarchar(20), #float),
convert(nvarchar(20), convert(numeric(18,4), #float)),
str(#float, 22, 17),
str(convert(numeric(18,4), #float)),
convert(nvarchar(20), convert(numeric(18,4), #float))
Instead of casting to a VarChar you might try this: cast to a decimal with 4 fractional digits and check if it's the same value as before.
case when field_name <> convert(numeric(38,4), field_name)
then 1
else 0
end
The issue you have here is that float is an approximate number data type with an accuracy of about seven digits. That means it approaches the value while using less storage than a decimal / numeric. That's why you don't use float for values that require exact precision.
Check this example:
DECLARE #t TABLE (
col FLOAT
)
INSERT into #t (col)
VALUES (1.4666999999999999)
,(1.4667)
,(1.12399794301)
,(12.1209377);
SELECT col
, CONVERT(NVARCHAR(MAX),col) AS chr
, CAST(col as VARBINARY) AS bin
, LTRIM(RTRIM(STR(col, 22, 17))) AS rec
FROM #t
As you see the float 1.4666999999999999 binary equals 1.4667. For your stated needs I think this query would fit:
SELECT col
, RIGHT(CONVERT(NVARCHAR(MAX),col), LEN(CONVERT(NVARCHAR(MAX),col)) - CHARINDEX('.',CONVERT(NVARCHAR(MAX),col))) AS prec
from #t

Value of real type incorrectly compares

I have field of REAL type in db. I use PostgreSQL. And the query
SELECT * FROM my_table WHERE my_field = 0.15
does not return rows in which the value of my_field is 0.15.
But for instance the query
SELECT * FROM my_table WHERE my_field > 0.15
works properly.
How can I solve this problem and get the rows with my_field = 0.15 ?
To solve your problem use the data type numeric instead, which is not a floating point type, but an arbitrary precision type.
If you enter the numeric literal 0.15 into a numeric (same word, different meaning) column, the exact amount is stored - unlike with a real or float8 column, where the value is coerced to next possible binary approximation. This may or may not be exact, depending on the number and implementation details. The decimal number 0.15 happens to fall between possible binary representations and is stored with a tiny error.
Note that the result of a calculation can be inexact itself, so be still wary of the = operator in such cases.
It also depends how you test. When comparing, Postgres coerces diverging numeric types to a type that can best hold the result.
Consider this demo:
CREATE TABLE t(num_r real, num_n numeric);
INSERT INTO t VALUES (0.15, 0.15);
SELECT num_r, num_n
, num_r = num_n AS test1 --> FALSE
, num_r = num_n::real AS test2 --> TRUE
, num_r - num_n AS result_nonzero --> float8
, num_r - num_n::real AS result_zero --> real
FROM t;
db<>fiddle here
Old sqlfiddle
Therefore, if you have entered 0.15 as numeric literal into your column of data type real, you can find all such rows with:
SELECT * FROM my_table WHERE my_field = real '0.15'
Use numeric columns if you need to store fractional digits exactly.
Your problem originates from IEEE 754.
0.15 is not 0.15, but 0.15000000596046448 (assuming double precision), as it can not be exactly represented as a binary floating point number.
(check this calculator)
Why is this a problem? In this case, most likely because the other side of the comparison uses the exact value 0.15 - through an exact representation, like a numeric type. (Cleared up on suggestion by Eric)
So there are two ways:
use a format that actually stores the numbers in decimal format - as Erwin suggested
(or at least use the same type across the board)
use rounding as Jack suggested - which has to be used carefully (by the way this uses a numeric type too, to exactly represent 0.15...)
Recommended reading:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
(Sorry for the terse answer...)
Well, I can't see your data, but I'm guessing that my_field doesn't exactly equal 0.15. Try:
select * from my_table where round(my_field::numeric,2) = 0.15;
Considering both PPTerka's and Jack's answer.
Approximate numeric data types do not store the exact values specified for many numbers;
Look here for MS' decription of real values.
http://technet.microsoft.com/en-us/library/ms187912(v=sql.105).aspx

How do I count decimal places in SQL?

I have a column X which is full of floats with decimals places ranging from 0 (no decimals) to 6 (maximum). I can count on the fact that there are no floats with greater than 6 decimal places. Given that, how do I make a new column such that it tells me how many digits come after the decimal?
I have seen some threads suggesting that I use CAST to convert the float to a string, then parse the string to count the length of the string that comes after the decimal. Is this the best way to go?
You can use something like this:
declare #v sql_variant
set #v=0.1242311
select SQL_VARIANT_PROPERTY(#v, 'Scale') as Scale
This will return 7.
I tried to make the above query work with a float column but couldn't get it working as expected. It only works with a sql_variant column as you can see here: http://sqlfiddle.com/#!6/5c62c/2
So, I proceeded to find another way and building upon this answer, I got this:
SELECT value,
LEN(
CAST(
CAST(
REVERSE(
CONVERT(VARCHAR(50), value, 128)
) AS float
) AS bigint
)
) as Decimals
FROM Numbers
Here's a SQL Fiddle to test this out: http://sqlfiddle.com/#!6/23d4f/29
To account for that little quirk, here's a modified version that will handle the case when the float value has no decimal part:
SELECT value,
Decimals = CASE Charindex('.', value)
WHEN 0 THEN 0
ELSE
Len (
Cast(
Cast(
Reverse(CONVERT(VARCHAR(50), value, 128)) AS FLOAT
) AS BIGINT
)
)
END
FROM numbers
Here's the accompanying SQL Fiddle: http://sqlfiddle.com/#!6/10d54/11
This thread is also using CAST, but I found the answer interesting:
http://www.sqlservercentral.com/Forums/Topic314390-8-1.aspx
DECLARE #Places INT
SELECT TOP 1000000 #Places = FLOOR(LOG10(REVERSE(ABS(SomeNumber)+1)))+1
FROM dbo.BigTest
and in ORACLE:
SELECT FLOOR(LOG(10,REVERSE(CAST(ABS(.56544)+1 as varchar(50))))) + 1 from DUAL
A float is just representing a real number. There is no meaning to the number of decimal places of a real number. In particular the real number 3 can have six decimal places, 3.000000, it's just that all the decimal places are zero.
You may have a display conversion which is not showing the right most zero values in the decimal.
Note also that the reason there is a maximum of 6 decimal places is that the seventh is imprecise, so the display conversion will not commit to a seventh decimal place value.
Also note that floats are stored in binary, and they actually have binary places to the right of a binary point. The decimal display is an approximation of the binary rational in the float storage which is in turn an approximation of a real number.
So the point is, there really is no sense of how many decimal places a float value has. If you do the conversion to a string (say using the CAST) you could count the decimal places. That really would be the best approach for what you are trying to do.
I answered this before, but I can tell from the comments that it's a little unclear. Over time I found a better way to express this.
Consider pi as
(a) 3.141592653590
This shows pi as 11 decimal places. However this was rounded to 12 decimal places, as pi, to 14 digits is
(b) 3.1415926535897932
A computer or database stores values in binary. For a single precision float, pi would be stored as
(c) 3.141592739105224609375
This is actually rounded up to the closest value that a single precision can store, just as we rounded in (a). The next lowest number a single precision can store is
(d) 3.141592502593994140625
So, when you are trying to count the number of decimal places, you are trying to find how many decimal places, after which all remaining decimals would be zero. However, since the number may need to be rounded to store it, it does not represent the correct value.
Numbers also introduce rounding error as mathematical operations are done, including converting from decimal to binary when inputting the number, and converting from binary to decimal when displaying the value.
You cannot reliably find the number of decimal places a number in a database has, because it is approximated to round it to store in a limited amount of storage. The difference between the real value, or even the exact binary value in the database will be rounded to represent it in decimal. There could always be more decimal digits which are missing from rounding, so you don't know when the zeros would have no more non-zero digits following it.
Solution for Oracle but you got the idea. trunc() removes decimal part in Oracle.
select *
from your_table
where (your_field*1000000 - trunc(your_field*1000000)) <> 0;
The idea of the query: Will there be any decimals left after you multiply by 1 000 000.
Another way I found is
SELECT 1.110000 , LEN(PARSENAME(Cast(1.110000 as float),1)) AS Count_AFTER_DECIMAL
I've noticed that Kshitij Manvelikar's answer has a bug. If there are no decimal places, instead of returning 0, it returns the total number of characters in the number.
So improving upon it:
Case When (SomeNumber = Cast(SomeNumber As Integer)) Then 0 Else LEN(PARSENAME(Cast(SomeNumber as float),1)) End
Here's another Oracle example. As I always warn non-Oracle users before they start screaming at me and downvoting etc... the SUBSTRING and INSTRING are ANSI SQL standard functions and can be used in any SQL. The Dual table can be replaced with any other table or created. Here's the link to SQL SERVER blog whre i copied dual table code from: http://blog.sqlauthority.com/2010/07/20/sql-server-select-from-dual-dual-equivalent/
CREATE TABLE DUAL
(
DUMMY VARCHAR(1)
)
GO
INSERT INTO DUAL (DUMMY)
VALUES ('X')
GO
The length after dot or decimal place is returned by this query.
The str can be converted to_number(str) if required. You can also get the length of the string before dot-decimal place - change code to LENGTH(SUBSTR(str, 1, dot_pos))-1 and remove +1 in INSTR part:
SELECT str, LENGTH(SUBSTR(str, dot_pos)) str_length_after_dot FROM
(
SELECT '000.000789' as str
, INSTR('000.000789', '.')+1 dot_pos
FROM dual
)
/
SQL>
STR STR_LENGTH_AFTER_DOT
----------------------------------
000.000789 6
You already have answers and examples about casting etc...
This question asks of regular SQL, but I needed a solution for SQLite. SQLite has neither a log10 function, nor a reverse string function builtin, so most of the answers here don't work. My solution is similar to Art's answer, and as a matter of fact, similar to what phan describes in the question body. It works by converting the floating point value (in SQLite, a "REAL" value) to text, and then counting the caracters after a decimal point.
For a column named "Column" from a table named "Table", the following query will produce a the count of each row's decimal places:
select
length(
substr(
cast(Column as text),
instr(cast(Column as text), '.')+1
)
) as "Column-precision" from "Table";
The code will cast the column as text, then get the index of a period (.) in the text, and fetch the substring from that point on to the end of the text. Then, it calculates the length of the result.
Remember to limit 100 if you don't want it to run for the entire table!
It's not a perfect solution; for example, it considers "10.0" as having 1 decimal place, even if it's only a 0. However, this is actually what I needed, so it wasn't a concern to me.
Hopefully this is useful to someone :)
Probably doesn't work well for floats, but I used this approach as a quick and dirty way to find number of significant decimal places in a decimal type in SQL Server. Last parameter of round function if not 0 indicates to truncate rather than round.
CASE
WHEN col = round(col, 1, 1) THEN 1
WHEN col = round(col, 2, 1) THEN 2
WHEN col = round(col, 3, 1) THEN 3
...
ELSE null END