Determine MAX Decimal Scale Used on a Column - sql

In MS SQL, I need a approach to determine the largest scale being used by the rows for a certain decimal column.
For example Col1 Decimal(19,8) has a scale of 8, but I need to know if all 8 are actually being used, or if only 5, 6, or 7 are being used.
Sample Data:
123.12345000
321.43210000
5255.12340000
5244.12345000
For the data above, I'd need the query to either return 5, or 123.12345000 or 5244.12345000.
I'm not concerned about performance, I'm sure a full table scan will be in order, I just need to run the query once.

Not pretty, but I think it should do the trick:
-- Find the first non-zero character in the reversed string...
-- And then subtract from the scale of the decimal + 1.
SELECT 9 - PATINDEX('%[1-9]%', REVERSE(Col1))

I like #Michael Fredrickson's answer better and am only posting this as an alternative for specific cases where the actual scale is unknown but is certain to be no more than 18:
SELECT LEN(CAST(CAST(REVERSE(Col1) AS float) AS bigint))
Please note that, although there are two explicit CAST calls here, the query actually performs two more implicit conversions:
As the argument of REVERSE, Col1 is converted to a string.
The bigint is cast as a string before being used as the argument of LEN.

SELECT
MAX(CHAR_LENGTH(
SUBSTRING(column_name::text FROM '\.(\d*?)0*$')
)) AS max_scale
FROM table_name;
*? is the non-greedy version of *, so \d*? catches all digits after the decimal point except trailing zeros.
The pattern contains a pair of parentheses, so the portion of the text that matched the first parenthesized subexpression (that is \d*?) is returned.
References:
https://www.postgresql.org/docs/9.6/static/sql-createcast.html
https://www.postgresql.org/docs/9.6/static/functions-matching.html

Note this will scan the entire table:
SELECT TOP 1 [Col1]
FROM [Table]
ORDER BY LEN(PARSENAME(CAST([Col1] AS VARCHAR(40)), 1)) DESC

Related

Real number comparison for trigram similarity

I am implementing trigram similarity for word matching in column comum1. similarity() returns real. I have converted 0.01 to real and rounded to 2 decimal digits. Though there are rank values greater than 0.01, I get no results on screen. If I remove the WHERE condition, lots of results are available. Kindly guide me how to overcome this issue.
SELECT *,ROUND(similarity(comum1,"Search_word"),2) AS rank
FROM schema.table
WHERE rank >= round(0.01::real,2)
I have also converted both numbers to numeric and compared, but that also didn't work:
SELECT *,ROUND(similarity(comum1,"Search_word")::NUMERIC,2) AS rank
FROM schema.table
WHERE rank >= round(0.01::NUMERIC,2)
LIMIT 50;
The WHERE clause can only reference input column names, coming from the underlying table(s). rank in your example is the column alias for a result - an output column name.
So your statement is illegal and should return with an error message - unless you have another column named rank in schema.table, in which case you shot yourself in the foot. I would think twice before introducing such a naming collision, while I am not completely firm with SQL syntax.
And round() with a second parameter is not defined for real, you would need to cast to numeric like you tried. Another reason your first query is illegal.
Also, the double-quotes around "Search_word" are highly suspicious. If that's supposed to be a string literal, you need single quotes: 'Search_word'.
This should work:
SELECT *, round(similarity(comum1,'Search_word')::numeric,2) AS rank
FROM schema.table
WHERE similarity(comum1, 'Search_word') > 0.01;
But it's still pretty useless as it fails to make use of trigram indexes. Do this instead:
SET pg_trgm.similarity_threshold = 0.01; -- set once
SELECT *
FROM schema.table
WHERE comum1 % 'Search_word';
See:
Finding similar strings with PostgreSQL quickly
That said, a similarity of 0.01 is almost no similarity. Typically, you need a much higher threshold.

Need to divide a date part in SQL Server

I have a column in my table with these values:
PING_TO_ME_20100828_Any87
TO_THESE_D_COLUMN_ENTRY_20200825
TO_THESE_D_20100829_COLUMN_ENTRY
201901_ARE_YOU_TRYING_TO_REACH47
ASK_TO_UOU_201008
I need to separate date values in a separate column.
My output should be:
20100828
20200825
20100829
201901
201008
Any help is very much appreciated.
You will (and already have) likely get comments about this telling you to fix your design. And while that is likely true...I won't try to pick apart why you are doing this, and I'll just give you the answer you came here for.
Your goal is to pick out either an 8 digit string of integers, or a 6 digit string of integers.
Here is one way you could do it:
SELECT x.y
, COALESCE(SUBSTRING(x.y, NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0), 8)
, SUBSTRING(x.y, NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0), 6))
FROM (
VALUES ('PING_TO_ME_20100828_Any87'),
('TO_THESE_D_COLUMN_ENTRY_20200825'),
('TO_THESE_D_20100829_COLUMN_ENTRY'),
('201901_ARE_YOU_TRYING_TO_REACH47'),
('ASK_TO_UOU_201008')
) x(y)
Explanation:
Since you are looking for both 8 and 6 digit values, you need to check for the longer of the two first. So first I search for the occurrence of a string of 8 integers using:
NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0)
This returns the first position of a string of 8 integers. The reason I wrap it in a NULLIF() is because if the value is not found, then PATINDEX will return 0.
I use NULLIF() to return NULL in that case, essentially indicating nothing was found. If you pass a NULL value to SUBSTRING() then it also returns NULL.
This is all just a nice way of "failing over" to the 6 character string check.
So there I do the same thing again:
NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0)
Except this time, I only repeat [0-9] six times. And again, I use the NULLIF() trick, so that it returns NULL if no string is found.
Throw that all into SUBSTRING() and COALESCE() and you've got a function that returns the results you're looking for.
Potential downsides
There are a couple down sides to this method.
It is not checking for a valid date, it's simply looking for a string of either 8 integers, or 6 integers. It could be 12345678 and it would still detect and return that.
If there are strings of integers longer than 8 digits, it will grab only the first 8 characters.
If there are multiple occurrences of 6 or 8 character integer strings...it will only return the first one.
There are much more robust ways you could write this, but it all depends on your data and what you need to do.
Other methods
Another way it could be done depending on which version of SQL Server you are using, is using STRING_SPLIT().
SELECT x.y, s.[value]
FROM (
VALUES ('PING_TO_ME_20100828_Any87'),('TO_THESE_D_COLUMN_ENTRY_20200825'),('TO_THESE_D_20100829_COLUMN_ENTRY'),('201901_ARE_YOU_TRYING_TO_REACH47'),('ASK_TO_UOU_201008')
) x(y)
CROSS APPLY (
SELECT [value]
FROM STRING_SPLIT(x.y, '_')
WHERE [value] LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
OR [value] LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]'
) s
This method handles a couple of the downsides mentioned earlier. For example, it will ONLY return integer strings of length 6 or 8. It will also return ALL integer strings of length 6 or 8 and not just the first one.
And there's other ways to identify the strings as well, like using ISNUMERIC(x.[value]) or TRY_CONVERT(int, s.[value]).
It all depends on how you are using this code...if it's runs fast enough, and it's a one off script, then it really doesn't matter. If it's running for millions of records at a time, then yeah you should play around with other methods.

How do I count decimal places in SQL?

I have a column X which is full of floats with decimals places ranging from 0 (no decimals) to 6 (maximum). I can count on the fact that there are no floats with greater than 6 decimal places. Given that, how do I make a new column such that it tells me how many digits come after the decimal?
I have seen some threads suggesting that I use CAST to convert the float to a string, then parse the string to count the length of the string that comes after the decimal. Is this the best way to go?
You can use something like this:
declare #v sql_variant
set #v=0.1242311
select SQL_VARIANT_PROPERTY(#v, 'Scale') as Scale
This will return 7.
I tried to make the above query work with a float column but couldn't get it working as expected. It only works with a sql_variant column as you can see here: http://sqlfiddle.com/#!6/5c62c/2
So, I proceeded to find another way and building upon this answer, I got this:
SELECT value,
LEN(
CAST(
CAST(
REVERSE(
CONVERT(VARCHAR(50), value, 128)
) AS float
) AS bigint
)
) as Decimals
FROM Numbers
Here's a SQL Fiddle to test this out: http://sqlfiddle.com/#!6/23d4f/29
To account for that little quirk, here's a modified version that will handle the case when the float value has no decimal part:
SELECT value,
Decimals = CASE Charindex('.', value)
WHEN 0 THEN 0
ELSE
Len (
Cast(
Cast(
Reverse(CONVERT(VARCHAR(50), value, 128)) AS FLOAT
) AS BIGINT
)
)
END
FROM numbers
Here's the accompanying SQL Fiddle: http://sqlfiddle.com/#!6/10d54/11
This thread is also using CAST, but I found the answer interesting:
http://www.sqlservercentral.com/Forums/Topic314390-8-1.aspx
DECLARE #Places INT
SELECT TOP 1000000 #Places = FLOOR(LOG10(REVERSE(ABS(SomeNumber)+1)))+1
FROM dbo.BigTest
and in ORACLE:
SELECT FLOOR(LOG(10,REVERSE(CAST(ABS(.56544)+1 as varchar(50))))) + 1 from DUAL
A float is just representing a real number. There is no meaning to the number of decimal places of a real number. In particular the real number 3 can have six decimal places, 3.000000, it's just that all the decimal places are zero.
You may have a display conversion which is not showing the right most zero values in the decimal.
Note also that the reason there is a maximum of 6 decimal places is that the seventh is imprecise, so the display conversion will not commit to a seventh decimal place value.
Also note that floats are stored in binary, and they actually have binary places to the right of a binary point. The decimal display is an approximation of the binary rational in the float storage which is in turn an approximation of a real number.
So the point is, there really is no sense of how many decimal places a float value has. If you do the conversion to a string (say using the CAST) you could count the decimal places. That really would be the best approach for what you are trying to do.
I answered this before, but I can tell from the comments that it's a little unclear. Over time I found a better way to express this.
Consider pi as
(a) 3.141592653590
This shows pi as 11 decimal places. However this was rounded to 12 decimal places, as pi, to 14 digits is
(b) 3.1415926535897932
A computer or database stores values in binary. For a single precision float, pi would be stored as
(c) 3.141592739105224609375
This is actually rounded up to the closest value that a single precision can store, just as we rounded in (a). The next lowest number a single precision can store is
(d) 3.141592502593994140625
So, when you are trying to count the number of decimal places, you are trying to find how many decimal places, after which all remaining decimals would be zero. However, since the number may need to be rounded to store it, it does not represent the correct value.
Numbers also introduce rounding error as mathematical operations are done, including converting from decimal to binary when inputting the number, and converting from binary to decimal when displaying the value.
You cannot reliably find the number of decimal places a number in a database has, because it is approximated to round it to store in a limited amount of storage. The difference between the real value, or even the exact binary value in the database will be rounded to represent it in decimal. There could always be more decimal digits which are missing from rounding, so you don't know when the zeros would have no more non-zero digits following it.
Solution for Oracle but you got the idea. trunc() removes decimal part in Oracle.
select *
from your_table
where (your_field*1000000 - trunc(your_field*1000000)) <> 0;
The idea of the query: Will there be any decimals left after you multiply by 1 000 000.
Another way I found is
SELECT 1.110000 , LEN(PARSENAME(Cast(1.110000 as float),1)) AS Count_AFTER_DECIMAL
I've noticed that Kshitij Manvelikar's answer has a bug. If there are no decimal places, instead of returning 0, it returns the total number of characters in the number.
So improving upon it:
Case When (SomeNumber = Cast(SomeNumber As Integer)) Then 0 Else LEN(PARSENAME(Cast(SomeNumber as float),1)) End
Here's another Oracle example. As I always warn non-Oracle users before they start screaming at me and downvoting etc... the SUBSTRING and INSTRING are ANSI SQL standard functions and can be used in any SQL. The Dual table can be replaced with any other table or created. Here's the link to SQL SERVER blog whre i copied dual table code from: http://blog.sqlauthority.com/2010/07/20/sql-server-select-from-dual-dual-equivalent/
CREATE TABLE DUAL
(
DUMMY VARCHAR(1)
)
GO
INSERT INTO DUAL (DUMMY)
VALUES ('X')
GO
The length after dot or decimal place is returned by this query.
The str can be converted to_number(str) if required. You can also get the length of the string before dot-decimal place - change code to LENGTH(SUBSTR(str, 1, dot_pos))-1 and remove +1 in INSTR part:
SELECT str, LENGTH(SUBSTR(str, dot_pos)) str_length_after_dot FROM
(
SELECT '000.000789' as str
, INSTR('000.000789', '.')+1 dot_pos
FROM dual
)
/
SQL>
STR STR_LENGTH_AFTER_DOT
----------------------------------
000.000789 6
You already have answers and examples about casting etc...
This question asks of regular SQL, but I needed a solution for SQLite. SQLite has neither a log10 function, nor a reverse string function builtin, so most of the answers here don't work. My solution is similar to Art's answer, and as a matter of fact, similar to what phan describes in the question body. It works by converting the floating point value (in SQLite, a "REAL" value) to text, and then counting the caracters after a decimal point.
For a column named "Column" from a table named "Table", the following query will produce a the count of each row's decimal places:
select
length(
substr(
cast(Column as text),
instr(cast(Column as text), '.')+1
)
) as "Column-precision" from "Table";
The code will cast the column as text, then get the index of a period (.) in the text, and fetch the substring from that point on to the end of the text. Then, it calculates the length of the result.
Remember to limit 100 if you don't want it to run for the entire table!
It's not a perfect solution; for example, it considers "10.0" as having 1 decimal place, even if it's only a 0. However, this is actually what I needed, so it wasn't a concern to me.
Hopefully this is useful to someone :)
Probably doesn't work well for floats, but I used this approach as a quick and dirty way to find number of significant decimal places in a decimal type in SQL Server. Last parameter of round function if not 0 indicates to truncate rather than round.
CASE
WHEN col = round(col, 1, 1) THEN 1
WHEN col = round(col, 2, 1) THEN 2
WHEN col = round(col, 3, 1) THEN 3
...
ELSE null END

Use max in a VARCHAR to get result > 999999?

What if somebody made a column as VARCHAR2(256 CHAR) and there are only numbers in this column. I would like to get the highest number. The problem is: the number is something > 999999 but a Max to a varchar is always giving me a max number of 999999
I tried to_number(max(numbers), '9999999999999') but i still get 999999 back, at that cant be. Any ideas? Thank you
the best way is to
First Solution
convert the column in numeric
or
Second Solution
convert data in you query in numeric and than get data...
Example
select max(col1) from(
select to_number(numbers) as col1 from table ) d
It has to be this way because if you call MAX() before TO_NUMBER(), it will sort alphabetically, and then 999999 is bigger than 100000000000. Note that applying TO_NUMBER() to a varchar2 column incurs the risk of an INVALID_NUMBER exception, should the column containing any non-numeric characters. This is why the first proposed solution is to be preferred.
In Oracle, the NUMBER type contains base 100 floating point values which have a precision of 38 significant digits, and a max value of 9999...(38 9's) x 10^125. There are two questions at issue - the first is whether a NUMBER can contain a value converted from a 256 character string, and the second is if two such values which are 'close' in numeric terms can be distinguished.
Let's start with taking a 256 character string and trying to convert it to a number. The obvious thing to do is:
SELECT TO_NUMBER('9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999') AS VAL
FROM DUAL;
Executing the above we get:
ORA-01426: numeric overflow
which, having paid attention earlier, we expected. The largest exponent that a NUMBER can handle is 125 - and here we're trying to convert a value with 256 significant digits. NUMBER's can't handle this. If we cut the number of digits down to 125, as follows:
SELECT TO_NUMBER('99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999') AS VAL
FROM DUAL;
It works fine, and our answer it 1E125.
<blink>
WHOA! WAIT!! WHAT??? The answer is 1 x 10^125??? What about all those 9's?!?!?!?
Remember earlier I'd mentioned that an Oracle NUMBER is a floating point value with a maximum precision of 38 and a maximum exponent of 125. From the point of view of TO_NUMBER 125 9's all strung together can't be exactly represented - too many digits (remember, max. precision of 38 (more on this later)). So it does the absolute best it can - it converts the first 38 digits (all of which are 9's) and then says "How should I best round this off to make the result A) representative of the input and B) as close as I can get to what I was given?". In this case it looks at digit 39, sees that it's a 9, and decides to round upward. As all the other digits are also 9's, it continues rounding neatly until it ends up with 1 as the remaining mantissa digit.
* Later, back at the ranch... *
OK, earlier I'd mentioned that NUMBER has a precision of 38 digits. That's not entirely true - it can actually differentiate between values with up to 40 digits of precision, at least sometimes, if the wind is right, and you're going downhill. Here's an example:
SELECT CASE
WHEN to_number('9999999999999999999999999999999999999999') >
to_number('9999999999999999999999999999999999999998')
THEN 'Greater'
ELSE 'Not greater'
END AS VAL
FROM DUAL;
Those two values each have 40 digits (counting is left as an exercise to the extremely bored reader :-). If you execute the above you'll get back 'Greater', showing that the comparison of two 40 digit values succeeded.
Now for some fun. If you add an additional '9' to each string, making for a 41 digit value, and re-execute the statement it'll return 'Not greater'.
<blink>
WAIT! WHAT?? WHOA!!! Those values are obviously different! Even a TotalFool (tm) can see that!!
The problem here is that a 41 digit number exceeds the precision of the NUMBER type, and thus when TO_NUMBER finds it has a value this long it starts discarding digits on the right side. Thus, even though those two really big numbers are clearly different to you and me, they're not different at all once they've been folded, spindled, mutilated, and converted.
So, what are the takeaways here?
1 - To the OP's original question - you'll have to come up with another way to compare your number strings besides using NUMBER because Oracle's NUMBER type can't hold 256 digit values. I suggest that you normalize the strings by making sure ALL the values are 256 digits long, adding zeroes on the left as needed, and then a string comparison should work OK.
2 - Floating point numbers prove the existence of (your favorite deity/deities here) by negation, as they are clearly the work of (your favorite personification of evil here). Whenever you work with them (as we all have to, sooner or later) you should remember that they are the foul byproducts of malignant evil, waiting to lash out at you when you least expect it.
3 - There is NO point three! (And extra credit for those who can identify without resorting to an extra-cranial search engine where this comes from :-)
Share and enjoy.
If you mean that the numbers in the column can be that big (256 digits), you could try something like this:
SELECT numbers
FROM (
SELECT numbers
FROM table_name
ORDER BY LPAD(numbers, 256) DESC
)
WHERE rownum = 1
or like this:
SELECT LTRIM(MAX(LPAD(numbers, 256))) AS numbers
FROM table_name

How to find MAX() value of character column?

We have legacy table where one of the columns part of composite key was manually filled with values:
code
------
'001'
'002'
'099'
etc.
Now, we have feature request in which we must know MAX(code) in order to give user next possible value, in example case form above next value is '100'.
We tried to experiment with this but we still can't find any reasonable explanation how DB2 engine calculates that
MAX('001', '099', '576') is '576'
MAX('099', '99', 'www') is '99' and so on.
Any help or suggestion would be much appreciated!
You already have the answer to getting the maximum numeric value, but to answer the other part with regard to 'www','099','99'.
The AS/400 uses EBCDIC to store values, this is different to ASCII in several ways, the most important for your purposes is that Alpha characters come before numbers, which is the opposite of Ascii.
So on your Max() your 3 strings will be sorted and the highest EBCDIC value used so
'www'
'099'
'99 '
As you can see your '99' string is really '99 ' so it is higher that the one with the leading zero.
Cast it to int before applying max()
For the numeric maximum -- filter out the non-numeric values and cast to a numeric for aggregation:
SELECT MAX(INT(FLD1))
WHERE FLD1 <> ' '
AND TRANSLATE(FLD1, '0123456789', '0123456789') = FLD1
SQL Reference: TRANSLATE
And the reasonable explanation:
SQL Reference: MAX
This max working well in your type definition, when you want do max on integer values then convert values to integer before calling MAX, but i see you mixing max with string 'www' how you imagine this works?
Filter integer only values, cast it to int and call max. This is not good designed solution but looking at your problem i think is enough.
Sharing the solution for postgresql
which worked for me.
Suppose here temporary_id is of type character in database. Then above query will directly convert char type to int type when it gives response.
SELECT MAX(CAST (temporary_id AS Integer)) FROM temporary
WHERE temporary_id IS NOT NULL
As per my requirement I've applied MAX() aggregate function. One can remove that also and it will work the same way.