I have a column in my table with these values:
PING_TO_ME_20100828_Any87
TO_THESE_D_COLUMN_ENTRY_20200825
TO_THESE_D_20100829_COLUMN_ENTRY
201901_ARE_YOU_TRYING_TO_REACH47
ASK_TO_UOU_201008
I need to separate date values in a separate column.
My output should be:
20100828
20200825
20100829
201901
201008
Any help is very much appreciated.
You will (and already have) likely get comments about this telling you to fix your design. And while that is likely true...I won't try to pick apart why you are doing this, and I'll just give you the answer you came here for.
Your goal is to pick out either an 8 digit string of integers, or a 6 digit string of integers.
Here is one way you could do it:
SELECT x.y
, COALESCE(SUBSTRING(x.y, NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0), 8)
, SUBSTRING(x.y, NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0), 6))
FROM (
VALUES ('PING_TO_ME_20100828_Any87'),
('TO_THESE_D_COLUMN_ENTRY_20200825'),
('TO_THESE_D_20100829_COLUMN_ENTRY'),
('201901_ARE_YOU_TRYING_TO_REACH47'),
('ASK_TO_UOU_201008')
) x(y)
Explanation:
Since you are looking for both 8 and 6 digit values, you need to check for the longer of the two first. So first I search for the occurrence of a string of 8 integers using:
NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0)
This returns the first position of a string of 8 integers. The reason I wrap it in a NULLIF() is because if the value is not found, then PATINDEX will return 0.
I use NULLIF() to return NULL in that case, essentially indicating nothing was found. If you pass a NULL value to SUBSTRING() then it also returns NULL.
This is all just a nice way of "failing over" to the 6 character string check.
So there I do the same thing again:
NULLIF(PATINDEX('%[0-9][0-9][0-9][0-9][0-9][0-9]%', x.y), 0)
Except this time, I only repeat [0-9] six times. And again, I use the NULLIF() trick, so that it returns NULL if no string is found.
Throw that all into SUBSTRING() and COALESCE() and you've got a function that returns the results you're looking for.
Potential downsides
There are a couple down sides to this method.
It is not checking for a valid date, it's simply looking for a string of either 8 integers, or 6 integers. It could be 12345678 and it would still detect and return that.
If there are strings of integers longer than 8 digits, it will grab only the first 8 characters.
If there are multiple occurrences of 6 or 8 character integer strings...it will only return the first one.
There are much more robust ways you could write this, but it all depends on your data and what you need to do.
Other methods
Another way it could be done depending on which version of SQL Server you are using, is using STRING_SPLIT().
SELECT x.y, s.[value]
FROM (
VALUES ('PING_TO_ME_20100828_Any87'),('TO_THESE_D_COLUMN_ENTRY_20200825'),('TO_THESE_D_20100829_COLUMN_ENTRY'),('201901_ARE_YOU_TRYING_TO_REACH47'),('ASK_TO_UOU_201008')
) x(y)
CROSS APPLY (
SELECT [value]
FROM STRING_SPLIT(x.y, '_')
WHERE [value] LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
OR [value] LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]'
) s
This method handles a couple of the downsides mentioned earlier. For example, it will ONLY return integer strings of length 6 or 8. It will also return ALL integer strings of length 6 or 8 and not just the first one.
And there's other ways to identify the strings as well, like using ISNUMERIC(x.[value]) or TRY_CONVERT(int, s.[value]).
It all depends on how you are using this code...if it's runs fast enough, and it's a one off script, then it really doesn't matter. If it's running for millions of records at a time, then yeah you should play around with other methods.
Related
I did not expect this to be a problem, but I'm struggling to return the first 3 numbers, including the 0's before them. In the below examples, I show a few things I've tried. I want it to return '001'. It either returns '118' or an error. It seems like every solution wants to convert them to a text, which will drop the 0's.
SELECT lpad(00118458582::text, 3, '0')
returns 118
SELECT lpad(00118458582, 3, '0')
ERROR: function lpad(integer, integer, unknown) does not exist
SELECT left(00118458582::text, 3)
returns 118
SELECT left(00118458582, 3)
ERROR: function left(integer, integer) does not exist
SELECT substring(00118458582::text, 1, 3)
returns 118
Can I get any help please? Thanks!
Your problem starts before you try to get the first 3 digits, namely that you're considering 00118458582 to be a valid INTEGER (or whatever numeric type). I mean, it's not invalid, but what happens when you run SELECT 00118458582::INTEGER? You get 118458582. Because leading zeros in those types are senseless. So you'll never have a situation as in your examples (outside of a hardcoded number with leading zeros in your query window) in your tables, because those zeros wouldn't be stored in your number-based data type fields.
So the only way to get that sort of situation is when they're string-based: SELECT '00118458582'::TEXT returns 00118458582. And at that point you can run your preferred function to get the first 3 characters, e.g. SELECT LEFT('00118458582', 3) which returns 001. But if you're planning on casting that to INTEGER or something, forget about leading zeros.
SELECT substring(00118458582::text, 1, 3)
returns 118 because it is a number 118458582 (the leading zeros are automatically dropped), that is converted to text '118458582' and it then takes the first 3 characters.
If you are trying to take the first three digits and then convert to a number you can use try:
select substring('00118458582', 1,3::numeric)
it might actually be:
select substring('00118458582', 1,3)::numeric
I don't have a way to test right now...
lpad() refers to the total length of the returned value. So I think you want:
select lpad(00118458582::text, 12, '0'::text)
If you always want exactly 3 zeros before, then just concatenate them:
select '000' || 00118458582::text
I am querying to find things ending in "ST" followed by a number 1 - 999.
SELECT NUMBER WHERE NUMBER LIKE '%ST -- works correctly to return everything ending in "ST"
SELECT NUMBER WHERE NUMBER LIKE '%[1-999] -- works correctly to return everything ending in 1 - 999
SELECT NUMBER WHERE NUMBER LIKE '%ST[1-999] -- doesn't work - returns nothing
Also tried:
SELECT NUMBER WHERE NUMBER LIKE '%ST%[1-999] -- works, but also returns things like "GRASTNT3" that have extra things between the "ST" and the number
Can anyone help this struggling beginner?
Thanks!
The problem is that [1-999] doesn't mean what you think it does.
SQL Server interprets that as a set of values (1-9, 9, 9) which basically means that if there's more than 1 digit after the ST, the entry won't be returned.
So far as I can tell, your best bet is:
SELECT NUMBER WHERE
NUMBER LIKE '%ST[1-9][0-9][0-9]' OR
NUMBER LIKE '%ST[1-9][0-9]' OR
NUMBER LIKE '%ST[1-9]'
(assuming that your numbers don't have leading zeros - if they do, replace the ones with more zeros)
You need to do
SELECT NUMBER WHERE
NUMBER LIKE '%ST[1-9][0-9][0-9]'
OR NUMBER LIKE '%ST[1-9][0-9]'
OR NUMBER LIKE '%ST[1-9]';
The group in the the [] is a Char/NChar not an Int.
Better still normalise and type your data, so you have an ST bit and an int column for the number.
If you find you need to define different filters on variable string data, consider Full Text Searching or another Lucene related technology depending on your RDBMS.
What if somebody made a column as VARCHAR2(256 CHAR) and there are only numbers in this column. I would like to get the highest number. The problem is: the number is something > 999999 but a Max to a varchar is always giving me a max number of 999999
I tried to_number(max(numbers), '9999999999999') but i still get 999999 back, at that cant be. Any ideas? Thank you
the best way is to
First Solution
convert the column in numeric
or
Second Solution
convert data in you query in numeric and than get data...
Example
select max(col1) from(
select to_number(numbers) as col1 from table ) d
It has to be this way because if you call MAX() before TO_NUMBER(), it will sort alphabetically, and then 999999 is bigger than 100000000000. Note that applying TO_NUMBER() to a varchar2 column incurs the risk of an INVALID_NUMBER exception, should the column containing any non-numeric characters. This is why the first proposed solution is to be preferred.
In Oracle, the NUMBER type contains base 100 floating point values which have a precision of 38 significant digits, and a max value of 9999...(38 9's) x 10^125. There are two questions at issue - the first is whether a NUMBER can contain a value converted from a 256 character string, and the second is if two such values which are 'close' in numeric terms can be distinguished.
Let's start with taking a 256 character string and trying to convert it to a number. The obvious thing to do is:
SELECT TO_NUMBER('9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999') AS VAL
FROM DUAL;
Executing the above we get:
ORA-01426: numeric overflow
which, having paid attention earlier, we expected. The largest exponent that a NUMBER can handle is 125 - and here we're trying to convert a value with 256 significant digits. NUMBER's can't handle this. If we cut the number of digits down to 125, as follows:
SELECT TO_NUMBER('99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999') AS VAL
FROM DUAL;
It works fine, and our answer it 1E125.
<blink>
WHOA! WAIT!! WHAT??? The answer is 1 x 10^125??? What about all those 9's?!?!?!?
Remember earlier I'd mentioned that an Oracle NUMBER is a floating point value with a maximum precision of 38 and a maximum exponent of 125. From the point of view of TO_NUMBER 125 9's all strung together can't be exactly represented - too many digits (remember, max. precision of 38 (more on this later)). So it does the absolute best it can - it converts the first 38 digits (all of which are 9's) and then says "How should I best round this off to make the result A) representative of the input and B) as close as I can get to what I was given?". In this case it looks at digit 39, sees that it's a 9, and decides to round upward. As all the other digits are also 9's, it continues rounding neatly until it ends up with 1 as the remaining mantissa digit.
* Later, back at the ranch... *
OK, earlier I'd mentioned that NUMBER has a precision of 38 digits. That's not entirely true - it can actually differentiate between values with up to 40 digits of precision, at least sometimes, if the wind is right, and you're going downhill. Here's an example:
SELECT CASE
WHEN to_number('9999999999999999999999999999999999999999') >
to_number('9999999999999999999999999999999999999998')
THEN 'Greater'
ELSE 'Not greater'
END AS VAL
FROM DUAL;
Those two values each have 40 digits (counting is left as an exercise to the extremely bored reader :-). If you execute the above you'll get back 'Greater', showing that the comparison of two 40 digit values succeeded.
Now for some fun. If you add an additional '9' to each string, making for a 41 digit value, and re-execute the statement it'll return 'Not greater'.
<blink>
WAIT! WHAT?? WHOA!!! Those values are obviously different! Even a TotalFool (tm) can see that!!
The problem here is that a 41 digit number exceeds the precision of the NUMBER type, and thus when TO_NUMBER finds it has a value this long it starts discarding digits on the right side. Thus, even though those two really big numbers are clearly different to you and me, they're not different at all once they've been folded, spindled, mutilated, and converted.
So, what are the takeaways here?
1 - To the OP's original question - you'll have to come up with another way to compare your number strings besides using NUMBER because Oracle's NUMBER type can't hold 256 digit values. I suggest that you normalize the strings by making sure ALL the values are 256 digits long, adding zeroes on the left as needed, and then a string comparison should work OK.
2 - Floating point numbers prove the existence of (your favorite deity/deities here) by negation, as they are clearly the work of (your favorite personification of evil here). Whenever you work with them (as we all have to, sooner or later) you should remember that they are the foul byproducts of malignant evil, waiting to lash out at you when you least expect it.
3 - There is NO point three! (And extra credit for those who can identify without resorting to an extra-cranial search engine where this comes from :-)
Share and enjoy.
If you mean that the numbers in the column can be that big (256 digits), you could try something like this:
SELECT numbers
FROM (
SELECT numbers
FROM table_name
ORDER BY LPAD(numbers, 256) DESC
)
WHERE rownum = 1
or like this:
SELECT LTRIM(MAX(LPAD(numbers, 256))) AS numbers
FROM table_name
In MS SQL, I need a approach to determine the largest scale being used by the rows for a certain decimal column.
For example Col1 Decimal(19,8) has a scale of 8, but I need to know if all 8 are actually being used, or if only 5, 6, or 7 are being used.
Sample Data:
123.12345000
321.43210000
5255.12340000
5244.12345000
For the data above, I'd need the query to either return 5, or 123.12345000 or 5244.12345000.
I'm not concerned about performance, I'm sure a full table scan will be in order, I just need to run the query once.
Not pretty, but I think it should do the trick:
-- Find the first non-zero character in the reversed string...
-- And then subtract from the scale of the decimal + 1.
SELECT 9 - PATINDEX('%[1-9]%', REVERSE(Col1))
I like #Michael Fredrickson's answer better and am only posting this as an alternative for specific cases where the actual scale is unknown but is certain to be no more than 18:
SELECT LEN(CAST(CAST(REVERSE(Col1) AS float) AS bigint))
Please note that, although there are two explicit CAST calls here, the query actually performs two more implicit conversions:
As the argument of REVERSE, Col1 is converted to a string.
The bigint is cast as a string before being used as the argument of LEN.
SELECT
MAX(CHAR_LENGTH(
SUBSTRING(column_name::text FROM '\.(\d*?)0*$')
)) AS max_scale
FROM table_name;
*? is the non-greedy version of *, so \d*? catches all digits after the decimal point except trailing zeros.
The pattern contains a pair of parentheses, so the portion of the text that matched the first parenthesized subexpression (that is \d*?) is returned.
References:
https://www.postgresql.org/docs/9.6/static/sql-createcast.html
https://www.postgresql.org/docs/9.6/static/functions-matching.html
Note this will scan the entire table:
SELECT TOP 1 [Col1]
FROM [Table]
ORDER BY LEN(PARSENAME(CAST([Col1] AS VARCHAR(40)), 1)) DESC
We have legacy table where one of the columns part of composite key was manually filled with values:
code
------
'001'
'002'
'099'
etc.
Now, we have feature request in which we must know MAX(code) in order to give user next possible value, in example case form above next value is '100'.
We tried to experiment with this but we still can't find any reasonable explanation how DB2 engine calculates that
MAX('001', '099', '576') is '576'
MAX('099', '99', 'www') is '99' and so on.
Any help or suggestion would be much appreciated!
You already have the answer to getting the maximum numeric value, but to answer the other part with regard to 'www','099','99'.
The AS/400 uses EBCDIC to store values, this is different to ASCII in several ways, the most important for your purposes is that Alpha characters come before numbers, which is the opposite of Ascii.
So on your Max() your 3 strings will be sorted and the highest EBCDIC value used so
'www'
'099'
'99 '
As you can see your '99' string is really '99 ' so it is higher that the one with the leading zero.
Cast it to int before applying max()
For the numeric maximum -- filter out the non-numeric values and cast to a numeric for aggregation:
SELECT MAX(INT(FLD1))
WHERE FLD1 <> ' '
AND TRANSLATE(FLD1, '0123456789', '0123456789') = FLD1
SQL Reference: TRANSLATE
And the reasonable explanation:
SQL Reference: MAX
This max working well in your type definition, when you want do max on integer values then convert values to integer before calling MAX, but i see you mixing max with string 'www' how you imagine this works?
Filter integer only values, cast it to int and call max. This is not good designed solution but looking at your problem i think is enough.
Sharing the solution for postgresql
which worked for me.
Suppose here temporary_id is of type character in database. Then above query will directly convert char type to int type when it gives response.
SELECT MAX(CAST (temporary_id AS Integer)) FROM temporary
WHERE temporary_id IS NOT NULL
As per my requirement I've applied MAX() aggregate function. One can remove that also and it will work the same way.