How to get highest alphanumeric value in Postgres - sql

I have the following table of inventory stock symbols and want to fetch the highest alphanumeric value which is AR-JS-20. When I say "highest" I mean that the letter order is sorted first and then numbers are factored in, so AR-JS-20 is higher than AL-JS-20.
BTW, I don't want to split anything into parts because it is unknown what symbols vendors will send me in the futire.
I simply want an alphanumeric sort like you sort computer directory by name. Where dashes, undersocres, asterisks, etc. come first, then numbers, and letters last with cascading priority where the first character in the symbol has the most weight, then the second character and so on.
NOTE: The question has been edited so some of the answers below no longer apply.
AL-JS-20
AR-JS-20
AR-JS-9
AB-JS-8
AA-JS-1
1A-LM-30
2BA2-1
45HT
So ideally if this table was sorted to my requirements it would look like this
AR-JS-20
AR-JS-9
AB-JS-8
AL-JS-20
AA-JS-1
45HT
2BA2-1
1A-LM-30
However, when I use this query:
select max(symbol) from stock
I get:
AR-JS-9
but what I want to get is: AR-JS-20
I also tried:
select max(symbol::bytea) from stock
But this triggers error:
function max(bytea) does not exist

There is dedicated tag for this group of problems: natural-sort (I added it now.)
Ideally, you store the string part and the numeric part in separate columns.
While stuck with your unfortunate symbols ...
If your symbols are as regular as the sample suggests, plain left() and split_part() can do the job:
SELECT symbol
FROM stock
ORDER BY left(symbol, 5) DESC NULLS LAST
, split_part(symbol, '-', 3)::int DESC NULLS LAST
LIMIT 1;
Or, if at least the three dashes are a given:
...
ORDER BY split_part(symbol, '-', 1) DESC NULLS LAST
, split_part(symbol, '-', 2) DESC NULLS LAST
, split_part(symbol, '-', 3)::int DESC NULLS LAST
LIMIT 1
See:
Split comma separated column data into additional columns
Or, if the format is not as rigid: regular expression functions are more versatile, but also more expensive:
...
ORDER BY substring(symbol, '^\D+') DESC NULLS LAST
, substring(symbol, '\d+$')::int DESC NULLS LAST
LIMIT 1;
^ ... anchor to the start of the string
$ ... anchor to the end of the string
\D ... class shorthand for non-digits
\d ... class shorthand for digits
Taking only (trailing) digits, we can safely cast to integer (assuming numbers < 2^31), and sort accordingly.
Add NULLS LAST if any part can be missing, or the column can be NULL.

Specify a custom order by that trims everything up to the last - and converts the remaining number to int and take the first:
select stock_code
from mytable
order by regexp_replace(stock_code, '-?[0-9]+-?', ''), regexp_replace(stock_code, '[^0-9-]', '')::int
limit 1
See live demo.
This works for numbers at both start and end of code:
regexp_replace(stock_code, '-?[0-9]+-?', '') "deletes" digits and any adjacent dashes
regexp_replace(stock_code, '[^0-9]', '') "deletes" all non-digits

Related

Oracle SQL replace

Unfortunately I don't have the possibility to change field type.
I would like to REPLACE a , to . in a Typ=1 type of field (e.g.: 4,37 so in the end it should be 4.37), and I've tried CAST() and TO_NUMBER and TO_CHAR and I don't even know what else also, but I keep getting the ORA-01722 and it drives me crazy already. Why does it have to be a number for replacing ???
SELECT REPLACE(fmm, ',', '.') fmm FROM ...
Or do you have a better idea how can I do it without REPLACE maybe ?
UPDATE: it seems he has a problem with:
ORDER BY TO_NUMBER(fmm, '99D99')
So it seems he is taking the replaced version, so with . of fmm, but why ????
Try to remove the commas by replace(nvl(nr,0),',',''), and then formatting by
with tab as
(
select '1,234,567' as nr
from dual
)
select to_char(
replace(nvl(nr,0),',','')
,'fm999G999G990','NLS_NUMERIC_CHARACTERS = '',.''')
as "Number"
from tab;
Number
----------
1.234.567
Demo
Passing a string (varchar2) value into the replace function cannot throw an ORA-01722.
it seems he has a problem with:
ORDER BY TO_NUMBER(fmm, '99D99')
If that's complaining when fnm is '4,37' then you could add a replace() call inside the to_number(), but it's simpler/clearer to specify the NLS_NUMERIC_CHARACTERS as part of the conversion, so it knows that D is represented by a comma, and doesn't rely on the session settings:
order by to_number(fnm, '99D99', 'NLS_NUMERIC_CHARACTERS=,.')
If your table has a mix of values with period and comma decimal separators then you need to fix the data - this is the main reason you should not be storing numbers as strings in the first place. If you can't fix the data then you can workaround it with replace(), but it isn't ideal; you can then use a fixed period as the decimal character:
order by to_number(replace(fnm, ',', '.'), '99.99');
or still specify NLS_NUMERIC_CHARACTERS:
order by to_number(replace(fnm, ',', '.'), '99D99', 'NLS_NUMERIC_CHARACTERS=.,')
Either way that is 'normalising' all the string to only have periods, with no commas; and that allows them all to be converted.
db<>fiddle
what I don't understand, if I do some changes in the SELECT to a field, how can it affect the ORDER BY section? fmm should still remain 4,37 and not 4.37 in the ORDER BY section, shouldn't it?
No, because you gave the column expression REPLACE(fmm, ',', '.') the alias fnm, which is the same as the original column name; and the order-by clause is the only place column aliases are allowed, where it masks the original table column. When you do:
ORDER BY TO_NUMBER(fmm, '99D99')
the fnm in that conversion is the value of the column expression aliased as fnm, and not the original table column.
You can still access the table column, but to do so you have to prefix it with table name or alias, as the column from expression from the select list takes precedence (which is implied but not stated clearly in the docs:
expr orders rows based on their value for expr. The expression is based on columns in the select list or columns in the tables, views, or materialized views in the FROM clause.
So you can either explicitly refer to the table column via the table name or, here, an alias:
SELECT REPLACE(t.fmm, ',', '.') fmm
FROM your_table t
ORDER BY TO_NUMBER(t.fmm, '99D99')
though you still shouldn't rely on the session NLS settings really, so can/should still specify the NLS option to match the table column format:
SELECT REPLACE(t.fmm, ',', '.') fmm
FROM your_table t
ORDER BY TO_NUMBER(t.fmm, '99D99', 'NLS_NUMERIC_CHARACTERS=,.')
or use the replaced value and specify the NLS option for that (notice the option itself is different):
SELECT REPLACE(fmm, ',', '.') fmm
FROM your_table
ORDER BY TO_NUMBER(fmm, '99D99', 'NLS_NUMERIC_CHARACTERS=.,')
db<>fiddle
If your table has a mix of period and comma values then you need to use the column-alias version so it is consistent when it tries to convert. If you you only have commas then you can use either. (But again, you shouldn't be storing numbers as strings in the first place...)

oracle get the char of the longest charsequence from a "characterlist" in select

I think that would be a good question :)
So, I have a characterlist like '111122333334458888888888'
and I want to get only the char of the longest sequence.(it's '8' in that example)
It's a maxsearch of course, but I need to do it in the SELECT statement.
You can try something like this:
select character
from
(
select character, count(1)
from
(
select substr('111122333334458888888888', level, 1) as character
from dual
connect by level <= length('111122333334458888888888')
)
group by character
order by 2 desc
)
where rownum = 1
This uses the inner query to split the starting string into single characters, then counts the occurrence of every character ordering to get the character with the greatest number of occurrences.
You can rewrite this in different ways, with analytic functions; I believe this way is a one of the most readable.
If you have more than one character with the maximum number of occurrences, this will return one of them, in unpredictable way; if you need to chose, for example, the mimimum char, you can edit the ORDER BY clause accordingly.

Not selecting Max Value

I am using the query
select max(entry_no) from tbl_Invmaster
but its giving me ans 9 however the max value is 10.
You probably have the numbers in a VARCHAR column. Ordering in those fields is by alphabetcal order. That way 9 is bigger than 10. Explanation from the link:
To determine which of two strings comes first in alphabetical order, their first letters are compared. If they differ, then the string whose first letter comes earlier in the alphabet is the one which comes first in alphabetical order. If the first letters are the same, then the second letters are compared, and so on. If a position is reached where one string has no more letters to compare while the other does, then the first (shorter) string is deemed to come first in alphabetical order.
Your best solution is not to store numbers in VARCHAR columns but instead use the appropriate type, eg INT. That way your query would return the correct result.
If that is not an option for you, you could CAST the column to an integer type. Eg in SQL Server you would write:
select max(CAST(entry_no AS INT)) from tbl_Invmaster
select max( to_number( entry_no )) from tbl_invmaster

SQL MAX function and strings

I have a column nr that contains strings in the format of 12345-12345. The numbers before and after the dash can be of any length. I would like to get the maximum value for nr taking into account only the part after the dash. I tried
SELECT MAX(nr) AS max_nr FROM table WHERE (nr LIKE '12345-%')
However, this works only for values < 10 (i.e. 12345-9 would be returned as max even if 12345-10 exists). I thought of removing the dash and doing a type conversion:
SELECT MAX(REPLACE(nr, '-', '')::int) AS max_nr FROM table WHERE (nr LIKE '12345-%')
However, this of course returns the result without the dash. What would be the best way to get the maximum value while including the dash and the number before the dash in the result?
PostgreSQL 9.1
I'm no expert in PostGres, but you can use regexp_replace('foobarbaz', 'b..', 'X') to extract the string after the dash and then convert the number to int. The following query will retrieve only one row the nr from your table where the nr is like 12345-%, sorted by the number after the dash in descending order (largest number first).
SELECT nr
FROM table WHERE (nr LIKE '12345-%')
ORDER BY regexp_replace(nr, '^\d+-', '')::integer DESC
LIMIT 1
The regular expression above removes the leading digits and the dash, leaving only the last set of digits. For example, 54352-12345 would become 12345.
Official documentation.
And here is a SQL Fiddle illustrating it's use.
Use substring function with position function:
http://www.postgresql.org/docs/8.1/static/functions-string.html
to extract number after dash, and then use this value in MAX function as you have in your code now. You can also try to_number function.
It will look similiar to this:
MAX(substring(nr from position('-' in nr))::int)

Order By clause not sorting properly

This is one of the most interesting issue that I have come across in quiet some time. I have a scalar function defined in SQL server 2008 which whose return type is varchar(max)
This is the query:
Select dbo.GetJurisdictionsSubscribed(u.UserID) as 'Jurisdiction' From Users u ORDER BY Jurisdiction desc
Could anybody explain why would AAAA... 2nd record in the resultset? I am doing a descending sort, AAA... should appear at the last. If I change the query to
Jurisdiction asc
AAA goes 2nd last in the list instead of the 1st record.
This is the screenshot of the resultset: http://i48.tinypic.com/23j5vzq.jpg
Am I missing something?
That is the correct sort order. You have spaces. You must read Case Sensitive Collation Order.
Because, as you can see in your screenshot, they are a white space in other rows before 'Wise' word (and withe space is greater than 'A')
You can left trim this spaces with:
ORDER BY ltrim( Jurisdiction ) desc
notice the leading white spaces, try
SELECT ...
FROM ...
ORDER BY LTRIM(Jurisdiction) desc
LTRIM would be fine.
It's a bit hard to tell on the screenshot, but can you check the length of the values because I think some of them have a leading space. If so then they would be sorted correctly.