SQL Using ORDER BY with UNION doesn't sort numbers correctly (e.g. 10 before 8) - sql

I've tried looking for the answer, and read many threads on this site, but still can't find the answer I'm looking for.
I am trying to sort a series of numbers that are real look-ups and also one * which isn't, I can sort fine when I don't need to add the fake * but not after.
I have tried
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear
FROM MasterTable
UNION ALL
SELECT DISTINCT "*" As [ClassName], "1" As [ClassYear]
FROM MasterTable
ORDER BY MasterTable.ClassYear;
And
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear
FROM (
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear FROM MasterTable
UNION
SELECT DISTINCT "*" As [ClassName], "1" As [ClassYear] FROM MasterTable
)
ORDER BY MasterTable.ClassYear;
But both return the ClassYear as 1, 10, 12, 8... rather than 1, 8, 10, 12....
Any help would be much appreciated,
Thanks :)

MasterTable.ClassYear is varchar so it will sort as a string.
You'll have to convert it in the query or fix the column type.
For the 2nd clause, you also need only:
SELECT "*" As [ClassName], "1" As [ClassYear] --No FROM MasterTable
However, you can "cheat" and do this. Here 1 will be int and will force a conversion to int from the 1st clause because
SELECT "*" As [ClassName], 1 As [ClassYear] --force int. And fixed on edit
UNION ALL
SELECT DISTINCT MasterTable.ClassName, MasterTable.ClassYear
FROM MasterTable
ORDER BY ClassYear; --no table ref needed

It's property sorting those values as strings. If you want them in numerical order, try something like Cast(MasterTable.ClassYear AS int), either in the select or in the order by, or both, depending on how you end up structuring your query.
And instead of SELECT ..., "1" As [ClassYear], write SELECT ..., 1 As [ClassYear].

You are returning the year as a string, not a number. That means that it's sorted as text, not numerically.
Either return the year as a number, or convert the value into a number when sorting it.

Related

How to easily remove count=1 on aliased field in SQL?

I have the following data in a table:
GROUP1|FIELD
Z_12TXT|111
Z_2TXT|222
Z_31TBT|333
Z_4TXT|444
Z_52TNT|555
Z_6TNT|666
And I engineer in a field that removes the leading numbers after the '_'
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_31TBT|Z_TBT|333 <- to be removed
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
How can I easily query the original table for only GROUP's that correspond to GROUP_ALIASES with only one Distinct FIELD in it?
Desired result:
GROUP1|GROUP_ALIAS|FIELD
Z_12TXT|Z_TXT|111
Z_2TXT|Z_TXT|222
Z_4TXT|Z_TXT|444
Z_52TNT|Z_TNT|555
Z_6TNT|Z_TNT|666
This is how I get all the GROUP_ALIAS's I don't want:
SELECT GROUP_ALIAS
FROM
(SELECT
GROUP1,FIELD,
case when instr(GROUP1, '_') = 2
then
substr(GROUP1, 1, 2) ||
ltrim(substr(GROUP1, 3), '0123456789')
else
substr(GROUP1 , 1, 1) ||
ltrim(substr(GROUP1, 2), '0123456789')
end GROUP_ALIAS
FROM MY_TABLE
GROUP BY GROUP_ALIAS
HAVING COUNT(FIELD)=1
Probably I could make the engineered field a second time simply on the original table and check that it isn't in the result from the latter, but want to avoid so much nesting. I don't know how to partition or do anything more sophisticated on my case statement making this engineered field, though.
UPDATE
Thanks for all the great replies below. Something about the SQL used must differ from what I thought because I'm getting info like:
GROUP1|GROUP_ALIAS|FIELD
111,222|,111|111
111,222|,222|222
etc.
Not sure why since the solutions work on my unabstracted data in db-fiddle. If anyone can spot what db it's actually using that would help but I'll also check on my end.
Here is one way, using analytic count. If you are not familiar with the with clause, read up on it - it's a very neat way to make your code readable. The way I declare column names in the with clause works since Oracle 11.2; if your version is older than that, the code needs to be re-written just slightly.
I also computed the "engineered field" in a more compact way. Use whatever you need to.
I used sample_data for the table name; adapt as needed.
with
add_alias (group1, group_alias, field) as (
select group1,
substr(group1, 1, instr(group1, '_')) ||
ltrim(substr(group1, instr(group1, '_') + 1), '0123456789'),
field
from sample_data
)
, add_counts (group1, group_alias, field, ct) as (
select group1, group_alias, field, count(*) over (partition by group_alias)
from add_alias
)
select group1, group_alias, field
from add_counts
where ct > 1
;
With Oracle you can use REGEXP_REPLACE and analytic functions:
select Group1, group_alias, field
from (select group1, REGEXP_REPLACE(group1,'_\d+','_') group_alias, field,
count(*) over (PARTITION BY REGEXP_REPLACE(group1,'_\d+','_')) as count from test) a
where count > 1
db-fiddle

Concatenate & Trim String

Can anyone help me, I have a problem regarding on how can I get the below result of data. refer to below sample data. So the logic for this is first I want delete the letters before the number and if i get that same thing goes on , I will delete the numbers before the letter so I can get my desired result.
Table:
SALV3000640PIX32BLU
SALV3334470A9CARBONGRY
TP3000620PIXL128BLK
Desired Output:
PIX32BLU
A9CARBONGRY
PIXL128BLK
You need to use a combination of the SUBSTRING and PATINDEX Functions
SELECT
SUBSTRING(SUBSTRING(fielda,PATINDEX('%[^a-z]%',fielda),99),PATINDEX('%[^0-9]%',SUBSTRING(fielda,PATINDEX('%[^a-z]%',fielda),99)),99) AS youroutput
FROM yourtable
Input
yourtable
fielda
SALV3000640PIX32BLU
SALV3334470A9CARBONGRY
TP3000620PIXL128BLK
Output
youroutput
PIX32BLU
A9CARBONGRY
PIXL128BLK
SQL Fiddle:http://sqlfiddle.com/#!6/5722b6/29/0
To do this you can use
PATINDEX('%[0-9]%',FieldName)
which will give you the position of the first number, then trim off any letters before this using SUBSTRING or other string functions. (You need to trim away the first letters before continuing with the next step because unlike CHARINDEX there is no starting point parameter in the PATINDEX function).
Then on the remaining string use
PATINDEX('%[a-z]%',FieldName)
to find the position of the first letter in the remaining string. Now trim off the numbers in front using SUBSTRING etc.
You may find this other solution helpful
SQL to find first non-numeric character in a string
Try this it may helps you
;With cte (Data)
AS
(
SELECT 'SALV3000640PIX32BLU' UNION ALL
SELECT 'SALV3334470A9CARBONGRY' UNION ALL
SELECT 'SALV3334470A9CARBONGRY' UNION ALL
SELECT 'SALV3334470B9CARBONGRY' UNION ALL
SELECT 'SALV3334470D9CARBONGRY' UNION ALL
SELECT 'TP3000620PIXL128BLK'
)
SELECT * , CASE WHEN CHARINDEX('PIX',Data)>0 THEN SUBSTRING(Data,CHARINDEX('PIX',Data),LEN(Data))
WHEN CHARINDEX('A9C',Data)>0 THEN SUBSTRING(Data,CHARINDEX('A9C',Data),LEN(Data))
ELSE NULL END AS DesiredResult FROM cte
Result
Data DesiredResult
-------------------------------------
SALV3000640PIX32BLU PIX32BLU
SALV3334470A9CARBONGRY A9CARBONGRY
SALV3334470A9CARBONGRY A9CARBONGRY
SALV3334470B9CARBONGRY NULL
SALV3334470D9CARBONGRY NULL
TP3000620PIXL128BLK PIXL128BLK

Get group maxima from combined strings

I have a table with a column code containing multiple pieces of data like this:
001/2017/TT/000001
001/2017/TT/000002
001/2017/TN/000003
001/2017/TN/000001
001/2017/TN/000002
001/2016/TT/000001
001/2016/TT/000002
001/2016/TT/000001
002/2016/TT/000002
There are 4 items in 001/2016/TT/000001: 001, 2016, TT and 000001.
How can I extract the max for every group formed by the first 3 items? The result I want is this:
001/2017/TT/000003
001/2017/TN/000002
001/2016/TT/000002
002/2016/TT/000002
Edit
The subfield separator is /, and the length of subfields can vary.
I use PostgreSQL 9.3.
Obviously, you should normalize the table and split the combined string into 4 columns with proper data type. The function split_part() is the tool of choice if the separator '/' is constant in your string and the length of can vary.
CREATE TABLE tbl_better AS
SELECT split_part(code, '/', 1)::int AS col_1 -- better names?
, split_part(code, '/', 2)::int AS col_2
, split_part(code, '/', 3) AS col_3 -- text?
, split_part(code, '/', 4)::int AS col_4
FROM tbl_bad
ORDER BY 1,2,3,4 -- optionally cluster data.
Then the task is trivial:
SELECT col_1, col_2, col_3, max(col_4) AS max_nr
FROM tbl_better
GROUP BY 1, 2, 3;
Related:
Split comma separated column data into additional columns
Of course, you can do it on the fly, too. For varying subfield length you could use substring() with a regular expression like this:
SELECT max(substring(code, '([^/]*)$')) AS max_nr
FROM tbl_bad
GROUP BY substring(code, '^(.*)/');
Related (with basic explanation for regexp pattern):
Filter strings with regex before casting to numeric
Or to get only the complete string as result:
SELECT DISTINCT ON (substring(code, '^(.*)/'))
code
FROM tbl_bad
ORDER BY substring(code, '^(.*)/'), code DESC;
About DISTINCT ON:
Select first row in each GROUP BY group?
Be aware that data items cast to a suitable type may behave differently from their string representation. The max of 900001 and 1000001 is 900001 for text and 1000001 for integer ...
Use the LEFT and RIGHT functions.
SELECT MAX(RIGHT(code,6)) AS MAX_CODE
FROM yourtable
GROUP BY LEFT(code,12)
check this out, possible helpfull
select
distinct on (tab[4],tab[2]) tab[4],tab[3],tab[2],tab[1]
from
(
select
string_to_array(exe.x,'/') as tab,
exe.x
from
(
select
unnest
(
array
['001/2017/TT/000001',
'001/2017/TT/000002',
'001/2017/TN/000003',
'001/2017/TN/000001',
'001/2017/TN/000002',
'001/2016/TT/000001',
'001/2016/TT/000002',
'001/2016/TT/000001',
'002/2016/TT/000002']
) as x
) exe
) exe2
order by tab[4] desc,tab[2] desc,tab[3] desc;

Sort varchar datatype with numeric characters

SQL SERVER 2005
SQL Sorting :
Datatype varchar
Should sort by
1.aaaa
5.xx
11.bbbbbb
12
15.
how can i get this sorting order
Wrong
1.aaaa
11.bbbbbb
12
15.
5.xx
On Oracle, this would work.
SELECT
*
FROM
table
ORDER BY
to_number(regexp_substr(COLUMN,'^[0-9]+')),
regexp_substr(column,'\..*');
You could do this by calculating a column based on what's on the left hand side of the period('.').
However this method will be very difficult to make robust enough to use in a production system, unless you can make a lot of assertions about the content of the strings.
Also handling strings without periods could cause some grief
with r as (
select '1.aaaa' as string
union select '5.xx'
union select '11.bbbbbb'
union select '12'
union select '15.' )
select *
from r
order by
CONVERT(int, left(r.string, case when ( CHARINDEX('.', r.string)-1 < 1)
then LEN(r.string)
else CHARINDEX('.', r.string)-1 end )),
r.string
If all the entries have this form, you could split them into two parts and sort be these, for example like this:
ORDER BY
CONVERT(INT, SUBSTRING(fieldname, 1, CHARINDEX('.', fieldname))),
SUBSTRING(fieldname, CHARINDEX('.', fieldname) + 1, LEN(fieldname))
This should do a numeric sort on the part before the . and an alphanumeric sort for the part after the ., but may need some tuning, as I haven't actually tried it.
Another way (and faster) might be to create computed columns that contain the part before the . and after the . and sort by them.
A third way (if you can't create computed columns) could be to create a view over the table that has two additional columns with the respective parts of the field and then do the select on that view.

When is a virtual column using AS rendered in SQL?

I wasn't sure how to search for the answer:
select orderid, REGEXP_REPLACE (
orderid,
'^0+(.)',
'\1'
) as new_order_id
from orders
where length('new_order_id') < 6
This returns nothing. But I know the data is there. If I do:
select orderid, REGEXP_REPLACE (
orderid,
'^0+(.)',
'\1'
) as new_order_id
from orders
order by order_id asc
I get order ids like 1, 2, 3...
So how can I get back the ones that are less than six? Does the where not operation on my returned regexp_replace data after the dataset is returned. Oracle if it matters.
Also, I believe my query is knocking out all leading zeros and replacing it with nothing. Not sure what the \1 means. Yes, I copied it. I thought it was putting nothing there, which is what I want. Just truncate leading zeros.
Thanks.
In your query,
where length('new_order_id') < 6
compares the length of the literal string 'new_order_id', not the value of the field new_order_id.
Try removing the quotes:
where length(new_order_id) < 6
Try this:
select * from
(select orderid
, regexp_replace(orderid,'^0+(.)','\1') new_order_id
from orders)
where length(new_order_id) < 6;
You can avoid using regexp:
select orderid
, ltrim(orderid,'0') new_order_id
from orders
where length(ltrim(orderid,'0'))<6
order by 1;
The length of the string 'new_order_id' is never less than 6. Probably you will have to do the length(regexp_replace(...)) < 6 instead if oracle doesn't support using the output column name without quotes (I have no idea).