Conditional Order By in sql - sql

I have a sample data, which I want to sort. If User is 1,then sort views in descending, otherwise if User is not 1,then sort normally.
I have written below sql,and I am getting required result.
My Question is Why and How it works?
with data as (
select 2 as User, 1 as Views UNION ALL
select 1,3 UNION ALL
select 4,1 UNION ALL
select 1,5 UNION ALL
select 1,6 UNION ALL
select 2,6 UNION ALL
select 7,2 UNION ALL
select 8,3 UNION ALL
select 3,9
)
select ARRAY_AGG(struct(User,Views) order by if(User=1,1,0) desc ,Views desc )
from data
I am confused with if(User=1,1,0), if User=1,then 1.Is this 1,the
column number? If its column number,then ,when User is not equal to
1,then the value will be 0 ,which is not any column.
I was researching on this,and found that,if I write, if(User=1,100,0) desc ,Views desc ,then also I am getting correct result ,mean numbers in that IF() are not columns, otherwise 100 will produce error ,becoz there is no 100th column.
Can Anyone explain me,how its working?
Image 1
Image 2

Can Anyone explain me,how its working?
I think below is the simplest way to explain/show what is happening here
Consider below slightly modified/simplified example - I eliminated aggregation to focus on ordering aspect only
with data as (
select 2 as user, 1 as views union all
select 1,3 union all
select 4,1 union all
select 1,5 union all
select 1,6 union all
select 2,6 union all
select 7,2 union all
select 8,3 union all
select 3,9
)
select *, if(user=1,1,0) sort
from data
order by sort desc, views desc
output of above is
I don't think you have any doubts why above result is as is - it is just straightforward!!
Now - if you use if(user=1,100,0) - you get
Obviously, exactly same output (in terms of ordering) and I still don't think you have any doubts why it is as it is
So, finally to streamline query - users (or at least power users) would use the shortcut - instead of introducing sort column to use in order by - they would move this into order by itself
Hope this is clear now for you!

Related

Removing doubling lines

I have written a union query but I need to eliminate the lines that are duplicated (line 2 and 3 in the column 'kods') and leave only distinct values of column 'kods'. How can that be done?
You need to decide which of the id values to discard using either min or max and group by the remaining columns. you don't need distinct and can union all since group by will perform the dedupe.
select kods, min(id) id, vards, uzvards from (
select kods, id, vards, uzvards
from dataset
union all
select kods, id, vards, uzvards
from dataset_2
)x
group by kods, vards, uzvards

Recursive ORDER BY

I have a USERS table which is a membership matrix like below. Table is unique on ID, and each ID belongs to at least one group, but could belong to all 3.
SELECT 1 AS ID, 0 AS IS_A, 0 AS IS_B, 1 AS IS_C FROM DUAL UNION ALL
SELECT 2,0,1,0 FROM DUAL UNION ALL
SELECT 3,0,1,1 FROM DUAL UNION ALL
SELECT 4,1,1,0 FROM DUAL UNION ALL
SELECT 5,1,1,0 FROM DUAL UNION ALL
SELECT 6,1,1,1 FROM DUAL UNION ALL
SELECT 7,0,1,1 FROM DUAL UNION ALL
SELECT 8,0,0,1 FROM DUAL UNION ALL
SELECT 9,1,0,0 FROM DUAL UNION ALL
SELECT 10,1,0,1 FROM DUAL UNION ALL
SELECT 11,0,0,1 FROM DUAL UNION ALL
SELECT 12,0,1,1 FROM DUAL
The final goal is to SELECT randomly a sample of at least 4 users from A, 3 from B and 5 from C (just an example) but with exactly 10 distinct IDs (otherwise the solution is trivial; just SELECT *).
The focus is less to determine if it's possible at all, but more to attempt a best effort to maximize memberships.
The output is expected to be unique on ID.
I can only think of a procedural way to achieve this:
Take the first ID with MAX(IS_A+IS_B+IS_C)
Check if the quotas are reached
If, for example, we already have 4 users from A, then we'll continue with the next ID with MAX(IS_B+IS_C), completely ignoring any further contributions from IS_A column
If we have already achieved all quotas, revert back to taking MAX(IS_A+IS_B+IS_C) to get "bonus" points
Stop upon reaching the overall maximum of 10
In essence, we prioritize and incrementally take the ID that has the most memberships in groups that have not reached the quota
However, I can't figure out how to do this in Oracle SQL since the ORDER BY would depend on not just the current row's values, but also recursively on whether the earlier rows have filled up the respective quotas.
I've tried ROWNUM, ROW_NUMBER(), SUM(IS_A) OVER (ORDER BY ...), RECURSIVE CTE but to no avail. Best I have is
WITH CTE AS (
SELECT ID, IS_A, IS_B, IS_C
, ROW_NUMBER() OVER (ORDER BY IS_A+IS_B+IS_C DESC) AS RN
FROM USERS
)
, CTE2 AS (
SELECT CTE.*
, GREATEST(4 - SUM(IS_A) OVER (ORDER BY RN), 0.001) AS QUOTA_A --clip negatives to 0.001
, GREATEST(3 - SUM(IS_B) OVER (ORDER BY RN), 0.001) AS QUOTA_B --so that when all quotas are exhausted,
, GREATEST(5 - SUM(IS_C) OVER (ORDER BY RN), 0.001) AS QUOTA_C --we still prioritize those that contribute most number of concurrent memberships
FROM CTE
)
SELECT ID FROM CTE2
ORDER BY QUOTA_A*IS_A + QUOTA_B*IS_B + QUOTA_C*IS_C DESC
FETCH NEXT 10 ROWS ONLY
but it does not work because QUOTA_A is computed based on ORDER BY RN instead of recursively.
Thanks in advance!

Insert with select for multiple records

I have an insert statements for which I want to make 2 inserts. I have the following code:
INSERT INTO [dbo].[Licemb]
([Lic_Id],
[LicEmb_EmbTS],
[LicEmb_EmbOffset])
SELECT TOP 1
Lic_ID,
'00:00:00',
-7
FROM dbo.Lics
WHERE Org_ID = 2
ORDER BY NP_ID DESC
UNION ALL
SELECT TOP 1
Lic_ID,
'00:00:00',
-7
FROM dbo.Lics
WHERE Org_ID = 4
ORDER BY NP_ID DESC
however I keep getting syntax errors and I can't find a work around after searching for a while.
Error:
Incorrect syntax near the keyword 'UNION'.
How can I modify this code so that I can use a single statement to make 2 inserts with selects?
Any help would be much appreciated.
you can only have one order by for your entire union statement.
if you need to order each select you will need to run a sub query and union them
so
INSERT INTO [dbo].[Licemb]
([Lic_Id],
[LicEmb_EmbTS],
[LicEmb_EmbOffset])
select id,daytime,embargo from (
SELECT TOP 1
Lic_ID AS id,
'00:00:00' AS daytime,
-7 AS embargo
FROM [NLASQL].dbo.Lics
WHERE Org_ID = 2
ORDER BY NP_ID DESC)
UNION ALL
select id,daytime,embargo from (
SELECT TOP 1
Lic_ID AS id,
'00:00:00' AS daytime,
-7 AS embargo
FROM [NLASQL].dbo.Lics
WHERE Org_ID = 4
ORDER BY NP_ID DESC)
this is not an ideal solution and would ask why you need to order each set of data and then approach the problem from that angle.
If you use a union (all), there can only be one order by, namely after the last unioned query. This order by is applied over all queries in the union.

I am trying to figure out logic to produce top 200 results for all categories in the IN command. (T-SQL)

So the code I have the following code as of now:
select top 200 employees,phone_no,address,job_code
from from employee
where code IN ('BA', 'QA', 'BI')
So the result I am looking to produce is
Top 200 results for BA and then Top 200 Results for QA and Top 200 results for BI. So total records it should populate would be 600. The current code would populate 200 only. I can do union commands but its lengthy. Looking for effective solution in this case scenario.
While I think that Union is the appropriate way to solve this, you could probably also use Window Functions to get a row number partitioned by code and then restrict that in an outer query:
SELECT employees,
phone_no,
address,
job_code,
FROM
(
select employees,
phone_no,
address,
job_code,
ROW_NUMBER() OVER (PARTITION BY CODE) as code_rownumber
from from employee
where code IN ('BA', 'QA', 'BI')
)subquery
WHERE subquery.code_rownumber <=200
There's a good chance this will take longer than a SELECT TOP 200... UNION SELECT TOP 200... UNION SELECT TOP 200... since a row_number() will need to be done for each record, and only after that is it limited to less than 200.
Also, it's peculiar that you are wanting the top 200, but you don't specify your sort order. In the window function above, if you want to specify how you sort, you would do so like:
ROW_NUMBER() OVER (PARTITION BY CODE ORDER BY job_code DESC) as code_rownumber
Where we sort by job_code in descending order for the numbering of each record within each code partition.
Try this, Break your query into 3 sub queries and union the results.
For example, if i have to select Top 2 records for Cateogry in 'A' and 'B'
The query would be,
select top 2 ProductID1,Revenue1 from ProductTotals1 where Category IN ('A')
union
select top 2 ProductID1,Revenue1 from ProductTotals1 where Category IN ('B')
Hope this helps!

group by and union in oracle

I would like to union 2 queries but facing an error in oracle.
select count(*) as faultCount,
COMP_IDENTIFIER
from CORDYS_NCB_LOG
where AUDIT_CONTEXT='FAULT'
union
select count(*) as responseCount,
COMP_IDENTIFIER
from CORDYS_NCB_LOG
where AUDIT_CONTEXT='RESPONSE'
group by COMP_IDENTIFIER
order by responseCount;
Two queries run perfectly individually.but when using union,it says ORA-00904: "RESPONSECOUNT": invalid identifier
The error you've run into
In Oracle, it's best to always name each column in each UNION subquery the same way. In your case, the following should work:
select count(*) as theCount,
COMP_IDENTIFIER
from CORDYS_NCB_LOG
where AUDIT_CONTEXT='FAULT'
group by COMP_IDENTIFIER -- don't forget this
union
select count(*) as theCount,
COMP_IDENTIFIER
from CORDYS_NCB_LOG
where AUDIT_CONTEXT='RESPONSE'
group by COMP_IDENTIFIER
order by theCount;
See also:
Curious issue with Oracle UNION and ORDER BY
A good workaround is, of course, to use indexed column references as suggested by a_horse_with_no_name
The query you really wanted
From your comments, however, I suspect you wanted to write an entirely different query, namely:
select count(case AUDIT_CONTEXT when 'FAULT' then 1 end) as faultCount,
count(case AUDIT_CONTEXT when 'RESPONSE' then 1 end) as responseCount,
COMP_IDENTIFIER
from CORDYS_NCB_LOG
where AUDIT_CONTEXT in ('FAULT', 'RESPONSE')
group by COMP_IDENTIFIER
order by responseCount;
The column names of a union are determined by the first query. So your first column is actually named FAULTCOUNT.
But the easiest way to sort the result of a union is to use the column index:
select ...
union
select ...
order by 1;
You most probably also want to use UNION ALL which avoids removing duplicates between the two queries and is faster than a plain UNION
In Union or Union all query column names are determined by the first query column name.
In your query replace "order by responseCount" with "order by faultCount.