SQL Order of execution when DISTINCT and AVG are present in the same SELECT statement

SQL Order of execution when DISTINCT and AVG are present in the same SELECT statement - sql

I'm using Oracle 11g. In what order will this SQL statement be "parsed"?
Assuming there are many duplicate values in col2:
SELECT DISTINCT col1, AVG(col2)
FROM table1
GROUP BY col1
Will it:
1. remove all the duplicate col1-col2 data combination, and then do an average on col2 on this reduced resultset, OR
2. do an aggregate average on col2 first, and then do a distinct on this resultset?

An exampe should be self-explanatory:
SQL> create table testDistinct (col1, col2) as(
2 select 1, 100 from dual union all
3 select 1, 10 from dual union all
4 select 1, 10 from dual union all
5 select 2, 50 from dual union all
6 select 3, 1 from dual union all
7 select 3, 100 from dual
8 );
Table created.
SQL> select col1, avg(col2)
2 from testDistinct
3 group by col1;
COL1 AVG(COL2)
---------- ----------
1 40
2 50
3 50,5
SQL> select DISTINCT col1, avg(col2)
2 from testDistinct
3 group by col1;
COL1 AVG(COL2)
---------- ----------
1 40
2 50
3 50,5
Applying the GROUP over the result of a DISTINCT gives:
SQL> select col1, avg(col2)
2 from (
3 select DISTINCT col1, col2
4 from testDistinct
5 )
6 group by col1;
COL1 AVG(COL2)
---------- ----------
1 55
2 50
3 50,5

Related

Compare before column in before row with next column in next row

My code is :
with x as
(
select 1 col from dual union all
select 2 col from dual union all
select 8 col from dual union all
select 4 col from dual union all
select 3 col from dual union all
select 2 col from dual
)
select col col1, col col2, col col3, rownum
from x
where col2.ROWNUM > col1.ROWNUM -1
and col2.ROWNUM > col3ROWNUM +1 ;
I want to compare col2.ROWNUM > col1.ROWNUM -1 and col2.ROWNUM > col3ROWNUM + 1 but that doesn't work and I got an error
ORA-01747: invalid user.table.column, table.column, or column specification
01747. 00000 - "invalid user.table.column, table.column, or column specification"
*Cause:
*Action:
Error at Line: 10 Column: 13
Please help me

It looks you got something wrong.
Result of that CTE is a single-column table whose only column is named col. There are no other columns.
SQL> with x as (
2 select 1 col from dual union all --> in UNION, all columns are
3 select 2 col from dual union all named by column name(s) from the
4 select 8 col from dual union all first SELECT statement
5 select 4 col from dual union all
6 select 3 col from dual union all
7 select 2 col from dual)
8 select x.*, rownum
9 from x;
COL ROWNUM
---------- ----------
1 1
2 2
8 3
4 4
3 5
2 6
6 rows selected.
SQL>
Therefore, where clause you wrote doesn't make any sense. Perhaps you should explain what you really have, rules that should be applied to source data and result you'd like to get.
Based on text you put into the title:
compare before column in before row with next column in next row
maybe you'd be interested in lag and lead analytic functions which then let you compare values in adjacent rows (pay attention to NULL values; I didn't). For example:
SQL> with x as (
2 select 1 col from dual union all
3 select 2 col from dual union all
4 select 8 col from dual union all
5 select 4 col from dual union all
6 select 3 col from dual union all
7 select 2 col from dual
8 ),
9 temp as
10 (select col,
11 rownum as rn
12 from x
13 ),
14 temp2 as
15 (select
16 rn,
17 col as this_row,
18 lag(col) over (order by rn) as previous_row,
19 lead(col) over (order by rn) as next_row
20 from temp
21 )
22 select this_row,
23 previous_row,
24 next_row,
25 --
26 case when this_row < previous_row then 'This < previous'
27 when this_row < next_row then 'This < next'
28 else 'something else'
29 end as result
30 from temp2
31 order by rn;
Result:
THIS_ROW PREVIOUS_ROW NEXT_ROW RESULT
---------- ------------ ---------- ---------------
1 2 This < next
2 1 8 This < next
8 2 4 something else
4 8 3 This < previous
3 4 2 This < previous
2 3 This < previous
6 rows selected.
SQL>

Use lead or lag functions. But, please, do not use rownum for such purposes.
Rownum indicates simply the order in which a row was found in the database and cannot be used for other purposes except limiting the number of rows fetched, like where rownum<=1 to be certain you won't get a too_many_rows exception, for instance. Still, if in a query you do fetch the pseud-column rownum, give it an alias so that you may use that value later on.
Moreover, what is supposed to mean col2.ROWNUM or col1.ROWNUM? That is not clear. col1 and col2 are two columns, which do not have the attribute rownum.
Something that may help in the future for analytic queries:
https://oracle-base.com/articles/misc/lag-lead-analytic-functions
And, if you wish to get a working SQL, please explain clearly what you wish to achieve, for I haven't really understood what that code is intended to do.
A way you may use rownum without getting errors:
with x as (
select 1 col from dual union all
select 2 col from dual union all
select 8 col from dual union all
select 4 col from dual union all
select 3 col from dual union all
select 2 col from dual)
,x2 as (
select col col1 ,col col2, col col3 ,rownum rn
from x
)
select *
from x2
where rn between 2 and 3 --- rownum cannot be used in such a
condition!!!
;
Or, to be certain you get only the first row from a table satisfying a given condition:
select x_col1, x_col2 into v_col1, v_col2
from x_table
where ... --- logical conditions
and rownum<=1; --- rownum <= 1 avoids too_many_rows_exception if several rows satisfy the logical conditions given before

In Oracle, results sets have a non-deterministic order (i.e. they are unordered) unless you use an ORDER BY clause. Therefore, if you have a physical table, you need another column to provide the order (rather than relying on the ROWNUM pseudo-column, which may result in unexpected behaviour):
CREATE TABLE x (order_id, col) AS
SELECT 1, 1 FROM DUAL UNION ALL
SELECT 2, 2 FROM DUAL UNION ALL
SELECT 3, 8 FROM DUAL UNION ALL
SELECT 4, 4 FROM DUAL UNION ALL
SELECT 5, 3 FROM DUAL UNION ALL
SELECT 6, 2 FROM DUAL;
If you want to find the rows that go up in succession, then you can use MATCH_RECOGNIZE for row-by-row pattern matching:
SELECT *
FROM x
MATCH_RECOGNIZE(
ORDER BY order_id
MEASURES
any_row.col AS col1,
FIRST(up.col) AS col2,
LAST(up.col) AS col3,
FIRST(order_id) AS start_order_id
PATTERN ( any_row up{2} )
DEFINE up AS ( col > PREV(col) )
)
or the LEAD analytic function:
SELECT *
FROM (
SELECT col AS col1,
LEAD(col, 1) OVER (ORDER BY order_id) AS col2,
LEAD(col, 2) OVER (ORDER BY order_id) AS col3,
order_id
FROM x
)
WHERE col2 > col1
AND col3 > col2;
Which both output:
COL1
COL2
COL3
START_ORDER_ID
1
2
8
1
fiddle

It looks like you want to find the rows where the value of the column is bigger than it is in both - the previous and next row. If so, you could try this:
WITH
tbl (ID, COL) AS -- Sample data (ID column is just to preserve order of the rows)
(
Select 1, 1 From Dual Union All
Select 2, 2 From Dual Union All
Select 3, 8 From Dual Union All
Select 4, 4 From Dual Union All
Select 5, 3 From Dual Union All
Select 6, 2 From DUAL
)
Select ID, COL, CASE WHEN COL > LAG(COL, 1) OVER(Order By ID) And COL > LEAD(COL, 1) OVER(Order By ID) THEN 'YES' END "BIGGER_THAN_PREV_AND_NEXT"
From tbl
Order By ID
ID COL BIGGER_THAN_PREV_AND_NEXT
---------- ---------- -------------------------
1 1
2 2
3 8 YES
4 4
5 3
6 2
... with a bit different sample data this will find the other row(s) that satisfy the condition ...
WITH
tbl (ID, COL) AS -- Sample data (ID column is just to preserve order of the rows)
(
Select 1, 1 From Dual Union All
Select 2, 2 From Dual Union All
Select 3, 8 From Dual Union All
Select 4, 4 From Dual Union All
Select 5, 5 From Dual Union All -- value of COL changed from 3 to 5
Select 6, 2 From DUAL
)
Select ID, COL, CASE WHEN COL > LAG(COL, 1) OVER(Order By ID) And COL > LEAD(COL, 1) OVER(Order By ID) THEN 'YES' END "BIGGER_THAN_PREV_AND_NEXT"
From tbl
Order By ID
ID COL BIGGER_THAN_PREV_AND_NEXT
---------- ---------- -------------------------
1 1
2 2
3 8 YES
4 4
5 5 YES
6 2
OR without ID - using ROWNUM (as in your question), - not adviseable, though...
WITH
tbl (COL) AS -- Sample data (without ID column)
(
Select 1 From Dual Union All
Select 2 From Dual Union All
Select 8 From Dual Union All
Select 4 From Dual Union All
Select 5 From Dual Union All
Select 2 From DUAL
)
Select COL, CASE WHEN COL > LAG(COL, 1) OVER(Order By ROWNUM) And COL > LEAD(COL, 1) OVER(Order By ROWNUM) THEN 'YES' END "BIGGER_THAN_PREV_AND_NEXT"
From tbl
COL BIGGER_THAN_PREV_AND_NEXT
---------- -------------------------
1
2
8 YES
4
5 YES
2
Any Order By clause added to the query could change the ROWNUM values and the result...

Oracle regex count multiple occurrences of a string surrounded by commas

This question is similar to a previous question of mine. I am looking for a way to count a character string in a comma-separated list of values in a column in an Oracle (11g) SQL database. For example, suppose I have the following data:
SELECT ('SL,PK') as col1 FROM dual
UNION ALL
SELECT ('SL,CR,SL') as col1 FROM dual
UNION ALL
SELECT ('PK,SL') as col1 FROM dual
UNION ALL
SELECT ('SL,SL') as col1 FROM dual
UNION ALL
SELECT ('SL') as col1 FROM dual
UNION ALL
SELECT ('PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SL,SL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,OSL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SLR,PK') as col1 FROM dual
COL1
-----
SL,PK
SL,CR,SL
PK,SL
SL,SL
SL
PK
PI,SL,PK
PI,SL,SL,PK
PI,SL,SL,SL,PK
PI,SL,SL,SL,SL,PK
PI,OSL,SL,PK
PI,SL,SLR,PK
I am looking to count all occurrences of the substring 'SL', strictly (i.e. not including 'OSL', 'SLR', etc). The ideal result would look like this:
COL1 COL2
----- -----
SL,PK 1
SL,CR,SL 2
PK,SL 1
SL,SL 2
SL 1
PK 0
PI,SL,PK 1
PI,SL,SL,PK 2
PI,SL,SL,SL,PK 3
PI,SL,SL,SL,SL,PK 4
PI,OSL,SL,PK 1
PI,SL,SLR,PK 1
I can accomplish this using length and regexp_replace:
SELECT
col1,
(length(col1) - NVL(length(regexp_replace(regexp_replace(col1,'(^|,)(SL)($|,)','\1' || '' || '\3',1,0,'imn'),'(^|,)(SL)($|,)','\1' || '' || '\3',1,0,'imn')),0))/length('SL') as col2
FROM (
SELECT ('SL,PK') as col1 FROM dual
UNION ALL
SELECT ('SL,CR,SL') as col1 FROM dual
UNION ALL
SELECT ('PK,SL') as col1 FROM dual
UNION ALL
SELECT ('SL,SL') as col1 FROM dual
UNION ALL
SELECT ('SL') as col1 FROM dual
UNION ALL
SELECT ('PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SL,SL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,OSL,SL,PK') as col1 FROM dual
UNION ALL
SELECT ('PI,SL,SLR,PK') as col1 FROM dual
)
COL1 COL2
----- -----
SL,PK 1
SL,CR,SL 2
PK,SL 1
SL,SL 2
SL 1
PK 0
PI,SL,PK 1
PI,SL,SL,PK 2
PI,SL,SL,SL,PK 3
PI,SL,SL,SL,SL,PK 4
PI,OSL,SL,PK 1
PI,SL,SLR,PK 1
but was hoping for a more elegant solution, perhaps with regexp_count. I have achieved my goal successfully in other regex implementations that have the word boundary \b construct available (with \bSL\b), but have not found a solution for Oracle's regex.

You can use regexp_count() if you hack the string:
select col1, regexp_count(replace(col1, ',', ',,'), '(^|\W)SL(\W|$)')
This doubles the delimiter so the first match doesn't eat it up -- getting around the underlying issue which is that Oracle regular expressions do not support look-ahead.
Here is a db<>fiddle.

Here's one option:
SQL> with temp as
2 (select col1,
3 regexp_substr(col1, '[^,]+', 1, column_value) val
4 from test cross join
5 table(cast(multiset(select level from dual
6 connect by level <= regexp_count(col1, ',') + 1
7 ) as sys.odcinumberlist))
8 )
9 select col1,
10 sum(case when val = 'SL' then 1 else 0 end) col2
11 From temp
12 group by col1;
COL1 COL2
----------------- ----------
PI,SL,SLR,PK 1
PK,SL 1
PK 0
SL,CR,SL 2
PI,OSL,SL,PK 1
SL,SL 2
PI,SL,SL,PK 2
PI,SL,SL,SL,PK 3
SL,PK 1
SL 1
PI,SL,PK 1
PI,SL,SL,SL,SL,PK 4
12 rows selected.
SQL>
What does it do?
temp CTE splits each column into rows (separator is comma)
the final select simply counts number of SLs for each col1

You can use an XMLTABLE to spilt the string and then count:
SELECT col1,
(
SELECT COUNT(*)
FROM XMLTABLE(
('"' || REPLACE( col1, ',', '","' ) || '"')
COLUMNS
value CHAR(2) PATH '.'
)
WHERE value = 'SL'
) AS col2
FROM test_data
So, for your test data:
CREATE TABLE test_data ( col1 ) AS
SELECT 'SL,PK' FROM dual UNION ALL
SELECT 'SL,CR,SL' FROM dual UNION ALL
SELECT 'PK,SL' FROM dual UNION ALL
SELECT 'SL,SL' FROM dual UNION ALL
SELECT 'SL' FROM dual UNION ALL
SELECT 'PK' FROM dual UNION ALL
SELECT 'PI,SL,PK' FROM dual UNION ALL
SELECT 'PI,SL,SL,PK' FROM dual UNION ALL
SELECT 'PI,SL,SL,SL,PK' FROM dual UNION ALL
SELECT 'PI,SL,SL,SL,SL,PK' FROM dual UNION ALL
SELECT 'PI,OSL,SL,PK' FROM dual UNION ALL
SELECT 'PI,SL,SLR,PK' FROM dual
This outputs:
COL1 | COL2
:---------------- | ---:
SL,PK | 1
SL,CR,SL | 2
PK,SL | 1
SL,SL | 2
SL | 1
PK | 0
PI,SL,PK | 1
PI,SL,SL,PK | 2
PI,SL,SL,SL,PK | 3
PI,SL,SL,SL,SL,PK | 4
PI,OSL,SL,PK | 1
PI,SL,SLR,PK | 2
db<>fiddle here

Oracle 12c Analytic Function

Is there a way to obtain the corresponding value X for a minimum value Y in a given dataset, in the same record, using Oracle Analytic functions, and without using a subquery?
For example:
If I have the following dataset "ds1":
Col1 Col2
A 1
B 2
C 3
D 4
E 4
A 10
Normally, in order to find the value "A" in Col1, which corresponds to the minimum value "1" in Col2, I would write the following query:
select ds1.col1
from ds1
, (select min (col2) col2
from ds1) min_ds1
where ds1.col2 = min_ds1.col2
/
Here is the executed code for such a Test Case:
### 1014.010, Start time is: 10/30/2019 11:39:35am
MYUN#MYDB-C1>>create table ds1 (col1 varchar2 (1), col2 number)
2 /
Table created.
Elapsed: 00:00:00.01
MYUN#MYDB-C1>>insert into ds1 (col1, col2)
2 select 'A', 1 from dual
3 union all select 'B', 2 from dual
4 union all select 'C', 3 from dual
5 union all select 'D', 4 from dual
6 union all select 'E', 4 from dual
7 union all select 'A', 10 from dual
8 /
6 rows created.
Elapsed: 00:00:00.02
MYUN#MYDB-C1>>commit
2 /
Commit complete.
Elapsed: 00:00:00.01
MYUN#MYDB-C1>>col col1 format a10
MYUN#MYDB-C1>>select ds1.col1
2 from ds1
3 , (select min (col2) col2
4 from ds1) min_ds1
5 where ds1.col2 = min_ds1.col2
6 /
COL1
----------
A
1 row selected.
Elapsed: 00:00:00.01
MYUN#MYDB-C1>>drop table ds1
2 /
Table dropped.
Elapsed: 00:00:00.03
The time now: 10/30/2019 11:39:36am
My question is:
Is it possible to derive the value "A" using an Analytic Function and without requiring a subquery? I am aware I can use the analytic function "ROW_NUMBER", sort the result in the ORDER BY clause, all in a subquery and then add a WHERE clause on the outer query where I say something like "WHERE RN = 1", where "RN" is the alias for the column in the subquery where the ROW_NUMBER function is used.

Use an aggregation function with KEEP to get the minimum values for another column:
Oracle Setup:
create table ds1 ( col1, col2 ) AS
select 'A', 1 from dual
union all select 'B', 2 from dual
union all select 'C', 3 from dual
union all select 'D', 4 from dual
union all select 'E', 4 from dual
union all select 'F', 10 from dual;
Aggregation Query:
SELECT MIN( col1 ) KEEP ( DENSE_RANK FIRST ORDER BY col2 ) AS col1
FROM ds1
Output:
| COL1 |
| :--- |
| A |
Analytic Query:
If you particularly want an analytic function then:
SELECT col1, col2
FROM (
SELECT ds1.*,
DENSE_RANK() OVER ( ORDER BY col2 ASC ) AS rnk
FROM ds1
)
WHERE rnk = 1
This has a sub-query but there is only a single table-scan.
You can easily integrate it into a huge query:
WITH my_huge_query AS (
<paste your huge query here>
)
SELECT *
FROM (
SELECT m.*,
DENSE_RANK() OVER( ORDER BY col2 ASC ) AS rnk
FROM my_huge_query m
)
WHERE rnk = 1
Output:
COL1 | COL2
:--- | ---:
A | 1
db<>fiddle here

Sql Query for Unique and Duplicates in oracle sql?

I need to display unique records in one column and duplicates in another column in Oracle?
COL1 COL2
1 10
1 10
2 20
3 30
3 30
unique in one set duplicate in one set
col1 col2 col1 col2
2 20 1 10
1 10
3 30
3 30

You can use the group by for both cases with the having clause:
Unique records
select *
from table as t
inner join (
select col1, col2, count(*) as times
from table
group by col1, col2
having count(*) = 1) as t2 ON t.col1 = t2.col2 and t.col2 = t2.col2
Duplicate records:
select *
from table as t
inner join (
select col1, col2, count(*) as times
from table
group by col1, col2
having count(*) > 1) as t2 ON t.col1 = t2.col1 and t.col2 = t2.col2

Would something like this do? See comments within code.
SQL> with
2 test (col1, col2) as
3 -- sample data
4 (select 1, 10 from dual union all
5 select 1, 10 from dual union all
6 select 2, 20 from dual union all
7 select 3, 30 from dual union all
8 select 3, 30 from dual
9 ),
10 uni as
11 -- unique values
12 (select col1, col2
13 from test
14 group by col1, col2
15 having count(*) = 1
16 ),
17 dup as
18 -- duplicate values
19 (select col1, col2
20 from test
21 group by col1, col2
22 having count(*) > 1
23 )
24 -- the final result
25 select u.col1 ucol1,
26 u.col2 ucol2,
27 d.col1 dcol1,
28 d.col2 dcol2
29 from uni u full outer join dup d on u.col1 = d.col1;
UCOL1 UCOL2 DCOL1 DCOL2
---------- ---------- ---------- ----------
1 10
3 30
2 20
SQL>

You can identify the duplicate values using window functions, and then filter each query. Then to get unique records:
select col1, col2
from (select t.*, count(*) over (partition by col1) as cnt
from t
) t
where cnt = 1;
To get duplicates:
select col1, col2
from (select t.*, count(*) over (partition by col1) as cnt
from t
) t
where cnt > 1;

SQL Server Counting

I have the following query:
select col1, sum( col2 ), count( col3 )
from table1
group by col1
order by col1
which returns something like this
col1
dept1
dept2
dept3
col2
10
20
30
col3
2
3
4
Without a stored procedure, is it possible to get a total column below the results generated by the original query?
i.e.
col1
dept1
dept2
dept3
total
col2
10
20
30
60
col3
2
3
4
9

use ROLLUP:
;with Table1 as (
select 'dept1' as col1, 5 as col2,1 as col3
union all
select 'dept1', 5 as col2, 1 as col3
union all
select 'dept2',10,1
union all
select 'dept2',5,1
union all
select 'dept2',5,1
union all
select 'dept3',10,1
union all
select 'dept3',5,1
union all
select 'dept3',5,1
union all
select 'dept3',10,1
)
select COALESCE(col1,'total'), sum( col2 ), count( col3 )
from table1
group by col1
with rollup
order by COALESCE(col1,'ZZZZZ')
Results:
(No column name) (No column name) (No column name)
dept1 10 2
dept2 20 3
dept3 30 4
total 60 9

Have a look at the keyword WITH ROLLUP on your GROUP BY clause

yep:
select col1, sum(col2), count(col3)
from table1
group by col1
union all
select 'totals', sum(col2), count(1) from table1
order by col1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas