Error in SQL. Different values when using count - sql

This is my table
+-----+-----+-----+-------+-------+---------+
| pa | pl | rp | corp | year | rating |
|-----+-----+-----+-------+-------+---------|
| pa1 | pl1 | rp1 | a1 | 2016 | 6 |
| pa1 | pl1 | rp1 | a1 | 2017 | 7 |
| pa1 | pl1 | rp1 | a2 | 2016 | 6.5 |
| pa1 | pl1 | rp1 | a2 | 2017 | 7.5 |
| pa1 | pl1 | rp2 | a1 | 2016 | 2 |
| pa1 | pl1 | rp2 | a1 | 2017 | 1.5 |
| pa1 | pl1 | rp2 | a2 | 2016 | 4 |
| pa1 | pl1 | rp2 | a2 | 2017 | 4.5 |
+-------------------------------------------+
Its name is list.
The query I wrote is to find where the average values have increased with time.
This is my query.
select count(sq.rp)
from
(select l1.pa, l1.pl, l1.rp, l1.corp, l1.rating as r1, l2.rating as r2
from list l1, list l2
where l1.year = '2016' and l2.year = '2017' and l1.pa = l2.pa
and l1.pl = l2.pl and l1.rp = l2.rp and l1.corp = l2.corp) sq
group by pa, pl, rp
having (avg(sq.r2)-avg(sq.r1)) > 0
When I don't use the count in the first line ( or replacing the first line with this line
select sq.rp
It shows the output result as with one row 'rp1'.
+-----+
| rp |
|-----|
| rp1 |
+-----+
But when I use the 'count' keyword also, it shows the count as 2.
I can't understand the reason.

Consider below example's
with x as
(select 'a' idx, 10 from dual union all
select 'a' idx, 20 from dual)
select idx from x group by idx
In the first you selected idx & group by idx but we havn't provided any aggregate functions but we are still grouping by. As per the document
In each group, no two rows have the same value for the grouping column or columns
so it actually just does distinct of column idx for you.
Try below with group by & count together and you might understand the query:
with x as
(select 'a' idx, 10 from dual union all
select 'a' idx, 20 from dual)
select idx,count(idx) from x group by idx

GROUP BY without an aggregate function is the same as using DISTINCT, therefore just one row. GROUP BY in combination with an aggregate function applies that aggregate function on all rows of the group, therefore 2. Execute the inner select without the group and you will see that it returns two rows because of the join.

Not clear what data as result you need to obtain but to me is more readable join table with AVG already calculated and then just select the column you're interested in.. try the below example:
the table "list" should fit your example replacing columns pa pl and rp with k (key)
select * --sub2.k, sub2.av
from (
select k,y, AVG(v) av
from (
select 'val1' k, 'a1' corp,'2016' y ,6 v
union select 'val1', 'a2','2016' ,4
union select 'val1', 'a1','2017',7
union select 'val1', 'a2','2017',9
union select 'val2', 'a1','2016',1
union select 'val2', 'a1','2017',3
union select 'val2', 'a2','2016',3
union select 'val2', 'a2','2017',5
) list
where list.y = 2016
group by k,y
) sub1
join (
select k,y, AVG(v) av
from (
select 'val1' k, 'a1' corp,'2016' y ,6 v
union select 'val1', 'a2','2016' ,4
union select 'val1', 'a1','2017',7
union select 'val1', 'a2','2017',9
union select 'val2', 'a1','2016',1
union select 'val2', 'a1','2017',3
union select 'val2', 'a2','2016',3
union select 'val2', 'a2','2017',5
) list
where list.y = 2017
group by k,y
) sub2 on sub1.k=sub2.k
and sub2.av-sub1.av >0

Related

Possible to use a column name in a UDF in SQL?

I have a query in which a series of steps is repeated constantly over different columns, for example:
SELECT DISTINCT
MAX (
CASE
WHEN table_2."GRP1_MINIMUM_DATE" <= cohort."ANCHOR_DATE" THEN 1
ELSE 0
END)
OVER (PARTITION BY cohort."USER_ID")
AS "GRP1_MINIMUM_DATE",
MAX (
CASE
WHEN table_2."GRP2_MINIMUM_DATE" <= cohort."ANCHOR_DATE" THEN 1
ELSE 0
END)
OVER (PARTITION BY cohort."USER_ID")
AS "GRP2_MINIMUM_DATE"
FROM INPUT_COHORT cohort
LEFT JOIN INVOLVE_EVER table_2 ON cohort."USER_ID" = table_2."USER_ID"
I was considering writing a function to accomplish this as doing so would save on space in my query. I have been reading a bit about UDF in SQL but don't yet understand if it is possible to pass a column name in as a parameter (i.e. simply switch out "GRP1_MINIMUM_DATE" for "GRP2_MINIMUM_DATE" etc.). What I would like is a query which looks like this
SELECT DISTINCT
FUNCTION(table_2."GRP1_MINIMUM_DATE") AS "GRP1_MINIMUM_DATE",
FUNCTION(table_2."GRP2_MINIMUM_DATE") AS "GRP2_MINIMUM_DATE",
FUNCTION(table_2."GRP3_MINIMUM_DATE") AS "GRP3_MINIMUM_DATE",
FUNCTION(table_2."GRP4_MINIMUM_DATE") AS "GRP4_MINIMUM_DATE"
FROM INPUT_COHORT cohort
LEFT JOIN INVOLVE_EVER table_2 ON cohort."USER_ID" = table_2."USER_ID"
Can anyone tell me if this is possible/point me to some resource that might help me out here?
Thanks!
There is no such direct as #Tejash already stated, but the thing looks like your database model is not ideal - it would be better to have a table that has USER_ID and GRP_ID as keys and then MINIMUM_DATE as seperate field.
Without changing the table structure, you can use UNPIVOT query to mimic this design:
WITH INVOLVE_EVER(USER_ID, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE)
AS (SELECT 1, SYSDATE, SYSDATE, SYSDATE, SYSDATE FROM dual UNION ALL
SELECT 2, SYSDATE-1, SYSDATE-2, SYSDATE-3, SYSDATE-4 FROM dual)
SELECT *
FROM INVOLVE_EVER
unpivot ( minimum_date FOR grp_id IN ( GRP1_MINIMUM_DATE AS 1, GRP2_MINIMUM_DATE AS 2, GRP3_MINIMUM_DATE AS 3, GRP4_MINIMUM_DATE AS 4))
Result:
| USER_ID | GRP_ID | MINIMUM_DATE |
|---------|--------|--------------|
| 1 | 1 | 09/09/19 |
| 1 | 2 | 09/09/19 |
| 1 | 3 | 09/09/19 |
| 1 | 4 | 09/09/19 |
| 2 | 1 | 09/08/19 |
| 2 | 2 | 09/07/19 |
| 2 | 3 | 09/06/19 |
| 2 | 4 | 09/05/19 |
With this you can write your query without further code duplication and if you need use PIVOT-syntax to get one line per USER_ID.
The final query could then look like this:
WITH INVOLVE_EVER(USER_ID, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE)
AS (SELECT 1, SYSDATE, SYSDATE, SYSDATE, SYSDATE FROM dual UNION ALL
SELECT 2, SYSDATE-1, SYSDATE-2, SYSDATE-3, SYSDATE-4 FROM dual)
, INPUT_COHORT(USER_ID, ANCHOR_DATE)
AS (SELECT 1, SYSDATE-1 FROM dual UNION ALL
SELECT 2, SYSDATE-2 FROM dual UNION ALL
SELECT 3, SYSDATE-3 FROM dual)
-- Above is sampledata query starts from here:
, unpiv AS (SELECT *
FROM INVOLVE_EVER
unpivot ( minimum_date FOR grp_id IN ( GRP1_MINIMUM_DATE AS 1, GRP2_MINIMUM_DATE AS 2, GRP3_MINIMUM_DATE AS 3, GRP4_MINIMUM_DATE AS 4)))
SELECT qcsj_c000000001000000 user_id, GRP1_MINIMUM_DATE, GRP2_MINIMUM_DATE, GRP3_MINIMUM_DATE, GRP4_MINIMUM_DATE
FROM INPUT_COHORT cohort
LEFT JOIN unpiv table_2
ON cohort.USER_ID = table_2.USER_ID
pivot (MAX(CASE WHEN minimum_date <= cohort."ANCHOR_DATE" THEN 1 ELSE 0 END) AS MINIMUM_DATE
FOR grp_id IN (1 AS GRP1,2 AS GRP2,3 AS GRP3,4 AS GRP4))
Result:
| USER_ID | GRP1_MINIMUM_DATE | GRP2_MINIMUM_DATE | GRP3_MINIMUM_DATE | GRP4_MINIMUM_DATE |
|---------|-------------------|-------------------|-------------------|-------------------|
| 3 | | | | |
| 1 | 0 | 0 | 0 | 0 |
| 2 | 0 | 1 | 1 | 1 |
This way you only have to write your calculation logic once (see line starting with pivot).

How to create a query with all of dependencies in hierarchical organization?

I've been trying hard to create a query to see all dependencies in a hierarchical organization. But the only I have accuaried is to retrieve the parent dependency. I have attached an image to show what I need.
Thanks for any clue you can give me.
This is the code I have tried with the production table.
WITH CTE AS
(SELECT
H1.systemuserid,
H1.pes_aprobadorid,
H1.yomifullname,
H1.internalemailaddress
FROM [dbo].[ext_systemuser] H1
WHERE H1.pes_aprobadorid is null
UNION ALL
SELECT
H2.systemuserid,
H2.pes_aprobadorid,
H2.yomifullname,
H2.internalemailaddress
FROM [dbo].[ext_systemuser] H2
INNER JOIN CTE c ON h2.pes_aprobadorid=c.systemuserid)
SELECT *
FROM CTE
OPTION (MAXRECURSION 1000)
You are almost there with your query. You just have to include all rows as a starting point. Also the join should be cte.parent_id = ext.user_id and not the other way round. I've done an example query in postgres, but you shall easily adapt it to your DBMS.
with recursive st_units as (
select 0 as id, NULL as pid, 'Director' as nm
union all select 1, 0, 'Department 1'
union all select 2, 0, 'Department 2'
union all select 3, 1, 'Unit 1'
union all select 4, 3, 'Unit 1.1'
),
cte AS
(
SELECT id, pid, cast(nm as text) as path, 1 as lvl
FROM st_units
UNION ALL
SELECT c.id, u.pid, cast(path || '->' || u.nm as text), lvl + 1
FROM st_units as u
INNER JOIN cte as c on c.pid = u.id
)
SELECT id, pid, path, lvl
FROM cte
ORDER BY lvl, id
id | pid | path | lvl
-: | ---: | :--------------------------------------- | --:
0 | null | Director | 1
1 | 0 | Department 1 | 1
2 | 0 | Department 2 | 1
3 | 1 | Unit 1 | 1
4 | 3 | Unit 1.1 | 1
1 | null | Department 1->Director | 2
2 | null | Department 2->Director | 2
3 | 0 | Unit 1->Department 1 | 2
4 | 1 | Unit 1.1->Unit 1 | 2
3 | null | Unit 1->Department 1->Director | 3
4 | 0 | Unit 1.1->Unit 1->Department 1 | 3
4 | null | Unit 1.1->Unit 1->Department 1->Director | 4
db<>fiddle here
I've reached this code that it is working but when I include a hierarchy table of more than 1800 the query is endless.
With cte AS
(select systemuserid, systemuserid as pes_aprobadorid, internalemailaddress, yomifullname
from #TestTable
union all
SELECT c.systemuserid, u.pes_aprobadorid, u.internalemailaddress, u.yomifullname
FROM #TestTable as u
INNER JOIN cte as c on c.pes_aprobadorid = u.systemuserid
)
select distinct * from cte
where pes_aprobadorid is not null
OPTION (MAXRECURSION 0)

Conditionally fallback to different join condition if stricter condition not matched

I have 2 tables j and c.
Both tables have columns ports and sec, and JOIN ON j.ports = c.ports and c.sec = j.sec.
For j.port = 'ABC', if there is no c.sec = j.sec for the same ports, then JOIN ON LEFT(c.sec, 6) = LEFT(j.sec, 6)
For other j.ports, I only want to join ON j.ports = c.ports and c.sec = j.sec
How can I do that?
Example Data
Table c
+------+------------+------------+
| Port | sec | Other |
+------+------------+------------+
| ABC | abcdefghij | ONE |
| ABC | klmnop | TWO |
| LMN | qwertyuiop | THREE |
| XYZ | asdfghjkl | FOUR |
+------+------------+------------+
Table j
+------+------------+
| Port | sec |
+------+------------+
| ABC | abcdefxxxx |
| ABC | klmnop |
| LMN | qwertyuiop |
| XYZ | zxcvbnm |
+------+------------+
EDITED: Desired Results
+------+------------+------------+
| Port | sec | other |
+------+------------+------------+
| ABC | abcdefghij | ONE | --> mactching on sec's 1st 6 characters
| ABC | klmnop | TWO | --> mactching on sec
| LMN | qwertyuiop | THREE | --> mactching on sec
+------+------------+------------+
This does conditional joining:
select t1.*, t2.*
from j t1 inner join c t2
on t2.ports = t1.ports and
case
when exists (select 1 from c where sec = t1.sec) then t1.sec
else left(t1.sec, 6)
end =
case
when exists (select 1 from c where sec = t1.sec) then t2.sec
else left(t2.sec, 6)
end
I question its efficiency but I think it does what you need.
See the demo.
You can do two outer joins and then do isnull type of operation. In oracle nvl is isnull of sqlserver
with c as
(
select 'ABC' port, 'abcdefghij' sec from dual
union all select 'ABC', 'klmnop' from dual
union all select 'LMN', 'qwertyuiop' from dual
union all select 'XYZ', 'asdfghjkl' from dual
),
j as
(
select 'ABC' port, 'abcdefxxxx' sec from dual
union all select 'ABC', 'klmnop' from dual
union all select 'LMN', 'qwertyuiop' from dual
union all select 'XYZ', 'zxcvbnm' from dual
)
select c.port, c.sec, nvl(j_full.sec, j_part.sec) j_sec
from c
left outer join j j_full on j_full.port = c.port and j_full.sec = c.sec
left outer join j j_part on j_part.port = c.port and substr(j_part.sec,1,6) = substr(c.sec,1,6)
order by 1,2
One way would be to just inner join on the less strict predicate then use a ranking function to discard unwanted rows in the event that c.port = 'ABC' and the stricter condition got a match for a particular c.port, c.sec combination.
with cte as
(
select c.port as cPort,
c.sec as cSec,
c.other as other,
j.sec as jSec,
RANK() OVER (PARTITION BY c.port, c.sec ORDER BY CASE WHEN c.port = 'ABC' AND j.sec = c.sec THEN 0 ELSE 1 END) AS rnk
from c inner join j on left(j.sec,6) = left(c.sec,6)
)
SELECT cPort, cSec, other, jSec
FROM cte
WHERE rnk = 1

Oracle Sql: Obtain a Sum of a Group, if Subgroup condition met

I have a dataset upon which I am trying to obain a summed value for each group, if a subgroup within each group meets a certain condition. I am not sure if this is possible, or if I am approaching this problem incorrectly.
My data is structured as following:
+----+-------------+---------+-------+
| ID | Transaction | Product | Value |
+----+-------------+---------+-------+
| 1 | A | 0 | 10 |
| 1 | A | 1 | 15 |
| 1 | A | 2 | 20 |
| 1 | B | 1 | 5 |
| 1 | B | 2 | 10 |
+----+-------------+---------+-------+
In this example I want to obtain the sum of values by the ID column, if a transaction does not contain any products labeled 0. In the above described scenario, all values related to Transaction A would be excluded because Product 0 was purchased. With the outcome being:
+----+-------------+
| ID | Sum of Value|
+----+-------------+
| 1 | 15 |
+----+-------------+
This process would repeat for multiple IDs with each ID only containing the sum of values if the transaction does not contain product 0.
Hmmm . . . one method is to use not exists for the filtering:
select id, sum(value)
from t
where not exists (select 1
from t t2
where t2.id = t.id and t2.transaction = t.transaction and
t2.product = 0
)
group by id;
Do not need to use correlated subquery with not exists.
Just use group by.
with s (id, transaction, product, value) as (
select 1, 'A', 0, 10 from dual union all
select 1, 'A', 1, 15 from dual union all
select 1, 'A', 2, 20 from dual union all
select 1, 'B', 1, 5 from dual union all
select 1, 'B', 2, 10 from dual)
select id, sum(sum_value) as sum_value
from
(select id, transaction,
sum(value) as sum_value
from s
group by id, transaction
having count(decode(product, 0, 1)) = 0
)
group by id;
ID SUM_VALUE
---------- ----------
1 15

Selecting a record based on a series of criteria

I would like to run a query that will allow me to chose the best record from a particular username based on certain criteria. I have 2 columns (col01, col02) that are my criteria that I am looking at.
• If one record (username a in the example below) has both columns as yes, I would like that one to take precedence.
• If one record has col01 as a yes, that takes next 2nd rank precenence (username c in the example below)
• If one record has col01, and the other has col02 as yes, than col01 takes precedence(username d in the example below).
• If one record has col02 as yes, and the other records as no, than column two takes 3rd precedence (username g in the example below).
• If both records are the same, than neither should be returned as these records need to be investigated further (usernames b, e, f)
Below is example sample and output. How it can be done using sql query?
+----------+-----+-------+-------+
| username | id | col01 | col02 |
+----------+-----+-------+-------+
| a | 1 | yes | yes |
| a | 2 | yes | no |
| b | 3 | no | no |
| b | 4 | no | no |
| c | 5 | yes | no |
| c | 6 | no | no |
| d | 7 | yes | no |
| d | 8 | no | yes |
| e | 9 | no | yes |
| e | 10 | no | yes |
| f | 11 | yes | yes |
| f | 12 | yes | yes |
| g | 13 | no | no |
| g | 14 | no | yes |
+----------+----+--------+-------+
output
+----------+-----+-------+------+
| username | id | col01 | col02|
+----------+-----+-------+------+
| a | 1 | yes | yes |
| c | 5 | yes | no |
| d | 7 | yes | no |
| g | 14 | no | yes |
+----------+----+--------+------+
Edit: I was asked to explain the conditions. Basically the records come from the same area (username); The col01 is the most recently updated information we have, while col02 is older. Both columns are important to us, so that is why it is better if both are yes; col01 being more recent is where the more dependable data is. Where all the records are exactly the same, we have to dig a little deeper to understand out data.
Use analytic functions and then you do not need any self-joins:
Query:
SELECT username,
id,
col01,
col02
FROM (
SELECT t.*,
c.col2,
MIN( t.col01 ) OVER ( PARTITION BY username ) AS mincol01,
MAX( t.col01 ) OVER ( PARTITION BY username ) AS maxcol01,
MIN( c.col02 ) OVER ( PARTITION BY username ) AS mincol02,
MAX( c.col02 ) OVER ( PARTITION BY username ) AS maxcol02,
ROW_NUMBER() OVER ( PARTITION BY username
ORDER BY t.col01 DESC, c.col02 DESC ) AS rn
FROM table_name t
INNER JOIN
col02_table c
ON ( t.id = c.id )
)
WHERE ( mincol01 < maxcol01 OR mincol02 < maxcol02 )
AND rn = 1;
Output:
USERNAME ID COL01 COL02
-------- -- ----- -----
a 1 yes yes
c 5 yes no
d 7 yes no
g 14 no yes
with
inputs ( username, id, col01 , col02 ) as (
select 'a', 1, 'yes', 'yes' from dual union all
select 'a', 2, 'yes', 'no' from dual union all
select 'b', 3, 'no' , 'no' from dual union all
select 'b', 4, 'no' , 'no' from dual union all
select 'c', 5, 'yes', 'no' from dual union all
select 'c', 6, 'no' , 'no' from dual union all
select 'd', 7, 'yes', 'no' from dual union all
select 'd', 8, 'no' , 'yes' from dual union all
select 'e', 9, 'no' , 'yes' from dual union all
select 'e', 10, 'no' , 'yes' from dual union all
select 'f', 11, 'yes', 'yes' from dual union all
select 'f', 12, 'yes', 'yes' from dual union all
select 'g', 13, 'no' , 'no' from dual union all
select 'g', 14, 'no' , 'yes' from dual
)
-- Query begins here
select username,
max(id) keep (dense_rank last order by col01, col02) as id,
max(col01) as col01,
max(col02) keep (dense_rank last order by col01) as col02
from inputs
group by username
having min(col01) != max(col01) or min(col02) != max(col02)
;
USERNAME ID COL COL
-------- --- --- ---
a 1 yes yes
c 5 yes no
d 7 yes no
g 14 no yes
Use multiple outer self joins, one for records with both yes, one for records with only col01 = yes and one for records with only col02 = yes. Then add predicates to only select records where the id is the id of the first record in that set (id of row with same name that has both yes, id of row with same name that has only col01 = yes, etc.)
to get rid of rows that are dupes, filter out any row where there's another row, (with different id) that has same value for username, col01, and col02.
Select distinct a.username, a.id,
a.col01, a.col02
From table a
left join table b -- <- this is rows with both cols = yes
on b.username=a.username
and b.col01='yes'
and b.col02='yes'
left join table c1 -- <- this is rows with col1 = yes
on c1.username=a.username
and c1.col01='yes'
and c1.col02='no'
left join table c2 -- <- this is rows with col2 = yes
on c2.username=a.username
and c2.col01='no'
and c2.col02='yes'
Where a.id = coalesce(b.id, c1.Id, c2.Id)
and not exists -- <- This gets rid of f
(select * from table
where username = a.username
and id != a.id
and col01 = a.col01
and col02 = a.col02)
if col02 is in another table, then in each place you use the table and need col02, you will need to add another join to this other table.
Select distinct a.username, a.id,
a.col01, ot.col02
From (table a join other table ot
on ot.id = a.Id)
left join (table b join otherTable ob -- <- this rows with both cols yes
on ob.id= b.id)
on b.username=a.username
and b.col01='yes'
and ob.col02='yes'
left join (table c1 join otherTable oc1 -- <- this rows with col1 yes
on oc1.id= c1.id)
on c1.username=a.username
and c1.col01='yes'
and oc1.col02='no'
left join (table c2 join otherTable oc2 -- <- this rows with col2 yes
on oc2.id= c2.id)
on c2.username=a.username
and c2.col01='no'
and oc2.col02='yes'
Where a.id = coalesce(b.id, c1.Id, c2.Id)
and not exists -- <- This gets rid of f
(select * from table e
join otherTable oe
on oe.id= e.id
where e.username = a.username
and e.id != a.id
and e.col01 = a.col01
and oe.col02 = a.col02)