Teradata SQL count on Top n subgroups

Teradata SQL count on Top n subgroups - sql

I am using TD v15. I have a table as below - each row is a single record, I want to perform Count in the following way:
In Question Column: I have 4 'A', 5 'B', 3 'C' and 2 'D'. Select top 2 from them, which are A & B. Group the rest Questions as 'OtherQ' - Put them in Result Question Column.
In Change Column, I have 2 'AA', 3 'AB', 2'AC', 2 'AD', 4 'AE' and 2 'AG', select top 2, which are AE & AB, group the rest Change as 'Other' - Put them in Result Change Column.
Then, count according...
Question Result Change
A Pass AG
A Pass AE
A Pass AA
A Pass AB
B Pass AC
B Pass AG
B Pass AB
B Pass AE
B Pass AD
B Pass AA
C Pass AB
C Pass AC
C Pass AD
D Pass AE
D Pass AE
A Fail Null
A Fail Null
C Fail Null
E Fail Null
B Fail Null
This is the desired result, it counts on top 2 questions (A&B) and OtherQ with Top 2 changes (AE&AB) and other Changes, also, it counts Pass&Fail for A&B and OtherQ.
The sum of Count is 20, this should match the 20 individual row in the table above.
Question Result Change Count
A Pass AE 1
A Pass AB 1
A Pass Other 2
B Pass AE 1
B Pass AB 1
B Pass Other 4
OtherQ Pass AE 2
OtherQ Pass AB 1
OtherQ Pass Other 2
A Fail Null 2
B Fail Null 1
OtherQ Fail Null 2
Could you please kindly help? It's very large data table, needs the code to be efficient. Many thank for your time and help in advance.

I would suggest using aggregations and subqueries:
select coalesce(tq.question, 'Other') as question
(case when t.change is null then null
else coalesce(tch.change, 'Other')
end) as change,
count(*)
from t left join
(select question, count(*) as cnt,
row_number() over (order by count(*) desc) as seqnum
from t
group by question
) tq
on tq.question = t.question and tq.seqnum <= 2 left join
(select change, count(*) as cnt,
row_number() over (order by count(*) desc) as seqnum
from t
group by change
) tch
on tch.change = t.change and tch.seqnum <= 2
group by coalesce(tq.question, 'Other'),
coalesce(tch.change, 'Other');

Related

how to execute query for each row result of another query

I have 2 tables , one stores IDs and another logs for each ID , i would like to get sum of log for each ID and ID number from these 2 tables
A B
------- -------------
ID ID_C LOG
1 1 15
2 1 30
3 4 44
4 2 14
5 3 88
3 10
2 10
for getting sum query is
SELECT SUM(LOG) FROM B WHERE ID_C ='2' ;
notice ID and ID_C are same but name is different in tables
and for getting all ids available query is
SELECT ID FROM A ;
I would like to get the following table result
result
--------------------
ID SUM
1 45
4 44
2 24
3 98
I tried
SELECT SUM(LOG) FROM B WHERE ID_C in (SELECT ID FROM A ) ;
but it result in sum of all IDs

It looks like you just need a join aggregation here:
SELECT a.ID, SUM(b.LOG) AS SUM
FROM A a
INNER JOIN B b
ON b.ID_C = a.ID
GROUP BY a.ID
ORDER BY a.ID;
Note that the inner join will also remove ID values from the A table which no entries whatsoever in the B table, which seems to be the behavior you want.

you should use inner join and GROUP BY:
SELECT A.ID as ID, SUM(LOG) AS SumLOG
FROM A inner join B ON A.ID = B.ID_C
GROUP BY A.ID
if you needed can use where for ID filter.

Recursive CTE with 2 tables not working in Oracle / SQL

There are 2 tables Department and subdepartment which have id in common.
I am trying to recursively fetch all the ids reporting to AB directly and indirectly. BC is reporting to AB, hence 4,5,6 are indirectly reporting to AB, likewise fetching till the last id.
Tried the below recursive CTE query but I am getting the result of only the first level. Seems recursive query is not executing.
I am not sure what is wrong in the query. Can someone help me in spotting the error.
Department
Name id
AB 1
AB 2
AB 3
BC 4
BC 5
BC 6
CD 7
CD 8
EF 9
EF 10
EF 11
Subdepartment
ID Reporting
1
2
3 BC
4
5 CD
6
7
8 EF
9
10
11
Query:
With reportinghierarchy (Name, Id, Reporting, Level) As
(
--Anchor
Select A.name,A.id,reporting,0 from department A, subdepartment B
where A.id=B.id and A.name='AB'
Union All
--Recursive member
Select C.name,C.id,D.reporting, Level+1 from department C, subdepartment D
Inner Join reportinghierarchy R
On (C.Name = R.reporting)
Where C.name != 'AB' and C.Id =D.id
And R.Reporting is not null
)
Select * from reportinghierarchy
Current Output :
Name Id Reporting Level
AB 1 0
AB 2 0
AB 3 BC 0
Expected output :
Name id Reporting Level
AB 1 0
AB 2 0
AB 3 BC 0
BC 4 1
BC 5 CD 1
BC 6 1
CD 7 2
CD 8 EF 2
EF 9 3
EF 10 3
EF 11 3

Hmmm, "horrible data structure" comes to mind. This approach gets one row per "reporting" name to use for the recursive CTE portion. It then joins the level back to the original data.
with ds as (
select d.name, d.id, sd.reporting
from department d join
subdepartment sd
on d.id = sd.id
),
nd as (
select d.name, sd.reporting
from ds
where sd.reporting is not null
),
cte as (
select ds.name, nd.reporting, 0 as lev
from nd
where not exists (select 1 from nd nd2 where nd2.reporting = nd.name)
union all
select nd.name, nd.reporting, lev + 1
from cte join
nd
on nd.name = cte.reporting
)
select ds.*, cte.lev
from ds join
cte
on ds.name = cte.name;
Also, learn to use proper, explicit JOIN syntax. It has been the standard syntax for decades.

Your original query was actually VERY, VERY close to working. The reasons it didn't work are:
You used the keyword LEVEL as a column name without quoting it. In Oracle LEVEL has specific meaning, and using it out of context causes the parser no end of headaches. I've changed it to LVL, which works fine.
In the recursive half of the UNION you mixed old-style and new-style joins. This is a huge problem and should never be done. Either use "old-style" implied joins, or use "new-style" explicit joins. To keep as close to your original as possible I used implicit joins, but good coding practice says you should use explicit joins all the time.
The corrected query is:
With reportinghierarchy (Name, Id, Reporting, lvl) As
(
--Anchor
Select A.name,A.id,reporting,0 from department A, subdepartment B
where A.id=B.id and A.name='AB'
Union All
--Recursive member
Select C.name,C.id,D.reporting, lvl+1 from department C, subdepartment D, reportinghierarchy R
Where C.name != 'AB' and C.Id =D.id and C.Name = R.reporting
And R.Reporting is not null
)
Select * from reportinghierarchy;
Given the above, the following results are returned, which appear to match your desired results:
NAME ID REPORTING LVL
AB 1 (null) 0
AB 2 (null) 0
AB 3 BC 0
BC 4 (null) 1
BC 5 CD 1
BC 6 (null) 1
CD 7 (null) 2
CD 8 EF 2
EF 9 (null) 3
EF 10 (null) 3
EF 11 (null) 3
SQLFiddle here
Best of luck.

CASE statement when using LEFT JOIN

I need some help in fixing a data aberration. I create a view based on two tables with Left Join and the result has some duplicates (as given in the logic section)
Data Setup:
*******************
TEST1
*******************
PRODUCT VALUE1 KEY
1 2 12
1 3 13
1 4 14
1 5 15
*******************
TEST2
*******************
KEY ATTRIBUTE
12 DESC
13 (null)
14 DESC
15 (null)
What I tried so far
SELECT
B.KEY,
B.ATTRIBUTE,
A.PRODUCT
A.VALUE1
FROM TEST2 B LEFT JOIN TEST1 A ON TEST2.KEY = TEST1.KEY;
What I get with above SQL is
KEY ATTRIBUTE PRODUCT VALUE1
12 DESC 1 2
13 (null) 1 3
14 DESC 1 4
15 (null) 1 5
What I need to get
KEY ATTRIBUTE PRODUCT VALUE1
12 DESC 1 2
13 DESC 1 3
14 DESC 1 4
15 DESC 1 5
Logic:
Since all products with id 1 are same, I need to retain the attributes if it is NULL. So doing a distinct of PRODUCT and ATTRIBUTE will always have 1 row per product id. Test1 has more than 100 products and Test2 has corresponding descriptions.
Note: This is not a normalized design since it is data warehousing. So no complaints on design please
I would like to have a CASE statement in the attribute field.
CASE
WHEN ATTRIBUTE IS NULL THEN {fix goes here}
ELSE ATTRIBUTE
END AS ATTRIBUTE
Some one needs to see fiddle, then go here

It's not clear but if you say that for each product can be only one attribute then try to use MAX() OVER
SELECT
TEST1.Product,
TEST1.value1,
TEST2.KEY,
MAX(ATTRIBUTE) OVER (PARTITION BY test1.Product) ATTR
FROM TEST2
LEFT JOIN
TEST1 ON TEST2.KEY = TEST1.KEY
SQLFiddle demo

SQL Fiddle:
SELECT B.KEY,
CASE WHEN B.ATTRIBUTE IS NULL THEN
(
SELECT s2.ATTRIBUTE
FROM test2 s2
LEFT JOIN TEST1 s1 ON s1.KEY = s2.KEY
WHERE s1.PRODUCT = A.PRODUCT
AND s2.ATTRIBUTE IS NOT NULL
AND ROWNUM = 1
) ELSE B.ATTRIBUTE END AS ATTRIBUTE,
A.PRODUCT, A.VALUE1
FROM TEST2 B
LEFT JOIN TEST1 A ON A.KEY = B.KEY;

SELECT
NVL(attribute,'DESC')
FROM TEST2 LEFT JOIN TEST1 ON TEST2.KEY = TEST1.KEY;
Just seen its Oracle please try above

Is it possible to left join two tables and have the right table supply each row no more than once?

Given this table structure:
Table A
ID AGE EDUCATION
1 23 3
2 25 6
3 22 5
Table B
ID AGE EDUCATION
1 26 4
2 24 6
3 21 3
I want to find all matches between the two tables where the age is within 2 and the education is within 2. However, I do not want to select any row from TableB more than once. Each row in B should be selected 0 or 1 times and each row in A should be selected one or more times (standard left join).
SELECT *
FROM TableA as A LEFT JOIN TableB as B ON
abs(A.age - B.age) <= 2 AND
abs(A.education - B.education) <= 2
A.ID A.AGE A.EDUCATION B.ID B.AGE B.EDUCATION
1 23 3 3 21 3
2 25 6 1 26 4
2 25 6 2 24 6
3 22 5 2 24 6
3 22 5 3 21 3
As you can see, the last two rows in the output have duplicated B.ID of 2 and 3 when compared to the entire result set. I'd like those rows to return as a single null match with A.ID = 3 since they were both matched to previous A values.
Desired output:
(note that for A.ID = 3, there is no match in B because all rows in B have already been joined to rows in A.)
A.ID A.AGE A.EDUCATION B.ID B.AGE B.EDUCATION
1 23 3 3 21 3
2 25 6 1 26 4
2 25 6 2 24 6
3 22 5 null null null
I can do this with a short program, but I'd like to solve the problem using a SQL query because it is not for me and I will not have the luxury of ever seeing the data or manipulating the environment.
Any ideas? Thanks

As #Joel Coehoorn said earlier, there has to be a mechanism that selects which pairs of (a,b) with the same (b) are filtered out from the output. SQL is not great on allowing you to select ONE row when multiple match, so a pivot query needs to be created, where you filter out the records you don't want. In this case, filtering can be done by reducing all of match IDs of B as a smallest (or largest, it doesn't really matter), using any function that will return one value from a set, it's just min() and max() are most convenient to use. Once you reduced the result to know which (a,b) pairs you care about, then you join against that result, to pull out the rest of the table data.
select a.id a_id, a.age a_age, a.education a_e,
b.id b_id, b.age b_age, b.education b_e
from a left join
(
SELECT
a.id a_id, min(b.id) b_id from a,b where
abs(A.age - B.age) <= 2 AND
abs(A.education - B.education) <= 2
group by a.id
) g on a.id = g.a_id
left join b on b.id = g.b_id;

To my knowledge something like this is not possible with a simple select statement and joins because you need to know what has already been selected in order to eliminate duplicates.
You can however try something a little more like this:
DECLARE #JoinResults TABLE
(A_ID INT, A_Age INT, A_Education INT, B_ID INT, B_Age INT, B_Education INT)
INSERT INTO #JoinResults (A_ID, A_Age, A_Education)
SELECT ID, AGE, EDUCATION
FROM TableA
DECLARE #i INT
SET #i = 1
--Assume that A_ID is incremental and no values missed
WHILE (#i < (SELECT Max(A_ID) FROM #JoinResults
BEGIN
UPDATE #JoinResult
SET B_ID = SQ.ID,
B_Age = SQ.AGE,
B_Education = SQ.Education
FROM (
SELECT ID, AGE, EDUCATION
FROM TableB b
WHERE (
abs((SELECT A_Age FROM #JoinResult WHERE A_Id = #i) - AGE) <=2
AND abs((SELECT A_Education FROM #JoinResult WHERE A_Id = #i) - EDUCATION) <=2
) AND (SELECT B_ID FROM #JoinResults WHERE B_ID = b.id) IS NULL
) AS SQ
SET #i = #i + 1
END
SELECT #JoinResults
NOTE: I do not currently have access to a database so this is untested and I am weary of 2 potential issues with it
I am not sure how the update will react if there are no results
I am unsure if the IS NULL check is correct to eliminate the duplicates.
If these issues do arise let me know and I can help troubleshoot.

In SQL-Server, you can use the CROSS APPLY syntax:
SELECT
a.id, a.age, a.education,
b.id AS b_id, b.age AS b_age, b.education AS b_education
FROM tableB AS b
CROSS APPLY
( SELECT TOP (1) a.*
FROM tableA AS a
WHERE ABS(a.age - b.age) <= 2
AND ABS(a.education - b.education) <= 2
ORDER BY a.id -- your choice here
) AS a ;
Depending on the order you choose in the subquery, different rows from tableA will be selected.
Edit (after your update): But the above query will not show rows from A that have no matching rows in B or even some that have but not been selected.
It could also be done with window functions but Access does not have them. Here is a query that I think will work in Access:
SELECT
a.id, a.age, a.education,
s.id AS s_id, s.age AS b_age, s.education AS b_education
FROM tableB AS a
LEFT JOIN
( SELECT
b.id, b.age, b.education, MIN(a.id) AS a_id
FROM tableB AS b
JOIN tableA AS a
ON ABS(a.age - b.age) <= 2
AND ABS(a.education - b.education) <= 2
GROUP BY b.id, b.age, b.education
) AS s
ON a.id = s.a_id ;
I'm not sure if Access allows such a subquery but if it doesn't, you can define it as a "Query" and then use it in another.

Use SELECT DISTINCT
SELECT DISTINCT A.id, A.age, A.education, B.age, B.education
FROM TableA as A LEFT JOIN TableB as B ON
abs(A.age - B.age) <= 2 AND
abs(A.education - B.education) <= 2

Transposing rows to columns using self join

I have a table named category with values as below,
CategoryId | Value | Flag
1 25 a
2 26 a
3 27 a
1 28 m2 23 m
1 36 p2 33 p
Now I want to transpose the rows present in this table to columns based on the flag, something like
CategoryId | aValue | mValue | PValue
1 25 28 36
2 26 23 33
3 27 null null
I am trying to join based on the category id but I am just getting the matched records (inner join) in my resultset even if I use left outer join in my query.
My query:
SELECT
A.CategoryId,
A.Value AS actual,
B.Value AS projected,
C.Value AS Manual
FROM ((a AS A left JOIN b AS B ON A.categoryid=B.categoryid)
left JOIN c AS C ON A.categoryid=C.categoryid)
WHERE (((A.flag)="a") and ((B.flag)="p") and ((C.flag) ="m"))
I am getting the proper results if I have the data in 3 different tables.
I just want to check what would be the best way to transpose a rows to column when using self join...
Thanks,
Barani

Try this:
SELECT CategoryId,
MIN(SWITCH(YourTable.Flag = 'a',Value)) AS aValue,
MIN(SWITCH(YourTable.Flag = 'm',Value)) AS mValue,
MIN(SWITCH(YourTable.Flag = 'p',Value)) AS pValue
FROM YourTable
GROUP BY CategoryId

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Teradata SQL count on Top n subgroups - sql

Related

how to execute query for each row result of another query

Recursive CTE with 2 tables not working in Oracle / SQL

CASE statement when using LEFT JOIN

Is it possible to left join two tables and have the right table supply each row no more than once?

Transposing rows to columns using self join

Categories

Resources