Grouping like records - sql

We have a set of data where there are 2 records for each CODE. An example of the data is this:
TICKER CODE SCORE PRICE PCF
---------------------------------------
ABC 23 A 100 20
DEF 23 B 200 30
XXX 52 C 300 40
YYY 52 D 400 50
GHI 86 E 500 60
JKL 86 F 600 70
MNO 27 G 700 80
PQR 27 H 800 90
So, what we need to do is create a query which will return the columns of the like records by CODE into 1 record like this:
CODE,TICKER_1,SCORE_1,PRICE_1,PCF_1,TICKER_2,SCORE_2,PRICE_2,PCF_2
23,ABC,A,100,20,DEF,B,200,30
52,XXX,C,300,40,YYY,D,400,40
86,GHI,E,500,60,JKL,F,600,70
27,MNO,G,700,80,PQR,H,800,90
So, that they are combined by like CODE values.

You may try the following which assigns uses ROW_NUMBER to assign a row number for each code entry before using MAX with a case expression to filter for each entry.
Eg.
SELECT
CODE,
MAX(CASE WHEN rn=1 THEN TICKER END) AS TICKER_1,
MAX(CASE WHEN rn=1 THEN SCORE END) AS SCORE_1,
MAX(CASE WHEN rn=1 THEN PRICE END) AS PRICE_1,
MAX(CASE WHEN rn=1 THEN PCF END) AS PCF_1,
MAX(CASE WHEN rn=2 THEN TICKER END) AS TICKER_2,
MAX(CASE WHEN rn=2 THEN SCORE END) AS SCORE_2,
MAX(CASE WHEN rn=2 THEN PRICE END) AS PRICE_2,
MAX(CASE WHEN rn=2 THEN PCF END) AS PCF_2
FROM (
SELECT
m.*,
ROW_NUMBER() OVER (PARTITION BY CODE ORDER BY TICKER) as rn
FROM mytable m
) m1
GROUP BY
CODE
Outputs:
CODE
TICKER_1
SCORE_1
PRICE_1
PCF_1
TICKER_2
SCORE_2
PRICE_2
PCF_2
23
ABC
A
100
20
DEF
B
200
30
27
MNO
G
700
80
PQR
H
800
90
52
XXX
C
300
40
YYY
D
400
50
86
GHI
E
500
60
JKL
F
600
70
For debugging purposes, the output of the subquery
SELECT
m.*,
ROW_NUMBER() OVER (PARTITION BY CODE ORDER BY TICKER) as rn
FROM mytable m
looks like this:
TICKER
CODE
SCORE
PRICE
PCF
RN
ABC
23
A
100
20
1
DEF
23
B
200
30
2
MNO
27
G
700
80
1
PQR
27
H
800
90
2
XXX
52
C
300
40
1
YYY
52
D
400
50
2
GHI
86
E
500
60
1
JKL
86
F
600
70
2
View working demo on db fiddle
Notable alternatives
Instead of CASE WHEN rn=1 THEN TICKER END you could also use the DECODE function available in oracle as DECODE(rn,1,TICKER)
You may also use a pivot as shown below (NB. Column names are not as in the expected result)
WITH cte as (
SELECT
m.*,
ROW_NUMBER() OVER (PARTITION BY CODE ORDER BY TICKER) as rn
FROM mytable m
)
SELECT * FROM cte
PIVOT (
MAX(TICKER) as "TICKER",
MAX(SCORE) as "SCORE",
MAX(PRICE) as "PRICE",
MAX(PCF) as "PCF"
FOR rn IN (1,2)
)
Outputs:
CODE
1_TICKER
1_SCORE
1_PRICE
1_PCF
2_TICKER
2_SCORE
2_PRICE
2_PCF
23
ABC
A
100
20
DEF
B
200
30
27
MNO
G
700
80
PQR
H
800
90
52
XXX
C
300
40
YYY
D
400
50
86
GHI
E
500
60
JKL
F
600
70
View working demo on db fiddle here

Related

SQL - Order Data on a Column without including it in ranking

So I have a scenario where I need to order data on a column without including it in dense_rank(). Here is my sample data set:
This is the table:
create table temp
(
id integer,
prod_name varchar(max),
source_system integer,
source_date date,
col1 integer,
col2 integer);
This is the dataset:
insert into temp
(id,prod_name,source_system,source_date,col1,col2)
values
(1,'ABC',123,'01/01/2021',50,60),
(2,'ABC',123,'01/15/2021',50,60),
(3,'ABC',123,'01/30/2021',40,60),
(4,'ABC',123,'01/30/2021',40,70),
(5,'XYZ',456,'01/10/2021',80,30),
(6,'XYZ',456,'01/12/2021',75,30),
(7,'XYZ',456,'01/20/2021',75,30),
(8,'XYZ',456,'01/20/2021',99,30);
Now, I want to do dense_rank() on the data in such a way that for a combination of "prod_name and source_system", the rank gets incremented only if there is a change in col1 or col2 but the data should still be in ascending order of source_date.
Here is the expected result:
id
prod_name
source_system
source_date
col1
col2
Dense_Rank
1
ABC
123
01-01-21
50
60
1
2
ABC
123
15-01-21
50
60
1
3
ABC
123
30-01-21
40
60
2
4
ABC
123
30-01-21
40
70
3
5
XYZ
456
10-01-21
80
30
1
6
XYZ
456
12-01-21
75
30
2
7
XYZ
456
20-01-21
75
30
2
8
XYZ
456
20-01-21
99
30
3
As you can see above, the dates are changing but the expectation is that rank should only change if there is any change in either col1 or col2.
If I use this query
select id,prod_name,source_system,source_date,col1,col2,
dense_rank() over(partition by prod_name,source_system order by source_date,col1,col2) as rnk
from temp;
Then the result would come as:
id
prod_name
source_system
source_date
col1
col2
rnk
1
ABC
123
01-01-21
50
60
1
2
ABC
123
15-01-21
50
60
2
3
ABC
123
30-01-21
40
60
3
4
ABC
123
30-01-21
40
70
4
5
XYZ
456
10-01-21
80
30
1
6
XYZ
456
12-01-21
75
30
2
7
XYZ
456
20-01-21
75
30
3
8
XYZ
456
20-01-21
99
30
4
And, if I exclude source_date from order by in rank function i.e.
select id,prod_name,source_system,source_date,col1,col2,
dense_rank() over(partition by prod_name,source_system order by col1,col2) as rnk
from temp;
Then my result is coming as:
id
prod_name
source_system
source_date
col1
col2
rnk
3
ABC
123
30-01-21
40
60
1
4
ABC
123
30-01-21
40
70
2
1
ABC
123
01-01-21
50
60
3
2
ABC
123
15-01-21
50
60
3
6
XYZ
456
12-01-21
75
30
1
7
XYZ
456
20-01-21
75
30
1
5
XYZ
456
10-01-21
80
30
2
8
XYZ
456
20-01-21
99
30
3
Both the results are incorrect. How can I get the expected result? Any guidance would be helpful.
WITH cte AS (
SELECT *,
LAG(col1) OVER (PARTITION BY prod_name, source_system ORDER BY source_date, id) lag1,
LAG(col2) OVER (PARTITION BY prod_name, source_system ORDER BY source_date, id) lag2
FROM temp
)
SELECT *,
SUM(CASE WHEN (col1, col2) = (lag1, lag2)
THEN 0
ELSE 1
END) OVER (PARTITION BY prod_name, source_system ORDER BY source_date, id) AS `Dense_Rank`
FROM cte
ORDER BY id;
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=ac70104c7c5dfb49c75a8635c25716e6
When comparing multiple columns, I like to look at the previous values of the ordering column, rather than the individual columns. This makes it much simpler to add more and more columns.
The basic idea is to do a cumulative sum of changes for each prod/source system. In Redshift, I would phrase this as:
select t.*,
sum(case when prev_date = prev_date_2 then 0 else 1 end) over (
partition by prod_name, source_system
order by source_date
rows between unbounded preceding and current row
)
from (select t.*,
lag(source_date) over (partition by prod_name, source_system order by source_date, id) as prev_date,
lag(source_date) over (partition by prod_name, source_system, col1, col2 order by source_date, id) as prev_date_2
from temp t
) t
order by id;
I think I have the syntax right for Redshift. Here is a db<>fiddle using Postgres.
Note that ties on the date can cause a problem -- regardless of the solution. This uses the id to break the ties. Perhaps id can just be used in general, but your code is using the date, so this uses the date with the id.

PosrgreSQL Pivot Table

I need to make a PIVOT table from Source like this table
FactID UserID QTY Product
1 10 100 A
2 10 200 B
3 10 300 C
4 12 50 A
5 12 60 B
6 12 70 C
7 15 500 A
8 15 550 B
9 15 600 C
Need Pivot Like this
UserID A B C
10 100 200 300
12 50 60 70
15 500 550 600
My try
Select UserID,
CASE WHEN product = 'A' then QTY end as A,
CASE WHEN product = 'B' then QTY end as B,
CASE WHEN product = 'C' then QTY end as C
from public.table
And Result
UserID A B C
10 100 100 100
10 200 200 200
10 300 300 300
12 50 50 50
12 60 60 60
12 70 70 70
15 500 500 500
15 550 550 550
15 600 600 600
Where's my mistake? Maybe there's another way to do it?
Very close. You just need aggregation:
Select UserID,
SUM(CASE WHEN product = 'A' then QTY end) as A,
SUM(CASE WHEN product = 'B' then QTY end) as B,
SUM(CASE WHEN product = 'C' then QTY end) as C
from public.table
group by UserId;
In Postgres, though, this would normally use the FILTER clause instead of CASE:
Select UserID,
SUM(qty) FILTER (WHERE product = 'A') as A,
SUM(qty) FILTER (WHERE product = 'B') as B,
SUM(qty) FILTER (WHERE product = 'C') as C
from public.table
group by UserId;
You need aggregate function as
Select UserID,
Max(CASE WHEN product = 'A' then QTY end) as A,
Max(CASE WHEN product = 'B' then QTY end) as B,
Max(CASE WHEN product = 'C' then QTY end) as C
from public.table
Group by userid

Convert sql output to following format

I want to convert sql out put to following format.
Here is my table.
Id Country Code Totalcount
1 India 20 120
2 India 21 121
3 India 22 122
4 India 23 123
5 India 24 124
6 US 20 220
7 US 21 221
8 Us 22 222
9 UK 23 323
10 UK 24 324
Select Country, 20,21,22,23,24,25
from
(
Select Country,StatusCode,Totalcount from StatusDetails
) as SourceTable
Pivot
(
sum(Totalcount) for StatusCode in (20,21,22,23,24,25)
) as PivotTable
In Need Output like below one.Do I need to apply pivot table.
Country 20 21 22 23 24
India 120 121 122 123 124
US 220 221 222
UK 323 324
I am a fan of conditional aggregation for this purpose:
select country,
max(case when code = 20 then totalcount end) as cnt_20,
max(case when code = 21 then totalcount end) as cnt_21,
max(case when code = 22 then totalcount end) as cnt_22,
max(case when code = 23 then totalcount end) as cnt_23,
max(case when code = 24 then totalcount end) as cnt_24
from sourcetable
group by country
Yes you will need pivot & your code would also work along with quote's :
select pt.*
from (select Country, Code, Totalcount
from sourcetable
)as SourceTable Pivot
(sum(Totalcount) for Code in ([20],[21],[22],[23],[24],[25])
)as pt;

Query for get the specific record

Hi i have a table in which records as follows,every item have some variants and their quantity. i want to fetch only those item's record in which at least 3 variants value exist. ( a items have some qty in 3 variants but b have quantity only 2 variants, so i need only those record who have values at least in 3 records)
a 80 2
a 85 3
a 90 4
b 85 2
b 90 1
c 80 34
c 85 45
c 90 56
c 95 67
d 80 5
d 85 3
d 90 124
d 95 23
d 100 98
e 95 4
f 80 3
f 85 232
f 90 2
f 95 3
f 100 34
Result should be:
a 80 2
a 85 3
a 90 4
c 80 34
c 85 45
c 90 56
c 95 67
d 80 5
d 85 3
d 90 124
d 95 23
d 100 98
f 80 3
f 85 232
f 90 2
f 95 3
f 100 34
You can try with left join/is not null:
select t1.*
from tbl t1
left join ( select item
from tbl
group by item
having count(item) >= 3) t2 on t1.item = t2.item
where t2.item is not null
or in:
select t1.*
from tbl t1
where t1.item in ( select item
from tbl
group by item
having count(item) >= 3)
or exists:
select t1.*
from tbl t1
where exist ( select *
from tbl
where item = t1.item
group by item
having count(item) >= 3)
select * from a t1 where (select count(*) from a t2 where t2.x=t1.x)>2;
a is the table name, x is the first column's name.

Dynamically pivot to fixed number of columns

This is the structure of my data
Name TransID Amount
Joe 123 56
Joe 124 55
Joe 125 58
Tom 126 31
Tom 127 48
I have a requirement to report from this data in the below format
Name Amount1 Amount2
Joe 56 55
Joe 58
Tom 31 48
Joe has three Amounts in the original data set but I need a fixed number of columns (two) in the view. Therefore, the third Amount for Joe is inserted as a new record in the view. Is it possible to achieve this as a stored procedure or creating a view.
Break the problems into smaller steps. These are the steps that I would take:
Use ROW_NUMBER() OVER (PARTITION BY ...):
Name TransID Amount Row_Number
Joe 123 56 1
Joe 124 55 2
Joe 125 58 3
Tom 126 31 1
Tom 127 48 2
Subtract 1.
Name TransID Amount RowNumberStartingWith0
Joe 123 56 0
Joe 124 55 1
Joe 125 58 2
Tom 126 31 0
Tom 127 48 1
Divide it by 2, get the result of the division and the remainder modulo 2:
Name TransID Amount Result Remainder
Joe 123 56 0 0
Joe 124 55 0 1
Joe 125 58 1 0
Tom 126 31 0 0
Tom 127 48 0 1
Drop the TransID column. The remainder is always 0 or 1, so you can pivot on it:
Name Result AmountForRemainder0 AmountForRemainder1
Joe 0 56 55
Joe 1 58
Tom 0 31 48
Now you drop the Result column and rename your columns:
Name Amount1 Amount2
Joe 56 55
Joe 58
Tom 31 48
Profit.
TRY this and let me know .Check with other sample data also.I am getting the desire output.
DECLARE #t TABLE (
NAME VARCHAR(50)
,TransID INT
,Amount INT
)
INSERT INTO #t
VALUES ('Joe',123,56)
,('Joe',124,55)
,('Joe',125,58)
,('Tom',126,31)
,('Tom',127,48)
,('Tom',128,89)
,('Tom',129,90)
,('Joe',130,68);
WITH CTE
AS (
SELECT *
,row_number() OVER (
PARTITION BY NAME ORDER BY amount
) rn
FROM #t
)
,CTE1
AS (
SELECT NAME
,(
SELECT amount
FROM cte
WHERE rn = 1
AND NAME = a.NAME
) [Amount1]
,(
SELECT amount
FROM cte
WHERE rn = 2
AND NAME = a.NAME
) [Amount2]
,rn
FROM cte A
WHERE rn = 1
UNION ALL
SELECT b.NAME
,a.amount
,isnull(c.amount, 0)
,a.rn
FROM CTE1 B
INNER JOIN CTE A ON a.NAME = b.NAME
AND a.rn % 2 <> 0
AND a.rn > 1
AND b.rn <> a.rn
OUTER APPLY (
SELECT Amount
FROM CTE C
WHERE NAME = b.NAME
AND rn % 2 = 0
AND rn > 2
) c
)
SELECT *
FROM cte1