I have 30 columns like p1, p2, p3,......,p29, p30.
Out of them, any 6 consecutive values will be non-null and the rest are all none.
I need to write an SQL query (preferably Redshift) to get all of them into 6 columns. Say a1,a2,a3,a4,a5,a6
Eg. If I have 50 rows of data with 30 columns with a lot of nulls. I'll be turning it into 50 rows of data with those 6 non-null values of a row.
There is no simple way to do this. One method is to unpivot and then re-aggregate -- assuming your table has a primary key:
select pk,
max(case when seqnum = 1 then p end) as q1,
max(case when seqnum = 2 then p end) as q2,
max(case when seqnum = 3 then p end) as q3,
max(case when seqnum = 4 then p end) as q4,
max(case when seqnum = 5 then p end) as q5,
max(case when seqnum = 6 then p end) as q6
from (select pk, p, row_number() over (partition by pk order by which) as seqnum
from ((select pk, 1 as which, p1 as p from t) union all
(select pk, 2 as which, p2 as p from t) union all
. . .
) t
where p is not null
) t
group by pk
Related
I have some information on student name and their roll no -
And I want to split them into 3 sets of column like below. The total no of rows should always be ceiling value of (# of rows/3)
You can do this with conditional aggregation. However, SQL tables represent unordered sets, so which values end up where is arbitrary:
select max(case when seqnum % 3 = 0 then name end) as name_1,
max(case when seqnum % 3 = 0 then roll end) as roll_1,
max(case when seqnum % 3 = 1 then name end) as name_2,
max(case when seqnum % 3 = 1 then roll end) as roll_2,
max(case when seqnum % 3 = 2 then name end) as name_3,
max(case when seqnum % 3 = 2 then roll end) as roll_3
from (select t.*, row_number() over (order by (select null)) - 1 as seqnum
from t
) t
group by floor(seqnum / 3);
If you have an ordering column, then use it instead of (select null).
This question already has answers here:
SQL Server dynamic PIVOT query?
(9 answers)
Closed 4 years ago.
I have table with data as below
I have got result to convert one column to multiple columns. But I need output to convert multiple columns
The result is expected as below
Output needed as
Name Q1 G1 Q2 G2 Q3 G3
Antony HSE A Degree C NULL NULL
Bob HSE B Degree B Masters A
Marc HSE D Degree C Masters B
If those Qualifications have fixed values, then you can get that result via conditional aggregation.
SELECT
Name,
MAX(CASE WHEN Qualification = 'HSE' THEN Qualification END) AS Q1,
MAX(CASE WHEN Qualification = 'HSE' THEN Grade END) AS G1,
MAX(CASE WHEN Qualification = 'Degree' THEN Qualification END) AS Q2,
MAX(CASE WHEN Qualification = 'Degree' THEN Grade END) AS G2,
MAX(CASE WHEN Qualification = 'Masters' THEN Qualification END) AS Q3,
MAX(CASE WHEN Qualification = 'Masters' THEN Grade END) AS G3
FROM YourTable
GROUP BY Name
ORDER BY Name
If the qualification names aren't fixed, then you could generate a row_number and use that.
Then you can add as many Qn & Gn as a Name can have qualifications.
To test that: select top 1 [Name], count(*) Total from #YourTable group by [Name] order by Total desc
SELECT
Name,
MAX(CASE WHEN RN = 1 THEN Qualification END) AS Q1,
MAX(CASE WHEN RN = 1 THEN Grade END) AS G1,
MAX(CASE WHEN RN = 2 THEN Qualification END) AS Q2,
MAX(CASE WHEN RN = 2 THEN Grade END) AS G2,
MAX(CASE WHEN RN = 3 THEN Qualification END) AS Q3,
MAX(CASE WHEN RN = 3 THEN Grade END) AS G3
FROM
(
SELECT Name, Qualification, Grade,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Qualification) AS RN
FROM YourTable
) q
GROUP BY Name
ORDER BY Name
Or doing it dynamic
declare #MaxTotalQualifications int = (select top 1 count(*) from YourTable group by [Name] order by count(*) desc);
declare #cols varchar(max);
WITH DIGITS(n) AS (
SELECT n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n)
)
, NUMBERS(n) AS
(
SELECT ones.n + 10*tens.n + 100*hundreds.n + 1000*thousands.n
FROM DIGITS AS ones
CROSS JOIN DIGITS as tens
CROSS JOIN DIGITS as hundreds
CROSS JOIN DIGITS as thousands
)
select #cols = concat(#cols+CHAR(13)+CHAR(10)+', ', 'MAX(CASE WHEN RN = ', n ,' THEN Qualification END) AS [Q', n ,'], MAX(CASE WHEN RN = ', n ,' THEN Grade END) AS [G', n,']')
from NUMBERS
WHERE n BETWEEN 1 AND #MaxTotalQualifications;
-- select #MaxTotalQualifications as MaxTotalQualifications, #cols as cols;
declare #DynSql nvarchar(max);
set #DynSql = N'SELECT Name, '+ #cols + N'
FROM
(
SELECT Name, Qualification, Grade,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Qualification) AS RN
FROM YourTable
) q
GROUP BY Name
ORDER BY Name';
-- select #DynSql as DynSql;
exec(#DynSql);
A test on db<>fiddle here
Have a table:
Nomber Sce SceValue
10 A a1b2c3
20 C d2v3b4
10 B 42b2c3
10 B 5978c3
20 A edr432
I need to create the following listing using a view where there all possible "Sce" and "SceValue" pairs are shown for each individual "Nomber" (9 pairs maximum):
Nomber Sce1 SceValue1 Sce2 SceValue2 ... Sce9 SceValue9
10 A a1b2c3 B 42b2c3 B 5978c3
20 C d2v3b4 A edr432
I would like to achieve this using a View. Is this possible?
You can use conditional aggregation:
select number,
max(case when seqnum = 1 then Sce end) as sce_1,
max(case when seqnum = 1 then SceValue end) as SceValue_1,
max(case when seqnum = 2 then Sce end) as sce_2,
max(case when seqnum = 2 then SceValue end) as SceValue_2,
. . .
max(case when seqnum = 9 then Sce end) as sce_9,
max(case when seqnum = 9 then SceValue end) as SceValue_9
from (select t.*,
row_number() over (partition by nomber order by sce) as seqnum
from t
) t
group by nomber;
I have the following table:
custID Cat
1 A
1 B
1 B
1 B
1 C
2 A
2 A
2 C
3 B
3 C
4 A
4 C
4 C
4 C
What I need is the most efficient way to aggregate by CustID in such a manner that I obtain the most frequent category (cat), the second most frequent and the third. The output of the above should be
most freq 2nd most freq 3rd most freq
1 B A C
2 A C Null
3 B C Null
4 C A Null
When there is a tie in the count I do not really care what is first and what is second. For example for customer 1 2nd most freq and 3rd most freq could be swapped because each of them occur 1 time only.
Any sql would be fine, preferable hive sql.
Thank you
Try to use group by twice and dense_rank() to sort accorting to the cat count. Actually I'm not 100% sure , but I guess it should work in hive as well.
select custId,
max(case when t.rn = 1 then cat end) as [most freq],
max(case when t.rn = 2 then cat end) as [2nd most freq],
max(case when t.rn = 3 then cat end) as [3th most freq]
from
(
select custId, cat, dense_rank() over (partition by custId order by count(*) desc) rn
from your_table
group by custId, cat
) t
group by custId
demo
According to the comments I add slightly modified solution that conforms with Hive SQL
select custId,
max(case when t.rn = 1 then cat else null end) as most_freq,
max(case when t.rn = 2 then cat else null end) as 2nd_most_freq,
max(case when t.rn = 3 then cat else null end) as 3th_most_freq
from
(
select custId, cat, dense_rank() over (partition by custId order by ct desc) rn
from (
select custId, cat, count(*) ct
from your_table
group by custId, cat
) your_table_with_counts
) t
group by custId
Hive SQL demo
SELECT journal, count(*) as frequency
FROM ${hiveconf:TNHIVE}
WHERE journal IS NOT NULL
GROUP BY journal
ORDER BY frequency DESC
LIMIT 5;
I am trying to figure a query which performs some additions and subtractions of data in different rows and different columns based on the text/data in some other column in the same table.
Problem can be clearly addressed with the following example
Consider, I have table named Outright with four fields/columns with several records as follows
Product Term Bid Offer
------------------------------
A Aug14 P Q
A/B Aug14 R S
B Aug14 X Y
B Sep14 ab xy
B/C Sep14 pq rs
C Sep14 wx yz
When I run the query it should look for the Products that is separated by / in the above case there are two products of that type A/B and B/Cand then it should look for individual products based the those that are separated by / like we have a product A/B which is separated by a /, so it should look for product A and B with same term as A/B and perform some operations and return the data as follows
Product Term Bid Offer
------------------------------
A Aug14 a b
B Aug14 c d
B Sep14 ab cd
C Sep14 abc cde
in the above results
a=R+Y b=S+X
c=Q-S d=P-R
where P,Q,R,S,X,Y are Bid and Offer values from the table Outright
similar calculations are applied for all other data too like for B/C Sep14.. and many other
Example
Table Outright
A Oct14 -175 -75
B Oct14 125 215
A/B Oct14 NULL -150
Result should be
A Oct14 NULL -150+125=-25
B Oct14 -75-(-150)=75 NULL
The above values are calculated using the equations mentioned earlier
May I know a better way to solve it in SQL Server 2012?
Ok lets create some test data:
DECLARE #Outright TABLE
(
Product VARCHAR(10),
Term VARCHAR(10),
Bid VARCHAR(10),
Offer VARCHAR(10)
)
INSERT INTO #Outright
VALUES
('A', 'Aug14','P','Q'),
('A/B','Aug14','R','S'),
('B', 'Aug14','X','Y');
Making a cte to try to figure out the logic posted above and match the single product line to the multiproduct line
;WITH t AS
(
SELECT
a.*,
d.DRN,
d.Bid dBid,
d.Product dProduct,
d.Offer dOffer,
ROW_NUMBER() OVER (ORDER BY a.Product) RN
FROM #Outright a
OUTER APPLY
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY d.Product) DRN
FROM #Outright d
WHERE d.Product LIKE (a.Product + '/%')
OR d.Product LIKE ('%/' + a.Product)
) d
WHERE d.Product IS NOT NULL
)
Now we try to implement the + - rules as stated above (bids to offers, offers to bids, etc)
SELECT
*,
CASE WHEN RN = 1 THEN FE1_1 + '+' + FE1_2 ELSE FE1_1 + '-' + FE1_2 END Col1,
CASE WHEN RN = 1 THEN FE2_1 + '+' + FE2_2 ELSE FE2_1 + '-' + FE2_2 END Col2
FROM
(
SELECT
MAX(CASE WHEN RN = 1 THEN Product END) Prod1,
MAX(CASE WHEN RN = 1 THEN Term END) Term1,
MAX(CASE WHEN RN = 1 THEN dBid END) FE1_1,
MAX(CASE WHEN RN = 2 THEN Offer END) FE1_2,
MAX(CASE WHEN RN = 2 THEN dOffer END) FE2_1,
MAX(CASE WHEN RN = 2 THEN Bid END) FE2_2,
1 RN
FROM t
UNION ALL
SELECT
MAX(CASE WHEN RN = 2 THEN Product END) Prod2,
MAX(CASE WHEN RN = 2 THEN Term END) Term2,
MAX(CASE WHEN RN = 1 THEN Offer END) FE3_1,
MAX(CASE WHEN RN = 2 THEN dOffer END) FE3_2,
MAX(CASE WHEN RN = 1 THEN Bid END) FE4_1,
MAX(CASE WHEN RN = 2 THEN dBid END) FE4_2,
2 RN
FROM t
) d
Here is the output, with some extra columns to show the data being pulled
Prod1 Term1 FE1_1 FE1_2 FE2_1 FE2_2 RN Col1 Col2
A Aug14 R Y S X 1 R+Y S+X
B Aug14 Q S P R 2 Q-S P-R