How to transform Table in hive

How to transform Table in hive - hive

I have a table in Hive in below format.
std_id std_name sub_id sub_name marks
1 xxx 123 Eng 70
1 xxx 125 Maths 90
1 xxx 124 Science 80
I want a table in below format as output.
std_id std_name Eng Science Maths
1 xxx 70 80 90
how can i get the output with using Hive Query..

Use MAX ( CASE.. ) with GROUP BY
SELECT std_id
,std_name
,MAX(CASE
WHEN sub_name = 'Eng'
THEN marks
END) AS Eng
,MAX(CASE
WHEN sub_name = 'Science'
THEN marks
END) AS Science
,MAX(CASE
WHEN sub_name = 'Maths'
THEN marks
END) AS Maths
FROM t
GROUP BY std_id
,std_name;

Related

SQL Queries - Count total of rows with Case when statement

I will like to count to get the correct number of rows when the [Location] column match a certain value.
Below is my table:
Student
Marks
Location
Date
Kenn
66
UK
09-01-2022
Kenn
89
UK
09-01-2022
Kenn
77
Canada
09-01-2022
Below SQL queries is what I have tried:
SELECT [Student]
,COUNT(CASE WHEN [Location] = 'UK' THEN [Marks] ELSE 0 END) AS UK
,COUNT(CASE WHEN [Location] = 'Canada' THEN [Marks] ELSE 0 END) AS Canada
FROM table_name
GROUP BY [Student]
But the output is
Student
UK
Canada
Kenn
3
3
What I expected to see is:
Student
UK
Canada
Kenn
2
1
Please advise if anything wrong with my SQL queries?
Thank you!!

Use sum instead
SELECT [Student]
,SUM(CASE WHEN [Location] = 'UK' THEN 1 ELSE 0 END) AS UK
,SUM(CASE WHEN [Device] = 'Canada' THEN 1 ELSE 0 END) AS Canada
FROM table_name
GROUP BY [Student]

Transact-SQL Select Query Giving duplicate values

first off, thank you for taking the time to look at this.
I am trying to collect data from 3 tables and make a reference chart that allows the end user to see the stored data.
Basically I have 3 tables for this example:
USERS:
USER_PK USER_ID USER_NAME
1 10000 Bob
2 10001 Sally
3 10003 Joe
4 10004 Susan
SKILL_TYPE:
SKILL_PK SKILL_NAME
11 Point of Sale
22 Digital Sales
33 Customer Service
44 Specialist Support
SKILL_ASSOCIATION:
SKILL_ASSOC_PK SKILL_PK USER_PK START_DATE STOP_DATE Priority
99 11 1 36526 500000 2
88 11 2 36527 500000 3
77 22 1 36526 500000 3
66 33 3 36528 500000 1
55 44 4 36525 500000 1
444 33 4 36525 500000 4
(I know I've probably broken some rules with cataloging this data I did it in SQL Express, however it is only an example and not representative of the real data I am using)
My Select Query Returns an unwanted result with multiple lines for each USER:
Statement:
SELECT USERS.[USER_NAME], USERS.[USER_ID],
(CASE WHEN ST.SKILL_NAME ='Point of Sale' Then SA.[PRIORITY] END) AS 'POS',
(CASE WHEN ST.SKILL_NAME ='Digital Sales' Then SA.[PRIORITY] END) AS 'DS',
(CASE WHEN ST.SKILL_NAME ='Customer Service' Then SA.[PRIORITY] END) AS 'CS',
(CASE WHEN ST.SKILL_NAME ='Specialist Support' Then SA.[PRIORITY] END) AS 'Spec'
FROM USERS
INNER JOIN [dbo].[SKILL_ASSOCIATION] AS SA ON SA.USER_PK = USERS.USER_PK
INNER JOIN SKILL_TYPE AS ST ON ST.SKILL_PK = SA.SKILL_PK
Result:
USER_NAME USER_ID POS DS CS Spec
Bob 10000 2 NULL NULL NULL
Sally 10001 3 NULL NULL NULL
Bob 10000 NULL 3 NULL NULL
Joe 10003 NULL NULL 1 NULL
Susan 10004 NULL NULL NULL 1
Susan 10004 NULL NULL 4 NULL
I've tried using distinct as well with similar results.
Desired Results:
NAME ID POS DS CS Spec
Bob 1 2 3
Sally 2 3
Joe 1
Susan 4 1
I have very limited Query access with this SQL Server and cannot create/modify or delete from it to accomplish my objective.
Any guidance would be much appreciated!
Thanks,
Steven

Your expected output implies that an aggregation by user along with taking the MAX of each of the CASE expressions should work:
SELECT
u.[USER_NAME],
u.[USER_ID],
MAX(CASE WHEN ST.SKILL_NAME = 'Point of Sale' THEN SA.[PRIORITY] END) AS POS,
MAX(CASE WHEN ST.SKILL_NAME = 'Digital Sales' THEN SA.[PRIORITY] END) AS DS,
MAX(CASE WHEN ST.SKILL_NAME = 'Customer Service' THEN SA.[PRIORITY] END) AS CS,
MAX(CASE WHEN ST.SKILL_NAME = 'Specialist Support' THEN SA.[PRIORITY] END) AS Spec
FROM USERS u
INNER JOIN [dbo].[SKILL_ASSOCIATION] AS SA
ON SA.USER_PK = u.USER_PK
INNER JOIN SKILL_TYPE AS ST
ON ST.SKILL_PK = SA.SKILL_PK
GROUP BY
u.[USER_NAME],
u.[USER_ID];

sum of multiple fields in group by clause and create single row

I have a table like this.
Id Name Test Subject Marks
----------------------------
1 Alex 1 Maths 40
1 Alex 2 Maths 80
1 Alex 1 Sociology 55
1 Alex 2 Sociology 70
1 Alex 3 Sociology 60
2 Mark 1 Maths 30
2 Mark 2 Maths 60
2 Mark 1 Sociology 40
2 Mark 2 Sociology 50
2 Mark 3 Sociology 30
What I need is a group by on Id, Name, Subject and sum(Marks) in a single row, to give a result like:
Id Name Maths Sociology
-----------------------
1 Alex 120 185
2 Mark 90 120
I can get this as:
Id Name Marks
--------------
1 Alex 120
2 Mark 90
or:
Id Name Marks
-------------
1 Alex 185
2 Mark 120
I tried multiple options but I'm getting multiple rows for each ID.
The query below is not working:
select
Id, Name, Sum(Marks) where Subject = 'Maths' as Maths,Sum(Marks ) where Subject = 'Sociology' as Sociology
from
Table
group by
Id, Name;
I get this error:
"Error while compiling statement: FAILED: ParseException line 3:66 missing EOF at 'as' near ''Maths''

You are looking for PIVOT. Try :
create table #tbl(Id int, Name varchar(20), Test int, Subject varchar(20), Marks int)
insert into #tbl values
(1,'Alex',1,'Maths', 40 ),
(1,'Alex',2,'Maths', 80 ),
(1,'Alex',1,'Sociology', 55),
(1,'Alex',2,'Sociology', 70),
(1,'Alex',3,'Sociology', 60),
(2,'Mark',1,'Maths', 30 ),
(2,'Mark',2,'Maths', 60 ),
(2,'Mark',1,'Sociology', 40),
(2,'Mark',2,'Sociology', 50),
(2,'Mark',3,'Sociology', 30)
--select * from #tbl
SELECT ID,Name,Maths,Sociology
FROM(
SELECT ID, Name, Subject, Marks
FROM #tbl
) tbl
PIVOT(
SUM(Marks) FOR Subject IN (Maths, Sociology)
) piv
output:
ID Name Maths Sociology
----------- ---- ------------ -----------
1 Alex 120 185
2 Mark 90 120

You can use a case expression to effectively filter the values going into your sum:
select Id
,Name
,sum(case when Subject = 'Maths' then Marks else 0 end) as Maths
,sum(case when Subject = 'Sociology' then Marks else 0 end) as Sociology
from Table
group by Id
,Name;
If you are looking to do this across a lot of different subject values however, you will need to look at using pivot:
select Id
,[Name]
,[Maths]
,[Sociology]
from (select Id, [Name], [Subject], Marks from #t) as t
pivot(sum(Marks)
for [Subject] in(Maths,Sociology)
) as p;

Another solution using PIVOT operator:
SELECT
ID
,Name
,Maths
,Sociology
FROM(
SELECT
ID
,Name
,Subject
,Marks
FROM your_table
) tbl
PIVOT(
SUM(Marks) FOR Subject IN (Maths, Sociology)
) pvt

You can use below query :
select id,name,maths_marks,sociology_marks
from
(select id,name,test,subject,marks
from table)
pivot(sum(marks) for subject in ('Maths' as Maths_marks,'Sociology' as Sociology_marks);

Denormalizing using query in Oracle

I've a table like this:
STU_NAME SUBJECT MARKS
--------- --------- ------
1 ENGLISH 90
1 TAMIL 80
1 MATHS 70
2 MATHS 70
2 TAMIL 80
2 ENGLISH 95
And the result should be like below:
STU_NAME MATHS_MARK ENGLISH_MARK TAMIL_MARK TOTAL_MARKS
--------- ----------- ------------ ----------- -------------
1 70 90 80 240
2 70 95 80 245
Can we achieve this with a query?

I find that the easiest way is to use conditional aggregation:
select stu_name,
max(case when subject = 'MATHS' then Marks end) as Maths,
max(case when subject = 'ENGLISH' then Marks end) as English,
max(case when subject = 'TAMIL' then Marks end) as Tamil,
sum(Marks) as Total
from t
group by stu_name;

I got the same using PIVOT function.,
SELECT *
FROM (SELECT *
FROM x)
PIVOT (MAX(marks) FOR (SUBJECT) IN ('ENGLISH', 'MATHS', 'TAMIL'));
But, Still I'm facing few issues.,
whether it is possible to get sum(marks) while using PIVOT.
And getting subjects with respective marks without mentioning every subject(when we do know what are the subjects will be) in both(case,pivot).,?

SQL Teradata query

I have a table abc which have many records with columns col1,col2,col3,
dept | name | marks |
science abc 50
science cvv 21
science cvv 22
maths def 60
maths abc 21
maths def 62
maths ddd 90
I need to order by dept and name with ranking as ddd- 1, cvv - 2, abc -3, else 4 then need to find out maximum mark of an individual. Expected result is
dept | name | marks |
science cvv 22
science abc 50
maths ddd 90
maths abc 21
maths def 62
. How may I do it.?

SELECT
dept,
name,
MAX(marks) AS mark
FROM
yourTable
GROUP BY
dept,
name
ORDER BY
CASE WHEN name = 'ddd' THEN 1
name = 'cvv' THEN 2
name = 'abc' THEN 3
ELSE 4 END
Or, preferably, have another table that includes the sorting order.
SELECT
yourTable.dept,
yourTable.name,
MAX(yourTable.marks) AS mark
FROM
yourTable
INNER JOIN
anotherTable
ON yourTable.name = anotherTable.name
GROUP BY
yourTable.dept,
youtTable.name
ORDER BY
anotherTable.sortingOrder

This should work:
SELECT Dept, Name, MAX(marks) AS mark
FROM yourTable
GROUP BY Dept, Name
ORDER BY CASE WHEN Name = 'ddd' THEN 1
WHEN Name = 'cvv' THEN 2
WHEN Name = 'ABC' THEN 3
ELSE 4 END

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to transform Table in hive - hive

I have a table in Hive in below format. std_id std_name sub_id sub_name marks 1 xxx 123 Eng 70 1 xxx 125 Maths 90 1 xxx 124 Science 80 I want a table in below format as output. std_id std_name Eng Science Maths 1 xxx 70 80 90 how can i get the output with using Hive Query..

Use MAX ( CASE.. ) with GROUP BY SELECT std_id ,std_name ,MAX(CASE WHEN sub_name = 'Eng' THEN marks END) AS Eng ,MAX(CASE WHEN sub_name = 'Science' THEN marks END) AS Science ,MAX(CASE WHEN sub_name = 'Maths' THEN marks END) AS Maths FROM t GROUP BY std_id ,std_name;

Related

SQL Queries - Count total of rows with Case when statement

Transact-SQL Select Query Giving duplicate values

sum of multiple fields in group by clause and create single row

Denormalizing using query in Oracle

SQL Teradata query

Categories

Resources