I'm trying to pivot a two column table but am not getting my desired results.
Here is a sample of the data in the Employees table:
DataPoint Populated
name Ram
email ram#gmail.com
age 23
name Shyam
email shyam23#gmail.com
age 28
name John
email john#gmail.com
age 33
name Bob
email bob32#gmail.com
age 41
Here is what I want:
name email age
Ram ram#gmail.com 23
Shyam shyam23#gmail.com 28
John john#gmail.com 33
Bob bob32#gmail.com 41
Here is my code:
;WITH NeedToPivot AS(
SELECT *
FROM Employees)
SELECT *
FROM NeedToPivot
PIVOT(MAX(Populated) FOR DataPoint IN("name","email","age"))x
Here is what it's returning:
name email age
Shyam shyam23#gmail.com 28
Based on feedback from Sean Lange I added an EmployeeId column to the Employees table. The pivot operator now understands my desired grouping and the query is returning exactly what I want.
Employees table now looks like this:
EmployeeId DataPoint Populated
1 name Ram
1 email ram#gmail.com
1 age 23
2 name Shyam
2 email shyam23#gmail.com
2 age 28
3 name John
3 email john#gmail.com
3 age 33
4 name Bob
4 email bob32#gmail.com
4 age 41
Related
I have a need to concatenate name information from two tables in a specific order. So far, all I have is the SQL to retrieve the data. I need help with the SQL to output a name string in the correct order.
example data:
TBL_TITLE_ORDER
ORDER_ID
ORDER_SEQUENCE
TITLE_ID
FIRSTNAME_ID
20
1
30
20
2
456
20
3
33
21
1
31
21
2
32
TBL_TITLE
TITLE_ID
TITLE_NAME
30
Mr
31
Mrs
32
Jones
33
Smith
TBL_FIRSTNAME
FIRSTNAME_ID
NAME
456
John
SELECT TBL_TITLE.TITLE_NAME, TBL_FIRSTNAME.NAME
FROM (TBL_TITLE_ORDER LEFT JOIN TBL_TITLE ON TBL_TITLE_ORDER.TITLE_ID = TBL_TITLE.TITLE_ID) LEFT JOIN TBL_FIRSTNAME ON TBL_TITLE_ORDER.FIRSTNAME_ID = TBL_FIRSTNAME.FIRSTNAME_ID
WHERE (TBL_TITLE_ORDER.ORDER_ID) = 20
ORDER BY TBL_TITLE_ORDER.ORDER_SEQUENCE;
The output I need is a complete name string in the proper order sequence. There may or may not be a record in the TBL_FIRSTNAME table.
What I have so far:
TITLE_NAME
NAME
Mr
John
Smith
Required output:
Mr John Smith
Mrs Jones
this is my table schema, total_hours column is the result of a sum function.
Id name client total_hours
1 John company 1 100
1 John company 2 200
2 Jack company 3 350
2 Jack company 2 150
I want to merge the rows with similar ID into one row, looking like this.
Id name client_a total_hours_a client_b total_hours_b
1 John company 1 100 company 2 200
2 Jack company 3 350 company 2 150
I tried to use pivot but this function does not seem to exist in Dbeaver. Here is my query
SELECT
client
,name
,sum(hours) AS total_hours
FROM pojects
GROUP BY client, name;
Thanks in advance if anyone could be of any help.
I have a table in the following format
ID Property Value
1 name Tim
1 location USA
1 age 30
2 name Jack
2 location UK
2 age 27
And I would like an output in the following format
ID name location age
1 Tim USA 30
2 Jack UK 27
In python I can do
table_agg = table.groupby('ID')[['Property','Value']].apply(lambda x: dict(x.values))
p = pd.DataFrame(list(table_agg))
How to write the query in Hive?
You can use collect_list,map functions to group the data then access the array based on key.
Example:
hive> create table t1(id int,property string,valu string) stored as orc;
hive> insert into t1 values(1,"name","Tim"),(1,"location","USA"),(1,"age","30"),(2,"name","Jack"),(2,"location","UK"),(2,"age","27");
hive> select id,
va[0]["name"]name,
va[1]["location"]location,
va[2]["age"]age
from (
select id,collect_list(map(property,value))va
from <table_name> group by id
)t;
Result:
id name location age
1 Tim USA 30
2 Jack UK 27
The simple SELECT query would return the data as below:
Select ID, User, Country, TimeLogged from Data
ID User Country TimeLogged
1 Samantha SCO 10
1 John UK 5
1 Andrew NZL 15
2 John UK 20
3 Mark UK 10
3 Mark UK 20
3 Steven UK 10
3 Andrew NZL 15
3 Sharon IRL 5
4 Andrew NZL 25
4 Michael AUS 5
5 Jessica USA 30
I would like to return a sum of time logged for each user grouped by ID
But for only ID numbers where both of these values Country = UK and User = Andrew are included within their rows.
So the output in the above example would be
ID User Country TimeLogged
1 John UK 5
1 Andrew NZL 15
3 Mark UK 30
3 Steven UK 10
3 Andrew NZL 15
First you need to identify which IDs you're going to be returning
SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew';
and based on that, you can then filter to aggregate the expected rows.
SELECT ID,
[User],
Country,
SUM(Timelogged) as Timelogged
FROM mytable
WHERE (Country='UK' OR [User]='Andrew')
AND ID IN( SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew')
GROUP BY ID, [User], country;
So, you have described what you need to write almost perfectly but not quite. Your result table indicates that you want Country = UK OR User = Andrew, rather than AND
You need to select and group by, then include a WHERE:-
Select ID, User, Country, SUM(Timelogged) as Timelogged from mytable
WHERE Country='UK' OR User='Andrew'
Group by ID, user, country
Assume I have a table with the following data:
Name TransID Cost
---------------------------------------
Susan 1 10
Johnny 2 10
Johnny 3 9
Dave 4 10
I want to find a way to sum the Costs per name (assume the Names are unique) so that I get a table like this:
Name Cost
---------------------------------------
Susan 10
Johnny 19
Dave 10
Any help is appreciated.
This is relatively straightforward: you need to use a GROUP BY clause in your query:
SELECT Name,SUM(Cost)
FROM MyTable
GROUP BY Name