Aggregate ID-property-value records in Hive - sql

I have a table in the following format
ID Property Value
1 name Tim
1 location USA
1 age 30
2 name Jack
2 location UK
2 age 27
And I would like an output in the following format
ID name location age
1 Tim USA 30
2 Jack UK 27
In python I can do
table_agg = table.groupby('ID')[['Property','Value']].apply(lambda x: dict(x.values))
p = pd.DataFrame(list(table_agg))
How to write the query in Hive?

You can use collect_list,map functions to group the data then access the array based on key.
Example:
hive> create table t1(id int,property string,valu string) stored as orc;
hive> insert into t1 values(1,"name","Tim"),(1,"location","USA"),(1,"age","30"),(2,"name","Jack"),(2,"location","UK"),(2,"age","27");
hive> select id,
va[0]["name"]name,
va[1]["location"]location,
va[2]["age"]age
from (
select id,collect_list(map(property,value))va
from <table_name> group by id
)t;
Result:
id name location age
1 Tim USA 30
2 Jack UK 27

Related

Not getting desired results using the pivot operator in SQL Server

I'm trying to pivot a two column table but am not getting my desired results.
Here is a sample of the data in the Employees table:
DataPoint Populated
name Ram
email ram#gmail.com
age 23
name Shyam
email shyam23#gmail.com
age 28
name John
email john#gmail.com
age 33
name Bob
email bob32#gmail.com
age 41
Here is what I want:
name email age
Ram ram#gmail.com 23
Shyam shyam23#gmail.com 28
John john#gmail.com 33
Bob bob32#gmail.com 41
Here is my code:
;WITH NeedToPivot AS(
SELECT *
FROM Employees)
SELECT *
FROM NeedToPivot
PIVOT(MAX(Populated) FOR DataPoint IN("name","email","age"))x
Here is what it's returning:
name email age
Shyam shyam23#gmail.com 28
Based on feedback from Sean Lange I added an EmployeeId column to the Employees table. The pivot operator now understands my desired grouping and the query is returning exactly what I want.
Employees table now looks like this:
EmployeeId DataPoint Populated
1 name Ram
1 email ram#gmail.com
1 age 23
2 name Shyam
2 email shyam23#gmail.com
2 age 28
3 name John
3 email john#gmail.com
3 age 33
4 name Bob
4 email bob32#gmail.com
4 age 41

How to select only details of min value only in SQL?

I could get the minimum percentage of two values, but I need only the name, and ID in the select.
ID NAME CITY ONE TWO
--------------------------------------------------
2 Morales Los Angeles 40 10
1 John New York 60 20
4 Mary San Diego 10 10
I need to get the min value of one/two, and to only appear this as a result:
ID NAME
---------
4 Mary
Select ID, NAME
from MYTABLE
where least(ONE,TWO) = (select min(least(ONE,TWO)) from MYTABLE);
If you don't want Morales, then you can do this :
Select ID, NAME
from MYTABLE
where id =
(select id from
(select id from MYTABLE order by least(ONE,TWO), ONE*TWO)
where rownum <= 1);

In SQLite I am try to count people by current age when I have a column of birth dates

In SQLite I am try to count people by current age when I have a column of birth dates
e.g
AGE: ------------------ COUNT:
17 ------------------------- 4
18 ------------------------- 7
19 ------------------------- 6
etc......
Many Thanks,
Z
Try this query:
select age,count(age) FROM
(select strftime('%Y',Date('now')) - strftime('%Y',birth) age
from table1)t
group by age;
SQL Fiddle

Conditionally return column values from a joined table - Oracle

Hi I have a table DataTable as :
Name Age Address
----------------
Tom 21 XYZ
John 23 X123
Sam 32 Y123
there is another table MappingTable :
Name Address
-------------
John A12345
Now I want to create a query that returns the following :
Name Age Address
----------------
Tom 21 XYZ
John 23 A12345
Sam 32 Y123
How can I do this. I tried joining the tables but that would replace the complete column. I cannot even use Update since I am only returning a view using this query.
Thanks,
Monica
select dt.name,
dt.age,
coalesce(mt.address, dt.address)
from DataTable dt
left join MappingTable mt
on mt.Name = dt.Name;

How can I create a temporary sequence column in my sql query result?

I have a table which looks like this.
NAME AGE
james 22
ames 12
messi 32
....
....
I can query this table using Select name, age from emp;
Now what I want is having an extra column before name which will be 1,2,3..N if the query return n rows.
SEQUENCE NAME AGE
1 james 22
2 ames 12
3 messi 32
4 ....
....
How I can do this?
if you want to just add a column which will contain sequence number at display time(not actually store that data in a table) you can use ROWNUM pseudocolumn or row_number() analytical function.
select row_number() over(order by name) seq
, name
, age
from your_table
SEQ NAME AGE
---------- ----------- ----------
1 ames 12
2 james 22
3 messi 32
The output of the above query is ordered by NAME but you can order by any column or combination of columns you want.
Second approach is using rownum pseudocolumn. result is ordered by name also
select rownum seq
, name
, age
from ( select name
, age
from your_table
order by name
)
SEQ NAME AGE
---------- ----------- ----------
1 ames 12
2 james 22
3 messi 32
you can try
Select ROWNUM sequence, name , age from emp;
For each row returned by a query, the ROWNUM pseudocolumn returns a number indicating the order in which Oracle selects the row from a table or set of joined rows. The first row selected has a ROWNUM of 1, the second has 2, and so on.