Easy table transpose in hive - sql

I need to transpose my table.
Now i have that type of table:
Atr_1|Atr_2
A | 1
A | 2
But i want to get the next result
Atr_1|Atr_2|Atr_3
A | 1 | 2
How should i transpose my table for achieving this result?

Use case statements with min() or max() aggregation:
select Atr_1,
max(case when Atr_2=1 then 1 end ) Attr_2,
max(case when Atr_2=2 then 2 end ) Attr_3
from table t
group by Atr_1;

If you have only two values, min() and max() do what you want:
select atr_1, min(atr_2) as atr_2, max(atr_3) as atr_3
from t
group by atr_1;

I think you want aggregation :
SELECT atr_1,
MAX(CASE WHEN SEQ = 1 THEN atr_2 END) as atr_2,
MAX(CASE WHEN SEQ = 2 THEN atr_2 END) as atr_3
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY atr_1 ORDER BY atr_2) AS SEQ
FROM table t
) t
GROUP BY atr_1;

Related

Convert column value to row value

In SQL Server, I am trying to convert the from table 1 to table 2. From reading other answers from stack overflow, I can do some sort of row_number(). But the problem is I need do some inner join after the conversion because the following script use max() aggregate function, it kind force other fields from other tables to have some sort of aggregate function as well. So I was wondering if there is an alternative approach to solve this problem? Or if there is a way to handle this aggregate function when do join with another table.
select max(case when key = 'ab' then Value end) as ab,
max(case when key = 'cd' then Value end) as cd
from (select t.*,
row_number() over (partition by key order by Value) as seq
from table t
) t
group by seq;
table 1
table 2
You can try with this below script-
SELECT id,
MAX(CASE WHEN name = 'car1' THEN name END) car1,
MAX(CASE WHEN name = 'car2' THEN name END) car2,
MAX(CASE WHEN name = 'car3' THEN name END) car3
FROM your_table
GROUP BY id
You can go for PIVOT feature.
;WITH src as
(
SELECT *
FROM
(
VALUES
(1, 'Car1', 'nissan'),
(1, 'Car2', 'audi'),
(1, 'Car3', 'toyota')
) as t (id, name, value)
)
SELECT *
FROM src
PIVOT
(
max(VALUE) FOR NAME IN ([Car1], [Car2], [Car3])
) as pvt
+----+--------+------+--------+
| id | Car1 | Car2 | Car3 |
+----+--------+------+--------+
| 1 | nissan | audi | toyota |
+----+--------+------+--------+

Select SUM and column with max

I looking best or simplest way to SELECT type, user_with_max_value, SUM(value) GROUP BY type. Table look similar
type | user | value
type1 | 1 | 100
type1 | 2 | 200
type2 | 1 | 50
type2 | 2 | 10
And result look:
type1 | 2 | 300
type2 | 1 | 60
Use window functions:
select type, max(case when seqnum = 1 then user end), sum(value)
from (select t.*,
row_number() over (partition by type order by value desc) as seqnum
from t
) t
where seqnum = 1;
Some databases have functionality for an aggregation function that returns the first value. One method without a subquery using standard SQL is:
select distinct type,
first_value(user) over (partition by type order by value desc) as user,
sum(value) over (partition by type)
from t;
You can use window function :
select t.*
from (select t.type,
row_number() over (partition by type order by value desc) as seq,
sum(value) over (partition by type) as value
from table t
) t
where seq = 1;
Try below query.
It will help you.
SELECT type, max(user), SUM(value) from table1 GROUP BY type
use analytical functions
create table poo2
(
thetype varchar(5),
theuser int,
thevalue int
)
insert into poo2
select 'type1',1,100 union all
select 'type1',2,200 union all
select 'type2',1,50 union all
select 'type2',2,10
select thetype,theuser,mysum
from
(
select thetype ,theuser
,row_number() over (partition by thetype order by thevalue desc) r
,sum(thevalue) over (partition by thetype) mysum from poo2
) ilv
where r=1

Pivot data in SQL (repeated levels)

I have a question regarding pivoting data in SQL.
Input data:
TABLE NAME temp
id cat value
1 A 22
1 B 33
1 C 44
1 C 55
My ideal output would be:
id A B C
1 22 33 44
1 22 33 55
Can someone provide some hints on this?
Thanks!
select * from
(
select
id,cat,value
from tablename
)
as tablo
pivot
(
sum(value)
for cat in ([A],[B],[C])
) as p
order by id
use case when, assuming you did a mistake in output format in 2nd rows
select id, max( case when cat='A' then value end) as A,
max(case when cat='B' then value end) as B,
max(case when cat='C' then value end)as C from table
group by id
You need row_number() function with conditional aggregation :
select id, max(case when cat = 'a' then value end) a,
max(case when cat = 'b' then value end) b,
max(case when cat = 'c' then value end) c
from (select t.*, row_number() over (partition by id, cat order by value) as seq
from table t
) t
group by id, seq;
However, it doesn't produce your actual output (it leaves null value where the cat has only one value compare to other cats) but it will give the idea of how to do that.
Use CASE WHEN and MAX aggregation:
select id, max(case when cat='A' then value end) as A,max(case when cat='B' then value end) as B,
max(case when cat='C' then value end) as C from temp
group by id

Custom Order for Max()

I want to get the "max" character value for a column using a group by statement, except instead of the default alphabetical order, I want to set up a custom ordering that the max will use.
Table1:
ID | TYPE
-----+-------
1 | A
1 | B
1 | C
2 | A
2 | B
I want to group by ID and get max(type) in the order of C, A, B. Expected result:
ID | MAX_TYPE
-----+-----------
1 | C
2 | A
select
id,
case
max(
case max_type
when 'C' then 3 when 'A' then 2 when 'B' then 1
end
)
when 3 then 'C' when 2 then 'A' when 1 then 'B'
end as max_type
from T
group by id
Translate to a value that an be ranked by max() and then translate back to the original value.
If you also want to order the result by that value then you could add:
order by
max(
case max_type
when 'C' then 3 when 'A' then 2 when 'B' then 1
end
) desc
Some platforms require the sorting column to be included in the output. I'm not sure if PostgreSql is one of those. And no objection to Gordon's answer but you'd have to use another window function to calculate the sort order if you need that too.
Instead of translating back and forth, use window functions:
select t.*
from (select t.*,
row_number() over (partition by id
order by (case when type = 'C' then 1
when type = 'A' then 2
when type = 'B' then 3
end) as seqnum
from t
) t
where seqnum = 1;
Depending on what the values look like, you can also simplify this using string functions:
select t.*
from (select t.*,
row_number() over (partition by id
order by position(type, 'CAB')) as seqnum
from t
) t
where seqnum = 1;

Any other alternative to write this SQL query

I need to select data base upon three conditions
Find the latest date (StorageDate Column) from the table for each record
See if there is more then one entry for date (StorageDate Column) found in first step for same ID (ID Column)
and then see if DuplicateID is = 2
So if table has following data:
ID |StorageDate | DuplicateTypeID
1 |2014-10-22 | 1
1 |2014-10-22 | 2
1 |2014-10-18 | 1
2 |2014-10-12 | 1
3 |2014-10-11 | 1
4 |2014-09-02 | 1
4 |2014-09-02 | 2
Then I should get following results
ID
1
4
I have written following query but it is really slow, I was wondering if anyone has better way to write it.
SELECT DISTINCT(TD.RecordID)
FROM dbo.MyTable TD
JOIN (
SELECT T1.RecordID, T2.MaxDate,COUNT(*) AS RecordCount
FROM MyTable T1 WITH (nolock)
JOIN (
SELECT RecordID, MAX(StorageDate) AS MaxDate
FROM MyTable WITH (nolock)
GROUP BY RecordID)T2
ON T1.RecordID = T2.RecordID AND T1.StorageDate = T2.MaxDate
GROUP BY T1.RecordID, T2.MaxDate
HAVING COUNT(*) > 1
)PT ON TD.RecordID = PT.RecordID AND TD.StorageDate = PT.MaxDate
WHERE TD.DuplicateTypeID = 2
Try this and see how the performance goes:
;WITH
tmp AS
(
SELECT *,
RANK() OVER (PARTITION BY ID ORDER BY StorageDate DESC) AS StorageDateRank,
COUNT(ID) OVER (PARTITION BY ID, StorageDate) AS StorageDateCount
FROM MyTable
)
SELECT DISTINCT ID
FROM tmp
WHERE StorageDateRank = 1 -- latest date for each ID
AND StorageDateCount > 1 -- more than 1 entry for date
AND DuplicateTypeID = 2 -- DuplicateTypeID = 2
You can use analytic function rank , can you try this query ?
Select recordId from
(
select *, rank() over ( partition by recordId order by [StorageDate] desc) as rn
from mytable
) T
where rn =1
group by recordId
having count(*) >1
and sum( case when duplicatetypeid =2 then 1 else 0 end) >=1