Selecting the the last row in a partition in HIVE - sql

I have a table t1:
c1 | c2 | c3| c4
1 1 1 A
1 1 2 B
1 1 3 C
1 1 4 D
1 1 4 E
1 1 4 F
2 2 1 A
2 2 2 A
2 2 3 A
I want to select the last row of each c1, c2 pair. So (1,1,4,F) and (2,2,3,A) in this case. My idea is to do something like this:
create table t2 as
select *, row_number() over (partition by c1, c2 order by c3) as rank
from t1
create table t3 as
select a.c1, a.c2, a.c3, a.c4
from t2 a
inner join
(select c1, c2, max(rank) as maxrank
from t2
group by c1, c2
)
on a.c1=b.c1 and a.c2=b.c1
where a.rank=b.maxrank
Would this work? (Having environment issues so can't test myself)

Just use a subquery:
select t1.*
from (select t1.*, row_number() over (partition by c1, c2 order by c3 desc) as rank
from t1
) t1
where rank = 1;
Note the use of desc for the order by.

Related

hive - how to select top N elements for each match

Please consider a hive table - TableA as mentioned below.
This basic SQL syntax works fine when we want to get "all" the rows that matches the condition in the where clause. I want to limit the returned rows to a number - say N - for each of the matches of where clause.
Let me explain with an example:
(1)
Consider this table:
TableA
c1 c2
1. a
1 b
1 c
2. d
2. e
2. f
(2) Consider this query:
SELECT c1, c2
FROM TableA
WHERE c1 in (1,2)
(3) As you can imagine, it would produce this result:
Actual Results:
c1 c2
1. a
1 b
1 c
2. d
2. e
2. f
(4)
Desired Result:
c1 c2
1. a
1 b
2. d
2. e
Question: How do I modify the query in #2) to get the desired output mention in #4).
You can use row_number function to do this.
select c1,c2
from (SELECT c1, c2, row_number() over(partition by c1 order by c2) as rnum
FROM TableA
--add a where clause as needed
) t
where rnum <= 2
Only 2 values for c1
SELECT c1, c2 FROM TableA WHERE c1 = 1 ORDER BY c2 LIMIT 2
UNION ALL
SELECT c1, c2 FROM TableA WHERE c1 = 2 ORDER BY c2 LIMIT 2
More than 2 values, use rank()
select c1,c2 from
(
select c1,c2,rank() over (partition by c1 order by c2) as rank
from TableA
) t
where rank < 3;

sql numbering the partition of Numbers

I have a set of numbers like this
ID
===
1
2
3
1
2
1
1
2
3
4
5
...
I want to select a new row that increase when fetch next 1 like this
ID number
=== ========
1 1
2 1
3 1
1 2
2 2
1 3
1 4
2 4
3 4
4 4
5 4
Any suggestion ?
Assuming that you have a column o which specify the ordering then you can use a self-join like this:
select d1.o, d1.id, count(*)
from data d1
join data d2 on d1.o >= d2.o and d2.id = 1
group by d1.o, d1.id
DBFiddle DEMO
You can solve this with use of cte and window functions, as follows:
DECLARE #t TABLE (ID INT);
INSERT INTO #t VALUES (1),(2),(3),(1),(2),(1),(1),(2),(3),(4),(5);
WITH cte AS(
SELECT ID, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn
FROM #t
),
cte1 AS(
SELECT ID, rn, ROW_NUMBER() OVER (ORDER BY rn) rn2
FROM cte
WHERE ID = 1
)
SELECT c.ID, MAX(rn2) OVER (ORDER BY c.rn) rn
FROM cte c
LEFT JOIN cte1 c1 ON c1.rn = c.rn
ORDER BY c.rn

SQL: How do I combine tables on a single but non-unique identifier?

I have two tables:
TABLE 1
ID Value ValueFromTable2
1 A NULL
1 B NULL
1 C NULL
1 D NULL
2 E NULL
2 F NULL
TABLE 2
ID Value
1 A1
1 A2
1 A3
2 BOB
2 JIM
I would like to update TABLE 1 with the values of TABLE 2 such that the following rows would result:
TABLE 1
ID Value ValueFromTable2
1 A A1
1 B A2
1 C A3
1 D NULL
2 E BOB
2 F JIM
Order it not terribly important. That is, I'm not concerned that A be paired with A1 or that B be paired with A2. I just need a full set of data from the Value column in Table 2 to be available from Table 1.
Please advise!
You need a key for joining them. The implicit key is the ordering. You can add that in explicitly, using row_number():
select coalesce(t1.id, t2.id) as id,
t1.value, t2.value
from (select t1.*, row_number() over (partition by id order by (select NULL)) as seqnum
from table1 t1
) t1 full outer join
(select t2.*, row_number() over (partition by id order by (select NULL)) as seqnum
from table2 t2
) t2
on t1.id = t2.id and t1.seqnum = t2.seqnum;
By using full outer join, all values will appear, regardless of which is the longer list.

How to select entry with greater value in postgresql

I have two or more values like:
c1|c2 |c3 |c4
--+---+---+---
1 | Z | B | 29
2 | Z | B | 19
and I want to have the entry with the larger c4 value:
1 | Z | B | 29
I tried to query the max value from c4, after a group by of c2 and c3, but this doesn't work.
Postgres specific solution:
select distinct on (c2,c3) c1, c2, c3, c4
from the_table
order by c2,c3,c4 desc
ANSI SQL solution:
select c1,c2,c3,c4
from (
select c1,c2,c3,c4,
row_number() over (partition by c2,c3 order by c4 desc) as rn
from the_table
) t
where rn = 1;
You can order results in descending order by c4 and output only one row (see LIMIT clause):
SELECT *
FROM table_name
ORDER BY c4 DESC
LIMIT 1

How to set row number()

I have a table like this -
C1 C2 C3
A 20130101 10
A 20130102 10
A 20130103 20
A 20130104 10
I want to set row no like this -
C1 C2 C3 RowNo
A 20130101 10 1
A 20130102 10 2
A 20130103 20 1
A 20130104 10 1
How can I make by query?
or there is only way to loop this table?
Thanks..
I am updated answer with recursive CTE. It build hierarchy tree starting with records with new C3's value and display level as RowNo.
with t as
(select t.*, row_number () over (order by c2) rn from table1 t)
,temp (c2,c3,rn,lvl) AS
(SELECT c2,c3,rn,1 lvl from t t1
where not exists(
select 1 from t t0
where t1.rn=t0.rn+1
and t1.c3=t0.c3
)
UNION ALL
select t1.c2,t1.c3,t1.rn,lvl + 1 AS lvl FROM t t1
join temp t2 on t1.rn=t2.rn+1 and t1.c3=t2.c3)
SELECT c2, c3, lvl rowno FROM temp order by rn;
http://sqlfiddle.com/#!3/4adbd/1
ROW_NUMBER() function can help you to set numbers of rows:
SELECT ROW_NUMBER() over(order by [some field]), *
FROM [your table]
SELECT ROW_NUMBER() over(PARTITION BY [C3] order by [C2]), *
FROM table name