Row number for first action - impala

I have a table with user ids and action logs. And i would like to get the following result:
+---------+------------+---------+
| user_id | action_id | row_num |
+---------+------------+---------+
| id1 | action 1 | 1 |
| id1 | action 1 | 2 |
| id1 | action 2 | 1 |
| id1 | action 3 | 1 |
| id2 | action 1 | 1 |
| id2 | action 2 | 1 |
| id2 | action 3 | 1 |
| id2 | action 3 | 2 |
| id2 | action 3 | 3 |
+---------+------------+---------+
I am pretty sure I need to use ROW_NUMBER() function, and trying to achieve this by executing the following code:
select user_id,
action_id,
row_number() over (partition by action_id order by user_id desc) as rn
from table
But it seems like I am missing something. Would you please help me?
I am using Impala SQL syntax.
Thank you in advance.

First use action_id instead in order by clause:
select user_id, action_id,
row_number() over (partition by user_id, action_id order by action_id) as rn
from table t;
Second, you haven't specify the action_id in partition clause

You were close. Use
row_number() over (partition by user_id,action_id order by action_id) as rn

Related

SQL SERVER How to select the latest record in each group? [duplicate]

This question already has answers here:
Get top 1 row of each group
(19 answers)
Closed 2 years ago.
| ID | TimeStamp | Item |
|----|-----------|------|
| 1 | 0:00:20 | 0 |
| 1 | 0:00:40 | 1 |
| 1 | 0:01:00 | 1 |
| 2 | 0:01:20 | 1 |
| 2 | 0:01:40 | 0 |
| 2 | 0:02:00 | 1 |
| 3 | 0:02:20 | 1 |
| 3 | 0:02:40 | 1 |
| 3 | 0:03:00 | 0 |
I have this and I would like to turn it into
| ID | TimeStamp | Item |
|----|-----------|------|
| 1 | 0:01:00 | 1 |
| 2 | 0:02:00 | 1 |
| 3 | 0:03:00 | 0 |
Please advise, thank you!
A correlated subquery is often the fastest method:
select t.*
from t
where t.timestamp = (select max(t2.timestamp)
from t t2
where t2.id = t.id
);
For this, you want an index on (id, timestamp).
You can also use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by timestamp desc) as seqnum
from t
) t
where seqnum = 1;
This is typically a wee bit slower because it needs to assign the row number to every row, even those not being returned.
You need to group by id, and filter out through timestamp values descending in order to have all the records returning as first(with value 1) in the subquery with contribution of an analytic function :
SELECT *
FROM
(
SELECT *,
DENSE_RANK() OVER (PARTITION BY ID ORDER BY TimeStamp DESC) AS dr
FROM t
) t
WHERE t.dr = 1
where DENSE_RANK() analytic function is used in order to include records with ties also.

SQL how to remove duplicate records

How can I clean up a table by removing the duplicate records?
+----------+--------+------------+
| clientID | status | Insertdate |
+----------+--------+------------+
| 1 | new | 20191206 |
| 1 | new | 20191206 |
| 2 | old | 20191206 |
| 2 | old | 20191206 |
| 3 | new | 20191205 |
| 3 | new | 20191205 |
+----------+--------+------------+
I don't have any identity field.
Please find the below query. You can use Row Number.
;WITH cte as (
select clientid
, status, Insertdate
, ROW_NUMBER() over (partition by clientid, status, Insertdate order by clientid) RowNumber
from Yourtable
)
delete from cte where RowNumber > 1
Hope this will help if you are running MySQL database
SELECT clientID, status, Insertdate, count(*)
FROM table_name
GROUP BY clientID, status, Insertdate
having count(*) > 1

SQL - How to do something like value.Contains?

someone can help me, I need to exclude some repeated values, the result is:
There are some rows with null values and in that case I named 'No Informado'.
In line from 26 to 32 there is the same value1 and value2, but value3 is different.
I will need this result,
id | name | user
0x00E281759429DD4B807F467F8B2319E3 | PC_XBPOX0112 | llopez
0x00F37F5DA2C8854699EFBA30F7102DDD | PC_BSCTY1312 | No Informado
0x00F53DBE60CFF343942E3893ABA809EB | PC_SVCTY6834 | ntapia
0x00FDB75C00B8D84E8A1862A56C71A766 | NB_TSCTY06606 | jogonzalez
0x010029519191B34BB498E7F9FEAE3E21 | PC_BSCTY3229 | kfuentes
0x011506756396BC4588E705BFCFA84847 | PC_BSCTY3134 | csepulveda
0x0120BE537B242C4EB01C4F94E82E64BF | PC_BSCTY1296 | eaviles
0x01322ABEC4F19E41B2139291952838EE | PC_VSCTY6535 | vbravo
0x0133C6B80B50E44A928AF770510856E3 | PC_FSCTY0084 | mcarreno
0x01463ECF32DEBD41943330EC7C1822D4 | PC_BSCTY3220 | fegonzalez
0x01610C718C04264A8349FAEA6676363F | PC-FSCTY0543 | fcastro
someone can help me?
Forward thanks!
Another option is the WITH TIES clause in concert with Row_Number()
Example
Select Top 1 With Ties *
From YourTable
Order by Row_Number() over (Partition By ID Order by Date Desc)
Returns
id name date
1 name1 2018-01-01
2 name2 2018-01-01
3 name5 2018-02-01
SELECT Id
, MAX(name) AS Name
, MAX([date]) AS [date]
FROM TableName
GROUP BY Id

Oracle query - merging multiple results in a single row

I do have the following query which displays two results.
Query:
select /*+ parallel(16) */ * from CONTRACT where CONTRACT_ID ='1234';
Result:
_____________________________________________________________________________________
|CONTRACT_SOURCE | CONTRACT_ID | ROLE | ROLE_ID | STD_CD | INDEX
_____________________________________________________________________________________
|Source | 1234 | role_driver | unique1 | LOAD | 9
|Source | 1234 | role_insured| unique2 | LOAD | 9
_____________________________________________________________________________________
I would like to fetch these results merged to in the below format.
_____________________________________________________________________________________________________________________
|CONTRACT_SOURCE | CONTRACT_ID | ROLE | ROLE_ID | ROLE | ROLE_ID | STD_CD | INDEX |
_____________________________________________________________________________________________________________________
|Source | 1234 | role_driver | unique1 | role_insured | unique2 | LOAD | 9 |
_____________________________________________________________________________________________________________________
Can I achieve this through an Oracle query?
You can use row_number and aggregation to get the required multi column pivoting:
select contract_source,
contract_id,
std_cd,
in,
max(case when rn = 1 then role end) as role_1,
max(case when rn = 1 then role_id end) as role_id_1,
max(case when rn = 2 then role end) as role_2,
max(case when rn = 2 then role_id end) as role_id_2
from (
select c.*,
row_number() over (
partition by contract_source, contract_id, std_cd, in
order by role_id
) as rn
from contract c
) t
group by contract_source, contract_id, std_cd, in

Do I need a recursive CTE to update a table that relies on itself?

I need to apologize for the title. I put a lot of thought into it but didn't get too far.
I have a table that looks like this:
+--------------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------+
| accountid | pricexxxxxid | accountid | pricelevelid | counts |
+--------------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------+
| 36B077D4-E765-4C70-BE18-2ECA871420D3 | 00000000-0000-0000-0000-000000000000 | 36B077D4-E765-4C70-BE18-2ECA871420D3 | F43C47CE-28C6-42E2-8399-92C58ED4BA9D | 1 |
| EBC18CBC-2D2E-44CB-B36A-0ADE9E2BDE9F | 00000000-0000-0000-0000-000000000000 | EBC18CBC-2D2E-44CB-B36A-0ADE9E2BDE9F | 3BEEA9D3-F26B-47E4-88FA-A2AA366980ED | 1 |
| 8DC8D0FC-3138-425A-A922-2F0CAC57E887 | 00000000-0000-0000-0000-000000000000 | 8DC8D0FC-3138-425A-A922-2F0CAC57E887 | F1B8AD5D-B008-4C3F-94A0-AD3F90C777D7 | 1 |
| 8F908A92-1327-4655-BAE4-C890D971A554 | 00000000-0000-0000-0000-000000000000 | 8F908A92-1327-4655-BAE4-C890D971A554 | 2E0EC67E-5F8F-4305-932E-BBF8DF83DBEC | 1 |
| 37221AAC-B885-4002-B7D9-591F8C14D019 | 00000000-0000-0000-0000-000000000000 | 37221AAC-B885-4002-B7D9-591F8C14D019 | F4A2A0CA-FDFF-4C21-AE92-D4583DC18DED | 1 |
| 66F406B4-0D9B-40B8-9A23-119EE74B00B7 | 00000000-0000-0000-0000-000000000000 | 66F406B4-0D9B-40B8-9A23-119EE74B00B7 | 204B8570-CEBA-4C72-9B72-8B9B14AF625E | 2 |
| D0168CE3-479E-439E-967C-4FF0D701291A | 00000000-0000-0000-0000-000000000000 | D0168CE3-479E-439E-967C-4FF0D701291A | 204B8570-CEBA-4C72-9B72-8B9B14AF625E | 2 |
| 57E5F6E5-0A8A-4E54-B793-2F6493DC1EA3 | 00000000-0000-0000-0000-000000000000 | 57E5F6E5-0A8A-4E54-B793-2F6493DC1EA3 | 893F9FD2-43C9-4355-AEFC-08A62BF2B066 | 3 |
+--------------------------------------+--------------------------------------+--------------------------------------+--------------------------------------+--------+
It is sorted by ascending counts.
I would like to update the pricexxxxids that are all 00000000-0000-0000-0000-000000000000 with their corresponding pricelevelid.
For example for accountid = 36B077D4-E765-4C70-BE18-2ECA871420D3 I would like the pricexxxxid to be F43C47CE-28C6-42E2-8399-92C58ED4BA9D.
After that is done, I would like all the records FOLLOWING this one where accountid = 36B077D4-E765-4C70-BE18-2ECA871420D3 to be deleted.
Another words in result I will end up with a distinct list of accountids with pricexxxxid to be assigned with the corresponding value from pricelevelid.
Thank you so much for your guidance.
for your first case do !
update table
set pricexxxxids=pricelevelid.
if i understand your second case correctly :(delete duplicates/select distinct)?
delete from
(
select *,rn=row_number()over(partition by accountid order by accountid) from table
)x
where rn>1
--select distinct * from table
edited
select * from
(
select *,rn=row_number()over(partition by accountid order by accountid) from table
)x
where x.rn=1
updated
SELECT accountid,pricelevelid FROM
(
(SELECT *,
Row_number() OVER ( partition BY accountid ORDER BY counts, pricelevelid ) AS Recency
FROM table
)x
WHERE x.Recency = 1