Situation:
I have three columns:
id
date
tx_id
The primary id column is tx_id and is unique in the table. Each tx_id is tied to an id and it has a record date. I would like to test whether or not the tx_id is incremental.
Objective:
I need to extract the first tx_id by id but I want to prevent using ROW_NUMBER
i.e
select id, date, tx_id, row_number() over(partition by id order by date asc) as First_transaction_id from table
and simply use
select id, date, MIN(tx_id) as First_transaction_id from table
So how can i make sure since i have more than 50 millions of ids that by using MINtx_id will yield the earliest transaction for each id?
How can i add a flag column to segment those that don't satisfy the condition?
how can i make sure since i have more than 50 millions of ids that by using MINtx_id will yield the earliest transaction for each id?
Simply do the comparison:
You can get the exceptions with logic like this:
select t.*
from (select t.*,
min(tx_id) over (partition by id) as min_tx_id,
rank() over (partition by id order by date) as seqnum
from t
) t
where tx_id = min_tx_id and seqnum > 1;
Note: this uses rank(). It seems possible that there could be two transactions for an id on the same date.
use corelated sunquery
select t.* from table_name t
where t.date= ( select min(date) from table_name
t1 where t1.id=t.id)
Appreciate if you can help me with a case that I'm working on on Oracle (pl/sql).
Suppose I have 1 table named TableA:TableA
The rule of TableA sorting is :
CASE_ID & CONTRACT with 'SD' TRIGGER have to be placed on top, regardless of the SCORE.
After all of the CONTRACT & CASE_ID with 'SD' TRIGGER are placed on top, the next CASE_ID & CONTRACT are sorted by SCORE descending.
I want to place 1 unique number for 1 CASE_ID, ascending from 1, so CONTRACT with the same CASE_ID will have the same number. an Example of the solution that I'm trying to obtain is :Example Solution
I have tried using DENSE_RANK with the following query:
select a.*
,dense_rank() over (partition by a.case_id order by rn)
from (
select a.*,rownum as rn from TableA a
)a
But the solution still is not the way I want it to be, there are some CASE_ID assigned with the same NUMBER
Appreciate if you can give me some input regarding this.
Thank you very much!
You can use dense_rank(). It looks like:
select a.*,
dense_rank() over (order by a.case_id)
from TableA a;
No partition by is needed.
Dense Rank with a partition would just give you consecutive ranks in a partition. To use dense_rank without partition refer to Gordon's answer.
One other way is to create a row_number over distinct case_id and join back to original table;
SELECT TableA.* , bar.NUMBER
FROM TableA
JOIN
(SELECT foo.CASE_ID as case_id, ROW_NUMBER() OVER (Partition by 1) as NUMBER
FROM ((SELECT DISTINCT CASE_ID FROM TableA)as foo) as bar
ON TableA.CASE_ID = bar.case_id ;
I have the following temporary table
Aim is to flag the data with more than one records and put More than one records
In my example below, if Siren appears more than once, I would have
Siren ETS_RS Voie Ville nom_etp
348177155 POITOU-CHARENTES ENGRAIS P.C.E. (SNC) BOULEVARD WLADIMIR MORCH 17000 LA ROCHELLE More than one records
For records that are appearing once, I will have the single name of the company (here nom_etp)
Siren ETS_RS Voie Ville nom_etp
344843347 PRESTIGE AUTO ROCHELAIS (SAS) 4 RUE JEAN DEMEOCQ 17000 LA ROCHELLE NIGER
I tried a few things based on the idea that if I can have a count of more than one, I could flag them easily and use them with a CASE :
First: I tried to do a count
WITH cte_ssrep_moraux AS (...)
SELECT SIREN,ETS_RS,Voie,Ville
,Denomination AS nom_etp,COUNT(SIREN)
FROM cte_ssrep_moraux
GROUP BY ETS_RS,Voie,Ville,Denomination,SIREN
It hits a snitch as all counts were equal to one and I have the same dataset as in the picture...
Second:
WITH cte_ssrep_moraux AS (...)
SELECT ETS_RS,Voie,Ville
,Denomination AS nom_etp,SIREN,
RANK() OVER (PARTITION BY ETS_RS ORDER BY ETS_RS ASC) AS xx
FROM cte_ssrep_moraux
GROUP BY ETS_RS,Voie,Ville,Denomination,SIREN
It hits a snitch as all counts were equal to one and I have the same dataset as in the picture...
I'm bit confused on what I should do next. I have the feeling will be an easy one and I'll face palmed myself.
Many thanks for reading my question
If this is your criteria:
if Siren appears more than once,
Then the group by clause should only contain Siren:
SELECT SIREN, COUNT(*)
FROM cte_ssrep_moraux
GROUP BY SIREN
HAVING COUNT(*) > 1;
I'm not sure what you want to do after that, but this will return the SIREN values that appear more than once.
If there is more than one row and you change every nom_etp to 'more than one record', you end up with identical rows. That's why I prepared some tweaked query. See following (table simplified for clarity):
CREATE TABLE Duplicates
(
Id int,
Name varchar(20),
Item varchar(20)
)
INSERT Duplicates VALUES
(1,'Name1', 'Item1'),
(2,'Name2', 'Item2'),
(2,'Name2', 'Item3'),
(3,'Name3', 'Item4'),
(3,'Name3', 'Item5'),
(3,'Name3', 'Item6'),
(4,'Name4', 'Item7');
If you need just a query:
WITH Numbered AS
(
SELECT Id, Name, Item,
ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Id) RowNum,
COUNT(*) OVER (PARTITION BY Id ORDER BY ID) TotalInGroup
FROM Duplicates
)
SELECT Id, Name,
CASE WHEN RowNum=1 AND TotalInGroup>1 THEN 'More records' ELSE Item END Item
FROM Numbered
If you need to normalize:
WITH Numbered AS
(
SELECT Id, Name, Item,
ROW_NUMBER() OVER (ORDER BY Id) Number,
ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Id) RowNum,
COUNT(*) OVER (PARTITION BY Id ORDER BY ID) TotalInGroup
FROM Duplicates
)
MERGE Numbered AS tgt
USING Numbered AS src
ON src.Number=tgt.Number
WHEN MATCHED AND tgt.RowNum=1 AND tgt.TotalInGroup>1 THEN
UPDATE SET tgt.Item='More'
WHEN MATCHED AND tgt.RowNum>1 THEN
DELETE;
Table will look like below:
Id Name Item
-- ---- ----
1 Name1 Item1
2 Name2 More
3 Name3 More
4 Name4 Item7
If there are multiple rows with same id, first of them is updated with 'More' constant, all other in the group are deleted.
Use CTE for this purpose
;WITH CTE AS(
SELECT ETS_RS,Voie,Ville,Denomination AS nom_etp,SIREN,
ROW_NUMBER() OVER (PARTITION BY ETS_RS ORDER BY ETS_RS ASC) AS RN
FROM cte_ssrep_moraux
--GROUP BY ETS_RS,Voie,Ville,Denomination,SIREN
)
SELECT ETS_RS,
Voie,Ville,
CASE WHEN RN > 1 THEN 'More than one records'
ELSE nom_etp
END AS 'nom_etp',
SIREN
FROM CTE
;with cte
as
(
select siren,count(*) as cnt
from
yourtable
having count(*)>1
)
update t
set nom_etp='more than one records'
yourtable t where exists(Select 1 from cte c where c.sirenid=t.sirenid)
Since you still want all the records, including the unique.
Then you can use COUNT as a window function.
With a CASE to choose what to display as nom_etp.
select Siren, ETS_RS, Voie, Ville,
(case when count(*) over (partition by Siren) > 1 then 'More than one records' else nom_etp end) as nom_etp
from cte_ssrep_moraux;
Please find what I did
WITH cte_ssrep_moraux AS (
SELECT SIREN,ETS_RS,Voie,Ville
,Denomination AS nom_etp,ROW_NUMBER()
OVER (PARTITION BY ETS_RS ORDER BY ETS_RS ASC) AS Counting
FROM
(my_initial_cte) AS tb
)
SELECT Siren, ETS_RS, Voie, Ville,nom_etp
FROM cte_ssrep_moraux
WHERE counting = 1
AND Siren NOT IN (SELECT Siren FROM cte_ssrep_moraux WHERE counting > 1)
UNION ALL
SELECT DISTINCT Siren, ETS_RS, Voie, Ville,'More than one records'
FROM cte_ssrep_moraux
WHERE counting > 1
Explanation: After the initial CTE, I tried many of the solutions mentioned above especially using the CASE.
Issue with the CASE was that it would put something like that
Siren ETS_RS Voie Ville nom_etp
xxxx xyxy xyzet Bordeaux More than one records
xxxx xyxy xyzet Bordeaux More than one records
xxxx xyxy xyzet Bordeaux More than one records
xxxy zzzy ssare Paris Firm ABC
So instead of putting everything under a CASE, I said let's split that into 2 part :
First part would put everything with a counting equal to 1
Second part would put the rest with a counting that goes above 1 with a DISTINCT
Join the two results with an UNION ALL as the two sets have the same numbers of fetch rows
ID Sum Name
a 10 Joe
a 8 Mary
b 21 Kate
b 110 Casey
b 67 Pierce
What would you recommend as the best way to
obtain for each ID the name that corresponds to the largest sum (grouping by ID).
What I tried so far:
select ID, SUM(Sum) s, Name
from Table1
group by ID, Name
Order by SUM(Sum) DESC;
this will arrange the records into groups that have the highest sum first. Then I have to somehow flag those records and keep only those. Any tips or pointers? Thanks a lot
In the end I'd like to obtain:
a 10 Joe
b 110 Casey
You want the row_number() function:
select id, [sum], name
from (select t.*]
row_number() over (partition by id order by [sum] desc) as seqnum
from table1
) t
where seqnum = 1;
Your question is more confusing than it needs to be because you have a column called sum. You should avoid using SQL reserved words for identifiers.
The row_number() function assigns a sequential number to a group of rows, starting with 1. The group is defined by the partition by clause. In this case, all rows with the same id are in the same group. The ordering of the numbers is determined by the order by clause, so the one with the largest value of sum gets the value of 1.
If you might have duplicate maximum values and you want all of them, use the related function rank() or dense_rank().
select *
from
(
select *
,rn = row_number() over (partition by Id order by sum desc)
from table
)x
where x.rn=1
demo
I have a view (that is a union of several tables) and I need to filter out duplicates. The table looks like this:
id first last logo email entered
1 joe smith i.jpg e#m.c 2014-01-27
2 jim smith b.jpg e#j.c 2014-01-27
3 bob smith z.jpg b#b.c 2014-01-27
9 joeseph smith q.gif e#m.c 2014-01-20
I want to do something like this, but I can't seem to get a valid syntax for it:
SELECT
email, MAX(entered), first, last -- such that first and last come from the same row as the MAX(entered)
FROM
my_view
GROUP BY
email
Since your names are not the same on the duplicate email rows, you must use the row_number() function instead:
select email, entered, first, last
from (
select *, row_number() over (partition by email order by entered desc) rn
from my_view
) x
where rn = 1
You need a subquery because row_number() is not allowed in the where clause.
You want to use row_number():
SELECT email, entered, first, last
FROM (select v.*, row_number() over (partition by email order by entered desc) as seqnum
from my_view v
) v
WHERE seqnum = 1;
row_number() is a window function that assigns sequential numbers to groups of rows. The groups are defined by the partition by clause. In this case, everything with the same email is in the same group. The first row is given a value 1; the ordering is based on the order by clause.
The outer query select the first one, which has the largest entered date.