How to give the serial number if data is repeating - sql

if My table has this values i need to generate seqno column
ClientId clinetLocation seqno
001 Abc 1
001 BBc 2
001 ccd 3
002 Abc 1
002 BBc 2
003 ccd 1

You are looking for the row_number() function:
select ClientId, clinetLocation,
row_number() over (partition by ClientId order by clinetLocation) as seqnum
from t;
This is a standard function available in most databases.

One option would be counting the grouped rows with respect to those columns :
select count(1) over ( order by ClientId, ClientLocation ) as seqno,
ClientId, ClientLocation
from tab
group by ClientId, ClientLocation;
where ClientId & ClientLocation combination seems unique.
Rextester Demo

Related

How to use FIND_IN_SET and sum column in my SQL query

Can anyone help me? I have a table result like this:
id_user
score
type
001
30
play
001
40
play
001
30
redeem
002
20
play
002
30
redeem
I want to sum column score group by id_user base on type 'play' and after that I want show ranking using find_in_set. Like this is the result of the table that I want to display:
id_user
total
rank
001
70
1
002
20
2
Previously I used the rank() function in MySQL version 10.4, but it does not work in MySQL version 15.1. This is my previous query:
SELECT id_user, SUM(score) AS total,
RANK() OVER (ORDER BY total DESC) AS rank
FROM result
WHERE type='play'
GROUP BY id_user
I have made some changes in your query. It's working now. Instead of column alias total SUM(score) needs to be used in order by clause of Rank() function's over(). And since Rank is a reserve word I used rnk instead.
DB-Fiddle:
create table result (id_user varchar(5), score int, type varchar(20));
insert into result values('001',30 ,'play');
insert into result values('001',40 ,'play');
insert into result values('001',30 ,'redeem');
insert into result values('002',20 ,'play');
insert into result values('002',30 ',redeem');
Query:
select id_user, SUM(score) AS total, RANK() OVER (ORDER BY SUM(score) DESC) AS rnk FROM result where type='play' GROUP BY id_user
Output:
id_user
total
rnk
001
70
1
002
20
2
db<>fiddle here
If your MySQL version doesn't support rank() you can use subquery to achieve same result:
Query:
select id_user, SUM(score) AS total,
coalesce((select count(distinct id_user) from result r2
where type='play'
group by id_user
having sum(r2.score)>sum(r.score) ),0)+1 AS rnk
FROM result r where type='play'
GROUP BY id_user
Output:
id_user
total
rnk
001
70
1
002
20
2
db<>fiddle here

How to select sequential duplicates in SQL Server

I would like to select duplicate entries from a SQL Server table, but only if the id is consecutive.
I have been trying to twist this answer to my needs, but I can't get it to work.
The above answer is for Oracle, but I see that SQL Server also has lead and lag functions.
Also, I think that the answer above puts a * next to duplicates, but I only want to select the duplicates.
select
id, companyName,
case
when companyName in (prev, next)
then '*'
end match,
prev,
next
from
(select
id,
companyName,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from
companies)
order by
id;
Example:
So from this data set:
id companyName
-------------------
1 dogs ltd
2 cats ltd
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
7 dogs ltd
8 pigs ltd
I want to select:
id companyName
-------------------
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
Update
Every now and again I am taken aback by the quantity and quality of answers I get on SO. This is one of those times. I don't have the level of expertise to judge one answer as being better than another, so I've gone for SqlZim as this was the first working answer I saw. But it's great to see the different approaches. Especially when only an hour ago I was wondering "is this even possible?".
You are very close to what you want:
select id, companyName
from (select c.*,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from companies c
) a
where CompanyName in (prev, next)
order by id;
This is a gaps and islands style problem, but instead of using two row_numbers(), we use the id and row_number() in the innermost subquery. Followed by count() over() to get the count per grp, and finally return those with a cnt > 1.
select id, companyname
from (
select
id
, companyName
, grp
, cnt = count(*) over (partition by companyname, grp)
from (
select *
, grp = id - row_number() over (partition by companyname order by id)
from
companies
) islands
) d
where cnt > 1
order by id
rextester demo: http://rextester.com/ACP73683
returns:
+----+-------------+
| id | companyname |
+----+-------------+
| 3 | pigs ltd |
| 4 | pigs ltd |
| 5 | cats ltd |
| 6 | cats ltd |
+----+-------------+
One more alternate form, using LEAD() and LAG() (SQL 2012 and up)
SELECT id, CompanyName
FROM (
SELECT *,
LEAD(CompanyName, 1) OVER(ORDER BY id) as nc,
LAG(CompanyName, 1) OVER(ORDER BY id) AS pc
FROM #t t
) x
WHERE nc = companyName
OR pc = companyName
Here is the test data, so you can check it out yourself.
CREATE TABLE #T (id int not null PRIMARY KEY, companyName varchar(16) not null)
INSERT INTO #t Values
(1, 'dogs ltd'),
(2, 'cats ltd'),
(3, 'pigs ltd'),
(4, 'pigs ltd'),
(5, 'cats ltd'),
(6, 'cats ltd'),
(7, 'dogs ltd'),
(8, 'pigs ltd')
In the WHERE clause you just need to limit to those where the companyName is the same as the prev or the next
select id, companyName
from (
select id, companyName,
lag(companyName, 1) over (order by id) as prev,
lead(companyName, 1) over (order by id) as next
from companies
) q
where companyName in (prev, next)
order by id;
To make sure the id's are really without gaps then you can do it like this:
select id, companyName
from (
select id, companyName,
lag(concat(id+1,companyName), 1) over (order by id) as prev,
lead(concat(id-1,companyName), 1) over (order by id) as next
from companies
) q
where concat(id,companyName) in (prev, next)
order by id;
You can use Row_Number() and get the duplicates based on partition by clause
;with cte as (
SELECT id, companyName,
RowN = Row_Number() over (partition by id order by companynae) from #yourTable
)
Select * from cte where RowN > 1
Can you provide your input and expected output to verify this query

SQL group by data with row separate

I would like to group by Customer & Date and generate count columns for 2 separate values (Flag=Y and Flag=N). Input table looks like this:
Customer Date Flag
------- ------- -----
001 201201 Y
001 201202 Y
001 201203 Y
001 201204 N
001 201205 N
001 201206 Y
001 201207 Y
001 201208 Y
001 201209 N
002 201201 N
002 201202 Y
002 201203 Y
002 201205 N
The output should look like this:
Customer MinDate MaxDate Count_Y
------- ------ ------- -------
001 201201 201203 3
001 201206 201208 3
002 201202 201203 2
How can I write the SQL query? Any kind of help is appreciated! Thanks!
You want to find consecutive values of "Y". This is a "gaps-and-islands" problem, and there are two basic approaches:
Determine the first "Y" in each group and use this information to define a group of consecutive "Y" values.
Use the difference of row_number() values for the calculation.
The first depends on SQL Server 2012+ and you haven't specified the version. So, the second looks like this:
select customer, min(date) as mindate, max(date) as maxdate,
count(*) as numYs
from (select t.*,
row_number() over (partition by customer order by date) as seqnum_cd,
row_number() over (partition by customer, flag order by date) as seqnum_cfd
from t
) t
where flag = 'Y'
group by customer, (seqnum_cd - seqnum_cfd), flag;
It is a little tricky to explain how this works. In my experience, thought, if you run the subquery, you will see how the seqnum columns are calculated and "get it" by observing the results.
Note: This assumes that there is at most one record per day. If there are more, you can use dense_rank() instead of row_number() for the same effect.
Try with the below query,which will give you exactly what you want.
DROP TABLE [GroupCustomer]
GO
CREATE TABLE [dbo].[GroupCustomer](
Customer VARCHAR(50),
[Date] [datetime] NULL,
Flag VARCHAR(1)
)
INSERT INTO [dbo].[GroupCustomer] (Customer ,[Date],Flag)
VALUES ('001','201201','Y'),('001','201202','Y'),
('001','201203','Y'),('001','201204','N'),
('001','201205','N'),('001','201206','Y'),
('001','201207','Y'),('001','201208','Y'),
('001','201209','N'),('002','201201','N'),
('002','201202','Y'),('002','201203','Y'),
('002','201205','N')
GO
;WITH cte_cnt
AS
(
SELECT Customer,Format(MIN([Date]),'yyMMdd') AS MinDate
,Format(MAX([Date]),'yyMMdd') AS MaxDate
, COUNT('A') AS Count_Y
FROM (
SELECT Customer,Flag,[Date],
ROW_NUMBER() OVER(Partition by customer ORDER BY [Date]) AS ROW_NUMBER,
DATEDIFF(D, ROW_NUMBER() OVER(Partition by customer ORDER BY [Date])
, [Date]) AS Diff
FROM [GroupCustomer]
WHERE Flag='Y') AS dt
GROUP BY Customer,Flag, Diff )
SELECT *
FROM cte_cnt c
ORDER BY Customer
GO

Create table with distinct values based on date

I have a table which fills up with lots of transactions monthly, like below.
Name ID Date OtherColumn
_________________________________________________
John Smith 11111 2012-11-29 Somevalue
John Smith 11111 2012-11-30 Somevalue
Adam Gray 22222 2012-12-11 Somevalue
Tim Blue 33333 2012-12-15 Somevalue
John NewName 11111 2013-01-01 Somevalue
Adam Gray 22222 2013-01-02 Somevalue
From this table i want to create a dimension table with the unique names and id's. The problem is that a person can change his/her name, like "John" in the example above. The Id's are otherwise always unique. In those cases I want to only use the newest name (the one with the latest date).
So that I end up with a table like this:
Name ID
______________________
John NewName 11111
Adam Gray 22222
Tim Blue 33333
How do I go about achieving this?
Can I do it in a single query?
Use a CTE for this. It simplifies ranking and window functions.
;WITH CTE as
(SELECT
RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY [Date] DESC),
ID,
Name
FROM
YourTable)
SELECT
Name,
ID
FROM
CTE
WHERE
RN = 1
I think creating a table is a bad idea, but this is how you get the most recent name.
select name
from yourtable yt join
(select id, max(date) maxdate
from yourtable
group by id ) temp on temp.id = yt.id and yt.date = maxdate
JNK's CTE solution is an equivalent of the following.
SELECT
Name,
ID
FROM (
SELECT
RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY [Date] DESC),
Name,
ID
FROM theTable
)
WHERE RN = 1
Trying to think a way to get rid of the partition function without introducing the possible duplicates.

How to select first 2 rows using group's

I have:
Table1
ID date amt
-------------------
001 21/01/2012 1200
001 25/02/2012 1400
001 24/03/2012 1500
001 21/04/2012 1000
002 21/03/2012 1200
002 01/01/2012 0500
002 08/09/2012 1000
.....
I want to select the first two rows from each group of ID ordered by date DESC from Table1.
Query looks like this:
SELECT TOP 2 DATE, ID, AMT FROM TABLE1 GROUP BY ID, AMT --(NOT WORKING)
Expected output:
ID date amt
-------------------
001 21/01/2012 1200
001 25/02/2012 1400
002 21/03/2012 1200
002 01/01/2012 0500
.....
you can take advantage of using Common table Expression and Window Function
WITH recordList
AS
(
SELECT ID, DATE, Amt,
DENSE_RANK() OVER (PARTITION BY ID ORDER BY DATE ASC) rn
FROM tableName
)
SELECT ID, DATE, Amt
FROM recordList
WHERE rn <= 2
SQLFiddle Demo
based on your desired result above, you are ordering the date by ASCENDING.
Ok, You can either use DENSER_RANK() or ROW_NUMBER() but in my answer, I've used DENSE_RANK() because I'm thinking of the duplicates. Anyway, it's the choice of the OP to use ROW_NUMBER() instead of DENSE_RANK().
TSQL Ranking Functions