Renumbering rows in SQL Server - sql

I'm kinda new into the SQL Server and I'm having the following question: is there any possibility to renumber the rows in a column?
For ex:
id date name
1 2016-01-02 John
2 2016-01-02 Jack
3 2016-01-02 John
4 2016-01-02 John
5 2016-01-03 Jack
6 2016-01-03 Jack
7 2016-01-04 John
8 2016-01-03 Jack
9 2016-01-02 John
10 2016-01-04 Jack
I would like that all "Johns" to start with id 1 and go on (2, 3, 4 etc) and all "Jacks" have the following number when "John" is done (5, 6, 7 etc). Thanks!

I hope this helps..
declare #t table (id int ,[date] date,name varchar(20))
insert into #t
( id, date, name )
values (1,'2016-01-02','John')
,(2,'2016-01-02','Jack')
,(3,'2016-01-02','John')
,(4,'2016-01-02','John')
,(5,'2016-01-03','Jack')
,(6,'2016-01-03','Jack')
,(7,'2016-01-04','John')
,(8,'2016-01-03','Jack')
,(9,'2016-01-02','John')
,(10,'2016-01-04','Jack')
select
row_number() over(order by name,[date]) as ID,
date ,
name
from
#t
order by name

The id should just be an internal identifier you use for joins etc - I wouldn't change it. But you could query such a numbering using a window function:
SELECT ROW_NUMBER() OVER (ORDER BY CASE name WHEN 'John' THE 1 ELSE 2 END) AS rn,
date,
name
FROM mytable

Instead of renumbering the id column, you can use ROW_NUMBER window function to renumber the rows as per your requirement. for e.g.:
SELECT ROW_NUMBER() OVER(PARTITION BY name ORDER BY date) as rowid,date,name
FROM tablename

Related

How to apply randomly selected values to distinct dates in SQL Server

I have a table showing available dates for some staff with two fields - staffid and date with information that looks :`
staffid date
1 2016-01-01
1 2016-01-02
1 2016-01-03
2 2016-01-03
3 2016-01-01
3 2016-01-03
I need to generate a list of DISTINCT available dates from this table, where the staff selected to each date is selected randomly. I know how to select rows based on one distinct field, (see for example the answer here, but this will always select the rows based on a given order in the table (so for example staff 1 for January 1, while I need selection to be random so sometimes 1 will be selected as the distinct row and sometimes staff 3 will be selected.
The result needs to be ordered by date.
Try this:
-- test data
create table your_table (staffid int, [date] date);
insert into your_table values
(1, '2016-01-01'),
(1, '2016-01-02'),
(1, '2016-01-03'),
(2, '2016-01-03'),
(3, '2016-01-01'),
(3, '2016-01-03');
-- query
select *
from (
select distinct [date] [distinct_date] from your_table
) as d
outer apply (
select top 1 staffid
from your_table
where d.[distinct_date] = [date]
order by newid()
) as x
-- result 1
distinct_date staffid
-----------------------
2016-01-01 3
2016-01-02 1
2016-01-03 1
-- result 2
distinct_date staffid
-----------------------
2016-01-01 1
2016-01-02 1
2016-01-03 2
hope it helps :)

countif type function in SQL where total count could be retrieved in other column

I have 36 columns in a table but one of the columns have data multiple times like below
ID Name Ref
abcd john doe 123
1234 martina 100
123x brittany 123
ab12 joe 101
and i want results like
ID Name Ref cnt
abcd john doe 123 2
1234 martina 100 1
123x brittany 123 2
ab12 joe 101 1
as 123 has appeared twice i want it to show 2 in cnt column and so on
select ID, Name, Ref, (select count(ID) from [table] where Ref = A.Ref)
from [table] A
Edit:
As mentioned in comments below, this approach may not be the most efficient in all cases, but should be sufficient on reasonably small tables.
In my testing:
a table of 5,460 records and 976 distinct 'Ref' values returned in less than 1 second.
a table of 600,831 records and 8,335 distinct 'Ref' values returned in 6 seconds.
a table of 845,218 records and 15,147 distinct 'Ref' values returned in 13 seconds.
You should provide SQL brand to know capabilities:
1) If your DB supports window functions:
Select
*,
count(*) over ( partition by ref ) as cnt
from your_table
2) If not:
Select
T.*, G.cnt
from
( select * from your_table ) T inner join
( select count(*) as cnt from your_table group by ref ) G
on T.ref = G.ref
You can use COUNT with OVERin following:
QUERY
select ID,
Name,
ref,
count(ref) over (partition by ref) cnt
from #t t
SAMPLE DATA
create table #t
(
ID NVARCHAR(400),
Name NVARCHAR(400),
Ref INT
)
insert into #t values
('abcd','john doe', 123),
('1234','martina', 100),
('123x','brittany', 123),
('ab12','joe', 101)

SQL Select the Columnid with a max column group by one column

I think this question is already answered but it didn't satisfy my question.
I'd like to select the id/s of the names group by the latest date value (MAX) in my table. Using a group by column Name and group by column Date, I must get the ID, Name, Date.
Here is my table
ID Name Date
---------------------------------------
1 Brent 2012-02-17
2 Ash 2012-08-02
3 Brent 2012-08-15
4 Harold 2012-09-30
5 Margaret 2012-10-10
6 Ash 2012-12-01
7 Harold 2013-02-14
8 Ash 2012-01-01
9 Brent 2013-05-11
Output must be:
ID Name Date
---------------------------------------
5 Margaret 2012-10-10
6 Ash 2012-12-01
7 Harold 2013-02-14
9 Brent 2013-05-11
I try this statement:
SELECT
[ID], [Name], MAX([Date]) as [Date]
FROM
[SampleTable]
GROUP BY
[Name]
But I get this error:
Column 'ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
you can use Window Function such as ROW_NUMBER()
SELECT a.ID, a.Name, a.Date
FROM
(
SELECT ID, Name, Date,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE DESC) rn
FROM TableName
) a
WHERE a.rn = 1
if ID and Name is the same for every group, you can simply add Name in the GROUP BY clause.
GROUP BY ID, Name

Total number of days for a task before going on to the next one, grouped by person

I am trying to figure out how to show how many days have been worked on a certain task by using the dates in between each “task login” for each person. I think this can be done with one query? I'm open to suggestions and/or ideas.
The Table:
--------+-----------+----------
Person | TaskLogin | Date
--------+-----------+----------
Jane | A | 2013-01-01
Jane | B | 2013-01-03
Jane | A | 2013-01-06
Jane | B | 2013-01-10
Bob | A | 2013-01-01
Bob | A | 2013-01-06
---------------------------------------------------------------------
Row 1: Jane starts task A starting 2013-01-01 and works on it until starting Task B on 2013-01-03 = 2 days worked on Task A
Row 2: Jane starts on task B starting 2013-01-03 and works on it until starting task A on 2013-01-06 = 3 days worked on Task B
Row 3: Jane starts on task A starting 2013-01-06 and works on it until starting task B on 2013-01-10 = 4 days worked on Task A
Row 4: Skip because that is the highest date for Jane (Jane may or may not finish task B 2013-01-10 but we will not count it)
Row 5: Bob starts task A starting on 2013-01-01 and works on it until continuing to work on task A by logging it again on 2013-01-06 = 5 days worked on task A
Row 6: Skip because that is the highest date for Bob
A = 11 days because 2 + 4 + 5
B = 3 days because of Row 2
The output:
------+---------------------
Tasks | Time between Tasks
------+---------------------
A | 11 days
B | 3 days
**EDIT:*****
The solutions of Nicarus and Gordon Linoff (first pre-2013 solution specifically, with my edits in the comments) works. Note that (select distinct * from table t) t for table can be added to Gordon Linoff's solution to accommodate for the case of someone logging in twice in the same day.
What you are looking for is the lead() function. This is only available in SQL Server 2012. Before that, the easiest way is a correlated subquery:
select TaskLogin, sum(datediff(day, date, nextdate)) as days
from (select t.*,
(select top 1 date
from table t2
where t2.person = t.person
order by date desc
) as nextdate
from table t
) t
where nextdate is not null
group by TaskLogin;
In SQL Server 2012, it would be:
select TaskLogin, sum(datediff(day, date, nextdate)) as days
from (select t.*, lead(date) over (partition by person order by date) as nextdate
from table t
) t
where nextdate is not null
group by TaskLogin;
Maybe not the most elegant way, but it certainly works:
-- Setup table/insert values --
IF OBJECT_ID('TempDB.dbo.#TaskAccounting') IS NOT NULL BEGIN
DROP TABLE #TaskAccounting
END
CREATE TABLE #TaskAccounting
(
Person VARCHAR(4) NOT NULL,
TaskLogin CHAR(1) NOT NULL,
TaskDate DATETIME NOT NULL
)
INSERT INTO #TaskAccounting
VALUES ('Jane','A','2013-01-01')
INSERT INTO #TaskAccounting
VALUES ('Jane','B','2013-01-03')
INSERT INTO #TaskAccounting
VALUES ('Jane','A','2013-01-06')
INSERT INTO #TaskAccounting
VALUES ('Jane','B','2013-01-10')
INSERT INTO #TaskAccounting
VALUES ('Bob','A','2013-01-01')
INSERT INTO #TaskAccounting
VALUES ('Bob','A','2013-01-06');
-- Use a CTE to add sequence and join on it --
WITH Tasks AS (
SELECT
Person,
TaskLogin,
TaskDate,
ROW_NUMBER() OVER(PARTITION BY Person ORDER BY TaskDate) AS Sequence
FROM
#TaskAccounting
)
SELECT
a.TaskLogin AS Tasks,
CAST(SUM(DATEDIFF(DD,a.TaskDate,b.TaskDate)) AS VARCHAR) + ' days' AS TimeBetweenTasks
FROM
Tasks a
JOIN
Tasks b
ON (a.Person = b.Person)
AND (a.Sequence = b.Sequence - 1)
GROUP BY
a.TaskLogin

SQL Server select first instance of ranked data

I have a query that creates a result set like this:
Rank Name
1 Fred
1 John
2 Mary
2 Fred
2 Betty
3 John
4 Betty
4 Frank
I need to then select the lowest rank for each name, e.g.:
Rank Name
1 Fred
1 John
2 Mary
2 Betty
4 Frank
Can this be done within TSQL?
SELECT MIN(Rank) AS Rank, Name
FROM TableName
GROUP BY Name
yes
select name, min(rank)
from nameTable
group by name
As Paul + Kevin have pointed out, simple cases of returning a value from an aggregate can be extracted using MIN / MAX etc (just note that RANK is a reserved word)
In a more general / complicated case, e.g. where you need to find the second / Nth highest rank, you can use PARTITIONs with ROW_NUMBER() to do ranking and then filter by the rank.
SELECT [Rank], [Name]
FROM
(
SELECT [RANK], [Name],
ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY [Rank]) as [RowRank]
FROM [MyTable]
) AS [MyTableReRanked]
WHERE [RowRank] = #N
ORDER BY [Rank];