Order by decimals for counts of same value - sql

EDIT: I have edited this question to make the query simpler:
ReportTracking:
Userid, ReportId, Duration, CreatedDate
Query:
SELECT t.UserId, COUNT(DISTINCT(t.ReportId)) AS ReportsRead
FROM ReportTracking t
WHERE t.Duration >= 30
AND t.CreatedDate > DATEADD(Day, -30, GETDATE())
GROUP BY t.UserId
Sample Result:
UserId ReportsRead
1 22
2 13
3 2
4 2
5 2
What I need to do is assign a number value to Reports Read. Essentially because there are 3 users who read swimming and they tie in terms of ranking (they each have 2 read only) I need to order them by who read the report last. I need to assign them all a decimal number value based on order of reading. So the person who read the report last would get .1, the person who read it first would get .3.
I'm not quite sure how to achieve this, the key part is that they do have have a decimal number value that ranks them and this decimal should be few decimal points long as the records are rather long. My idea was to use DateCreated and convert it a number value which I can substract from a max. But since there are multiple dates (one for each report), I'm not sure how to grab the latest one and only use that date with my report count.

I'm not sure why you need to assign decimals...
Just order by ReportsRead desc, max(createdDate) (this should be most recent read for a user in the select).
Also distinct isn't a function it's a statement. No need for the ()
SELECT t.UserId
, COUNT(DISTINCT t.ReportId) AS ReportsRead
max(t.createDate) Asc) RN
FROM ReportTracking t
WHERE t.Duration >= 30
AND t.CreatedDate > DATEADD(Day, -30, GETDATE())
GROUP BY t.UserId
ORDER BY ReportsRead DESC, max(createdDate)
if you need the numbers and plan on displaying them
WITH CTE AS (
SELECT t.UserId
, COUNT(DISTINCT t.ReportId) AS ReportsRead
, row_number() over (partition by count(Distinct t.reportID) order by max(t.createDate) Asc) RN
FROM ReportTracking t
WHERE t.Duration >= 30
AND t.CreatedDate > DATEADD(Day, -30, GETDATE())
GROUP BY t.UserId)
SELECT *
FROM CTE
ORDER BY ReportsRead DESC, RN

You can rank your rows within ReportsRead partition to obtain a ranking by ordering on the max(createddate). documentation: SQL Server Rank function
here is an example: http://sqlfiddle.com/#!18/1eefc/11
You may simplify the query by using CTE to reuse column aliases but the concept is:
SELECT t.UserId
, COUNT(DISTINCT( t.ReportId )) AS ReportsRead
, CAST(RANK()
OVER(
partition BY COUNT(DISTINCT( t.ReportId ))
ORDER BY MAX(t.createdDate) DESC) AS DECIMAL) / 10 ranking
FROM ReportTracking t
WHERE t.Duration >= 30
AND t.CreatedDate > DATEADD(Day, -30, GETDATE())
GROUP BY t.UserId
ORDER BY ReportsRead DESC
, ranking;

Related

SQL server: Get record with date closest to given date

I have a table dbo.studies with datetime column studydate
I want to query the database using the datetime variable givendate to find the record closest to the datetime in column studydate
Using:
SELECT TOP 1 *
FROM studies
WHERE studies.studydate < givendate
ORDER BY studies.studydate DESC
Will result in the record that is less and closest to givendate, but I need the record closest to givendate, regardless of whether it's less or more then studydate
Any thoughts on how to find it?
One method is:
SELECT TOP 1 s.*
FROM studies s
ORDER BY ABS(DATEDIFF(day, s.studydate, #givendate));
This uses DATEDIFF() to get the closest date. Note that this is using day for the difference. If your "dates" have a time component, you might want a different date part.
Note that this will not take advantage of indexes. A faster method (if you have the indexes) is a bit more complicated:
SELECT TOP (1) s.*
FROM ((SELECT TOP 1 s.*
FROM studies s
WHERE s.studydate <= #givendate
ORDER BY s.studydate DESC
) UNION ALL
(SELECT TOP 1 s.*
FROM studies s
WHERE s.studydate > #givendate
ORDER BY s.studydate ASC
)
) s
ORDER BY DATEDIFF(day, s.studydate, #givendate));
Although this is more complicated, each subquery can use an index on studydate. The final sort would have only two rows, so it should be really fast.
SELECT TOP 1 *
FROM studies
ORDER BY ABS(DATEDIFF(second, #givendate, studies.studydate))
use datediff function in order by it will always return the nearest 1
SELECT TOP 1 *
FROM studies
ORDER BY DATEDIFF(dd,studies.studydate, givendate) ASC
Order By ABS(DATEDIFF(day, YourDate, GetDate()))
Is The Best Method For Get Distinct and unique Recod When Your Table is Many Row and one column different
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ColumnName ORDER BY ABS(DATEDIFF(day, YourDate, GetDate()))) RowNumber
FROM [tableName]
)A
WHERE RowNumber = 1

How to get value by a range of dates?

I have a table like so
And With this code I get the 5 latest values for each domainId
;WITH grp AS
(
SELECT DomainId, [Date],Passed, DatabasePerformance,ServerPerformance,
rn = ROW_NUMBER() OVER
(PARTITION BY DomainId ORDER BY [Date] DESC)
FROM dbo.DomainDetailDataHistory H
)
SELECT g.DomainId, g.[Date],g.Passed, g.ServerPerformance, g.DatabasePerformance
FROM grp g
INNER JOIN #Latest T ON T.DomainId = g.DomainId
WHERE rn < 7 AND t.date != g.[Date]
ORDER BY DomainId, [Date] DESC
What I Want
Well I would like to know how many tickets were sold for each of these 5 latest rows but with the following condition:
Each of these rows come with their own date which differs.
for each date I want to check how many were sold the last 15minutes AND how many were sold the last 30mns.
Example:
I get these 5 rows for each domainId
I want to extend the above with two columns, "soldTicketsLast15" and "soldTicketsLast30"
The date column contains all the dates I need and for each of these dates I want to go back 15 min and go back 30min to and get how many tickets were sold
Example:
SELECT MAX(SoldTickets) FROM DomainDetailDataHistory
WHERE [Date] >= DATEADD(minute, -15, '2016-04-12 12:10:28.2270000')
SELECT MAX(SoldTickets) FROM DomainDetailDataHistory
WHERE [Date] >= DATEADD(minute, -30, '2016-04-12 12:10:28.2270000')
How can i accomplish this?
I'd use OUTER APPLY or CROSS APPLY.
;WITH grp AS
(
SELECT
DomainId, [Date], Passed, DatabasePerformance, ServerPerformance,
rn = ROW_NUMBER() OVER (PARTITION BY DomainId ORDER BY [Date] DESC)
FROM dbo.DomainDetailDataHistory H
)
SELECT
g.DomainId, g.[Date],g.Passed, g.ServerPerformance, g.DatabasePerformance
,A15.SoldTicketsLast15
,A30.SoldTicketsLast30
FROM
grp g
INNER JOIN #Latest T ON T.DomainId = g.DomainId
OUTER APPLY
(
SELECT MAX(H.SoldTickets) - MIN(H.SoldTickets) AS SoldTicketsLast15
FROM DomainDetailDataHistory AS H
WHERE
H.DomainId = g.DomainId AND
H.[Date] >= DATEADD(minute, -15, g.[Date])
) AS A15
OUTER APPLY
(
SELECT MAX(H.SoldTickets) - MIN(H.SoldTickets) AS SoldTicketsLast30
FROM DomainDetailDataHistory AS H
WHERE
H.DomainId = g.DomainId AND
H.[Date] >= DATEADD(minute, -30, g.[Date])
) AS A30
WHERE
rn < 7
AND T.[date] != g.[Date]
ORDER BY DomainId, [Date] DESC;
To make the correlated APPLY queries efficient there should be an appropriate index, like the following:
CREATE NONCLUSTERED INDEX [IX_DomainId_Date] ON [dbo].[DomainDetailDataHistory]
(
[DomainId] ASC,
[Date] ASC
)
INCLUDE ([SoldTickets])
This index may also help to make the main part of your query (grp) efficient.
If I understood your question correctly, you want to get the tickets sold from one of your dates (in the Date column) going back 15 minutes and 30 minutes. Assuming that you are using your DATEADD function correctly, the following should work:
SELECT MAX(SoldTickets) FROM DomainDetailDataHistory
WHERE [Date] BETWEEN [DATE] AND DATEADD(minute, -15, '2016-04-12 12:10:28.2270000') GROUP BY [SoldTickets]
The between operator allows you to retrieve results between two date parameters. In the SQL above, we also need a group by since you are using a GROUPING function (MAX). The group by would depend on what you want to group by but I think in your case it would be SoldTickets.
The SQL above will give you the ones between the date and 15 minutes back. You could do something similar with the 30 minutes back.

SQL How to group by but with special conditions

I have a SQL Server 2008 table where I have a list of employees with timestamps.
I have a script that groups by employee the dates.
What I need is to group by employee but I have to exclude the timestamps that are in the same day and the difference between them are less than 8 hours.
Here is a table that explains better:
I created a SQL Fiddle with the table and sample data.
http://sqlfiddle.com/#!3/3b956/1
Any clue?
What you really want is lag(), which is in SQL Server 2012+. With lag(), you would do:
select t.*
from (select t.*, lag(date) over (partition by EmployeeId order by date) as prev_date
from t
) t
where not (cast(prev_date as date) = cast(date as date) and
date <= dateadd(hour, 8, prev_date)
) or
prev_date is null;
In SQL Server 2008, you can do something similar with outer apply:
select t.*
from t outer apply
(select top 1 prev.*
from t prev
where prev.Employee_id = t.EmployeeId and
prev.date < t.date and
cast(prev.date as date) = cast(t.date as date)
order by prev.date desc
) prev
where prev.date is null or
t.date > dateadd(hour, 8, prev.date);
You may need an order by to maintain the same ordering.
This should also work by excluding rows for which there exist previuos row with diffrence less than 8 hours:
select p1.employeeid, count(*) as [count]
from punch p1
where not exists(select * from punch p2
where p2.employeeid = p1.employeeid and p2.id < p1.id and
dateadd(hour, 8, p2.date) > p1.date)
group by p1.employeeid

SQL Server - Select all top of the hour records

I have a large table with records created every second and want to select only those records that were created at the top of each hour for the last 2 months. So we would get 24 selected records for every day over the last 60 days
The table structure is Dateandtime, Value1, Value2, etc
Many Thanks
You could group by on the date part (cast(col1 as date)) and the hour part (datepart(hh, col1). Then pick the minimum date for each hour, and filter on that:
select *
from YourTable yt
join (
select min(dateandtime) as dt
from YourTable
where datediff(day, dateandtime, getdate()) <= 60
group by
cast(dateandtime as date)
, datepart(hh, dateandtime)
) filter
on filter.dt = yt.dateandtime
Alternatively, you can group on a date format that only includes the date and the hour. For example, convert(varchar(13), getdate(), 120) returns 2013-05-11 18.
...
group by
convert(varchar(13), getdate(), 120)
) filter
...
For clarity's sake, I would probably use a two-step, CTE-based approach (this works in SQL Server 2005 and newer - you didn't clearly specify which version of SQL Server you're using, so I'm just hoping you're not on an ancient version like 2000 anymore):
-- define a "base" CTE to get the hour component of your "DateAndTime"
-- column and make it accessible under its own name
;WITH BaseCTE AS
(
SELECT
ID, DateAndTime,
Value1, Value2,
HourPart = DATEPART(HOUR, DateAndTime)
FROM dbo.YourTable
WHERE DateAndTime >= #SomeThresholdDateHere
),
-- define a second CTE which "partitions" the data by this "HourPart",
-- and number all rows for each partition starting at 1. So each "last"
-- event for each hour is the one with the RN = 1 value
HourlyCTE AS
(
SELECT ID, DateAndTime, Value1, Value2,
RN = ROW_NUMBER() OVER(PARTITION BY HourPart ORDER BY DateAndTime DESC)
FROM BaseCTE
)
SELECT *
FROM HourlyCTE
WHERE RN=1
Also: I wasn't sure what exactly you mean by "top of the hour" - the row that's been created right at the beginning of each hour (e.g. at 04:00:00) - or rather the last row created in that hour's time span? If you mean the first one for each hour - then you'd need to change the ORDER BY DateAndTime DESC to ORDER BY DateAndTime ASC
You can use option with EXISTS operator
SELECT *
FROM dbo.tableName t
WHERE t.DateAndTime >= #YourDateCondition
AND EXISTS (
SELECT 1
FROM dbo.tableName t2
WHERE t2.Dateandtime >= DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime), 0)
AND t2.Dateandtime < DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime)+1, 0)
HAVING MAX(t2.Dateandtime) = t.Dateandtime
)
OR option with CROSS APPLY operator
SELECT *
FROM dbo.test83 t CROSS APPLY (
SELECT 1
FROM dbo.test83 t2
WHERE t2.Dateandtime >= DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime), 0)
AND t2.Dateandtime < DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime)+1, 0)
HAVING MAX(t2.Dateandtime) = t.Dateandtime
) o(IsMatch)
WHERE t.DateAndTime >= #YourDateCondition
For improving performance use this index:
CREATE INDEX x ON dbo.test83(DateAndTime) INCLUDE(Value1, Value2)
Try:
select * from mytable
where datepart(mi, dateandtime)=0 and
datepart(ss, dateandtime)=0 and
datediff(d, dateandtime, getdate()) <=60
You can use window functions for this:
select dateandtime, val1, val2, . . .
from (select t.*,
row_number() over (partition by cast(dateandtime as date), hour(dateandtime)
order by dateandtime
) as seqnum
from t
) t
where seqnum = 1
The function row_number() assigns a sequential number to each group defined by the partition clause -- in this case each hour of each day. Within this group, it orders by the dateandtime value, so the one closest to the top of the hour gets a value of 1. The outer query just selects this one record for each group.
You may need an additional filter clause to get records in the last 60 days. Use this in the subquery:
where dateandtime >= getdate() - 60
This helped me get the top of the hour. Anything that ends in ":00:00".
WHERE (CAST(DATETIME as VARCHAR(19))) LIKE '%:00:00'

SQL to determine distinct periods of sequential days of access?

Jeff recently asked this question and got some great answers.
Jeff's problem revolved around finding the users that have had (n) consecutive days where they have logged into a system. Using a database table structure as follows:
Id UserId CreationDate
------ ------ ------------
750997 12 2009-07-07 18:42:20.723
750998 15 2009-07-07 18:42:20.927
751000 19 2009-07-07 18:42:22.283
Read the original question first for clarity and then...
I was intrigued by the problem of determining how many distinct (n)-day periods for a user.
Could one craft a speedy SQL query that could return a list of users and the number of distinct (n)-day periods they have?
EDIT: as per a comment below If someone has 2 consecutive days, then a gap, then 4 consecutive days, then a gap, then 8 consecutive days. It would be 3 "distinct 4 day periods". The 8 day period should count as two back-to-back 4 day periods.
My answer appears to have not appeared...
I'll try again...
Rob Farley's answer to the original question has the handy benefit of including the number of consecutive days.
with numberedrows as
(
select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID
from tablename
)
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID
from numberedrows
group by UserID, TheOffset
Using integer division, simply dividing the consecutive number of days gives the number of "distinct (n)-day periods" covered by the whole consecutive period...
- 2 / 4 = 0
- 4 / 4 = 1
- 8 / 4 = 2
- 9 / 4 = 2
- etc, etc
So here is my take on Rob's answer for your needs...
(I really LOVE Rob's answer, go read the explanation, it's inspired thinking!)
with
numberedrows (
UserID,
TheOffset
)
as
(
select
UserID,
row_number() over (partition by UserID order by CreationDate)
- DATEDIFF(DAY, 0, CreationDate) as TheOffset
from
tablename
),
ConsecutiveCounts(
UserID,
ConsecutiveDays
)
as
(
select
UserID,
count(*) as ConsecutiveDays
from
numberedrows
group by
UserID,
TheOffset
)
select
UserID,
SUM(ConsecutiveDays / #period_length) AS distinct_n_day_periods
from
ConsecutiveCounts
group by
UserID
The only real difference is that I take Rob's results and then run it through another GROUP BY...
So - I'm going to start with my query from the last question, which listed each run of consecutive days. Then I'm going to group that by userid and NumConsecutiveDays, to count how many runs of days there are for those users.
with numberedrows as
(
select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID
from tablename
)
,
runsOfDay as
(
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID
from numberedrows
group by UserID, TheOffset
)
select UserID, NumConsecutiveDays, count(*) as NumOfRuns
from runsOfDays
group by UserID, NumConsecutiveDays
;
And of course, if you want to filter this to only consider runs of a certain length, then put "where NumConsecutiveDays >= #days" in the last query.
Now, if you want to count a run of 16 days as three 5-day runs, then each run will count as NumConsecutiveDays / #runlength of these (which will round down for each integer). So now instead of just counting how many there are of each, use SUM instead. You could use the query above and use SUM(NumOfRuns * NumConsecutiveDays / #runlength), but if you understand the logic, then the query below is a bit easier.
with numberedrows as
(
select row_number() over (partition by UserID order by CreationDate) - cast(CreationDate-0.5 as int) as TheOffset, CreationDate, UserID
from tablename
)
,
runsOfDay as
(
select min(CreationDate), max(CreationDate), count(*) as NumConsecutiveDays, UserID
from numberedrows
group by UserID, TheOffset
)
select UserID, sum(NumConsecutiveDays / #runlength) as NumOfRuns
from runsOfDays
where NumConsecutiveDays >= #runlength
group by UserID
;
Hope this helps,
Rob
This works quite nicely with the test data I have.
DECLARE #days int
SET #days = 30
SELECT DISTINCT l.UserId, (datediff(d,l.CreationDate, -- Get first date in contiguous range
(
SELECT min(a.CreationDate ) as CreationDate
FROM UserHistory a
LEFT OUTER JOIN UserHistory b
ON a.CreationDate = dateadd(day, -1, b.CreationDate ) AND
a.UserId = b.UserId
WHERE b.CreationDate IS NULL AND
a.CreationDate >= l.CreationDate AND
a.UserId = l.UserId
) )+1)/#days as cnt
INTO #cnttmp
FROM UserHistory l
LEFT OUTER JOIN UserHistory r
ON r.CreationDate = dateadd(day, -1, l.CreationDate ) AND
r.UserId = l.UserId
WHERE r.CreationDate IS NULL
ORDER BY l.UserId
SELECT UserId, sum(cnt)
FROM #cnttmp
GROUP BY UserId
HAVING sum(cnt) > 0