select 10 rows with lowest time difference in sql - sql

I'm using sql-server 2005
Hi, i have Users table with userID and registrationDate. I want to select shortest period of time between two registrationDates when first date is x and other row is x+10 rows. I don't mind cursor because i will run this query once in a while.
I will explain again, i need shortest period of time between 10 users registrations to get an idea what a high border of registrations per certain time can be.
thanks

Try this query if you are using SQL Server 2005 or newer:
WITH T1 AS (
SELECT
userID,
registrationDate,
ROW_NUMBER() OVER (ORDER BY registrationDate) AS rn
FROM Users
), T3 AS (
SELECT
T1.registrationDate AS interval_start,
T2.registrationDate AS interval_end,
T1.registrationDate - T2.registrationDate AS diff
FROM T1
JOIN T1 T2
ON T1.rn = T2.rn + 5
)
SELECT TOP 1 interval_start, interval_end
FROM T3
ORDER BY diff

Related

How do you call previous row in a where clause?

I am trying to figure out how to get rid of results that occur close together. For example the rows have a create timestamp (source_time). I want to remove results that occur within 10 seconds of each other.
I thought lag() might do it, but I can't use that in the where clause.
select *
from table
where source_time - previous(source_time) >= 10 second
Very rough code, but I am not sure how to call the previous source time. I have translated them to timestamps and used timestamp_diff(source_time, x, second) >= 10 but not sure how to make x the previous value.
Hopefully this is clear.
You can do this with subqueries.
delete table t1
where t1.id in (
select t2.id
from (
select
id,
source_time - lag(source_time) over (order by source_time) as time_diff
from table
) t2
where t2.time_diff < 10 second
)
Keep in mind this can potentially leave large gaps in your records if. For example, if you get a row every 9 seconds for an hour you'll delete all but the last record in that hour.
You might instead partition the source_time every 10 seconds and delete anything with a row_number > 1.
delete table t1
where t1.id in (
select t2.id
from (
select
id,
source_time,
row_number() over(
partition by source_time - make_interval(second => extract(second from source_time) % 10)
order by source_time asc
) rownum
from table
) t2
where rownum > 1
)

SQL Server - Count Rows based on How Column values have changed across rows

I have a table in SQL Server DB like this:
TBL_EMPLOYEE
Now I need to Pick up the first (min) and last (max) RecordID for Each Employee and do the following:
If the first record is USA and Last is Canada : then flag the employee as "USA to Canada".
If the last record is USA and first is Canada : then flag the employee as "Canada to USA".
My Ultimate goal is to produce the below table - which will show the number of employees that have moved across the two countries.
TBL_MIGRATION
How can I achieve this in SQL Server ?
It sounds like you want the first and last rows for each employee. Then you can track the overall movement:
select first_workfrom, last_workfrom, count(*)
from (select t.*,
first_value(workfrom) over (partition by employee order by recordid) as first_workfrom,
first_value(workfrom) over (partition by employee order by recordid desc) as last_workfrom
from t
) t
group by first_workfrom, last_workfrom
having first_workfrom <> last_workfrom;
First_value & Last_value is available since 2012 or higher, if you are running with lower version then you can use apply :
select movement, count(*)
from (select distinct t.employee,
concat(t1.workfrom, ' to ', t11.workfrom) as movement
from table t cross apply
( select top (1) t1.*
from table t1
where t1.employee = t.employee
order by t1.id
) t1 cross apply
( select top (1) t11.*
from table t11
where t11.employee = t.employee
order by t11.id desc
) t11
where t1.workfrom <> t11.workfrom
) t
group by movement;

Access SQL Query Top 1 percent group by winsorizing

I need to find the 99th and 1th percentiles for a variable at each date. So far I have managed to do so but for the overall period, I would like to "loop" the following query (that does work) for each date (which is basic winsorizing) like a simple GROUP BY, but the latter does not work with TOP PERCENT)
SELECT Date,ID,Value,
IIf(Value>[upper_threshold],[upper_threshold],IIf(Value<[lower_threshold],
[lower_threshold],Value)) AS winsor_Value
FROM MyTable,
(SELECT [lower_threshold], [upper_threshold] FROM (SELECT MAX(Value) AS
lower_threshold FROM (SELECT TOP 1 PERCENT Value FROM MyTable ORDER BY
Value)) AS t1, (SELECT MIN(Value) AS upper_threshold FROM (SELECT TOP 1
PERCENT Value FROM MyTable ORDER BY Value DESC)));
My data looks like
I have 700 000 rows.
Thanks a lot
I am not sure if the following works in MS Access, but it is worth a try. To get the value at the top 99%:
select t.date,
(select min(t2.value)
from (select top 1 percent t2.*
from t as t2
where t2.date = t.date
order by t2.value desc
) as t2
) as percentile_99
from (select distinct date
from t
) as t;
I do not know if MS Access scoping rules allow you to correlate a subquery more than one level deep. If so the above approach should work for all the percentiles.

SELECT field value minus previous field value

I have a select query that gets a CarID, month, mileage and CO2 emission.
Now it gives for each month per car the mileage like this:
month 1: 5000
month 2: 5200
...
What I really need is that it takes the current value minus the previous one. I get data between a certain time frame and I already included a mileage point before that time frame. So it would be possible to get the total miles per month, I just don't know how. What I want is this.
pre timeframe: 5000
month 1: 200
month 2: 150
...
How would I do this?
edit: code, I have not yet tried anything as I have no clue how to start to do this.
resultlist as (
SELECT
CarID
, '01/01/2000' as beginmonth
, MAX(kilometerstand) as Kilometers
, MAX(Co2Emission) as CO2
FROM
totalmileagelist
GROUP BY CarID
UNION
SELECT
CarID
, beginmonth
, MAX(kilometerstand) as Kilometers
, MAX(Co2Emission) as CO2
FROM
resultunionlist
GROUP BY CarID, beginmonth
)
select * from resultlist
order by CarID, beginmonth
Edit2: explanation to the code
In the first part of the result list I grab the latest mileage per car. In the second part, after the union, I grab per month per car the latest mileage.
If you just want to subtract the previous milage, use the lag() function:
select ml.*,
(kilometerstand - lag(kilometerstand) over (partition by carid order by month)
) as diff
from totalmileagelist ml;
lag() is available in SQL Server 2012+. In earlier versions you can use a correlated subquery or outer apply.
(I missed the version because it is in the title and not on a tag.) In SQL Server 2008:
select ml.*,
(ml.mileage - mlprev.mileage) as diff
from totalmileagelist ml outer apply
(select top 1 ml2.*
from totalmileagelist ml2
where ml2.CarId = ml.CarId and
ml2.month < ml.month
order by ml2.month desc
) mlprev;
Try like this:
SELECT id, yourColumnValue,
COALESCE(
(
SELECT TOP 1 yourColumnValue
FROM table_name t
WHERE t.id> tbl.id
ORDER BY
rowInt
), 0) - yourColumnValue AS diff
FROM table_name tbl
ORDER BY
id
or like this using rank()
select rank() OVER (ORDER BY id) as 'RowId', mileage into temptable
from totalmileagelist
select t1.mileage - t2.mileage from temptable t1, temptable t2
where t1.RowId = t2.RowId - 1
drop table temptable

Select Max(Date) and next highest Max(date) from table

I have a table with approx 100,000 rows per day, each containing the date they were added. However, rows are only added on business days, so no weekends or bank holidays.
I'm trying to find an efficient way to find the Max(BusinessDate) and also the next highest max(BusinessDate). However, when I try the following it takes 20 mins to run (+50Mn rows total).
SELECT
MAX(t1.BusinessDataDate) AS BusinessDataDate
, MAX(t2.BusinessDataDate) AS PreviousDataDate
FROM
cb_account t1
, cb_account t2
WHERE
t2.BusinessDataDate < t1.BusinessDataDate
Just selecting the Max(BusinessDataDate) is instant.
SELECT TOP 2
'cb_account' AS TableName
, BusinessDataDate
FROM cb_account
GROUP BY BusinessDataDate
ORDER BY BusinessDataDate DESC
will give me both the dates, but I really need them in a single row.
SELECT MAX(dates.BusinessDataDate) AS BusinessDataDate,
MIN(dates.BusinessDataDate) AS PreviousDataDate
FROM
(SELECT TOP 2
'cb_account' AS TableName
, BusinessDataDate
FROM cb_account
GROUP BY BusinessDataDate
ORDER BY BusinessDataDate DESC) dates
If I've got it right try this:
WITH T as
(
SELECT MAX(BusinessDataDate) as MaxD
FROM cb_account
)
SELECT MaxD, (SELECT MAX(BusinessDataDate)
FROM cb_account
WHERE BusinessDataDate<T.MaxD)
FROM T
What is the SQL version you are using ? if you are are on SQL 2012 you can use LEAD() or LAG() functions
http://technet.microsoft.com/en-us/library/hh213125.aspx