Pick the last date within 1 year of the min date - sql

I am trying to create a new column that returns the last date within 1 year of the first date.
Example:
I have the following dates.
5/6/2011
8/9/2011
3/5/2012
6/8/2012
So the query should pick 3/5/2012 as the last date in this scenario.

One method uses window functions:
select max(dt)
from (select t.*, min(dt) over () as min_dt
from t
) t
where dt < dateadd(year, 1, min_dt);
I think I prefer a correlated subquery, though:
select max(dt)
from t
where dt < (select dateadd(year, 1, min(dt)) from t);

You can pretty much translate your English spec into an sql for this one:
SELECT max(d)
FROM t
WHERE d < (SELECT DATEADD(year, 1, MIN(d)) FROM t)

Assume your column name is dt and your table name is Tbl
SELECT MAX(dt)
FROM Tbl
WHERE dt < (SELECT MIN(dt) + 365 FROM Tbl)

Related

How to use max(Date) and Distinct in Oracle DB. I would like to get data inserted very last within a day

I am looking for a way to fetch Data.
by
latest date In the same day
by UserId
UserId,Value1,Date
1, 2030,2020–09-07 10:58:58
1, 2020,2020–09-07 05:58:28
1, 2050,2020–09-08 19:58:28
2, 3000,2020–09-07 10:58:18
2, 2001,2020–09-06 10:58:55
3, 2400,2020–09-08 10:28:53
4, 2400,2020–09-07 13:28:53
e.g
where Date >= trunc(TO_DATE(’20200907’,’YYYYMMDD’)) and Date < trunc(TO_DATE(’20200908’,’YYYYMMDD’))
Ideal Result
UserId,Value
1,2050
2,3000
4,2400
select UserId, value
What should I use ?
max(Date) ? Distinct userId ? Group by userId?
If value is the only column you want, then you could use keep:
select userid, max(value1) keep(dense_rank last order by dt) value1
from mytable
where dt >= date '2020-09-07' and dt < date '2020-09-08'
group by userid
order by userid
Notes:
this uses the standard date syntax rather than to_date() to build literal dates
dateis a reserved word in Oracle, hence not a good choice for a column name; I renamed it to dt in the query.
If you want more columns in the resultset, then filtering with window functions is more appropriate:
select t.*
from (
select t.*, row_number() over(partition by userid order by dt desc) rn
from mytable t
where dt >= date '2020-09-07' and dt < date '2020-09-08'
) t
where rn = 1

How to replace the loop in MsSQL?

For example
If I want to check in every day last week
select count(ID) from DB where date < "2019/07/01"
select count(ID) from DB where date < "2019/07/02"
select count(ID) from DB where date < "2019/07/03"
...
select count(ID) from DB where date < "2019/07/08"
like
0701 10
0702 15
0703 23
...
0707 45
How to do this without loop and one query?
You can generate the dates using a recursive CTE (or other method) and then run the query:
with dates as (
select convert(date, '2019-07-01') as dte union all
select dateadd(day, 1, dte)
from dates
where dte < '2019-07-08'
)
select d.dte,
(select count(*) from DB where DB.date < d.dte)
from dates d;
More efficient, though, is a cumulative sum:
select db.*
from (select date, count(*) as cnt, sum(count(*)) over (order by date) as running_cnt
from db
group by date
) d
where d.date >= '2019-07-01' and d.date < '2019-07-09';
Are you just counting the number by day?
Something like
SELECT MONTH(date), DAY(date), COUNT(ID)
FROM DB
GROUP BY MONTH(date), DAY(date);
(assuming date is a DATE or DATETIME)
Do it with window Count. range between current row and current row selects exactly this day rows.
select distinct date, count(1) over (order by Date) - count(1) over (order by Date range between current row and current row)
from DB
where date between '2019-07-01' and '2019-07-08';
I assume date column is exactly DATE.

SQL how to write a query that return missing date ranges?

I am trying to figure out how to write a query that looks at certain records and finds missing date ranges between today and 9999-12-31.
My data looks like below:
ID |start_dt |end_dt |prc_or_disc_1
10412 |2018-07-17 00:00:00.000 |2018-07-20 00:00:00.000 |1050.000000
10413 |2018-07-23 00:00:00.000 |2018-07-26 00:00:00.000 |1040.000000
So for this data I would want my query to return:
2018-07-10 | 2018-07-16
2018-07-21 | 2018-07-22
2018-07-27 | 9999-12-31
I'm not really sure where to start. Is this possible?
You can do that using the lag() function in MS SQL (but that is available starting with 2012?).
with myData as
(
select *,
lag(end_dt,1) over (order by start_dt) as lagEnd
from myTable),
myMax as
(
select Max(end_dt) as maxDate from myTable
)
select dateadd(d,1,lagEnd) as StartDate, dateadd(d, -1, start_dt) as EndDate
from myData
where lagEnd is not null and dateadd(d,1,lagEnd) < start_dt
union all
select dateAdd(d,1,maxDate) as StartDate, cast('99991231' as Datetime) as EndDate
from myMax
where maxDate < '99991231';
If lag() is not available in MS SQL 2008, then you can mimic it with row_number() and joining.
select
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then end_dt +1 END as F1,
CASE WHEN DATEDIFF(day, end_dt, ISNULL(LEAD(start_dt) over (order by ID), '99991231')) > 1 then ISNULL(LEAD(start_dt) over (order by ID) - 1, '99991231') END as F2
from t
Working SQLFiddle example is -> Here
FOR 2008 VERSION
SELECT
X.end_dt + 1 as F1,
ISNULL(Y.start_dt-1, '99991231') as F2
FROM t X
LEFT JOIN (
SELECT
*
, (SELECT MAX(ID) FROM t WHERE ID < A.ID) as ID2
FROM t A) Y ON X.ID = Y.ID2
WHERE DATEDIFF(day, X.end_dt, ISNULL(Y.start_dt, '99991231')) > 1
Working SQLFiddle example is -> Here
This should work in 2008, it assumes that ranges in your table do not overlap. It will also eliminate rows where the end_date of the current row is a day before the start date of the next row.
with dtRanges as (
select start_dt, end_dt, row_number() over (order by start_dt) as rownum
from table1
)
select t2.end_dt + 1, coalesce(start_dt_next -1,'99991231')
FROM
( select dr1.start_dt, dr1.end_dt,dr2.start_dt as start_dt_next
from dtRanges dr1
left join dtRanges dr2 on dr2.rownum = dr1.rownum + 1
) t2
where
t2.end_dt + 1 <> coalesce(start_dt_next,'99991231')
http://sqlfiddle.com/#!18/65238/1
SELECT
*
FROM
(
SELECT
end_dt+1 AS start_dt,
LEAD(start_dt-1, 1, '9999-12-31')
OVER (ORDER BY start_dt)
AS end_dt
FROM
yourTable
)
gaps
WHERE
gaps.end_dt >= gaps.start_dt
I would, however, strongly urge you to use end dates that are "exclusive". That is, the range is everything up to but excluding the end_dt.
That way, a range of one day becomes '2018-07-09', '2018-07-10'.
It's really clear that my range is one day long, if you subtract one from the other you get a day.
Also, if you ever change to needing hour granularity or minute granularity you don't need to change your data. It just works. Always. Reliably. Intuitively.
If you search the web you'll find plenty of documentation on why inclusive-start and exclusive-end is a very good idea from a software perspective. (Then, in the query above, you can remove the wonky +1 and -1.)
This solves your case, but provide some sample data if there will ever be overlaps, fringe cases, etc.
Take one day after your end date and 1 day before the next line's start date.
DECLARE # TABLE (ID int, start_dt DATETIME, end_dt DATETIME, prc VARCHAR(100))
INSERT INTO # (id, start_dt, end_dt, prc)
VALUES
(10410, '2018-07-09 00:00:00.00','2018-07-12 00:00:00.000','1025.000000'),
(10412, '2018-07-17 00:00:00.00','2018-07-20 00:00:00.000','1050.000000'),
(10413, '2018-07-23 00:00:00.00','2018-07-26 00:00:00.000','1040.000000')
SELECT DATEADD(DAY, 1, end_dt)
, DATEADD(DAY, -1, LEAD(start_dt, 1, '9999-12-31') OVER(ORDER BY id) )
FROM #
You may want to take a look at this:
http://sqlfiddle.com/#!18/3a224/1
You just have to edit the begin range to today and the end range to 9999-12-31.

SQL Server - Select all top of the hour records

I have a large table with records created every second and want to select only those records that were created at the top of each hour for the last 2 months. So we would get 24 selected records for every day over the last 60 days
The table structure is Dateandtime, Value1, Value2, etc
Many Thanks
You could group by on the date part (cast(col1 as date)) and the hour part (datepart(hh, col1). Then pick the minimum date for each hour, and filter on that:
select *
from YourTable yt
join (
select min(dateandtime) as dt
from YourTable
where datediff(day, dateandtime, getdate()) <= 60
group by
cast(dateandtime as date)
, datepart(hh, dateandtime)
) filter
on filter.dt = yt.dateandtime
Alternatively, you can group on a date format that only includes the date and the hour. For example, convert(varchar(13), getdate(), 120) returns 2013-05-11 18.
...
group by
convert(varchar(13), getdate(), 120)
) filter
...
For clarity's sake, I would probably use a two-step, CTE-based approach (this works in SQL Server 2005 and newer - you didn't clearly specify which version of SQL Server you're using, so I'm just hoping you're not on an ancient version like 2000 anymore):
-- define a "base" CTE to get the hour component of your "DateAndTime"
-- column and make it accessible under its own name
;WITH BaseCTE AS
(
SELECT
ID, DateAndTime,
Value1, Value2,
HourPart = DATEPART(HOUR, DateAndTime)
FROM dbo.YourTable
WHERE DateAndTime >= #SomeThresholdDateHere
),
-- define a second CTE which "partitions" the data by this "HourPart",
-- and number all rows for each partition starting at 1. So each "last"
-- event for each hour is the one with the RN = 1 value
HourlyCTE AS
(
SELECT ID, DateAndTime, Value1, Value2,
RN = ROW_NUMBER() OVER(PARTITION BY HourPart ORDER BY DateAndTime DESC)
FROM BaseCTE
)
SELECT *
FROM HourlyCTE
WHERE RN=1
Also: I wasn't sure what exactly you mean by "top of the hour" - the row that's been created right at the beginning of each hour (e.g. at 04:00:00) - or rather the last row created in that hour's time span? If you mean the first one for each hour - then you'd need to change the ORDER BY DateAndTime DESC to ORDER BY DateAndTime ASC
You can use option with EXISTS operator
SELECT *
FROM dbo.tableName t
WHERE t.DateAndTime >= #YourDateCondition
AND EXISTS (
SELECT 1
FROM dbo.tableName t2
WHERE t2.Dateandtime >= DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime), 0)
AND t2.Dateandtime < DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime)+1, 0)
HAVING MAX(t2.Dateandtime) = t.Dateandtime
)
OR option with CROSS APPLY operator
SELECT *
FROM dbo.test83 t CROSS APPLY (
SELECT 1
FROM dbo.test83 t2
WHERE t2.Dateandtime >= DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime), 0)
AND t2.Dateandtime < DATEADD(HOUR, DATEDIFF(HOUR, 0, t.Dateandtime)+1, 0)
HAVING MAX(t2.Dateandtime) = t.Dateandtime
) o(IsMatch)
WHERE t.DateAndTime >= #YourDateCondition
For improving performance use this index:
CREATE INDEX x ON dbo.test83(DateAndTime) INCLUDE(Value1, Value2)
Try:
select * from mytable
where datepart(mi, dateandtime)=0 and
datepart(ss, dateandtime)=0 and
datediff(d, dateandtime, getdate()) <=60
You can use window functions for this:
select dateandtime, val1, val2, . . .
from (select t.*,
row_number() over (partition by cast(dateandtime as date), hour(dateandtime)
order by dateandtime
) as seqnum
from t
) t
where seqnum = 1
The function row_number() assigns a sequential number to each group defined by the partition clause -- in this case each hour of each day. Within this group, it orders by the dateandtime value, so the one closest to the top of the hour gets a value of 1. The outer query just selects this one record for each group.
You may need an additional filter clause to get records in the last 60 days. Use this in the subquery:
where dateandtime >= getdate() - 60
This helped me get the top of the hour. Anything that ends in ":00:00".
WHERE (CAST(DATETIME as VARCHAR(19))) LIKE '%:00:00'

SQL Get the gaps in dateranges when a list of ranges is provided

I'm currently looking for a SQL solution for the following problem:
SQLFiddle as guidance:
I have a list of not-nullable startdates and nullable enddates. Based on this list I need the total gap time between a given start and enddate.
Based on the SQLFiddle
If I would only have situation 1 in my database the result should be 2 days.
If I would have situation 2 and 3 in my database the result should be 1 day.
I have been pondering this for a couple of days now... any help would be much appreciated!
Regards,
Kyor
Notes: I'm running SQL 2012 ( should any special new features be required )
The best solution will be to create 'Dates' table and start from there, otherwise solution will be unmaintainable. For each date in specified range you can check whether it is covered by ranges in 'dateranges' table and get a count of dates that are not.
Something like this:
SELECT COUNT(*)
FROM
Dates d
WHERE
d.Date BETWEEN #start AND #end
AND NOT EXISTS
(SELECT *
FROM dateranges r
WHERE d.date BETWEEN r.startdate and ISNULL(r.enddate, d.date)
)
CREATE TABLE Dates (
dt DATETIME NOT NULL PRIMARY KEY);
INSERT INTO Dates VALUES('20081204');
INSERT INTO Dates VALUES('20081205');
INSERT INTO Dates VALUES('20090608');
INSERT INTO Dates VALUES('20090609');
-- missing ranges
SELECT DATEADD(DAY, 1, prev) AS start_gap,
DATEADD(DAY, -1, next) AS end_gap,
DATEDIFF(MONTH, DATEADD(DAY, 1, prev),
DATEADD(DAY, -1, next)) AS month_diff
FROM (
SELECT dt AS prev,
(SELECT MIN(dt)
FROM Dates AS B
WHERE B.dt > A.dt) AS next
FROM Dates AS A) AS T
WHERE DATEDIFF(DAY, prev, next) > 1;
-- existing ranges
SELECT MIN(dt) AS start_range,
MAX(dt) AS end_range
FROM (
SELECT dt,
DATEDIFF(DAY, ROW_NUMBER() OVER(ORDER BY dt), dt) AS grp
FROM Dates) AS D
GROUP BY grp;
DROP TABLE Dates;