T-SQL query to get the last value in a list of months - sql

Below is the data I have
date value
30/03/2014 625949
23/03/2014 624549
16/03/2014 623149
09/03/2014 621549
02/03/2014 619749
23/02/2014 617749
16/02/2014 616149
09/02/2014 614549
02/02/2014 612949
19/01/2014 609749
12/01/2014 608149
06/01/2014 606749
if I want to get only the last value in a month, for example output like below, how I can get it?
date value
30/03/2014 625949
23/02/2014 617749
19/01/2014 609749

;with Cte as(Select tDate,value, C=ROW_NUMBER()
over(PARTITION by convert(varchar(6), tdate, 112) order by tdate desc)
From #Temp )
Select *
from cte
where C=1
Fiddle Sample

SELECT date,value FROM table1 WHERE date in (
SELECT MAX(date) FROM table1 GROUP BY MONTH(date)
)

Related

Fill missing gaps in data using a date column

I have a temp table that returns this output
PRICE | DATE
1.491500 | 2019-02-01
1.494000 | 2019-02-04
1.486500 | 2019-02-06
I want to fill in the missing gaps in data by duplicating the last known record prior to the gaps in data using the date. Is their a way to update the existing temp table or create a new temp table with this desired output dynamically:
PRICE | DATE
1.491500 | 2019-02-01
1.491500 | 2019-02-02
1.491500 | 2019-02-03
1.494000 | 2019-02-04
1.494000 | 2019-02-05
1.486500 | 2019-02-06
I am working on sql server 2008r2
Because SQL Server does not support IGNORE NULLS in LAG() this is a bit tricky. I would go for a recursive subquery of the form:
with cte as (
select price, date, dateadd(day, -1, lead(date) over (order by date)) as last_date
from t
union all
select price, dateadd(day, 1, date), last_date
from cte
where date < last_date
)
select price, date
from cte
order by date;
Here is a db<>fiddle.
In SQL Server 2008, you can replace the lead() with:
with cte as (
select price, date,
(select min(date)
from t t2
where t2.date > t.date
) as last_date
from t
union all
select price, dateadd(day, 1, date), last_date
from cte
where date < last_date
)
select price, date
from cte
order by date;
Assuming there is a dates table (if not you can easily make one), you can do this by left joining the existing table to the dates table. Thereafter assign groups per dates found using a running sum. The max value per group is what would be needed to fill in the missing values.
select dt,max(price) over(partition by grp) as price
from (select p.price,d.dt,sum(case when p.dt is null then 0 else 1 end) over(order by d.dt) as grp
from dates d
left join prices p on p.dt = d.dt
) t
Sample Demo
Making a dates table with a recursive cte. Persist it as needed.
--Generate dates in 2019
with dates(dt) as (select cast('2019-01-01' as date)
union all
select dateadd(day,1,dt)
from dates
where dt < '2019-12-31'
)
select * from dates
option(maxrecursion 0)

Alternate for LAG function in SQL Server 2005 to calculate growth rate

How could I create a column with the growth rate of my database?
I have a database and created another table (only with the date and count field) where I have the date and total number of records in it. I would also like to insert another column with the percentage of my growth:
SELECT CONVERT(DATE, GETDATE()) AS Date, COUNT(*) AS Count
FROM [Database]
My table looks like this:
Date Count Growth Rate
01/01/18 20.000,00
01/02/18 25.000,00 25,00%
01/03/18 40.000,00 60,00%
I use SQL Server 2005, cause this can not use the lag function. How could I do it?
You can use a correlated sub query to emulate LAG function like so:
WITH cte (Date, Count) AS (
SELECT '2018-01-01', 20000.00 UNION
SELECT '2018-01-02', 25000.00 UNION
SELECT '2018-01-03', 40000.00
)
SELECT *, (
SELECT TOP 1 Count
FROM cte AS x
WHERE Date < t.Date
ORDER BY Date DESC
) AS PoorMansLag -- 100 * (Count / PoorMansLag - 1) gives you the result in OP
FROM cte AS t
Another possible approach:
CREATE TABLE #LagTable (
[Date] datetime,
[Count] numeric(10, 2)
)
INSERT INTO #LagTable VALUES ('2018-01-01', 20000.00)
INSERT INTO #LagTable VALUES ('2018-01-02', 25000.00)
INSERT INTO #LagTable VALUES ('2018-01-03', 40000.00);
WITH cte AS (
SELECT [Date], [Count], ROW_NUMBER() OVER (ORDER BY [Date]) AS RN
FROM #LagTable
)
SELECT
t1.[Date],
t1.[Count],
(t1.[Count] - COALESCE(t2.[Count], 0)) / t2.[Count] * 100 AS GrowthRate
FROM cte t1
LEFT JOIN cte t2 ON (t1.RN = t2.RN + 1)
Output:
Date Count GrowthRate
2018-01-01 00:00:00.000 20000.00 NULL
2018-02-01 00:00:00.000 25000.00 25.0000000000000
2018-03-01 00:00:00.000 40000.00 60.0000000000000

Day after max date in data

I am loading data into a table. I don't have any info on how frequent or when the source data is loaded, all I know is I need data from the source to run my script.
Here's the issue, if I run max(date) I get the latest date from the source, but I don't know if the data is still loading. I've ran into cases where I've only gotten a percentage of the data. Thus, I need the next business day after max date.
I want to know is there a way to get the second latest date in the system. I know I can get max(date) - 1, but that give me literally the day after. I don't need the literal day after.
Example, if I run the script on Tuesday, max(date) will be Monday, but since weekend are not in the source system, I need to get Friday instead of Monday.
DATE
---------
2017-04-29
2017-04-25
2017-04-21
2017-04-19
2017-04-18
2017-04-15
2017-04-10
max(date) = 2017-04-29
how do I get 2017-04-25?
Depending on your version of SQL Server, you can use a windowing function like row_number:
select [Date]
from
(
select [Date],
rn = row_number() over(order by [Date] desc)
from #yourtable
) d
where rn = 2
Here is a demo.
Should you have multiple of the same date, you can perform a distinct first:
;with cte as
(
select distinct [date]
from #yourtable
)
select [date]
from
(
select [date],
rn = row_number() over(order by [date] desc)
from cte
) x
where rn = 2;
You can use row_number and get second as below
select * from ( select *, Rown= row_number() over (order by date desc) from yourtable ) a
where a.RowN = 2
More recent SQL Server versions support FETCH FIRST:
select date
from tablename
order by date desc
offset 1 fetch first 1 row only
OFFSET 1 means skip one row. (The 2017-04-29 row.)
;With cte([DATE])
AS
(
SELECT '2017-04-29' union all
SELECT '2017-04-25' union all
SELECT '2017-04-21' union all
SELECT '2017-04-19' union all
SELECT '2017-04-18' union all
SELECT '2017-04-15' union all
SELECT '2017-04-10'
)
SELECT [DATE] FROM
(
SELECT *,ROW_NUMBER()OVER(ORDER BY Seq)-1 As Rno FROM
(
SELECT *,MAX([DATE])OVER(ORDER BY (SELECT NULL))Seq FROM cte
)dt
)Final
WHERE Final.Rno=1
OutPut
DATE
-----
2017-04-25
You can also use FIRST_VALUE with a dynamic date something like DATEADD(DD, -1, GETDATE()). The example below has the date hard coded.
SELECT DISTINCT
FIRST_VALUE([date]) OVER(ORDER BY [date] DESC) AS FirstDate
FROM CTE
WHERE [date] < '2017-04-25'
Another way
DECLARE #T TABLE ([DATE] DATE)
INSERT INTO #T VALUES
('2017-04-29'),
('2017-04-25'),
('2017-04-21'),
('2017-04-19'),
('2017-04-18'),
('2017-04-15'),
('2017-04-10');
SELECT
MAX([DATE]) AS [DATE]
FROM #T
WHERE DATENAME(DW,[DATE]) NOT IN ('Saturday','Sunday')
Another way of doing it, just for example sake...
SELECT MIN(A.date)
FROM
(
SELECT TOP 2 DISTINCT date
FROM YourTable AS C
ORDER BY date DESC
) AS A

Filling table with datetime's incremented by one second each

I have a MyDatabase.MyTable.DateCol with few thousand rows, which I want to fill up with datetime. I want each date to be bigger than the previous one by 1 second. How can I do that?
Sample Table
CREATE Table DateTable
(ID INT IDENTITY(1,1),Name NVARCHAR(300), Data Datetime)
GO
Test Data
INSERT INTO DateTable (Name)
VALUES ('John'),('Mark'),('Phil'),('Simon'),('Sam'),('Pete'),('Josh')
GO
Query
;WITH CTE
AS
(
SELECT *, rn = ROW_NUMBER() OVER (ORDER BY ID ASC) FROM DateTable
)
UPDATE CTE
SET Data = DATEADD(SECOND, CTE.rn, GETDATE())
Result Set
SELECT * FROM DateTable
ID Name Data
1 John 2013-11-06 20:34:59.310
2 Mark 2013-11-06 20:35:00.310
3 Phil 2013-11-06 20:35:01.310
4 Simon 2013-11-06 20:35:02.310
5 Sam 2013-11-06 20:35:03.310
6 Pete 2013-11-06 20:35:04.310
7 Josh 2013-11-06 20:35:05.310
Not quite sure of your ordering criteria, but you can use:
MIN(DateCol) OVER()
To get the first date, and
ROW_NUMBER() OVER(ORDER BY DateCol, ID)
To get the number of seconds to add (your ordering criteria may be different). Then combine the two to update a common table expression:
WITH CTE AS
( SELECT ID,
DateCol,
NewDate = DATEADD(SECOND,
MIN(DateCol) OVER(),
ROW_NUMBER() OVER(ORDER BY DateCol, ID))
FROM MyDatabase.MyTable
)
UPDATE CTE
SET DateCol = NewDate;
If you have no dates in your column then you can just enter a start date (GETDATE() below):
WITH CTE AS
( SELECT ID,
DateCol,
NewDate = DATEADD(SECOND,
GETDATE(),
ROW_NUMBER() OVER(ORDER BY DateCol, ID))
FROM MyDatabase.MyTable
)
UPDATE CTE
SET DateCol = NewDate;

Find the start and end date (set based) in T-SQL

I have the below.
Name Date
A 2011-01-01 01:00:00.000
A 2011-02-01 02:00:00.000
A 2011-03-01 03:00:00.000
B 2011-04-01 04:00:00.000
A 2011-05-01 07:00:00.000
The desired output is
Name StartDate EndDate
-------------------------------------------------------------------
A 2011-01-01 01:00:00.000 2011-04-01 04:00:00.000
B 2011-04-01 04:00:00.000 2011-05-01 07:00:00.000
A 2011-05-01 07:00:00.000 NULL
How to achieve the same using TSQL in a set based approach.
DDL is as under
DECLARE #t TABLE(PersonName VARCHAR(32), [Date] DATETIME)
INSERT INTO #t VALUES('A', '2011-01-01 01:00:00')
INSERT INTO #t VALUES('A', '2011-01-02 02:00:00')
INSERT INTO #t VALUES('A', '2011-01-03 03:00:00')
INSERT INTO #t VALUES('B', '2011-01-04 04:00:00')
INSERT INTO #t VALUES('A', '2011-01-05 07:00:00')
Select * from #t
;WITH cte1
AS (SELECT *,
ROW_NUMBER() OVER (ORDER BY Date) -
ROW_NUMBER() OVER (PARTITION BY PersonName
ORDER BY Date) AS G
FROM #t),
cte2
AS (SELECT PersonName,
MIN([Date]) StartDate,
ROW_NUMBER() OVER (ORDER BY MIN([Date])) AS rn
FROM cte1
GROUP BY PersonName,
G)
SELECT a.PersonName,
a.StartDate,
b.StartDate AS EndDate
FROM cte2 a
LEFT JOIN cte2 b
ON a.rn + 1 = b.rn
Because the result of CTEs are not generally materialised however
you may well find you get better performance if you materialize the
intermediate result yourself as below.
DECLARE #t2 TABLE (
rn INT IDENTITY(1, 1) PRIMARY KEY,
PersonName VARCHAR(32),
StartDate DATETIME );
INSERT INTO #t2
SELECT PersonName,
MIN([Date]) StartDate
FROM (SELECT *,
ROW_NUMBER() OVER (ORDER BY Date) -
ROW_NUMBER() OVER (PARTITION BY PersonName
ORDER BY Date) AS G
FROM #t) t
GROUP BY PersonName,
G
ORDER BY StartDate
SELECT a.PersonName,
a.StartDate,
b.StartDate AS EndDate
FROM #t2 a
LEFT JOIN #t2 b
ON a.rn + 1 = b.rn
SELECT
PersonName,
StartDate = MIN(Date),
EndDate
FROM (
SELECT
PersonName,
Date,
EndDate = (
/* get the earliest date after current date
associated with a different person */
SELECT MIN(t1.Date)
FROM #t AS t1
WHERE t1.Date > t.Date
AND t1.PersonName <> t.PersonName
)
FROM #t AS t
) s
GROUP BY PersonName, EndDate
ORDER BY 2
Basically, for every Date we find the nearest date after it such that is associated with a different PersonName. That gives us EndDate, which now distinguishes for us consecutive groups of dates for the same person.
Now we only need to group the data by PersonName & EndDate and get the minimal Date in every group as StartDate. And yes, sort the data by StartDate, of course.
Get a row number so you will know where the previous record is. Then, take a record and the next record after it. When the state changes we have a candidate row.
select
state,
min(start_timestamp),
max(end_timestamp)
from
(
select
first.state,
first.timestamp_ as start_timestamp,
second.timestamp_ as end_timestamp
from
(
select
*, row_number() over (order by timestamp_) as id
from test
) as first
left outer join
(
select
*, row_number() over (order by timestamp_) as id
from test
) as second
on
first.id = second.id - 1
and first.state != second.state
) as agg
group by state
having max(end_timestamp) is not null
union
-- last row wont have a ending row
--(select state, timestamp_, null from test order by timestamp_ desc limit 1)
-- I think it something like this for sql server
(select top state, timestamp_, null from test order by timestamp_ desc)
order by 2
;
Tested with PostgreSQL but should work with SQL Server as well
The other answer with the cte is a good one. Another option would be to iterate over the collection in any case. It's not set based, but it is another way to do it.
You will need to iterate to either A. assign a unique id to each record that corresponds to its transaction, or B. to actually get your output.
TSQL is not ideal for iterating over records, especially if you have a lot, and so I would recommend some other way of doing it, a small .net program or something that is better at iterating.
There's a very quick way to do this using a bit of Gaps and Islands theory:
WITH CTE as (SELECT PersonName, [Date]
, Row_Number() over (ORDER BY [Date])
- Row_Number() over (ORDER BY PersonName, [Date]) as Island
FROM #t)
Select PersonName, Min([Date]), Max([Date])
from CTE
GROUP BY Island, PersonName
ORDER BY Min([Date])