SQL Find Last Entry Closest to a Date - sql

I am trying to filter the last entry in a table closet to a defined date and I am having difficulties. Any input is greatly appreciated. Thanks! I am running Microsoft SQL Server 2008.
Table:
code | account | date | amount
1 | 1234 | 2016-02-28 | 500
2 | 1234 | 2016-03-01 | 650
3 | 1234 | 2016-03-05 | 842
4 | 7890 | 2016-02-28 | 500
5 | 7890 | 2016-03-30 | 550
I want to select only entries with a date closest to March 31 ('2016-03-31'). In this example, the entry closest to 2016-03-31 for account 1234 is entry #3 and the entry closest to 2016-03-31 for account 7890 is entry #5. In other words, I want the last entry for all accounts equal to or before a date.
3 | 1234 | 2016-03-05 | 842
5 | 7890 | 2016-03-30 | 550

Most DBMSes (including MS SQL Server) support Analytical Functions:
select *
from
(
select *,
row_number() -- create a ranking
over (partition by account -- for each account
order by date desc) as rn -- based on descending dates
from tab
where date <= date '2016-03-31'
) dt
where rn = 1 -- return the row with the "closest" date

Since no DBMS is specified, here's a kind of hacky way to do this in SQL Server. It grabs the record just before and just after the specified date:
select * from (
select top(1) * FROM mytable
where date >= '2016-03-31' order by date asc
) t1
union
select * from (
select top(1) * FROM mytable
where date <= '2016-03-31' order by date desc
) t2

This should do what you want, and it should be easy enough to understand to don't need further explanation:
select t.*
from your_table t
join (
select account, max(date) as date
from your_table
where date <= '2016-03-31'
group by account
) as subquery on t.account = subquery.account and t.date = subquery.date
Edit: for SQL Server it might be better to use an analytical function (like row_number)

Related

Create all months list from a date column in ORACLE SQL

CREATE TABLE dates(
alldates date);
INSERT INTO dates (alldates) VALUES ('1-May-2017');
INSERT INTO dates (alldates) VALUES ('1-Mar-2018');
I want to generate all months beginning between these two dates. I am very new to Oracle SQL. My solution is below, but it is not working properly.
WITH t1(test) AS (
SELECT MIN(alldates) as test
FROM dates
UNION ALL
SELECT ADD_MONTHS(test,1) as test
FROM t1
WHERE t1.test<= (SELECT MAX(alldates) FROM date)
)
SELECT * FROM t1
The result I want should look like
Test
2017-02-01
2017-03-01
...
2017-12-01
2018-01-01
2018-02-01
2018-03-01
You made a typo and wrote date instead of dates but you also need to make a second change and use ADD_MONTHS in the recursive query's WHERE clause or you will generate one too many rows.
WITH t1(test) AS (
SELECT MIN(alldates)
FROM dates
UNION ALL
SELECT ADD_MONTHS(test,1)
FROM t1
WHERE ADD_MONTHS(test,1) <= (SELECT MAX(alldates) FROM dates)
)
SELECT * FROM t1
Which outputs:
| TEST |
| :-------- |
| 01-MAY-17 |
| 01-JUN-17 |
| 01-JUL-17 |
| 01-AUG-17 |
| 01-SEP-17 |
| 01-OCT-17 |
| 01-NOV-17 |
| 01-DEC-17 |
| 01-JAN-18 |
| 01-FEB-18 |
| 01-MAR-18 |
However, a more efficient query would be to get the minimum and maximum values in the same query and then iterate using these pre-found bounds:
WITH t1(min_date, max_date) AS (
SELECT MIN(alldates),
MAX(alldates)
FROM dates
UNION ALL
SELECT ADD_MONTHS(min_date,1),
max_date
FROM t1
WHERE ADD_MONTHS(min_date,1) <= max_date
)
SELECT min_date AS month
FROM t1
db<>fiddle here
Update
Oracle 11gR2 has bugs handling recursive date queries; this is fixed in later Oracle versions but if you want to use SQL Fiddle and Oracle 11gR2 then you need to iterate over a numeric value and not a date. Something like this:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE dates(
alldates date);
INSERT INTO dates (alldates) VALUES ('1-May-2017');
INSERT INTO dates (alldates) VALUES ('1-Mar-2018');
Query 1:
WITH t1(min_date, month, total_months) AS (
SELECT MIN(alldates),
0,
MONTHS_BETWEEN(MAX(alldates),MIN(alldates))
FROM dates
UNION ALL
SELECT min_date,
month+1,
total_months
FROM t1
WHERE month+1<=total_months
)
SELECT ADD_MONTHS(min_date,month) AS month
FROM t1
Results:
| MONTH |
|----------------------|
| 2017-05-01T00:00:00Z |
| 2017-06-01T00:00:00Z |
| 2017-07-01T00:00:00Z |
| 2017-08-01T00:00:00Z |
| 2017-09-01T00:00:00Z |
| 2017-10-01T00:00:00Z |
| 2017-11-01T00:00:00Z |
| 2017-12-01T00:00:00Z |
| 2018-01-01T00:00:00Z |
| 2018-02-01T00:00:00Z |
| 2018-03-01T00:00:00Z |
You seem to want a recursive CTE. That syntax would be:
WITH CTE(min_date, max_date) as (
SELECT MIN(alldates) as min_date, MAX(alldates) as max_date
FROM dates
UNION ALL
SELECT add_months(min_date, 1), max_date
FROM CTE
WHERE min_date < max_date
)
SELECT min_date
FROM CTE;
Here is a db<>fiddle.
You just made a typo: date instead of dates:
WITH t1(test) AS (
SELECT MIN(alldates) as test
FROM dates
UNION ALL
SELECT ADD_MONTHS(test,1) as test
FROM t1
WHERE t1.test<= (SELECT MAX(alldates) FROM dateS) -- fixed here
)
SELECT * FROM t1

Flat Big Query Rows

Please help me to resolve this task. I have Google Big Query table like this:
| name | startDate | endDate |
| Bob | 2018-01-01 | 2018-01-01 |
| Nick | 2017-12-29 | 2017-12-31 |
and as a result I need to get something like this:
| name | date |
| Bob | 2018-01-01 |
| Nick | 2017-12-29 |
| Nick | 2017-12-30 |
| Nick | 2017-12-31 |
Is it possible? Thank you in advance.
WITH CTE as (
SELECT 'bob' name, date('2018-01-01') startDate, date('2018-01-01') endDate
UNION ALL SELECT 'Nick', date '2017-12-29' startDate, date('2017-12-31') endDate
),
CTE2 AS (
SELECT name, GENERATE_DATE_ARRAY(startDate, endDate, INTERVAL 1 DAY) AS date
FROM CTE
)
SELECT name, date
FROM CTE2,
UNNEST(date) as date
Or just simply
#standardSQL
SELECT name, date
FROM `project.dataset.table`,
UNNEST(GENERATE_DATE_ARRAY(startDate, endDate)) date
You may make use of a calendar table here:
WITH dates AS (
SELECT '2017-12-29' AS date_val UNION ALL
SELECT '2017-12-30' UNION ALL
SELECT '2017-12-31' UNION ALL
SELECT '2018-01-01'
-- and maybe other dates
)
SELECT
t2.name,
t1.date_val
FROM dates t1
INNER JOIN yourTable t2
ON t1.date_val BETWEEN t2.startDate AND t2.endDate
ORDER BY
t2.name,
t1.date_val;
If your version of BigQuery does not support CTE, you may just inline the CTE as a subquery. That is, replace dates with the body of the CTE itself.
In practice, you might want to generate a date series (q.v. here), or possibly maintain a dedicated calendar table in your database. The above just shows what the query itself might look like.

get the id based on condition in group by

I'm trying to create a sql query to merge rows where there are equal dates. the idea is to do this based on the highest amount of hours, so that i in the end gets the corresponding id for each date with the highest amount of hours. i've been trying to do with a simple group by, but does not seem to work, since i CANT just put a aggregate function on id column, since it should be based the hours condition
+------+-------+--------------------------------------+
| id | date | hours |
+------+-------+--------------------------------------+
| 1 | 2012-01-01 | 37 |
| 2 | 2012-01-01 | 10 |
| 3 | 2012-01-01 | 5 |
| 4 | 2012-01-02 | 37 |
+------+-------+--------------------------------------+
desired result
+------+-------+--------------------------------------+
| id | date | hours |
+------+-------+--------------------------------------+
| 1 | 2012-01-01 | 37 |
| 4 | 2012-01-02 | 37 |
+------+-------+--------------------------------------+
If you want exactly one row -- even if there are ties -- then use row_number():
select t.*
from (select t.*, row_number() over (partition by date order by hours desc) as seqnum
from t
) t
where seqnum = 1;
Ironically, both Postgres and Oracle (the original tags) have what I would consider to be better ways of doing this, but they are quite different.
Postgres:
select distinct on (date) t.*
from t
order by date, hours desc;
Oracle:
select date, max(hours) as hours,
max(id) keep (dense_rank first over order by hours desc) as id
from t
group by date;
Here's one approach using row_number:
select id, dt, hours
from (
select id, dt, hours, row_number() over (partition by dt order by hours desc) rn
from yourtable
) t
where rn = 1
You can use subquery with correlation approach :
select t.*
from table t
where id = (select t1.id
from table t1
where t1.date = t.date
order by t1.hours desc
limit 1);
In Oracle you can use fetch first 1 row only in subquery instead of LIMIT clause.

Select top 2 rows different from each other

I have this table
| date | sum |
|--------------|-------|
| 2015-02-19 | 10000 |
| 2015-02-19 | 10000 |
| 2015-02-20 | 15000 |
| 2015-02-20 | 15000 |
| 2015-02-21 | 18000 |
| 2015-02-21 | 18000 |
I want to select top 2 rows from the table, but only different ones, meaning my result should return 2015-02-20 and 2015-02-21.
SELECT TOP 2 distinct date
FROM stock
Using this gives me an error:
Incorrect syntax near the keyword 'distinct'.
Help would be highly appreciated.
You can try like this
select top 2 * from
(
select distinct date FROM stock
)
Try something like:
SELECT TOP 2 date
FROM stock
GROUP BY date
I think Distinct and Top should switch places in your query:
SELECT DISTINCT TOP 2 date FROM stock ORDER BY date DESC
try
select distinct top 2 date from stock
You can use GROUP BY:
SELECT TOP 2 date
FROM stock
GROUP BY date
ORDER BY date DESC
Sample result:
DATE
2015-02-21
2015-02-20
See result in SQL Fiddle.
Try this :
WITH cte AS
( SELECT distinct date ,
ROW_NUMBER() OVER (PARTITION BY date
ORDER BY date DESC
)
AS rn
FROM stock
)
SELECT date
FROM cte
WHERE rn <= 3
ORDER BY rn ;
Try this:
SELECT TOP 2 date FROM stock group by date

SQL Calculate Days between two dates in one table

I have a table dbo.Trans which contains an id called bd_id(varchar) and transfer_date(Datetime), also an identifier member_id pk is trns_id and is sequential
Duplicates of bd_id and member_id exist in the table.
transfer_date |bd_id| member_id | trns_id
2008-01-01 00:00:00 | 432 | 111 | 1
2008-01-03 00:00:00 | 123 | 111 | 2
2008-01-08 00:00:00 | 128 | 111 | 3
2008-02-04 00:00:00 | 123 | 432 | 4
.......
For each member_id, I want to get the amount of days between dates and for each bd_id
E.G., member 111 used 432 from 2008-01-01 until 2008-02-01 so return should be 2
Then next would be 5
I know the DATEDIFF() function exists but I am not sure how to get the difference when dates are in the same table.
Any help appreciated.
You could try something like this.
select T1.member_id,
datediff(day, T1.transfer_date, T3.transfer_date) as DD
from YourTable as T1
cross apply (select top 1 T2.transfer_date
from YourTable as T2
where T2.transfer_date > T1.transfer_date and
T2.member_id = T1.member_id
order by T2.transfer_date) as T3
SE-Data
You must select 1st and 2nd records that you want, then get their dates and get DATEDIFF of those two dates.
DATEDIFF(date1, date2);
Your problem is getting the next member date.
Here is an example using a correlated subquery to get the next date:
select t.*, datediff(day, t.transfer_date, nextdate) as Days_Between
from (select t.*,
(select min(transfer_date)
from trans t2
where t.bd_id = t2.bd_id and
t.member_id = t2.member_id and
t.transfer_date < t2.transfer_date
) as NextDate
from trans t
) t
SQL Server 2012 has a function called lag() that makes this a bit easier to express.