SELECT Query between dates, only selecting items between start and end fields - sql

I have two tables that I will be using for tracking purposes, a Date Table and a Item Table. The Date Table is used to track the start and end dates of a tracked id. The Item Table is the amount of items that are pulled on a specific date for an id. The id is the foreign key between these two tables.
What I want to do, is a sum of the items with a GROUP BY of the id of the items, but only by summing the items based on if the date of the pulled item falls between the start_date and end_date of the tracked id.
The Date Table
id start_date end_date
1 2014-01-01 NULL
2 2014-01-01 2014-01-02
3 2014-01-25 NULL
The Item Table
id items date
1 3 2014-01-01
1 5 2014-01-02
1 5 2014-01-26
2 2 2014-01-01
2 3 2014-01-05
2 2 2014-01-26
3 2 2014-01-01
3 3 2014-01-05
3 2 2014-01-26
SQL I have so far, but I'm lost as to what to add to it from here.
SELECT
a.id,
SUM(items)
FROM
ww_test.dbo.items a
INNER JOIN ww_test.dbo.dates b ON
a.id = b.id
WHERE
a.date >= '2014-01-01' AND a.date <= '2014-01-30'
GROUP BY
a.id
ORDER BY
a.id
The output should be:
id items
1 13
2 2
3 2
Instead of:
id items
1 13
2 7
3 7

First of all, I strongly recommend that you stop using NULL in your date ranges to represent "no end date" and instead use a sentinel value such as 9999-12-31. The reason for this is primarily performance and secondarily query simplicity--a benefit to yourself now in writing the queries and to you or others later who have to maintain them. In front-end or middle-tier code, there is little difference to comparing a date range to Null or to 9999-12-31, and in fact you get some of the same benefits of simplified code there as you do in your SQL. I base this recommendation on over 10 years of full-time professional SQL query writing experience.
To fix your query as is, I think this would work:
SELECT
a.id,
ItemsSum = SUM(items)
FROM
ww_test.dbo.items a
INNER JOIN ww_test.dbo.dates b
ON a.id = b.id
AND a.date >= Coalesce(b.start_date, 0)
AND a.date <= Coalesce(b.end_date, '99991231')
WHERE
a.date >= '20140101'
AND a.date <= '20140130'
GROUP BY
a.id
ORDER BY
a.id
;
Note that if you followed my recommendation, your query JOIN conditions could look like this:
INNER JOIN ww_test.dbo.dates b
ON a.id = b.id
AND a.date >= b.start_date
AND a.date <= b.end_date
You will find that if your data sets become large, having to put a Coalesce or IsNull in there will hurt performance in a significant way. It doesn't help to use OR clauses, either:
INNER JOIN ww_test.dbo.dates b
ON a.id = b.id
AND (a.date >= b.start_date OR b.start_date IS NULL)
AND (a.date <= b.end_date OR b.end_date IS NULL)
That's going to have the same problems (for example converting what could have been a seek when there's a suitable index, into a scan, which would be very sad).
Last, I also recommend that you change your end dates to be exclusive instead of inclusive. This means that for the end date, instead of entering the date of the beginning of the final day the information is true, you put the date of the first day it is no longer true. There are several reasons for this recommendation:
If your date resolution ever changes to hours, or minutes, or seconds, every piece of code you have ever written dealing with this data will have to change (and it won't if you use exclusive end dates).
If you ever have to compare date ranges to each other (to collapse date ranges together or locate contiguous ranges or even locate non-contiguous ranges), you now have to do all the comparisons on a.end_date + 1 = b.start_date instead of a simple equijoin of a.end_date = b.start_date. This is painful, and easy to make mistakes.
Always thinking of dates as suggesting time of day will be extremely salutary to your coding ability in any language. Many mistakes are made, over and over, by people forgetting that dates, even ones in formats that can't denote a time portion (such as the date data type in SQL 2008 and up) still have an implicit time portion, and can be converted directly to date data types that do have a time portion, and that time portion will always be 0 or 12 a.m..
The only drawback is that in some cases, you have to do some twiddling about what date you show users (to convert to the inclusive date) and then convert dates they enter into the exclusive date for storing into the database. But this is confined to UI-handling code and is not throughout your database, so it's not that big a drawback.
The only change to your query would be:
INNER JOIN ww_test.dbo.dates b
ON a.id = b.id
AND a.date >= b.start_date
AND a.date < b.end_date -- no equal sign now
One last thing: be aware that the date format 'yyyy-mm-dd' is not culture-safe.
SET LANGUAGE FRENCH;
SELECT Convert(datetime, '2014-01-30'); -- fails with an error
The only invariantly culture-safe formats for datetime in SQL Server are:
yyyymmdd
yyyy-mm-ddThh:mm:ss

I think what you want to do is to compare the dates to be between the start_date and end_date of your Data table.
Change your query to the following and try
SELECT
a.id,
SUM(items)
FROM
ww_test.dbo.items a
INNER JOIN ww_test.dbo.dates b ON a.id = b.id
WHERE
a.date >= ISNULL(b.start_date, GETDATE())
AND a.date <= ISNULL(b.end_date, GETDATE())
GROUP BY a.id
ORDER BY a.id

The problem with the query is the condition part.
Also, since you need to retrieve data based on the condition defined in Dates table, you do not have to explicitly hard code the condition.
Assuming that your End Date can either be null or have values, you can use the following
query:
SELECT
a.id,
SUM(items)
FROM
ww_test.dbo.items a
INNER JOIN ww_test.dbo.dates b ON
a.id = b.id
where (b.end_date is not null and a.date between b.start_date and b.end_date)
or (b.end_date is null and a.date >= b.start_date)
GROUP BY
a.id
ORDER BY
a.id

Related

SQL join with dates

I have table A with columns:
customer_id, month, amount
Month is like 2015/12/01 meaning it's amount paid in December 2015.
Then there is table B with columns:
customer_id, plan_id, start_date, end_date
This is information on when a particular customer started and ended using a particular plan. The current plan will have end_date NULL. One customer could have used many different plans in the past.
I need to add plan_id column to table A by joining these 2 tables but I have no idea how to deal with the dates.
Note that for each customer one month should correspond to one plan only. So even if the start_date for a plan is 2015/11/02, it should only be applied for the next month (2015/12/01).
This is a basically a join, but with inequalities:
select a.*, b.*
from a left join
b
on a.customer_id = b.customer_id and
a.month >= b.start_date and
(a.month <= b.end_date or b.end_date is null);

How to insert the dates from a table to another table for two different conditions

I want to get the date (b.date) information with a condition (i. e. When c. Code=001 then b. Date should fetch the dates from past one month, and for the c. code other than 001 then b. Date should be greater than sysdate) from table b and table c.
Please find the below explanation for the same.
Create Table A
As (select b. Date, c. Code from table b, table c
Where b. Code=c.code
And b. Date should fetch the dates from past one month when c. code =001
and b. Date should fetch the dates from greater than sysdate when c. Code is other than 001)
Table A should populated with the below the date and code columns
For ex
Code Date
Code001 7/01/19
Code001 8/01/19
...
Code111 7/02/19
Could you please tell me how can i achieve this using oracle sql?
Please let me know if you need any other information on the issue or if my explanation is confusing.
Thanks for your support,
Vani.
I am not so sure about the question, however based on what you have provided, I am doing a union of two queries(#Oracle Sql) ..I understand this would not be an optimized query depending how large is the table, it may consume resources because of cartesian product .. please see below:
select b.date,c.code from <table> b,<table> c
where b.code=c.code
and c.code='001'
and b.date>= trunc(add_months(sysdate, -1),'MM') and b.date<trunc(sysdate, 'MM')
union
select b.date,c.code from <table> b,<table> c
where b.code=c.code
and c.code<>'001'
and b.date>sysdate;
It should be possible to achieve this by implementing the logic in the WHERE clause.
WHERE
( c.Code = '001' AND b.Date >= TRUNC(sysdate, 'mm')
OR b.Date >= sysdate
Or maybe (if you are storing dates without time)
WHERE
( c.Code = '001' AND b.Date >= TRUNC(sysdate, 'mm')
OR b.Date >= TRUNC(sysdate)
NB : the date functions widely vary depending on the RDBMS ; the above solution is for Oracle. In MySQL, you would do :
WHERE
( c.Code = '001' AND b.Date >= DATE_FORMAT(CURDATE(), '%Y-%m-01')
OR b.Date >= NOW() -- or CURDATE()

How to select overlapping date ranges in SQL

I have a table with the following columns :
sID, start_date and end_date
Some of the values are as follows:
1 1995-07-28 2003-07-20
1 2003-07-21 2010-05-04
1 2010-05-03 2010-05-03
2 1960-01-01 2011-03-01
2 2011-03-02 2012-03-13
2 2012-03-12 2012-10-21
2 2012-10-22 2012-11-08
3 2003-07-23 2010-05-02
I only want the 2nd and 3rd rows in my result as they are the overlapping date ranges.
I tried this but it would not get rid of the first row. Not sure where I am going wrong?
select a.sID from table a
inner join table b
on a.sID = b.sID
and ((b.start_date between a.start_date and a.end_date)
and (b.end_date between a.start_date and b.end_date ))
order by end_date desc
I am trying to do in SQL Server
One way of doing this reasonably efficiently is
WITH T1
AS (SELECT *,
MAX(end_date) OVER (PARTITION BY sID ORDER BY start_date) AS max_end_date_so_far
FROM YourTable),
T2
AS (SELECT *,
range_start = IIF(start_date <= LAG(max_end_date_so_far) OVER (PARTITION BY sID ORDER BY start_date), 0, 1),
next_range_start = IIF(LEAD(start_date) OVER (PARTITION BY sID ORDER BY start_date) <= max_end_date_so_far, 0, 1)
FROM T1)
SELECT SId,
start_date,
end_date
FROM T2
WHERE 0 IN ( range_start, next_range_start );
if you have an index on (sID, start_date) INCLUDE (end_date) this can perform the work with a single ordered scan.
Your logic is not totally correct, although it almost works on your sample data. The specific reason it fails is because between includes the end points, so any given row matches itself. That said, the logic still isn't correct because it doesn't catch this situation:
a-------------a
b----b
Here is correct logic:
select a.*
from table a
where exists (select 1
from table b
where a.sid = b.sid and
a.start_date < b.end_date and
a.end_date > b.start_date and
(a.start_date <> b.start_date or -- filter out the record itself
a.end_date <> b.end_date
)
)
order by a.end_date;
The rule for overlapping time periods (or ranges of any sort) is that period 1 overlaps with period 2 when period 1 starts before period 2 ends and period 1 ends after period 2 starts. Happily, there is no need or use for between for this purpose. (I strongly discourage using between with date/time operands.)
I should note that this version does not consider two time periods to overlap when one ends on the same day another begins. That is easily adjusted by changing the < and > to <= and >=.
Here is a SQL Fiddle.

Fetch max value from a sort-of incomplete dataset

A number of devices return a value. Only upon change, this value gets stored in a table:
Device Value Date
B 5 2017-07-01
C 2 2017-07-01
A 3 2017-07-02
C 1 2017-07-04
A 6 2017-07-04
Values may enter the table at any date (i.e. date doesn't increment continiously). Several devices may store their value on the same date.
Note that, even though there are usually only a few devices for each date in the table, all devices actually have a value at that date: it's the latest one stored until then. For example, on 2017-07-02 only device A stored a value. The values for B and C on that date are the ones stored on 2017-07-01; these are still valid on -02, they just did not change.
To retrieve the values for all devices on a given date, e.g. 2017-07-04, I'm using this:
select device, value from data inner join (select device, max(date) as date from data where date <= "2017-07-04" group by device) latestdate on data.device = latestdate.device and data.date = latestdate.date
Device Value
A 6
B 5
C 1
Question: I'd like to read the max value of all devices on all dates in a given range. The result set would be like this:
Date max(value)
2017-07-01 5
2017-07-02 5
2017-07-04 6
.. and I have no clue if that's possible using only SQL. Until now all I got was lost in an exceptional bunch of joins and groupings.
(Database is sqlite3. Generic SQL would be nice, but I'd still be happy to hear about solutions specific to other databases, especially PostgreSQL or MariaDB.)
Extra bonus: Include the missing date -03, to be exact: returning values at given dates, not necessarily the ones appearing in the table.
Date max(value)
2017-07-01 5
2017-07-02 5
2017-07-03 5
2017-07-04 6
I think the most generic way to approach this is using a separate query for each date. There are definitely simpler methods, depending on the database. But getting one that works for SQLite, MariaDB, and Postgres is not going to use any sophisticated functionality:
select '2017-07-01' as date, max(data.value)
from data inner join
(select device, max(date) as date
from data
where date <= '2017-07-01' group by device
) latestdate
on data.device = latestdate.device and data.date = latestdate.date
union all
select '2017-07-02' as date, max(data.value)
from data inner join
(select device, max(date) as date
from data
where date <= '2017-07-02' group by device
) latestdate
on data.device = latestdate.device and data.date = latestdate.date
select '2017-07-03' as date, max(data.value)
from data inner join
(select device, max(date) as date
from data
where date <= '2017-07-03' group by device
) latestdate
on data.device = latestdate.device and data.date = latestdate.date
select '2017-07-04' as date, max(data.value)
from data inner join
(select device, max(date) as date
from data
where date <= '2017-07-04' group by device
) latestdate
on data.device = latestdate.device and data.date = latestdate.date;
This should be a solution for your problem.
It should be cross-database, since OVER clause is supported by the most of the databases.
You should create a table with all the dates("ALL_DATE" in the query), otherwise every database has a specific way to do it without a table.
WITH GROUPED_BY_DATE_DEVICE AS (
SELECT DATE, DEVICE, SUM(VALUE) AS VALUE FROM DEVICE_INFO
GROUP BY DATE, DEVICE
), GROUPED_BY_DATE AS (
SELECT A.DATE, MAX(VALUE) AS VALUE
FROM ALL_DATE A
LEFT JOIN GROUPED_BY_DATE_DEVICE B
ON A.DATE = B.DATE
GROUP BY A.DATE
)
SELECT DATE, MAX(VALUE) OVER (ORDER BY DATE) AS MAX_VALUE
FROM GROUPED_BY_DATE
ORDER BY DATE;

SQL - daily change in a value with business day into consideration

Hi I am trying to write a query that will track daily changes of a column which isn't populated on weekends/holidays.
First my data looks something like this :
Date Value
11/5/2015 10
11/6/2015 11
11/9/2015 12
11/10/2015 12
11/11/2015 11
so i want my query to give me result of the value change each date vs. the previous business day to return something like this:
Date Change in Value since previous business day
11/5/2015 -
11/6/2015 1
11/9/2015 1
11/10/2015 0
11/11/2015 -1
how do i write a write a query in MS Access which tracks daily changes over a business day? Currently i have written the following which only returns daily change over a calendar day as opposed to a biz day. so it won't return anything on Mondays.
SELECT A.Date, A.Value, ( A.Value - B.Value) as [Daily change]
FROM Table as A INNER JOIN Table as B on (A.date = B.date+1)
=============================================================================
thanks guys I've tried all 3 suggestions but they didn't work unfortunately :( there's another column called product ID and perhaps that is why? in other words, on each day, each product ID will have their own distinct values. there is a total of 100 product IDs so on each date there are 100 different values and I would like to track daily changes (business day basis) for each of the 100 product IDs. could anyone kindly help here? :(
It's hacky, but why not:
Join on 3 days ago also
use iif to say "if the 1 day ago diff is null then show the 3 days ago diff"
SELECT
A.Date, A.Value,
iif (isNull( A.Value - B.Value), ( A.Value - C.Value), ( A.Value - B.Value) ) as [change since last biz day]
FROM [Table] as A
left JOIN [Table] as B on ( A.Date = B.Date + 1 )
left JOIN [Table] as C on ( A.Date = C.Date + 3 )
Sometimes I just say it many times in English and the SQL follows. You want it where B equals the maximum date that is less than A.
SELECT A.Date,
A.Value,
A.Value - B.Value as [Daily Change]
FROM MyTable as A
INNER JOIN MyTable as B
ON B.date = (SELECT MAX(C.date) FROM MyTable C WHERE C.Date < A.Date)
ORDER BY A.Date