Oracle SQL query to get sales by date range - sql

I am looking to write an SQL query that will provide me sales broken into date ranges, but it is a bit above my SQL knowledge.
I have a table of date ranges by customers as follows:
Cust Product startdate enddate
-----------------------------------
A 123 2011-01-01 2011-12-31
A 124 2011-01-01 2011-05-01
A 125 2011-01-01 2011-05-01
B 123 2011-01-01 2011-03-01
B 124 2011-01-01 2011-03-01
C 125 2011-02-02 2011-05-01
and sales stored as follows:
Cust Product date qty
-----------------------------------
A 123 2011-04-08 1
A 124 2011-01-01 12
A 125 2011-05-01 2
B 123 2011-01-04 3
B 124 2011-02-01 5
C 125 2011-03-01 80
The results should look something like:
Cust Product startdate enddate qty
-----------------------------------------
A 124 2011-01-01 2011-02-01 12
B 123 2011-01-01 2011-02-01 3
B 124 2011-02-02 2011-03-01 5
A 123 2011-03-02 2011-05-01 1
C 125 2011-03-02 2011-05-01 80
A 125 2011-05-02 2011-12-31 2
Any advice gratefully received.

I made the example in MySQL because Oracle server was down. But query is the same.
SQL Fiddle Demo
SELECT R.*, S.*
FROM dRanges R
JOIN Sales S
ON S.`date` >= R.`startdate`
AND S.`date` <= R.`enddate`
AND S.`Cust` = R.`Cust`
AND S.`Product` = R.`Product`
But you have to be carefull ranges doesnt overlap, otherwise you can have same Sales value appear on two ranges
EDIT Please explain the logic here

Related

SQL Conditional update column value based on previous row value

I have a table with attendance dates in SQL Workbench/J. I need to define the attendance periods of the people, where any attendance period with gaps less or equal than 90 days are merged into a single attendance period and any gaps larger than that are considered a different attendance period. For example for a single person this is the table I have
id
year
start_date
end_date
prev_att_month
diff
1
2012
2012-08-01
2012-08-31
2012-07-01
31
1
2012
2012-07-01
2012-07-31
2012-04-01
91
1
2012
2012-04-01
2012-04-30
2012-03-01
31
1
2012
2012-03-01
2012-03-31
2012-02-01
29
1
2012
2012-02-01
2012-02-29
2012-01-01
31
1
2012
2012-01-01
2012-01-31
2011-12-01
31
1
2011
2011-12-01
2011-12-31
2011-11-01
30
1
2011
2011-11-01
2011-11-30
2011-10-01
31
1
2011
2011-10-01
2011-10-31
2011-09-01
30
1
2011
2011-09-01
2011-09-30
2011-08-01
31
1
2011
2011-08-01
2011-08-31
2011-07-01
31
1
2011
2011-07-01
2011-07-31
2011-05-01
61
1
2011
2011-05-01
2011-05-31
2011-04-01
30
1
2011
2011-04-01
2011-04-30
2011-03-01
31
1
2011
2011-03-01
2011-03-31
2011-02-01
28
1
2011
2011-02-01
2011-02-28
2010-08-01
184
1
2010
2010-08-01
2010-08-31
2010-07-01
31
1
2010
2010-07-01
2010-07-31
2010-06-01
30
1
2010
2010-06-01
2010-06-30
2010-05-01
31
1
2010
2010-05-01
2010-05-31
2010-04-01
30
1
2010
2010-04-01
2010-04-30
where I defined the previous attendance month column with a lag function and then found the difference between that column and the the start_date in the diff column. This way I can check the gaps between the attendance months
I want this output of the attendance periods with the 90 day rule explained above:
id
start_date
end_date
1
1/04/2010
31/08/2010
1
1/02/2011
30/04/2012
1
1/07/2012
31/08/2012
Does any one have an idea of how to do this?
So far I was just able to define the difference between the attendance months but since this is a large data set I have not been able to find a solution to define the attendance periods without making a row to row analysis.
with [table] as (
select id, year, start_date, end_date,
lag(start_date) over (partition by id order by id, year, start_date) as prev_att_month,
start_date-prev_att_month as diff
from source
)
select *
from [table]
where id = 1
One method would be to use a windowed COUNT to count how many times a value greater than 90 has appeared in the diff column, which provides a unique group number. Then you can just group your data into those groups and get the MIN and MAX values:
WITH Grps AS(
SELECT V.id,
V.year,
V.start_date,
V.end_date,
V.prev_att_month,
V.diff,
COUNT(CASE WHEN diff > 90 THEN 1 END) OVER (PARTITION BY ID ORDER BY V.start_date ASC) AS Grp
FROM (VALUES(1,2012,CONVERT(date,'20120801'),CONVERT(date,'20120831'),CONVERT(date,'2012-07-01'),31),
(1,2012,CONVERT(date,'20120701'),CONVERT(date,'20120731'),CONVERT(date,'2012-04-01'),91),
(1,2012,CONVERT(date,'20120401'),CONVERT(date,'20120430'),CONVERT(date,'2012-03-01'),31),
(1,2012,CONVERT(date,'20120301'),CONVERT(date,'20120331'),CONVERT(date,'2012-02-01'),29),
(1,2012,CONVERT(date,'20120201'),CONVERT(date,'20120229'),CONVERT(date,'2012-01-01'),31),
(1,2012,CONVERT(date,'20120101'),CONVERT(date,'20120131'),CONVERT(date,'2011-12-01'),31),
(1,2011,CONVERT(date,'20111201'),CONVERT(date,'20111231'),CONVERT(date,'2011-11-01'),30),
(1,2011,CONVERT(date,'20111101'),CONVERT(date,'20111130'),CONVERT(date,'2011-10-01'),31),
(1,2011,CONVERT(date,'20111001'),CONVERT(date,'20111031'),CONVERT(date,'2011-09-01'),30),
(1,2011,CONVERT(date,'20110901'),CONVERT(date,'20110930'),CONVERT(date,'2011-08-01'),31),
(1,2011,CONVERT(date,'20110801'),CONVERT(date,'20110831'),CONVERT(date,'2011-07-01'),31),
(1,2011,CONVERT(date,'20110701'),CONVERT(date,'20110731'),CONVERT(date,'2011-05-01'),61),
(1,2011,CONVERT(date,'20110501'),CONVERT(date,'20110531'),CONVERT(date,'2011-04-01'),30),
(1,2011,CONVERT(date,'20110401'),CONVERT(date,'20110430'),CONVERT(date,'2011-03-01'),31),
(1,2011,CONVERT(date,'20110301'),CONVERT(date,'20110331'),CONVERT(date,'2011-02-01'),28),
(1,2011,CONVERT(date,'20110201'),CONVERT(date,'20110228'),CONVERT(date,'2010-08-01'),184),
(1,2010,CONVERT(date,'20100801'),CONVERT(date,'20100831'),CONVERT(date,'2010-07-01'),31),
(1,2010,CONVERT(date,'20100701'),CONVERT(date,'20100731'),CONVERT(date,'2010-06-01'),30),
(1,2010,CONVERT(date,'20100601'),CONVERT(date,'20100630'),CONVERT(date,'2010-05-01'),31),
(1,2010,CONVERT(date,'20100501'),CONVERT(date,'20100531'),CONVERT(date,'2010-04-01'),30),
(1,2010,CONVERT(date,'20100401'),CONVERT(date,'20100430'),NULL,NULL))V(id,year,start_date,end_date,prev_att_month,diff))
SELECT id,
MIN(Start_date) AS Start_date,
MAX(End_Date) AS End_Date
FROM Grps
GROUP BY Id,
Grp
ORDER BY id,
Start_date;

SQL join 3 tables on ID and dates

I have the below test data. There are 3 tables, sales table, sales delivery table and sales delivery months table.
I need to join all the tables together, so that the blue marked rows are connected to the blue marked rows and the red marked rows are connected to the red marked rows.
The join should use the From and To columns that exist in every table, I guess.
Update:
I have tried the following:
SELECT *
FROM Sales co
LEFT JOIN SalesDelivery cd
ON co.SalesID = cd.SalesID
AND cd.From BETWEEN co.From AND co.To
AND cd.To BETWEEN co.From AND co.To
LEFT JOIN SalesDeliveryMonth cdp
ON cd.SalesDeliveryID = cdp.SalesDeliveryID
AND cdp.From BETWEEN cd.From AND cd.To
AND cdp.To BETWEEN cd.From AND cd.To
Sales table:
SalesID Name Revenue From To Current row
100 New CRM 250000.00 1800-01-01 2018-10-03 0
100 New CRM 500000.00 2018-10-03 9999-12-31 1
SalesDelivery table:
SalesID SalesDeliveryID SalesDeliveryName Revenue SalesStart From To Current row
100 AB100 New CRM 250000.00 2018-07-01 1800-01-01 2018-10-03 0
100 AB100 New CRM 500000.00 2018-07-01 2018-10-03 9999-12-31 1
100 ABM100 New CRM - maintenance 0.00 2018-07-01 2018-10-03 9999-12-31 1
SalesDeliveryMonths table:
RevenueMonth Month SalesDeliveryID SalesID From To Current row
833333.3333 2018-07-01 AB100 100 1800-01-01 2018-10-04 0
166666.6667 2018-07-01 AB100 100 2018-10-04 9999-12-31 1
833333.3333 2018-08-01 AB100 100 1800-01-01 2018-10-04 0
166666.6667 2018-08-01 AB100 100 2018-10-04 9999-12-31 1
833333.3333 2018-09-01 AB100 100 1800-01-01 2018-10-04 0
166666.6667 2018-09-01 AB100 100 2018-10-04 9999-12-31 1

SQL Date Range Query - Table Comparison

I have two SQL Server tables containing the following information:
Table t_venues:
venue_id is unique
venue_id | start_date | end_date
1 | 01/01/2014 | 02/01/2014
2 | 05/01/2014 | 05/01/2014
3 | 09/01/2014 | 15/01/2014
4 | 20/01/2014 | 30/01/2014
Table t_venueuser:
venue_id is not unique
venue_id | start_date | end_date
1 | 02/01/2014 | 02/01/2014
2 | 05/01/2014 | 05/01/2014
3 | 09/01/2014 | 10/01/2014
4 | 23/01/2014 | 25/01/2014
From these two tables I need to find the dates that haven't been selected for each range, so the output would look like this:
venue_id | start_date | end_date
1 | 01/01/2014 | 01/01/2014
3 | 11/01/2014 | 15/01/2014
4 | 20/01/2014 | 22/01/2014
4 | 26/01/2014 | 30/01/2014
I can compare the two tables and get the date ranges from t_venues to appear in my query using 'except' but I can't get the query to produce the non-selected dates. Any help would be appreciated.
Calendar Table!
Another perfect candidate for a calendar table. If you can't be bothered to search for one, here's one I made earlier.
Setup Data
DECLARE #t_venues table (
venue_id int
, start_date date
, end_date date
);
INSERT INTO #t_venues (venue_id, start_date, end_date)
VALUES (1, '2014-01-01', '2014-01-02')
, (2, '2014-01-05', '2014-01-05')
, (3, '2014-01-09', '2014-01-15')
, (4, '2014-01-20', '2014-01-30')
;
DECLARE #t_venueuser table (
venue_id int
, start_date date
, end_date date
);
INSERT INTO #t_venueuser (venue_id, start_date, end_date)
VALUES (1, '2014-01-02', '2014-01-02')
, (2, '2014-01-05', '2014-01-05')
, (3, '2014-01-09', '2014-01-10')
, (4, '2014-01-23', '2014-01-25')
;
The Query
SELECT t_venues.venue_id
, calendar.the_date
, CASE WHEN t_venueuser.venue_id IS NULL THEN 1 ELSE 0 END As is_available
FROM dbo.calendar /* see: http://gvee.co.uk/files/sql/dbo.numbers%20&%20dbo.calendar.sql for an example */
INNER
JOIN #t_venues As t_venues
ON t_venues.start_date <= calendar.the_date
AND t_venues.end_date >= calendar.the_date
LEFT
JOIN #t_venueuser As t_venueuser
ON t_venueuser.venue_id = t_venues.venue_id
AND t_venueuser.start_date <= calendar.the_date
AND t_venueuser.end_date >= calendar.the_date
ORDER
BY t_venues.venue_id
, calendar.the_date
;
The Result
venue_id the_date is_available
----------- ----------------------- ------------
1 2014-01-01 00:00:00.000 1
1 2014-01-02 00:00:00.000 0
2 2014-01-05 00:00:00.000 0
3 2014-01-09 00:00:00.000 0
3 2014-01-10 00:00:00.000 0
3 2014-01-11 00:00:00.000 1
3 2014-01-12 00:00:00.000 1
3 2014-01-13 00:00:00.000 1
3 2014-01-14 00:00:00.000 1
3 2014-01-15 00:00:00.000 1
4 2014-01-20 00:00:00.000 1
4 2014-01-21 00:00:00.000 1
4 2014-01-22 00:00:00.000 1
4 2014-01-23 00:00:00.000 0
4 2014-01-24 00:00:00.000 0
4 2014-01-25 00:00:00.000 0
4 2014-01-26 00:00:00.000 1
4 2014-01-27 00:00:00.000 1
4 2014-01-28 00:00:00.000 1
4 2014-01-29 00:00:00.000 1
4 2014-01-30 00:00:00.000 1
(21 row(s) affected)
The Explanation
Our calendar tables contains an entry for every date.
We join our t_venues (as an aside, if you have the choice, lose the t_ prefix!) to return every day between our start_date and end_date. Example output for venue_id=4 for just this join:
venue_id the_date
----------- -----------------------
4 2014-01-20 00:00:00.000
4 2014-01-21 00:00:00.000
4 2014-01-22 00:00:00.000
4 2014-01-23 00:00:00.000
4 2014-01-24 00:00:00.000
4 2014-01-25 00:00:00.000
4 2014-01-26 00:00:00.000
4 2014-01-27 00:00:00.000
4 2014-01-28 00:00:00.000
4 2014-01-29 00:00:00.000
4 2014-01-30 00:00:00.000
(11 row(s) affected)
Now we have one row per day, we [outer] join our t_venueuser table. We join this in much the same manner as before, but with one added twist: we need to join based on the venue_id too!
Running this for venue_id=4 gives this result:
venue_id the_date t_venueuser_venue_id
----------- ----------------------- --------------------
4 2014-01-20 00:00:00.000 NULL
4 2014-01-21 00:00:00.000 NULL
4 2014-01-22 00:00:00.000 NULL
4 2014-01-23 00:00:00.000 4
4 2014-01-24 00:00:00.000 4
4 2014-01-25 00:00:00.000 4
4 2014-01-26 00:00:00.000 NULL
4 2014-01-27 00:00:00.000 NULL
4 2014-01-28 00:00:00.000 NULL
4 2014-01-29 00:00:00.000 NULL
4 2014-01-30 00:00:00.000 NULL
(11 row(s) affected)
See how we have a NULL value for rows where there is no t_venueuser record. Genius, no? ;-)
So in my first query I gave you a quick CASE statement that shows availability (1=available, 0=not available). This is for illustration only, but could be useful to you.
You can then either wrap the query up and then apply an extra filter on this calculated column or simply add a where clause in: WHERE t_venueuser.venue_id IS NULL and that will do the same trick.
This is a complete hack, but it gives the results you require, I've only tested it on the data you provided so there may well be gotchas with larger sets.
In general what you are looking at solving here is a variation of gaps and islands problem ,this is (briefly) a sequence where some items are missing. The missing items are referred as gaps and the existing items are referred as islands. If you would like to understand this issue in general check a few of the articles:
Simple talk article
blogs.MSDN article
SO answers tagged gaps-and-islands
Code:
;with dates as
(
SELECT vdates.venue_id,
vdates.vdate
FROM ( SELECT DATEADD(d,sv.number,v.start_date) vdate
, v.venue_id
FROM t_venues v
INNER JOIN master..spt_values sv
ON sv.type='P'
AND sv.number BETWEEN 0 AND datediff(d, v.start_date, v.end_date)) vdates
LEFT JOIN t_venueuser vu
ON vdates.vdate >= vu.start_date
AND vdates.vdate <= vu.end_date
AND vdates.venue_id = vu.venue_id
WHERE ISNULL(vu.venue_id,-1) = -1
)
SELECT venue_id, ISNULL([1],[2]) StartDate, [2] EndDate
FROM (SELECT venue_id, rDate, ROW_NUMBER() OVER (PARTITION BY venue_id, DateType ORDER BY rDate) AS rType, DateType as dType
FROM( SELECT d1.venue_id
,d1.vdate AS rDate
,'1' AS DateType
FROM dates AS d1
LEFT JOIN dates AS d0
ON DATEADD(d,-1,d1.vdate) = d0.vdate
LEFT JOIN dates AS d2
ON DATEADD(d,1,d1.vdate) = d2.vdate
WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 1
AND ISNULL(d0.vdate, '01 Jan 1753') = '01 Jan 1753'
UNION
SELECT d1.venue_id
,ISNULL(d2.vdate,d1.vdate)
,'2'
FROM dates AS d1
LEFT JOIN dates AS d2
ON DATEADD(d,1,d1.vdate) = d2.vdate
WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 2
) res
) src
PIVOT (MIN (rDate)
FOR dType IN
( [1], [2] )
) AS pvt
Results:
venue_id StartDate EndDate
1 2014-01-01 2014-01-01
3 2014-01-11 2014-01-15
4 2014-01-20 2014-01-22
4 2014-01-26 2014-01-30

select min booking date by each customer

I have 2 tables, CustomerMaster, EntBookings, now for each customer, I have record of each of their booking(s) in EntBookings table.
What I want is, to find out the first date of booking for all customers.
What I tried,
SELECT cm.CustomerCode,
cm.CustomerName,
MIN(eb.BookingDate) OVER(PARTITION BY eb.BookingByCustomer)
FROM EntBookings eb
INNER JOIN CustomerMaster cm
ON cm.CustomerCode = eb.BookingByCustomer
GROUP BY
cm.CustomerCode,
cm.CustomerName,
eb.BookingByCustomer,
eb.BookingDate
It gives the result as
Cust1 ABC 2013-01-24 00:00:00.000
Cust2 ABCD 2013-01-24 00:00:00.000
Cust2 ABCD 2013-01-24 00:00:00.000
Cust3 BCD 2013-01-25 00:00:00.000
Cust4 BCDE 2013-01-24 00:00:00.000
Cust7 DEF 2013-01-02 00:00:00.000
I don't know why for Cust2, its showing 2 rows? How can I get it to return just one row?
If I do this..
SELECT cm.CustomerCode,
cm.CustomerName,
MIN(eb.BookingDate) OVER(PARTITION BY eb.BookingDate)
FROM EntBookings eb
INNER JOIN CustomerMaster cm
ON cm.CustomerCode = eb.BookingByCustomer
GROUP BY
cm.CustomerCode,
cm.CustomerName,
eb.BookingByCustomer,
eb.BookingDate
I get this...
Cust7 DEF 2013-01-02 00:00:00.000
Cust1 ABC 2013-01-24 00:00:00.000
Cust2 ABCD 2013-01-24 00:00:00.000
Cust4 BCDE 2013-01-24 00:00:00.000
Cust3 BCD 2013-01-25 00:00:00.000
Cust2 ABCD 2013-02-13 00:00:00.000
In this case, though I am still getting 2 rows for Cust2, dates are wrong, it should be 2013-01-24 00:00:00.000 (as in first case), but that is returning 2 rows. So, both of these queries are incorrect. Can anyone help me figure out what I am doing wrong?
You are complicating things and you are grouping too much. It seems you only want to group on customer code and name.
SELECT
cm.CustomerCode,
cm.CustomerName,
MIN(eb.BookingDate)
FROM
EntBookings eb
JOIN
CustomerMaster cm ON
cm.CustomerCode = eb.BookingByCustomer
GROUP BY
cm.CustomerCode, cm.CustomerName

SQL Server query join several tables

I have a query that I don't think should be that hard to make, however, I've spent a lot of time on it now and still can't get it the way I want, so I hope someone here can help me.
Basically, I need to create a report that will give a value for each month, for each area. However, not all areas deliver data each month; in that case the view should return NULL for that month and area. So, the view need to look something like this:
Month Area Value
2012-08-01 Area1 2
2012-08-01 Area2 3
2012-09-01 Area1 3
2012-09-01 Area2 NULL
My data table looks something like this
Date Area Value
2012-08-01 Area1 2
2012-08-01 Area2 3
2012-09-01 Area1 3 -- Notice that Area2 is not present for September here
I have a table with all the available areas
Furthermore, I have created a table-valued function that returns all dates from a given date until now.
For example this statement
SELECT * FROM Periods_Months('2012-01-01')
would return 8 records like:
DateValue Year Month YearMonth
2012-01-01 00:00:00.000 2012 1 20121
2012-02-01 00:00:00.000 2012 2 20122
2012-03-01 00:00:00.000 2012 3 20123
2012-04-01 00:00:00.000 2012 4 20124
2012-05-01 00:00:00.000 2012 5 20125
2012-06-01 00:00:00.000 2012 6 20126
2012-07-01 00:00:00.000 2012 7 20127
2012-08-01 00:00:00.000 2012 8 20128
Based on the suggestions, my query now looks like this:
WITH months AS (
SELECT DateValue, YearMonth FROM Periods_Months('2011-01-01')
)
select m.DateValue
,CAST(DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,m.DateValue)+1,0)) AS Date) AS DateReported -- Get last day in month
,ResponseTime AS Value
,g.ExternalId
from GISDB.dbo.GisObjects g
CROSS JOIN months m
LEFT OUTER JOIN
( -- SELECT data from data table, grouped by area and month
SELECT dbo.YearMonth(CloseDate) AS YearMonth
,MAX(CloseDate) AS LastDate
,GisObjectId
,SUM(DATEDIFF(HH,RegDate,CloseDate)) AS ResponseTime -- calculate response time between start and end data (the value we need)
FROM DataTable
WHERE CloseDate IS NOT NULL
AND GisObjectId IS NOT NULL
GROUP BY GisObjectId, dbo.YearMonth(CloseDate) -- group by area and month
) c
ON g.ObjectId = c.GisObjectId AND c.YearMonth = m.YearMonth
WHERE g.CompanyId = 3 AND g.ObjectTypeId = 1 -- reduce the GIS objects that we compare to
ORDER BY m.DateValue, g.ObjectId
But the result is this (Value is always NULL):
DateValue DateReported Value ExternalId
2011-01-01 00:00:00.000 31-01-2011 NULL 9994
2011-01-01 00:00:00.000 31-01-2011 NULL 9993
2011-01-01 00:00:00.000 31-01-2011 NULL 9992
2011-01-01 00:00:00.000 31-01-2011 NULL 9991
2011-01-01 00:00:00.000 31-01-2011 NULL 2339
2011-01-01 00:00:00.000 31-01-2011 NULL 2338
2011-01-01 00:00:00.000 31-01-2011 NULL 2337
2011-01-01 00:00:00.000 31-01-2011 NULL 2336
2011-01-01 00:00:00.000 31-01-2011 NULL 2335
2011-01-01 00:00:00.000 31-01-2011 NULL 2334
2011-01-01 00:00:00.000 31-01-2011 NULL 2327
2011-01-01 00:00:00.000 31-01-2011 NULL 2326
2011-01-01 00:00:00.000 31-01-2011 NULL 2325
2011-01-01 00:00:00.000 31-01-2011 NULL 2324
2011-01-01 00:00:00.000 31-01-2011 NULL 2323
2011-01-01 00:00:00.000 31-01-2011 NULL 2322
etc.
I suppose you have a table with all your areas, which I call area_table.
WITH month_table AS (
SELECT dateValue FROM Periods_Months('2012-01-01')
)
select * from area_table
CROSS JOIN month_table
LEFT OUTER JOIN myValueTable
ON area_table.name = myValueTable.area
AND myValueTable.date = left(convert(varchar(30),month_table.dateValue,120),10)
ORDER BY myValueTable.Month, myValueTable.area
Suppose Areas is your table for all available areas, t - is your data table:
SELECT pm.dateValue,Ar.Area, t.value
FROM Periods_Months('2012-01-01') pm, Areas ar
left join t on (pm.dateValue=t.Date) and (ar.Area=t.Area)
order by pm.DateValue,ar.Area