SQL Server query join several tables - sql

I have a query that I don't think should be that hard to make, however, I've spent a lot of time on it now and still can't get it the way I want, so I hope someone here can help me.
Basically, I need to create a report that will give a value for each month, for each area. However, not all areas deliver data each month; in that case the view should return NULL for that month and area. So, the view need to look something like this:
Month Area Value
2012-08-01 Area1 2
2012-08-01 Area2 3
2012-09-01 Area1 3
2012-09-01 Area2 NULL
My data table looks something like this
Date Area Value
2012-08-01 Area1 2
2012-08-01 Area2 3
2012-09-01 Area1 3 -- Notice that Area2 is not present for September here
I have a table with all the available areas
Furthermore, I have created a table-valued function that returns all dates from a given date until now.
For example this statement
SELECT * FROM Periods_Months('2012-01-01')
would return 8 records like:
DateValue Year Month YearMonth
2012-01-01 00:00:00.000 2012 1 20121
2012-02-01 00:00:00.000 2012 2 20122
2012-03-01 00:00:00.000 2012 3 20123
2012-04-01 00:00:00.000 2012 4 20124
2012-05-01 00:00:00.000 2012 5 20125
2012-06-01 00:00:00.000 2012 6 20126
2012-07-01 00:00:00.000 2012 7 20127
2012-08-01 00:00:00.000 2012 8 20128
Based on the suggestions, my query now looks like this:
WITH months AS (
SELECT DateValue, YearMonth FROM Periods_Months('2011-01-01')
)
select m.DateValue
,CAST(DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,m.DateValue)+1,0)) AS Date) AS DateReported -- Get last day in month
,ResponseTime AS Value
,g.ExternalId
from GISDB.dbo.GisObjects g
CROSS JOIN months m
LEFT OUTER JOIN
( -- SELECT data from data table, grouped by area and month
SELECT dbo.YearMonth(CloseDate) AS YearMonth
,MAX(CloseDate) AS LastDate
,GisObjectId
,SUM(DATEDIFF(HH,RegDate,CloseDate)) AS ResponseTime -- calculate response time between start and end data (the value we need)
FROM DataTable
WHERE CloseDate IS NOT NULL
AND GisObjectId IS NOT NULL
GROUP BY GisObjectId, dbo.YearMonth(CloseDate) -- group by area and month
) c
ON g.ObjectId = c.GisObjectId AND c.YearMonth = m.YearMonth
WHERE g.CompanyId = 3 AND g.ObjectTypeId = 1 -- reduce the GIS objects that we compare to
ORDER BY m.DateValue, g.ObjectId
But the result is this (Value is always NULL):
DateValue DateReported Value ExternalId
2011-01-01 00:00:00.000 31-01-2011 NULL 9994
2011-01-01 00:00:00.000 31-01-2011 NULL 9993
2011-01-01 00:00:00.000 31-01-2011 NULL 9992
2011-01-01 00:00:00.000 31-01-2011 NULL 9991
2011-01-01 00:00:00.000 31-01-2011 NULL 2339
2011-01-01 00:00:00.000 31-01-2011 NULL 2338
2011-01-01 00:00:00.000 31-01-2011 NULL 2337
2011-01-01 00:00:00.000 31-01-2011 NULL 2336
2011-01-01 00:00:00.000 31-01-2011 NULL 2335
2011-01-01 00:00:00.000 31-01-2011 NULL 2334
2011-01-01 00:00:00.000 31-01-2011 NULL 2327
2011-01-01 00:00:00.000 31-01-2011 NULL 2326
2011-01-01 00:00:00.000 31-01-2011 NULL 2325
2011-01-01 00:00:00.000 31-01-2011 NULL 2324
2011-01-01 00:00:00.000 31-01-2011 NULL 2323
2011-01-01 00:00:00.000 31-01-2011 NULL 2322
etc.

I suppose you have a table with all your areas, which I call area_table.
WITH month_table AS (
SELECT dateValue FROM Periods_Months('2012-01-01')
)
select * from area_table
CROSS JOIN month_table
LEFT OUTER JOIN myValueTable
ON area_table.name = myValueTable.area
AND myValueTable.date = left(convert(varchar(30),month_table.dateValue,120),10)
ORDER BY myValueTable.Month, myValueTable.area

Suppose Areas is your table for all available areas, t - is your data table:
SELECT pm.dateValue,Ar.Area, t.value
FROM Periods_Months('2012-01-01') pm, Areas ar
left join t on (pm.dateValue=t.Date) and (ar.Area=t.Area)
order by pm.DateValue,ar.Area

Related

How do I join a sparse table and fill rows between in SQL Server

How can I apply weights from a one table to another [Port] where the weight table has sparse dates?
[Port] table
utcDT UsdPnl
-----------------------------------------------
2012-03-09 00:00:00.000 -0.00581815226439161
2012-03-11 00:00:00.000 -0.000535272460588547
2012-03-12 00:00:00.000 -0.00353079778650661
2012-03-13 00:00:00.000 0.00232882689252497
2012-03-14 00:00:00.000 -0.0102592811199384
2012-03-15 00:00:00.000 0.00254451559598693
2012-03-16 00:00:00.000 0.0146718613139845
2012-03-18 00:00:00.000 0.000425144543842752
2012-03-19 00:00:00.000 -0.00388548271428044
2012-03-20 00:00:00.000 -0.00662423680184768
2012-03-21 00:00:00.000 0.00405506208635343
2012-03-22 00:00:00.000 -0.000814822806982203
2012-03-23 00:00:00.000 -0.00289523953346103
2012-03-25 00:00:00.000 0.00204150859774465
2012-03-26 00:00:00.000 -0.00641635182718787
2012-03-27 00:00:00.000 -0.00107168420738448
2012-03-28 00:00:00.000 0.00131000520696153
2012-03-29 00:00:00.000 0.0008223678402638
2012-03-30 00:00:00.000 -0.00255345945390133
2012-04-01 00:00:00.000 -0.00337792814650089
[Weights] table
utcDT Weight
--------------------------------
2012-03-09 00:00:00.000 1
2012-03-20 00:00:00.000 3
2012-03-29 00:00:00.000 7
So, I want to use the weights as if I had a full table like this below. i.e. change to new weight on first day it appears in [Weights] table:
utcDT UsedWeight
----------------------------------
2012-03-09 00:00:00.000 1
2012-03-11 00:00:00.000 1
2012-03-12 00:00:00.000 1
2012-03-13 00:00:00.000 1
2012-03-14 00:00:00.000 1
2012-03-15 00:00:00.000 1
2012-03-16 00:00:00.000 1
2012-03-18 00:00:00.000 1
2012-03-19 00:00:00.000 1
2012-03-20 00:00:00.000 3
2012-03-21 00:00:00.000 3
2012-03-22 00:00:00.000 3
2012-03-23 00:00:00.000 3
2012-03-25 00:00:00.000 3
2012-03-26 00:00:00.000 3
2012-03-27 00:00:00.000 3
2012-03-28 00:00:00.000 3
2012-03-29 00:00:00.000 7
2012-03-30 00:00:00.000 7
2012-04-01 00:00:00.000 7
You can use apply:
select p.*, w.*
from port p outer apply
(select top (1) w.*
from weights w
where w.utcDT <= p.utcDT
order by w.utcDT desc
) w;
outer apply is usually pretty efficient, if you have the right indexes. In this case, the right inex is on weights(utcDT desc).
You can use lead() in a subquery to associate the next date a weight changes to each weights record, and then join with port using an inequality condition on the dates:
select p.utcDt, w.weight
from port p
inner join (
select utcDt, weight, lead(utcDt) over(order by utcDt) lead_utcDt from weights
) w
on p.utcDt >= w.utcDt
and (w.lead_utcDt is null or p.utcDt < w.lead_utcDt)

How to use next date column value to calculate delta for current column

I have a temp table
BusinessDate SSQ_CompScore
2011-01-05 00:00:00.000 41
2011-01-06 00:00:00.000 6
2011-01-07 00:00:00.000 1
2011-01-10 00:00:00.000 8
2011-01-11 00:00:00.000 48
2011-01-12 00:00:00.000 50
2011-01-13 00:00:00.000 59
I need to calculate delta for each current date.
I have prepared a solution but it doesn't work where date as not consecutive.
Can you please help?
select t1.businessdate, t1.ssq_compscore, (t2.ssq_compscore - t1.ssq_compscore) as delta
from #temp t1
left join #temp t2 on t1.businessdate = DATEADD(dd,1,t2.businessdate)
where t1.businessdate >='20180814'
Result set should be as
BusinessDate SSQ_CompScore Delta
2011-01-05 00:00:00.000 41 NULL
2011-01-06 00:00:00.000 6 35
2011-01-07 00:00:00.000 1 5
2011-01-10 00:00:00.000 8 7
2011-01-11 00:00:00.000 48 40
2011-01-12 00:00:00.000 50 2
2011-01-13 00:00:00.000 59 9
Not sure if this is the most efficient way but it works as far as I see
SELECT businessdate, SSQ_CompScore ,
SSQ_CompScore - (SELECT SSQ_CompScore
FROM temp
WHERE businessdate < t1.businessdate
ORDER BY businessdate DESC
LIMIT 1) as delta
FROM temp t1
ORDER BY businessdate ASC

Oracle SQL query to get sales by date range

I am looking to write an SQL query that will provide me sales broken into date ranges, but it is a bit above my SQL knowledge.
I have a table of date ranges by customers as follows:
Cust Product startdate enddate
-----------------------------------
A 123 2011-01-01 2011-12-31
A 124 2011-01-01 2011-05-01
A 125 2011-01-01 2011-05-01
B 123 2011-01-01 2011-03-01
B 124 2011-01-01 2011-03-01
C 125 2011-02-02 2011-05-01
and sales stored as follows:
Cust Product date qty
-----------------------------------
A 123 2011-04-08 1
A 124 2011-01-01 12
A 125 2011-05-01 2
B 123 2011-01-04 3
B 124 2011-02-01 5
C 125 2011-03-01 80
The results should look something like:
Cust Product startdate enddate qty
-----------------------------------------
A 124 2011-01-01 2011-02-01 12
B 123 2011-01-01 2011-02-01 3
B 124 2011-02-02 2011-03-01 5
A 123 2011-03-02 2011-05-01 1
C 125 2011-03-02 2011-05-01 80
A 125 2011-05-02 2011-12-31 2
Any advice gratefully received.
I made the example in MySQL because Oracle server was down. But query is the same.
SQL Fiddle Demo
SELECT R.*, S.*
FROM dRanges R
JOIN Sales S
ON S.`date` >= R.`startdate`
AND S.`date` <= R.`enddate`
AND S.`Cust` = R.`Cust`
AND S.`Product` = R.`Product`
But you have to be carefull ranges doesnt overlap, otherwise you can have same Sales value appear on two ranges
EDIT Please explain the logic here

SQL Server : compare rows, exclude from results when some values are the same

I have the following SQL Server query problem.
If there is a row where Issue_DATE = as Maturity_Date in another row, and if both rows have the same ID and Amount USD, then none of these rows should be displayed.
Here is a simplified version of my table:
ID ISSUE_DATE MATURITY_DATE AMOUNT_USD
1 2010-01-01 00:00:00.000 2015-12-01 00:00:00.000 5000
1 2010-01-01 00:00:00.000 2001-09-19 00:00:00.000 700
2 2014-04-09 00:00:00.000 2019-04-09 00:00:00.000 400
1 2015-12-01 00:00:00.000 2016-12-31 00:00:00.000 5000
5 2015-02-24 00:00:00.000 2015-02-24 00:00:00.000 8000
4 2012-11-29 00:00:00.000 2015-11-29 00:00:00.000 10000
3 2015-01-21 00:00:00.000 2018-01-21 00:00:00.000 17500
2 2015-02-02 00:00:00.000 2015-12-05 00:00:00.000 12000
1 2015-01-12 00:00:00.000 2018-01-12 00:00:00.000 18000
2 2015-12-05 00:00:00.000 2016-01-10 00:00:00.000 12000
Result should be:
ID ISSUE_DATE MATURITY_DATE AMOUNT_USD
1 2010-01-01 00:00:00.000 2001-09-19 00:00:00.000 700
2 2014-04-09 00:00:00.000 2019-04-09 00:00:00.000 400
5 2015-02-24 00:00:00.000 2015-02-24 00:00:00.000 8000
4 2012-11-29 00:00:00.000 2015-11-29 00:00:00.000 10000
3 2015-01-21 00:00:00.000 2018-01-21 00:00:00.000 17500
1 2015-01-12 00:00:00.000 2018-01-12 00:00:00.000 18000
I tried with self join, but I do not get right result.
Thanks in advance!
Can you try something like this? 'not exists' is the way of doing it.
select * from table t1 where not exists (select 'x' from table t2 where t1.issue_date = t2.maturity_date and t1.amount_usd=t2.amount_usd and t1.id = t2.id)
I'd think about making subquery of all the dupes and then eliminating them from the first table like so:
select t1.ID
, t1.ISSUE_DATE
, t1.MATURITY_DATE
, t1.AMOUNT_USD
FROM
t1
LEFT JOIN
(select a.ID
, a.ISSUE_DATE
, a.MATURITY_DATE
, a.AMOUNT_USD
FROM
t1 a
INNER JOIN
ti b
) dupes
on
t1.ID = dupes.ID
WHERE dupes.ID IS NULL;

SQL Date Range Query - Table Comparison

I have two SQL Server tables containing the following information:
Table t_venues:
venue_id is unique
venue_id | start_date | end_date
1 | 01/01/2014 | 02/01/2014
2 | 05/01/2014 | 05/01/2014
3 | 09/01/2014 | 15/01/2014
4 | 20/01/2014 | 30/01/2014
Table t_venueuser:
venue_id is not unique
venue_id | start_date | end_date
1 | 02/01/2014 | 02/01/2014
2 | 05/01/2014 | 05/01/2014
3 | 09/01/2014 | 10/01/2014
4 | 23/01/2014 | 25/01/2014
From these two tables I need to find the dates that haven't been selected for each range, so the output would look like this:
venue_id | start_date | end_date
1 | 01/01/2014 | 01/01/2014
3 | 11/01/2014 | 15/01/2014
4 | 20/01/2014 | 22/01/2014
4 | 26/01/2014 | 30/01/2014
I can compare the two tables and get the date ranges from t_venues to appear in my query using 'except' but I can't get the query to produce the non-selected dates. Any help would be appreciated.
Calendar Table!
Another perfect candidate for a calendar table. If you can't be bothered to search for one, here's one I made earlier.
Setup Data
DECLARE #t_venues table (
venue_id int
, start_date date
, end_date date
);
INSERT INTO #t_venues (venue_id, start_date, end_date)
VALUES (1, '2014-01-01', '2014-01-02')
, (2, '2014-01-05', '2014-01-05')
, (3, '2014-01-09', '2014-01-15')
, (4, '2014-01-20', '2014-01-30')
;
DECLARE #t_venueuser table (
venue_id int
, start_date date
, end_date date
);
INSERT INTO #t_venueuser (venue_id, start_date, end_date)
VALUES (1, '2014-01-02', '2014-01-02')
, (2, '2014-01-05', '2014-01-05')
, (3, '2014-01-09', '2014-01-10')
, (4, '2014-01-23', '2014-01-25')
;
The Query
SELECT t_venues.venue_id
, calendar.the_date
, CASE WHEN t_venueuser.venue_id IS NULL THEN 1 ELSE 0 END As is_available
FROM dbo.calendar /* see: http://gvee.co.uk/files/sql/dbo.numbers%20&%20dbo.calendar.sql for an example */
INNER
JOIN #t_venues As t_venues
ON t_venues.start_date <= calendar.the_date
AND t_venues.end_date >= calendar.the_date
LEFT
JOIN #t_venueuser As t_venueuser
ON t_venueuser.venue_id = t_venues.venue_id
AND t_venueuser.start_date <= calendar.the_date
AND t_venueuser.end_date >= calendar.the_date
ORDER
BY t_venues.venue_id
, calendar.the_date
;
The Result
venue_id the_date is_available
----------- ----------------------- ------------
1 2014-01-01 00:00:00.000 1
1 2014-01-02 00:00:00.000 0
2 2014-01-05 00:00:00.000 0
3 2014-01-09 00:00:00.000 0
3 2014-01-10 00:00:00.000 0
3 2014-01-11 00:00:00.000 1
3 2014-01-12 00:00:00.000 1
3 2014-01-13 00:00:00.000 1
3 2014-01-14 00:00:00.000 1
3 2014-01-15 00:00:00.000 1
4 2014-01-20 00:00:00.000 1
4 2014-01-21 00:00:00.000 1
4 2014-01-22 00:00:00.000 1
4 2014-01-23 00:00:00.000 0
4 2014-01-24 00:00:00.000 0
4 2014-01-25 00:00:00.000 0
4 2014-01-26 00:00:00.000 1
4 2014-01-27 00:00:00.000 1
4 2014-01-28 00:00:00.000 1
4 2014-01-29 00:00:00.000 1
4 2014-01-30 00:00:00.000 1
(21 row(s) affected)
The Explanation
Our calendar tables contains an entry for every date.
We join our t_venues (as an aside, if you have the choice, lose the t_ prefix!) to return every day between our start_date and end_date. Example output for venue_id=4 for just this join:
venue_id the_date
----------- -----------------------
4 2014-01-20 00:00:00.000
4 2014-01-21 00:00:00.000
4 2014-01-22 00:00:00.000
4 2014-01-23 00:00:00.000
4 2014-01-24 00:00:00.000
4 2014-01-25 00:00:00.000
4 2014-01-26 00:00:00.000
4 2014-01-27 00:00:00.000
4 2014-01-28 00:00:00.000
4 2014-01-29 00:00:00.000
4 2014-01-30 00:00:00.000
(11 row(s) affected)
Now we have one row per day, we [outer] join our t_venueuser table. We join this in much the same manner as before, but with one added twist: we need to join based on the venue_id too!
Running this for venue_id=4 gives this result:
venue_id the_date t_venueuser_venue_id
----------- ----------------------- --------------------
4 2014-01-20 00:00:00.000 NULL
4 2014-01-21 00:00:00.000 NULL
4 2014-01-22 00:00:00.000 NULL
4 2014-01-23 00:00:00.000 4
4 2014-01-24 00:00:00.000 4
4 2014-01-25 00:00:00.000 4
4 2014-01-26 00:00:00.000 NULL
4 2014-01-27 00:00:00.000 NULL
4 2014-01-28 00:00:00.000 NULL
4 2014-01-29 00:00:00.000 NULL
4 2014-01-30 00:00:00.000 NULL
(11 row(s) affected)
See how we have a NULL value for rows where there is no t_venueuser record. Genius, no? ;-)
So in my first query I gave you a quick CASE statement that shows availability (1=available, 0=not available). This is for illustration only, but could be useful to you.
You can then either wrap the query up and then apply an extra filter on this calculated column or simply add a where clause in: WHERE t_venueuser.venue_id IS NULL and that will do the same trick.
This is a complete hack, but it gives the results you require, I've only tested it on the data you provided so there may well be gotchas with larger sets.
In general what you are looking at solving here is a variation of gaps and islands problem ,this is (briefly) a sequence where some items are missing. The missing items are referred as gaps and the existing items are referred as islands. If you would like to understand this issue in general check a few of the articles:
Simple talk article
blogs.MSDN article
SO answers tagged gaps-and-islands
Code:
;with dates as
(
SELECT vdates.venue_id,
vdates.vdate
FROM ( SELECT DATEADD(d,sv.number,v.start_date) vdate
, v.venue_id
FROM t_venues v
INNER JOIN master..spt_values sv
ON sv.type='P'
AND sv.number BETWEEN 0 AND datediff(d, v.start_date, v.end_date)) vdates
LEFT JOIN t_venueuser vu
ON vdates.vdate >= vu.start_date
AND vdates.vdate <= vu.end_date
AND vdates.venue_id = vu.venue_id
WHERE ISNULL(vu.venue_id,-1) = -1
)
SELECT venue_id, ISNULL([1],[2]) StartDate, [2] EndDate
FROM (SELECT venue_id, rDate, ROW_NUMBER() OVER (PARTITION BY venue_id, DateType ORDER BY rDate) AS rType, DateType as dType
FROM( SELECT d1.venue_id
,d1.vdate AS rDate
,'1' AS DateType
FROM dates AS d1
LEFT JOIN dates AS d0
ON DATEADD(d,-1,d1.vdate) = d0.vdate
LEFT JOIN dates AS d2
ON DATEADD(d,1,d1.vdate) = d2.vdate
WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 1
AND ISNULL(d0.vdate, '01 Jan 1753') = '01 Jan 1753'
UNION
SELECT d1.venue_id
,ISNULL(d2.vdate,d1.vdate)
,'2'
FROM dates AS d1
LEFT JOIN dates AS d2
ON DATEADD(d,1,d1.vdate) = d2.vdate
WHERE CASE ISNULL(d2.vdate, '01 Jan 1753') WHEN '01 Jan 1753' THEN '2' ELSE '1' END = 2
) res
) src
PIVOT (MIN (rDate)
FOR dType IN
( [1], [2] )
) AS pvt
Results:
venue_id StartDate EndDate
1 2014-01-01 2014-01-01
3 2014-01-11 2014-01-15
4 2014-01-20 2014-01-22
4 2014-01-26 2014-01-30