I'm trying to join data between three slow change dimension type 2. When I query the result, the sort by date between the dimensions are not as expected.
I have the slow change dimensions below:
Table Subsidiaries
id
name
subsidiary
department
start_date_dep
end_date_dep
last_record_flg
1
John Doe
AL
Engineering
2005-10-01
2013-01-01
0
1
John Doe
AL
Sales
2013-01-01
2014-05-01
0
1
John Doe
NY
Sales
2014-05-01
1
38
Ivy Johnson
NY
Sales
2020-06-01
1
Table Functions
id
function
start_date_fun
end_date_fun
last_record_flg
1
operator
2005-10-01
2009-08-01
0
1
leader
2009-08-01
2011-10-01
0
1
manager
2011-10-01
2017-07-01
0
1
director
2017-07-01
1
38
operator
2020-06-01
1
Table Graduations
id
university_graduation
conclusion_date
last_record_flg
1
bachelor
15/12/2005
0
1
master
15/12/2008
1
38
bachelor
15/12/2014
1
The desired result is:
id
name
subsidiary
department
start_date_dep
end_date_dep
last_record_flg
function
start_date_fun
end_date_fun
last_record_flg
university_graduation
conclusion_date
last_record_flg
max_date
seq
start
end
last_record_flg
1
John Doe
AL
Engineering
2005-10-01
2013-01-01
0
operator
2005-10-01
2009-08-01
0
bachelor
2005-12-15
0
2005-12-15
1
2005-10-01
2008-12-15
0
1
John Doe
AL
Engineering
2005-10-01
2013-01-01
0
operator
2005-10-01
2009-08-01
0
master
2008-12-15
1
2008-12-15
1
2008-12-15
2009-08-01
0
1
John Doe
AL
Engineering
2005-10-01
2013-01-01
0
leader
2009-08-01
2011-10-01
0
master
2008-12-15
1
2009-08-01
1
2009-08-01
2011-10-01
0
1
John Doe
AL
Engineering
2005-10-01
2013-01-01
0
manager
2011-10-01
2017-07-01
0
master
2008-12-15
1
2011-10-01
1
2011-10-01
2013-01-01
0
1
John Doe
AL
Sales
2013-01-01
2014-05-01
0
manager
2011-10-01
2017-07-01
0
master
2008-12-15
1
2013-01-01
1
2013-01-01
2014-05-01
0
1
John Doe
NY
Sales
2014-05-01
NULL
1
manager
2011-10-01
2017-07-01
0
master
2008-12-15
1
2014-05-01
1
2014-05-01
2017-07-01
0
1
John Doe
NY
Sales
2014-05-01
NULL
1
director
2017-07-01
NULL
1
master
2008-12-15
1
2017-07-01
1
2017-07-01
NULL
1
38
Ivy Johnson
NY
Sales
2020-06-01
NULL
1
operator
2020-06-01
NULL
1
bachelor
2014-12-15
1
2020-06-01
1
2020-06-01
NULL
1
I tried with CROSS APPLY, but is returning only one line for each id. I'm trying with CASE WHEN but the query output is not exactly equal the desired result. In my return the column 'FUNCTION' and 'START_DATE_FUN' not follow the sequence (sort) presented in the desired result, the same occur for columns 'UNIVERSITY_GRADUATION' and 'CONCLUSION_DATE'.
The query:
select
*
from(
select
tb.*
,row_number() over(partition by tb.id,tb.max_date order by tb.max_date) as seq
,tb.max_date as [start]
,lead( tb.max_date ) over( partition by tb.id order by tb.max_date ) as [end]
,case when lead( tb.max_date ) over( partition by tb.id order by tb.max_date ) is null then 1 else 0 end as last_record_flg
from(
select
sb.id
,sb.[name]
,sb.subsidiary
,sb.department
,sb.start_date_dep
,sb.end_date_dep
,sb.last_record_flg as lr_sb
,fc.[function]
,fc.start_date_fun
,fc.end_date_fun
,fc.last_record_flg as lr_fc
,gd.university_graduation
,gd.end_date_grad
,gd.last_record_flg as lr_gd
,case
when sb.start_date_dep >= fc.start_date_fun and sb.start_date_dep >= gd.end_date_grad then sb.start_date_dep
when fc.start_date_fun >= sb.start_date_dep and fc.start_date_fun >= gd.end_date_grad then fc.start_date_fun
else gd.end_date_grad
end as max_date
from
#Subsidiaries as sb
left outer join #Functions as fc
on sb.id = fc.id
left outer join #Graduations as gd
on sb.id = gd.id
) as tb
) as tb2
where
tb2.seq = 1
Below the DDL:
create table #Subsidiaries (
id int
,[name] varchar(15)
,subsidiary varchar(2)
,department varchar(15)
,start_date_dep date
,end_date_dep date
,last_record_flg bit
)
go
insert into #Subsidiaries values
(1,'John Doe','AL','Engineering','2005-10-01','2013-01-01',0),
(1,'John Doe','AL','Sales','2013-01-01','2014-05-01',0),
(1,'John Doe','NY','Sales','2014-05-01',null,1),
(38,'Ivy Johnson','NY','Sales','2020-06-01',null,1)
go
create table #Functions (
id int
,[function] varchar(15)
,start_date_fun date
,end_date_fun date
,last_record_flg bit
)
go
insert into #Functions values
(1,'operator','2005-10-01','2009-08-01',0),
(1,'leader','2009-08-01','2011-10-01',0),
(1,'manager','2011-10-01','2017-07-01',0),
(1,'director','2017-07-01',null,1),
(38,'operator','2020-06-01',null,1)
go
create table #Graduations (
id int
,university_graduation varchar(15)
,end_date_grad date
,last_record_flg bit
)
go
insert into #Graduations values
(1,'bachelor','2005-12-15',0),
(1,'master','2008-12-15',1),
(38,'bachelor','2014-12-15',1)
go
Case when someone find the same difficult to join two or more SCD type 2, I could find a reference in this link https://sqlsunday.com/2014/11/30/joining-two-scd2-tables/ (SQL Sunday) that help me to build the query and use the range intervals in the join condition to return result as desired.
I have three tables in my database Sales, SalesPeople and Appliances.
Sales
SaleDate EmployeeID AppID Qty
---------- ---------- ----- -----------
2010-01-01 1412 150 1
2010-01-05 3231 110 1
2010-01-03 2920 110 2
2010-01-13 1412 100 1
2010-01-25 1235 150 2
2010-01-22 1235 100 2
2010-01-12 2920 150 3
2010-01-14 3231 100 1
2010-01-15 1235 300 1
2010-01-03 2920 200 2
2010-01-31 2920 310 1
2010-01-05 1412 420 1
2010-01-15 3231 400 2
SalesPeople
EmployeeID EmployeeName CommRate BaseSalary SupervisorID
---------- ------------------------------ ----------- ----------- ------------
1235 Linda Smith 15 1200 1412
1412 Anne Green 12 1800 NULL
2920 Charles Brown 10 1150 1412
3231 Harry Purple 18 1700 1412
Appliances
ID AppType StoreID Cost Price
---- -------------------- ------- ------------- -------------
100 Refrigerator 22 150 250
110 Refrigerator 20 175 300
150 Television 27 225 340
200 Microwave Oven 22 120 180
300 Washer 27 200 325
310 Washer 22 280 400
400 Dryer 20 150 220
420 Dryer 22 240 360
How can I obtain this result? (That displays the profitability of each of the salespeople ordered from the most profitable to the least. Gross is simply the sum of the quantity of items sold multiplied by the price. Commission is calculated from the gross minus the cost of those items (i.e. from
qty*(price-cost)). Net profit is the total profit minus commission.)
Name Gross Commission Net Profit
------------- ----- ---------- ---------
Charles Brown 2380 83.5 751.5
Linda Smith 1505 83.25 471.75
Harry Purple 990 65.7 299.3
Anne Green 950 40.2 294.8
My attempt:
CREATE PROC Profitability AS
SELECT
sp.EmployeeName, (sum(s.Qty) * a.Price) as [Gross],
[Gross] - a.Cost, as [Commision],
SOMETHING as [Net Profit]
FROM
Salespeople sp, Appliances a, Sales s
WHERE
s.AppID = a.ID
AND sp.EmployeeID = s.EmployeeID
GROUP BY
sp.EmployeeName
GO
EXEC Profitability
Simple rule: Never use commas in the FROM clause. Always use explicit JOIN syntax.
In addition to fixing the JOIN syntax, your query needs a few other enhancements for the aggregation functions:
SELECT sp.EmployeeName, sum(s.Qty * a.Price) as Gross,
SUM(s.Qty * (a.Price - a.Cost)) * sp.CommRate / 100.0 as Commission,
SUM(s.Qty * (a.Price - a.Cost)) * (1 - sp.CommRate / 100.0) as NetProfit
FROM Sales s JOIN
Salespeople sp
ON sp.EmployeeID = s.EmployeeID JOIN
Appliances a
ON s.AppID = a.ID
GROUP BY sp.EmployeeName sp.CommRate
ORDER BY NetProfit DESC;
How would I go about extracting the overlapping dates from the following table?
ID Name StartDate EndDate Type
==============================================================
1 John Smith 01/01/2014 31/01/2014 A
2 John Smith 20/01/2014 20/02/2014 B
3 John Smith 01/03/2014 28/03/2014 A
4 John Smith 18/03/2014 24/03/2014 B
5 John Smith 01/07/2014 31/07/2014 A
6 John Smith 15/07/2014 31/07/2014 B
7 John Smith 25/07/2014 25/08/2014 C
Based on the first example for John Smith, the dates 01/01/2014 to 31/01/2014 overlap with 20/01/2014 to 20/02/2014, so I am expecting just overlapping period back which is 20/01/2014 to 31/01/2014.
The final result would be:
ID Name StartDate EndDate
==================================================
8 John Smith 20/01/2014 31/01/2014
9 John Smith 18/03/2014 24/03/2014
10 John Smith 15/07/2014 31/07/2014
11 John Smith 25/07/2014 31/07/2014
HELP REQUIRED 10 August 2014
In addition to the above request, I am looking for help or guidance on how to get the following results which should include the dates that overlap and the dates that don't. The ID column is irrelevant.
ID Name StartDate EndDate Type
==================================================
1 John Smith 01/01/2014 19/01/2014 A
8 John Smith 20/01/2014 31/01/2014 AB
2 John Smith 01/02/2014 20/02/2014 B
3 John Smith 01/03/2014 17/03/2014 A
9 John Smith 18/03/2014 24/03/2014 AB
3 John Smith 25/03/2014 28/03/2014 A
5 John Smith 01/07/2014 14/07/2014 A
10 John Smith 15/07/2014 31/07/2014 AB
11 John Smith 25/07/2014 31/07/2014 ABC
7 John Smith 01/08/2014 25/08/2014 C
Although the following image is not an exact reflection of the above, for illustration purposes, I am interested in seeing the dates that overlap (red) and the dates that don't (sky blue) in the same result set.
http://imgur.com/SeR9sY1
If you want just overlapping periods, you can get this with a self join. Do note that the results might be redundant if more than two periods overlap on certain dates.
select ft.name,
(case when max(ft.startdate) > max(ft2.startdate) then max(ft.startdate)
else max(ft2.startdate)
end) as startdate,
(case when min(ft.enddate) > min(ft2.enddate) then min(ft.enddate)
else min(ft2.enddate)
end) as enddate
from followingtable ft join
followingtable ft2
on ft.name = ft2.name and
ft.id < ft2.id and
ft.startdate <= ft2.enddate and
ft.enddate > ft2.startdate
group by ft.name, ft.id, ft2.id;
This doesn't assign the ids. You can do that with row_number() and an offset.
I am working with a Raiser's Edge database using SQL Server 2005. I have written SQL that will produce a temporary table containing details of direct debit instalments. Below is a small table containing the key variables for the question I'm going to ask, with some fictional data:
Donor_ID Instalment_ID Instalment_Date Amount
1234 1111 01/01/2011 £5.00
1234 1112 01/02/2011 £0.00
1234 1113 01/03/2011 £5.00
1234 1114 01/04/2011 £5.00
1234 1115 01/05/2011 £0.00
1234 1116 01/06/2011 £0.00
2345 2111 01/01/2011 £0.00
2345 2112 01/02/2011 £5.00
2345 2113 01/03/2011 £5.00
2345 2114 01/04/2011 £0.00
2345 2115 01/05/2011 £0.00
2345 2116 01/06/2011 £0.00
As you will see, some of the values in the Amount column are £0.00. This can occur when a donor has insufficient funds in their account, for example.
What I'd like to do is write a SQL query that will create a field containing an incremental count of consecutive £0.00 payments that resets after a non-£0.00 payment or after a change in Donor_ID. I have reproduced the above data below, with the field I'd like to see.
Donor_ID Instalment_ID Instalment_Date Amount New_Field
1234 1111 01/01/2011 £5.00
1234 1112 01/02/2011 £0.00 1
1234 1113 01/03/2011 £5.00
1234 1114 01/04/2011 £5.00
1234 1115 01/05/2011 £0.00 1
1234 1116 01/06/2011 £0.00 2
2345 2111 01/01/2011 £0.00 1
2345 2112 01/02/2011 £5.00
2345 2113 01/03/2011 £5.00
2345 2114 01/04/2011 £0.00 1
2345 2115 01/05/2011 £0.00 2
2345 2116 01/06/2011 £0.00 3
To help clarify what I'm looking for, I think what I'm looking to do would be similar to a winning streak field on a list of a football team's results. For example:
Opponent Score Winning_Streak
Arsenal 1-0 1
Liverpool 0-0
Swansea 3-1 1
Chelsea 2-1 2
Fulham 4-0 3
Stoke 0-0
Man Utd 1-3
Reading 2-1 1
I've considered various options, but have made no progress. Unless I've missed something obvious, I think that a solution more advanced than my current SQL programming level might be required.
If I am thinking about this problem correctly, I believe that you want a row number when the Amount is 0.00 pounds.
Select 0 as As InsufficientCount
, Donor_ID
, Installment_ID
, Amount
From [Table]
Where Amount > 0.00
Union
Select Row_Number() Over (Partition By Donor_ID Order By Installment_ID)
, Donor_ID
, Installment_ID
, Amount
From [Table]
Where Amount = 0.00
This union select should only give you 'ranks' where the Amount equals 0.
Am calling your new field streakAmount
ALTER TABLE instalments ADD streakAmount int NULL;
Then, to update the value:
UPDATE instalments
SET streakAmount =
(SELECT
COUNT(*)
FROM
instalments streak
WHERE
streak.donor_id = instalments.donor_id
AND
streak.instalment_date <= instalments.instalment_date
AND
(streak.instalment_date >
-- find previous instalment date, if any exists
COALESCE(
(
SELECT
MAX(instalment_date)
FROM
instalments prev
WHERE
prev.donor_id = instalments.donor_id
AND
prev.amount > 0
AND
prev.instalment_date < instalments.instalment_date
)
-- otherwise min date
, cast('1753-1-1' AS date))
)
)
WHERE
amount = 0;
http://sqlfiddle.com/#!6/a571f/18
I'm not sure exactly how to describe what I want to do, so I'll use a contrived example
On SQL Server 2005, Say I have a view with rows like this, call it vwGrades:
ID AssnDate AssnTxt Sally Ted Bob
----------- ----------------------- ------------- ----------- ----------- -----------
2999 2007-09-22 00:00:00 Homework #1 20 NULL NULL
2999 2007-09-22 00:00:00 Homework #1 NULL 0 NULL
2999 2007-09-22 00:00:00 Homework #1 NULL NULL 24
2999 2007-09-22 00:00:00 Final Exam 57 NULL NULL
2999 2007-09-22 00:00:00 Final Exam NULL 0 NULL
2999 2007-09-22 00:00:00 Final Exam NULL NULL 35
How can I query it, such that I get this, ridding myself of all the annoying nulls and duplicate rows?
ID AssnDate AssnTxt Sally Ted Bob
----------- ----------------------- ------------- ----------- ----------- -----------
2999 2007-09-22 00:00:00 Homework #1 20 0 24
2999 2007-09-22 00:00:00 Final Exam 57 0 35
Select
ID,
AssnDate,
AssnTxt,
Max(IsNull(Sally,0)) AS Sally,
Max(IsNull(Ted, 0)) As Ted,
Max(IsNull(Bob, 0)) As Bob
From vwGrades
Group By
ID,
AssnDate,
AssnTxt
Looks like a denormalized schema, you should have a FirstName column instead of Sally Ted and Bob. That would make the query much simpler. Can you refactor?