Join table on itself for unique row combinations for calculations - sql

I have a table that I need to use to build a result set from where certain rows from the table are columns in the result set. I started to chain LEFT JOINs together on the table multiple times but I need to eliminate results that are a different combination of another result already in the set:
For example, if I get 1, 21, 25 as result columns, I can't have ANY other combination of those numbers in the results.
My table definition is:
Table tblKPIDetails
Column Month int
Column Year int
Column Division varchar(3)
Column KPI int
Column Value decimal(18,4)
My current query is:
SELECT *
FROM tblKPIDetails J1
LEFT JOIN tblKPIDetails J2 ON J2.Month = J1.Month AND J2.Year = J1.Year AND J2.Division = J1.Division AND NOT(J2.KPI = J1.KPI ) AND (J2.KPI = 1 OR J2.KPI = 21 OR J2.KPI = 25)
LEFT JOIN tblKPIDetails J3 ON J3.Month = J1.Month AND J3.Year = J1.Year AND J3.Division = J1.Division AND NOT(J3.KPI = J1.KPI ) AND (J3.KPI = 1 OR J3.KPI = 21 OR J3.KPI = 25)
WHERE J1.KPI = 1 OR J1.KPI = 21 OR J1.KPI = 25
I know this is wrong, but it's a super-set of what I need. In the results from the query above, I can get J1.KPI, J2.KPI, J3.KPI or J1.KPI, J3.KPI, J2.KPI, or any other combination.
My expected result would be:
Division | Month | Year | KPIA | KPIAValue | KPIB | KPIBValue | KPIC | KPICValue
for each division, month, and year
where KPIA, KPIB, or KPIC = 1, 21, or 25 but only 1 combination of 1,21,25 exists per division|month|year
EDIT
To clarify the expected results a little more, using the above query, I'm getting the following results:
Division | Month | Year | KPIA | KPIAValue | KPIB | KPIBValue | KPIC | KPICValue
--------------------------------------------------------------------------------
000 1 2012 1 1000 21 2000 25 3000
000 1 2012 21 2000 1 1000 25 3000
000 1 2012 25 3000 21 2000 1 1000
111 1 2012 1 555 21 10000 25 5000
I need to make it so my results would only be ANY 1 of the first 3 results and then the last one...for example:
Division | Month | Year | KPIA | KPIAValue | KPIB | KPIBValue | KPIC | KPICValue
--------------------------------------------------------------------------------
000 1 2012 25 3000 21 2000 1 1000
111 1 2012 1 555 21 10000 25 5000

I think you are looking for the PIVOT table operator like so:
SELECT
Devision,
Month,
Year,
[1] AS KPIAValue,
[21] AS KPIBValue,
[25] AS KPICValue
FROM
(
SELECT t1.*
FROM tblKPIDetails t1
INNER JOIN
(
SELECT Month, Year, Devision
FROM tblKPIDetails
WHERE KPI IN(1, 21, 25)
GROUP BY Month, Year, Devision
HAVING COUNT(DISTINCT KPI) = 3
) t2 ON t1.Month = t2.Month AND t1.Year = t2.Year
AND t1.Devision = t2.Devision
) t
PIVOT
(
MAX(Value)
FOR KPI IN([1], [21], [25])) p;
SQL Fiddle Demo
This will give you the data in the form:
| DEVISION | MONTH | YEAR | KPIAVALUE | KPIBVALUE | KPICVALUE |
---------------------------------------------------------------
| A | 2 | 2012 | 16 | 16 | 16 |
| B | 10 | 2012 | 16 | 18 | 20 |
Note that: This will give you the only combination of the Year, Month, DEVISION that have all the values 1, 21 and 25, and that what this query do:
SELECT Month, Year, Devision
FROM tblKPIDetails
WHERE KPI IN(1, 21, 25)
GROUP BY Month, Year, Devision
HAVING COUNT(DISTINCT KPI) = 3
Update: If you are looking for those that had at least one of 1, 21 or 25, just remove the HAVING COUNT(DISTINCT KPI) = 3, but this will make you expect more values than these three, in this case it will ignore other values and return only those three. Also it will return NULL for any of the missing values of them like so:
SELECT
Devision,
Month,
Year,
[1] AS KPIAValue,
[21] AS KPIBValue,
[25] AS KPICValue
FROM
(
SELECT t1.*
FROM tblKPIDetails t1
INNER JOIN
(
SELECT Month, Year, Devision
FROM tblKPIDetails
WHERE KPI IN(1, 21, 25)
GROUP BY Month, Year, Devision
) t2 ON t1.Month = t2.Month AND t1.Year = t2.Year
AND t1.Devision = t2.Devision
) t
PIVOT
(
MAX(Value)
FOR KPI IN([1], [21], [25])) p;
Updated SQL Fiddle Demo
| DIVISION | MONTH | YEAR | KPIAVALUE | KPIBVALUE | KPICVALUE |
---------------------------------------------------------------
| A | 2 | 2012 | 15.5 | 15.5 | 15.5 |
| B | 10 | 2012 | 15.5 | 17.5 | 20.24 |
| C | 12 | 2012 | 15.5 | (null) | 20.24 |

If you don't have a large number of "IDs", you could just transpose the values like this:
select
[Month],
[Year],
Division,
sum(case when KPI = 1 then Value else null end) as KPI1,
sum(case when KPI = 21 then Value else null end) as KPI21,
sum(case when KPI = 25 then Value else null end) as KPI25
from tblKPIDetails
group by
[Month],
[Year],
Division
order by
[Month],
[Year],
Division
Or same thing by using the "OVER" clause.

I think you want a conditional aggregation. But it is still not clear to me how the results are being defined. This might help you on your way:
SELECT Division, Month, Year,
1, max(case when kpi = 1 then value end) as kpi1value,
21, max(case when kpi = 21 then value end) as kpi21value,
25, max(case when kpi = 25 then value end) as kpi25value,
FROM tblKPIDetails J1

maybe you can try the following:
SELECT DISTINCT
t.Division,
t.Month,
t.Year,
KA.Value AS KPIAValue,
KB.Value AS KPIBValue,
KC.Value AS KPICValue
FROM
tblKPIDetails t
LEFT JOIN tblKPIDetails KA ON t.Division = KA.Division and t.Month = KA.month and .year = KA.year and KA.KPI = 1
LEFT JOIN tblKPIDetails KB ON t.Division = KB.Division and t.Month = KB.month and t.year = KB.year and KB.KPI = 21
LEFT JOIN tblKPIDetails KC ON t.Division = KC.Division and t.Month = KC.month and t.year = KC.year and KC.KPI = 25
Then is one LEFT JOIN for each possible KPI value you want.

Related

Sum with SQL depending on the value of a column

I have 3 columns : year, price, and day_type.
year day_type price
2016 0 10
2016 1 20
2016 2 5
2017 0 14
2017 1 6
2017 2 3
I want to keep only the lines where day_type = 1 or 2, but add to these lines the value when day_type = 0.
Expected Result :
year day_type price
2016 1 30
2016 2 15
2017 1 20
2017 2 17
How can I do that?
You can use a join:
select t.year, t.day_type, (t.price + coalesce(t0.price, 0)) as price
from t left join
t t0
on t.year = t0.year and t0.day_type = 0
where t.day_type <> 0;
This uses left join in case one of the years does not have a 0 price.
With sum() window function:
select * from (
select year, (2 * day_type) % 3 as day_type,
sum(price) over (partition by year) - price as price
from tablename
) t
where day_type <> 0
order by year, day_type
See the demo.
Results:
year | day_type | price
---: | -------: | ----:
2016 | 1 | 30
2016 | 2 | 15
2017 | 1 | 20
2017 | 2 | 17

How to calculate moving average in SQL?

I've a table with 2 columns in SQL
+------+--------+
| WEEK | OUTPUT |
+------+--------+
| 1 | 10 |
| 2 | 20 |
| 3 | 30 |
| 4 | 40 |
| 5 | 50 |
| 6 | 50 |
+------+--------+
How do I calculate to sum up output for 2 weeks before (ex : on week 3, it will sum up the output for week 3, 2 and 1), I've seen many tutorials to do moving average but they are using date, in my case i want to use (int), is that possible ?.
Thanks !.
I think you want something like this :
SELECT *,
(SELECT Sum(output)
FROM table1 b
WHERE b.week IN( a.week, a.week - 1, a.week - 2 )) AS SUM
FROM table1 a
OR
In clause can be converted to between a.week-2 and a.week.
sql fiddle
You can use a self-join. The idea is to put you table beside itself with a condition that brings matching rows in a single row:
SELECT * FROM [output] o1
INNER JOIN [output] o2 ON o1.Week between o2.Week and o2.Week + 2
this select will produce this output:
o1.Week o1.Output o2.Week o2.Output
--------------------------------------------
1 10 1 10
2 20 1 10
2 20 2 20
3 30 1 10
3 30 2 20
3 30 3 30
4 40 2 20
4 40 3 30
4 40 4 40
and so on. Note that for weeks 1 and 2 there aren't previous weeks available.
Now you should just group the data by o1.Week and get the SUM:
SELECT o1.Week, SUM(o2.Output)
FROM [output] o1
INNER JOIN [output] o2 ON o1.Week between o2.Week and o2.Week + 2
GROUP BY o1.Week
If week is continuous, you can simply use Window function
SELECT [Week], [Output],
SUM([Output]) OVER (ORDER BY [Week] ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM dbo.SomeTable
Range is more accurate for your calculation, but it not implemented in SQL Server yet. Other database engines may support
SELECT [Week], [Output],
SUM([Output]) OVER (ORDER BY [Week] RANGE BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM dbo.SomeTable
Try this:
SELECT SUM(t1.output) / 3
FROM yourtable t1
WHERE t1.week <=
(select t2.week from yourtable t2 where t2.week - t1.week > 0 and t2.week - t1.week <= 2)
You are not written your sqlserver, if it is sqlserver2012 or above , then the simple example is
declare #table table(wk int,outpt int )
insert into #table values (1,10)
,(2,20)
,(3,30)
,(4,40)
,(5,50)
,(6,60)
select *,SUM(outpt) over(partition by id order by id rows between unbounded preceding and current row ) dd
from (
select * , 1 id
from #table
where wk < 5
) a

Calculating running sum starting x years before

I have a table with entity name, year and activity number as
bellow. During some years there is not any activity.
name | year | act_num
-----+------+---------
aa | 2000 | 2
aa | 2001 | 6
aa | 2002 | 9
aa | 2003 | 15
aa | 2005 | 17
b | 2000 | 3
b | 2002 | 4
b | 2003 | 9
b | 2005 | 12
b | 2006 | 2
To create it on postgresql;
CREATE TABLE entity_year_activity (
name character varying(10),
year integer,
act_num integer
);
INSERT INTO entity_year_activity
VALUES
('aa', 2000, 2),
('aa', 2001, 6),
('aa', 2002, 9),
('aa', 2003, 15),
('aa', 2005, 17),
('b', 2000, 3),
('b', 2002, 4),
('b', 2003, 9),
('b', 2005, 12),
('b', 2006, 2);
I would like to have the total number of the past x years with the
number of this year activities for each entity for every year as bellow.
As an example for x = three years.
name | year | act_num | total_3_years
-----+------+---------+---------------
aa | 2000 | 2 | 2
aa | 2001 | 6 | 8
aa | 2002 | 9 | 17
aa | 2003 | 15 | 30
aa | 2004 | 0 | 24
aa | 2005 | 17 | 32
b | 2000 | 3 | 3
b | 2001 | 0 | 3
b | 2002 | 4 | 7
b | 2003 | 9 | 13
b | 2005 | 12 | 21
b | 2006 | 2 | 14
Here's an approach that uses the ability to use the sum aggregate as a window function with a range-based window frame - see SUM(...) OVER (PARTITION BY name ORDER BY year ROWS 2 PRECEDING) and window framing.
WITH name_years(gen_name, gen_year) AS (
SELECT gen_name, s
FROM generate_series(
(SELECT min(year) FROM entity_year_activity),
(SELECT max(year) FROM entity_year_activity)
) s CROSS JOIN (SELECT DISTINCT name FROM entity_year_activity) n(gen_name)
),
windowed_history(name, year,act_num,last3_actnum) AS (
SELECT
gen_name, gen_year, coalesce( act_num, 0),
SUM(coalesce(act_num,0)) OVER (PARTITION BY gen_name ORDER BY gen_year ROWS 2 PRECEDING)
FROM name_years
LEFT OUTER JOIN entity_year_activity ON (gen_name = name AND gen_year = year)
)
SELECT name, year, act_num, sum(last3_actnum) as total_3_years
FROM windowed_history
GROUP BY name, year, act_num
HAVING sum(last3_actnum) <> 0
ORDER BY name, year;
See SQLFiddle.
The need to generate entries for years that have no entry themselves complicates this query. I generate a table of all (name, year) pairs, then left outer join entity_year_activity on it before doing the window sum, so all years for all name sets are represented. That's why this is so complicated. Then I filter the aggregated result to exclude entries with zero in the sum.
SQL Fiddle
select
s.name,
d "year",
coalesce(act_num, 0) act_num,
coalesce(act_num, 0)
+ lag(coalesce(act_num, 0), 1, 0) over(partition by s.name order by d)
+ lag(coalesce(act_num, 0), 2, 0) over(partition by s.name order by d)
total_3_years
from
entity_year_activity eya
right join (
generate_series(
(select min("year") from entity_year_activity),
(select max("year") from entity_year_activity)
) d cross join (
select distinct name
from entity_year_activity
) f
) s on s.name = eya.name and s.d = eya."year"
order by s.name, d
SELECT en_key.name, en_key.year, en_key.act_num, SUM(en_sum.act_num) as total_3_years
FROM entity_year_activity en_key
INNER JOIN entity_year_activity en_sum
ON en_key.name = en_sum.name
WHERE en_sum.year BETWEEN en_key.year - 2 AND en_key.year
GROUP BY en_key.name, en_key.year
Another try. This one lacks the 0 row years, though:
select t1.name, t1.year, t1.act_num,
(select sum(t2.act_num) from entity_year_activity t2
where t2.year between t1.year - 2 and t1.year
and t2.name = t1.name) total
from entity_year_activity t1;

How do you select from a date range as the data source

Short of creating a table with all of the values of a date range, how would I select from a datarange as a datasource.
What I'm trying to accomplish is to create a running total of all items created within the same week from separate tables, while showing weeks with 0 new
example table:
items
-----------------------------
created_on | name | type
-----------------------------
2012-01-01 | Cards | 1
2012-01-09 | Red Pen | 2
2012-01-31 | Pencil | 2
2012-02-01 | Blue Pen | 2
types
--------------
name | id
--------------
Fun | 1
Writing | 2
sample output:
----------------------------
year | week | fun | writing
----------------------------
2012 | 1 | 1 | 0
2012 | 2 | 0 | 1
2012 | 3 | 0 | 0
2012 | 4 | 0 | 0
2012 | 5 | 0 | 2
You could generate a number series for the week numbers
SELECT
w.week
FROM
(SELECT generate_series(1,52) as week) as w
Example
SELECT
w.year,
w.week,
COUNT(i1) as fun,
COUNT(i2) as writing
FROM (SELECT 2012 as year, generate_series(1,6) as week) as w
LEFT JOIN items i1 ON i1.type = 1 AND w.week = EXTRACT(WEEK FROM i1.created_on)
LEFT JOIN items i2 ON i2.type = 2 AND w.week = EXTRACT(WEEK FROM i2.created_on)
GROUP BY
w.year,
w.week
ORDER BY
w.year,
w.week
Very close erikxiv, but you got me in the right direction. I have multiple tables I need to grab information from, this the additional select in the select fields.
select
date_year.num,
date_week.num,
( select count(*) from items x
and EXTRACT(YEAR FROM x.created_on) = date_year.num
and EXTRACT(WEEK FROM x.created_on) = date_week.num
) as item_count
from
(SELECT generate_series(2011, date_part('year', CURRENT_DATE)::INTEGER) as num) as date_year,
(SELECT generate_series(1,52) as num) as date_week
where
(
date_year.num < EXTRACT (YEAR FROM CURRENT_DATE)
OR
(
date_year.num = EXTRACT (YEAR FROM CURRENT_DATE) AND
date_week.num <= EXTRACT (WEEK FROM CURRENT_DATE)
)
)

Putting stuff into date ranges in SQL Server 2005

I have a table with week ranges (week number,start date, end date) and a table with tutorial dates (for writing tutors (tutor ID, tutorial_date, tutorial type(A or B).
I want to create two query that shows the week ranges (week 1, week 2) across the top with the tutor names on the side with count of tutorials (of type "A") in that week's date range in each block for that week.
The result should look like this:
Counts of Tutorials of Type "A"
Tutor|Week One|Week Two|Week Three|Week Four|Total
Joe | 3 | 5 | 7 | 8 | 23
Sam | 2 | 4 | 3 | 8 | 17
Meaning that Joe completed 3 tutorials in week one, five in week two, 7 in week three, and 8 in week 4.
The second query should show totals for tutorial type "A" and type "B"
Tutor|Week One|Week Two|Week Three|Week Four|Total |
Joe | 3/1 | 5/3 | 7/2 | 8/2 | 23/8 |
Sam | 2/3 | 4/4 | 3/2 | 8/3 | 17/12 |
Here, in Week One, Joe has done 3 tutorials of type A and 1 of type B.
Sample table data for tutorials (week one)
Tutor | Tutorial_ID | Tutorial Date |Type|
------------------------------------------
Joe | 1 | 2011-01-01 | A |
Joe | 2 | 2011-01-02 | A |
Joe | 3 | 2011-01-03 | A |
Joe | 4 | 2011-01-03 | B |
Sam | 5 | 2011-01-01 | A |
Sam | 6 | 2011-01-02 | A |
Sam | 7 | 2011-01-03 | B |
The week table looks like this:
weekNumber |startDate |endDate
1 |2011-01-01|2011-01-15
I'd like to gen this in SQL Server 2005
There are a few ways to do this.
For query one, where you only need to PIVOT on type 'A' then you can do just a PIVOT
select *
from
(
select w1.tutor
, w1.type
, wk.weeknumber
from w1
inner join wk
on w1.tutorialdate between wk.startdate and wk.enddate
where w1.type = 'a'
) x
pivot
(
count(type)
for weeknumber in ([1])
)p
See SQL Fiddle with Demo
Or you can use a Count() with a CASE statement.
select w1.tutor
, COUNT(CASE WHEN w1.type = 'A' THEN 1 ELSE null END) [Week One]
from w1
inner join wk
on w1.tutorialdate between wk.startdate and wk.enddate
group by w1.tutor
See SQL Fiddle with Demo
But for the second query, I would just use a Count() with a CASE
select w1.tutor
, Cast(COUNT(CASE WHEN w1.type = 'A' AND wk.weeknumber = 1 THEN 1 ELSE null END) as varchar(10))
+ ' / '
+ Cast(COUNT(CASE WHEN w1.type = 'B' AND wk.weeknumber = 1 THEN 1 ELSE null END) as varchar(10)) [Week One]
, Cast(COUNT(CASE WHEN w1.type = 'A' AND wk.weeknumber = 2 THEN 1 ELSE null END) as varchar(10))
+ ' / '
+ Cast(COUNT(CASE WHEN w1.type = 'B' AND wk.weeknumber = 2 THEN 1 ELSE null END) as varchar(10)) [Week Two]
from w1
inner join wk
on w1.tutorialdate between wk.startdate and wk.enddate
group by w1.tutor
See SQL Fiddle with Demo
Edit as AndriyM pointed out the second could be done with a PIVOT here is a solution for the Second query:
SELECT *
FROM
(
select distinct w1.tutor
, wk.weeknumber
, left(total, len(total)-1) Totals
FROM w1
inner join wk
on w1.tutorialdate between wk.startdate and wk.enddate
CROSS APPLY
(
SELECT cast(count(w2.type) as varchar(max)) + ' / '
from w1 w2
inner join wk wk2
on w2.tutorialdate between wk2.startdate and wk2.enddate
WHERE w2.tutor = w1.tutor
AND wk2.weeknumber = wk.weeknumber
group by w2.tutor, wk2.weeknumber, w2.type
FOR XML PATH('')
) D ( total )
) x
PIVOT
(
min(totals)
for weeknumber in ([1], [2])
) p
See SQL Fiddle with Demo