cross reference nearest date data - sql

I have three table ElecUser, ElecUsage, ElecEmissionFactor
ElecUser:
UserID UserName
1 Main Building
2 Staff Quarter
ElecUsage:
UserID Time Amount
1 1/7/2010 23230
1 8/10/2011 34340
1 8/1/2011 34300
1 2/3/2012 43430
1 4/2/2013 43560
1 3/2/2014 44540
2 3/6/2014 44000
ElecEmissionFactor:
Time CO2Emission
1/1/2010 0.5
1/1/2011 0.55
1/1/2012 0.56
1/1/2013 0.57
And intended outcome:
UserName Time CO2
1 2010 11615
1 2011 37752 (34340*0.55 + 34300*0.55)
1 2012 24320.8
1 2013 24829.2
1 2014 25387.8
2 2014 25080
The logic is ElecUsage.Amount * ElecEmissionFactor.
If same user and same year, add them up for the record of that year.
My query is:
SELECT ElecUser.UserName, Year([ElecUsage].[Time]), SUM((ElecEmissionFactor.CO2Emission*ElecUsage.Amount)) As CO2
FROM ElecEmissionFactor, ElecUser INNER JOIN ElecUsage ON ElecUser.UserID = ElecUsage.UserID
WHERE (((Year([ElecUsage].[Time]))>=Year([ElecEmissionFactor].[Time])))
GROUP BY ElecUser.UserName, Year([ElecUsage].[Time])
HAVING Year([ElecUsage].[Time]) = Max(Year(ElecEmissionFactor.Time));
However, this only shows the year with emission factor.
The challenge is to reference the year without emission factor to the latest year with emission factor.
Sub-query may be one of the solutions but i fail to do so.
I got stuck for a while. Hope to see your reply.
Thanks

Try something like this..
-- not tested
select T1.id, year(T1.time) as Time, sum(T1.amount*T2.co2emission) as CO2
from ElecUsage T1
left outer join ElecEmissionFactor T2 on (year(T1.time) = year(T2.time))
Group by year(T1.time), T1.id
use sub query to get the corresponding factor in this way
select T1.id,
year(T1.time) as Time,
sum(T1.amount*
(
select top 1 CO2Emission from ElecEmissionFactor T2
where year(T2.time) <= year(T1.time) order by T2.time desc
)
) as CO2
from ElecUsage T1
Group by year(T1.time), T1.id

Related

SQL: Filter (rows) with maximum value within groups (columns)

I need to filter for rows based on maximum values of version within month and location. Using SQL.
For example, I have table below where there are version 1 and 2 of June & NYC, I wanted to filter for only the row of version 2 with revenue 11. Or for January & NYC, I wanted to get only the row with revenue 15.
Month Location Version Revenue
June NYC 1 10
June NYC 2 11
June LA 3 12
January NYC 1 13
January NYC 2 14
January NYC 3 15
January LA 1 16
January LA 2 17
Result:
Month Location Version Revenue
June NYC 2 11
June LA 3 12
January NYC 3 15
January LA 2 17
Edit to change name of column to Revenue to remove confusion. I do not need the max value of revenue, only revenue that goes with max version of that month and that location.
You can also use joins as an alternative to correlated subqueries, e.g.:
select t1.* from YourTable t1 inner join
(
select t2.month, t2.location, max(t2.version) as mv
from YourTable t2
group by t2.month, t2.location
) q on t1.month = q.month and t1.location = q.location and t1.version = q.mv
Change YourTable to the name of your table.
A typical method is filtering using a correlated subquery:
select t.*
from t
where t.version = (select max(t2.version)
from t t2
where t2.month = t.month and t2.location = t.location
);
Another alternative that minimizes subqueries is to use the row_number() window function. (You don't mention which database server you're using, but most of them support it.)
SELECT month, location, version, revenue
FROM (SELECT month, location, version, revenue
, row_number() OVER (PARTITION BY month, location ORDER BY version DESC) AS rn
FROM your_table)
WHERE rn = 1;

Calculate difference in rows recorded at different timestamp in SQL

I have a table with data as follows
Person_ID Date Sale
1 2016-05-08 2686
1 2016-05-09 2688
1 2016-05-14 2689
1 2016-05-18 2691
1 2016-05-24 2693
1 2016-05-25 2694
1 2016-05-27 2695
and there are a million such id's for different people. Sale count is recorded only when a sale increases else it is not. Therefore data for id' 2 can be different from id 1.
Person_ID Date Sale
2 2016-05-10 26
2 2016-05-20 29
2 2016-05-18 30
2 2016-05-22 39
2 2016-05-25 40
Sale count of 29 on 5/20 means he sold 3 products on 20th, and had sold 26 till 5/10 with no sale in between these 2 dates.
Question: I want a sql/dynamic sql to calculate the daily a sales of all the agents and produce a report as follows:
ID Sale_511 Sale_512 Sale_513 -------------- Sale_519 Sale_520
2 0 0 0 --------------- 0 3
(29-26)
Question is how do I use that data to calculate a report. As I do have data between 5/20 to 5/10. SO i can just write a query saying A-B = C?
Can anyone help? Thank you.
P.S - New to SQL so learning.
Using Sql Server 2008.
Most SQL dialects support the lag() function. You can get what you want as:
select person_id, date,
(sale - lag(sale) over (partition by person_id, date)) as Daily_Sales
from t;
This produces one row per date for each person. This format is more typical for how SQL would return such results.
In SQL Server 2008, you can do:
select t.person_id, t.date,
(t.sale - t2.sale) as Daily_Sales
from t outer apply
(select top 1 t2.*
from t t2
where t2.person_id = t.person_id and t2.date < t.date
) t2

How to get lastest date group by employee of a column but without another column

I'm working a query in SQL 2005.
I'm trying to get the latest date for a number column. The trick is there is another column (rate) that use the column date and I fetch the wrong column in the end.
An example will better explain my question.
This is my SQL table EmployeeRates:
----------------------------------
FkEmployee | Date | Rate | Number |
----------------------------------
1 2000 15 1.5
1 2001 16 1.5
1 2002 16 1.6
2 2000 12 1.5
2 2001 14 1.6
2 2002 15 1.6
So if I fetch the latest date, currently I have :
FkEmployee #1 = 2002 (which is correct because it's the latest date for the number column.)
FkEmployee #2 = 2002 (which is not what I want, because that year it was the rate that changed and there is a duplicate number) What I want is 2001.
The code I have right now (2015-08-10 14:15)
SELECT t1.FkEmployee, t1.Date
FROM EmployeeRates t1
INNER JOIN
(
SELECT FkEmployee, MAX(Date) AS MaxDate
FROM EmployeeRates
GROUP BY FkEmployee
)
t2 ON t1.FkEmploye = t2.FkEmploye
AND t1.DateTaux = t2.MaxDate
ORDER BY t1.FkEmploye
Thanks for anybody that can help =)
This should work. First find MIN date by Employee, Number, then get the MAX of that. This will ensure you are getting the earliest date per number, but latest date per employee:
SELECT t1.FkEmployee, t1.Date
FROM EmployeeRates t1
INNER JOIN
(SELECT FkEmployee,MAX(MinDate) AS MaxDate from
(SELECT FkEmployee, MIN(Date) AS MinDate
FROM EmployeeRates
GROUP BY FkEmployee,Number) a
GROUP BY Fkemployee
)
t2 ON t1.FkEmployee = t2.FkEmployee
AND t1.DateTaux = t2.MaxDate
ORDER BY t1.FkEmployee

Creating a SQL query to find recent field changes

I am having problems trying to create a SQL query to select the most recent change in hours and the difference from the previous time recorded.
The table is as below, the database keeps all the historical version together by versions:
Item ID Title RevisedDate ChangedDate Rev WorkHours
Task 187061 Development 10/9/12 11:14 10/5/12 15:54 1 4
Task 187061 Development 10/9/12 14:29 10/9/12 11:14 2 8
Task 187061 Development 10/10/12 15:07 10/9/12 14:29 3 16
Task 187061 Development 10/11/12 9:59 10/10/12 15:07 4 16
Task 187061 Development 10/12/12 10:51 10/11/12 9:59 5 16
Task 187061 Development 12/6/12 15:25 10/12/12 10:51 6 16
Task 187061 Development 12/11/12 10:27 12/6/12 15:25 7 16
Task 187061 Development 1/1/99 0:00 12/11/12 10:27 8 16
So the task most recent worked hours were updated on 10/10/12 15:07 from 8hr to 16hrs. I am having problems creating a query to tell me.
At the end of the day I need a result :-
Item ID Title RevisedDate ChangedDate Rev WorkHours ChangeHours
Task 187061 Development 10/10/12 15:07 10/9/12 14:29 3 16 8
(p.s I took one task as an example, the actual table has hundreds of task and several historical version)
As I understand your question you want the item with the most recent revised date for each ID
You get that like this:
SELECT *
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY RevisedDate DESC) as ord
FROM TABLE
) T
WHERE ord = 1
If you want the first one that changed that is harder:
-- First find the ones that changed
With FlagChange AS
(
SELECT T1.ID, T1.REV, T1.RevisedDate
CASE WHEN T2 IS NULL THEN FALSE
WHEN T1.WorkHour != T2.WorkHour THEN TRUE
ELSE FALSE END AS Changed
FROM TABLE T1
LEFT JOIN TABLE T2 ON T1.ID = T2.ID AND T2.REV = T1.REV-1
), NumberChange -- now use row number
(
SELECT ID, REV,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY RevisedDate DESC) as ord
FROM FlagChange
WHERE Changed = True
), SelectRecent -- take the newest ones
(
SELECT ID, REV
FROM NumberChange
WHERE ord = 1
) -- add in all the data and the ones with one revision
SELECT *
FROM TABLE T1
JOIN SelectRecient SR ON T1.ID = SR.ID AND T1.REV = SR.REV
UNION ALL
SELECT *
FROM TABLE
WHERE ID NOT IN (SELECT ID FROM SelectRecent)

Select info from table where row has max date

My table looks something like this:
group date cash checks
1 1/1/2013 0 0
2 1/1/2013 0 800
1 1/3/2013 0 700
3 1/1/2013 0 600
1 1/2/2013 0 400
3 1/5/2013 0 200
-- Do not need cash just demonstrating that table has more information in it
I want to get the each unique group where date is max and checks is greater than 0. So the return would look something like:
group date checks
2 1/1/2013 800
1 1/3/2013 700
3 1/5/2013 200
attempted code:
SELECT group,MAX(date),checks
FROM table
WHERE checks>0
GROUP BY group
ORDER BY group DESC
problem with that though is it gives me all the dates and checks rather than just the max date row.
using ms sql server 2005
SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group
That works to get the max date..join it back to your data to get the other columns:
Select group,max_date,checks
from table t
inner join
(SELECT group,MAX(date) as max_date
FROM table
WHERE checks>0
GROUP BY group)a
on a.group = t.group and a.max_date = date
Inner join functions as the filter to get the max record only.
FYI, your column names are horrid, don't use reserved words for columns (group, date, table).
You can use a window MAX() like this:
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
to get max dates per group alongside other data:
group date cash checks max_date
----- -------- ---- ------ --------
1 1/1/2013 0 0 1/3/2013
2 1/1/2013 0 800 1/1/2013
1 1/3/2013 0 700 1/3/2013
3 1/1/2013 0 600 1/5/2013
1 1/2/2013 0 400 1/3/2013
3 1/5/2013 0 200 1/5/2013
Using the above output as a derived table, you can then get only rows where date matches max_date:
SELECT
group,
date,
checks
FROM (
SELECT
*,
max_date = MAX(date) OVER (PARTITION BY group)
FROM table
) AS s
WHERE date = max_date
;
to get the desired result.
Basically, this is similar to #Twelfth's suggestion but avoids a join and may thus be more efficient.
You can try the method at SQL Fiddle.
Using an in can have a performance impact. Joining two subqueries will not have the same performance impact and can be accomplished like this:
SELECT *
FROM (SELECT msisdn
,callid
,Change_color
,play_file_name
,date_played
FROM insert_log
WHERE play_file_name NOT IN('Prompt1','Conclusion_Prompt_1','silent')
ORDER BY callid ASC) t1
JOIN (SELECT MAX(date_played) AS date_played
FROM insert_log GROUP BY callid) t2
ON t1.date_played = t2.date_played
SELECT distinct
group,
max_date = MAX(date) OVER (PARTITION BY group), checks
FROM table
Should work.