Sql Server - Joining subqueries using calculated fields - sql

I am trying to calculate the percentage change in price between days. As the days are not consectutive, I build into the query a calculated field that tells me what relative day it is (day 1, day 2, etc). In order to compare today with yesterday, I offset the calculated day number by 1 in a subquery. what I want to do is to join the inner and outer query on the calculated relative day. The code I came up with is:
SELECT TOP 11
P.Date,
(AVG(P.SettlementPri) - PriceY) / PriceY as PriceChange,
P.Symbol,
(RANK() OVER (ORDER BY P.Date desc)) as dayrank_Today
FROM OTE P
JOIN (SELECT TOP 11
C.Date,
AVG(SettlementPri) as PriceY,
(RANK() OVER (ORDER BY C.Date desc))+1 as dayrank_Yest
FROM OTE C
WHERE C.ComCode = 'C-'
GROUP BY c.Date) C ON dayrank_Today = C.dayrank_Yest
WHERE P.ComCode = 'C-'
GROUP BY P.Symbol, P.Date
If I try and execute the query, I get an erro message indicating dayrank_Today is an invalid column. I have tried renaming it, qualifying it, yell obsenities at it and I get squat. Still an error.

You can't do a select of a calculated column, and then use it in a join. You can use CTEs, which I'm not so familiar with, or you can jsut do table selects like so:
SELECT
P.Date,
(AVG(AvgPrice) - C.PriceY) / C.PriceY as PriceChange,
P.Symbol,
P.dayrank_Today FROM
(SELECT TOP 11
ComCode,
Date,
AVG(SettlementPri) as AvgPrice,
Symbol,
(RANK() OVER (ORDER BY Date desc)) as dayrank_Today
FROM OTE WHERE ComCode = 'C-') P
JOIN (SELECT TOP 11
C.Date,
AVG(SettlementPri) as PriceY,
(RANK() OVER (ORDER BY C.Date desc))+1 as dayrank_Yest
FROM OTE C
WHERE C.ComCode = 'C-'
GROUP BY c.Date) C ON dayrank_Today = C.dayrank_Yest
GROUP BY P.Symbol, P.Date

If possible consider using a CTE as it makes it very easy. Something like this:
With Raw as
(
SELECT TOP 11 C.Date,
Avg(SettlementPri) As PriceY,
Rank() OVER (ORDER BY C.Date desc) as dayrank
FROM OTE C WHERE C.Comcode = 'C-'
Group by C.Date
)
select today.pricey as todayprice ,
yesterday.pricey as yesterdayprice,
(today.pricey - yesterday.pricey)/today.pricey * 100 as percentchange
from Raw today
left outer join Raw yesterday on today.dayrank = yesterday.dayrank + 1
Obviously this doesn;t include the symbol but that can be included pretty easily.
If using 'With' syntax doesn;t suit you can also use calculated fields with Outer Apply http://technet.microsoft.com/en-us/library/ms175156.aspx
Although the CTE will mean that you only need to write your price calculation once which is a lot cleaner
Cheers

I had the same problem and found this thread and found a solution so I thought I'd post it here.
Instead of using the column name as parameter for ON, copy the statement that gave you the colmun name in the first place:
replace:
ON dayrank_Today = C.dayrank_Yest
with:
ON (RANK() OVER (ORDER BY Date desc)) = C.dayrank_Yest
Granted, you're displeasing the Programming Gods by violating DRY, but you could be pragmatic and mention the duplication in the comments, which should appease their wrath to a mild grumbling.

Related

SQL get top 3 values / bottom 3 values with group by and sum

I am working on a restaurant management system. There I have two tables
order_details(orderId,dishId,createdAt)
dishes(id,name,imageUrl)
My customer wants to see a report top 3 selling items / least selling 3 items by the month
For the moment I did something like this
SELECT
*
FROM
(SELECT
SUM(qty) AS qty,
order_details.dishId,
MONTHNAME(order_details.createdAt) AS mon,
dishes.name,
dishes.imageUrl
FROM
rms.order_details
INNER JOIN dishes ON order_details.dishId = dishes.id
GROUP BY order_details.dishId , MONTHNAME(order_details.createdAt)) t
ORDER BY t.qty
This gives me all the dishes sold count order by qty.
I have to manually filter max 3 records and reject the rest. There should be a SQL way of doing this. How do I do this in SQL?
You would use row_number() for this purpose. You don't specify the database you are using, so I am guessing at the appropriate date functions. I also assume that you mean a month within a year, so you need to take the year into account as well:
SELECT ym.*
FROM (SELECT YEAR(od.CreatedAt) as yyyy,
MONTH(od.createdAt) as mm,
SUM(qty) AS qty,
od.dishId, d.name, d.imageUrl,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_desc,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_asc
FROM rms.order_details od INNER JOIN
dishes d
ON od.dishId = d.id
GROUP BY YEAR(od.CreatedAt), MONTH(od.CreatedAt), od.dishId
) ym
WHERE seqnum_asc <= 3 OR
seqnum_desc <= 3;
Using the above info i used i combination of group by, order by and limit
as shown below. I hope this is what you are looking for
SELECT
t.qty,
t.dishId,
t.month,
d.name,
d.mageUrl
from
(
SELECT
od.dishId,
count(od.dishId) AS 'qty',
date_format(od.createdAt,'%Y-%m') as 'month'
FROM
rms.order_details od
group by date_format(od.createdAt,'%Y-%m'),od.dishId
order by qty desc
limit 3) t
join rms.dishes d on (t.dishId = d.id)

calculate time difference of consecutive row dates in SQL

Hello I am trying to calculate the time difference of 2 consecutive rows for Date (either in hours or Days), as attached in the image
Highlighted in Yellow is the result I want which is basically the difference of the date in that row and 1 above.
How can we achieve it in the SQL? Attached is my complex code which has the rest of the fields in it
with cte
as
(
select m.voucher_no, CONVERT(VARCHAR(30),CONVERT(datetime, f.action_Date, 109),100) as action_date,f.col1_Value,f.col3_value,f.col4_value,f.comments,f.distr_user,f.wf_status,f.action_code,f.wf_user_id
from attdetailmap m
LEFT JOIN awftaskfin f ON f.oid = m.oid and f.client ='PC'
where f.action_Date !='' and action_date between '$?datef' and '$?datet'
),
.*select *, ROW_NUMBER() OVER(PARTITION BY action_Date,distr_user,wf_Status,wf_user_id order by action_Date,distr_user,wf_Status,wf_user_id ) as row_no_1 from cte
cte2 as
(
select *, ROW_NUMBER() OVER(PARTITION BY voucher_no,action_Date,distr_user,wf_Status,wf_user_id order by voucher_no ) as row_no_1 from cte
)
select distinct(v.dim_value) as resid,c.voucher_no,CONVERT(datetime, c.action_Date, 109) as action_Date,c.col4_value,c.comments,c.distr_user,v.description,c.wf_status,c.action_code, c.wf_user_id,v1.description as name,r.rel_value as pay_office,r1.rel_value as site
from cte2 c
LEFT OUTER JOIN aagviuserdetail v ON v.user_id = c.distr_user
LEFT OUTER JOIN aagviuserdetail v1 ON v1.user_id = c.wf_user_id
LEFT OUTER JOIN ahsrelvalue r ON r.resource_id = v.dim_Value and r.rel_Attr_id = 'P1' and r.period_to = '209912'
LEFT OUTER JOIN ahsrelvalue r1 ON r1.resource_id = v.dim_Value and r1.rel_Attr_id = 'Z1' and r1.period_to = '209912'
where c.row_no_1 = '1' and r.rel_value like '$?site1' and voucher_no like '$?trans'
order by voucher_no,action_Date
The key idea is lag(). However, date/time functions vary among databases. So, the idea is:
select t.*,
(date - lag(date) over (partition by transaction_no order by date)) as diff
from t;
I should note that this exact syntax might not work in your database -- because - may not even be defined on date/time values. However, lag() is a standard function and should be available.
For instance, in SQL Server, this would look like:
select t.*,
datediff(second, lag(date) over (partition by transaction_no order by date), date) / (24.0 * 60 * 60) as diff_days
from t;

Using a date field for matching SQL Query

I'm having a bit of an issue wrapping my head around the logic of this changing dimension. I would like to associate these two tables below. I need to match the Cost - Period fact table to the cost dimension based on the Id and the effective date.
As you can see - if the month and year field is greater than the effective date of its associated Cost dimension, it should adopt that value. Once a new Effective Date is entered into the dimension, it should use that value for any period greater than said date going forward.
EDIT: I apologize for the lack of detail but the Cost Dimension will actually have a unique Index value and the changing fields to reference for the matching would be Resource, Project, Cost. I tried to match the query you provided with my fields, but I'm getting the incorrect output.
FYI: Naming convention change: EngagementId is Id, Resource is ConsultantId, and Project is ProjectId
I've changed the images below and here is my query
,_cte(HoursWorked, HoursBilled, Month, Year, EngagementId, ConsultantId, ConsultantName, ProjectId, ProjectName, ProjectRetainer, RoleId, Role, Rate, ConsultantRetainer, Salary, amount, EffectiveDate)
as
(
select sum(t.Duration), 0, Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantId, c.ConsultantName, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, 0, c.EffectiveDate
from timesheet t
left join Engagement c on t.EngagementId = c.EngagementId and Month(c.EffectiveDate) = Month(t.EndDate) and Year(c.EffectiveDate) = Year(t.EndDate)
group by Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantName, c.ConsultantId, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, c.EffectiveDate
)
select * from _cte where EffectiveDate is not null
union
select _cte.HoursWorked, _cte.HoursBilled, _cte.Month, _cte.Year, _cte.EngagementId, _cte.ConsultantId, _cte.ConsultantName, _cte.ProjectId, _Cte.ProjectName, _cte.ProjectRetainer, _cte.RoleId, _cte.Role, sub.Rate, _cte.ConsultantRetainer,_cte.Salary, _cte.amount, sub.EffectiveDate
from _cte
outer apply (
select top 1 EffectiveDate, Rate
from Engagement e
where e.ConsultantId = _cte.ConsultantId and e.ProjectId = _cte.ProjectId and e.RoleId = _cte.RoleId
and Month(e.EffectiveDate) < _cte.Month and Year(e.EffectiveDate) < _cte.Year
order by EffectiveDate desc
) sub
where _cte.EffectiveDate is null
Example:
I'm struggling with writing the query that goes along with this. At first I attempted to partition by greatest date. However, when I executed the join I got the highest effective date for every single period (even those prior to the effective date).
Is this something that can be accomplished in a query or should I be focusing on incremental updates of the destination table so that any effective date / time period in the past is left alone?
Any tips would be great!
Thanks,
Channing
Try this one:
; with _CTE as(
select p.* , c.EffectiveDate, c.Cost
from period p
left join CostDimension c on p.id = c.id and p.Month = DATEPART(month, c.EffectiveDate) and p.year = DATEPART (year, EffectiveDate)
)
select * from _CTE Where EffectiveDate is not null
Union
select _CTE.id, _CTE.Month, _CTE.Year, sub.EffectiveDate, sub.Cost
from _CTE
outer apply (select top 1 EffectiveDate, Cost
from CostDimension as cd
where cd.Id = _CTE.id and cd.EffectiveDate < DATETIMEFROMPARTS(_CTE.Year, _CTE.Month, 1, 0, 0, 0, 0)
order by EffectiveDate desc
) sub
where _Cte.EffectiveDate is null

Trying to create a SQL query

I am trying to create a query that retrieves only the ten companies with the highest number of pickups over the six-month period, this means pickup occasions, and not the number of items picked up.
I have done this
SELECT *
FROM customer
JOIN (SELECT manifest.pickup_customer_ref reference,
DENSE_RANK() OVER (PARTITION BY manifest.pickup_customer_ref ORDER BY COUNT(manifest.trip_id) DESC) rnk
FROM manifest
INNER JOIN trip ON manifest.trip_id = trip.trip_id
WHERE trip.departure_date > TRUNC(SYSDATE) - interval '6' month
GROUP BY manifest.pickup_customer_ref) cm ON customer.reference = cm.reference
WHERE cm.rnk < 11;
this uses dense_rank to determine the order or customers with the highest number of trips first
Hmm well i don't have Oracle so I can't test it 100%, but I believe your looking for something like the following:
Keep in mind that when you use group by, you have to narrow down to the same fields you group by in the select. Hope this helps at least give you an idea of what to look at.
select TOP 10
c.company_name,
m.pickup_customer_ref,
count(*) as 'count'
from customer c
inner join mainfest m on m.pickup_customer_ref = c.reference
inner join trip t on t.trip_id = m.trip_id
where t.departure_date < DATEADD(month, -6, GETDATE())
group by c.company_name, m.pickup_customer_ref
order by 'count', c.company_name, m.pickup_customer_ref desc

Showing the closest date and not the rest

Thanks for looking.
I am trying to get the smallest time between visits per customer. With the below code I am getting the amount of time between each visit, but it is comparing every date. I just need the smallest margin and then to get rid of/hide all the others.
select a.account_id, a.transaction_id, b.transaction_date , a.transaction_date, round(a.transaction_date - b.transaction_date, 0) as Time_between
from time_between_trans a, time_between_trans b
where a.transaction_date > b.transaction_date
and a.account_ID like '717724'
and a.account_id = b.account_id
for each date in transaction_date find the closest date to it in the
trasactions
They must be of the same account_id and the second date must be later
than the first
You need to add a group by to your query:
select a.account_id, a.transaction_id, min(time_between) as min_time_between
from (select a.account_id, a.transaction_id, b.transaction_date, a.transaction_date,
round(a.transaction_date - b.transaction_date, 0) as Time_between
from time_between_trans a join
time_between_trans b
on a.transaction_date > b.transaction_date and
a.account_ID ='717724' and
a.account_id = b.account_id
) a
group by a.account_id, a.transaction_id
I also fixed your join syntax to use proper join syntax, and changed the "like" to an "=", since you have no wildcards.
There may be other methods to express this, but this is standard SQL. You should specify the database you are using in your question.
If you are using Oracle, then just do the following:
select a.*,
(case when nextdt - transaction_date) < transaction_date - nextdt
then nextdt - transaction_date)
else transaction_date - nextdt
end) as mintime
from (select a.*,
lead(transaction_date, 1) over (partition by account_id order by transaction_date) as nexttd,
lag(transaction_date, 1) over (partition by account_id order by transaction_date) as prevtd
from time_between_trans a
) a
Actually, this isn't quite correct, because you have to take NULLs into account for nexttd and prevtd. But, the idea is simple. The lead and lag functions let you take the prev or text transaction, and then you can do whatever you want to find the minimum -- or get whatever other information you would like from the records.