Trying to create a SQL query - sql

I am trying to create a query that retrieves only the ten companies with the highest number of pickups over the six-month period, this means pickup occasions, and not the number of items picked up.
I have done this

SELECT *
FROM customer
JOIN (SELECT manifest.pickup_customer_ref reference,
DENSE_RANK() OVER (PARTITION BY manifest.pickup_customer_ref ORDER BY COUNT(manifest.trip_id) DESC) rnk
FROM manifest
INNER JOIN trip ON manifest.trip_id = trip.trip_id
WHERE trip.departure_date > TRUNC(SYSDATE) - interval '6' month
GROUP BY manifest.pickup_customer_ref) cm ON customer.reference = cm.reference
WHERE cm.rnk < 11;
this uses dense_rank to determine the order or customers with the highest number of trips first

Hmm well i don't have Oracle so I can't test it 100%, but I believe your looking for something like the following:
Keep in mind that when you use group by, you have to narrow down to the same fields you group by in the select. Hope this helps at least give you an idea of what to look at.
select TOP 10
c.company_name,
m.pickup_customer_ref,
count(*) as 'count'
from customer c
inner join mainfest m on m.pickup_customer_ref = c.reference
inner join trip t on t.trip_id = m.trip_id
where t.departure_date < DATEADD(month, -6, GETDATE())
group by c.company_name, m.pickup_customer_ref
order by 'count', c.company_name, m.pickup_customer_ref desc

Related

SQL get top 3 values / bottom 3 values with group by and sum

I am working on a restaurant management system. There I have two tables
order_details(orderId,dishId,createdAt)
dishes(id,name,imageUrl)
My customer wants to see a report top 3 selling items / least selling 3 items by the month
For the moment I did something like this
SELECT
*
FROM
(SELECT
SUM(qty) AS qty,
order_details.dishId,
MONTHNAME(order_details.createdAt) AS mon,
dishes.name,
dishes.imageUrl
FROM
rms.order_details
INNER JOIN dishes ON order_details.dishId = dishes.id
GROUP BY order_details.dishId , MONTHNAME(order_details.createdAt)) t
ORDER BY t.qty
This gives me all the dishes sold count order by qty.
I have to manually filter max 3 records and reject the rest. There should be a SQL way of doing this. How do I do this in SQL?
You would use row_number() for this purpose. You don't specify the database you are using, so I am guessing at the appropriate date functions. I also assume that you mean a month within a year, so you need to take the year into account as well:
SELECT ym.*
FROM (SELECT YEAR(od.CreatedAt) as yyyy,
MONTH(od.createdAt) as mm,
SUM(qty) AS qty,
od.dishId, d.name, d.imageUrl,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_desc,
ROW_NUMBER() OVER (PARTITION BY YEAR(od.CreatedAt), MONTH(od.createdAt) ORDER BY SUM(qty) DESC) as seqnum_asc
FROM rms.order_details od INNER JOIN
dishes d
ON od.dishId = d.id
GROUP BY YEAR(od.CreatedAt), MONTH(od.CreatedAt), od.dishId
) ym
WHERE seqnum_asc <= 3 OR
seqnum_desc <= 3;
Using the above info i used i combination of group by, order by and limit
as shown below. I hope this is what you are looking for
SELECT
t.qty,
t.dishId,
t.month,
d.name,
d.mageUrl
from
(
SELECT
od.dishId,
count(od.dishId) AS 'qty',
date_format(od.createdAt,'%Y-%m') as 'month'
FROM
rms.order_details od
group by date_format(od.createdAt,'%Y-%m'),od.dishId
order by qty desc
limit 3) t
join rms.dishes d on (t.dishId = d.id)

oracle query for get 20 top agency which have pass issued

I have a query which show datewise pass issued by agency. I wanted to get top 20 agency who have most pass issued here is my query
Nothing in your data id identifies "agency". If I assume you mean "agent", you can get the top 20 by aggregating and then limiting the result. In Oracle 12C+, you can use:
SELECT gp.agent_id, a.agent_name, COUNT(*)
FROM eofficeuat.gatepass gp INNER JOIN
eofficeuat.cnf_agents a
ON gp.agent_id = a.agent_id INNER JOIN
eofficeuat.cardprintlog_user u
ON gp.agent_id = u.agent_id
WHERE gp.issuedatetime BETWEN DATE '2019-09-28' AND DATE '2019-09-29'
GROUP BY gp.agent_id, a.agent_name
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
In earlier versions, a subquery is needed:
SELECT *
FROM (SELECT gp.agent_id, a.agent_name, COUNT(*)
FROM eofficeuat.gatepass gp INNER JOIN
eofficeuat.cnf_agents a
ON gp.agent_id = a.agent_id INNER JOIN
eofficeuat.cardprintlog_user u
ON gp.agent_id = u.agent_id
WHERE gp.issuedatetime BETWEEN DATE '2019-09-28' AND DATE '2019-09-29'
GROUP BY gp.agent_id, a.agent_name
ORDER BY COUNT(*) DESC
) a
WHERE rownum <= 20;
Obviously, if you do mean "agency" and that is identified by different columns, you would just adjust the SELECT and GROUP BY clauses.
Also, I would advise you never to use BETWEEN on dates in Oracle. There is a time component that might cause issues.
If you intend only times on '2019-09-28', then:
gp.issuedatetime >= DATE '2019-09-28' AND
gp.issuedatetime < DATE '2019-09-29'
If you intend both the 28th and 29th:
gp.issuedatetime >= DATE '2019-09-28' AND
gp.issuedatetime < DATE '2019-09-30'
You can use LIMIT clause(12c or higher version) with TOP 20 records as following:
SELECT eofficeuat.gatepass.agent_id, eofficeuat.cnf_agents.agent_name, COUNT(1) as cnt
FROM eofficeuat.gatepass INNER JOIN
eofficeuat.cnf_agents
ON eofficeuat.gatepass.agent_id = eofficeuat.cnf_agents.agent_id INNER JOIN
eofficeuat.cardprintlog_user
ON eofficeuat.gatepass.agent_id = eofficeuat.cardprintlog_user.agent_id
WHERE eofficeuat.gatepass.issuedatetime BETWEN DATE '2019-09-28' AND DATE '2019-09-29'
GROUP BY eofficeuat.gatepass.agent_id, eofficeuat.cnf_agents.agent_name
ORDER BY cnt DESC
FETCH FIRST 20 ROWS ONLY; -- this will fetch top 20 agents
Cheers!!

Using a date field for matching SQL Query

I'm having a bit of an issue wrapping my head around the logic of this changing dimension. I would like to associate these two tables below. I need to match the Cost - Period fact table to the cost dimension based on the Id and the effective date.
As you can see - if the month and year field is greater than the effective date of its associated Cost dimension, it should adopt that value. Once a new Effective Date is entered into the dimension, it should use that value for any period greater than said date going forward.
EDIT: I apologize for the lack of detail but the Cost Dimension will actually have a unique Index value and the changing fields to reference for the matching would be Resource, Project, Cost. I tried to match the query you provided with my fields, but I'm getting the incorrect output.
FYI: Naming convention change: EngagementId is Id, Resource is ConsultantId, and Project is ProjectId
I've changed the images below and here is my query
,_cte(HoursWorked, HoursBilled, Month, Year, EngagementId, ConsultantId, ConsultantName, ProjectId, ProjectName, ProjectRetainer, RoleId, Role, Rate, ConsultantRetainer, Salary, amount, EffectiveDate)
as
(
select sum(t.Duration), 0, Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantId, c.ConsultantName, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, 0, c.EffectiveDate
from timesheet t
left join Engagement c on t.EngagementId = c.EngagementId and Month(c.EffectiveDate) = Month(t.EndDate) and Year(c.EffectiveDate) = Year(t.EndDate)
group by Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantName, c.ConsultantId, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, c.EffectiveDate
)
select * from _cte where EffectiveDate is not null
union
select _cte.HoursWorked, _cte.HoursBilled, _cte.Month, _cte.Year, _cte.EngagementId, _cte.ConsultantId, _cte.ConsultantName, _cte.ProjectId, _Cte.ProjectName, _cte.ProjectRetainer, _cte.RoleId, _cte.Role, sub.Rate, _cte.ConsultantRetainer,_cte.Salary, _cte.amount, sub.EffectiveDate
from _cte
outer apply (
select top 1 EffectiveDate, Rate
from Engagement e
where e.ConsultantId = _cte.ConsultantId and e.ProjectId = _cte.ProjectId and e.RoleId = _cte.RoleId
and Month(e.EffectiveDate) < _cte.Month and Year(e.EffectiveDate) < _cte.Year
order by EffectiveDate desc
) sub
where _cte.EffectiveDate is null
Example:
I'm struggling with writing the query that goes along with this. At first I attempted to partition by greatest date. However, when I executed the join I got the highest effective date for every single period (even those prior to the effective date).
Is this something that can be accomplished in a query or should I be focusing on incremental updates of the destination table so that any effective date / time period in the past is left alone?
Any tips would be great!
Thanks,
Channing
Try this one:
; with _CTE as(
select p.* , c.EffectiveDate, c.Cost
from period p
left join CostDimension c on p.id = c.id and p.Month = DATEPART(month, c.EffectiveDate) and p.year = DATEPART (year, EffectiveDate)
)
select * from _CTE Where EffectiveDate is not null
Union
select _CTE.id, _CTE.Month, _CTE.Year, sub.EffectiveDate, sub.Cost
from _CTE
outer apply (select top 1 EffectiveDate, Cost
from CostDimension as cd
where cd.Id = _CTE.id and cd.EffectiveDate < DATETIMEFROMPARTS(_CTE.Year, _CTE.Month, 1, 0, 0, 0, 0)
order by EffectiveDate desc
) sub
where _Cte.EffectiveDate is null

Selecting only if at least one row matches condition

I have a select statement and want to return all values only if at least one of them has a date with 60 days of difference from today.
The problem is that i have an outer apply which returns the column i want to compare to, and they come from different tables (one belongs to cash items, and the other to card items).
Considering I have the following:
OUTER APPLY (
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1 --Cash
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2 --Card
) AS items
I want to return all the rows only when DATEDIFF(DAY, MIN(items.item_date), GETDATE()) >= 60, but I want them all even if only one matches this condition.
What would be the best approach to do this?
EDIT
To make it clearer, I'll explain the use case:
I need to show the items of every loan, only if the client is late for more than 60 days of the due date on any of it
I am also not sure, what do you expect, but how about that:
WITH items
AS (SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2)
SELECT a.*
FROM items AS a,
(SELECT TOP 1 *
FROM items AS b
WHERE Datediff(day, b.item_date, Getdate()) >= 60) AS c
It's a sort of CROSS JOIN, where table C will have one or zero rows depending on that if the condition is met - it will than join to every row in other table.
Have you tried something like this?
SELECT a.quantity, a.item_date
FROM
(SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2) a
WHERE DATEDIFF(day, a.item_date, GETDATE()) >= 60
Typically I do this using a CTE to select the key for the records I want to select and then join on that. Below is an attempt at an example:
with LateClients as
(
SELECT LoadId FROM Payment Where /*payment date later than 60 days*/
)
SELECT p.LoanId,
p.UserId
FROM Payment as p
INNER JOIN LateClients as LC
ON p.LoanId = lc.LoanId
OrderBy p.LoanId, p.UserId
I know it's a bit different from the code you posted, but this is a simplified example that should explain the concept. Good luck!

Sql Server - Joining subqueries using calculated fields

I am trying to calculate the percentage change in price between days. As the days are not consectutive, I build into the query a calculated field that tells me what relative day it is (day 1, day 2, etc). In order to compare today with yesterday, I offset the calculated day number by 1 in a subquery. what I want to do is to join the inner and outer query on the calculated relative day. The code I came up with is:
SELECT TOP 11
P.Date,
(AVG(P.SettlementPri) - PriceY) / PriceY as PriceChange,
P.Symbol,
(RANK() OVER (ORDER BY P.Date desc)) as dayrank_Today
FROM OTE P
JOIN (SELECT TOP 11
C.Date,
AVG(SettlementPri) as PriceY,
(RANK() OVER (ORDER BY C.Date desc))+1 as dayrank_Yest
FROM OTE C
WHERE C.ComCode = 'C-'
GROUP BY c.Date) C ON dayrank_Today = C.dayrank_Yest
WHERE P.ComCode = 'C-'
GROUP BY P.Symbol, P.Date
If I try and execute the query, I get an erro message indicating dayrank_Today is an invalid column. I have tried renaming it, qualifying it, yell obsenities at it and I get squat. Still an error.
You can't do a select of a calculated column, and then use it in a join. You can use CTEs, which I'm not so familiar with, or you can jsut do table selects like so:
SELECT
P.Date,
(AVG(AvgPrice) - C.PriceY) / C.PriceY as PriceChange,
P.Symbol,
P.dayrank_Today FROM
(SELECT TOP 11
ComCode,
Date,
AVG(SettlementPri) as AvgPrice,
Symbol,
(RANK() OVER (ORDER BY Date desc)) as dayrank_Today
FROM OTE WHERE ComCode = 'C-') P
JOIN (SELECT TOP 11
C.Date,
AVG(SettlementPri) as PriceY,
(RANK() OVER (ORDER BY C.Date desc))+1 as dayrank_Yest
FROM OTE C
WHERE C.ComCode = 'C-'
GROUP BY c.Date) C ON dayrank_Today = C.dayrank_Yest
GROUP BY P.Symbol, P.Date
If possible consider using a CTE as it makes it very easy. Something like this:
With Raw as
(
SELECT TOP 11 C.Date,
Avg(SettlementPri) As PriceY,
Rank() OVER (ORDER BY C.Date desc) as dayrank
FROM OTE C WHERE C.Comcode = 'C-'
Group by C.Date
)
select today.pricey as todayprice ,
yesterday.pricey as yesterdayprice,
(today.pricey - yesterday.pricey)/today.pricey * 100 as percentchange
from Raw today
left outer join Raw yesterday on today.dayrank = yesterday.dayrank + 1
Obviously this doesn;t include the symbol but that can be included pretty easily.
If using 'With' syntax doesn;t suit you can also use calculated fields with Outer Apply http://technet.microsoft.com/en-us/library/ms175156.aspx
Although the CTE will mean that you only need to write your price calculation once which is a lot cleaner
Cheers
I had the same problem and found this thread and found a solution so I thought I'd post it here.
Instead of using the column name as parameter for ON, copy the statement that gave you the colmun name in the first place:
replace:
ON dayrank_Today = C.dayrank_Yest
with:
ON (RANK() OVER (ORDER BY Date desc)) = C.dayrank_Yest
Granted, you're displeasing the Programming Gods by violating DRY, but you could be pragmatic and mention the duplication in the comments, which should appease their wrath to a mild grumbling.