I am in a SQL class and struggling with one of the questions. We are using the AdventureWorksDW2014 database in SQL Server and this is the problem I'm stuck on:
Write a query that will return the employee key, first name, middle name, last name, total sales, and average amount per sale for every employee who has made sales to resellers. All monetary values should be rounded to two decimal places. Names should appear as a single record as "Last, First Middle." Sort the results by total sales (highest first), then by average amount per sale (highest first), then by employee name.
I have no problem selecting the EmployeeKey, nor with using concat and formatting the name as instructed. After exploring the data, it is clear that the employee information will need to come from the DimEmployee table, and the sales figures will need to come from the FactResellerSales table, and I am able to complete the inner join between the tables with no problem. I also know how to use the sum and avg functions to calculate the totals and averages for the employees individually, but those will only calculate for one employee at a time and only returns a single result. The part that I'm hung up on is creating the columns for the calculated sums and averages for each employee. The result I need to come up with needs to have a single column that shows the total sales of each employee and a single column that shows the average amount per sales for each employee, along with other information requested for each employee. So far, I have run
select distinct EmployeeKey
from FactResellerSales
to determine which employee keys are associated with sales, and it shows that there are 17. I attempted to construct the query using a subquery for each employee in the from statement,
(select EmployeeKey, sum (SalesAmount) as TotalSalesByEmp, avg (SalesAmount)
as AvgPerSaleByEmp
from FactResellerSales
where EmployeeKey = 272)
thinking that, even though it would be time consuming to do 17 subqueries, I could ultimately draw the requested data from them into the main query, but I get an error message of "Msg 8120, Level 16, State 1, Line 359
Column 'FactResellerSales.EmployeeKey' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause" when I try to test the subquery. But I can't leave out the EmployeeKey as I need it for the linking field of the inner join. My query so far (including the aliases I will use for the other fields as appropriate in the order by statement) is:
USE AdventureWorksDW2014
select e.EmployeeKey,
concat (e.LastName, ', ' + e.FirstName, ' ' + e.MiddleName) as EmployeeName
from FactResellerSales as s
inner join DimEmployee as e
on s.EmployeeKey = e.EmployeeKey
order by TotalSalesByEmp desc, AvgPerSaleByEmp desc, EmployeeName
I just need to figure out how to add the other two fields.
I've already described what the results I need should look like, but since that is apparently not good enough for some people, I will try to give an example. Apologies if the formatting is weird in the transition (I promise it looks right as I'm typing it).
| EmployeeKey | EmployeeName | TotalSalesByEmp | AvgPerSaleByEmp |
| 282 | Mitchell, Linda C | 10367007.43 | 1458.70 |
| 283 | Carson, Jillian | 10065803.54 | 1286.36 |
| 281 | Blythe, Michael G | 9293903.01 | 1314.74 |
| 272 | Jiang, Stephen Y | 1092123.86 | 1378.94 |
Please help.
Simply run your aggregation with GROUP BY on employee details which will calculate the total and average reseller sales across all 17 employees:
USE AdventureWorksDW2014
select e.EmployeeKey,
concat(e.LastName, ', ' + e.FirstName, ' ' + e.MiddleName) as EmployeeName,
sum(s.SalesAmount) as TotalSalesByEmp,
avg(s.SalesAmount) as AvgPerSaleByEmp
from FactResellerSales as s
inner join DimEmployee as e
on s.EmployeeKey = e.EmployeeKey
group by e.EmployeeKey,
e.LastName,
e.FirstName,
e.MiddleName
order by TotalSalesByEmp desc,
AvgPerSaleByEmp desc,
EmployeeName
Related
I'm trying to write a query that calculates the average profit per employee for several projects.
I have a table that has employee names, what project they are working on, and how much profit they bring to their specific project each day.
My first query gives 3 fields - The project name, the sum of all the profits the employees bring to the project, and the number of employees in the project.
My second query I am trying to display 2 fields - the project name and the average profit per employee that each project makes
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit) AS SumOfProfit, Count(SAYSquery.[EmpFirstName]) AS NumberOfEmps
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName;
SELECT SAYSqueryNIPE.[ProjectName], SAYSqueryNIPE.[SumOfProfit]/[NumberOfEmps] AS total
FROM SAYSqueryNIPE
GROUP BY SAYSqueryNIPE.[ProjectName], SAYSqueryNIPE.[SumOfProfit]/[NumberOfEmps];
Unfortunately, my second query is giving me the same average profit for every project and I'm not sure why. Any help would be much appreciated.
EDIT:
Query 1 reads:
**Employee Name | Sell Rate | Renumeration | Profit (Sell-Renumeration) | Project Name**
Query 2 reads:
**PROJECT NAME | SumofProfit | NumberofEmployees**
Project X | $1500 | 3 employees
Query 3 reads:
**PROJECT NAME | TOTAL**
Project X | $500 (Average profit per employee)
The problem is in your first query where you count the number of employees. As written in the question, you are returning a count of rows, not employees who worked on the project. You need to use count distinct. I'd also recommend not counting on EmpFirstName. If you have more than one employee with the same first name, the query won't give you correct results. It would be better to use a unique employee identifier instead of their first name.
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit) AS SumOfProfit,
Count(distinct SAYSquery.[EmpFirstName]) AS NumberOfEmps
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName
You could wrap the whole thing up into a single query instead of two or three, as described in your question.
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit)/Count(distinct SAYSquery.[EmpFirstName]) AS AvgProfitPerEmployee
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName
I have 2 tables joined with political results and I need to have the votes SUM per county, and then the MAX of the vote counts per county, with the Party that relates to the MAX in another column. I'm having trouble getting the Party into the Query results without messing up the SUM and MAX columns.
This Table I can get with the Following SQL
County Name SumOfVoteCount MaxOfVoteCount OfficeID
Baker 7253 4008 S
SELECT NY_Race.[County Name], Sum(NY_Results.VoteCount) AS SumOfVoteCount, Max(NY_Results.VoteCount) AS MaxOfVoteCount
FROM NY_Race INNER JOIN NY_Results ON NY_Race.RaceCountyID = NY_Results.RaceCountyID
GROUP BY NY_Race.[County Name], NY_Race.OfficeID
HAVING (((NY_Race.OfficeID)="S"));
What I need is for the Party that has that 4008 vote total to be included in the query results, but when I try to select Party to be added, it shows all of them and messes up the SUM of the vote count, and I end of with this:
County Name SumOfVoteCount MaxOfVoteCount1 Party OfficeID
Baker 2927 2927 Dem S
Baker 4008 4008 GOP S
Baker 101 101 Lib S
Baker 53 53 Prg S
Baker 164 164 WF S
This is the SQL code I am using that gets the above Table:
SELECT NY_Race.[County Name], Sum(NY_Results.VoteCount) AS SumOfVoteCount, Max(NY_Results.VoteCount) AS MaxOfVoteCount, NY_Results.Party
FROM NY_Race INNER JOIN NY_Results ON NY_Race.RaceCountyID = NY_Results.RaceCountyID
GROUP BY NY_Race.[County Name], NY_Race.OfficeID, NY_Results.Party
HAVING (((OR_Race.OfficeID)="S"));
How can I get this table in the query results?
County Name SumOfVoteCount MaxOfVoteCount Party OfficeID
Baker 7253 4008 GOP S
I can't help but think I'm missing a WHERE claus somewhere that compares Party to MAXofVoteCount
One way to approach these is to have a nested subquery that gets the MAX() for the field of interest. Then, only select the record with that MAX(). Here's the structure:
select COUNTY_NAME, R1.*
, (select sum(votecount) from results R2 where R1.COUNTY_ID=R2.COUNTY_ID and R1.OFFICE_ID=R2.OFFICE_ID)
from RESULTS R1
join RACE on R1.COUNTY_ID=RACE.COUNTY_ID and R1.OFFICE_ID=RACE.OFFICE_ID
where R1.office_id = 'S'
and voteCount =
(select max(votecount) from results R3 where R1.COUNTY_ID=R3.COUNTY_ID and R1.OFFICE_ID=R3.OFFICE_ID)
I created a demo on SQLFiddle.
One issue: what if two get exactly the same number of votes. That's a functional issue you will have to resolve.
I'm trying to solve this query where i need to find the the top balance at each base. Balance is in one table and bases are in another table.
This is the existing query i have that returns all the results but i need to find a way to limit it to 1 top result per baseID.
SELECT o.names.name t.accounts.bidd.baseID, MAX(t.accounts.balance)
FROM order o, table(c.accounts) t
WHERE t.accounts.acctype = 'verified'
GROUP BY o.names.name, t.accounts.bidd.baseID;
accounts is a nested table.
this is the output
Name accounts.BIDD.baseID MAX(T.accounts.BALANCE)
--------------- ------------------------- ---------------------------
Jerard 010 1251.21
john 012 3122.2
susan 012 3022.2
fin 012 3022.2
dan 010 1751.21
What i want the result to display is calculate the highest balance for each baseID and only display one record for that baseID.
So the output would look only display john for baseID 012 because he has the highest.
Any pointers in the right direction would be fantastic.
I think the problem is cause of the "Name" column. since you have three names mapped to one base id(12), it is considering all three records as unique ones and grouping them individually and not together.
Try to ignore the "Name" column in select query and in the "Group-by" clause.
SELECT t.accounts.bidd.baseID, MAX(t.accounts.balance)
FROM order o, table(c.accounts) t
WHERE t.accounts.acctype = 'verified'
GROUP BY t.accounts.bidd.baseID;
I am looking at making a simple leader board for a time trial. A member may perform many time trials, but I only want for their fastest result to be displayed. My table columns are as follows:
Members { ID (PK), Forename, Surname }
TimeTrials { ID (PK), MemberID, Date, Time, Distance }
An example dataset would be:
Forename | Surname | Date | Time | Distance
Bill Smith 01-01-11 1.14 100
Dave Jones 04-09-11 2.33 100
Bill Smith 02-03-11 1.1 100
My resulting answer from the example above would be:
Forename | Surname | Date | Time | Distance
Bill Smith 02-03-11 1.1 100
Dave Jones 04-09-11 2.33 100
I have this so far, but access complains that I am not using Date as part of an aggregate function:
SELECT Members.Forename, Members.Surname, Min(TimeTrials.Time) AS MinOfTime, TimeTrials.Date
FROM Members
INNER JOIN TimeTrials ON Members.ID = TimeTrials.Member
GROUP BY Members.Forename, Members.Surname, TimeTrials.Distance
HAVING TimeTrials.Distance = 100
ORDER BY MIN(TimeTrials.Time);
IF I remove the Date from the SELECT the query works (without the date). I have tried using FIRST upon the TimeTrials.Date, but that will return the first date which is normally incorrect.
Obviously putting the Date as part of the GROUP BY would not return the result set that I am after.
Make this task easier on yourself by starting with a smaller piece of the problem. First get the minimum Time from TimeTrials for each combination of MemberID and Distance.
SELECT
tt.MemberID,
tt.Distance,
Min(tt.Time) AS MinOfTime
FROM TimeTrials AS tt
GROUP BY
tt.MemberID,
tt.Distance;
Assuming that SQL is correct, use it in a subquery which you join back to TimeTrials again.
SELECT tt2.*
FROM
TimeTrials AS tt2
INNER JOIN
(
SELECT
tt.MemberID,
tt.Distance,
Min(tt.Time) AS MinOfTime
FROM TimeTrials AS tt
GROUP BY
tt.MemberID,
tt.Distance
) AS sub
ON
tt2.MemberID = sub.MemberID
AND tt2.Distance = sub.Distance
AND tt2.Time = sub.MinOfTime
WHERE tt2.Distance = 100
ORDER BY tt2.Time;
Finally, you can join that query to Members to get Forename and Surname. Your question shows you already know how to do that, so I'll leave it for you. :-)
I have the following "COMPANIES_BY_NEWS_REPUTATION" in my JavaDB database (this is some random data just to represent the structure)
COMPANY | NEWS_HASH | REPUTATION | DATE
-------------------------------------------------------------------
Company A | 14676757 | 0.12345 | 2011-05-19 15:43:28.0
Company B | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company C | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company A | -7874564 | 0.12345 | 2011-05-19 15:43:28.0
One news_hash may relate to several companies while a company can relate to several news_hashes as well. Reputation and date are bound to the news_hash.
What I need to do is calculate the average reputation of last 5 news for every company. In order to do that I somehow feel that I need to user 'order by' and 'offset' in a subquery as shown in the code below.
select COMPANY, avg(REPUTATION) from
(select * from COMPANY_BY_NEWS_REPUTATION order by "DATE" desc
offset 0 rows fetch next 5 row only) as TR group by COMPANY;
However, JavaDB allows neither ORDER BY, nor OFFSET in a subquery. Could anyone suggest a working solution for my problem please?
Which version of JavaDB are you using? According to the chapter TableSubquery in the JavaDB documentation, table subqueries do support order by and fetch next, at least in version 10.6.2.1.
Given that subqueries can be ordered and the size of the result set can be limited, the following (untested) query might do what you want:
select COMPANY, (select avg(REPUTATION)
from (select REPUTATION
from COMPANY_BY_NEWS_REPUTATION
where COMPANY = TR.COMPANY
order by DATE desc
fetch first 5 rows only))
from (select distinct COMPANY
from COMPANY_BY_NEWS_REPUTATION) as TR
This query retrieves all distinct company names from COMPANY_BY_NEWS_REPUTATION, then retrieves the average of the last five reputation rows for each company. I have no idea whether it will perform sufficiently, that will likely depend on the size of your data set and what indexes you have in place.
If you have a list of unique company names in another table, you can use that instead of the select distinct ... subquery to retrieve the companies for which to calculate averages.