I'm trying to write a query that calculates the average profit per employee for several projects.
I have a table that has employee names, what project they are working on, and how much profit they bring to their specific project each day.
My first query gives 3 fields - The project name, the sum of all the profits the employees bring to the project, and the number of employees in the project.
My second query I am trying to display 2 fields - the project name and the average profit per employee that each project makes
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit) AS SumOfProfit, Count(SAYSquery.[EmpFirstName]) AS NumberOfEmps
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName;
SELECT SAYSqueryNIPE.[ProjectName], SAYSqueryNIPE.[SumOfProfit]/[NumberOfEmps] AS total
FROM SAYSqueryNIPE
GROUP BY SAYSqueryNIPE.[ProjectName], SAYSqueryNIPE.[SumOfProfit]/[NumberOfEmps];
Unfortunately, my second query is giving me the same average profit for every project and I'm not sure why. Any help would be much appreciated.
EDIT:
Query 1 reads:
**Employee Name | Sell Rate | Renumeration | Profit (Sell-Renumeration) | Project Name**
Query 2 reads:
**PROJECT NAME | SumofProfit | NumberofEmployees**
Project X | $1500 | 3 employees
Query 3 reads:
**PROJECT NAME | TOTAL**
Project X | $500 (Average profit per employee)
The problem is in your first query where you count the number of employees. As written in the question, you are returning a count of rows, not employees who worked on the project. You need to use count distinct. I'd also recommend not counting on EmpFirstName. If you have more than one employee with the same first name, the query won't give you correct results. It would be better to use a unique employee identifier instead of their first name.
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit) AS SumOfProfit,
Count(distinct SAYSquery.[EmpFirstName]) AS NumberOfEmps
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName
You could wrap the whole thing up into a single query instead of two or three, as described in your question.
SELECT SAYSquery.ProjectName, SUM(SAYSquery.Profit)/Count(distinct SAYSquery.[EmpFirstName]) AS AvgProfitPerEmployee
FROM SAYSquery
WHERE profit > 0
GROUP BY SAYSquery.ProjectName
Related
I am in a SQL class and struggling with one of the questions. We are using the AdventureWorksDW2014 database in SQL Server and this is the problem I'm stuck on:
Write a query that will return the employee key, first name, middle name, last name, total sales, and average amount per sale for every employee who has made sales to resellers. All monetary values should be rounded to two decimal places. Names should appear as a single record as "Last, First Middle." Sort the results by total sales (highest first), then by average amount per sale (highest first), then by employee name.
I have no problem selecting the EmployeeKey, nor with using concat and formatting the name as instructed. After exploring the data, it is clear that the employee information will need to come from the DimEmployee table, and the sales figures will need to come from the FactResellerSales table, and I am able to complete the inner join between the tables with no problem. I also know how to use the sum and avg functions to calculate the totals and averages for the employees individually, but those will only calculate for one employee at a time and only returns a single result. The part that I'm hung up on is creating the columns for the calculated sums and averages for each employee. The result I need to come up with needs to have a single column that shows the total sales of each employee and a single column that shows the average amount per sales for each employee, along with other information requested for each employee. So far, I have run
select distinct EmployeeKey
from FactResellerSales
to determine which employee keys are associated with sales, and it shows that there are 17. I attempted to construct the query using a subquery for each employee in the from statement,
(select EmployeeKey, sum (SalesAmount) as TotalSalesByEmp, avg (SalesAmount)
as AvgPerSaleByEmp
from FactResellerSales
where EmployeeKey = 272)
thinking that, even though it would be time consuming to do 17 subqueries, I could ultimately draw the requested data from them into the main query, but I get an error message of "Msg 8120, Level 16, State 1, Line 359
Column 'FactResellerSales.EmployeeKey' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause" when I try to test the subquery. But I can't leave out the EmployeeKey as I need it for the linking field of the inner join. My query so far (including the aliases I will use for the other fields as appropriate in the order by statement) is:
USE AdventureWorksDW2014
select e.EmployeeKey,
concat (e.LastName, ', ' + e.FirstName, ' ' + e.MiddleName) as EmployeeName
from FactResellerSales as s
inner join DimEmployee as e
on s.EmployeeKey = e.EmployeeKey
order by TotalSalesByEmp desc, AvgPerSaleByEmp desc, EmployeeName
I just need to figure out how to add the other two fields.
I've already described what the results I need should look like, but since that is apparently not good enough for some people, I will try to give an example. Apologies if the formatting is weird in the transition (I promise it looks right as I'm typing it).
| EmployeeKey | EmployeeName | TotalSalesByEmp | AvgPerSaleByEmp |
| 282 | Mitchell, Linda C | 10367007.43 | 1458.70 |
| 283 | Carson, Jillian | 10065803.54 | 1286.36 |
| 281 | Blythe, Michael G | 9293903.01 | 1314.74 |
| 272 | Jiang, Stephen Y | 1092123.86 | 1378.94 |
Please help.
Simply run your aggregation with GROUP BY on employee details which will calculate the total and average reseller sales across all 17 employees:
USE AdventureWorksDW2014
select e.EmployeeKey,
concat(e.LastName, ', ' + e.FirstName, ' ' + e.MiddleName) as EmployeeName,
sum(s.SalesAmount) as TotalSalesByEmp,
avg(s.SalesAmount) as AvgPerSaleByEmp
from FactResellerSales as s
inner join DimEmployee as e
on s.EmployeeKey = e.EmployeeKey
group by e.EmployeeKey,
e.LastName,
e.FirstName,
e.MiddleName
order by TotalSalesByEmp desc,
AvgPerSaleByEmp desc,
EmployeeName
I have a table that contains data about different benefit plans and users enrolled in one or more of those plans. So basically the table contains two columns representing the benefit plan counts and total users enrolled in those plans.
I need to create visualization in Power BI to represent the number of total users enrolled in 1 plan, 2 plans, 3 plans, ...etc.
I wrote the query in sql to get the desired result but not sure how do I do the same in power BI.
Below is my sql query:
SELECT S.PlanCount, COUNT(S.UserName) AS Participants
FROM (
SELECT A.Username, COUNT(*) AS PlanCount
FROM [dbo].[vw_BenefitsCount_Plan_Participants] AS A
GROUP BY A.username
)AS S
GROUP BY S.PlanCount
ORDER BY S.PlanCount
The query result is below image:
So here, PlanCount column represents the total different benefit plans that users are enrolled in. For e.g. the first row means that total of 6008 members are enrolled in only 1 plan, whereas row 2 displays that there are total of 3030 members who are enrolled in total of 2 plans and similarly row 5 means there are only 10 users who are enrolled in total of 6 plans.
I am new to Power BI and trying to understand DAX functions but couldn't find a reasonable example that could help me create my visualization.
I found a something similar here and here but they seem to be more towards single count and group by usage.
Here is a simple example. I have a table of home owners who have homes in multiple cities.
Now in this table, Alex, Dave and Julie have home in 1 city (basically we can say that these 3 people own just 1 home each). Similarly Jim owns a total of 2 homes and Bob and Pam each have 3 homes in total.
Now the output that I need is a table with total number of home owners that own 1 home, 2 homes and so on. So the resulting table in SQL is this.
Where NameCount is basically count of total home owners and Homes is the count of total homes these home owners have.
Please let me know if this helps.
Thanks.
If I understood fine, you have a table like this:
BenefitPlan | User
1 | Max
1 | Joe
2 | Max
3 | Anna
If it's ok, you can simply use a plot bar (for example) where the Axis is BenefitPlan and Value is User. When you drag some column in Value field, it will be grouped automaticaly (like group by in SQL), and by default the groupping method is count.
Hope it helps.
Regards.
You can use DAX to create a summary table from your data table:
https://community.powerbi.com/t5/Desktop/Creating-a-summary-table-out-of-existing-table-assistance/td-p/431485
Once you have counted plans by customer you will then have a field that will enable you to visualize the # of customers with each count.
Mock-up of the code:
PlanSummary = SUMMARIZE('vw_BenefitsCount_Plan_Participants',[Username],COUNT([PLAN_ID])
I have the following "COMPANIES_BY_NEWS_REPUTATION" in my JavaDB database (this is some random data just to represent the structure)
COMPANY | NEWS_HASH | REPUTATION | DATE
-------------------------------------------------------------------
Company A | 14676757 | 0.12345 | 2011-05-19 15:43:28.0
Company B | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company C | 454564556 | 0.78956 | 2011-05-24 18:44:28.0
Company A | -7874564 | 0.12345 | 2011-05-19 15:43:28.0
One news_hash may relate to several companies while a company can relate to several news_hashes as well. Reputation and date are bound to the news_hash.
What I need to do is calculate the average reputation of last 5 news for every company. In order to do that I somehow feel that I need to user 'order by' and 'offset' in a subquery as shown in the code below.
select COMPANY, avg(REPUTATION) from
(select * from COMPANY_BY_NEWS_REPUTATION order by "DATE" desc
offset 0 rows fetch next 5 row only) as TR group by COMPANY;
However, JavaDB allows neither ORDER BY, nor OFFSET in a subquery. Could anyone suggest a working solution for my problem please?
Which version of JavaDB are you using? According to the chapter TableSubquery in the JavaDB documentation, table subqueries do support order by and fetch next, at least in version 10.6.2.1.
Given that subqueries can be ordered and the size of the result set can be limited, the following (untested) query might do what you want:
select COMPANY, (select avg(REPUTATION)
from (select REPUTATION
from COMPANY_BY_NEWS_REPUTATION
where COMPANY = TR.COMPANY
order by DATE desc
fetch first 5 rows only))
from (select distinct COMPANY
from COMPANY_BY_NEWS_REPUTATION) as TR
This query retrieves all distinct company names from COMPANY_BY_NEWS_REPUTATION, then retrieves the average of the last five reputation rows for each company. I have no idea whether it will perform sufficiently, that will likely depend on the size of your data set and what indexes you have in place.
If you have a list of unique company names in another table, you can use that instead of the select distinct ... subquery to retrieve the companies for which to calculate averages.
First, I've been using mysql for forever and am now upgrading to postgresql. The sql syntax is much stricter and some behavior different, thus my question.
I've been searching around for how to merge rows in a postgresql query on a table such as
id | name | amount
0 | foo | 12
1 | bar | 10
2 | bar | 13
3 | foo | 20
and get
name | amount
foo | 32
bar | 23
The closest I've found is Merge duplicate records into 1 records with the same table and table fields
sql returning duplicates of 'name':
scope :tallied, lambda { group(:name, :amount).select("charges.name AS name,
SUM(charges.amount) AS amount,
COUNT(*) AS tally").order("name, amount desc") }
What I need is
scope :tallied, lambda { group(:name, :amount).select("DISTINCT ON(charges.name) charges.name AS name,
SUM(charges.amount) AS amount,
COUNT(*) AS tally").order("name, amount desc") }
except, rather than returning the first row of a given name, should return mash of all rows with a given name (amount added)
In mysql, appending .group(:name) (not needing the initial group) to the select would work as expected.
This seems like an everyday sort of task which should be easy. What would be a simple way of doing this? Please point me on the right path.
P.S. I'm trying to learn here (so are others), don't just throw sql in my face, please explain it.
I've no idea what RoR is doing in the background, but I'm guessing that group(:name, :amount) will run a query that groups by name, amount. The one you're looking for is group by name:
select name, sum(amount) as amount, count(*) as tally
from charges
group by name
If you append amount to the group by clause, the query will do just that -- i.e. count(*) would return the number of times each amount appears per name, and the sum() would return that number times that amount.
Yeah, so I'm filling out a requirements document for a new client project and they're asking for growth trends and performance expectations calculated from existing data within our database.
The best source of data for something like this would be our logs table as we pretty much log every single transaction that occurs within our application.
Now, here's the issue, I don't have a whole lot of experience with MySql when it comes to collating cumulative sum and running averages. I've thrown together the following query which kind of makes sense to me, but it just keeps locking up the command console. The thing takes forever to execute and there are only 80k records within the test sample.
So, given the following basic table structure:
id | action | date_created
1 | 'merp' | 2007-06-20 17:17:00
2 | 'foo' | 2007-06-21 09:54:48
3 | 'bar' | 2007-06-21 12:47:30
... thousands of records ...
3545 | 'stab' | 2007-07-05 11:28:36
How would I go about calculating the average number of records created for each given day of the week?
day_of_week | average_records_created
1 | 234
2 | 23
3 | 5
4 | 67
5 | 234
6 | 12
7 | 36
I have the following query which makes me want to murderdeathkill myself by casting my body down an elevator shaft... and onto some bullets:
SELECT
DISTINCT(DAYOFWEEK(DATE(t1.datetime_entry))) AS t1.day_of_week,
AVG((SELECT COUNT(*) FROM VMS_LOGS t2 WHERE DAYOFWEEK(DATE(t2.date_time_entry)) = t1.day_of_week)) AS average_records_created
FROM VMS_LOGS t1
GROUP BY t1.day_of_week;
Halps? Please, don't make me cut myself again. :'(
How far back do you need to go when sampling this information? This solution works as long as it's less than a year.
Because day of week and week number are constant for a record, create a companion table that has the ID, WeekNumber, and DayOfWeek. Whenever you want to run this statistic, just generate the "missing" records from your master table.
Then, your report can be something along the lines of:
select
DayOfWeek
, count(*)/count(distinct(WeekNumber)) as Average
from
MyCompanionTable
group by
DayOfWeek
Of course if the table is too large, then you can instead pre-summarize the data on a daily basis and just use that, and add in "today's" data from your master table when running the report.
I rewrote your query as:
SELECT x.day_of_week,
AVG(x.count) 'average_records_created'
FROM (SELECT DAYOFWEEK(t.datetime_entry) 'day_of_week',
COUNT(*) 'count'
FROM VMS_LOGS t
GROUP BY DAYOFWEEK(t.datetime_entry)) x
GROUP BY x.day_of_week
The reason why your query takes so long is because of your inner select, you are essentialy running 6,400,000,000 queries. With a query like this your best solution may be to develop a timed reporting system, where the user receives an email when the query is done and the report is constructed or the user logs in and checks the report after.
Even with the optimization written by OMG Ponies (bellow) you are still looking at around the same number of queries.
SELECT x.day_of_week,
AVG(x.count) 'average_records_created'
FROM (SELECT DAYOFWEEK(t.datetime_entry) 'day_of_week',
COUNT(*) 'count'
FROM VMS_LOGS t
GROUP BY DAYOFWEEK(t.datetime_entry)) x
GROUP BY x.day_of_week