SQL Average or MAX of one column in query for pivot table - sql

I have a query that shows everytime an employee clocks in or out at a customer site. I've already worked out the math for calculating their hours for each visit to a site. Is it possible to just get the average or max value of the invoice amount column for CompanyA?
I'm trying to use this data in a pivot table, but it will SUM the Invoice Amount for that customer showing '690.0000' and then it totals in the "Grand Total". This makes the math wrong for the grand total. I want it to just account for the one value of 345.00. The InvoiceAmount is always the same no matter how many times they clocked in and out there.
Is there something I can do in my query before it's pulled into the pivot table?
ProjectName
InvoiceAmount
HoursPerRow
CompanyA
345.00
19
CompanyA
345.00
2
declare #current_cst datetimeoffset;
set #current_cst = (SELECT getdate() AT TIME ZONE 'UTC' AT TIME ZONE 'Central Standard Time')
SELECT
dbo.WorkOrderEmployeeTimeTracker.EmployeeID,
dbo.Employees.FirstName + ' ' + dbo.Employees.LastName AS EmployeeFullName,
dbo.Projects.ProjectName,
MAX (dbo.WorkOrderProjects.InvoiceAmount) AS InvoiceAmount,
dbo.WorkOrders.WorkOrderID,
AVG (dbo.WorkOrderEmployeeTimeTracker.Wage) AS Wage,
convert(datetime,switchoffset(convert(datetimeoffset, dbo.WorkOrderEmployeeTimeTracker.ClockIn),datename(tzoffset,#current_cst))) AS ClockIn,
convert(datetime,switchoffset(convert(datetimeoffset, dbo.WorkOrderEmployeeTimeTracker.ClockOut),datename(tzoffset,#current_cst))) AS ClockOut,
CONVERT(int,DATEDIFF(mi, ClockIn, ClockOut) / 60) as Hrs_Difference,
CONVERT(int,DATEDIFF(mi, ClockIn, ClockOut) % 60) as Mins_Difference,
DATEDIFF(second, dbo.WorkOrderEmployeeTimeTracker.ClockIn, dbo.WorkOrderEmployeeTimeTracker.ClockOut) / 3600.0 as HoursPerRow
FROM
dbo.WorkOrderEmployeeTimeTracker INNER JOIN
dbo.WorkOrders ON dbo.WorkOrderEmployeeTimeTracker.WorkOrderID = dbo.WorkOrders.WorkOrderID INNER JOIN
dbo.WorkOrderProjects ON dbo.WorkOrders.WorkOrderProjectID = dbo.WorkOrderProjects.WorkOrderProjectID INNER JOIN
dbo.Employees ON dbo.WorkOrderEmployeeTimeTracker.EmployeeID = dbo.Employees.EmployeeID INNER JOIN
dbo.Projects ON dbo.WorkOrderProjects.ProjectID = dbo.Projects.ProjectID
where dbo.WorkOrderProjects.IsPaid = 'True'
group by
dbo.WorkOrderEmployeeTimeTracker.EmployeeID,
dbo.Employees.FirstName,
dbo.Employees.LastName,
dbo.WorkOrderProjects.InvoiceAmount,
dbo.Projects.ProjectName,
dbo.WorkOrders.WorkOrderID,
dbo.WorkOrderEmployeeTimeTracker.Wage,
WorkOrderEmployeeTimeTracker.ClockIn,
dbo.WorkOrderEmployeeTimeTracker.ClockOut
order by 1,2,3
I only want the Invoice Amount once when there is a duplicate. I still need the other information from each row though, as employees have different clock times when they have clocked in and out more than once for the same project.
UPDATE: It would seem that covertion on the clockin and clockout were throwing everything off. Removing them solved everything. I Will have to find a way to convert these times to Central Standard Time.
If anyone has proper ideas on that, please let me know.

There is aggregate functions that will help like MAX function
you can write
SELECT MAX(InvoiceAmount) FROM TableName
and there is other functions like AVG and can be used on on column

Related

Joining date with datetime and values per day

I am writing a query so that I can fill my facttable. I have a table which registers the weather average per day, registered at 23:59:00 of each day (date).
I have another table in which climate control data of different rooms are registered, per minute (datetime).
And I also have a time dimension available in another database.
I want to fill my facttable with all the available timeKeys combined with all the data from my climate control table and my weather table.
I'm sorry for my English, it isn't my mother tongue.
So, to find the matching timeKey for the date values I wrote this query:
SELECT t.timeKey AS WeathertimeKey,
weather.date AS date,
weather.temperature,
weather.rainAmountMM,
weather.windDirection,
weather.windSpeed
FROM StarSchema.dbo.timeDim t, weather
WHERE DATEPART(mm, t.DATE) = DATEPART(mm, weather.date)
AND DATEPART(dd, t.DATE) = DATEPART(dd, weather.date)
AND DATEPART(Hour, t.DATE) = '23'
AND DATEPART(MINUTE, t.DATE) = '59'
RESULT: Result
My time dimension has a timeKey for every minute in 2015: timeDimension
The facttable I am trying to fill: facttable
My solution for filling the facttable was creating a view with the corresponding timeKey per day and then joining that view in my main query.
SELECT
t.timeKey as timeKey,
rt1.roomId AS roomKey,
1 AS roomDataKey,
1 AS usageKey,
1 AS knmiKey,
rt1.temperature AS temperature,
rt1.locWindow AS locWindow,
rt1.locDoor AS locDoor,
rh1.turnedOn AS turnedOn,
rh1.temperature AS temperatureHeater,
s.storyTemp AS storyTemp,
s.storyHumidity AS storyHumidity,
vw.temperature AS temperatureOutside,
vw.rainAmountMM AS rainAmountMM,
vw.windSpeed AS windSpeed,
vw.windDirection AS windDirection,
vu.gasM3 AS gasM3,
vu.electricityKWH AS electricityKWH
FROM StarSchema.dbo.timeDim t
INNER JOIN roomTemperature1 rt1 ON rt1.date = t.DATE
INNER JOIN roomHeating1 rh1 ON rt1.date = rh1.date
INNER JOIN story s ON s.date = rt1.date
INNER JOIN vw_timeKeyWeatherDay vw ON t.timeKey = vw.WeathertimeKey
INNER JOIN vw_timeKeyUsageDay vu ON t.timeKey = vu.UsagetimeKey
The result is as follows: result2
So now it only uses the timeKey of 23:59 of everyday.
I want the complete days in there, but how do I do this?
Can someone help me out?
And my apologies for my use of the English language, again.
I did my best :-)
If I understand your question property, you want to match two date columns which have a different level of precision: one is per minute the other one is per day. What I suggest is a query which stores the yyyy.mm.dd only for both. Then when you join you get a matching record every time.
You can do that by adding the number of days that distance each of your dates from the date 0 in SQL server
DECLARE #DVal DATE
DECLARE #DTVal DATETIME
SET #DVal = '2018-01-18'
SET #DTVal = '2018-01-18 12:02:01.003'
SELECT #DVal
SELECT #DTVal
SELECT DATEDIFF(D,0,#DTVal)
SELECT DATEDIFF(D,0,#DVal)
SELECT DATEADD(D,(DATEDIFF(D,0,#DTVal) ),0)
SELECT DATEADD(D,(DATEDIFF(D,0,#DVal) ),0)
Comments about the code above:
First I declare the variables, one DATE and one DATETIME and I give them slightly different values which are less than a day.
Then I select them so that we can see they are different
2018-01-18
2018-01-18 12:02:01.003
Then I select the difference in days between each date and 0, and we have the same number of days
43116
43116
Then I add this difference to the date 0, and we end up with two datetime values which are identical
2018-01-18 00:00:00.000
2018-01-18 00:00:00.000
I hope I have answered your question. Please comment on my answer if I have not. At least this is a starting point. If your goal is to get the complete range of minutes per day you can create two calculated columns, one based on the current date and the other one based on the current date +1 day, and join against the time table with a BETWEEEN ON clause, etc.

sql select number divided aggregate sum function

I have this schema
and I want to have a query to calculate the cost per consultant per hour per month. In other words, a consultant has a salary per month, I want to divide the amount of the salary between the hours that he/she worked that month.
SELECT
concat_ws(' ', consultants.first_name::text, consultants.last_name::text) as name,
EXTRACT(MONTH FROM tasks.init_time) as task_month,
SUM(tasks.finish_time::timestamp::time - tasks.init_time::timestamp::time) as duration,
EXTRACT(MONTH FROM salaries.payment_date) as salary_month,
salaries.payment
FROM consultants
INNER JOIN tasks ON consultants.id = tasks.consultant_id
INNER JOIN salaries ON consultants.id = salaries.consultant_id
WHERE EXTRACT(MONTH FROM tasks.init_time) = EXTRACT(MONTH FROM salaries.payment_date)
GROUP BY (consultants.id, EXTRACT(MONTH FROM tasks.init_time), EXTRACT(MONTH FROM salaries.payment_date), salaries.payment);
It is not possible to do this in the select
salaries.payment / SUM(tasks.finish_time::timestamp::time - tasks.init_time::timestamp::time)
Is there another way to do it? Is it possible to solve it in one query?
Assumptions made for this answer:
The model is not entirely clear to me, so I am assuming the following:
you are using PostgreSQL
salaries.date is defined as a date column that stores the day when a consultant was paid
tasks.init_time and task.finish_time are defined as timestamp storing the data & time when a consultant started and finished work on a specific task.
Your join on only the month is wrong as far as I can tell. For one, because it would also include months from different years, but more importantly because this would lead to a result where the same row from salaries appeared several times. I think you need to join on the complete date:
FROM consultants c
JOIN tasks t ON c.id = t.consultant_id
JOIN salaries s ON c.id = s.consultant_id
AND t.init_time::date = s.payment_date --<< here
If my assumptions about the data types are correct, the cast to a timestamp and then back to a time is useless and wrong. Useless because you can simply subtract to timestamps and wrong because you are ignoring the actual date in the timestamp so (although unlikely) if init_time and finish_time are not on the same day, the result is wrong.
So the calculation of the duration can be simplified to:
t.finish_time - t.init_time
To get the cost per hour per month, you need to convert the interval (which is the result when subtracting one timestamp from another) to a decimal indicating the hours, you can do this by extracting the seconds from the interval and then dividing that by 3600, e.g.
extract(epoch from sum(t.finish_time - t.init_time)) / 3600)
If you divide the sum of the payments by that number you get your cost per hour per month:
SELECT concat_ws(' ', c.first_name, c.last_name) as name,
to_char(s.payment_date, 'yyyy-mm') as salary_month,
extract(epoch from sum(t.finish_time - t.init_time)) / 3600 as worked_hours,
sum(s.payment) / (extract(epoch from sum(t.finish_time - t.init_time)) / 3600) as cost_per_hour
FROM consultants c
JOIN tasks t ON c.id = t.consultant_id
JOIN salaries s ON c.id = s.consultant_id AND t.init_time::date = s.payment_date
GROUP BY c.id, to_char(s.payment_date, 'yyyy-mm') --<< no parentheses!
order by name, salary_month;
As you want the report broken down by month you should convert the month into something that contains the year as well. I used to_char() to get a string with only year and month. You also need to remove salaries.payment from the group by clause.
You also don't need the "payment month" and "salary month" because both will always be the same as that is the join condition.
And finally you don't need the cast to ::text for the name columns because they are most certainly defined as varchar or text anyway.
The sample data I made up for this: http://sqlfiddle.com/#!15/ae0c9
Somewhat unrelated, but:
You should also not put the column list of the group by in parentheses. Putting a column list in parentheses in Postgres creates an anonymous record which is something completely different then having multiple columns. This is also true for the columns in the select list.
If at all the target is putting it in one query, then just confirming, have you tried to achieve it using CTEs?
Like
;WITH cte_pymt
AS
(
//Your existing query 1
)
SELECT <your required data> FROM cte_pymt

Access: Having trouble with getting average movies per day

I have a database project at my school and I am almost finished. The only thing that I need is average movies per day. I have a watchhistory where you can find the users who have watch a movie. The instrucition is that you filter the people out of the watchhistory who have an average of 2 movies per day.
I wrote the following SQL statement. But every time I get errors. Can someone help me?
SQL:
SELECT
customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
COUNT(movie_id) / SUM(GETDATE() -
(SELECT subscription_start FROM Customer)) AS AveragePerDay
FROM
Watchhistory
GROUP BY
customer_mail_address
The error:
Msg 130, Level 15, State 1, Line 1
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
I tried something different and this query sums the total movie's per day. Now I need the average of everything and that SQL only shows the cusotmers who are have more than 2 movies per day average.
SELECT
Count(movie_id) as AantalPerDag,
Customer_mail_address,
Cast(watchhistory.watch_date as Date) as Date
FROM
Watchhistory
GROUP BY
customer_mail_address, Cast(watch_date as Date)
The big problem that I see is that you're trying to use a subquery as if it's a single value. A subquery could potentially return many values, and unless you have only one customer in your system it will do exactly that. You should be JOINing to the Customer table instead. Hopefully the JOIN only returns one customer per row in WatchHistory. If that's not the case then you'll have more work to do there.
SELECT
customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
CAST(COUNT(movie_id) AS DECIMAL(10, 4)) / DATEDIFF(dy, C.subscription_start, GETDATE()) AS AveragePerDay
FROM
WatchHistory WH
INNER JOIN Customer C ON C.customer_id = WH.customer_id -- I'm guessing at the join criteria here since no table structures were provided
GROUP BY
C.customer_mail_address,
C.subscription_start
HAVING
COUNT(movie_id) / DATEDIFF(dy, C.subscription_start, GETDATE()) <> 2
I'm guessing that the criteria isn't exactly 2 movies per day, but either less than 2 or more than 2. You'll need to adjust based on that. Also, you'll need to adjust the precision for the average based on what you want.
What the error message is telling you is that you can't use SUM together with COUNT.
try putting SUM(GETDATE()-(SELECT subscription_start FROM Customer)) as your second aggregate variable, and
try using HAVING & FILTER at the end of your query to select only the users that have count/sum = 2
maybe this is what you need?
lets join the two tables Watchhistory and Customers
select customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
COUNT(movie_id) / datediff(Day, GETDATE(),Customer.subscription_start) AS AveragePerDay
from Watchhistory inner join Customer
on Watchhistory.customer_mail_address = Customer.customer_mail_address
GROUP BY
customer_mail_address
having AveragePerDay = 2
change the last line of code according to what you need (I did not understand if you want it in or out)
I got it guys. Finally :)
SELECT customer_mail_address, SUM(AveragePerDay) / COUNT(customer_mail_address) AS gemiddelde
FROM (SELECT DISTINCT customer_mail_address, COUNT(CAST(watch_date AS date)) AS AveragePerDay
FROM dbo.Watchhistory
GROUP BY customer_mail_address, CAST(watch_date AS date)) AS d
GROUP BY customer_mail_address
HAVING (SUM(AveragePerDay) / COUNT(customer_mail_address) >= 2

Calculate Average after populating a temp table

I have been tasked with figuring out the average length of time that our customers stick with us. (Specifically from the date they become a customer, to when they placed their last order.)
I am not 100% sure that I am doing this properly, but my thought was to gather the date we enter the customer into the database, and then head over to the order table and grab their most recent order date, dump them into a temp table, and then figure out the length of time between those two dates, and then tally an average based on that number.
( I have to do some other wibbly wobbly time stuff as well, but this is the one thats kicking my butt)
The end goal with this is to be able to say "On Average our customers stick with us for 4 years, and 3 months." (Or whatever the data shows it to be.)
SELECT * INTO #AvgTable
FROM(
SELECT DISTINCT (c.CustNumber) AS [CustomerNumber]
, COALESCE(convert( VARCHAR(10),c.OrgEnrollDate,101),'') AS [StartDate]
, COALESCE(CONVERT(VARCHAR(10),MAX(co.OrderDate),101),'')AS [EndDate]
,DATEDIFF(DD,c.OrgEnrollDate, co.OrderDate) as [LengthOfTime]
FROM dbo.Customer c
JOIN dbo.CustomerOrder co ON c.ID = co.CustomerID
WHERE c.Archived = 0
AND co.Archived =0
AND c.OrgEnrollDate IS NOT NULL
AND co.OrderDate IS NOT NULL
GROUP BY c.CustNumber
, co.OrderDate 2
)
--This is where I start falling apart
Select AVG[LengthofTime]
From #AvgTable
If understand you correctly, then just try
SELECT AVG(DATEDIFF(dd, StartDate, EndDate)) AvgTime
FROM #AvgTable
My guess is that since you are storing the data in a temp table, that the integer result of the datediff is being implicitly converted back to a datetime (which you cannot do an average on).
Don't store the average in your temp table (don't even have a temp table, but that is whole different conversation). Just do the differencing in your select.

Query for the average call length for all users for a day

A person uses their cell phone multiple times per day, and the length of their calls vary. I am tracking the length of the calls in a table:
Calls [callID, memberID, startTime, duration]
I need to a query to return the average call length for users per day. Per day means, if a user used the phone 3 times, first time for 5 minutes, second for 10 minutes and the last time for 7 minutes, the calculation is: 5 + 10 + 7 / 3 = ...
Note:
People don't use the phone everyday, so we have to get the latest day's average per person and use this to get the overall average call duration.
we don't want to count anyone twice in the average, so only 1 row per user will go into calculating the average daily call duration.
Some clarifications...
I need a overall per day average, based on the per-user per-day average, using the users latest days numbers (since we are only counting a given user ONCE in the query), so it will mean we will be using different days avg. since people might not use the phone each day or on the same day even.
The following query will get you the desired end results.
SELECT AVG(rt.UserDuration) AS AveragePerDay
FROM
(
SELECT
c1.MemberId,
AVG(c1.Duration) AS "UserDuration"
FROM Calls c1
WHERE CONVERT(VARCHAR, c1.StartTime, 102) =
(SELECT CONVERT(VARCHAR, MAX(c2.StartTime), 102)
FROM Calls c2
WHERE c2.MemberId = c1.MemberId)
GROUP By MemberId
) AS rt
THis accomplishes it by first creating a table with 1 record for each member and the average duration of their calls for the most recent day. Then it simply averages all of those values to get the end "average call duration. If you want to see a specific user, you can run just the innser SELECT section to get the member list
You need to convert the DATETIME to something you can make "per day" groups on, so this would produce "yy/mm/dd".
SELECT
memberId,
CONVERT(VARCHAR, startTime, 102) Day,
AVG(Duration) AvgDuration
FROM
Calls
WHERE
CONVERT(VARCHAR, startTime, 102) =
(
SELECT
CONVERT(VARCHAR, MAX(startTime), 102)
FROM
Calls i WHERE i.memberId = Calls.memberId
)
GROUP BY
memberId,
CONVERT(VARCHAR, startTime, 102)
Use LEFT(CONVERT(VARCHAR, startTime, 120), 10) to produce "yyyy-mm-dd".
For these kind of queries it would be helpful to have a dedicated "day only" column to avoid the whole conversion business and as a side effect make the query more readable.
select average(duration) from calls group by date(startTime);