Access: Having trouble with getting average movies per day - sql

I have a database project at my school and I am almost finished. The only thing that I need is average movies per day. I have a watchhistory where you can find the users who have watch a movie. The instrucition is that you filter the people out of the watchhistory who have an average of 2 movies per day.
I wrote the following SQL statement. But every time I get errors. Can someone help me?
SQL:
SELECT
customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
COUNT(movie_id) / SUM(GETDATE() -
(SELECT subscription_start FROM Customer)) AS AveragePerDay
FROM
Watchhistory
GROUP BY
customer_mail_address
The error:
Msg 130, Level 15, State 1, Line 1
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
I tried something different and this query sums the total movie's per day. Now I need the average of everything and that SQL only shows the cusotmers who are have more than 2 movies per day average.
SELECT
Count(movie_id) as AantalPerDag,
Customer_mail_address,
Cast(watchhistory.watch_date as Date) as Date
FROM
Watchhistory
GROUP BY
customer_mail_address, Cast(watch_date as Date)

The big problem that I see is that you're trying to use a subquery as if it's a single value. A subquery could potentially return many values, and unless you have only one customer in your system it will do exactly that. You should be JOINing to the Customer table instead. Hopefully the JOIN only returns one customer per row in WatchHistory. If that's not the case then you'll have more work to do there.
SELECT
customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
CAST(COUNT(movie_id) AS DECIMAL(10, 4)) / DATEDIFF(dy, C.subscription_start, GETDATE()) AS AveragePerDay
FROM
WatchHistory WH
INNER JOIN Customer C ON C.customer_id = WH.customer_id -- I'm guessing at the join criteria here since no table structures were provided
GROUP BY
C.customer_mail_address,
C.subscription_start
HAVING
COUNT(movie_id) / DATEDIFF(dy, C.subscription_start, GETDATE()) <> 2
I'm guessing that the criteria isn't exactly 2 movies per day, but either less than 2 or more than 2. You'll need to adjust based on that. Also, you'll need to adjust the precision for the average based on what you want.

What the error message is telling you is that you can't use SUM together with COUNT.
try putting SUM(GETDATE()-(SELECT subscription_start FROM Customer)) as your second aggregate variable, and
try using HAVING & FILTER at the end of your query to select only the users that have count/sum = 2

maybe this is what you need?
lets join the two tables Watchhistory and Customers
select customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
COUNT(movie_id) / datediff(Day, GETDATE(),Customer.subscription_start) AS AveragePerDay
from Watchhistory inner join Customer
on Watchhistory.customer_mail_address = Customer.customer_mail_address
GROUP BY
customer_mail_address
having AveragePerDay = 2
change the last line of code according to what you need (I did not understand if you want it in or out)

I got it guys. Finally :)
SELECT customer_mail_address, SUM(AveragePerDay) / COUNT(customer_mail_address) AS gemiddelde
FROM (SELECT DISTINCT customer_mail_address, COUNT(CAST(watch_date AS date)) AS AveragePerDay
FROM dbo.Watchhistory
GROUP BY customer_mail_address, CAST(watch_date AS date)) AS d
GROUP BY customer_mail_address
HAVING (SUM(AveragePerDay) / COUNT(customer_mail_address) >= 2

Related

I'm trying to find the average_months_between two functions and also rounding the number to one decimal point. IN oracle sql

CREATE VIEW AVGMNTHSBETWEEN
AS
SELECT
VENDOR_NAME,
AVG(INVOICE_DUE_DATE, INVOICE_DATE) AS MONTHS_BETWN
FROM
VENDORS
INNER JOIN
INVOICES ON VENDORS.VENDOR_ID = INVOICES.VENDOR_ID
GROUP BY
VENDOR_NAME
HAVING
AVG(ROUND(CONVERT(DECIMAL(5, 4 (INVOICE_DUE_DATE, INVOICE_DATE)) >= 1.5
ORDER BY
MONTHS_BETWN DESC;
I get errors with sorting the result set in descending order by the average_months_between, and the results to only show those vendors that the “average_months_between” is greater than or equal 1.5 months.
If you're looking for months between, then include that function. If you just subtract two DATE datatype values, you'll get number of days between them.
Round result where you selected it (not in the having clause, although you'll probably want to do that as well).
Something like this:
CREATE OR REPLACE VIEW avgmnthsbetween
AS
SELECT
vendor_name,
ROUND(AVG(months_between(invoice_due_date, invoice_date)), 1) AS avg_months_betwn
FROM vendors INNER JOIN invoices
ON vendors.vendor_id = invoices.vendor_id
GROUP BY
vendor_name
HAVING
ROUND(AVG(months_between(invoice_due_date, invoice_date)), 1) >= 1.5
ORDER BY avg_months_betwn DESC;

SQL-How to Sum Data of Clients Over Time?

Goal: SUM/AVG Client Data over multiple dates/transactions.
Detailed Question: How do I properly Group clients ('PlayerID') then SUM the int(MinsPlayed), then AVG (AvgBet)?
Current Issue: my Results are giving individual transactions day by day over the 90 day time period instead of the SUM/AVG over the 90 days.
Current Script/Results: FirstName-Riley is showing each individual daily transaction instead of 1 total SUM/AVG over set time period
Firstly, you don't need to use DISTINCT as you are going to be aggregating the results using GROUP BY, so you can take that out.
The reason you are returning a row for each transaction is that your GROUP BY clause includes the column you are trying to aggregate (e.g. TimePlayed). Typically, you only want to GROUP BY the columns that are not being aggregated, so remove all the columns from the GROUP BY clause that you are aggregating using SUM or AVG (TimePlayed, PlayerSkill etc.).
Here's your current SQL:
SELECT DISTINCT CDS_StatDetail.PlayerID,
StatType,
FirstName,
LastName,
Email,
SUM(TimePlayed)/60 AS MinsPlayed,
SUM(CashIn) AS AvgBet,
SUM(PlayerSkill) AS AvgSkillRating,
SUM(PlayerSpeed) AS Speed,
CustomFlag1
FROM CDS_Player INNER JOIN CDS_StatDetail
ON CDS_Player.Player_ID = CDS_StatDetail.PlayerID
WHERE StatType='PIT' AND CDS_StatDetail.GamingDate >= '1/02/17' and CDS_StatDetail.GamingDate <= '4/02/2017' AND CustomFlag1='N'
GROUP BY CDS_StatDetail.PlayerID, StatType, FirstName, LastName, Email, TimePlayed, CashIn, PlayerSkill, PlayerSpeed, CustomFlag1
ORDER BY CDS_StatDetail.PlayerID
You want something like:
SELECT CDS_StatDetail.PlayerID,
SUM(TimePlayed)/60 AS MinsPlayed,
AVG(CashIn) AS AvgBet,
AVG(PlayerSkill) AS AvgSkillRating,
SUM(PlayerSpeed) AS Speed,
FROM CDS_Player INNER JOIN CDS_StatDetail
ON CDS_Player.Player_ID = CDS_StatDetail.PlayerID
WHERE StatType='PIT' AND CDS_StatDetail.GamingDate BETWEEN '2017-01-02' AND '2017-04-02' AND CustomFlag1='N'
GROUP BY CDS_StatDetail.PlayerID
Next time, please copy and paste your text, not just linking to a screenshot.

How to join to inner query and calculate column based on different groupings?

I have a table that contains data about a series of visits to shops.
The raw data for these visits can be found here.
My main table will have 1 row per Country, and will use something along the lines of:
Select Distinct o.Country from OtherTable as o
I need to add a new column to my main table, that uses the following calculation:
"Avg Visits by User" = (Sum of (No. Call IDs / No. unique User IDs)
for each day) / No. unique of days (based on Actual Start) for the
row.
I have formed this additional select statement to get the number of calls and users by day - but I am struggling to join this to my main table:
Select DATEPART(DAY, c.ActualStart) As 'Day',
CAST(CAST(COUNT(c.CallID) AS DECIMAL (5,1))/CAST(COUNT(Distinct c.UserID) AS DECIMAL (5,1)) AS DECIMAL (5,1)) as 'Value' from CallInfo as c
where (c.Status = 3))
Group by DATEPART(DAY, c.ActualStart)
For the country GB, I would expect to come to the see the following output:
Day Calls Users Calls / Users
13-Jun 29 8 3.625
14-Jun 31 7 4.428571429
So, in my main table, the calculation for my new column would be:
8.053571 / 2
Therefore, if I somehow add this to my table I would expect the following output:
Country Unique Days Sum of Calls/Users for each day) Final Calc
GB 2 8.053571429 4.026785714
I have tried adding this as a join, but I don't know how to join this to my main table. I could for example join on Call Id - but this would require the addition of a callID column in my inner query, and this would mean that the values are incorrect.
You can use a subquery to make calculations by day and after that make calculations by country. The result SQL query can be like this:
-- Make calculation by country, from the subquery
SELECT Country, UniqueDays = count(TheDay), CallsUserPerDay = sum(CallsPerUser),
FinalCalc = sum(CallsPerUser) / cast(count(TheDay) as DECIMAL)
FROM (
-- SUBQUERY: Make calculations by day
SELECT c.Country, c.ActualStart as TheDay,
Calls = COUNT(c.CallID),
Users = COUNT(Distinct c.UserID),
COUNT(c.CallID)
/CAST(COUNT(Distinct c.UserID) AS DECIMAL) as CallsPerUser
FROM CallInfo as c
WHERE (c.Status = 3)
GROUP BY c.Country, c.ActualStart
) data
GROUP BY Country
Note: I avoid use precission on DECIMAL casting to avoid rounding on final result.

SQL to calculate value of Shares at a particular time

I'm looking for a way that I can calculate what the value of shares are at a given time.
In the example I need to calculate and report on the redemptions of shares in a given month.
There are 3 tables that I need to look at:
Redemptions table that has the Date of the redemption, the number of shares that were redeemed and the type of share.
The share type table which has the share type and links the 1st and 3rd tables.
The Share price table which has the share type, valuation date, value.
So what I need to do is report on and have calculated based on the number of share redemptions the value of those shares broken down by month.
Does that make sense?
Thanks in advance for your help!
Apologies, I think I should elaborate a little further as there might have been some misunderstandings. This isn't to calculate daily changing stocks and shares, it's more for fund management. What this means is that the share price only changes on a monthly basis and it's also normally a month behind.
The effect of this is that the what the query needs to do, is look at the date of the redemption, work out the date ie month and year. Then look at the share price table and if there's a share price for the given date (this will need to be calculated as it will be a single day ie the price was x on day y) then multiple they number of units by this value. However, if there isn't a share price for the given date then use the last price for that particular share type.
Hopefully this might be a little more clear but if there's any other information I can provide to make this easier then please let me know and I'll supply you with the information.
Regards,
Phil
This should do the trick (note: updated to group by ShareType):
SELECT
ST.ShareType,
RedemptionMonth = DateAdd(month, DateDiff(month, 0, R.RedemptionDate), 0),
TotalShareValueRedeemed = Sum(P.SharePrice * R.SharesRedeemed)
FROM
dbo.Redemption R
INNER JOIN dbo.ShareType ST
ON R.ShareTypeID = ST.ShareTypeID
CROSS APPLY (
SELECT TOP 1 P.*
FROM dbo.SharePrice P
WHERE
R.ShareTypeID = P.ShareTypeID
AND R.RedemptionDate >= P.SharePriceDate
ORDER BY P.SharePriceDate DESC
) P
GROUP BY
ShareType,
DateAdd(month, DateDiff(month, 0, R.RedemptionDate), 0)
ORDER BY
ShareType,
RedemptionMonth
;
See it working in a Sql Fiddle.
This can easily be parameterized by simply adding a WHERE clause with conditions on the Redemption table. If you need to show a 0 for share types in months where they had no Redemptions, please let me know and I'll improve my answer--it would help if you would fill out your use case scenario a little bit, and describe exactly what you want to input and what you want to see as output.
Also please note: I'm assuming here that there will always be a price for a share redemption--if a redemption exists that is before any share price for it, that redemption will be excluded.
If you have the valuations for every day, then the calculation is a simple join followed by an aggregation. The resulting query is something like:
select year(redemptiondate), month(redemptiondate),
sum(r.NumShares*sp.Price) as TotalPrice
from Redemptions r left outer join
ShareType st
on r.sharetype = st.sharetype left outer join
SharePrice sp
on st.sharename = sp.sharename and r.redemptiondate = sp.pricedate
group by year(redemptiondate), month(redemptiondate)
order by 1, 2;
If I understand your question, you need a query like
select shares.id, shares.name, sum (redemption.quant * shareprices.price)
from shares
inner join redemption on shares.id = redemption.share
inner join shareprices on shares.id = shareprices.share
where redemption.curdate between :p1 and :p2
order by shares.id
group by shares.id, shares.name
:p1 and :p2 are date parameters
If you just need it for one date range:
SELECT s.ShareType, SUM(ISNULL(sp.SharePrice, 0) * ISNULL(r.NumRedemptions, 0)) [RedemptionPrice]
FROM dbo.Shares s
LEFT JOIN dbo.Redemptions r
ON r.ShareType = s.ShareType
OUTER APPLY (
SELECT TOP 1 SharePrice
FROM dbo.SharePrice p
WHERE p.ShareType = s.ShareType
AND p.ValuationDate <= r.RedemptionDate
ORDER BY p.ValuationDate DESC) sp
WHERE r.RedemptionDate BETWEEN #Date1 AND #Date2
GROUP BY s.ShareType
Where #Date1 and #Date2 are your dates
The ISNULL checks are just there so it actually gives you a value if something is null (it'll be 0). It's completely optional in this case, just a personal preference.
The OUTER APPLY acts like a LEFT JOIN that will filter down the results from SharePrice to make sure you get the most recent ValuationDate from table based on the RedemptionDate, even if it wasn't from the same date range as that date. It could probably be achieved another way, but I feel like this is easily readable.
If you don't feel comfortable with the OUTER APPLY, you could use a subquery in the SELECT part (i.e., ISNULL(r.NumRedemptions, 0) * (/* subquery from dbo.SharePrice here */)

SQL: HAVING clause

See the following SQL statement:
SELECT datediff("d", MAX(invoice.date), Now) As Date_Diff
, MAX(invoice.date) AS max_invoice_date
, customer.number AS customer_number
FROM invoice
INNER JOIN customer
ON invoice.customer_number = customer.number
GROUP BY customer.number
If the the following was added:
HAVING datediff("d", MAX(invoice.date), Now) > 365
would this simply exclude rows with Date_Diff <= 365?
What should be the effect of the HAVING clause here?
EDIT: I am not experiencing what the answers here are saying. A copy of the mdb is at http://hotfile.com/dl/40641614/2353dfc/test.mdb.html (no macros or viruses). VISDATA.EXE is being used to execute the queries.
EDIT2: I think the problem might be VISDATA, because I am experiencing different results via DAO.
As already pointed out, yes, that is the effect. For completeness, 'HAVING' is like 'WHERE', but for the already aggregated (grouped) values (such as, MAX in this case, or SUM, or COUNT, or any of the other aggregate functions).
Yes, it would exclude those rows.
Yes, that is what it would do.
WHERE applies to all of the individual rows, so WHERE MAX(...) would match all rows.
HAVING is like WHERE, but within the current group. That means you can do things like HAVING count(*) > 1, which will only show groups with more than one result.
So to answer your question, it would only include rows where the record in the group that has the highest (MAX) date is greater than 365. In this case you are also selecting MAX(date), so yes, it excludes rows with date_diff <= 365.
However, you could select MIN(date) and see the minimum date in all the groups that have a maximum date of greater than 365. In this case it would not exclude "rows" with date_diff <= 365, but rather groups with max(date_diff) <= 365.
Hopefully it's not too confusing...
You may be trying the wrong thing with your MAX. By MAXing the invoice.date column you are effectively looking for the most recent invoice associated with the customer. So effectively the HAVING condition is selecting all those customers who have not had any invoices within the last 365 days.
Is this what you are trying to do? Or are you actually trying to get all customers who have at least one invoice from more than a year ago? If that is the case, then you should put the MAX outside the datediff function.
That depends on whether you mean rows in the table or rows in the result. The having clause filters the result after grouping, so it would elliminate customers, not invoices.
If you want to filter out the new invoices rather than the customers with new invoices, you should use where instead so that you filter before grouping:
select
datediff("d",
max(invoice.date), Now) As Date_Diff,
max(invoice.date) as max_invoice_date,
customer.number
from
invoice
inner join customer on invoice.customer_number = customer.number
where
datediff("d", invoice.date, Now) > 365
group by
customer.number
I wouldn't use a GROUP BY query at all. Using standard Jet SQL:
SELECT Customer.Number
FROM [SELECT DISTINCT Invoice.Customer_Number
FROM Invoice
WHERE (((Invoice.[Date])>Date()-365));]. AS Invoices
RIGHT JOIN Customer ON Invoices.Customer_Number = Customer.Number
WHERE (((Invoices.Customer_Number) Is Null));
Using SQL92 compatibility mode:
SELECT Customer.Number
FROM (SELECT DISTINCT Invoice.Customer_Number
FROM Invoice
WHERE (((Invoice.[Date])>Date()-365));) AS Invoices
RIGHT JOIN Customer ON Invoices.Customer_Number = Customer.Number
WHERE (((Invoices.Customer_Number) Is Null));
The key here is to get a set of the customer numbers who've had an invoice in the last year, and then doing an OUTER JOIN on that result set to return only those not in the set of customers with invoices in the last year.