SQL-How to Sum Data of Clients Over Time? - sql

Goal: SUM/AVG Client Data over multiple dates/transactions.
Detailed Question: How do I properly Group clients ('PlayerID') then SUM the int(MinsPlayed), then AVG (AvgBet)?
Current Issue: my Results are giving individual transactions day by day over the 90 day time period instead of the SUM/AVG over the 90 days.
Current Script/Results: FirstName-Riley is showing each individual daily transaction instead of 1 total SUM/AVG over set time period

Firstly, you don't need to use DISTINCT as you are going to be aggregating the results using GROUP BY, so you can take that out.
The reason you are returning a row for each transaction is that your GROUP BY clause includes the column you are trying to aggregate (e.g. TimePlayed). Typically, you only want to GROUP BY the columns that are not being aggregated, so remove all the columns from the GROUP BY clause that you are aggregating using SUM or AVG (TimePlayed, PlayerSkill etc.).

Here's your current SQL:
SELECT DISTINCT CDS_StatDetail.PlayerID,
StatType,
FirstName,
LastName,
Email,
SUM(TimePlayed)/60 AS MinsPlayed,
SUM(CashIn) AS AvgBet,
SUM(PlayerSkill) AS AvgSkillRating,
SUM(PlayerSpeed) AS Speed,
CustomFlag1
FROM CDS_Player INNER JOIN CDS_StatDetail
ON CDS_Player.Player_ID = CDS_StatDetail.PlayerID
WHERE StatType='PIT' AND CDS_StatDetail.GamingDate >= '1/02/17' and CDS_StatDetail.GamingDate <= '4/02/2017' AND CustomFlag1='N'
GROUP BY CDS_StatDetail.PlayerID, StatType, FirstName, LastName, Email, TimePlayed, CashIn, PlayerSkill, PlayerSpeed, CustomFlag1
ORDER BY CDS_StatDetail.PlayerID
You want something like:
SELECT CDS_StatDetail.PlayerID,
SUM(TimePlayed)/60 AS MinsPlayed,
AVG(CashIn) AS AvgBet,
AVG(PlayerSkill) AS AvgSkillRating,
SUM(PlayerSpeed) AS Speed,
FROM CDS_Player INNER JOIN CDS_StatDetail
ON CDS_Player.Player_ID = CDS_StatDetail.PlayerID
WHERE StatType='PIT' AND CDS_StatDetail.GamingDate BETWEEN '2017-01-02' AND '2017-04-02' AND CustomFlag1='N'
GROUP BY CDS_StatDetail.PlayerID
Next time, please copy and paste your text, not just linking to a screenshot.

Related

Grouping and Summing in SQL

Good day
I would like to find out how I would Sort and group the following data by grouping the Account and summing the grandTotal of the Account per Month.
This is my current Select statement:
SELECT
tbl_AccountLedger.ledgerName
,tbl_SalesMaster.date
, tbl_SalesMaster.grandTotal
FROM tbl_SalesMaster
INNER JOIN tbl_AccountLedger ON tbl_SalesMaster.ledgerId =tbl_AccountLedger.ledgerId
Here is wat my select statement is bringing back
Now I need to sort this data by summing the grand total for each month for a legerName
You can use the below query on SQL 2012 or higher version. I will suggest please read grouping in SQL.
SELECT
tbl_AccountLedger.ledgerName
,DATEFROMPARTS(YEAR(tbl_SalesMaster.date),MONTH(tbl_SalesMaster.date),1) AS Month
, SUM(tbl_SalesMaster.grandTotal) AS TotalGrandTotal
FROM tbl_SalesMaster
INNER JOIN tbl_AccountLedger ON tbl_SalesMaster.ledgerId =tbl_AccountLedger.ledgerId
GROUP BY tbl_AccountLedger.ledgerName, DATEFROMPARTS(YEAR(tbl_SalesMaster.date),MONTH(tbl_SalesMaster.date),1)

Returning single records per month

I have a use case function that needs to returns a single row only for every end of month.
I tried using select distinct and it is showing multiple records for the same end of month
SELECT DISTINCT CASE
WHEN eff_interest_balance < 0.01 THEN trial_balance_date
WHEN date_paid < trial_balance_date THEN date_paid
END as A
, period
FROM dbo.Intpayments[enter image description here][1]
WHERE loan_number = 60023
ORDER BY period ASC
Each row should return single date for each month
Distinct is returning unique rows, not grouping them. You are looking to aggregate rows. This means using some combination of aggregate functions and group by.
What your current query is missing is some sort of logic for aggregating the rows that are in the same period. Do you want to compare the sum of these values? The min, the max?
In any case, the basic idea of aggregating and grouping would look like this - I don't think this summing is what you want, but the query shows the basic idea of aggregating and grouping:
SELECT
period
, SUM(eff_interest_balance) AS SumOfBalance
FROM dbo.Intpayments
WHERE loan_number = 60023
GROUP BY period

sql select number divided aggregate sum function

I have this schema
and I want to have a query to calculate the cost per consultant per hour per month. In other words, a consultant has a salary per month, I want to divide the amount of the salary between the hours that he/she worked that month.
SELECT
concat_ws(' ', consultants.first_name::text, consultants.last_name::text) as name,
EXTRACT(MONTH FROM tasks.init_time) as task_month,
SUM(tasks.finish_time::timestamp::time - tasks.init_time::timestamp::time) as duration,
EXTRACT(MONTH FROM salaries.payment_date) as salary_month,
salaries.payment
FROM consultants
INNER JOIN tasks ON consultants.id = tasks.consultant_id
INNER JOIN salaries ON consultants.id = salaries.consultant_id
WHERE EXTRACT(MONTH FROM tasks.init_time) = EXTRACT(MONTH FROM salaries.payment_date)
GROUP BY (consultants.id, EXTRACT(MONTH FROM tasks.init_time), EXTRACT(MONTH FROM salaries.payment_date), salaries.payment);
It is not possible to do this in the select
salaries.payment / SUM(tasks.finish_time::timestamp::time - tasks.init_time::timestamp::time)
Is there another way to do it? Is it possible to solve it in one query?
Assumptions made for this answer:
The model is not entirely clear to me, so I am assuming the following:
you are using PostgreSQL
salaries.date is defined as a date column that stores the day when a consultant was paid
tasks.init_time and task.finish_time are defined as timestamp storing the data & time when a consultant started and finished work on a specific task.
Your join on only the month is wrong as far as I can tell. For one, because it would also include months from different years, but more importantly because this would lead to a result where the same row from salaries appeared several times. I think you need to join on the complete date:
FROM consultants c
JOIN tasks t ON c.id = t.consultant_id
JOIN salaries s ON c.id = s.consultant_id
AND t.init_time::date = s.payment_date --<< here
If my assumptions about the data types are correct, the cast to a timestamp and then back to a time is useless and wrong. Useless because you can simply subtract to timestamps and wrong because you are ignoring the actual date in the timestamp so (although unlikely) if init_time and finish_time are not on the same day, the result is wrong.
So the calculation of the duration can be simplified to:
t.finish_time - t.init_time
To get the cost per hour per month, you need to convert the interval (which is the result when subtracting one timestamp from another) to a decimal indicating the hours, you can do this by extracting the seconds from the interval and then dividing that by 3600, e.g.
extract(epoch from sum(t.finish_time - t.init_time)) / 3600)
If you divide the sum of the payments by that number you get your cost per hour per month:
SELECT concat_ws(' ', c.first_name, c.last_name) as name,
to_char(s.payment_date, 'yyyy-mm') as salary_month,
extract(epoch from sum(t.finish_time - t.init_time)) / 3600 as worked_hours,
sum(s.payment) / (extract(epoch from sum(t.finish_time - t.init_time)) / 3600) as cost_per_hour
FROM consultants c
JOIN tasks t ON c.id = t.consultant_id
JOIN salaries s ON c.id = s.consultant_id AND t.init_time::date = s.payment_date
GROUP BY c.id, to_char(s.payment_date, 'yyyy-mm') --<< no parentheses!
order by name, salary_month;
As you want the report broken down by month you should convert the month into something that contains the year as well. I used to_char() to get a string with only year and month. You also need to remove salaries.payment from the group by clause.
You also don't need the "payment month" and "salary month" because both will always be the same as that is the join condition.
And finally you don't need the cast to ::text for the name columns because they are most certainly defined as varchar or text anyway.
The sample data I made up for this: http://sqlfiddle.com/#!15/ae0c9
Somewhat unrelated, but:
You should also not put the column list of the group by in parentheses. Putting a column list in parentheses in Postgres creates an anonymous record which is something completely different then having multiple columns. This is also true for the columns in the select list.
If at all the target is putting it in one query, then just confirming, have you tried to achieve it using CTEs?
Like
;WITH cte_pymt
AS
(
//Your existing query 1
)
SELECT <your required data> FROM cte_pymt

Access: Having trouble with getting average movies per day

I have a database project at my school and I am almost finished. The only thing that I need is average movies per day. I have a watchhistory where you can find the users who have watch a movie. The instrucition is that you filter the people out of the watchhistory who have an average of 2 movies per day.
I wrote the following SQL statement. But every time I get errors. Can someone help me?
SQL:
SELECT
customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
COUNT(movie_id) / SUM(GETDATE() -
(SELECT subscription_start FROM Customer)) AS AveragePerDay
FROM
Watchhistory
GROUP BY
customer_mail_address
The error:
Msg 130, Level 15, State 1, Line 1
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
I tried something different and this query sums the total movie's per day. Now I need the average of everything and that SQL only shows the cusotmers who are have more than 2 movies per day average.
SELECT
Count(movie_id) as AantalPerDag,
Customer_mail_address,
Cast(watchhistory.watch_date as Date) as Date
FROM
Watchhistory
GROUP BY
customer_mail_address, Cast(watch_date as Date)
The big problem that I see is that you're trying to use a subquery as if it's a single value. A subquery could potentially return many values, and unless you have only one customer in your system it will do exactly that. You should be JOINing to the Customer table instead. Hopefully the JOIN only returns one customer per row in WatchHistory. If that's not the case then you'll have more work to do there.
SELECT
customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
CAST(COUNT(movie_id) AS DECIMAL(10, 4)) / DATEDIFF(dy, C.subscription_start, GETDATE()) AS AveragePerDay
FROM
WatchHistory WH
INNER JOIN Customer C ON C.customer_id = WH.customer_id -- I'm guessing at the join criteria here since no table structures were provided
GROUP BY
C.customer_mail_address,
C.subscription_start
HAVING
COUNT(movie_id) / DATEDIFF(dy, C.subscription_start, GETDATE()) <> 2
I'm guessing that the criteria isn't exactly 2 movies per day, but either less than 2 or more than 2. You'll need to adjust based on that. Also, you'll need to adjust the precision for the average based on what you want.
What the error message is telling you is that you can't use SUM together with COUNT.
try putting SUM(GETDATE()-(SELECT subscription_start FROM Customer)) as your second aggregate variable, and
try using HAVING & FILTER at the end of your query to select only the users that have count/sum = 2
maybe this is what you need?
lets join the two tables Watchhistory and Customers
select customer_mail_address,
COUNT(movie_id) AS AantalBekeken,
COUNT(movie_id) / datediff(Day, GETDATE(),Customer.subscription_start) AS AveragePerDay
from Watchhistory inner join Customer
on Watchhistory.customer_mail_address = Customer.customer_mail_address
GROUP BY
customer_mail_address
having AveragePerDay = 2
change the last line of code according to what you need (I did not understand if you want it in or out)
I got it guys. Finally :)
SELECT customer_mail_address, SUM(AveragePerDay) / COUNT(customer_mail_address) AS gemiddelde
FROM (SELECT DISTINCT customer_mail_address, COUNT(CAST(watch_date AS date)) AS AveragePerDay
FROM dbo.Watchhistory
GROUP BY customer_mail_address, CAST(watch_date AS date)) AS d
GROUP BY customer_mail_address
HAVING (SUM(AveragePerDay) / COUNT(customer_mail_address) >= 2

SQL: Average value per day

I have a database called ‘tweets’. The database 'tweets' includes (amongst others) the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss), ‘classified’ and 'processed text'. Within the ‘processed text’ row there are certain strings such as {TICKER|IBM}', to which I will refer as ticker-strings.
My target is to get the average value of ‘classified’ per ticker-string per day. The row ‘classified’ includes the numerical values -1, 0 and 1.
At this moment, I have a working SQL query for the average value of ‘classified’ for one ticker-string per day. See the script below.
SELECT Date( `created_at` ) , AVG( `classified` ) AS Classified
FROM `tweets`
WHERE `processed_text` LIKE '%{TICKER|IBM}%'
GROUP BY Date( `created_at` )
There are however two problems with this script:
It does not include days on which there were zero ‘processed_text’s like {TICKER|IBM}. I would however like it to spit out the value zero in this case.
I have 100+ different ticker-strings and would thus like to have a script which can process multiple strings at the same time. I can also do them manually, one by one, but this would cost me a terrible lot of time.
When I had a similar question for counting the ‘tweet_id’s per ticker-string, somebody else suggested using the following:
SELECT d.date, coalesce(IBM, 0) as IBM, coalesce(GOOG, 0) as GOOG,
coalesce(BAC, 0) AS BAC
FROM dates d LEFT JOIN
(SELECT DATE(created_at) AS date,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|IBM}%' then tweet_id
END) as IBM,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|GOOG}%' then tweet_id
END) as GOOG,
COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|BAC}%' then tweet_id
END) as BAC
FROM tweets
GROUP BY date
) t
ON d.date = t.date;
This script worked perfectly for counting the tweet_ids per ticker-string. As I however stated, I am not looking to find the average classified scores per ticker-string. My question is therefore: Could someone show me how to adjust this script in such a way that I can calculate the average classified scores per ticker-string per day?
SELECT d.date, t.ticker, COALESCE(COUNT(DISTINCT tweet_id), 0) AS tweets
FROM dates d
LEFT JOIN
(SELECT DATE(created_at) AS date,
SUBSTR(processed_text,
LOCATE('{TICKER|', processed_text) + 8,
LOCATE('}', processed_text, LOCATE('{TICKER|', processed_text))
- LOCATE('{TICKER|', processed_text) - 8)) t
ON d.date = t.date
GROUP BY d.date, t.ticker
This will put each ticker on its own row, not a column. If you want them moved to columns, you have to pivot the result. How you do this depends on the DBMS. Some have built-in features for creating pivot tables. Others (e.g. MySQL) do not and you have to write tricky code to do it; if you know all the possible values ahead of time, it's not too hard, but if they can change you have to write dynamic SQL in a stored procedure.
See MySQL pivot table for how to do it in MySQL.