SQL Query for Normalized Sales - sql

I need to write an SQL query that solves this problem:
"Get the top 10 departments overall ranked by total sales normalized by the size
of the store where the sales were recorded."
Normalizing the sales means to divide the number of sales by the size of the store per sale. So in other words I need a query that returns the top 10 departments with the greatest sum of WeeklySales/size for every sale by that department. Eg: (week1sales/size1) + (week2sales/size2) + ...
Here is the data in the database where bolded attributes are keys.
-Holidays (WeekDate, IsHoliday)
-Stores (Store, Type, Size)
-TemporalData (Store, WeekDate, Temperature, FuelPrice, CPI, UnemploymentRate)
-Sales (Store, Dept, WeekDate, WeeklySales)
(WeekDate is the date for the first day of the week. WeeklySales is an integer for the number of sales that week.)
The main issue I'm having with writing this query is that I'm having troubles figuring out how to find the sum of all sales by each department. How would keep track of the total normalized sales for each department in a query and then add them all together? Also this query will have to run in SQLite3 if that makes any difference.
Edit: Explained normalized in this context.

Your question isnt clear but for one single week you need something like this
SELECT *
FROM (
SELECT Sa.Store, Sa.Dept, SUM(WeeklySales) / St.Size as normalized
FROM Sales Sa
JOIN Stores St
ON Sa.Store = St.Store
WHERE Sa.WeekDate = #date#
GROUP BY Sa.Store, Sa.Dept
) AS T
ORDER BY normalized DESC
LIMIT 10
If you want the comulative for all weeks
SELECT *
FROM (
SELECT Sa.Store, Sa.Dept, SUM(WeeklySales) / St.Size as normalized
FROM Sales Sa
JOIN Stores St
ON Sa.Store = St.Store
GROUP BY Sa.Store, Sa.Dept
) AS T
ORDER BY normalized DESC
LIMIT 10

Related

Calculating the percentage of different types of customer feedback in each quarter

The problem statement is: I have a table (order_t) which has customer feedback (one column) and quarter number (as another column).
Using a CTE, I need to calculate the percentage of number of customer feedback in each category as well as the total number of customer feedback in each quarter.
After this happens, I need the percentage of different types of customer feedback (like good, bad, ok, very good, very bad) but using CTE.
How can I solve this statement?
I try to solve customer feedback as
WITH total_feedback AS
(
SELECT *
COUNT(CUSTOMER_FEEDBACK), QUARTER NUMBER
FROM
table1
GROUP BY
2
)
But I'm unable to calculate the first half portion, i.e. percentage of different types of customer feedback in each quarter using CTE.
How can I do that?
Find the file of the data
What you could do, and I'll keep the example as close to the code you provided as possible, is the following - using 2 CTE's:
WITH total_feedback AS (
SELECT COUNT(CUSTOMER_FEEDBACK) AS total_feedback, QUARTER_NUMBER
FROM table1
GROUP BY 2
),
category_feedback AS (
SELECT COUNT(CUSTOMER_FEEDBACK) AS feedback_count, CUSTOMER_FEEDBACK, QUARTER_NUMBER
FROM table1
GROUP BY 2, 3
)
SELECT
category_feedback.CUSTOMER_FEEDBACK,
category_feedback.QUARTER_NUMBER,
(feedback_count / total_feedback.total_feedback) * 100 AS feedback_percentage
FROM category_feedback
INNER JOIN total_feedback
ON category_feedback.QUARTER_NUMBER = total_feedback.QUARTER_NUMBER

SQL-How to Sum Data of Clients Over Time?

Goal: SUM/AVG Client Data over multiple dates/transactions.
Detailed Question: How do I properly Group clients ('PlayerID') then SUM the int(MinsPlayed), then AVG (AvgBet)?
Current Issue: my Results are giving individual transactions day by day over the 90 day time period instead of the SUM/AVG over the 90 days.
Current Script/Results: FirstName-Riley is showing each individual daily transaction instead of 1 total SUM/AVG over set time period
Firstly, you don't need to use DISTINCT as you are going to be aggregating the results using GROUP BY, so you can take that out.
The reason you are returning a row for each transaction is that your GROUP BY clause includes the column you are trying to aggregate (e.g. TimePlayed). Typically, you only want to GROUP BY the columns that are not being aggregated, so remove all the columns from the GROUP BY clause that you are aggregating using SUM or AVG (TimePlayed, PlayerSkill etc.).
Here's your current SQL:
SELECT DISTINCT CDS_StatDetail.PlayerID,
StatType,
FirstName,
LastName,
Email,
SUM(TimePlayed)/60 AS MinsPlayed,
SUM(CashIn) AS AvgBet,
SUM(PlayerSkill) AS AvgSkillRating,
SUM(PlayerSpeed) AS Speed,
CustomFlag1
FROM CDS_Player INNER JOIN CDS_StatDetail
ON CDS_Player.Player_ID = CDS_StatDetail.PlayerID
WHERE StatType='PIT' AND CDS_StatDetail.GamingDate >= '1/02/17' and CDS_StatDetail.GamingDate <= '4/02/2017' AND CustomFlag1='N'
GROUP BY CDS_StatDetail.PlayerID, StatType, FirstName, LastName, Email, TimePlayed, CashIn, PlayerSkill, PlayerSpeed, CustomFlag1
ORDER BY CDS_StatDetail.PlayerID
You want something like:
SELECT CDS_StatDetail.PlayerID,
SUM(TimePlayed)/60 AS MinsPlayed,
AVG(CashIn) AS AvgBet,
AVG(PlayerSkill) AS AvgSkillRating,
SUM(PlayerSpeed) AS Speed,
FROM CDS_Player INNER JOIN CDS_StatDetail
ON CDS_Player.Player_ID = CDS_StatDetail.PlayerID
WHERE StatType='PIT' AND CDS_StatDetail.GamingDate BETWEEN '2017-01-02' AND '2017-04-02' AND CustomFlag1='N'
GROUP BY CDS_StatDetail.PlayerID
Next time, please copy and paste your text, not just linking to a screenshot.

How to join to inner query and calculate column based on different groupings?

I have a table that contains data about a series of visits to shops.
The raw data for these visits can be found here.
My main table will have 1 row per Country, and will use something along the lines of:
Select Distinct o.Country from OtherTable as o
I need to add a new column to my main table, that uses the following calculation:
"Avg Visits by User" = (Sum of (No. Call IDs / No. unique User IDs)
for each day) / No. unique of days (based on Actual Start) for the
row.
I have formed this additional select statement to get the number of calls and users by day - but I am struggling to join this to my main table:
Select DATEPART(DAY, c.ActualStart) As 'Day',
CAST(CAST(COUNT(c.CallID) AS DECIMAL (5,1))/CAST(COUNT(Distinct c.UserID) AS DECIMAL (5,1)) AS DECIMAL (5,1)) as 'Value' from CallInfo as c
where (c.Status = 3))
Group by DATEPART(DAY, c.ActualStart)
For the country GB, I would expect to come to the see the following output:
Day Calls Users Calls / Users
13-Jun 29 8 3.625
14-Jun 31 7 4.428571429
So, in my main table, the calculation for my new column would be:
8.053571 / 2
Therefore, if I somehow add this to my table I would expect the following output:
Country Unique Days Sum of Calls/Users for each day) Final Calc
GB 2 8.053571429 4.026785714
I have tried adding this as a join, but I don't know how to join this to my main table. I could for example join on Call Id - but this would require the addition of a callID column in my inner query, and this would mean that the values are incorrect.
You can use a subquery to make calculations by day and after that make calculations by country. The result SQL query can be like this:
-- Make calculation by country, from the subquery
SELECT Country, UniqueDays = count(TheDay), CallsUserPerDay = sum(CallsPerUser),
FinalCalc = sum(CallsPerUser) / cast(count(TheDay) as DECIMAL)
FROM (
-- SUBQUERY: Make calculations by day
SELECT c.Country, c.ActualStart as TheDay,
Calls = COUNT(c.CallID),
Users = COUNT(Distinct c.UserID),
COUNT(c.CallID)
/CAST(COUNT(Distinct c.UserID) AS DECIMAL) as CallsPerUser
FROM CallInfo as c
WHERE (c.Status = 3)
GROUP BY c.Country, c.ActualStart
) data
GROUP BY Country
Note: I avoid use precission on DECIMAL casting to avoid rounding on final result.

Two tables with no direct relationship

I have 2 tables with no relation between them. I want to display the data in tabular format by month. Here is a sample output:
There are 2 different tables
1 for income
1 for expense
Problem is that we have no direct relation between these. The only commonality between them is month (date). Does anyone have a suggestion on how to generate such a report?
here is my union queries:
SELECT TO_DATE(TO_CHAR(PAY_DATE,'MON-YYYY'), 'MON-YYYY') , 'FEE RECEIPT', NVL(SUM(SFP.AMOUNT_PAID),0) AMT_RECIEVED
FROM STU_FEE_PAYMENT SFP, STU_CLASS SC, CLASS C
WHERE SC.CLASS_ID = C.CLASS_ID
AND SFP.STUDENT_NO = SC.STUDENT_NO
AND PAY_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
AND SFP.AMOUNT_PAID >0
GROUP BY TO_CHAR(PAY_DATE,'MON-YYYY')
UNION
SELECT TO_DATE(TO_CHAR(EXP_DATE,'MON-YYYY'), 'MON-YYYY') , ET.DESCRIPTION, SUM(EXP_AMOUNT)
FROM EXP_DETAIL ED, EXP_TYPE ET, EXP_TYPE_DETAIL ETD
WHERE ET.EXP_ID = ETD.EXP_ID
AND ED.EXP_ID = ET.EXP_ID
AND ED.EXP_DETAIL_ID = ETD.EXP_DETAIL_ID
AND EXP_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
GROUP BY TO_CHAR(EXP_DATE,'MON-YYYY'), ET.DESCRIPTION
ORDER BY 1
Regards:
In order to do this you probably want to make the Income and Expenses into separate sub-queries.
I have taken the two parts of your union query and separated them into sub-queries, one called income and one called expense. Both sub-queries summarise the data over the month period as before, but now you can use a JOIN on the Months to allow the data from each sub-query to be connected. Note: I have used an OUTER JOIN, because this will still join month where there is no income, but there is expense and vice versa. This will require some manipulation, because you probably are better off returning a set of zeros for the month if no transaction occur.
In the top level SELECT, replace the use of *, with the correct listing of fields required. I simply used this to show that each field can be reused from the sub-query in the outer query, by referring to the alias as the table name.
SELECT Income.*, Expenses.*
FROM (SELECT TO_DATE(TO_CHAR(PAY_DATE,'MON-YYYY'), 'MON-YYYY') as Month, 'FEE RECEIPT', NVL(SUM(SFP.AMOUNT_PAID),0) AMT_RECIEVED
FROM STU_FEE_PAYMENT SFP, STU_CLASS SC, CLASS C
WHERE SC.CLASS_ID = C.CLASS_ID
AND SFP.STUDENT_NO = SC.STUDENT_NO
AND PAY_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
AND SFP.AMOUNT_PAID >0
GROUP BY TO_CHAR(PAY_DATE,'MON-YYYY') Income
OUTER JOIN (SELECT TO_DATE(TO_CHAR(EXP_DATE,'MON-YYYY'), 'MON-YYYY') as Month, ET.DESCRIPTION, SUM(EXP_AMOUNT)
FROM EXP_DETAIL ED, EXP_TYPE ET, EXP_TYPE_DETAIL ETD
WHERE ET.EXP_ID = ETD.EXP_ID
AND ED.EXP_ID = ET.EXP_ID
AND ED.EXP_DETAIL_ID = ETD.EXP_DETAIL_ID
AND EXP_DATE BETWEEN '01-JAN-2014' AND '31-DEC-2014'
GROUP BY TO_CHAR(EXP_DATE,'MON-YYYY'), ET.DESCRIPTION) Expenses
ON Income.Month = Expenses.Month
There are still many calculations that you will have to insert, to get your final result, which you will have to work on separately. The resulting query to perform what you expect above will likely be a lot longer than this, I am just trying to show you the structure.
However the final tricky part for you is going to be the BBF. Balance Bought Forward. SQL is great a joining tables and columns, but each row is treated and handled separately, it does not read and value from the previous row within a query and allow you to manipulate that value in the next row. To do this you need another sub-query to SUM() all the changes from a point in time up until the start of the month. Financial products normally store Balance at points in time, because it is possible that not all transaction are accurately recorded and there needs to be a mechanism to adjust the Balance. Using this theory, you you need to write your sub-query to summarise all changes since the previous Balance.
IMO Financial applications are inherently complex, so the solution is going to take some time to mould into the right one.
Final Word: I am not familiar with OracleReports, but there may be something in there which will assist with maintaining the BBF.
sqlite> create table Income(Month text, total_income real);
sqlite> create table Expense(Month text, total_expense real);
sqlite> insert into Income values('Jan 2014', 9000);
sqlite> insert into Income values('Feb 2014', 6000);
sqlite> insert into Expense values('Jan 2014', 9000);
sqlite> insert into Expense values('Feb 2014', 18000);
sqlite> select Income.Month, Income.total_income, Expense.total_expense, Income.total_income - Expense.total_expense as Balance from Income, Expense where Income.Month == Expense.Month
Jan 2014|9000.0|9000.0|0.0
Feb 2014|6000.0|18000.0|-12000.0

Aggregated data from transactional table for sparklines

I'm working on an Ruby-on-Rails app which contains a list type of report. Two columns within that table are an aggregation from a transactional table.
So let's say we have these two tables:
**items**
id
name
group
price
**transactions**
id
item_id
type
date
qty
These two tables are connected with item_id in the transactions table.
Now I want to show some set of lines within the items table in a table and have two calculated columns within that table:
Calculated column 1 (Sparkline data):
Sparkline for transactions for the item with type="actuals" for the last 12 months. The result from the database should be text with aggregated qty for each month seperated by comma. Example:
15,20,0,12,44,33,6,4,33,23,11,65
Calculated column 2 (6m total sale):
Total qty for the item multiplied by sale for the last 6 months.
So the results would how columns like these:
Item name - Sparkline data - 6m total sale
So the result could by many thousand of lines, but would probably be paged.
So the question is, how is the most straightforward way of doing this in Rails models which doesn't sacrifice to much performance? Although this is a ruby-on-rails question it might contain more of a sql type solution.
The core sql could be something similar:
select
i.id,
i.name,
y.sparkline,
i.price*s.sum totalsale6m
from
items i left join
(select
x.item_id,
GROUP_CONCAT(x.sumqtd order by datemonth asc SEPARATOR ',') sparkline
from
(select
t.item_id,
date_format(date, '%m') datemonth,
sum(qtd) sumqtd
from
transactions t
where
t.type='actuals' and
t.date>date_sub(now(), interval 1 year)
group by
t.item_id, datemonth
) x
group by
x.item_id
) y on i.id=y.item_id
left join
(select
t.item_id,
sum(qtd) sumqtd
from
transactions t
where
t.date>date_sub(now(), interval 6 month)
group by
t.item_id
) s on i.id=s.item_id
group by
i.id, i.name
A few comments:
I wasn't able to test it without real data.
If there are gaps in the sales, I mean no sales in a given month, then the list will not contain 12 elements. In this case you need to adjust x,y tables
If you need the result only for a given few items, then probably you can put the item id filter deeper into the subqueries sparing time.