SQL - Getting multiple counts with criteria - sql

I have three tables that store customers, customer visits to a store, and store reviews:
Customers
ID
BirthDate...etc.
CustomerVisits
Customer_ID
Store_ID
VisitDate
Reviews
Store_ID
Customer_ID
Rating
What I need to get in a (hopefully) a single SQL statement is a count of all time visitors per store, count of visitors within the last 30 days per store, average customer age per store, and average review score per store. I need to be able to do this for several stores at once using an IN clause like where Store_ID IN (1,2,3). I know I could create a temp table and loop through store_ids, running multiple selects, but would rather do this in a single select if that is possible.
Thanks in advance!

You could perform each count in a subquery as follows:
SELECT Stores.Store_ID,
review.AvgRating,
cv.VisitsLast20days,
cv.TotalVisits,
cv.AvgCustomerAge
FROM Stores
LEFT JOIN
( SELECT Store_ID, [AvgRating] = AVG(Rating)
FROM Reviews
GROUP BY Store_ID
) review
ON review.Store_ID = Stores.Store_ID
LEFT JOIN
( SELECT CustomerVisits.Store_ID,
[VisitsLast30Days] = COUNT(CASE WHEN CustomerVisits.VisitDate >= DATEADD(DAY, -30, CURRENT_TIMESTAMP) THEN 1 END),
[TotalVisits] = COUNT(*),
[AvgCustomerAge] = AVG(DATEDIFF(DAY, Customer.BirthDate, CURRENT_TIMESTAMP)) / 365.25
FROM CustomerVisits
INNER JOIN Customer
ON Customer.Customer_ID = CustomerVisits.Customer_ID
GROUP BY CustomerVisits.Store_ID
) cv
ON cv.Store_ID = Stores.Store_ID;
I have assumed you have a table called stores to do this, and used LEFT JOINs on the assumption that not every store has a visit or a review.
I've also used a fairly crude method of calculating the average age of a customer, but given it is only for an average, and not actually working out an accurate age for an individual I doubt it will adversely affect the results

Try:
select s.Store_ID,
count(distinct v.Customer_ID) all_time_visitors,
count(distinct case when datediff(d, v.VisitDate, getdate()) <= 30 then v.Customer_ID end) 30day_visitors,
avg(datediff(yy, c.BirthDate, getdate())) avg_customer_age,
max(r.avg_rating) avg_rating
from Stores s
left join CustomerVisits v on s.Store_ID = v.Store_ID
left join Customers c on v.Customer_ID = c.Customer_ID
left join (select Store_ID, avg(Rating) avg_rating
from Reviews
group by Store_ID) r on s.Store_ID = r.Store_ID
where s.Store_ID in (1,2,3) /*amend as required*/
group by s.Store_ID

Related

SQL -percent calculation

Make a report on the sales in 2015 of the products by categories (total value and
quantity sold). Also determine what% of the value of sales
for a given category represent the sales of each of the products in the category.
My query so far:
WITH sales AS
(SELECT t1.category_name
, t2.product_name
, (t3.unit_price*t3.quantity) Total_sales
, EXTRACT (YEAR FROM order_date) Year
FROM categories t1
INNER JOIN
products t2
ON t2.category_id=t1.category_id
INNER JOIN
order_details t3
ON t3.product_id=t2.product_id
INNER JOIN
orders t4
ON t4.order_id=t3.order_id
WHERE EXTRACT (YEAR FROM order_date) = '2015'
GROUP BY t1.category_name
, t2.product_name
, (t3.unit_price*t3.quantity)
, EXTRACT (YEAR FROM order_date)
ORDER BY 1
)
SELECT s.category_name
, s.product_name
, SUM (Total_sales)
FROM sales s
GROUP BY s.category_name
, s.product_name
ORDER BY 1
How to calculate %? Thank you
I think that you want window functions - if your database, that you did not specify, supports them:
SELECT
c.category_name
p.product_name
SUM(od.unit_price * od.quantity) as total_sales
1.0 * SUM(od.unit_price * od.quantity)
/ SUM(SUM(od.unit_price * od.quantity)) OVER(PARTITION BY c.category_id)
as category_sales_ratio
FROM categories c
INNER JOIN products t2 p ON p.category_id = c.ategory_id
INNER JOIN order_details od ON od.product_id = p.product_id
INNER JOIN orders o ON o.order_id = od.order_id
WHERE o.order_date >= '2015-01-01' AND o.order_date < '2016-01-01'
GROUP BY c.category_id, c.ategory_name, p.product_id, p.product_name
ORDER BY c.category_name, p.product_name
The window sum computes the total sales for the whole category, that you can divide the sales of the current product with.
Note that I changed your query in serveral ways:
meaningful table aliases make the query easier to write, read and maintain
filtering dates without transformation is much more efficient that using date functions
there is no need for a subquery
it is always a good idea to put the relevant primary keys in the GROUP BY clause (in case two different products or categories have the same name) - on the other hand, you also had additiona uneeded columns in that clause

How to select customers that didn't place an order in the last 7 days

I am trying to find out the customers that didn't place an order in the last seven days. Basically I have 3 tables: customers, orders and help_desk_agents.
I'm trying to figure out the best way to get this information.
The SQL bellow retrieves the customers info, the help desk agent 111 and the last date of the orders of each customer:
SELECT DISTINCT customers.customer_id,
customers.customer_name,
agents.help_desk_agent,
Max(orders.order_date)
FROM customers
LEFT JOIN (SELECT DISTINCT customers.customer_id,
orders.order_date
FROM orders
GROUP BY 1,
2) orders2
ON customers.customer_id = orders2.customer_id
LEFT JOIN help_desk_agents
ON customers.help_desk_agent_id =
help_desk_agents.help_desk_agent_id
WHERE customer.help_desk_agent_id = 111
GROUP BY 1,
2,
3
I would like like somehow to filter the customers that didn't place an order in the last seven days.
Try adding this at the and of your query :
having max(orders.order_date) < dateadd(day, -7, getdate())
You can try a
Datediff(dd,<datecolumn>,getdate())
and use
>= 7
as a condition.
The query that you want should look like this:
SELECT c.customer_id, c.customer_name, a.help_desk_agent,
Max(orders.order_date)
FROM customers c JOIN
(SELECT o.customer_id, MAX(o.order_date) as max_order_date
FROM orders o
GROUP BY o.customer_id
) o
ON c.customer_id = o.customer_id
WHERE c.help_desk_agent_id = 111 AND
o.max_order_date < dateadd(day, -7, getdate());
Your query has multiple issues:
The alias customers.customer_id is not understood in the subquery.
The select distinct is unnecessary.
LEFT JOIN is unnecessary because presumably customers have at least one order and you require a match to the agent table.
You don't need the agent table, because the information you want is in the customer table.

Group by Month, return 0 if no record found

I want to fetch records from database table for last 12 months. Here is what I tried so far.
SELECT COUNT(s.id), date_part('month', s.viewed_at) month_number
FROM statistics_maps_view as s
INNER JOIN maps as m
ON s.maps_id=m.id Where m.users_id = $users_id group by month_number ORDER BY month_number DESC LIMIT 12
I know It'll group the records month wise. but is there a way to add Count = 0 if there is no record for a particular month?
The group by clause will not create entries where there's no data, as you've seen. What you could do is left join this entire result with another result set that has all the entries you want - e.g., one you dynamically generate with generate_series:
SELECT generate_series AS month_number, cnt
FROM GENERATE_SERIES(1,12) g
LEFT JOIN (SELECT COUNT(s.id) AS cnt,
DATE_PART('month', s.viewed_at) AS month_number
FROM statistics_maps_view s
INNER JOIN maps m ON s.maps_id = m.id
WHERE m.users_id = $users_id
GROUP BY month_number) s ON g.generate_series = s.month_number
ORDER BY 1 ASC

Query based on Count, Time frame, and location

Ok so I need to write a query that I am probably making much more complicated than it needs to be but I could use some help.
I need to select records of clients that have not been seen for a year or longer, have seen us more than once but can be only once if it is not at certain locations.
So what I have so far is:
WITH CTE AS
(
SELECT
client_id,
location_id,
employee_id,
create_timestamp,
ROW_NUMBER() OVER(PARTITION BY person_id ORDER BY create_timestamp DESC) AS ROW
FROM
client_Appointment
)
SELECT
c.client_id,
COUNT(*)
FROM
CTE AS ce
INNER JOIN person AS c
ON p.person_id= ce.client_id
INNER JOIN employee_mstr AS em
ON em.employee_id = ce.empoyee_id
INNER JOIN location_mstr AS lm
ON lm.location_id = ce.location_id
WHERE
ce.create_timestamp <= CONVERT(VARCHAR(10), DATEADD(Year,-1,GETDATE()), 120)
GROUP BY
p.person_id
HAVING
COUNT(*) > 1
I'm unsure where to go from here. Also this does not get me all the info I need and if I add that information to the select clause I have to use it in group by which means I don't get all the needed records.
Thanks
So you want only clients who have not been seen in a year or more,
then clients that have either one visit NOT at certain locations OR more than one visit. Did I get that right?
Note: Just replace (VALUES(1),(2),(3)) with your table name
WITH CTE_visits
AS
(
SELECT
c.client_id,
COUNT(*) AS total_visits,
SUM(
CASE
WHEN ce.location_id IN (SELECT ID FROM (VALUES(1),(2),(3)) AS A(ID)) THEN 0 --so when it is a certain location then do NOT count it
ELSE 1 --if it is not at the certain locations, then count it
END
) AS visits_not_at_certain_locations
FROM
client_Appointment AS ce
INNER JOIN person AS c
ON p.person_id= ce.client_id
INNER JOIN employee_mstr AS em
ON em.employee_id = ce.empoyee_id
INNER JOIN location_mstr AS lm
ON lm.location_id = ce.location_id
CROSS APPLY(SELECT client_id, MAX(create_timestamp) last_visit FROM client_Appointment WHERE client_id = ce.client_id GROUP BY client_id) CA --find most recent visit for each client_id
WHERE
ce.create_timestamp <= CONVERT(VARCHAR(10), DATEADD(Year,-1,GETDATE()), 120) --remember this only counts visits over a year ago
AND last_visit <= CONVERT(VARCHAR(10), DATEADD(Year,-1,GETDATE()), 120) --says only return client_id's who's last visit is more than a year ago
GROUP BY
p.person_id
)
SELECT *
FROM CTE_visits
WHERE visits_not_at_certain_locations = 1 --seen once NOT at certain locations
OR total_visits > 1 --seen more than once at any location

Join two tables but only get most recent associated record

I am having a hard time constructing an sql query that gets all the associated data with respect to another (associated) table and loops over into that set of data on which are considered as latest (or most recent).
The image below describes my two tables (Inventory and Sales), the Inventory table contains all the item and the Sales table contains all the transaction records. The Inventory.Id is related to Sales.Inventory_Id. And the Wanted result is the output that I am trying to work on to.
My objective is to associate all the sales record with respect to inventory but only get the most recent transaction for each item.
Using a plain join (left, right or inner) doesn't produce the result that I am looking into for I don't know how to add another category in which you can filter the most recent data to join to. Is this doable or should I change my table schema?
Thanks.
You can use APPLY
Select Item,Sales.Price
From Inventory I
Cross Apply(Select top 1 Price
From Sales S
Where I.id = S.Inventory_Id
Order By Date Desc) as Sales
WITH Sales_Latest AS (
SELECT *,
MAX(Date) OVER(PARTITION BY Inventory_Id) Latest_Date
FROM Sales
)
SELECT i.Item, s.Price
FROM Inventory i
INNER JOIN Sales_Latest s ON (i.Id = s.Inventory_Id)
WHERE s.Date = s.Latest_Date
Think carefully about what results you expect if there are two prices in Sales for the same date.
I would just use a correlated subquery:
select Item, Price
from Inventory i
inner join Sales s
on i.id = s.Inventory_Id
and s.Date = (select max(Date) from Sales where Inventory_Id = i.id)
select * from
(
select i.name,
row_number() over (partition by i.id order by s.date desc) as rownum,
s.price,
s.date
from inventory i
left join sales s on i.id = s.inventory_id
) tmp
where rownum = 1
SQLFiddle demo