SQL Server clause issue - sql

Select Name, contact, and postal code of the customer who has done MAXIMUM transactions in the month of June.
SELECT
Customer.customer_name,
Customer.customer_email,
Customer.customer_postcode
FROM
Customer
INNER JOIN
Sales on Customer.customer_id = Sales.customer_id
WHERE
MAX(Sales.customer_id) IN (SELECT COUNT((sales.customer_id)) AS 'transactions'
FROM sales
GROUP BY (sales.customer_id))
AND MONTH(date_purchased) = 6;
But I get this error:
Msg 147, Level 15, State 1, Line 4
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference

You're taking the MAX of the customer_id, but what you want is the customer_id with the highest number of transactions. Start with your inner query, and get the top customers using ORDER BY..DESC.
SELECT Sales.customer_id, count(Sales.customer_id) as transactions
FROM Sales
GROUP BY Sales.customer_id
ORDER BY transactions DESC;
Now that you have the top customer_id, you should be able to join that result on the Customers table (using this as a CTE or an inner query) to get the name, contact, and postal code.

Your current query has a number of issues:
Aggregates such as MAX cannot be used in the WHERE, they must be in the HAVING part.
Even if you change it to HAVING, the subquery is wrong because it doesn't filter on June
A much simpler method is to just join the tables, group, and then sort by COUNT and take the first row
The outer June filter should use start and end dates, not MONTH function, to improve performance
You should use proper table aliasing
SELECT TOP (1)
c.customer_name,
c.customer_email,
c.customer_postcode
FROM
Customer c
INNER JOIN
Sales s on Customer.customer_id = Sales.customer_id
WHERE
date_purchased >= '20200601' AND date_purchased < '20200701'
-- note the use of half open interval >= AND <
GROUP BY
c.customer_name,
c.customer_email,
c.customer_postcode
ORDER BY COUNT(*) DESC;

Related

Best approach to display all the users who have more than 1 purchases in a month in SQL

I have two tables in an Oracle Database, one of which is all the purchases done by all the customers over many years (purchase_logs). It has a unique purchase_id that is paired with a customer_id.The other table contains the user info of all the customers. Both have a common key of customer_id.
I want to display the user info of customers who have more than 1 unique item (NOT the item quantity) purchased in any month (i.e if A customer bought 4 unique items in february 2020 they would be valid as well as someone who bought 2 items in june). I was wondering what should my correct approach be and also how to correct execute that approach.
The two approaches that I can see are
Approach 1
Count the overall number of purchases done by all customers, filter the ones that are greater than 1 and then check if they any of them were done within a month.
Use this as a subquery in the where clause of the main query for retrieving the customer info for all the customer_id which match this condition.
This is what i've done so far,this retrieves the customer ids of all the customers who have more than 1 purchases in total. But I do not understand how to filter out all the purchases that did not occur in a single arbitrary month.
SELECT * FROM customer_details
WHERE customer_id IN (
SELECT cust_id from purchase_logs
group by cust_id
having count(*) >= 2);
Approach 2
Create a temporary table to Count the number of monthly purchases of a specific user_id then find the MAX() of the whole table and check if that MAX value is bigger than 1 or not. Then if it is provide it as true for the main query's where clause for the customer_info.
Approach 2 feels like the more logical option but I cannot seem to understand how to write the proper subquery for it as the command MAX(COUNT(customer_id)) from purchase_logs does not seem to be a valid query.
This is the DDL diagram.
This is the Sample Data of Purchase_logs
Customer_info
and Item_info
and the expected output for this sample data would be
It is certainly possible that there is a simpler approach that I am not seeing right now.
Would appreciate any suggestions and tips on this.
You need this query:
SELECT DISTINCT cust_id
FROM purchase_logs
GROUP BY cust_id, TO_CHAR(purchase_date, 'YYYY-MON')
HAVING COUNT(DISTINCT item_id) > 1;
to get all the cust_ids of the customers who have more than 1 unique item purchased in any month and you can use with the operator IN:
SELECT *
FROM customer_details
WHERE customer_id IN (
SELECT DISTINCT cust_id -- here DISTINCT may be removed as it does not make any difference when the result is used with IN
FROM purchase_logs
GROUP BY cust_id, TO_CHAR(purchase_date, 'YYYY-MON')
HAVING COUNT(DISTINCT item_id) > 1
);
One approach might be to try
with multiplepurchase as (
select customer_id,month(purchasedate),count(*) as order_count
from purchase_logs
group by customer_id,month(purchasedate)
having count(*)>=2)
select customer_id,username,usercategory
from mutiplepurchase a
left join userinfo b
on a.customer_id=b.customer_id
Expanding on #MT0 answer:
SELECT *
FROM customer_details CD
WHERE exists (
SELECT cust_id
FROM purchase_logs PL
where CD.customer_id = PL.customer_id
GROUP BY cust_id, item_id, to_char(purchase_date,'YYYYMM')
HAVING count(*) >= 2
);
I want to display the user info of customers who have more than 1 purchases in a single arbitrary month.
Just add a WHERE filter to your sub-query.
So assuming that you wanted the month of July 2021 and you had a purchase_date column (with a DATE or TIMESTAMP data type) in your purchase_logs table then you can use:
SELECT *
FROM customer_details
WHERE customer_id IN (
SELECT cust_id
FROM purchase_logs
WHERE DATE '2021-07-01' <= purchase_date
AND purchase_date < DATE '2021-08-01'
GROUP BY cust_id
HAVING count(*) >= 2
);
If you want the users where they have bought two-or-more items in any single calendar month then:
SELECT *
FROM customer_details c
WHERE EXISTS (
SELECT 1
FROM purchase_logs p
WHERE c.customer_id = p.cust_id
GROUP BY cust_id, TRUNC(purchase_date, 'MM')
HAVING count(*) >= 2
);

using a subquery with a having

So the goal is to get a list of customers that have on average ordered more than the total average of all customers.
Select customerNumber, customerName, orderNumber, SUM(quantityOrdered)as 'total_qty', ROUND(AVG(quantityOrdered),2) as 'avg'
From customers
join orders using(customerNumber)
join orderdetails using (orderNumber)
Group by customerNumber, OrderNumber
Having ROUND(AVG(quantityOrdered),2) > ROUND(AVG(quantityOrdered),2) IN
(SELECT ROUND(AVG(quantityOrdered),2) FROM orderdetails)
ORDER BY customerName;
My code runs but it doesn't filter the results on the avg quantity ordered column to only show results over the total average of 35.22.
Possibly, you mean:
select c.customernumber, c.customername,
sum(od.quantity_ordered) as sum_qty,
round(avg(od.quantity_ordered), 2) as avg_dty
from customers c
join orders o using(customerNumber)
join orderdetails od using (orderNumber)
group by customernumber, customername
having avg(od.quantity_ordered) > (select avg(quantity_ordered) from orderdetails)
Rationale:
you discuss computing the average ordered, but what your query does is compare the average order detail quantity per customer; this assumes that the latter is what you want
then: since you want an average per customer, so do not put the order number in the group by
no need for in or the-like in the having clause: just compare the customer's average against a scalar subquery that computes the overall
Notes:
don't use single quotes for identifiers (such as column aliases) - they are meant for literal strings
table aliases make the query easier to write and read; prefixing all columns with the alias of the table they belong to makes the query understandable

SQL How to select customers with highest transaction amount by state

I am trying to write a SQL query that returns the name and purchase amount of the five customers in each state who have spent the most money.
Table schemas
customers
|_state
|_customer_id
|_customer_name
transactions
|_customer_id
|_transact_amt
Attempts look something like this
SELECT state, Sum(transact_amt) AS HighestSum
FROM (
SELECT name, transactions.transact_amt, SUM(transactions.transact_amt) AS HighestSum
FROM customers
INNER JOIN customers ON transactions.customer_id = customers.customer_id
GROUP BY state
) Q
GROUP BY transact_amt
ORDER BY HighestSum
I'm lost. Thank you.
Expected results are the names of customers with the top 5 highest transactions in each state.
ERROR: table name "customers" specified more than once
SQL state: 42712
First, you need for your JOIN to be correct. Second, you want to use window functions:
SELECT ct.*
FROM (SELECT c.customer_id, c.name, c.state, SUM(t.transact_amt) AS total,
ROW_NUMBER() OVER (PARTITION BY c.state ORDER BY SUM(t.transact_amt) DESC) as seqnum
FROM customers c JOIN
transaactions t
ON t.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, c.state
) ct
WHERE seqnum <= 5;
You seem to have several issues with SQL. I would start with understanding aggregation functions. You have a SUM() with the alias HighestSum. It is simply the total per customer.
You can get them using aggregation and then by using the RANK() window function. For example:
select
state,
rk,
customer_name
from (
select
*,
rank() over(partition by state order by total desc) as rk
from (
select
c.customer_id,
c.customer_name,
c.state,
sum(t.transact_amt) as total
from customers c
join transactions t on t.customer_id = c.customer_id
group by c.customer_id
) x
) y
where rk <= 5
order by state, rk
There are two valid answers already. Here's a third:
SELECT *
FROM (
SELECT c.state, c.customer_name, t.*
, row_number() OVER (PARTITION BY c.state ORDER BY t.transact_sum DESC NULLS LAST, customer_id) AS rn
FROM (
SELECT customer_id, sum(transact_amt) AS transact_sum
FROM transactions
GROUP BY customer_id
) t
JOIN customers c USING (customer_id)
) sub
WHERE rn < 6
ORDER BY state, rn;
Major points
When aggregating all or most rows of a big table, it's typically substantially faster to aggregate before the join. Assuming referential integrity (FK constraints), we won't be aggregating rows that would be filtered otherwise. This might change from nice-to-have to a pure necessity when joining to more aggregated tables. Related:
Why does the following join increase the query time significantly?
Two SQL LEFT JOINS produce incorrect result
Add additional ORDER BY item(s) in the window function to define which rows to pick from ties. In my example, it's simply customer_id. If you have no tiebreaker, results are arbitrary in case of a tie, which may be OK. But every other execution might return different results, which typically is a problem. Or you include all ties in the result. Then we are back to rank() instead of row_number(). See:
PostgreSQL equivalent for TOP n WITH TIES: LIMIT "with ties"?
While transact_amt can be NULL (has not been ruled out) any sum may end up to be NULL as well. With an an unsuspecting ORDER BY t.transact_sum DESC those customers come out on top as NULL comes first in descending order. Use DESC NULLS LAST to avoid this pitfall. (Or define the column transact_amt as NOT NULL.)
PostgreSQL sort by datetime asc, null first?

Creating variable in SQL and using in WHERE clause

I want to create a variable that counts the number of times each customer ID appears in the CSV, and then I want the output to be all customer IDs that appear 0,1,or 2 times. Here is my code so far:
SELECT Customers.customer_id , COUNT(*) AS counting
FROM Customers
LEFT JOIN Shopping_cart ON Customers.customer_id = Shopping_cart.customer_id
WHERE counting = '0'
OR counting = '1'
OR counting = '2'
GROUP BY Customers.customer_id;
SELECT Customers.customer_id , COUNT(*) AS counting
FROM Customers LEFT JOIN Shopping_cart on Customers.customer_id=Shopping_cart.customer_id
WHERE COUNT(*) < 3
GROUP BY Customers.customer_id;
The query groups all customer ids, and with count() we get the number of items in a group. So for your solution you call the group count() and say only the items where the group count is smaller then 3. Smaller then 3 includes (0,1,2). You can reuse the count() in the query.
You're probably thinking of HAVING, not WHERE.
For example:
select JOB, COUNT(JOB) from SCOTT.EMP
group by JOB
HAVING count(JOB) > 1 ;
While a tad odd, you may be specific about the HAVING condition(s):
HAVING count(JOB) = 2 or count(JOB) = 4
Note: the WHERE clause is used for filtering rows and it applies on each and every row, while the HAVING clause is used to filter groups.
You can apply a filter after the aggregation with the HAVING clause.
Please note that count(*) counts all rows, including empty ones, so you cannot use it to detect customers without any shopping cart; you have to count the non-NULL values in some column instead:
SELECT customer_id,
count(Shopping_cart.some_id) AS counting
FROM Customers
LEFT JOIN Shopping_cart USING (customer_id)
GROUP BY customer_id
HAVING count(Shopping_cart.some_id) BETWEEN 0 and 2;

MS-Access: HAVING clause not returning any records

I have a Select query to extract Customer Names and Purchase Dates from a table. My goal is to select only those names and dates for customers who have ordered on more than one distinct date. My code is as follows:
SELECT Customer, PurchDate
FROM (SELECT DISTINCT PurchDate, Customer
FROM (SELECT CDate(FORMAT(DateAdd("h",-7,Mid([purchase-date],1,10)+""+Mid([purchase-date],12,8)), "Short Date")) AS PurchDate,
[buyer-name] AS Customer
FROM RawImport
WHERE sku ALIKE "%RE%"))
GROUP BY Customer, PurchDate
HAVING COUNT(PurchDate)>1
ORDER BY PurchDate
This returns no results, even though there are many customers with more than one Purchase Date. The inner two Selects work perfectly and return a set of distinct dates for each customer, so I believe there is some problem in my GROUP/HAVING/ORDER clauses.
Thanks in advance for any help!
You are doing in the inner select
SELECT DISTINCT PurchDate, Customer
and in the outter select
GROUP BY Customer, PurchDate
That mean all are
having count(*) = 1
I cant give you the exact sintaxis in access but you need something like this
I will use YourTable as a replacement of your inner derivated table to make it easy to read
SELECT DISTINCT Customer, PurchDate
FROM YourTable
WHERE Customer IN (
SELECT Customer
FROM (SELECT DISTINCT Customer, PurchDate
FROM YourTable)
GROUP BY Customer
HAVING COUNT(*) > 1
)
inner select will give you which customer order on more than one day.
outside select will bring you those customer on all those days.
.
Maybe you can try something simple to get the list of customer who brought in more than one day like this
SELECT [buyer-name]
FROM RawImport
WHERE sku ALIKE "%RE%"
GROUP BY [buyer-name]
HAVING Format(MAX(purchase-date,"DD/MM/YYYY")) <>
Format(MIN(purchase-date,"DD/MM/YYYY"))