sql error ORA-00934: Group function is not allowed here - sql

When I run the following query, I get ORA-00934: group function is not allowed here
What is the problem?
Select cust_name
from Customers
where
state = 'California' AND
cust_id in(
select cust_id
from Orders
where
count(cust_id) >= 1 AND
book_id in(select book_id from Books where category = 'Computers')
group by cust_id
)

You wrote:
where
count(cust_id) >= 1 AND
You cannot use a COUNT, MIN, MAX, AVG or other aggregate function in a WHERE clause because at the time the WHERE is executed the GROUP BY has not yet been done so there is no aggregation. SQLs execute in the following order:
FROM
WHERE
GROUP
SELECT
Subqueries execute in that order before main queries execute in that order. Main queries cannot access anything inside a sub query unless the sub query emits it (your sub queries emit lists of values used by IN)
So, you can't use COUNT in your WHERE, but let's look at what you're trying to do:
where
count(cust_id) >= 1 AND
"Where the count of cust_id is at least one.."
It's highly likely this is redundant; the way to get count to return 0 is not have any data for that cust_id, but because you're grouping and counting just one table it's you don't get a 0 count out of it - in order to show up in a result set a row has to be present, which means the count is always at least 1. Other than having null in the cust_id there is no way to make this query return 0 for any row:
SELECT cust_id, count(cust_id)
FROM t
GROUP BY cust_id
And if you're looking to eliminate nulls, you'd just say WHERE cust_id IS NOT NULL. If Orders has a not hull constraint on cust_id (is it logical to have an order that has no customer?) then there wouldn't be any need to specify it
Further, because you're then using the results in an IN, even if a NULL was selected, it gets discarded by the IN anyway- nothing is ever equal to a NULL, even another NULL so saying
WHERE x IN (1,2,3,NULL)
just gives you rows with x that is 1, 2 or 3; you don't get any rows with c as NULL. IN also doesn't care about duplicated values so this is the same as above:
WHERE x IN (1,1,2,2,2,3,NULL)
All in there is entirely no need for the clause you've put, and it can be removed. I suppose the question you're answering is "get the names of all customers from California who have ordered at least one book about computers". The at least one is a red herring; there won't be an order for them if they haven't so you can ignore it:
select cust_name
from Customers
where
state = 'California' AND
cust_id in(
select cust_id
from Orders
where
book_id in(select book_id from Books where category = 'Computers')
)
If however the assignment is "at least two books" then you will need to exclude the single orders. That is done with HAVING which is a where clause that runs after a GROUP BY...
Select cust_name
from Customers
where
state = 'California' AND
cust_id in(
select cust_id
from Orders
where
book_id in(select book_id from Books where category = 'Computers')
group by cust_id
having count(cust_id) > 1 AND
)
Note the use of > rather than >=
Personally, rather than nesting IN I would use JOINs and keep it all on the same level:
SELECT cust_name
FROM
Customers c
INNER JOIN Orders o on c.cust_id = o.cust_id
INNER JOIN Books b on o.book_id = b.book_id
WHERE
c.state = 'California' AND
b.category = 'Computers'
GROUP BY c.cust_id, c.cust_name
HAVING COUNT(*) > 1
If you're going to use this latter form for "at least one book", remove the HAVING but keep the GROUP BY rather than using DISTINCT, as it will prevent different customers with the same name coalescing into one

Seems no need use group by.
Try the SQL statement:
Select cust_name from Customers
where state = 'California'
AND cust_id in
(select cust_id from Orders
where count(cust_id) >= 1
AND book_id in
(select book_id from Books where category = 'Computers')
)
At least you can use distinct to avoid using group by. But distinct seems no need to use in the select subquery.

Related

Best approach to display all the users who have more than 1 purchases in a month in SQL

I have two tables in an Oracle Database, one of which is all the purchases done by all the customers over many years (purchase_logs). It has a unique purchase_id that is paired with a customer_id.The other table contains the user info of all the customers. Both have a common key of customer_id.
I want to display the user info of customers who have more than 1 unique item (NOT the item quantity) purchased in any month (i.e if A customer bought 4 unique items in february 2020 they would be valid as well as someone who bought 2 items in june). I was wondering what should my correct approach be and also how to correct execute that approach.
The two approaches that I can see are
Approach 1
Count the overall number of purchases done by all customers, filter the ones that are greater than 1 and then check if they any of them were done within a month.
Use this as a subquery in the where clause of the main query for retrieving the customer info for all the customer_id which match this condition.
This is what i've done so far,this retrieves the customer ids of all the customers who have more than 1 purchases in total. But I do not understand how to filter out all the purchases that did not occur in a single arbitrary month.
SELECT * FROM customer_details
WHERE customer_id IN (
SELECT cust_id from purchase_logs
group by cust_id
having count(*) >= 2);
Approach 2
Create a temporary table to Count the number of monthly purchases of a specific user_id then find the MAX() of the whole table and check if that MAX value is bigger than 1 or not. Then if it is provide it as true for the main query's where clause for the customer_info.
Approach 2 feels like the more logical option but I cannot seem to understand how to write the proper subquery for it as the command MAX(COUNT(customer_id)) from purchase_logs does not seem to be a valid query.
This is the DDL diagram.
This is the Sample Data of Purchase_logs
Customer_info
and Item_info
and the expected output for this sample data would be
It is certainly possible that there is a simpler approach that I am not seeing right now.
Would appreciate any suggestions and tips on this.
You need this query:
SELECT DISTINCT cust_id
FROM purchase_logs
GROUP BY cust_id, TO_CHAR(purchase_date, 'YYYY-MON')
HAVING COUNT(DISTINCT item_id) > 1;
to get all the cust_ids of the customers who have more than 1 unique item purchased in any month and you can use with the operator IN:
SELECT *
FROM customer_details
WHERE customer_id IN (
SELECT DISTINCT cust_id -- here DISTINCT may be removed as it does not make any difference when the result is used with IN
FROM purchase_logs
GROUP BY cust_id, TO_CHAR(purchase_date, 'YYYY-MON')
HAVING COUNT(DISTINCT item_id) > 1
);
One approach might be to try
with multiplepurchase as (
select customer_id,month(purchasedate),count(*) as order_count
from purchase_logs
group by customer_id,month(purchasedate)
having count(*)>=2)
select customer_id,username,usercategory
from mutiplepurchase a
left join userinfo b
on a.customer_id=b.customer_id
Expanding on #MT0 answer:
SELECT *
FROM customer_details CD
WHERE exists (
SELECT cust_id
FROM purchase_logs PL
where CD.customer_id = PL.customer_id
GROUP BY cust_id, item_id, to_char(purchase_date,'YYYYMM')
HAVING count(*) >= 2
);
I want to display the user info of customers who have more than 1 purchases in a single arbitrary month.
Just add a WHERE filter to your sub-query.
So assuming that you wanted the month of July 2021 and you had a purchase_date column (with a DATE or TIMESTAMP data type) in your purchase_logs table then you can use:
SELECT *
FROM customer_details
WHERE customer_id IN (
SELECT cust_id
FROM purchase_logs
WHERE DATE '2021-07-01' <= purchase_date
AND purchase_date < DATE '2021-08-01'
GROUP BY cust_id
HAVING count(*) >= 2
);
If you want the users where they have bought two-or-more items in any single calendar month then:
SELECT *
FROM customer_details c
WHERE EXISTS (
SELECT 1
FROM purchase_logs p
WHERE c.customer_id = p.cust_id
GROUP BY cust_id, TRUNC(purchase_date, 'MM')
HAVING count(*) >= 2
);

how can I select rows that column does NOT have more than 1 value?

I am very new to SQL and I am wondering how to solve this issue. For example, my table looks as follows:
As you see in the table item_id 1 appears in both city_id 1 and 2, so does the item_id 4, but I want to get all the items where appears only in one city_id.
In this example, these would be item_id 2 (appearing only in city_id 2) and item_id 3 (appearing in city_id 1).
Use aggregation on item_id and count distinct values of city_id. The having clause can be used to filter on aggregates.
select item_id from mytable group by id having count(distinct city_id) = 1
You can use the following query:
SELECT item_id
FROM table_name
GROUP BY item_id
HAVING COUNT(DISTINCT city_id) = 1
In case you want to see the city_id to you can use this query:
SELECT item_id, MIN(city_id) AS city_id
FROM example
GROUP BY item_id
HAVING COUNT(DISTINCT city_id) = 1
Since there is only one city_id you can use MIN or MAX to get the id.
demo on dbfiddle.uk
You want all the id where they have only one distinct city:
SELECT item_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
It works by counting all the different values that city_id has for the same item_id. For those item ids where they repeat a lot, but the city_id is always the same the count of unique values in the city id is 1, and we can look for these using a HAVING clause. "Having" is like a where clause that runs after a GROUP BY operation is completed. It is the conceptual equivalent of this:
SELECT item_id
FROM
(
SELECT item_id, count(distinct city_id) as cdci
FROM table
GROUP BY item_id
) x
WHERE cdci = 1
If you want the city id too you can either get the MAX city (because in this case there is only one city so it's safe to do):
SELECT item_id, MAX(city_id) as city_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
or you could join this query back to the item table as a subquery:
SELECT t.*
(
SELECT item_id
FROM table
GROUP BY item_id
HAVING count(distinct city_id) = 1
) x
INNER JOIN
table t
ON x.item_id = t.item_id
This technique is the more general process for performing a group by that finds some particular set of rows, then bringing in the rest of the data from that row. You cant always stick every other column you want in a MAX because it will mix row data up, and you can't put the extra columns in your group by because that will subdivide what you're grouping on, giving the wrong results. Doing the group as a subquery and joining it back is a typical way to get all the row data when you have to group it to find which rows are interesting
In your case this form of query will bring all the duplicated rows (whereas the group by/max won't). If you don't want the duplicate rows you can make the top line SELECT DISTINCT t.* but don't make a habit of slapping distinct in to get rid of duplicated rows; if your tables don't have duplicates to start with but suddenly after you wrote a JOIN you got duplicated rows, google fornwhat a Cartesian product is in database queries and how to prevent it
You just need a group by on item id with having
Select item_id from table group by
item_id having count(distinct city_id)
=1
Also, if you want to have majority of same no of rows as input then
Select item_id, city, rank()
over(partition by item_id order by city)
rn
From table where rn=1;

Creating variable in SQL and using in WHERE clause

I want to create a variable that counts the number of times each customer ID appears in the CSV, and then I want the output to be all customer IDs that appear 0,1,or 2 times. Here is my code so far:
SELECT Customers.customer_id , COUNT(*) AS counting
FROM Customers
LEFT JOIN Shopping_cart ON Customers.customer_id = Shopping_cart.customer_id
WHERE counting = '0'
OR counting = '1'
OR counting = '2'
GROUP BY Customers.customer_id;
SELECT Customers.customer_id , COUNT(*) AS counting
FROM Customers LEFT JOIN Shopping_cart on Customers.customer_id=Shopping_cart.customer_id
WHERE COUNT(*) < 3
GROUP BY Customers.customer_id;
The query groups all customer ids, and with count() we get the number of items in a group. So for your solution you call the group count() and say only the items where the group count is smaller then 3. Smaller then 3 includes (0,1,2). You can reuse the count() in the query.
You're probably thinking of HAVING, not WHERE.
For example:
select JOB, COUNT(JOB) from SCOTT.EMP
group by JOB
HAVING count(JOB) > 1 ;
While a tad odd, you may be specific about the HAVING condition(s):
HAVING count(JOB) = 2 or count(JOB) = 4
Note: the WHERE clause is used for filtering rows and it applies on each and every row, while the HAVING clause is used to filter groups.
You can apply a filter after the aggregation with the HAVING clause.
Please note that count(*) counts all rows, including empty ones, so you cannot use it to detect customers without any shopping cart; you have to count the non-NULL values in some column instead:
SELECT customer_id,
count(Shopping_cart.some_id) AS counting
FROM Customers
LEFT JOIN Shopping_cart USING (customer_id)
GROUP BY customer_id
HAVING count(Shopping_cart.some_id) BETWEEN 0 and 2;

SQL Query: Find the name of the company that has been assigned the highest number of patents

Using this query I can find the Company Assignee number for company with most patents but I can't seem to print the company name.
SELECT count(*), patent.assignee
FROM Patent
GROUP BY patent.assignee
HAVING count(*) =
(SELECT max(count(*))
FROM Patent
Group by patent.assignee);
COUNT(*) --- ASSIGNEE
9 19715
9 27895
Nesting above query into
SELECT company.compname
FROM company
WHERE ( company.assignee = ( *above query* ) );
would give an error "too many values" since there are two companies with most patents but above query takes only one assignee number in the WHERE clause. How do I solve this problem? I need to print name of BOTH companies with assignee number 19715 and 27895. Thank you.
You have started down the path of using nested queries. All you need to do is remove COUNT(*):
SELECT company.compname
FROM company
WHERE company.assignee IN
(SELECT patent.assignee
FROM Patent
GROUP BY patent.assignee
HAVING count(*) = (SELECT max(count(*))
FROM Patent
GROUP BY patent.assignee
)
);
I wouldn't write the query this way. The use of max(count(*)) is particularly jarring, but it is valid Oracle syntax.
Applying an aggregate function on another aggregate function (like max(count(*))) is illegal in many databases but I believe using the ALL operator instead and a join to get the company name would solve your problem.
Try this:
SELECT COUNT(*), p.assignee, c.compname
FROM Patent p
JOIN Company c ON c.assignee = p.assignee
GROUP BY p.assignee, c.compname
HAVING COUNT(*) >= ALL -- this predicate will return those rows
( -- for which the comparison holds true
SELECT COUNT(*) -- for all instances.
FROM Patent -- it can only be true for the highest count
GROUP BY assignee
);
Assuming you have Oracle, I thought about this a bit differently:
select
c.compname
from
company c
join
(
select
assignee,
dense_rank() over (order by count(1) desc) rnk
from
patent
group by
assignee
) p
on p.assignee = c.assignee
where
p.rnk = 1
;
I like this because is lets you find the any rank. For example, if you want the top 3 you would just change p.rnk = 1 to p.rnk <= 3. If you want 10th place, you just change it to p.rnk = 10. Adding the total count and rank into the results would be easy from here too. Overall I think it's more versatile.

Which customer has placed most orders. SQL query

I'm trying to query my database for my class to find out which customer has placed the most orders. The table I'm searching is a three attribute table that has the customerID, orderID, and the placedDate.
The query I thought would work is:
select cid from placed order by sum(oid);
But I keep getting an error saying cid is "not a single-group group function" the oid is the primary key and is a foreign key that references another table. Is that what the issue is?
If you want to count the number of orders you should do a count instead of a SUM:
SELECT cid,COUNT(*)
FROM placed
GROUP BY cid
ORDER BY COUNT(*) DESC
This will give you the list of customers and their respective number of orders, ordered by the number of orders descendent.
If you want just the customer with most orders, you have to limit the number of records to the first one. For that, you have to tell what DBMS you use, since it varies with the DBMS the way you limit the query to the first one (ex: mysql is LIMIT 1, sql-server is TOP 1):
In Oracle, you can do:
SELECT * FROM (
SELECT cid,COUNT(*)
FROM placed
GROUP BY cid
ORDER BY COUNT(*) DESC
) a
WHERE rownum = 1
In case the there are one or more customers having maximum orders:
select * from orders o, customer c where o.cusId = c.cusId and o.cusId IN (select cusId from orders group by cusId having count(*) = (select count(*) from orders or group by or.cusId order by count(*) desc limit 1));
This solution is for MySQL, as I have used LIMIT. It can be changed as per the DBMS.
I also used = in the second query since LIMIT does not work with IN.