Difference between Group by queries with and without Having [closed] - sql

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 15 days ago.
Improve this question
I'm new to SQL queries so I'm trying to understand the difference between a select query with and without having. In the task, I need to report the first login date for each player.
How the table is arranged:
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
The first attempt, in which I wrote having, was unsuccessful.
SELECT player_id, event_date as first_login FROM Activity
GROUP BY player_id
HAVING MIN(event_date)
Failed Test
{"headers":{"Activity":["player_id","device_id","event_date","games_played"]},"rows":{"Activity":[[1,2,"2016-03-01",5],[1,2,"2016-05-02",6],[1,3,"2015-06-25",1],[3,1,"2016-03-02",0],[3,4,"2016-02-03",5]]}}
This option passed all tests:
SELECT player_id, MIN(event_date) as first_login FROM Activity
GROUP BY player_id
Why is it impossible to write HAVING in the first option, what is the difference between queries?

HAVING as well as WHERE are both filtering clauses. You use WHERE to specify filtering condition before grouping takes place. You can use HAVING to further filter grouped results.
Below you can see some example of viewing all the customers from specific state - WHERE who spend more than 70$. Here you need HAVING since you need to filter from grouped results. In order to see who spent more than 70$ you need to calculate how much they spent. I hope this clarifies your question.
SELECT oi.order_id, SUM(quantity * unit_price) total_spent, CONCAT(c.first_name, ' ', c.last_name) AS fullname, c.state
FROM order_items oi
JOIN orders o USING(order_id)
JOIN customers c USING(customer_id)
WHERE c.state = 'VA'
GROUP BY order_id
HAVING total_spent > 70
ORDER BY order_id;

The HAVING clause is used to place conditions on groups created by the GROUP BY clause, similar to how the WHERE clause places conditions on columns.
You don't need a HAVING clause to solve this problem (as you noted). You might use a HAVING clause if you needed to report the first login date for each player before or after a specific date.
SELECT player_id, MIN(event_date) as first_login
FROM Activity
GROUP BY player_id
HAVING MIN(event_date) > '2020-12-31'

Related

TSQL request is required [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
The task sounds like this:
Choose the top 10 cities for the next store opening
Columns: City | A priority
Priority is defined as the number of buyers in the city
There should be no shop in the city.
Preliminary wrong decision:
SELECT
City,
COUNT(1) as [Priority]
FROM
Sales.vIndividualCustomer
GROUP BY City
EXCEPT
SELECT
City,
COUNT(1) as [Priority]
FROM
Purchasing.vVendorWithAddresses
GROUP BY City
ORDER BY [Priority] DESC
GO
Result:
Uniqueness disappears as soon as I start counting priority. Maybe there is another way?
PS: Used as a database AdventureWorks2016 from Microsoft.
You can use a NOT EXISTS subquery
SELECT
ic.City,
COUNT(1) as [Priority]
FROM
Sales.vIndividualCustomer ic
GROUP BY
ic.City
HAVING NOT EXISTS (SELECT 1
FROM
Purchasing.vVendorWithAddresses va
WHERE
va.City = ic.City)
ORDER BY [Priority] DESC;
Note that if you wish to refer to aggregated columns then the NOT EXISTS will have to be in a WHERE not a HAVING

Does join combine the same column with same name and data?

Im reading this article by Miguel Grinberg, and on the 'The Join' part, I'm kinda confused with the result.
To sum up the part I'm concerned, he joined a query and a subquery belonging to the SAME table on the condition where its customer_id's are the same
Query selected: id, customer_id, order_date
Subquery selected: customer_id, max(order_date) AS last_order_date
When he joined it I was expecting something like:
id | customer_id | order_data | customer_id | last_order_date
--------------------------------------------------------------
But his result was:
id | customer_id | order_data | last_order_date
-----------------------------------------------
Where is the other customer_id selected from the subquery?
With that I would like to confirm if my understanding is correct, a JOIN also combines two COLUMNS if it has the same NAME and VALUE.
The fact that the article uses select * when it should be using select orders.*, last_orders.last_order_date already makes me suspicious of anything else in the article.
Most databases would run the query and return two columns with customer_id -- as you suggest should happen. However, there is then a problem in accessing both those columns in an application. They have the same name. So, the columns might be elided in some way.
All that said, this is a rather poor example, because the query is much better written using window functions:
select o.*, max(order_date) over (partition by customer_id)
from orders o;

SQL,Trying to get the sum with a datetime column [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I have 2 columns, one is showing the email and the other one shows the date the email was created.
What I'm trying to do is something like this:
registered customer 4 Date 01-01-2018
registered customer 2 Date 01-02-2018
registered customer 9 Date 01-03-2018
Any help would be appreciated.
Thanks
If table stucture consist of 2 columns(email, created_at) you should groupping your data by created_at field.
Example:
select date, count(email) as count
from your_table
group by date
You appears to want :
select date, count(email) as emailcount
from table t
group by date;
EDIT : Use count() instead of sum() & if you want to count day wise email as registered customer then use the date in GROUP BY clause instead of doing aggregation.
SELECT SITEID, THEDATE, COUNT(EMAIL)
FROM [database].[dbo].[table]
WHERE LOGIN NOT LIKE '' AND SITEID = 'someSiteId' AND
THEDATE >= '2017-01-01'
GROUP BY SITEID, THEDATE
ORDER BY THEDATE ASC;
Depending on how many entries you have, you could also do:
Select Num, date
FROM table
ORDER BY date
Then make Num be an identity column in the table that matches the email with a customer number. Maybe not as ideal depending on your data...but if you are referring to the customer by their number you may potentially want that info in the table.

Sum Expression Aggregate Error

I am rewriting this question because Gordon Linoff told me it might come off as rude if I edited my other one -- so this isn't a duplicate, just a correction.
I am trying to write a code that will sum the prices of all the orders that come up when I fill out the query. For example, if I enter the ID range 1-60, I want there to be a sum column created that then sums up all the prices of ID's 1-60.
I thought it would be simple enough to just create a SUM(.....) AS Exp 1, but it tells me that there is a problem with the ID and aggregate function.
I want to be able to see the individual prices, as well as a new column with the sum of all these prices. I plan on adding some more columns of data into the table later on.
My current code looks like this:
SELECT table.ID, table.Price, SUM(table.Price) AS Exp 1
FROM table
WHERE table.ID BETWEEN StartID AND EndID
Thank you for any help
You have a concept error with your aggregate statement. When you run this query, the WHERE clause will evaluate first to exclude all IDs that are not between your user specified start and end points. Then, you missed the GROUP BY clause to tell it what needs to be grouped. Eliminate the table.Price field, otherwise you will be getting unique records for each price which is not what you want.
SELECT t.ID, SUM(t.Price) AS Price_Summary
FROM table t
WHERE t.ID BETWEEN StartID AND EndID
GROUP BY t.ID
Also, aliases will help improve readability.
EDIT
I think this might be what you are trying to get to, but I'm still unclear.
SELECT t.ID, SUM(t.Price) AS Price_Summary,
(SELECT SUM(t2.Price) FROM table t2 WHERE t2.ID BETWEEN StartID AND EndID) AS Total_Price
FROM table t
WHERE t.ID BETWEEN StartID AND EndID
GROUP BY t.ID
This will give you a result set with all of the IDs, a sum of the price of everything with that ID, and then a total sum. So it would look something like this:
ID | Price_Summary | Total_Price
1 10 60
2 30 60
3 20 60

Need some assistance with 2 SQL queries [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I need some help with my exam, there are 2 questions but I don't know how to join them and which SQL statements I need.
Write a SQL statement which displays the employee id, last name,and salary of any employee whose salary is above the price of the most expensive product.
Write a SQL statement which displays the customer id of customers who purchased the product with product_id 1 but so far never purchased the product with product_id 3.
It would be nice if somebody explain me the solution.
With the understanding this is a practice exam, and that you don't have access to the tables/values....
You really should show what you've tried... but since you're new...
Question 1:
Return employeeID, last name and salary of all employees who have a salary greater than the price of the most expensive product (like how they avoided using the word max there...)
So this uses a subquery to get the max price from products and simply compares salary against it.
SELECT Employee_ID, Last_Name, Salary
FROM employees
WHERE salary > (Select max(price) from Products);
Question 2:
Generate two sets one for product_ID of 1 one for product_ID of 3 and make sure those with 1 are not in the group with 3.
and this uses a correlated subquery (notice how the PUR alias is in the subquery? that's why it's correlated)
So the first query gets all the customers buying product 1. The 2nd query finds all the customers buying product 3. and since we want only those customers who bought 1 not 3, we correlate the queries using not exists and we have our answer.
Exists - usually fastest as it can early exit the subquery once a single record for a customer is found.
SELECT Distinct PUR.Customer_ID
FROM purchases PUR
WHERE PRODUCT_ID = 1
and not exists (SELECT 1
FROM Purchases PUR2
WHERE Product_ID = 3
and PUR.Customer_ID = PUR2.Customer_I
There are 2-3 ways of doing this. I find the above the most efficient in most cases. However you could also do it by... a self join; or using an not IN statement
Self LEFT join (works great if you need data from multiple tables
We self join on purchases but filter each table instance to be for a specific product. we use left join as we want all records with product Id 1 and where a match is found in p2, we want to EXCLUDE those records so if the customer_ID is null that means they have no product 3 purchases
SELECT Distinct P1.customer_ID
FROM Purchases P1
LEFT JOIN Purchase P2
on P1.customer_Id = P2.customer_ID
and P2.Product_ID = 3
WHERE P1.Product_ID = 1
and P2.customer_ID is null
not in (usually slowest but works fine if subquery is for a VERY Small datasets)
SELECT Distinct PUR.Customer_ID
FROM purchases PUR
WHERE PRODUCT_ID = 1
and customer_ID not in (SELECT Customer_Id
FROM Purchases
WHERE Product_ID = 3)
Notice that neither of the 1st two answers uses a join; though a correlated subquery is close.