What is "master join" in sql - sql

CREATE TABLE car (id INT, name1 CHAR(15))
CREATE TABLE sales (sid INT, cid INT, year1 INT)
INSERT INTO car
VALUES (1, 'vento'), (2, 'vento'), (3, 'baleno'), (4, 'swift')
INSERT INTO sales
VALUES (1, 1, 2017), (2, 3, 2018), (3, 3, 2017), (5, 4, 2017)
--verify
--SELECT * FROM car
--SELECT * FROM sales
--1st query
SELECT sid, cid, year1, name1
FROM sales master
INNER JOIN car ON cid = id
--2nd query
SELECT sid, cid, year1, name1
FROM sales
INNER JOIN car ON cid = id
What is the difference between the 1st and the 2nd query?
What is the purpose of "master join" and when should we use it?

Nothing is different, it is functionally the same.
To explain, I've qualified the column names to demonstrate that master is just an alias.
I'd highly recommend qualifying the rest of the query as well, since ...car on cid = id works now, but isn't good form because if ever sales or car tables had the same column name, you'd get an error about ambiguity.
Also, decide if you want to use INNER JOIN, LEFT OUTER JOIN, etc.. [Types of joins] because it's more clear what you desire from a maintenance point of view later on.
SELECT master.sid
,master.cid
,master.year1
,c.name1
FROM sales master
INNER JOIN car c ON master.cid = c.id --(1st)
SELECT s.sid
,s.cid
,s.year1
,c.name1
FROM sales s
INNER JOIN car c ON s.cid = c.id --(2nd)

There is no concept called Master Join in sql.
SELECT
sid, cid, year1, name1
FROM sales master
INNER JOIN car
ON
cid = id
The above query takes master as alias name for sales table and does the inner join between sales table and car table
you can refer cid using alias (master) as below:
SELECT
sid, cid, year1, name1
FROM sales master
INNER JOIN car
ON
master.cid = id

Related

HAVING clause with subquery -- Checking if group has at least one row matching conditions

Suppose I have the following table
DROP TABLE IF EXISTS #toy_example
CREATE TABLE #toy_example
(
Id int,
Pet varchar(10)
);
INSERT INTO #toy
VALUES (1, 'dog'),
(1, 'cat'),
(1, 'emu'),
(2, 'cat'),
(2, 'turtle'),
(2, 'lizard'),
(3, 'dog'),
(4, 'elephant'),
(5, 'cat'),
(5, 'emu')
and I want to fetch all Ids that have certain pets (for example either cat or emu, so Ids 1, 2 and 5).
DROP TABLE IF EXISTS #Pets
CREATE TABLE #Pets
(
Animal varchar(10)
);
INSERT INTO #Pets
VALUES ('cat'),
('emu')
SELECT Id
FROM #toy_example
GROUP BY Id
HAVING COUNT(
CASE
WHEN Pet IN (SELECT Animal FROM #Pets)
THEN 1
END
) > 0
The above gives me the error Cannot perform an aggregate function on an expression containing an aggregate or a subquery. I have two questions:
Why is this an error? If I instead hard code the subquery in the HAVING clause, i.e. WHEN Pet IN ('cat','emu') then this works. Is there a reason why SQL server (I've checked with SQL server 2017 and 2008) does not allow this?
What would be a nice way to do this? Note that the above is just a toy example. The real problem has many possible "Pets", which I do not want to hard code. It would be nice if the suggested method could check for multiple other similar conditions too in a single query.
If I followed you correctly, you can just join and aggregate:
select t.id, count(*) nb_of_matches
from #toy_example t
inner join #pets p on p.animal = t.pet
group by t.id
The inner join eliminates records from #toy_example that have no match in #pets. Then, we aggregate by id and count how many recors remain in each group.
If you want to retain records that have no match in #pets and display them with a count of 0, then you can left join instead:
select t.id, count(*) nb_of_records, count(p.animal) nb_of_matches
from #toy_example t
left join #pets p on p.animal = t.pet
group by t.id
How about this approach?
SELECT e.Id
FROM #toy_example e JOIN
#pets p
ON e.pet = p.animal
GROUP BY e.Id
HAVING COUNT(DISTINCT e.pet) = (SELECT COUNT(*) FROM #pets);

SQL Query to satisfy 2 conditions

I'm new to SQL and after designing the database, i'm having trouble with some queries. The query i'm currently struggling with states:
"A list of the customers who have ordered at least one project with a higher than average expected duration."
SELECT Customer.name
FROM Project, Customer
WHERE Project.c_id = Customer.c_id AND Project.exp_duration > AVG(Project.exp_duration)
I tried to implement this code but i keep gettin the following error message : "An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference."
Can someone help me with this? I've thought about using joins but i can't get it to work either.
Thanks in advance!
Replace the table variables (#Project & #Customer) with your real tables (Project & Customer).
DECLARE #Project TABLE
(
p_id INT,
exp_duration DECIMAL(18,2),
c_id INT
)
DECLARE #Customer TABLE
(
c_id INT,
name VARCHAR(20)
)
INSERT #Project VALUES (1, 10, 1), (2, 5, 1), (3, 20, 1), (4, 10, 2), (5, 15, 2), (6, 20, 1)
INSERT #Customer VALUES (1, 'C1'), (2, 'C2')
-- average duration
-- SELECT AVG(exp_duration) FROM #Project
SELECT DISTINCT C.name
FROM #Customer C INNER JOIN #Project P ON C.c_id = P.c_id
WHERE p.exp_duration > (SELECT AVG(exp_duration) FROM #Project)
The following query gives the list of Customers who have ordered at least one project (i.e. being a part of one or more projects) and whose ExpectedDuration is greater than the Average ExpectedDuration.
I have used left outer join, group by, count and avg aggregate functions.
Select
C.CustomerID,
C.Name
From SampleCustomer C
Left Join SampleProject P
On C.CustomerID = P.CustomerID
Where P.ExpectedDuration > (Select Avg(ExpectedDuration) From SampleProject Where CustomerID = C.CustomerID)
Group By C.CustomerID, C.Name
Having Count(P.ProjectID) >= 1
Order By C.CustomerID;

Join on resultant table of another join without using subquery,CTE or temp tables

My question is can we join a table A to resultant table of inner join of table A and B without using subquery, CTE or temp tables ?
I am using SQL Server.
I will explain the situation with an example
The are two tables GoaLScorers and GoalScoredDetails.
GoaLScorers
gid Name
-----------
1 A
2 B
3 A
GoalScoredDetails
DetailId gid stadium goals Cards
---------------------------------------------
1 1 X 2 1
2 2 Y 5 2
3 3 Y 2 1
The result I am expecting is if I select a stadium 'X' (or 'Y')
I should get name of all who may or may not have scored there, also aggregate total number of goals,total cards.
Null value is acceptable for names if no goals or no cards.
I can get the result I am expecting with the below query
SELECT
gs.name,
SUM(goal) as TotalGoals,
SUM(cards) as TotalCards
FROM
(SELECT
gid, stadium, goal, cards
FROM
GoalScoredDetails
WHERE
stadium = 'Y') AS vtable
RIGHT OUTER JOIN
GoalScorers AS gs ON vtable.gid = gs.gid
GROUP BY
gs.name
My question is can we get the above result without using a subquery or CTE or temp table ?
Basically what we need to do is OUTER JOIN GoalScorers to resultant virtual table of INNER JOIN OF GoalScorers and GoalScoredDetails.
But I am always faced with ambiguous column name error as "gid" column is present in GoalScorers and also in resultant table. Error persists even if I try to use alias for column names.
I have created a sql fiddle for this her: http://sqlfiddle.com/#!3/40162/8
SELECT gs.name, SUM(gsd.goal) AS totalGoals, SUM(gsd.cards) AS totalCards
FROM GoalScorers gs
LEFT JOIN GoalScoredDetails gsd ON gsd.gid = gs.gid AND
gsd.Stadium = 'Y'
GROUP BY gs.name;
IOW, you could push your where criteria onto joining expression.
The error Ambiguous column name 'ColumnName' occurs when SQL Server encounters two or more columns with the same and it hasn't been told which to use. You can avoid the error by prefixing your column names with either the full table name, or an alias if provided. For the examples below use the following data:
Sample Data
DECLARE #GoalScorers TABLE
(
gid INT,
Name VARCHAR(1)
)
;
DECLARE #GoalScoredDetails TABLE
(
DetailId INT,
gid INT,
stadium VARCHAR(1),
goals INT,
Cards INT
)
;
INSERT INTO #GoalScorers
(
gid,
Name
)
VALUES
(1, 'A'),
(2, 'B'),
(3, 'A')
;
INSERT INTO #GoalScoredDetails
(
DetailId,
gid,
stadium,
goals,
Cards
)
VALUES
(1, 1, 'x', 2, 1),
(2, 2, 'y', 5, 2),
(3, 3, 'y', 2, 1)
;
In this first example we recieve the error. Why? Because there is more than one column called gid it cannot tell which to use.
Failed Example
SELECT
gid
FROM
#GoalScoredDetails AS gsd
RIGHT OUTER JOIN #GoalScorers as gs ON gs.gid = gsd.gid
;
This example works because we explicitly tell SQL which gid to return:
Working Example
SELECT
gs.gid
FROM
#GoalScoredDetails AS gsd
RIGHT OUTER JOIN #GoalScorers as gs ON gs.gid = gsd.gid
;
You can, of course, return both:
Example
SELECT
gs.gid,
gsd.gid
FROM
#GoalScoredDetails AS gsd
RIGHT OUTER JOIN #GoalScorers as gs ON gs.gid = gsd.gid
;
In multi table queries I would always recommend prefixing every column name with a table/alias name. This makes the query easier to follow, and reduces the likelihood of this sort of error.

How to ensure outer join with filter still returns all desired rows?

Imagine I have two tables in a DB like so:
products:
product_id name
----------------
1 Hat
2 Gloves
3 Shoes
sales:
product_id store_id sales
----------------------------
1 1 20
2 2 10
Now I want to do a query to list ALL products, and their sales, for store_id = 1. My first crack at it would be to use a left join, and filter to the store_id I want, or a null store_id, in case the product didn't get any sales at store_id = 1, since I want all the products listed:
SELECT name, coalesce(sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id
WHERE store_id = 1 or store_id is null;
Of course, this doesn't work as intended, instead I get:
name sales
---------------
Hat 20
Shoes 0
No Gloves! This is because Gloves did get sales, just not at store_id = 1, so the WHERE clause has filtered them out.
How then can I get a list of ALL products and their sales for a specific store?
Here are some queries to create the test tables:
create temp table test_products as
select 1 as product_id, 'Hat' as name;
insert into test_products values (2, 'Gloves');
insert into test_products values (3, 'Shoes');
create temp table test_sales as
select 1 as product_id, 1 as store_id, 20 as sales;
insert into test_sales values (2, 2, 10);
UPDATE: I should note that I am aware of this solution:
SELECT name, case when store_id = 1 then sales else 0 end as sales
FROM test_products p
LEFT JOIN test_sales s ON p.product_id = s.product_id;
however, it is not ideal... in reality I need to create this query for a BI tool in such a way that the tool can simply add a where clause to the query and get the desired results. Inserting the required store_id into the correct place in this query is not supported by this tool. So I'm looking for other options, if there are any.
Add the WHERE condition to the LEFT JOIN clause to prevent that rows go missing.
SELECT p.name, coalesce(s.sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id
AND s.store_id = 1;
Edit for additional request:
I assume you can manipulate the SELECT items? Then this should do the job:
SELECT p.name
,CASE WHEN s.store_id = 1 THEN coalesce(s.sales, 0) ELSE NULL END AS sales
FROM products p
LEFT JOIN sales s USING (product_id)
Also simplified the join syntax in this case.
I'm not near SQL, but give this a shot:
SELECT name, coalesce(sales, 0)
FROM products p
LEFT JOIN sales s ON p.product_id = s.product_id AND store_id = 1
You don't want a where on the whole query, just on your join

filtering records based on n:m table criteria

I have table product
ID int
name nvarchar()
dummy data: (1,'car'), (2,'bike')
I have table parameters
ID int
name nvarchar()
dummy data: (1,'abs'), (2,'audio'), (3,'eps'), (4,'air conditioning')
and finally I have n:m table product_parameters which holds information about parameters for product.
ID int
id_product int
id_parameter int
dummy data:
(id,product,parameter)
(1, 1, 1),
(2, 1, 2),
(3, 1, 3),
(4, 2, 1)
How do I create select which:
shows everything is parameters is not defined in search
shows car and bike because they both have parameter abs
shows only car because it searches for abs, eps, audi
Is it possible?
UPDATE
I created only 3 parameters but think of it like unlimited number, whether 10, 20 or 30 or more ... basically is there a way how to build select in such a way that it will query one parameter if needed or 20 parameters if needed.
First:
SELECT * FROM product PR
JOIN product_parameters PP ON PR.ID=PP.id_product
JOIN parameters PA ON PP.id_parameter=PA.ID
Second:
SELECT PR.* FROM product PR
JOIN product_parameters PP ON PR.ID=PP.id_product
JOIN parameters PA ON PP.id_parameter=PA.ID
WHERE PA.name = 'abs'
Third:
SELECT PR.* FROM product PR
WHERE EXISTS (SELECT * FROM product_parameters PP JOIN parameters PA ON PP.id_parameter=PA.ID WHERE PP.id_product=PR.ID AND PA.name='abs')
AND EXISTS (SELECT * FROM product_parameters PP JOIN parameters PA ON PP.id_parameter=PA.ID WHERE PP.id_product=PR.ID AND PA.name='eps')
AND EXISTS (SELECT * FROM product_parameters PP JOIN parameters PA ON PP.id_parameter=PA.ID WHERE PP.id_product=PR.ID AND PA.name='audi')