How to self join with nulls postgresql - sql

I want to show the names of the customers along with the names of those who referred them. I thought I was on the right track but the result is messed up. I tried conditions ReferredBy NOT NULL in WHERE clause, ReferredBy NOT NULL in ON clause - no luck.
I would appreciate any help!
sqlfiddle
CREATE TABLE Customers(
Id int NOT NULL,
Name varchar(50) NOT NULL,
ReferredBy int REFERENCES Customers(Id),
PRIMARY KEY (Id)
);
INSERT INTO Customers VALUES
(11, 'Peter', 22),
(22, 'Ariel', NULL),
(33, 'Tom', 11);
My approach:
SELECT c.Id, c.Name, c.ReferredBy, n.Name as ReferredBy_name
FROM Customers c
LEFT JOIN Customers n
ON c.Id = n.ReferredBy
Desired Output:

I think the JOIN conditions have the tables inverted:
SELECT c.Id, c.Name, c.ReferredBy, n.Name as ReferredBy_name
FROM Customers c LEFT JOIN
Customers n
ON n.Id = c.ReferredBy ;

Related

How many elements in one column are linked to an element other column?

Consider I have two tables
Courses Program
---------------------------
course_ID program_id
course_title program_name
program_ID
Now, I want to check no of courses(by course_id) offered by each program (program_id).
select c.program_id ,p.program_name, count(course_id)
from courses c
join Program p on c.Program_id =p.Program_id
group by program_id,program_name
If I understood you correctly, you're searching for a GROUP BY and a corresponding aggregate.
--Creating sample tables and data
SELECT course_ID, course_title, program_ID
INTO #courses
FROM (
VALUES (0, 'course_0', 0),
(1, 'course_1', 0),
(2, 'course_2', 0),
(3, 'course_3', 0),
(4, 'course_4', 1),
(5, 'course_5', 1),
(NULL, 'course_6', 1)
) AS C (course_ID, course_title, program_ID)
SELECT program_ID, program_title
INTO #programs
FROM (
VALUES (0, 'program_0'),
(1, 'program_1')
) AS P (program_ID, program_title)
and after that execute the query
SELECT P.program_title, COUNT(C.course_ID) AS courses_amount
FROM #courses C
INNER JOIN #programs P ON C.program_ID = P.program_ID
GROUP BY P.program_ID, P.program_title
So you basically GROUP BY the value to which you to aggregate to and COUNT the 'course_id'.
COUNT(C.course_ID) only counts actual values and will ignore NULLs.
If you want to count the NULLs as well, just use COUNT(*).
EDIT: Forgot the result...
So it'll look like this:
program_title
courses_amount
program_0
4
program_1
2

How to count number of associated records, when association has composite primary key?

Given a customers table, and an associated orders table, I want to count the number of orders made per customer.
However, the orders table has a composite primary key. Here are the schemas and the test data:
CREATE TABLE customers (
name TEXT NOT NULL PRIMARY KEY
);
INSERT INTO customers VALUES ('Joe');
INSERT INTO customers VALUES ('Jane');
CREATE TABLE orders (
customer_name TEXT NOT NULL REFERENCES customers (name),
section_id INT NOT NULL,
item_id INT NOT NULL,
PRIMARY KEY (customer_name, section_id, item_id)
);
INSERT INTO orders VALUES ('Joe', 1, 100);
INSERT INTO orders VALUES ('Joe', 1, 101);
INSERT INTO orders VALUES ('Joe', 2, 110);
There are two customers: Joe (with 3 orders) and Jane (with no orders).
This query...
SELECT customers.*,
COUNT((orders.section_id, orders.item_id)) AS num_orders
FROM customers
LEFT JOIN orders
ON customers.name = orders.customer_name
GROUP BY customers.name;
...correctly counts that Joe has 3 orders, but incorrectly counts that Jane has 1 order (she actually has 0).
That's because COUNT((orders.section_id, orders.item_id)) produces 1. Even though section_id and item_id are both NULL, the tuple expression (NULL, NULL) is not considered NULL.
How do I properly query the orders count in the face of composite primary keys?
I'm not sure why do you want to count on both fields: section_id and item_id...
I'd suggest to provide simple experiment:
SELECT (NULL, NULL) result
-- produces: (,)
So, when you use id with COUNT, it have to return 1 - as it's NOT NULL.
SELECT COALESCE(NULL, NULL) result
-- produces: NULL
So, when you use id with COUNT, it have to return 0 - as it's NULL.
db<>fiddle
As to me, it's enough to use COUNT(item_id) or COUNT(section_id), as both can NOT be null. So...
SELECT c.*,
COUNT(o.section_id) AS num_orders
FROM customers c
LEFT JOIN orders o
ON c.name = o.customer_name
GROUP BY c.name;
Unless... you want to get distinct sections/items in each order of specific customer.
SELECT c.*,
COUNT(DISTINCT o.section_id) AS num_sections,
COUNT(DISTINCT o.item_id) AS num_items
FROM customers c
LEFT JOIN orders o
ON c.name = o.customer_name
GROUP BY c.name;
db<>fiddle
Count orders.customer_name, that's the one that's guaranteed not to be NULL if there was a join partner row as it's used in the ON clause with =.
SELECT customers.*,
count(orders.customer_name) AS num_orders
FROM customers
LEFT JOIN orders
ON customers.name = orders.customer_name
GROUP BY customers.name;

How to I find the person who has taught the most classes

I want to try and find the employee who has taught the most classes as the position Teacher. So in this I want to print out Nick, as he has taught the most classes as a Teacher.
However, I am getting the error:
ERROR: column "e.name" must appear in the GROUP BY clause or be used in an aggregate function Position: 24
CREATE TABLE employees (
id integer primary key,
name text
);
CREATE TABLE positions (
id integer primary key,
name text
);
CREATE TABLE teaches (
id integer primary key,
class text,
employee integer,
position integer,
foreign key (employee) references employees(id),
foreign key (position) references positions(id)
);
INSERT INTO employees (id, name) VALUES
(1, 'Clive'), (2, 'Johnny'), (3, 'Sam'), (4, 'Nick');
INSERT INTO positions (id, name) VALUES
(1, 'Assistant'), (2, 'Teacher'), (3, 'CEO'), (4, 'Manager');
INSERT INTO teaches (id, class, employee, position) VALUES
(1, 'Dancing', 1, 1), (2, 'Gardening', 1, 2),
(3, 'Dancing', 1, 2), (4, 'Baking', 4, 2),
(5, 'Gardening', 4, 2), (6, 'Gardening', 4, 2),
(7, 'Baseball', 4, 1), (8, 'Baseball', 2, 1),
(9, 'Baseball', 4, 2);
The SQL statement I am trying to use:
SELECT count(t.class), e.name
FROM positions p
JOIN teaches t
ON p.id = t.position
JOIN employees e
ON e.id = t.employee
WHERE p.name = 'Teacher'
GROUP BY t.employee;
I've been working on this on a sql fiddle:
http://www.sqlfiddle.com/#!17/a8e19c/3
Your query looks pretty good. You just need to fix the GROUP BY clause, so it is consistent with the columns in the SELECT clause. Then ORDER BY and LIMIT:
SELECT count(*) cnt_classes, e.name
FROM positions p
INNER JOIN teaches t ON p.id = t.position
INNER JOIN employees e ON e.id = t.employee
WHERE p.name = 'Teacher'
GROUP BY e.id --> primary key of "employees"
ORDER BY cnt_classes DESC --> order by descending count of classes
LIMIT 1 --> keep the first row only
In your select you are using aggregate COUNT that counts all lines in each group (GROUP BY t.employee) but you don't aggregate e.name.
So for Nick you basically select 4 rows each for one class that have two columns - class name and teacher name. Then you ask server to count class names in Nicks group (by his employee id), that aggregates 4 rows into one with value 4 but you don't do anything about teacher name so you are left with invalid structure where you have 1 row for classes count column and 4 rows for teacher name. Same for other teachers. And that's what server is complaining about. Easiest way to fix that is to add e.name to GROUP BY, that will squeeze those 4 rows of same value into one.
To get teacher that teaches most classes you then only need to sort results by class count descending order and limit result count to 1. That will give you result row with highest class count.
Updated fiddle: http://www.sqlfiddle.com/#!17/a8e19c/7
You're getting the error because you need to need to have every column you're selecting (e.name in this example in the GROUP BY clause, otherwise SQL doesn't know how to group and return a count for that column. You'll also want to use TOP(1) and order by if you want to return the person with the most.
SELECT TOP(1) count(*), e.name
FROM teaches t
INNER JOIN positions p ON t.position = p.id
INNER JOIN employees e ON e.id = t.employee
WHERE p.name = 'Teacher'
GROUP BY e.name
ORDER BY count(*) DESC;

HAVING clause with subquery -- Checking if group has at least one row matching conditions

Suppose I have the following table
DROP TABLE IF EXISTS #toy_example
CREATE TABLE #toy_example
(
Id int,
Pet varchar(10)
);
INSERT INTO #toy
VALUES (1, 'dog'),
(1, 'cat'),
(1, 'emu'),
(2, 'cat'),
(2, 'turtle'),
(2, 'lizard'),
(3, 'dog'),
(4, 'elephant'),
(5, 'cat'),
(5, 'emu')
and I want to fetch all Ids that have certain pets (for example either cat or emu, so Ids 1, 2 and 5).
DROP TABLE IF EXISTS #Pets
CREATE TABLE #Pets
(
Animal varchar(10)
);
INSERT INTO #Pets
VALUES ('cat'),
('emu')
SELECT Id
FROM #toy_example
GROUP BY Id
HAVING COUNT(
CASE
WHEN Pet IN (SELECT Animal FROM #Pets)
THEN 1
END
) > 0
The above gives me the error Cannot perform an aggregate function on an expression containing an aggregate or a subquery. I have two questions:
Why is this an error? If I instead hard code the subquery in the HAVING clause, i.e. WHEN Pet IN ('cat','emu') then this works. Is there a reason why SQL server (I've checked with SQL server 2017 and 2008) does not allow this?
What would be a nice way to do this? Note that the above is just a toy example. The real problem has many possible "Pets", which I do not want to hard code. It would be nice if the suggested method could check for multiple other similar conditions too in a single query.
If I followed you correctly, you can just join and aggregate:
select t.id, count(*) nb_of_matches
from #toy_example t
inner join #pets p on p.animal = t.pet
group by t.id
The inner join eliminates records from #toy_example that have no match in #pets. Then, we aggregate by id and count how many recors remain in each group.
If you want to retain records that have no match in #pets and display them with a count of 0, then you can left join instead:
select t.id, count(*) nb_of_records, count(p.animal) nb_of_matches
from #toy_example t
left join #pets p on p.animal = t.pet
group by t.id
How about this approach?
SELECT e.Id
FROM #toy_example e JOIN
#pets p
ON e.pet = p.animal
GROUP BY e.Id
HAVING COUNT(DISTINCT e.pet) = (SELECT COUNT(*) FROM #pets);

SQL Query to satisfy 2 conditions

I'm new to SQL and after designing the database, i'm having trouble with some queries. The query i'm currently struggling with states:
"A list of the customers who have ordered at least one project with a higher than average expected duration."
SELECT Customer.name
FROM Project, Customer
WHERE Project.c_id = Customer.c_id AND Project.exp_duration > AVG(Project.exp_duration)
I tried to implement this code but i keep gettin the following error message : "An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference."
Can someone help me with this? I've thought about using joins but i can't get it to work either.
Thanks in advance!
Replace the table variables (#Project & #Customer) with your real tables (Project & Customer).
DECLARE #Project TABLE
(
p_id INT,
exp_duration DECIMAL(18,2),
c_id INT
)
DECLARE #Customer TABLE
(
c_id INT,
name VARCHAR(20)
)
INSERT #Project VALUES (1, 10, 1), (2, 5, 1), (3, 20, 1), (4, 10, 2), (5, 15, 2), (6, 20, 1)
INSERT #Customer VALUES (1, 'C1'), (2, 'C2')
-- average duration
-- SELECT AVG(exp_duration) FROM #Project
SELECT DISTINCT C.name
FROM #Customer C INNER JOIN #Project P ON C.c_id = P.c_id
WHERE p.exp_duration > (SELECT AVG(exp_duration) FROM #Project)
The following query gives the list of Customers who have ordered at least one project (i.e. being a part of one or more projects) and whose ExpectedDuration is greater than the Average ExpectedDuration.
I have used left outer join, group by, count and avg aggregate functions.
Select
C.CustomerID,
C.Name
From SampleCustomer C
Left Join SampleProject P
On C.CustomerID = P.CustomerID
Where P.ExpectedDuration > (Select Avg(ExpectedDuration) From SampleProject Where CustomerID = C.CustomerID)
Group By C.CustomerID, C.Name
Having Count(P.ProjectID) >= 1
Order By C.CustomerID;