How to group count and join in sequel? - sql

I've looked through all the documentation and I'm having an issue putting together this query in Sequel.
select a.*, IFNULL(b.cnt, 0) as cnt FROM a LEFT OUTER JOIN (select a_id, count(*) as cnt from b group by a_id) as b ON b.a_id = a.id ORDER BY cnt
Think of table A as products and table B is a record indicated A was purchased.
So far I have:
A.left_outer_join(B.group_and_count(:a_id), a_id: :id).order(:count)
Essentially I just want to group and count table B, join it with A, but since B does not necessarily have any records for A and I'm ordering it by the number in B, I need to default a value.

DB[:a].
left_outer_join(DB[:b].group_and_count(:a_id).as(:b), :a_id=>:id).
order(:cnt).
select_all(:a).
select_more{IFNULL(:b__cnt, 0).as(:cnt)}

I can help you in MS SQL syntax.
Let's say your tables are Product and Order.
CREATE TABLE Product (
Id INT NOT NULL,
NAME VARCHAR(100) NOT NULL)
CREATE TABLE [Order] (
Id INT NOT NULL,
ProductId INT)
INSERT INTO Product (Id, Name) VALUES
(1, 'Tea'), (2, 'Coffee'), (3, 'Hot Chocolate')
INSERT INTO [Order] (Id, ProductId) VALUES
(1, 1), (2, 1), (3, 1), (4, 2)
This query will give the number of orders each product has, including ones without any orders.
SELECT p.Id AS ProductId,
p.Name AS ProductName,
COUNT(o.Id) AS Orders
FROM Product p
LEFT OUTER JOIN [Order] o
ON p.Id = o.ProductId
GROUP BY
p.Id,
p.Name
ORDER BY
COUNT(o.Id) DESC

Related

How to count the number of times each value of an attribute from one table, appears in another table? And if there is no appearance return zero

If I have a CUSTOMER table with the attribute customer_id and an ORDER table
with the attributes order_id and customer_id.
How do I find the total number of orders submitted by each customer and if a customer has none, return zero.
I have tried the following:
SELECT c.customer_id, COUNT(*)
FROM Customer c, Orders o
WHERE c.customer_id= o.customer_id
GROUP BY c.customer_id;
With the above, I am able to display the number of orders made by each customer, only if they made an order.
How do I also display count 0 for those customers who did not make any order?
Use an outer join and count the rows in the "outer" table:
SELECT c.customer_id, COUNT(o.customer_id)
FROM Customer c
LEFT JOIN Orders o ON c.customer_id= o.customer_id
GROUP BY c.customer_id;
You can use LEFT JOIN and in the COUNT() place the o.customer_id
SELECT c.customer_id, COUNT(o.customer_id) AS OrderCount
FROM Customer c
LEFT JOIN Orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id;
Demo with sample data. Here Customer Id 2 and 4 doesn't have any data in the Orders table and it result zero in the ouput.
DECLARE #Customer TABLE (CustomerId INT);
INSERT INTO #Customer (CustomerId) VALUES (1), (2), (3), (4), (5);
DECLARE #Orders TABLE (CustomerId INT, OrderId INT);
INSERT INTO #Orders (CustomerId, OrderId) VALUES (1, 1), (1, 2), (3, 2), (3, 4), (5, 1);
SELECT c.CustomerId, COUNT(o.CustomerId) AS OrderCount
FROM #Customer c
LEFT JOIN #Orders o ON c.CustomerId = o.CustomerId
GROUP BY c.CustomerId;
Output:
CustomerId OrderCount
----------------------
1 2
2 0
3 2
4 0
5 1
First aggregate the orders by customer ID and calculate the total count within an inner select. Make sure to left join this to your customers table so that you don't lose any of the customers that have not placed an order. Finally use a case statement to determine whether or not the return value for the number of orders for a customer is null meaning they have made no orders and in that case set the value to zero.
SELECT
c.customer_id,
CASE
WHEN o.num_orders IS NULL THEN 0
ELSE o.num_orders
END
FROM Customer c
LEFT JOIN (
SELECT customer_id, COUNT(*) AS num_orders
FROM Orders
GROUP BY customer_id
) AS o ON c.customer_id= o.customer_id;
Try the IFNULL function :
https://www.w3schools.com/sql/sql_isnull.asp
hope it'll help!

SQL where nested select not null

I have a Customers table with CustomerID and CustomerName.
I then have a Orders table with CustomerID, datetime OrderPlaced and datetime OrderDelivered.
Bearing in mind that not all customers have placed orders, I would like to get a list of CustomerName, OrderPlaced and OrderDelivered but only for customers that have placed orders and whose orders have already been delivered, and only the most recent OrderPlaced per customer.
I started by doing (fully aware that this does not implement the OrderDelivered limitation to it yet, but already not doing what I want):
SELECT CustomerID,
(SELECT TOP 1 OrderDelivered
FROM Orders ORDER BY OrderDelivered DESC) AS OrderDelivered
FROM Customer
WHERE OrderDelivered IS NOT NULL
But already MS SQL doesn't like this, it says that it doesn't know what OrderDelivered is on the WHERE clause.
How can I accomplish this?
Personally, I would move your subquery into the FROM and use CROSS APPLY. Then you can far more easily reference the column:
SELECT C.CustomerID,
O.OrderDelivered
FROM Customer C
CROSS APPLY (SELECT TOP 1 OrderDelivered
FROM Orders oa
WHERE oa.CustomerID = C.CustomerID --Guess column name for orders
AND O.OrderDelivered IS NOT NULL
ORDER BY O.OrderDelivered DESC) O;
As, however, this is a CROSS APPLY, then the results will already be filtered; so no need for the WHERE.
If you want the most recent delivered order, then one method uses apply:
select c.*, o.OrderPlaced, o.OrderDelivered
from customer c cross apply
(select top (1) o.*
from orders o
where o.CustomerID = c.CustomerID and
o.OrderDelivered is not null
order by o.OrderPlaced desc
) o;
You can achieve this by using the OVER clause (https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql).
DECLARE #customers TABLE (CustomerId INT, CustomerName NVARCHAR(20))
DECLARE #orders TABLE (CustomerId INT, OrderPlaced DATETIME, OrderDelivered DATETIME)
INSERT INTO #customers VALUES
(1, 'a'),
(2, 'b')
INSERT INTO #orders VALUES
(1, '2019-01-01', null),
(2, '2019-01-03', '2019-02-01'),
(2, '2019-01-05', null)
SELECT
c.CustomerName,
-- Latest OrderPlaced
FIRST_VALUE(o.OrderPlaced)
OVER(PARTITION BY c.CustomerId ORDER BY o.OrderPlaced DESC) AS OrderPlaced,
-- The matching OrderDelivered
FIRST_VALUE(o.OrderDelivered)
OVER(PARTITION BY c.CustomerId ORDER BY o.OrderPlaced DESC) AS OrderDelivered
FROM #customers c
INNER JOIN #orders o ON o.CustomerId = c.CustomerId
WHERE o.OrderDelivered IS NOT NULL

Accessing derived tables from outer query

In the following problem
Filtering based on Joining Multiple Tables in SQL
I managed to determine that the posters problem was happening because he was accessing derived tables from the outer query.
What I don't understand is why this happened.
So if you run the following
create table salesperson (
id int, name varchar(40)
)
create table customer (
id int, name varchar(40)
)
create table orders (
number int, cust_id int, salesperson_id int
)
insert into salesperson values (1, 'abe'); insert into salesperson values (2, 'bob');
insert into salesperson values (5, 'chris'); insert into salesperson values (7, 'dan');
insert into salesperson values (8, 'ken'); insert into salesperson values (11, 'joe');
insert into customer values (4, 'Samsonic'); insert into customer values (6, 'panasung');
insert into customer values (7, 'samony'); insert into customer values (9, 'orange');
insert into orders values (10, 4, 2); insert into orders values (20, 4, 8);
insert into orders values (30, 9, 1); insert into orders values (40, 7, 2);
insert into orders values (50, 6, 7); insert into orders values (60, 6, 7);
insert into orders values (70, 9, 7);
SELECT *
FROM salesperson s
INNER JOIN orders o ON s.id = o.salesperson_id
INNER JOIN customer c ON o.cust_id = c.id
WHERE s.name NOT IN (
select s.name where c.name='Samsonic'
)
SELECT *
FROM salesperson s
INNER JOIN orders o ON s.id = o.salesperson_id
INNER JOIN customer c ON o.cust_id = c.id
WHERE s.name NOT IN (
SELECT s.name
FROM salesperson s
INNER JOIN orders o ON s.id = o.salesperson_id
INNER JOIN customer c ON o.cust_id = c.id
WHERE c.name = 'Samsonic'
)
The first select statement accesses the derived tables in the outer query, while the other creates its own joins and derives its own tables.
Why does the first select contain bob while the other one does not?
In your first query you are only removing the rows which has customer name Samsonic, since Bob has a record for samony that one comes in the out put.
In the second one you are getting the salesperson who has the customer name Samsonic in that case you are getting both Bob and Ken then you are removing all there records for both Bob and Ken using the 'not in'so both records for bob is getting removed hence you dont get any.
The difference is that in your first query you are only removing orders which involve Samsonic, because the exclusion is only looking at data in the current row. Whereas by the sounds of it you want to remove any sales-person who has ever sold a Samsonic. You can see the difference with in the results of the following query:
SELECT *, s.name, c.name
, case when s.name NOT IN (
select s.name where c.name='Samsonic'
) then 1 else 0 end /* Order not Samsonic */
, case when not exists (
select 1
from Orders O1
inner join Customer C1 on o1.cust_id = c1.id
where C1.Name = 'Samsonic' and o1.salesperson_id = O.salesperson_id
) then 1 else 0 end /* Salesperson never sold a Samsonic */
FROM salesperson s
INNER JOIN orders o ON s.id = o.salesperson_id
INNER JOIN customer c ON o.cust_id = c.id
Your first query has a select with no from clause. So the where is equivalent to:
WHERE s.name NOT IN (CASE WHEN c.name = 'Samsonic' THEN s.name END)
Or more simply:
WHERE c.name <> 'Samsonic'
Bob has an order that is not with 'Samsonic', so Bob is in the result set. In other words, the logic is looking at each row individually.
The second version is looking at all names that have made an order. Bob is one of those names, so this applies to all orders made by Bob.
If you want to exclude all salespersons who have ever made an order to 'Samsonic', then I would recommend using window functions instead of complicated logic:
SELECT *
FROM (SELECT s.id as salesperson_id, s.name as salesperson_name, c.id as customer_id, c.name as customer_name, o.number,
SUM(CASE WHEN c.name = 'Samsonic' THEN 1 ELSE 0 END) OVER (PARTITION BY s.id) as num_samsonic
FROM salesperson s INNER JOIN
orders o
ON s.id = o.salesperson_id INNER JOIN
customer c
ON o.cust_id = c.id
WHERE c.name <> 'Samsonic'
) soc
WHERE num_samsonic = 0

join within recursive with adjacency

I have something like this:
CREATE TABLE categories (
id varchar(250) PRIMARY KEY,
name varchar(250) NOT NULL,
parentid varchar(250)
);
CREATE TABLE products (
id varchar(250) PRIMARY KEY,
name varchar(250) NOT NULL,
price double precision,
category varchar(250) NOT NULL
);
INSERT INTO categories VALUES ('1', 'Rack', '');
INSERT INTO categories VALUES ('2', 'Women', '1');
INSERT INTO categories VALUES ('3', 'Shorts', '2');
INSERT INTO products VALUES ('1', 'Jean', 2.99, '3');
INSERT INTO products VALUES ('2', 'Inflatable Boat', 5.99, '1');
Now, if I wanted to see the total price of products for each category, I could do something like this:
SELECT
categories.name,
SUM(products.price) AS CATPRICE
FROM
categories,
products
WHERE products.category = categories.id
GROUP BY categories.name
;
Which produces output:
name | catprice
--------+----------
Rack | 5.99
Shorts | 2.99
(2 rows)
But notice that "Shorts" is an ancestor of "Rack". I want a query that will produce output like this:
name | catprice
--------+----------
Rack | 8.98
(1 row)
So that all product prices are added together under the root category. There are multiple root categories in the category table; only one has been shown for simplicity.
This is what I have thus far:
-- "nodes_cte" is the virtual table that is being created as the recursion continues
-- The contents of the ()s are the columns that are being built
WITH RECURSIVE nodes_cte(name, id, parentid, depth, path) AS (
-- Base case?
SELECT tn.name, tn.id, tn.parentid, 1::INT AS depth, tn.id::TEXT AS path FROM categories AS tn, products AS tn2
LEFT OUTER JOIN categories ON tn2.CATEGORY = categories.ID
WHERE tn.parentid IS NULL
UNION ALL
-- nth case
SELECT c.name, c.id, c.parentid, p.depth + 1 AS depth, (p.path || '->' || c.id::TEXT) FROM nodes_cte AS p, categories AS c, products AS c2
LEFT OUTER JOIN categories ON c2.CATEGORY = categories.ID
WHERE c.parentid = p.id
)
SELECT * FROM nodes_cte AS n ORDER BY n.id ASC;
I have no clue what I've done wrong. The above query returns zero results.
Your recursive query is off by a little. Give this a try:
EDIT -- To make this work with the SUM, use this:
WITH RECURSIVE nodes_cte(name, id, id2, parentid, price) AS (
-- Base case?
SELECT c.name,
c.id,
c.id id2,
c.parentid,
p.price
FROM categories c
LEFT JOIN products p on c.id = p.category
WHERE c.parentid = ''
UNION ALL
-- nth case
SELECT n.name,
n.id,
c.id id2,
c.parentid,
p.price
FROM nodes_cte n
JOIN categories c on n.id2 = c.parentid
LEFT JOIN products p on c.id = p.category
)
SELECT id, name, SUM(price) FROM nodes_cte GROUP BY id, name
And here is the Fiddle: http://sqlfiddle.com/#!1/7ac6d/19
Good luck.

SQL - identifying rows for a value in one table, where all joined rows only has a specific value

IN SQL Server, I have a result set from a joined many:many relationship.
Considering Products linked to Orders via a link table ,
Table - Products
ID
ProductName
Table - Orders
ID
OrderCountry
LinkTable OrderLines (columns not shown)
I'd like to be able to filter these results to show only the results where for an entity from one table, all the values in the other table only have a given value in a particular column. In terms of my example, for each product, I want to return only the joined rows when all the orders they're linked to are for country 'uk'
So if my linked result set is
productid, product, orderid, ordercountry
1, Chocolate, 1, uk
2, Banana, 2, uk
2, Banana, 3, usa
3, Strawberry, 4, usa
I want to filter so that only those products that have only been ordered in the UK are shown (i.e. Chocolate). I'm sure this should be straight-forward, but its Friday afternoon and the SQL part of my brain has given up for the day...
You could do something like this, where first you get all products only sold in one country, then you proceed to get all orders for those products
with distinctProducts as
(
select LinkTable.ProductID
from Orders
inner join LinkTable on LinkTable.OrderID = Orders.ID
group by LinkTable.ProductID
having count(distinct Orders.OrderCountry) = 1
)
select pr.ID as ProductID
,pr.ProductName
,o.ID as OrderID
,o.OrderCountry
from Products pr
inner join LinkTable lt on lt.ProductID = pr.ID
inner join Orders o on o.ID = lt.OrderID
inner join distinctProducts dp on dp.ProductID = pr.ID
where o.OrderCountry = 'UK'
In the hope that some of this may be generally reusable:
;with startingRS (productid, product, orderid, ordercountry) as (
select 1, 'Chocolate', 1, 'uk' union all
select 2, 'Banana', 2, 'uk' union all
select 2, 'Banana', 3, 'usa' union all
select 3, 'Strawberry', 4, 'usa'
), countryRankings as (
select productid,product,orderid,ordercountry,
RANK() over (PARTITION by productid ORDER by ordercountry) as FirstCountry,
RANK() over (PARTITION by productid ORDER by ordercountry desc) as LastCountry
from
startingRS
), singleCountry as (
select productid,product,orderid,ordercountry
from countryRankings
where FirstCountry = 1 and LastCountry = 1
)
select * from singleCountry where ordercountry='uk'
In the startingRS, you put whatever query you currently have to generate the intermediate results you've shown. The countryRankings CTE adds two new columns, that ranks the countries within each productid.
The singleCountry CTE reduces the result set back down to those results where country ranks as both the first and last country within the productid (i.e. there's only a single country for this productid). Finally, we query for those results which are just from the uk.
If you want, for example, all productid rows with a single country of origin, you just skip this last where clause (and you'd get 3,strawberry,4,usa in your results also)
So is you've got a current query that looks like:
select p.productid,p.product,o.orderid,o.ordercountry
from product p inner join order o on p.productid = o.productid --(or however these joins work for your tables)
Then you'd rewrite the first CTE as:
;with startingRS (productid, product, orderid, ordercountry) as (
select p.productid,p.product,o.orderid,o.ordercountry
from product p inner join order o on p.productid = o.productid
), /* rest of query */
Hmm. Based on Philip's earlier approach, try adding something like this to exclude rows where there's been the same product ordered in another country:
SELECT pr.Id, pr.ProductName, od.Id, od.OrderCountry
from Products pr
inner join LinkTable lt
on lt.ProductId = pr.ID
inner join Orders od
on od.ID = lt.OrderId
where
od.OrderCountry = 'UK'
AND NOT EXISTS
(
SELECT
*
FROM
Products MatchingProducts
inner join LinkTable lt
on lt.ProductId = MatchingProducts.ID
inner join Orders OrdersFromOtherCountries
on OrdersFromOtherCountries.ID = lt.OrderId
WHERE
MatchingProducts.ID = Pr.ID AND
OrdersFromOtherCountries.OrderCountry != od.OrderCountry
)
;WITH mytable (productid,ordercountry)
AS
(SELECT productid, ordercountry
FROM Orders od INNER JOIN LinkTable lt ON od.orderid = lt.OrderId)
SELECT * FROM mytable
INNER JOIN dbo.Products pr ON pr.productid = mytable.productid
WHERE pr.productid NOT IN (SELECT productid FROM mytable
GROUP BY productid
HAVING COUNT(ordercountry) > 1)
AND ordercountry = 'uk'
SELECT pr.Id, pr.ProductName, od.Id, od.OrderCountry
from Products pr
inner join LinkTable lt
on lt.ProductId = pr.ID
inner join Orders od
on od.ID = lt.OrderId
where od.OrderCountry = 'UK'
This probably isn't the most efficient way to do this, but ...
SELECT p.ProductName
FROM Product p
WHERE p.ProductId IN
(
SELECT DISTINCT ol.ProductId
FROM OrderLines ol
INNER JOIN [Order] o
ON ol.OrderId = o.OrderId
WHERE o.OrderCountry = 'uk'
)
AND p.ProductId NOT IN
(
SELECT DISTINCT ol.ProductId
FROM OrderLines ol
INNER JOIN [Order] o
ON ol.OrderId = o.OrderId
WHERE o.OrderCountry != 'uk'
)
TestData
create table product
(
ProductId int,
ProductName nvarchar(50)
)
go
create table [order]
(
OrderId int,
OrderCountry nvarchar(50)
)
go
create table OrderLines
(
OrderId int,
ProductId int
)
go
insert into Product VALUES (1, 'Chocolate')
insert into Product VALUES (2, 'Banana')
insert into Product VALUES (3, 'Strawberry')
insert into [order] values (1, 'uk')
insert into [order] values (2, 'uk')
insert into [order] values (3, 'usa')
insert into [order] values (4, 'usa')
insert into [orderlines] values (1, 1)
insert into [orderlines] values (2, 2)
insert into [orderlines] values (3, 2)
insert into [orderlines] values (4, 3)
insert into [orderlines] values (3, 2)
insert into [orderlines] values (3, 3)