Select rows which have a field in common with another row - sql

I have two tables: products and postings. A product is a consumer product (ex. iPhone X), and a posting is a listing for a product on an online marketplace (ex. eBay posting). A single product has zero or more associated postings.
Is there any way to select only postings which have a "sibling"? ie. Select all postings whose product column is equal to any other postings' product column.
SELECT * FROM postings a
INNER JOIN products b ON a.product = b.id
WHERE COUNT(b) > 0

I am wondering your inner join should only do the trick, but in case I am missing something you can try this
With a as
(
SELECT a.*,b.*, count(*) over(Partition by b.id) cnt
FROM postings a
INNER JOIN products b ON a.product = b.id
)
Select * from a where cnt > 0

If you just want postings, then I would suggest:
select p.*
from postings p
where exists (select 1
from postings p2
where p2.product = p.product and
p2.posting_id <> p.posting_id
);
Or, use window functions like this:
select p.*
from (select p.*,
count(*) over (partition by p.product) as cnt
from postings p
) p
where cnt > 1;
Note that these do not require the products table, because the product information is available in postings.

Related

All combinations of counts

I have a table, with columns
product_id, status, for example:
product_id
status
1
ok
2
bad
1
ok
3
bad
2
bad
1
ok
I'd like to show count of all possible combinations of product_ID and status:
product_id
status
count
1
ok
3
1
bad
0
2
ok
0
2
bad
2
3
ok
0
3
bad
1
The solution I've found is that I could use a Cartesian join and then union it with regular counts and aggregate the results (works fine):
SELECT product_id, status, SUM(cnt) FROM (
---all combinations, no count
SELECT DISTINCT t1.product_id, t2.status, 0 AS cnt
FROM
details t1,
details t2
UNION
---counts of existing combinations
SELECT DISTINCT product_id, status, COUNT(status) AS cnt
FROM details
GROUP BY product_id, status) AS T1
GROUP BY product_id, status
Now I am wondering, is here a better way to do it?
I learning SQL with PostgreSQL and Access SQL. Comments are added to clarify (left-out in Access code).
Use CROSS JOIN to build all combinations and top up with a LEFT JOIN:
SELECT p.product_id, s.status, COUNT(t.any_not_null_column)
FROM (SELECT DISTINCT product_id FROM t) AS p
CROSS JOIN (SELECT DISTINCT status FROM t) AS s
LEFT JOIN t ON p.product_id = t.product_id AND s.status = t.status
GROUP BY p.product_id, s.status
The following is a Postgres solution (a database I strongly recommend over MS Access). The idea is to generate all the rows and then use left join and group by to get the counts you want:
select p.product_id, s.status, count(d.product_id)
from (select distinct product_id from details) p cross join
(values ('ok'), ('bad')) s left join
details d
on d.product_id = p.product_id and d.status = s.status
group by p.product_id, s.status;
Note: You might have other tables that have the list of products and/or statuses that you want.
An equivalent version in MS Access (which would also work in Postgres) might look like:
select p.product_id, s.status, count(d.product_id)
from ((select distinct product_id from details) p,
(select distinct status from details) s
) left join
details d
on d.product_id = p.product_id and
d.status = s.status
group by p.product_id, s.status;

list all table elements and count() even if count is 0

Suppose the following tables:
table book(
id,
title,
deleted
)
table invoice(
id,
book_id,
settled
)
I need a list off all the books and the number of settled invoices for each book.
I tried this:
select book.id, title, count(invoice.id)
from book LEFT OUTER JOIN invoice ON book.id=invoice.book_id
where deleted=0
and settled=1
group by book.id
This works only if a book has at least 1 settled invoice or if it doesn't have any invoce at all. However it fails when a book has unsettled invoices and it doesn't have any settled invoice.
Any idea how to query it ?
The following will list all books, but only join and count settled invoices.
SELECT
b.id, b.title, COUNT(i.id) AS settled
FROM
book b
LEFT JOIN invoice i
ON b.id = i.book_id
AND i.settled = 1
WHERE
b.deleted = 0
GROUP BY
b.id
The condition and settled = 1 on your WHERE is effectively turning your LEFT JOIN into an INNER JOIN. You can add a CASE expression to your COUNT:
SELECT b.id,
b.title,
COUNT(CASE WHEN i.settled = 1 THEN 1 END)
FROM book b
LEFT JOIN invoice i
ON b.id = i.book_id
WHERE b.deleted=0
GROUP BY b.id;
Or use the LEFT JOIN with the invoice table already filtered:
SELECT b.id,
b.title,
COUNT(i.id)
FROM book b
LEFT JOIN ( SELECT *
FROM invoice
WHERE settled = 1) i
ON b.id = i.book_id
WHERE b.deleted=0
GROUP BY b.id;

Fetch most recent records as part of Joins

I am joining 2 tables customer & profile. Both the tables are joined by a specific column cust_id. In profile table, I have more than 1 entry. I want to select the most recent entry by start_ts (column) when joining both the tables. As a result I would like 1 row - row from customer and most recent row from profile in the resultset. Is there a way to do this ORACLE SQL?
I would use window functions:
select . . .
from customer c join
(select p.*,
row_number() over (partition by cust_id order by start_ts desc) as seqnum
from profile
) p
on c.cust_id = p.cust_id and p.seqnum = 1;
You can use a left join if you like to get customers that don't have profiles as well.
One way (which works for all DB engines) is to join the tables you want to select data from and then join against the specific max-record of profile to filter out the data
select c.*, p.*
from customer c
join profile p on c.cust_id = p.cust_id
join
(
select cust_id, max(start_ts) as maxts
from profile
group by cust_id
) p2 on p.cust_id = p2.cust_id and p.start_ts = p2.maxts
Here is another way (if there exists no newer entry then it's the newest):
select
c.*,
p.*
from
customer c inner join
profile p on p.cust_id = c.cust_id and not exists(
select *
from profile
where cust_id = c.cust_id and start_ts > p.start_ts
)

SQL strategy to fetch maximum

Suppose I have these three tables:
I want to get, for all products, it's product_id and the client that bougth it most times (the biggest client of the product).
I solved it like this:
SELECT
product_id AS product,
(SELECT TOP 1 client_id FROM Bill_Item, Bill
WHERE Bill_Item.product_id = p.product_id
and Bill_Item.bill_id = Bill.bill_id
GROUP BY
client_id
ORDER BY
COUNT(*) DESC
) AS client
FROM Product p
Do you know a better way?
the inner query will give you the ranking. The outer query will give you the client that puchase the most for a product
SELECT *
(
SELECT i.product_id, b.client_id,
r = row_number() over (partition by i.product_id
order by count(*) desc)
FROM Bill b
INNER JOIN Bill_Item i ON b.bill_id = i.bill_id
GROUP BY i.product_id, b.client_id
) d
WHERE r = 1
I was going to submit pretty much the same thing as #Squirrell only with a Common Table Expression [CTE] rather than a derived table. So I wont duplicate that but there are some learning points concerning your query. First is IMPLICIT JOINS such as FROM Bill_Item, Bill are really easy to have uintended consequences (one of many questions: Queries that implicit SQL joins can't do?) Next for the Calculated column you can actually do this in a OUTER APPLY or CROSS APPLY which is a very useful technique.
So you could re-write your method as follows:
SELECT *
FROM
Product p
OUTER APPLY (SELECT TOP 1 b.client_id
FROM
Bill_Item bi
INNER JOIN Bill b
ON bi.bill_id = b.bill_id
WHERE
bi.product_id = p.product_id
GROUP BY
b.client_id
ORDER BY
COUNT(*) DESC) c
And to show you how squirell's answer can still include products that have never been sold all you need to do is join Products and LEFT JOIN to other tables:
;WITH cte AS (
SELECT
p.product_id
,b.client_id
,ROW_NUMBER() OVER (PARTITION BY p.product_id ORDER BY COUNT(*) DESC) as RowNumber
FROM
Product p
LEFT JOIN Bill_Item bi
ON p.product_id = bi.product_id
LEFT JOIN Bill b
ON bi.bill_id = b.bill_id
GROUP BY
p.product_id
,b.client_id
)
SELECT *
FROM
cte
WHERE
RowNumber = 1
Techniques used in some of these that are useful.
CTE
APPLY (Outer & Cross)
Window Functions
Squirrel's answer doesn't return products that have never been sold. If you want to include those, then your approach is ok, although I would write the query as:
SELECT product_id as product,
(SELECT TOP 1 b.client_id
FROM Bill_Item bi JOIN
Bill b
ON bi.bill_id = b.bill_id
WHERE Bill_Item.product_id = p.product_id
GROUP BY client_id
ORDER BY COUNT(*) DESC
) as client
FROM Product p;
You can also express this using APPLY, but a correlated subquery is also fine.
Note the correct use of the explicit JOIN syntax.

Using Top to get only the first row, but from multiple records in a subquery

I have two tables, one with customers and one with payments
Need the most recent payment from customers with specific criteria. Created a query for that specific_criteria)
select top 1 *
from payments
where id in (select id from specific_criteria);
Obviously, that only returns one row.
I can't actually write VB code in this database to do it.
Looking for some SQL code that will get the most recent payment per customer.
SELECT c.*,
p.*
FROM Customers c
INNER JOIN
Payments p ON p.Customer_ID=c.ID
WHERE p.ID IN (SELECT p2.ID FROM
(SELECT p1.ID, MAX(p1.payment_date)
FROM Payments p1
GROUP BY p1.ID) AS p2)
Would something like this work?
select c.*, (
select top 1 p.*
from payments p
where p.customer_id = c.id
order by p.payment_date desc )
from customers c
where customer_critera