Mixing INs and Cross Table queries in Postgres - sql

I'm new to Postgres. I have a query that involves 4 tables. My tables look like the following:
User Account Transaction Action
---- ------- ----------- ------
ID ID ID ID
Name UserID AccountID AccountID
Description Description
For each user, I'm trying to figure out: How many accounts they have, and how many total transactions and actions have been taken across all accounts. In other words, I'm trying to generate a query whose results will look like the following:
User Accounts Transactions Actions
---- -------- ------------ -------
Bill 2 27 7
Jack 1 7 0
Joe 0 0 0
How do I write a query like this? Currently, I'm trying the following:
SELECT
u.Name,
(SELECT COUNT(ID) FROM Account a WHERE a.UserID = u.ID) as Accounts
FROM
User u
Now, I'm stuck though.

untested, I would go for something like this.
select
u.Name,
count(distinct a.ID) as Accounts,
count(distinct t.ID) as Transactions,
count(distinct ac.ID) as Actions
from User u
left join Account a on u.ID = a.UserID
left join Transaction t on t.AccountID = a.ID
left join Action ac on ac.AccountId = a.Id
group by u.Name

As probably intended:
SELECT u.name, u.id
,COALESCE(x.accounts, 0) AS accounts
,COALESCE(x.transactions, 0) AS transactions
,COALESCE(x.actions, 0) AS actions
FROM users u
LEFT JOIN (
SELECT a.userid AS id
,count(*) AS accounts
,sum(t.ct) AS transactions
,sum(c.ct) AS actions
FROM account a
LEFT JOIN (SELECT accountid AS id, count(*) AS ct FROM transaction GROUP BY 1) t USING (id)
LEFT JOIN (SELECT accountid AS id, count(*) AS ct FROM action GROUP BY 1) c USING (id)
GROUP BY 1
) x USING (id);
Group first, join later. That's fastest and cleanest by far if you want the whole table.
SQL Fiddle (building on the one provided by #Raphaƫl, prettified).
Aside: I tripped over your naming convention in my first version.

Related

Query records in one table that exists in either of two columns in another table

I have two tables. One with user info, one with payment info. I would like to find out users that are either the sender or the receiver of a payment.
Eample data:
user
id
other columns
1
2
3
payments:
sender
receiver
other columns
1
4
1
3
5
3
4
5
ideal output
id
1
3
what I tried:
SELECT id
FROM user u
where exists
(
SELECT 1
FROM payments p
where u.id = p.sender or u.id = p.receiver
)
BigQuery gave the error:
LEFT SEMI JOIN cannot be used without a condition that is an equality of fields from both sides of the join
which is quite confusing to me
It's because WHERE u.id = p.sender or u.id = p.receiver makes LEFT SEMI JOIN to be non-equi join.
You can separate WHERE condition into 2 EXITS clauses.
SELECT id
FROM user u
WHERE EXISTS (SELECT 1 FROM payments p WHERE u.id = p.sender)
OR EXISTS (SELECT 1 FROM payments p WHERE u.id = p.receiver)
;
output:
But this approach sometimes shows very poor performance in real circumstances.
So, below query would be another option you can use in that case.
SELECT id FROM user u WHERE EXISTS (SELECT 1 FROM payments p WHERE u.id = p.sender)
UNION ALL
SELECT id FROM user u WHERE EXISTS (SELECT 1 FROM payments p WHERE u.id = p.receiver)
;
I think below is the most optimal solution
select distinct id
from payments, unnest([sender, receiver]) id
join user
using(id)
if applied to sample data in your question - output is

Get active and total bookings (from 1 table) for every user in users table

I have 2 tables:
users:
username (pk)
bookings:
username (fk)
status (A = Active, C = Cancelled , D = DONE)
I'm willing to show user details along with with their count of active and total bookings (where total bookings will be all the entries in "bookings" table for a particular user).
Table to show:
username, active bookings (count), total bookings (count)
Currently I'm unable to make an efficient query for this.
My DB is postgresql.
Please assist.
Thank you
As you are using PostgreSQL you can take the advantage of Filter() clause. Also you have to use Left Join because you want the details for every user from user table. So Write your query like below:
select
t1.username,
count(*) filter (where t2.status='A') as "Active_Bookings",
count(t2.*) as "Total_Bookings"
from users t1 left join bookings t2 on t1.username=t2.username
group by 1
Edit as per comment:
Filter clause is supported by Postgresql and SQLite. For others count with case will do the thing. Below query should work for almost every other database.
select
t1.username,
count(case when t2.status='A' then 1 end) as "Active_Bookings",
count(t2.*) as "Total_Bookings"
from users t1 left join bookings t2 on t1.username=t2.username
group by t1.username
you can use sum(case when t2.status='A' then 1 else 0 end) as "Active_Bookings" also.
You can try the below -
select u.username,count(*) as total_booking,
count(case when status='Active' then 1 end) as active_bookings
from users u join bookings b on u.username=b.username
group by u.username
Not very sure I udnerstood your question but based on the input you gave try this , this should work
select x.username,active_bookings,total_bookings from (
(select username, count(status) as active_bookings from bookings where status='A' group by username)x join (select username,count(status) as total_bookings from bookings group by username)y on x.username=y.username);

join 2 foreign key using subquery

help me solve this, i am intended to join 2 table for 2 different foreign key within the same column, table snapshot provide below:
users table
transactions table
i want to return top 5 based on transactions amount from high-low alongside to display transactions id, investor id, investor name, borrower id, borrower name, amount
the following run properly but contains no investor name
select top 5 t.id,
investor_id,
borrower_id,
username as BorrowerName,
amount
from transactions t join users u on t.borrower_id = u.id
order by t.amount desc;
minus investor name result table
while if i do subquery resulting error
select top 5 t.id,
investor_id,
(select username from users join transactions on users.id =
transactions.investor_id) investorName,
borrower_id,
username BorrowerName,
amount
from transactions t join users u on t.borrower_id = u.id
order by t.amount desc;
select top 5 t.id,
investor_id, ui.username as InvestorName,
borrower_id, ub.username as BorrowerName,
amount
from transactions t
join users ub on t.borrower_id = ub.id
join users ui on t.investor_id = ui.id
order by t.amount desc;
The Subquery must be scalar. i.e. return a single value, but you currently return a result set.
select top 5 t.id,
investor_id,
(-- Correlated Scalar Subquery, returns a single value
select username
from users
WHERE users.id = transactions.investor_id) investorName,
borrower_id,
username BorrowerName,
amount
from transactions t join users u on t.borrower_id = u.id
order by t.amount desc;
Isn't this what you want? Two joins on users table
SELECT TOP 5
investor_id,
investors.username InvestorName,
borrower_id,
borrowers.username BorrowerName,
amount
FROM
transactions
INNER JOIN users investors ON (transactions.investor_id = investors.id)
INNER JOIN users borrowers ON (transactions.borrower_id = borrowers.id)
ORDER BY
amount desc;
I would recommend against using subqueries in this case, since the database will be forced to perform two sequential scans in a nested loop for each row.

Getting Counts Per User Across Tables in POSTGRESQL

I'm new to postgresql. I have a database that has three tables in it: Users, Order, Comments. Those three tables look like this
Orders Comments
------ --------
ID ID
UserID UserID
Description Details
CreatedOn CreatedOn
I'm trying to get a list of all of my users and how many orders each user has made and how many comments each user has made. In other words, the result of the query should look like this:
UserID Orders Comments
------ ------ --------
1 5 7
2 2 9
3 0 0
...
Currently, I'm trying the following:
SELECT
UserID,
(SELECT COUNT(ID) FROM Orders WHERE UserID=ID) AS Orders,
(SELECT COUNT(ID) FROM Comments WHERE UserID=ID) AS Comments
FROM
Orders o,
Comments c
WHERE
o.UserID = c.UserID
Is this the right way to do this type of query? Or can someone provide a better approach from a performance standpoint?
SQL Fiddle
select
id, name,
coalesce(orders, 0) as orders,
coalesce(comments, 0) as comments
from
users u
left join
(
select userid as id, count(*) as orders
from orders
group by userid
) o using (id)
left join
(
select userid as id, count(*) as comments
from comments
group by userid
) c using (id)
order by name
The usual way to do this is by using outer joins to the two other tables and then group by the id (and name)
select u.id,
u.name,
count(distinct o.id) as num_orders,
count(distinct c.id) as num_comments
from users u
left join orders o on o.userId = u.id
left join comments c on c.userId = u.id
group by u.id, u.name
order by u.name;
That might very well be faster than your approach. But Postgres' query optimizer is quite smart and I have seen situations where both solutions are essentially equal in performance.
You will need to test that on your data and also have a look at the execution plans in order to find out which one is more efficient.

Getting highest results in a JOIN

I've got three tables; Auctions, Auction Bids and Users. The table structure looks something like this:
Auctions:
id title
-- -----
1 Auction 1
2 Auction 2
Auction Bids:
id user_id auction_id bid_amt
-- ------- ---------- -------
1 1 1 200.00
2 2 1 202.00
3 1 2 100.00
Users is just a standard table, with id and user name.
My aim is to join these tables so I can get the highest values of these bids, as well as get the usernames related to those bids; so I have a result set like so:
auction_id auction_title auctionbid_amt user_username
---------- ------------- -------------- -------------
1 Auction 1 202.00 Bidder2
2 Auction 2 100.00 Bidder1
So far my query is as follows:
SELECT a.id, a.title, ab.bid_amt, u.display_name FROM auction a
LEFT JOIN auctionbid ab ON a.id = ab.auction_id
LEFT JOIN users u ON u.id = ab.user_id
GROUP BY a.id
This gets the single rows I am after, but it seems to display the lowest bid_amt, not the highest.
You can use the MAX-Function and a sub-select to get the maximum bid for each auction. If you join this subselect with your other tables and set the where clause as follows you should get what you are looking for.
SELECT a.id, a.title, ab.bid_points, u.display_name
FROM Auction AS a
INNER JOIN (SELECT auction_id, MAX(bid_points) AS maxAmount FROM auction_bids GROUP BY auction_id) AS maxBids ON maxBids.auction_id = a.id
INNER JOIN auction_bids AS ab ON a.id = ab.auction_id
INNER JOIN users AS u ON u.id = ab.user_id
WHERE ab.auction_id = maxBids.auction_id AND ab.bid_amount = maxBids.maxAmount
Hope that helps.
This is a typical within-group aggregate problem. You can solve it using a so called left self exclusion join
Try the following:
SELECT a.id, a.title, ab.bid_points, u.displayname
FROM auction a
INNER JOIN auction_bids ab ON ab.auction_id = a.id
LEFT JOIN auction_bids b1 ON ab.auction_id = b1.auction_id
AND ab.bid_points < b1.bid_points
LEFT JOIN users u ON u.id = ab.user_id
WHERE b1.auction_id IS NULL
It basically builds a join between the left and right side, until it doesn't find one for the left side anymore, and thats the highest element then.
Another solution would be using multiple querys (of course) or a temporary aggregate table.
Try this:
SELECT a.id, a.title, ab.bid_points, u.display_name FROM auction a
LEFT JOIN auctionbid ab ON a.id = ab.auction_id
LEFT JOIN users u ON u.id = ab.user_id
GROUP BY a.id
ORDER BY ab.bid_points DESC
If that doesn't work, try using a subselect on auctionbids containing something like
SELECT id, user_id, auction_id, MAX(bid_amt) FROM action_bids GROUP BY auction_id
Try adding the following clause; not sure about performance.
WHERE NOT EXISTS
(SELECT * FROM auctionbid abhigher
WHERE abhigher.auction_id = ab.auction_id
AND abhigher.auctionbid_amt > ab.auctionbid_amt)
Excludes auction bids from the query that have a higher bid for the same auction.
The only problem is that if you have 2 equal bids and both will list. One way to get rid of them - but it is a relatively arbitrary choice of winner, is to use the bid id:
WHERE NOT EXISTS
(SELECT * FROM auctionbid abhigher
WHERE abhigher.auction_id = ab.auction_id
AND abhigher.auctionbid_amt >= ab.auctionbid_amt
AND abhigher.id > ab.id)
Here is what you can try..like old school..nothing new..no need to go for left join or anything else..rest depends on your exact requirement
select A.id,A.title,max(AB.bid_amt),name
from Auction A,AuctionBids AB,Users U
where U.ID=AB.USER_ID AND A.ID=AB.ID
group by A.ID,A.title,name