SQL SELECT statement with a many-to-many extra table - sql

I have a table that links users. Consider the following:
**Table contracts:**
contract_id int,
contract_number varchar,
user_id int
**Table users:**
user_id int
**Table user_links**
user_id int,
linked_user_id int
The user_links table can have 0 rows for a particular user_id, given the user doesn't have linked users, so a select statement can return either a row or NULL.
The approach with
left join user_links ul on ul.user_id = contracts.user_id OR ul.linked_user_id = contracts.user_id doesn't seem to work if there is no row in the user_links table.
Given only an int user_id, how can I get rows from the contracts table for both user_id AND linked_user_id?
For example, if the user_id 1 has a linked_user_id 2, I need the rows from contracts for both users; however, if the user doesn't have a row in user_links table, I still need to get their contracts.

Assuming your input user_id is the variable #user_id, then the below query will get you all the contracts of that user, and if any linked user.
SELECT * from contracts c
where c.user_id = #user_id
OR c.user_id IN ( SELECT linked_user_id from user_links ul
WHERE ul.user_id = #user_id)

Related

Insert many foreign-key related records in one query

I am trying to come up with a single query that helps me atomically normalise a table (populate a new table with initial values from another table, and simultaneously add the foreign key reference to it)
Obviously populating one table from another is a standard INSERT INTO ... SELECT ..., but I also want to update the foreign key reference in the 'source' table to reference the new record in the 'new' table.
Let's say I am migrating a schema from:
CREATE TABLE companies (
id INTEGER PRIMARY KEY,
address_line_1 TEXT,
address_line_2 TEXT,
address_line_3 TEXT
)
to:
CREATE TABLE addresses (
id INTEGER PRIMARY KEY,
line_1 TEXT,
line_2 TEXT,
line_3 TEXT
)
CREATE TABLE companies (
id INTEGER PRIMARY KEY,
address_id INTEGER REFERENCES addresses(id)
)
... I thought perhaps a CTE might help of the form
WITH new_addresses AS (
INSERT INTO addresses (line_1, line_2, line_3)
SELECT address_line_1, address_line_2, address_line_3
FROM companies
RETURNING id, companies.id AS company_id -- DOESN'T WORK
)
UPDATE companies
SET address_id = new_addresses.id
FROM new_addresses
WHERE new_addresses.company_id = companies.id
It seems that RETURNING can only return data from the inserted record however, so this will not work
I assume at this point that the answer will be to either use PLSQL or incorporate domain knowledge of the data to do this in a multi step process. My current solution is pretty much:
-- FIRST QUERY
-- ensure address record IDs will be in the same sequential order as their relating companies
INSERT INTO addresses (line_1, line_2, line_3)
SELECT address_line_1, address_line_2, address_line_3
FROM companies
ORDER BY id;
-- SECOND QUERY
-- join, making the assumption that the table IDs are in the same order
WITH address_ids AS (
SELECT id AS address_id, ROW_NUMBER() OVER(ORDER BY id) AS idx
FROM addresses
), company_ids AS (
SELECT id AS company_id, ROW_NUMBER() OVER(ORDER BY id) AS idx
FROM companies
), company_address_ids AS (
SELECT company_id, address_id
FROM address_ids
JOIN company_ids USING (idx)
)
UPDATE companies
SET address_id = company_address_ids.address_id
FROM company_address_ids
WHERE id = company_address_ids.company_id
This is obviously problematic in that it relies on the addresses table containing exactly as many records as the company table, but such a query would be a one-off when the table is first created.

How to get last edited post of every user in PostgreSQL?

I have user data in two tables like
1. USERID | USERPOSTID
2. USERPOSTID | USERPOST | LAST_EDIT_TIME
How do I get the last edited post and its time for every user? Assume that every user has 5 posts, and each one is edited at least once.
Will I have to write a loop iterating over every user, find the USERPOST with MAX(LAST_EDIT_TIME) and then collect the values? I tried GROUP BY, but I can't put USERPOSTID or USERPOST in an aggregate function. TIA.
Seems like something like this should work:
create table users(
id serial primary key,
username varchar(50)
);
create table posts(
id serial primary key,
userid integer references users(id),
post_text text,
update_date timestamp default current_timestamp
);
insert into users(username)values('Kalpit');
insert into posts(userid,post_text)values(1,'first test');
insert into posts(userid,post_text)values(1,'second test');
select *
from users u
join posts p on p.userid = u.id
where p.update_date =
( select max( update_date )
from posts
where userid = u.id )
fiddle: http://sqlfiddle.com/#!15/4b240/4/0
You can use a windowing function here:
select
USERID
, USERPOSTID
from
USERS
left join (
select
USERID
, row_number() over (
partition by USERID
order by LAST_EDIT_TIME desc) row_num
from
USERPOST
) most_recent
on most_recent.USERID = USERS.USERID
and row_num = 1

Select rows where the last row of associated table has a specific value

I have two tables:
User (id, name)
UserEvent (id, user_id, name, date)
How can I get all the users where the last (ordered by date) UserEvent.name has a value of 'played'?
I wrote an example on SQLFiddle with some specific data: http://sqlfiddle.com/#!9/b76e24 - For this scenario I would just get 'Mery' from table User, because even though 'John' has associated events name of the last one is not 'played'.
This is probably fastest:
SELECT u.*
FROM usr u -- avoiding "User" as table name
JOIN LATERAL (
SELECT name
FROM userevent
WHERE user_id = u.id
ORDER BY date DESC NULLS LAST
LIMIT 1
) ue ON ue.name = 'played';
LATERAL requires Postgres 9.3+:
What is the difference between LATERAL and a subquery in PostgreSQL?
Or you could use DISTINCT ON (faster for few rows per user):
SELECT u.*
FROM usr u -- avoiding "User" as table name
JOIN (
SELECT DISTINCT ON (user_id)
user_id, name
FROM userevent
ORDER BY user_id, date DESC NULLS LAST
) ue ON ue.user_id = u.id
AND ue.name = 'played';
Details for DISTINCT ON:
Select first row in each GROUP BY group?
SQL Fiddle with valid test case.
If date is defined NOT NULL, you don't need NULLS LAST. (Neither in the index below.)
PostgreSQL sort by datetime asc, null first?
Key to read performance for both but especially the first query is a matching multicolumn index:
CREATE INDEX userevent_foo_idx ON userevent (user_id, date DESC NULLS LAST, name);
Optimize GROUP BY query to retrieve latest record per user
Aside: Never use reserved words as identifiers.
Return the max date from user event grouping by user id. Take that result set and join it back to user event by the user id and max date and filter for just the played records.
Here it is:
First i get the MAX ID from each user and join it to the ROW with this ID
to test if the status are 'played' if so, i get the username of them.
SELECT
ids.*,
u.name,
ue.*
FROM (
SELECt max(id) AS id from UserEvent
GROUP by user_id
) as ids
LEFT JOIN UserEvent ue ON ue.id = ids.id
LEFT JOIN User u ON u.id = ue.user_id
WHERE ue.name = 'played';

SQL: Selecting all from a table, where related table, doesn't have a specified field value

I have a database, with a table for venues and bookings. The relationship is 1:M.
I want to select all venues, that doesn't have a booking on a specific date. The date field is called booking_date and is present on the bookings table. I'm not the strongest in SQL, but i found the NOT EXISTS, but that seems to give me either no venues or all the venues, depending on whether the date is present in just one of the booking_date fields.
To sum up, what i need is a query that: Select all venues, that doesn't have a booking with a booking_date field = ?.
Venues table:
id, int
name, string
other unimportant fields
Bookings table:
id, int
customer_id, int
venue_id, int
booking_date, date
So a venue belongs to a booking. I want all venues, that doesn't have a booking, where the booking_date field is equal to a specified date.
For instance, if i have 5 venues, where one of them has a booking on 2014-06-09, and i supply that date, i want the other 4 venues, that doesn't have a booking on this date.
If anyone is interested, the use for this is to show the venues, that are available on a given date, that the users specify.
NOT EXISTS sounds exactly what you need:
SELECT *
FROM Venues V
WHERE NOT EXISTS(SELECT 1 FROM bookings
WHERE booking_date = 'SomeDate'
AND venue_id = V.venue_id)
I would take care of this in the WHERE (making some assumptions on your tables):
DECLARE #DateCheck date = '2014-05-09';
SELECT
*
FROM
Venues
WHERE
VenueId NOT IN
(
SELECT
VenueId
FROM
Bookings
WHERE
BookingDate = #DateCheck
);
Check the below query
select v.* from
tbl_venues v
left join tbl_bookings b on v.VenueID=b.VenueID and b.booking_date ='2014-05-02'
where b.bookingID is null
where bookingID: the primary column of booking table,
venueID: the primary column of venues table
You haven't included exact table structure, but it sounds like this would work.
DECLARE #SomeDate DATE='05/10/2014'
SELECT v.*
FROM Venues v
LEFT JOIN Bookings b on b.VenuesID=v.VenuesID AND CONVERT(DATE,b.Booking_Date)=#SomeDate
WHERE b.Booking_Date IS NULL

Find Specific Rows

I'm trying to build a rather specific query to find a set of user_ids based on topics they have registered to.
Unfortunately it's not possible to refactor the tables so I have to go with what I've got.
Single table with user_id and registration_id
I need to find all user_ids that have a registration_id of (4 OR 5) AND NOT 1
Each row is a single user_id/registration_id combination.
My SQL skills aren't the best, so I'm really scratching my brain. Any help would be greatly appreciated.
SELECT *
FROM (
SELECT DISTINCT user_id
FROM registrations
) ro
WHERE user_id IN
(
SELECT user_id
FROM registrations ri
WHERE ri.registration_id IN (4, 5)
)
AND user_id NOT IN
(
SELECT user_id
FROM registrations ri
WHERE ri.registration_id = 1
)
Most probably, user_id, registration_id is a PRIMARY KEY in your table. If it's not, then create a composite index on (user_id, registration_id) for this to work fast.
Possibly not the best way to do it (my SQL skills aren't the best either), but should do the job:
SELECT user_id
FROM table AS t
WHERE registration_id IN (4, 5)
AND NOT EXISTS (SELECT user_id
FROM table
WHERE user_id = t.user_id
AND registration_id = 1);
Another way with eliminating duplicates of user_id:
SELECT user_id
FROM registrations
WHERE registration_id IN (4, 5)
except
SELECT user_id
FROM registrations
WHERE registration_id =1
One use of a table:
select user_id from registrations
where registration_id <=5
group by user_id
having MIN(registration_id)>1 and MAX(registration_id)>= 4