SQL: many-to-many relationship, IN condition

SQL: many-to-many relationship, IN condition - sql

I have a table called transactions with a many-to-many relationship to items through the items_transactions table.
I want to do something like this:
SELECT "transactions".*
FROM "transactions"
INNER JOIN "items_transactions"
ON "items_transactions".transaction_id = "transactions".id
INNER JOIN "items"
ON "items".id = "items_transactions".item_id
WHERE (items.id IN (<list of items>))
But this gives me all transactions that have one or more of the items in the list associated with it and I only want it to give me the transactions that are associated with all of those items.
Any help would be appreciated.

You have to expand out your query for all of the items in the list:
SELECT "transactions".*
FROM "transactions"
WHERE EXISTS (SELECT 1 FROM "items_transactions"
INNER JOIN "items" ON "items".id = "items_transactions".item_id
WHERE "items_transactions".transaction_id = "transactions".id
AND "items".id = <first item in list>)
AND EXISTS (SELECT 1 FROM "items_transactions"
INNER JOIN "items" ON "items".id = "items_transactions".item_id
WHERE "items_transactions".transaction_id = "transactions".id
AND "items".id = <second item in list>)
...
You might also be able to massage it out using IN and COUNT DISTINCT, I'm not sure which would be faster. Something like (completely untested):
SELECT "transactions".*
FROM "transactions"
INNER JOIN (SELECT "items_transactions".transaction_id
FROM "items_transactions"
INNER JOIN "items" ON "items".id = "items_transactions".item_id
WHERE "items".id IN (<list of items>)
GROUP BY "items_transactions".transaction_id
HAVING COUNT(DISTINCT "items".id) = <count of items in list>) matches ON transactions.transaction_id = matches.transaction_id

I think this does what you want.
I would put the list of items you need in to a table (temp one will be fine) and join on to that. Then count the number of distinct items and match the count to the item transactions count.
I've provided the sample DDL & Data that I used.
Create table #trans
(
transId int identity(1,1),
trans varchar(10)
)
Create Table #itemTrans
(
transId int,
itemId int
)
Create table #items
(
itemId int identity(1,1),
item varchar(10)
)
Create table #itemsToSelect
(
itemId int
)
Insert Into #trans
Values ('Trans 1')
Insert Into #trans
Values ('Trans 2')
Insert Into #trans
Values ('Trans 3')
Insert Into #Items
Values ('Item 1')
Insert Into #Items
Values ('Item 2')
Insert Into #Items
Values ('Item 3')
Insert Into #Items
Values ('Item 4')
Insert Into #itemTrans
Values (1, 1)
Insert Into #itemTrans
Values (1, 2)
Insert Into #itemTrans
Values (1, 3)
Insert Into #itemTrans
Values (2, 1)
Insert Into #itemTrans
Values (2, 3)
Insert Into #itemTrans
Values (3, 4)
Insert Into #itemsToSelect
Values (1)
Insert Into #itemsToSelect
Values (2)
Insert Into #itemsToSelect
Values (3)
Select t.transId
From #items i
Join #itemTrans it on i.itemId = it.itemId
Join #trans t on it.transId = t.transId
Join #itemsToSelect its on it.ItemId = its.ItemId
Where it.TransId is not null
Group by t.transId
Having count(distinct(it.itemId)) = (Select count(distinct(itemId)) from #itemsToSelect)

SELECT transactions.*
WHERE (SELECT count(*)
FROM items_transactions
WHERE items_transactions.transaction_id = transactions.transaction_id
AND items_transactions.item_id IN (<list of items>)
) = <number of items>
Although this will probably do a scan of transactions, nesting the correlated subquery for each one... not particularly efficient, so maybe:
SELECT transactions.*
WHERE EXISTS (SELECT 1 FROM items_transactions
WHERE items_transactions.transaction_id = transactions.transaction_id
AND items_transactions.item_id IN (<list of items>)
)
AND
(SELECT count(*)
FROM items_transactions
WHERE items_transactions.transaction_id = transactions.transaction_id
AND items_transactions.item_id IN (<list of items>)
) = <number of items>
or something similar to persuade the DB to find transactions related to at least one of the items first, and then check each transaction is linked against all the items later.
As someone's noted, you can also simply generate join clauses for each item instead, which might well be better if the number of items isn't large.

I haven't executed this, but that should get you the result you want:
SELECT t.* FROM items i
INNER JOIN items_transactions it ON i.id = it.item_id
INNER JOIN transactions t ON it.transaction_id = t.id
WHERE i.id IN (1,2,3)

The final bit of the query looks wrong:
WHERE (items.id IN (<list of items>))
the 'in' statement is like a big OR statement rather than an AND statement, so it is expanded by the optimizer as:
WHERE (items.id = 123 OR items.id = 456 OR items.id = 789)
EDIT
I reckon you need to perform a correlated subquery on the items table.

Related

UPDATE Table based on the same table with relations to other tables

I have a 2 tables:
Product (Id, RefKey, ParentId)
example data.: (1, 'SX1234', NULL), (2, 'SX4321', NULL)
and
ProductSTAGE (Id, RefKeyCode, ParentCode)
example data: (1, 'SX1234', 'SX4321')
where Product.RefKey = ProductSTAGE.RefKeyCode
How can I update Product table based on these relations to result
Product (Id, RefKey, ParentId)
result data.: (1, 'SX1234', 2)
I used
WITH CTE
AS (
SELECT P.ParentId FROM Product AS P
)
UPDATE CTE SET ParentId = P2.Id
FROM Product AS P2
INNER JOIN ProductSTAGE AS PS ON PS.RefKeyCode = P2.RefKey
WHERE PS.ParentCode IS NOT NULL
but using this my Product.ParentId always is equal Product.Id

I found my CTE problem but CTE is not necessary on this issue. This query works and should be enough for me:
UPDATE P
SET P.ParentId = P2.Id
FROM Product AS P
INNER JOIN ProductSTAGE AS PS ON P.RefKey = PS.RefKeyCode
INNER JOIN Product AS P2 ON P2.RefKey = PS.ParentCode
WHERE PS.ParentCode IS NOT NULL

How do I pull all results form a table based on another tables many to one relationship?

Setup
Sorry for the poorly phrased question. I'm not sure how to phrase it better. Feel free to try your hand at it if you are more versed in sql phrasing.
I have 3 related tables.
Person => person_id, name, etc
Cases => case_id, person_id, incedent_date, etc
Files => file_id, case_id, file_path, etc
Problem
For a given case_id I want to pull all file_id's for the same person.
Requirements:
1 query.
without duplicates.
without using UNIQUE/DISTINCT flag.
without changing table structure.
e.g. Bob has 2 cases, auto and house.
He has 10 files on each case.
I have the case_id for auto.
I want the files for both auto and house (20 files).
My attempt
This returns all files for all cases.
SELECT
f.file_id AS id
FROM files f
LEFT JOIN Cases c1 ON f.case_id = c1.case_id
LEFT JOIN Cases c2 ON f.case_id = c2.case_id
WHERE (f.case_id = 3566 OR c1.person_id = c2.person_id)
AND f.active = 1
ORDER BY f.upload_date ASC
This returns files for only given case:
SELECT
f.file_id AS id
FROM files f
LEFT JOIN Cases c1 ON f.case_id = c1.case_id
LEFT JOIN Cases c2 ON f.case_id = c2.case_id
WHERE (f.case_id = 3566 OR (c1.case_id = 3566 AND c1.person_id = c2.person_id)
AND f.active = 1
ORDER BY f.upload_date ASC
This returns duplicate values and seems to pull only the given case:
SELECT
f.file_id AS id
FROM files f
LEFT JOIN Cases c1 ON f.case_id = c1.case_id
LEFT JOIN Cases c2 ON c1.person_id = c2.person_id
WHERE f.case_id = 3566
AND f.active = 1
ORDER BY f.upload_date ASC

I hope this is what you want.
Create table #Person (person_id int, name varchar(10))
Insert into #Person values (1,'Ajay')
Insert into #Person values (2,'Vijay')
Create table #Cases (case_id int, person_id int)
Insert into #Cases values (1,1)
Insert into #Cases values (2,1)
Insert into #Cases values (3,1)
Insert into #Cases values (4,2)
Create table #Files (file_id int, case_id int)
Insert into #Files values (1,1)
Insert into #Files values (2,1)
Insert into #Files values (3,1)
Insert into #Files values (4,2)
Insert into #Files values (5,4)
SELECT
f.file_id AS id
FROM #files f
LEFT JOIN #Cases c1 ON f.case_id = c1.case_id
LEFT JOIN #Cases c2 ON c1.person_id = c2.person_id and c2.case_id = 2
where c2.case_id is not null
--OR
SELECT *,
f.file_id AS id
FROM #files f
LEFT JOIN #Cases c1 ON f.case_id = c1.case_id
INNER JOIN #Cases c2 ON c1.person_id = c2.person_id and c2.case_id = 2

Dynamically update table with column from another table

I have a table customer like this:
CREATE TABLE tbl_customer (
id INTEGER,
name VARCHAR(16),
voucher VARCHAR(16)
);
and a voucher table like this:
CREATE TABLE tbl_voucher (
id INTEGER,
code VARCHAR(16)
);
Now imagine that the customer table always has rows with id and name filled in, however the voucher needs to be inserted periodically from the tbl_voucher table.
Important: every voucher may only be assigned to one specific customer (i.e. must be unique)
I wrote a query like this:
UPDATE tbl_customer
SET voucher = (
SELECT code
FROM tbl_voucher
WHERE code NOT IN (
SELECT voucher
FROM tbl_customer
WHERE voucher IS NOT NULL
)
LIMIT 1
)
WHERE voucher IS NULL;
However this is not working as expected, since the part that looks for an unused voucher is executed once and said voucher is then applied to every customer.
Any ideas on how I can solve this without using programming structures such as loops?
Also, some example data so you can imagine what I would like to happen:
INSERT INTO tbl_customer VALUES (1, 'Sara', 'ABC');
INSERT INTO tbl_customer VALUES (1, 'Simon', 'DEF');
INSERT INTO tbl_customer VALUES (1, 'Andy', NULL);
INSERT INTO tbl_customer VALUES (1, 'Alice', NULL);
INSERT INTO tbl_voucher VALUES (1, 'ABC');
INSERT INTO tbl_voucher VALUES (2, 'LOL');
INSERT INTO tbl_voucher VALUES (3, 'ZZZ');
INSERT INTO tbl_voucher VALUES (4, 'BBB');
INSERT INTO tbl_voucher VALUES (5, 'CCC');
After the wanted query is executed, I'd expect Andy to have the voucher LOL and Alice should get ZZZ

I am going to guess this is MySQL. The answer is that this is a pain. The following assigns the values in a select:
select c.*, v.voucher
from (select c.*, (#rnc := #rnc + 1) as rn
from tbl_customer c cross join
(select #rnc := 0) params
where c.voucher is null
) c join
(select v.*, (#rnv := #rnv + 1) as rn
from tbl_vouchers v cross join
(select #rnv := 0) params
where not exists (select 1 from tbl_customers c where c.voucher = v.voucher)
) v
on c.rn = v.rn;
You can now use this for the update:
update tbl_customer c join
(select c.*, v.voucher
from (select c.*, (#rnc := #rnc + 1) as rn
from tbl_customer c cross join
(select #rnc := 0) params
where c.voucher is null
) c join
(select v.*, (#rnv := #rnv + 1) as rn
from tbl_vouchers v cross join
(select #rnv := 0) params
where not exists (select 1 from tbl_customers c where c.voucher = v.voucher)
) v
on c.rn = v.rn
) cv
on c.id = cv.id
set c.voucher = cv.voucher;

How to UPDATE pivoted table in SQL SERVER

I have flat table which I have to join using EAN attribute with my main table and update gid (id of my main table).
id attrib value gid
1 weight 10 NULL
1 ean 123123123112 NULL
1 color blue NULL
2 weight 5 NULL
2 ean 331231313123 NULL
I was trying to pivot ean rows into column, next join on ean both tables, and for this moment everything works great.
--update SideTable
--set gid = ab_id
select gid, ab_id
from SideTable
pivot (max (value) for attrib in ([EAN],[MPN])) as b
join MainTable as c
on c.ab_ean = b.EAN
where b.EAN !='' AND c.ab_archive = '0'
When I am selecting both id columns is okey, but when I am uncomment first lines and delete select whole table is set with first gid from my main table.
It have to set my main id into all attributes where ID where ean is matched from my main table.
I am sorry for my terrible english but I hope someone can help me, with that.

The reason your update does not work is that you don't have any link between your source and target for the update, although you reference sidetable in the FROM clause, this is effectively destroyed by the PIVOT function, leaving no link back to the instance of SideTable that you are updating. Since there is no link, all rows are updated with the same value, this will be the last value encountered in the FROM.
This can be demonstrated by running the following:
DECLARE #S TABLE (ID INT, Attrib VARCHAR(50), Value VARCHAR(50), gid INT);
INSERT #S
VALUES
(1, 'weight', '10', NULL), (1, 'ean', '123123123112', NULL), (1, 'color', 'blue', NULL),
(2, 'weight', '5', NULL), (2, 'ean', '331231313123', NULL);
SELECT s.*
FROM #S AS s
PIVOT (MAX(Value) FOR attrib IN ([EAN],[MPN])) AS pvt;
You clearly have a table aliased s in the FROM clause, however because you have used pivot you cannot use SELECT s*, you get the following error:
The column prefix 's' does not match with a table name or alias name used in the query.
You haven't provided sample data for your main table, but I am about 95% certain your PIVOT is not needed, I think you can get your update using just normal JOINs:
UPDATE s
SET gid = ab_id
FROM SideTable AS s
INNER JOIN SideTable AS ean
ON ean.ID = s.ID
AND ean.attrib = 'ean'
INNER JOIN MainTable AS m
ON m.ab_EAN = ean.Value
WHERE m.ab_archive = '0'
AND m.ab_EAN != '';

As per comment to the question, you need to use update + select statement.
A standard version looks like:
UPDATE
T
SET
T.col1 = OT.col1,
T.col2 = OT.col2
FROM
Some_Table T
INNER JOIN
Other_Table OT
ON
T.id = OT.id
WHERE
T.col3 = 'cool'
As to your needs:
update a
set a.gid = p.ab_id
from SideTable As a
Inner join (
select gid, ab_id
from SideTable
pivot (max (value) for attrib in ([EAN],[MPN])) as b
join MainTable as c
on c.ab_ean = b.EAN
where b.EAN !='' AND c.ab_archive = '0') p ON a.ean = p.EAN

try and break it down a bit more like this..
update SideTable
set SideTable.gid = p.ab_id
FROM
(
select gid, ab_id
from SideTable
pivot (max (value) for attrib in ([EAN],[MPN])) as b
join MainTable as c
on c.ab_ean = b.EAN
where b.EAN !='' AND c.ab_archive = '0'
) p
WHERE p.EAN = SideTable.EAN

Need assistance with SQL query

I have 3 tables that I'm trying to create a query from:
Table 1 (iuieEmployee) ->position number
Table 2 (jbEmployeeH1BInfo) -> position number, LCA number, start date
Table 3 (jbEmployeeLCA) -> LCA number
Table 4 (jbInternationsl) -> Main demographic table
I have a query that works fine where there's only 1 record in each table, but tables 2 and 3 can have multiple records. I want it to find the record with he most recent start date and verify that there is a matching LCA number in the 3rd table and a matching position number int he first table and show me any records where this isn't the case. How can I accomplish this? I currently have:
SELECT DISTINCT jbInternational.idnumber, jbInternational.lastname, jbInternational.firstname, jbInternational.midname,
jbInternational.campus, jbInternational.universityid, jbInternational.sevisid, jbInternational.citizenship,
jbInternational.immigrationstatus, jbEmployeeH1BInfo.lcaNumber AS lcaNumber1, jbEmployeeLCA.lcaNumber AS lcaNumber2
FROM (select jbEmployeeH1BInfo.idnumber, MAX(jbEmployeeH1BInfo.approvalStartDate) AS MaxDateStamp FROM [internationalservices].[dbo].jbEmployeeH1BInfo GROUP BY idnumber ) my
INNER JOIN [internationalservices].[dbo].jbEmployeeH1BInfo WITH (nolock) ON my.idnumber=jbEmployeeH1BInfo.idnumber AND my.MaxDateStamp=jbEmployeeH1BInfo.approvalStartDate
INNER JOIN [internationalservices].[dbo].jbInternational WITH (nolock) ON jbInternational.idnumber=jbEmployeeH1BInfo.idnumber
inner join [internationalservices].[dbo].jbEmployeeLCA ON jbInternational.idnumber = jbEmployeeLCA.idnumber
WHERE jbInternational.idnumber not in(
SELECT DISTINCT jbInternational.idnumber
FROM (select distinct jbEmployeeH1BInfo.idnumber, MAX(jbEmployeeH1BInfo.approvalStartDate) AS MaxDateStamp
FROM [internationalservices].[dbo].jbEmployeeH1BInfo GROUP BY idnumber ) my
INNER JOIN [internationalservices].[dbo].jbEmployeeH1BInfo WITH (nolock) ON my.idnumber=jbEmployeeH1BInfo.idnumber AND my.MaxDateStamp=jbEmployeeH1BInfo.approvalStartDate
INNER JOIN [internationalservices].[dbo].jbInternational WITH (nolock) ON jbInternational.idnumber=jbEmployeeH1BInfo.idnumber
inner join [internationalservices].[dbo].jbEmployeeLCA ON jbInternational.idnumber = jbEmployeeLCA.idnumber
AND jbEmployeeH1BInfo.lcaNumber = jbEmployeeLCA.lcaNumber)
Table Schema:
create table iuieEmployee(idnumber int, POS_NBR varchar(8));
insert into iuieEmployee values(123456, '470V13');
insert into iuieEmployee values(123457, '98X000');
insert into iuieEmployee values(123458, '98X000');
insert into iuieEmployee values(123455, '98X000');
create table jbEmployeeH1BInfo (idnumber int, approvalStartDate smalldatetime, lcaNumber varchar(20), positionNumber varchar(200));
insert into jbEmployeeH1BInfo values (123456, 07/01/2012, '1-200-3000', '98X000');
insert into jbEmployeeH1BInfo values (123456, 07/30/2013, '1-200-4000', '470V13');
insert into jbEmployeeH1BInfo values (123457, 07/01/2012, '1-200-5000', '98X000');
insert into jbEmployeeH1BInfo values (123458, 07/01/2012, '1-200-6000', '98X000');
insert into jbEmployeeH1BInfo values (123455, 07/30/2014, '1-200-7000', '98X000');
insert into jbEmployeeH1BInfo values (123455, 07/01/2012, '1-200-8000', '470V13');
create table jbEmployeeLCA (idnumber int, lcaNumber varchar(20));
insert into jbEmployeeLCA values (123456, 1-200-3000);
insert into jbEmployeeLCA values (123456, 1-200-4111);
insert into jbEmployeeLCA values (123457, 1-200-5000);
insert into jbEmployeeLCA values (123458, 1-200-6000);
insert into jbEmployeeLCA values (123455, 1-200-7000);
insert into jbEmployeeLCA values (123455, 1-200-8000);
create table jbInternational(idnumber int);
insert into jbInternational values(123456);
insert into jbInternational values(123457);
insert into jbInternational values(123458);
insert into jbInternational values(123455);
Should only return 1 line:
123456, 07/30/2013, '1-200-4000'
but is instead returning two lines:
123456, 07/30/2013, '1-200-4000 (not matching 1-200-4111)
123456, 07/30/2013, '1-200-4000 (not matching 1-200-3000)
It shouldn't return the second row because the position number with the -3000 lca number doesn't have the most current date.

Your explanation is hard to understand. I guess if you could explain it well, then you could probably write the query yourself. Here's what I think you meant:
Employee contains the main records.
You want to find all idnumbers such that
idnumber is in International
the H1BInfo record with the most recent approvalStartDate does not have an LCA number matching the LCA record
The first thing to do is to simplify that H1BInfo table. We are only looking for the rows with the most recent approvalStartDate. We can do that by partitioning by idnumber and ordering by approvalStartDate:
with rankedH1BInfo as (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY jbEmployeeH1BInfo.idnumber
ORDER BY jbEmployeeH1BInfo.approvalStartDate desc) as r
FROM [internationalservices].[dbo].jbEmployeeH1BInfo
)
Let's only get the first row of each partition:
, MostRecentH1BInfo as (
SELECT * FROM rankedH1BInfo
WHERE r = 1
)
Now we can do the join to find all the good ones:
, goodIDs as (
SELECT i.idnumber
FROM [internationalservices].[dbo].jbInternational i WITH (NOLOCK)
JOIN [internationalservices].[dbo].jbEmployeeLCA l WITH (NOLOCK) on l.idnumber = i.idnumber
JOIN MostRecentH1BInfo h WITH (NOLOCK) on h.idnumber = i.idnumber
JOIN iuieEmployee e WITH (NOLOCK) on e.positionNumber = h.positionNumber
WHERE h.lcaNumber = l.lcaNumber
)
To put it all together and get the ones where this is false:
with rankedH1BInfo as (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY jbEmployeeH1BInfo.idnumber
ORDER BY jbEmployeeH1BInfo.approvalStartDate desc) as r
FROM [internationalservices].[dbo].jbEmployeeH1BInfo
), MostRecentH1BInfo as (
SELECT * FROM rankedH1BInfo
WHERE r = 1
), goodIDs as (
SELECT i.idnumber
FROM [internationalservices].[dbo].jbInternational i WITH (NOLOCK)
JOIN [internationalservices].[dbo].jbEmployeeLCA l WITH (NOLOCK) on l.idnumber = i.idnumber
JOIN MostRecentH1BInfo h WITH (NOLOCK) on h.idnumber = i.idnumber
JOIN iuieEmployee e WITH (NOLOCK) on e.positionNumber = h.positionNumber
WHERE h.lcaNumber = l.lcaNumber
)
SELECT DISTINCT jbInternational.idnumber, jbInternational.lastname, jbInternational.firstname, jbInternational.midname,
jbInternational.campus, jbInternational.universityid, jbInternational.sevisid, jbInternational.citizenship,
jbInternational.immigrationstatus, jbEmployeeH1BInfo.lcaNumber AS lcaNumber1, jbEmployeeLCA.lcaNumber AS lcaNumber2
FROM (select jbEmployeeH1BInfo.idnumber, MAX(jbEmployeeH1BInfo.approvalStartDate) AS MaxDateStamp FROM [internationalservices].[dbo].jbEmployeeH1BInfo GROUP BY idnumber ) my
INNER JOIN [internationalservices].[dbo].jbEmployeeH1BInfo WITH (nolock) ON my.idnumber=jbEmployeeH1BInfo.idnumber AND my.MaxDateStamp=jbEmployeeH1BInfo.approvalStartDate
INNER JOIN [internationalservices].[dbo].jbInternational WITH (nolock) ON jbInternational.idnumber=jbEmployeeH1BInfo.idnumber
inner join [internationalservices].[dbo].jbEmployeeLCA ON jbInternational.idnumber = jbEmployeeLCA.idnumber
WHERE jbInternational.idnumber not in (select idnumber from goodIDs)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: many-to-many relationship, IN condition - sql

I haven't executed this, but that should get you the result you want: SELECT t.* FROM items i INNER JOIN items_transactions it ON i.id = it.item_id INNER JOIN transactions t ON it.transaction_id = t.id WHERE i.id IN (1,2,3)

Related

UPDATE Table based on the same table with relations to other tables

How do I pull all results form a table based on another tables many to one relationship?

Dynamically update table with column from another table

How to UPDATE pivoted table in SQL SERVER

Need assistance with SQL query

Categories

Resources