Return Only Primary Keys That Share The Same Foriegn Keys [duplicate] - sql

This question already has answers here:
Select group of rows that match all items in a list
(3 answers)
Closed 7 years ago.
We're struggling with a lightweight way to reduce results in a larger population of search results by excluding rows that don't match ALL of the foreign keys in a second table.
For example, we have a mapping table that matches documents to tags where each document can have an unlimited number of tag relationships:
DocID | TagID
12 | 1
12 | 2
34 | 1
53 | 1
53 | 4
66 | 1
66 | 2
67 | 3
We're building a table in our SPROC based on a list of tagIDs passed in by the user, so we have a second table that looks like this:
TagID
1
2
What we'd like to do is return only the rows from the first table that contain a match on every value in the second table, effectively an "and" query. So if the user passes in tag values 1 & 2 we want to return DocID 12 and 66. Right now our joins return essentially an "or" result, so values 1 & 2 would return DocIDs 12, 34, and 66.
We're currently stuck with MS SQL 2008R2.

You can do this with group by and having and a twist:
select docid
from firsttable t1 join
secondtable t2
on t1.tagid = t2.tagid
group by docid
having count(*) = (select count(*) from secondtable);
You may need count(distinct) if either table could have duplicates.

Related

Group Rows Together in Join [duplicate]

This question already has answers here:
How do I create a comma-separated list using a SQL query?
(11 answers)
Closed 18 days ago.
I have a table that contains a bunch of keys with different IDs. Some keys have more than 1 ID.
ID Table:
key id
a 1
a 11
a 12
b 2
c 3
c 33
2nd table:
key count location
a 17 123 Test Rd
b 10 12 Smith St
c 18 999 Fire Rd
Desired result:
key count location id
a 17 123 Test Rd 1, 11, 12
b 10 12 Smith St 2
c 18 999 Fire Rd 3, 33
I am trying to join this table with another table that I have so that the ID is carried over, but the issue I am running in to is that my join result is getting a larger number of rows than desired simply because it is creating a row for each unique ID that a key has. Is there a way to do the JOIN in a way where it groups each id together (perhaps comma delimited) in one single row?
You can use STRING_AGG:
SELECT t.key, s.count, s.location, STRING_AGG(t.id, ',') as id
FROM YourTable t
JOIN YourTable2 s
ON(t.key = s.key)
GROUP BY t.key, s.count, s.location

How to find a field with some specific value in another field in sql

I have 2 columns MessageID and FlowStatusID.
I want to find MessageID's which have a FlowStatusID is one specific value with a special sequence.
For example I want to find MessageID's where the FlowStatusID contains the sequence of these numbers: 105,81,21
MessageID FlowStatusID
-------------------------
1 11
1 105
2 105
2 81
2 21
3 81
4 105
4 81
4 21
5 21
5 105
The result must be 2, 4
You don't mention the database type but I've found some other tickets which explain how to concatenate values from multiple records.
Postgress: Concatenate multiple result rows of one column into one, group by another column
Oracle : SQL Query to concatenate column values from multiple rows in Oracle
You can use this concatenated field in a condition with an equality match:
where myconcatresult = '21,81,105'
I don't know how this will perform :)
Try like this
select MessageIDfrom t group by MessageID having count(*) =3 ;
Or
select MessageID from (
select t.MessageID from t where FlowStatusID=21
union all
select t.MessageID from t where FlowStatusID=81
union all
select t.MessageID from t where FlowStatusID=105 )
as tt group by MessageID having count(*) =3

In SQL, find duplicates in one column with unique values for another column

So I have a table of aliases linked to record ids. I need to find duplicate aliases with unique record ids. To explain better:
ID Alias Record ID
1 000123 4
2 000123 4
3 000234 4
4 000123 6
5 000345 6
6 000345 7
The result of a query on this table should be something to the effect of
000123 4 6
000345 6 7
Indicating that both record 4 and 6 have an alias of 000123 and both record 6 and 7 have an alias of 000345.
I was looking into using GROUP BY but if I group by alias then I can't select record id and if I group by both alias and record id it will only return the first two rows in this example where both columns are duplicates. The only solution I've found, and it's a terrible one that crashed my server, is to do two different selects for all the data and then join them
ON [T_1].[ALIAS] = [T_2].[ALIAS] AND NOT [T_1].[RECORD_ID] = [T_2].[RECORD_ID]
Are there any solutions out there that would work better? As in, not crash my server when run on a few hundred thousand records?
It looks as if you have two requirements:
Identify all aliases that have more than one record id, and
List the record ids for these aliases horizontally.
The first is a lot easier to do than the second. Here's some SQL that ought to get you where you want with the first:
WITH A -- Get a list of unique combinations of Alias and [Record ID]
AS (
SELECT Distinct
Alias
, [Record ID]
FROM T1
)
, B -- Get a list of all those Alias values that have more than one [Record ID] associated
AS (
SELECT Alias
FROM A
GROUP BY
Alias
HAVING COUNT(*) > 1
)
SELECT A.Alias
, A.[Record ID]
FROM A
JOIN B
ON A.Alias = B.Alias
Now, as for the second. If you're satisfied with the data in this form:
Alias Record ID
000123 4
000123 6
000345 6
000345 7
... you can stop there. Otherwise, things get tricky.
The PIVOT command will not necessarily help you, because it's trying to solve a different problem than the one you have.
I am assuming that you can't necessarily predict how many duplicate Record ID values you have per Alias, and thus don't know how many columns you'll need.
If you have only two, then displaying each of them in a column becomes a relatively trivial exercise. If you have more, I'd urge you to consider whether the destination for these records (a report? A web page? Excel?) might be able to do a better job of displaying them horizontally than SQL Server can do in returning them arranged horizontally.
Perhaps what you want is just the min() and max() of RecordId:
select Alias, min(RecordID), max(RecordId)
from yourTable t
group by Alias
having min(RecordId) <> max(RecordId)
You can also count the number of distinct values, using count(distinct):
select Alias, count(distinct RecordId) as NumRecordIds, min(RecordID), max(RecordId)
from yourTable t
group by Alias
having count(DISTINCT RecordID) > 1;
This will give all repeated values:
select Alias, count(RecordId) as NumRecordIds,
from yourTable t
group by Alias
having count(RecordId) <> count(distinct RecordId);
I agree with Ann L's answer but would like to show how you can use window functions with CTE's as you may prefer the readability.
(Re: how to pivot horizontally, I again agree with Ann)
create temporary table things (
id serial primary key,
alias varchar,
record_id int
)
insert into things (alias, record_id) values
('000123', 4),
('000123', 4),
('000234', 4),
('000123', 6),
('000345', 6),
('000345', 7);
with
things_with_distinct_aliases_and_record_ids as (
select distinct on (alias, record_id)
id,
alias,
record_id
from things
),
things_with_unique_record_id_counts_per_alias as (
select *,
COUNT(*) OVER(PARTITION BY alias) as unique_record_ids_count
from things_with_distinct_aliases_and_record_ids
)
select * from things_with_unique_record_id_counts_per_alias
where unique_record_ids_count > 1
The first CTE gets all the unique alias/record id combinations. E.g.
id | alias | record_id
----+--------+-----------
1 | 000123 | 4
4 | 000123 | 6
3 | 000234 | 4
5 | 000345 | 6
6 | 000345 | 7
The second CTE simply creates a new column for the above and adds the count of record ids for each alias. This allows you to filter only those aliases which have more than one record id associated with them.
id | alias | record_id | unique_record_ids_count
----+--------+-----------+-------------------------
1 | 000123 | 4 | 2
4 | 000123 | 6 | 2
3 | 000234 | 4 | 1
5 | 000345 | 6 | 2
6 | 000345 | 7 | 2
SELECT A.CitationId,B.CitationId, A.CitationName, A.LoaderID, A.PrimaryReferenceLoaderID,B.SecondaryReference1LoaderID, A.SecondaryReference1LoaderID, A.SecondaryReference2LoaderID,
A.SecondaryReference3LoaderID, A.SecondaryReference4LoaderID, A.CreatedOn, A.LastUpdatedOn
FROM CitationMaster A, CitationMaster B
WHERE A.PrimaryReferenceLoaderID= B.SecondaryReference1LoaderID and Isnull(A.PrimaryReferenceLoaderID,'') != '' and Isnull(B.SecondaryReference1LoaderID,'') !=''

Writing a query for figuring out non-existent parent IDs in SQL

I have a table (Table1) wherein some columns reference another table (Table2) by id. it looks something like this:
Table 1
ID Column 1 Column 2 Column 3
3 15 16 0
4 19 0 0
5 21 22 23
6 0 0 0
7 25 26 0
8 27 0 0
Table 2
ID String
15 data
16 data
19 data
21 data
etc.
I am trying to write a query that returns results like this:
Table2ID Table1ID
15 3
16 3
19 4
21 5
22 5
23 5
25 7
etc.
There is no reference to any sort of parent in Table 2, so I'm trying to figure out the best way to query this. Any help would be much obliged, as my SQL experience is about less than two weeks worth.
You can use union (or union all) to combine the results of multiple queries. There are some rules you must follow in order to do this. First, each query must return the same number of columns. Next, each column does not necessarily need to return the same data type, but you can get unexpected errors if they are different data types so you are MUCH better of if each query returns the same data type for each column.
Just to be clear, when you have multiple columns, each query should return the same data type for the first column, and each query should return the same data type for the second column, but the data types for the first column do not need to match the data type of the second column.
Select Column1 As Table2ID, ID As Table1ID From Table1
Union All
Select Column2, ID From Table1
Union All
Select Column3, ID From Table1
SELECT Table2ID,t1.ID
FROM Table2 t2
JOIN Table1 t1
ON t2.Table1ID=t1.Col1 OR t2.Table1ID=t1.Col2 OR t2.Table1ID=t1.Col3

SQL query to find rows that aren't present in other tables

Here's what I'm trying to accomplish:
I've got two tables, call them first and second. They each have an ID column. They might have other columns but those aren't important. I have a third table, call it third. It contains two columns, ID and OTHERID. OTHERID references entries that may or may not exist in tables first and second.
I want to query third and look for rows who don't have an OTHERID column value that is found in either tables first or second. The goal is to delete those rows from table third.
Example:
first table:
ID
1
2
3
second table:
ID
6
7
8
third table
ID | OTHERID
21 1
22 2
23 3
24 4
25 5
26 6
27 7
28 8
In this case, I'd want to retrieve the IDs from third who don't have a matching ID in either table first or table second. I'd expect to get back the following IDs:
24
25
What I've tried:
I've done something this to get back the entries in third that aren't in first:
select t.* from third t where not exists (select * from first f where t.otherid = f.id);
and this will get me back the following rows:
ID | OTHERID
24 4
25 5
26 6
27 7
28 8
Similarly, I can get the ones that aren't in second:
select t.* from third t where not exists (select * from second s where t.otherid = s.id);
and I'll get:
ID | OTHERID
21 1
22 2
23 3
24 4
25 5
What I can't get my brain about this morning is how to combine the two queries together to get the intersection between the two results sets, so that just the rows with IDs 24 and 25 are returned. Those would be two rows I could remove since they are orphans.
How would you solve this? I think I'm on the right track but I'm just spinning at this point making no progress.
Maybe this :
SELECT third.*
FROM third
LEFT JOIN first ON third.otherID = first.id
LEFT JOIN second ON third.otherID = second.id
WHERE first.id IS NULL AND second.id IS NULL
Just use
select t.*
from third t
where
not exists (select * from first f where t.otherid = f.id)
and not exists (select * from second s where t.otherid = s.id)
SELECT t.ID
FROM third t
WHERE t.OTHERID NOT IN (
SELECT ID
FROM first
UNION
SELECT ID
FROM second
)