SQL Query - Ensure a row exists for each value in () - sql

Currently struggling with finding a way to validate 2 tables (efficiently lots of rows for Table A)
I have two tables
Table A
ID
A
B
C
Table matched
ID Number
A 1
A 2
A 9
B 1
B 9
C 2
I am trying to write a SQL Server query that basically checks to make sure for every value in Table A there exists a row for a variable set of values ( 1, 2,9)
The example above is incorrect because t should have for every record in A a corresponding record in Table matched for each value (1,2,9). The end goal is:
Table matched
ID Number
A 1
A 2
A 9
B 1
B 2
B 9
C 1
C 2
C 9
I know its confusing, but in general for every X in ( some set ) there should be a corresponding record in Table matched. I have obviously simplified things.
Please let me know if you all need clarification.

Use:
SELECT a.id
FROM TABLE_A a
JOIN TABLE_B b ON b.id = a.id
WHERE b.number IN (1, 2, 9)
GROUP BY a.id
HAVING COUNT(DISTINCT b.number) = 3
The DISTINCT in the COUNT ensures that duplicates (IE: A having two records in TABLE_B with the value "2") from being falsely considered a correct record. It can be omitted if the number column either has a unique or primary key constraint on it.
The HAVING COUNT(...) must equal the number of values provided in the IN clause.

Create a temp table of values you want. You can do this dynamically if the values 1, 2 and 9 are in some table you can query from.
Then, SELECT FROM tempTable WHERE NOT IN (SELECT * FROM TableMatched)

I had this situation one time. My solution was as follows.
In addition to TableA and TableMatched, there was a table that defined the rows that should exist in TableMatched for each row in TableA. Let’s call it TableMatchedDomain.
The application then accessed TableMatched through a view that controlled the returned rows, like this:
create view TableMatchedView
select a.ID,
d.Number,
m.OtherValues
from TableA a
join TableMatchedDomain d
left join TableMatched m on m.ID = a.ID and m.Number = d.Number
This way, the rows returned were always correct. If there were missing rows from TableMatched, then the Numbers were still returned but with OtherValues as null. If there were extra values in TableMatched, then they were not returned at all, as though they didn't exist. By changing the rows in TableMatchedDomain, this behavior could be controlled very easily. If a value were removed TableMatchedDomain, then it would disappear from the view. If it were added back again in the future, then the corresponding OtherValues would appear again as they were before.
The reason I designed it this way was that I felt that establishing an invarient on the row configuration in TableMatched was too brittle and, even worse, introduced redundancy. So I removed the restriction from groups of rows (in TableMatched) and instead made the entire contents of another table (TableMatchedDomain) define the correct form of the data.

Related

Get the "most" optimal row in a JOIN

Problem
I have a situation in which I have two tables in which I would like the entries from table 2 (lets call it table_2) to be matched up with the entries in table 1 (table_1) such that there are no duplicates rows of table_2 used in the match up.
Discussion
Specifically, in this case there are datetime stamps in each table (field is utcdatetime). For each row in table_1, I want to find the row in table_2 in which has the closed utcdatetime to the table 1 utcdatetime such that the table2.utcdatetime is older than the table_1 utcdatetime and within 30 minutes of the table 1 utcdatetime. Here is the catch, I do not want any repeats. If a row in table 2 gets gobbled up in a match on an earlier row in table 1, then I do not want it considered for a match later.
This has currently been implemented in a Python routine, but it is slow to iterate over all of the rows in table 1 as it is large. I thought I was there with a single SQL statement, but I found that my current SQL results in duplicate table 2 rows in the output data.
I would recommend using a nested select to get whatever results you're looking for.
For instance:
select *
from person p
where p.name_first = 'SCCJS'
and not exists (select 'x' from person p2 where p2.person_id != p.person_id
and p.name_first = 'SCCJS' and p.name_last = 'SC')

For MS Access SQL, want to use EXCEPT in Access for three columns in table1 to 1 column in table 2

I have 3 columns in table A. I am trying to design a query that will call out all the values (in the three columns) that do not apepar in the 1 column I have in table B. If it helps to make it more clear, table B is a list of currencies in ISO codes and table A is three columns of currencies being used, I am identifying all those values that are NOT using ISO codes to denote their currency.
Currently, I can't seem to get them all to match to the one column, so I made 2 more columns in table B so I can match them individually. My constraints are, I cannot change table A and I must do this in one query. What I got so far is below.
SELECT m.Currency1, i.ISO_Code, m.Currency2 , i.ISO_Code1, m.Currency3, i.ISO_Code2
FROM A AS m
LEFT JOIN B AS i
ON m.Currency=i.ISO_Code
AND m.Currency2=i.ISO_Code1
AND m.Currency3=i.ISO_Code2
WHERE i.ISO_Code is NULL
OR i.ISO_Code1 is NULL
OR i.ISO_Code2 is NULL;
I wouldn't bother making multiple columns in 'B'. I played with this in SQLFiddle and got it to work.
Something like this:
SELECT
m.Currency1, i.ISO_Code,
m.Currency2, j.ISO_Code AS ISO_Code1,
m.Currency3, k.ISO_Code AS ISO_Code2
FROM A AS m
LEFT JOIN B as i
ON m.Currency1 = i.ISO_Code
LEFT JOIN B as j
ON m.Currency2 = j.ISO_Code
LEFT JOIN B as k
ON m.Currency3 = k.ISO_Code
WHERE
i.ISO_Code IS NULL OR
j.ISO_Code IS NULL OR
k.ISO_Code IS NULL

SQL Cartesian product joining table to itself and inserting into existing table

I am working in phpMyadmin using SQL.
I want to take the primary key (EntryID) from TableA and create a cartesian product (if I am using the term correctly) in TableB (empty table already created) for all entries which share the same value for FieldB in TableA, except where TableA.EntryID equals TableA.EntryID
So, for example, if the values in TableA were:
TableA.EntryID TableA.FieldB
1 23
2 23
3 23
4 25
5 25
6 25
The result in TableB would be:
Primary key EntryID1 EntryID2 FieldD (Default or manually entered)
1 1 2 Default value
2 1 3 Default value
3 2 1 Default value
4 2 3 Default value
5 3 1 Default value
6 3 2 Default value
7 4 5 Default value
8 4 6 Default value
9 5 4 Default value
10 5 6 Default value
11 6 4 Default value
12 6 5 Default value
I am used to working in Access and this is the first query I have attempted in SQL.
I started trying to work out the query and got this far. I know it's not right yet, as I’m still trying to get used to the syntax and pieced this together from various articles I found online. In particular, I wasn’t sure where the INSERT INTO text went (to create what would be an Append Query in Access).
SELECT EntryID
FROM TableA.EntryID
TableA.EntryID
WHERE TableA.FieldB=TableA.FieldB
TableA.EntryID<>TableA.EntryID
INSERT INTO TableB.EntryID1
TableB.EntryID2
After I've got that query right, I need to do a TRIGGER query (I think), so if an entry changes it's value in TableA.FieldB (changing it’s membership of that grouping to another grouping), the cartesian product will be re-run on THAT entry, unless TableB.FieldD = valueA or valueB (manually entered values).
I have been using the Designer Tab. Does there have to be a relationship link between TableA and TableB. If so, would it be two links from the EntryID Primary Key in TableA, one to each EntryID in TableB? I assume this would not work because they are numbered EntryID1 and EntryID2 and the name needs to be the same to set up a relationship?
If you can offer any suggestions, I would be very grateful.
Research:
http://www.fluffycat.com/SQL/Cartesian-Joins/
Cartesian Join example two
Q: You said you can have a Cartesian join by joining a table to itself. Show that!
Select *
From Film_Table T1,
Film_Table T2;
No, you don't want a cartesian product where you join tables without any condition. What you are looking for is a simple self join:
insert into TableB(EntryID1, EntryID2)
select x.EntryID, y.EntryID
from TableA x
join TableA y on x.FieldB = y.FieldB and x.EntryID <> y.EntryID;
EDIT: As you want this table to be up-to-date all the time, consider a view instead of a table (only, then you could not have a manually maintained FieldD).
This should give you the 'cartesian' (its not actually a cartesian, as #ThorstenKettner mentions below - its a self join) you're after, and insert it into TableB:
INSERT INTO TableB (EntryId1, EntryId2, fieldD)
SELECT a.EntryID, b.EntryID, 'Default Value'
FROM TableA a, TableA b
WHERE a.FieldB=b.FieldB
AND a.EntryID<>b.EntryID
You don't need any relationship between the two tables for the Trigger, although I would suggest you have a foreign key relationship setup anyway, so that you never get an entry in TableB.EntryID1 or TableB.EntryID2 that doesn't have a corresponding entry in TableA.EntryID..
For the Triggers, you'd do something like this for the insert (you don't need to check TableB in this case because you know that your new TableA.EntryId doesn't exist there yet:
CREATE TRIGGER ins_tableA AFTER INSERT ON TableA
FOR EACH ROW
BEGIN
INSERT INTO TableB (EntryId1, EntryId2, fieldD)
SELECT a.EntryID, b.EntryID, 'Default Value'
FROM TableA a, TableA b
WHERE a.FieldB=b.FieldB
AND a.EntryID=NEW.EntryID
AND a.EntryID<>b.EntryID;
END;
And for the update, you could delete all the corresponding rows from TableB first, and then re-run the insert. Something like this:
CREATE TRIGGER upd_tableA AFTER UPDATE ON TableA
FOR EACH ROW
BEGIN
DELETE FROM TableB b
WHERE b.EntryId1 = NEW.EntryId
OR b.EntryId2 = NEW.EntryId;
INSERT INTO TableB (EntryId1, EntryId2, fieldD)
SELECT a.EntryID, b.EntryID, 'Default Value'
FROM TableA a, TableA b
WHERE a.FieldB=b.FieldB
AND a.EntryID=NEW.EntryID
AND a.EntryID<>b.EntryID;
END;
None of this is tested I'm afraid, but hopefully it'll put you on the right track..

Query to find duplicate values for two fields

Sorry for the Title, But didn't know how to explain.
I have a table that have 2 fields A and B.
I want find all rows in the table that have duplicate A (more than one record) but at the same time A will consider as a duplicate only if B is different in both rows.
Example:
FIELD A Field B
10 10
10 10 // This is not duplicate
10 10
10 5 // this is a duplicate
How to to this in a single query
Let's break this down into how you would go about constructing such a query. You don't make it clear whether you're looking for all values of A or all rows but let's assume all values of A initially.
The first step therefore is to create a list of all values of A. This can be done two ways, DISTINCT or GROUP BY. I'm going to use GROUP BY because of what else you want to do:
select a
from your_table
group by a
This returns a single column that is unique on A. Now, how can you change this to give you the unique values? The most obvious thing to use is the HAVING clause, which allows you to restrict on aggregated values. For instance the following will give you all values of A which only appear once in the table
select a
from your_table
group by a
having count(*) = 1
That is the count of all values of A inside the group is 1. You don't want this of course, you want to do this with the column B. You need there to exist more than one value of B in order for the situation you want to identify to be possible (if there's only one value of B then it's impossible). This gets us to
select a
from your_table
group by a
having count(b) > 1
This still isn't enough as you want two different values of B. The above just counts the number of records with the column B. Inside an aggregate function you use the DISTINCT keyword to determine unique values; bringing us to:
select a
from your_table
group by a
having count(distinct b) > 1
To transcribe this into English this means select all unique values of A from YOUR_TABLE that have more than one values of B in the group.
You can use this method, or something similar, to build up your own queries as you create them. Determine what you want to achieve and slowly build up to it.
select FIELD from your_table group by FIELD having count(b) > 1
take in consideration that this will return count of all duplicate
example
if you have values
1
1
2
1
it will return 3 for value 1 not 2

SQL INNER JOIN vs. WHERE ID IN(...) not the same results

I was surprised by the outcome of these two queries. I was expecting same from both. I have two tables that share a common field but there is not a relationship set up. The table (A) has a field EventID varchar(10) and table (B) has a field XXNumber varchar(15).
Values from table B column XXNumber are referenced in table A column EventID. Even though XXNumber can hold 15 chars, none of the 179K rows of data is longer than 10 chars.
So the requirement was:
"To avoid Duplicate table B and table A entries, if the XXNumber is contained in a table A >“Event ID” number, then it should not be counted."
To see how many common records I have I ran this query first - call it query alpha"
SELECT dbo.TableB.XXNumber FROM dbo.TableB WHERE dbo.TableB.XXNumber in
( select distinct dbo.TableA.EventId FROM dbo.TableA )
The result was 5322 rows.
The following query - call it query delta which looks like this:
SELECT DISTINCT dbo.TableB.XXNumber, dbo.TableB.EventId
FROM dbo.TableB INNER JOIN dbo.TableA ON dbo.TableB.XXNumber= dbo.TableB.EventId
haas returned 4308 rows.
Shouldn't the resulting number of rows be the same?
The WHERE ID IN () version will select all rows that match each distinct value in the list (regardless of whether you code DISTINCT indide the inner select or not - that's irrelevant). If a given value appears in the parent table more than once, you'll get multipke rows selected from the parent table for that single value found in the child table.
The INNER JOIN version will select each row from the parent table once for every successful join, so if there are 3 rows in the child table with the value, and 2 in the parent, then there will be 6 rows rows in the result for that value.
To make them "the same", add 'DISTINCT' to your main select.
To explain what you're seeing, we'd need to know more about your actual data.