Query three non-bisecting sets of data - sql

I am retrieving three different sets of data (or what should be "unique" rows). In total, I expect 3 different unique sets of rows because I have to complete different operations on each set of data. I am, however, retrieving more rows than there are in total in the table, meaning that I must be retrieving duplicate rows somewhere. Here is an example of my three sets of queries:
SELECT DISTINCT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t2.ID = t1.ID
AND t2.NAME = t1.NAME
AND t2.ADDRESS <> t1.ADDRESS
SELECT DISTINCT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t2.ID = t1.ID
AND t2.NAME <> t1.NAME
AND t2.ADDRESS <> t1.ADDRESS
SELECT DISTINCT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t2.ID <> t1.ID
AND t2.NAME = t1.NAME
AND t2.ADDRESS <> t1.ADDRESS
As you can see, I am selecting (in order of queries)
Set of data where the id AND name match
Set of data where the id matches but the name does NOT
Set of data where the id does not match but name DOES
I am retrieving MORE rows than exist in T1 when adding up the number of results returned from all three queries which I don't think is logically possible, plus this means I must be duplicating rows (if it is logically possible) somewhere which prevents me from executing different commands against each set (since a row would have another command executed on it).
Can someone find where I'm going wrong here?

Consider if Name is not unique. If you have the following data:
Table 1 Table 2
ID Name Address ID Name Address
0 Jim Smith 1111 A St 0 Jim Smith 2222 A St
1 Jim Smith 2222 B St 1 Jim Smith 3333 C St
Then Query 1 gives you:
0 Jim Smith 1111 A St
1 Jim Smith 2222 B St
Because rows 1 & 2 in Table 1 match rows 1 & 2, respectively in Table 2.
Query 2 gives you nothing.
Query 3 gives you
0 Jim Smith 1111 A St
1 Jim Smith 2222 B St
Because row 1 in Table 1 matches row 2 in Table 2 and row 2 in Table 1 matches row 1 in Table 2. Thus you get 4 rows out of Table 1 when there are only 2 rows in it.

Are you sure that NAME and ID are unique in both tables?
If not, you could have a situation, for example, where table 1 has this:
NAME: Fred
ID: 1
and table2 has this:
NAME: Fred
ID: 1
NAME: Fred
ID: 2
In this case, the record in table1 will be returned by two of your queries: ID and NAME both match, and NAME matches but ID doesn't.
You might be able to narrow down the problem by intersecting each combination of two queries to find out what the duplicates are, e.g.:
SELECT DISTINCT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t2.ID = t1.ID
AND t2.NAME = t1.NAME
AND t2.ADDRESS <> t1.ADDRESS
INTERSECT
SELECT DISTINCT t1.*
FROM table1 t1
INNER JOIN table2 t2
ON t2.ID = t1.ID
AND t2.NAME <> t1.NAME
AND t2.ADDRESS <> t1.ADDRESS

Assuming that T2.ID has a unique constraint, it is still quite logically possible for this scenario to occur.If for every record in T1, there are two corresponding records in T2:
Same name, same id, different address
Same name, different id, different address
Then the same record for T1 can come up in the first and third query for example.
It is also possible to simultaneously get the same row in the second and third query.
If T2.ID is not guaranteed to be unique, then you could get the same row from T1 in all three queries.

I think the last query could be the one fetching extra set of rows.
i.e. It is relying on Name matching in both tables (and not on ID)

To find the offending data (and help find your logic hole) I would recommend:
(caution pseudo-code)
Limit the results to just SELECT id FROM ....
UNION the result sets
COUNT(id)
GROUP BY id
HAVING count(id) > 1
This will show the records that match more than one sub-query.

Related

SQL comparing two sets of complex data

Sorry, probably the title is not the best one but I hope you will understand what problem I have.
I need to compare and analyse two sets of data and I'm using MS-Access for that. My data is organized in two tables. Following is not the real data I'm working with but will serve ok as example:
TABLE 1
ID Name
1 Zoie
2 Rohan
2 Simon
3 Jerome
4 Jakob
4 Mathew
4 Cora
6 Keely
7 Aiyana
7 Jake
8 Reid
9 Emerson
TABLE 2
ID Name
1 Michael
2 Rohan
2 Simon
3 Jill
4 Jakob
4 Cora
5 Charlie
7 John
8 Reid
9 Yadiel
9 Emerson
9 Paris
So, I need to select only those IDs which fully corresponds (all names under specific IDs are the same) in both tables and those are: 2 and 8
I would also like to have separate select statement which will result with IDs 2 and 8 but also IDs with names from table 1 which also appears in table 2 (all from table 1 plus possible some extra in table 2 under the same ID). So that would be: 2, 8, 9
I would also like to have separate select statement which will result with IDs 2 and 8 but also IDs with names from table 2 which also appears in table 1 (all from table 2 plus possible some extra in table 1 under the same ID). So that would be: 2, 4, 8
I would also like to have separate select statement which would be a combination of last two.
So result would be: 2, 4, 8, 9
I would appreciate any suggestions.
Thanks in advance!
Best regards,
Mario
Q#1:
select id
from table1
group by id
having count(*) =
(
select count(*)
from table2
group by table2.id
having table2.id = table1.id
)
and count(*) =
(
select count(*)
from table1 table1_1
inner join table2 on table1_1.id = table2.id and table1_1.name = table2.name
group by table1_1.id
having table1_1.id = table1.id
)
Explanation of this query:
It is grouping table1 by ID
For each group (for each ID), it is counting the number of rows in table1 that have this ID.
For each group, it is counting the number of rows in table2 that have this ID.
For each group, it is counting the number of rows where the name appears in both tables for this ID (it does that by inner joining table1 and table2 on the ID and Name which means only rows where both ID and Name match in both tables will be counted, for each ID).
It then returns IDs (from table1) where each of the above counts are equal. This is what results in returning IDs where all names are in both tables (no more, no less).
Q#2 - In this case you don't care that table2 has the same number of names per ID. So remove the first sub-query (that counts matching rows in table2).
select id
from table1
group by id
having count(*) =
(
select count(*)
from table1 table1_1
inner join table2 on table1_1.id = table2.id and table1_1.name = table2.name
group by table1_1.id
having table1_1.id = table1.id
)
Although the above is easy enough to understand following the same logic as Q#1, it is probably more efficient to do the following, and more straightforward. It only matters if you find it running too slow for your data (which is subjective and context dependent).
select table1.id
from table1
left join table2 on table1.id = table2.id and table1.name = table2.name
group by table1.id
having count(table1.id) = count(table2.id)
Here, the two tables are LEFT (outer) joined which means all records from table1 are gathered and records in table2 that match by ID and Name are also included alongside. Then, we group them by ID and we compare the count of each group in table1 with those that had matching names in table2.
Q#3 - This case is the same as Q#2 except table1 and table2 are swapped.
Q#4 - In this case you only care about IDs that have at least one name that appears in both tables. So join the tables and return the distinct IDs:
select distinct id
from table1
inner join table2 on table1.id = table2.id and table1.name = table2.name
Here is a SQLFiddle to play with containing the four queries: http://www.sqlfiddle.com/#!18/3fc71/22

SQL: checking if row contains contents of another cell

I'm having difficulty with a SQL query.
I have two tables:
Table 1
ID/First Name
1 Ben
2 Barry
3 Birl
Table 2
ID/Full name
1 Ben Rurth
2 Barry Bird
3 Burney Saf
I want to run a check between the two tables where if the contents of the First Name in Table 1 is not in the Full name in table 2 the result will be returned, e.g. returning id 3, Birl, in the above example.
I have been trying queries like:
SELECT First_Name
from Table_1
WHERE NOT EXIST (SELECT Full_name from Table_2)
with no luck so far.
You can make use of LIKE clause combined with concatenation.
SELECT t1.First_Name,t2.Full_Name
FROM Table1 t1
JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t2.Full_Name NOT LIKE '%' || t1.First_Name || '%'
Or
SELECT t1.First_Name,t2.Full_Name
FROM Table1 t1
JOIN Table2 t2 ON t1.ID = t2.ID
WHERE t2.Full_Name NOT LIKE CONCAT('%', t1.First_Name, '%')
This is, understanding that both tables shares the ID column.

After Table Joining, need specific row value based on another Row with same ID

I have 2 tables as follows:
Table 1:
ID FName
1 Basics
2 Machine1
3 master
4 Machine2
15 Machine3
16 Machine16
Table 2:
ParentID Name InitialValue
1 Active 1
2 MachineName Entrylevel
2 Active 1
3 Active 1
4 MachineName Midlevellevel
4 Active 1
15 MachineName Endlevel
15 Active 1
16 MachineName Miscellenious
16 Active 0
Here, ID of Table 1 is referred as Parent ID at Table 2. I want "Initial Value" of Table 2 for MachineName Rows (of Table 2) provided "InitialValue" of Table 2 for Active Rows (of Table 2) are 1
Result should be like
ID InitialValue
2 Entrylevel
4 Midlevellevel
15 Endlevel
You could join the second table twice, once for MachineName, and once for Active:
SELECT t.ID, machine.InitialValue
FROM table1 t
INNER JOIN table2 machine
ON t.ID = machine.ParentId
AND machine.Name = 'MachineName'
INNER JOIN table2 active
ON t.ID = active.ParentId
AND active.Name = 'Active'
AND active.InitialValue = 1;
About Joins
The JOIN syntax allows you to link records to the previous table in your FROM list, most of the time via a relationship of foreign key - primary key. In a distant past, we used to do that with a WHERE condition, but that really is outdated syntax.
In the above query, that relationship of primary key - foreign key is expressed with t.ID = machine.ParentId in the first case. Note the alias that was defined for table2, so we can refer to it with machine.
Some extra condition(s) are added to the join condition, such as machine.Name = 'MachineName'. Those could just as well have been placed in a WHERE clause, but I like it this way.
Then the same table is joined again, this time with another alias. This time it filters the "Active" 1 records. Note that if the ID in table1 does not have a matching record with those conditions, that parent record will be excluded from the results.
So now we have the table1 records with a matching "MachineName" record and are sure there is an "Active" 1 record for it as well. This is what needs to be output.
Not sure if this is standard SQL but it should work using MySQL.
select T1.ID, T2.InitialValue
from Table1 T1 inner join Table2 T2 on T1.ID = T2.ParentId
where
T2.Name <> 'Active'
and exists (
select * from Table2 T3 where T3.ParentId = T1.ID and T3.Name = 'Active' and T3.InitialValue = 1
)
SELECT t1.ID, t2.InitialValue
FROM table1 t1 join table2 t2 on t1.ID=t2.ParentID
WHERE t2.name LIKE 'MachineName'AND t1.ID= ANY(SELECT t22.ParentID
FROM table2 t22
WHERE t22.InitialValue=1)
I think this should work
//slightly changed the condition in WHERE clausule (t2.parentID changed to t1.ID)

SQL check adjacent rows for sequence

I have a table with an id column (unique, primary), a name (not unique--in fact, most likely repeated), and a flag column which has values 0, 1, or 2. Let's say I reorder the table using the command
SELECT id, name, flag ORDER BY name, id
I want to produce using SQL a list of names where, when the rows in the reordering are read downward, there are two adjacent rows with the same name and flags of value 0 and 1 (in that order). Additionally, in that list, I want to have the ids of the two rows where this happened. If it happened more than once, there should be multiple rows.
So, for instance, in the following
id name flag
4 Bob 0
5 Bob 2
6 Bob 1
1 Cathy 0
7 Cathy 1
3 David 0
2 Elvis 2
8 Elvis 0
9 Elvis 1
I would want to select
name id1 id2
Cathy 1 7
Elvis 8 9
How do I do this?
I'm using MySQL.
EDIT: Note that the IDs for those adjacent rows might not be consecutive; they're only consecutive if we order by name. See, for example, Cathy.
Thanks!
try
select t1.name, t1.id as id1,t2.id as id2
from tablename t1, tablename t2
where t1.flag = 0 and t2.id = t1.id+1 and t2.flag = 1 and t1.name = t2.name
Try:
select t1.name, t1.id as id1,t2.id as id2
from tablename t1
join tablename t2
on t2.name = t1.name and t2.flag = t1.flag + 1 and t2.id =
(select min(t3.id) from tablename t3
where t1.name = t3.name and t1.id < t3.id)
EDIT: amended join to t2 to include lookup on subquery

Advanced SQL SELECT

I have basic tables one and two. Let`s call them tbl1 and tbl2.
tbl1
---------
1 - apple |
2 - orange |
3 - banana |
4 - pineapple
tbl2
---------
1 - table |
2 - chair |
3 - sofa |
Then there is tbl3 that has foreign keys that link it to both tables above.
Table 3 form has two select fields: one that queries tbl1 and another that queries tbl2. So far, so good. I combine items from tbl1 and tbl2 in tbl3.
Then I have the following situation: When users fulfill tbl3, I want the select2 (corresponding to data from tbl2) to reload after the user has picked an item from tbl1. What for?
Let`s say that for the first time fulfilling tbl3 the user has picked "apple" from tbl1 and "sofa" from tbl2. When the user selected "apple", the second dropdownlist reloaded with all the 3 items as options.
Now, the second time the user fulfills a tbl3 form, if the user selects "apple" again, he will now have only 2 options. apple AND sofa have been picked before. Options now are only "table" and "chair". If he picks "apple" and "table" now, the remaining option for first item "apple" will be "chair". And so on...
I cannot think of an SQL to run this query that grabs elements from tbl2. I must use the selected element from tbl1 and somehow grab the REMAINING items, those that do not correspond to a mach. Is this SQL query possibe? I believe so, but cannot think of a way out.
If the links are on a per user basis:
SELECT t.id,
t.value
FROM TABLE2 t
JOIN TABLE3 t3 ON t3.id != t.id
If the links are are per user basis for:
SQL Server:
SELECT t.id,
t.value
FROM TABLE2 t
JOIN TABLE3 t3 ON t3.id != t.id AND t3.userid = #userId
Oracle:
SELECT t.id,
t.value
FROM TABLE2 t
JOIN TABLE3 t3 ON t3.id != t.id AND t3.userid = :userId
I am guessing that table3 has columns:
User, Tbl1-ID, Tbl2-ID
I am guessing that you want, for user X, the values for tbl2 based on the selecton made from tbl1 that are NOT in table 3.
SELECT ID, Value
FROM tbl2
WHERE tbl2.ID NOT IN (
SELECT Tbl2-ID
FROM Table3
WHERE (User = <this user>) AND (Tbl1-ID = <selected tbl1 ID>)
)