Comparing two Access tables identical in structure but not data - sql

I have two tables in Access database, Table1 and Table2 with exactly the same structure but Table1 has more data. I want to figure out which data am I missing from Table2. The primary key for each table is composed of text fields:
CenterName
BuildingName
FloorNo
RoomNo
Each center can have many buildings and two different centers can have a building with the same name. Also room numbers and floor numbers can be the same across different buildings and different centers.
I have tried
SELECT t1.CenterName, t1.BuildingName, t1.FloorNo, t1.RoomNo, t2.CenterName
FROM Table1 as t1 LEFT JOIN Table2 as t2 ON t1.CenterName=t2.CenterName
WHERE t2.CenterName Is Null;
But the above does not return any data, meaning all the Centers are in both tables. But it does not tell me anything about the rest of the fields that might be missing from Table2.
Can anyone please help to re-write my query so it works as intended?
I am used to SQL Server database so building queries in Access is a bit time consuming for me. Before I transfer all the data into SQL Server for analysis I wanted to see if I can get any help here.

Join on all four of the fields which make up the primary key.
SELECT
t1.CenterName,
t1.BuildingName,
t1.FloorNo,
t1.RoomNo,
t2.CenterName
FROM
Table1 AS t1
LEFT JOIN Table2 AS t2
ON
t1.CenterName = t2.CenterName
AND t1.BuildingName = t2.BuildingName
AND t1.FloorNo = t2.FloorNo
AND t1.RoomNo = t2.RoomNo
WHERE t2.CenterName Is Null;

Related

SQL join on multiple columns or on single calculated column

I'm migrating the backend a budget database from Access to SQL Server and I ran into an issue.
I have 2 tables (let's call them t1 and t2) that share many fields in common: Fund, Department, Object, Subcode, TrackingCode, Reserve, and FYEnd.
If I want to join the tables to find records where all 7 fields match, I can create an inner join using each field:
SELECT *
FROM t1
INNER JOIN t2
ON t1.Fund = t2.Fund
AND t1.Department = t2.Department
AND t1.Object = t2.Object
AND t1.Subcode = t2.Subcode
AND t1.TrackingCode = t2.TrackingCode
AND t1.Reserve = t2.Reserve
AND t1.FYEnd = t2.FYEnd;
This works, but it runs very slowly. When the backend was in Access, I was able to solve the problem by adding a calculated column to both tables. It basically, just concatenated the fields using "-" as a delimiter. The revised query is as follows:
SELECT *
FROM t1 INNER JOIN t2
ON CalculatedColumn = CalculatedColumn
This looks cleaner and runs much faster. The problem is when I moved t1 and t2 to SQL Server, the same query gives me an error message:
I'm new to SQL Server. Can anyone explain what's going on here? Is there a setting I need to change for the calculated column?
Posting this as an answer from my comment.
Usually, this is an issue with mismatched Data types between the two columns referenced. Check and make sure the data types of the two fields (CompositeID) are the same.
You have to calculate the columns before joining them as the ON clause can only access columns for the table.
It is no good to have two identical tables anyway so you should rethink your design completely.
SELECT t1a.*,t2a.*
FROM (SELECT CalculatedColumn, * FROM t1) t1a INNER JOIN (SELECT CalculatedColumn, * FROM t2 ) t2a
ON t1a.CalculatedColumn = t2a.CalculatedColumn

How does JOIN work exactly in SQL

I know that joins work by combining two or more tables by their attributes, so if you have two tables that both have three columns and both have column INDEX, if you use table1 JOIN table2 you will get a new table with 5 columns, but what if you do not have a column that is shared by both table1 and table2? Can you still use JOIN or do you have to use TIMES?
Join is not a method for combining tables. It is a method to select records (and selected fields) from 2 or more tables where every table in the query must carry a field that can be matched to a field in another table in the query. The matched fields need not have the same name, but must carry the same type of data. Lacking this would be like trying to create meaning from joining a list of license plates of cars in NYC, with height data from lumberjacks in Washington state -- not meaningful.
Ex:)
Select h.name, h.home_address, h.home_phone, w.work_address,
w.department
from home h, work w
where h.employee_id = w.emp_id
As long as both columns: employee_id and emp_id carry the same information this query will work
In Microsoft Access, to get five rows from a three column table joined to a two column table, you'd use:
SELECT Table1.*, Table2.* FROM Table1 INNER JOIN Table2 ON Table1.Field1 = Table2.Field1;
You can query whatever you want, and join whatever you want, though.
If your one table is a list of people, and your other is a list of cars, and you want to see what people have names that are also models of cars, you can do:
SELECT Table1.Name, Table1.Age, Table2.Make, Table2.Year
FROM Table1 INNER JOIN Table2 ON Table1.Name = Table2.Model;
Only when Name is the same as Model will it show a record.
This is the same idea for joining tables in any relational DBMS I've used.
You are right you can join two tables even if they do not have shared column.
Join uses primary to prevent mistakes on inserting or deleting when user trying to insert record that does not has a parent one or some thing like this.
join methods has many types you can view them here:
http://dev.mysql.com/doc/refman/5.7/en/join.html
LEFT JOIN: select all records from first table, then selecting all records from second table that fulfilling the condition after ON clause.
you can't join the tables if they do not share a common column. If you can find a 3rd table that has common columns with table1 and table2 you can get them to join that way. so join table2 and tabl3 on a common column and than join table3 back to table1 on a common column.

Query data depends on data from other table

I'm trying to get data from a table which is only related through two different tables. For example, I have information "dataA" and need to get information "dataD".
How do I write a query to display dataA and dataD as they relate? I don't want to display all instances of dataD, just the ones that related to dataB and then to dataA. I'm sorry if this doesn't make enough sense.
You'll use JOINs to do this. Specifically, in this case, a LEFT OUTER JOIN is your best bet. Something like:
SELECT
Table1.dataA,
Table3.dataD
FROM
Table1
LEFT OUTER JOIN Table2 ON Table1.dataB = Table2.dataB
LEFT OUTER JOIN Table3 ON Table2.dataC = Table3.dataC
Working on the assumption that the "data" entries you put under each table name represent columns in the table, this is fairly simple:
select dataA
, dataD
from Table1 t1
join Table2 t2
on t1.dataB = t2.dataB
join Table3 t3
on T2.dataC = t3.dataC
If you need all the dataA regardless of whether there's a matching dataD (or other intervening column) you would change where I wrote 'join' to 'left join'.
And your question makes plenty of sense. :) I was just a little unclear from the diagram what you had in mind.

sql query to Combine different fields from 2 different tables with no relations except a common field - SQL Server Compact 3.5 SP2

My first question here. This has been a really helpful platform so far. I am some what a newbie in sql. But I have a freelance project in hand which I should release this month.(reporting application with no database writes)
To the point now: I have been provided with data (excel sheets with rows spanning up to 135000). Requirement is to implement a standalone application. I decided to use sql server compact 3.5 sp2 and C#. Due to time pressure(I thought it made sense too), I created tables based on each xls module, with fields of each tables matching the names of the headers in the xls, so that it can be easily imported via CSV import using SDF viewer or sql server compact toolbox added in visual studio. (so no further table normalizations done due to this reason).
I have a UI design for a typical form1 in which inputs from controls in it are to be checked in an sql query spanning 2 or 3 tables. (eg: I have groupbox1 with checkboxes (names matching field1,field2.. of table1) and groupbox2 with checkboxes matching field3, field4 of table2). also date controls based on which a common 'DateTimeField' is checked in each of the tables.
There are no foreign keys defined on tables for linking(did not arise the need to, since the data are different for each). The only commmon field is a 'DateTimeField'(same name) which exists in each table. (basically readings on a datetime stamp from locations. field1, field 2 etc are locations. For a particular datetime there may or may not be readings from table 1 or table2)
How will I accomplish an sql select query(using Union/joins/nested selects - if sql compact 3.5 supports it) to return fields from the 2 tables based on datetime(where clause). For a given date time there can be even empty values for fields in table 2. I have done a lot of research on this and tried as well. but not yet a good solution probably also due to my bad experience. apologies!
I would really appreciate any of your help! Can provide a sample of the data how it looks if you need it. Thanks in advance.
Edit:
Sample Data (simple as that)
Table 1
t1Id xDateTime loc1 loc2 loc3
(could not format the tabular schmema here. sorry. but this is self explanatory)
... and so on up to 135000 records existing imported from xls
Table 2
t2Id xDateTime loc4 loc5 loc6
.. and so on up to 100000 records imported from xls. merging table 1 and table 2 will result in a huge amount of blank rows/values for a date time.. hence leaving it as it is.
But a UI multiselect(loc1,loc2,loc4,loc5 from both t1 and t2) event from winform needs to combine the result from both tables based on a datetime.
... and so on
I managed to write it which comes very close. I say very close cause i have test in detail with different combination of inputs.. Thanks to No'am for the hint. Will mark as answer if everything goes well.
SELECT T1.xDateTime, T1.loc2, T2.loc4 FROM Table1 T1
INNER JOIN Table2 T2 ON T1.xDateTime = T2.xDateTime
WHERE (T1.xDateTime BETWEEN 'somevalue1' AND 'somevalue2')
UNION
SELECT T2.xDateTime, T1.loc2, T2.loc4 FROM Table1 T1
RIGHT JOIN Table2 T2 ON T1.xDateTime = T2.xDateTime
WHERE (T1.xDateTime BETWEEN 'somevalue1' AND 'somevalue2')
UNION
SELECT T1.xDateTime, T1.loc2, T2.loc4 FROM Table1 T1
LEFT JOIN Table2 T2 ON T1.xDateTime = T2.xDateTime
WHERE (T1.xDateTime BETWEEN 'somevalue1' AND 'somevalue2')
If 't1DateTime' and 't2DateTime' are the common fields, then apparently you need a query such as
SELECT table1.t1DateTime, table1.tiID, table1.loc2, table2.t2id, table2.loc4
FROM table1
INNER JOIN table2 ON table2.t2DateTime = table1.t1DateTime
This will give you values from rows which match in both tables, according to DateTime. If there is also supposed to be a match with the locations then you will have to add the desired condition to the 'ON' statement.
Based on your comment:
For a given date time there can be even empty values for fields in table 2
my understanding would be that you are not interested in orphaned records in table 2 (based on date) so in that case a LEFT JOIN would do it:
SELECT table1.t1DateTime, table1.tiID, table1.loc2, table2.t2id, table2.loc4
FROM table1
LEFT JOIN table2 ON table2.t2DateTime = table1.t1DateTime
However if there are also entries in table2 with no matching dates in table1 that you need to return you could try this:
SELECT table1.t1DateTime, table1.tiID, table1.loc2, ISNULL(table2.t2id, 0), ISNULL(table2.loc4, 0.0)
FROM table1
LEFT JOIN table2 ON table2.t2DateTime = table1.t1DateTime
WHERE (T1.t1DateTime BETWEEN 'somevalue1' AND 'somevalue2')
UNION ALL
SELECT table2.t2DateTime, '0', '0.0', table2.t2id, table2.loc4
FROM table2
LEFT OUTER JOIN table1 on table1.t1DateTime=table2.t2DateTime
WHERE table1.t1Datetime IS NULL AND T2.t2DateTime BETWEEN 'somevalue1' AND 'somevalue2'
Thanks a lot to #kbbucks.
Works with this so far.
SELECT T1.MonitorDateTime, T1.loc2, T.loc4
FROM Table1 T1
LEFT JOIN Table2 T2 ON T2.MonitorDateTime = T1.MonitorDateTime
WHERE T1.MonitorDateTime BETWEEN '04/05/2011 15:10:00' AND '04/05/2011 16:00:00'
UNION ALL
SELECT T2.MonitorDateTime, '', T2.loc4
FROM Table2 T2
LEFT OUTER JOIN Table1 T1 ON T1.MonitorDateTime = T2.MonitorDateTime
WHERE T1.MonitorDateTime IS NULL AND T2.MonitorDateTime BETWEEN '04/05/2011 15:10:00' AND '04/05/2011 16:00:00'

What's the best way to join on the same table twice?

This is a little complicated, but I have 2 tables. Let's say the structure is something like this:
*Table1*
ID
PhoneNumber1
PhoneNumber2
*Table2*
PhoneNumber
SomeOtherField
The tables can be joined based on Table1.PhoneNumber1 -> Table2.PhoneNumber, or Table1.PhoneNumber2 -> Table2.PhoneNumber.
Now, I want to get a resultset that contains PhoneNumber1, SomeOtherField that corresponds to PhoneNumber1, PhoneNumber2, and SomeOtherField that corresponds to PhoneNumber2.
I thought of 2 ways to do this - either by joining on the table twice, or by joining once with an OR in the ON clause.
Method 1:
SELECT t1.PhoneNumber1, t1.PhoneNumber2,
t2.SomeOtherFieldForPhone1, t3.someOtherFieldForPhone2
FROM Table1 t1
INNER JOIN Table2 t2
ON t2.PhoneNumber = t1.PhoneNumber1
INNER JOIN Table2 t3
ON t3.PhoneNumber = t1.PhoneNumber2
This seems to work.
Method 2:
To somehow have a query that looks a bit like this -
SELECT ...
FROM Table1
INNER JOIN Table2
ON Table1.PhoneNumber1 = Table2.PhoneNumber OR
Table1.PhoneNumber2 = Table2.PhoneNumber
I haven't gotten this to work yet and I'm not sure if there's a way to do it.
What's the best way to accomplish this? Neither way seems simple or intuitive... Is there a more straightforward way to do this? How is this requirement generally implemented?
First, I would try and refactor these tables to get away from using phone numbers as natural keys. I am not a fan of natural keys and this is a great example why. Natural keys, especially things like phone numbers, can change and frequently so. Updating your database when that change happens will be a HUGE, error-prone headache. *
Method 1 as you describe it is your best bet though. It looks a bit terse due to the naming scheme and the short aliases but... aliasing is your friend when it comes to joining the same table multiple times or using subqueries etc.
I would just clean things up a bit:
SELECT t.PhoneNumber1, t.PhoneNumber2,
t1.SomeOtherFieldForPhone1, t2.someOtherFieldForPhone2
FROM Table1 t
JOIN Table2 t1 ON t1.PhoneNumber = t.PhoneNumber1
JOIN Table2 t2 ON t2.PhoneNumber = t.PhoneNumber2
What i did:
No need to specify INNER - it's implied by the fact that you don't specify LEFT or RIGHT
Don't n-suffix your primary lookup table
N-Suffix the table aliases that you will use multiple times to make it obvious
*One way DBAs avoid the headaches of updating natural keys is to not specify primary keys and foreign key constraints which further compounds the issues with poor db design. I've actually seen this more often than not.
The first is good unless either Phone1 or (more likely) phone2 can be null. In that case you want to use a Left join instead of an inner join.
It is usually a bad sign when you have a table with two phone number fields. Usually this means your database design is flawed.
You could use UNION to combine two joins:
SELECT Table1.PhoneNumber1 as PhoneNumber, Table2.SomeOtherField as OtherField
FROM Table1
JOIN Table2
ON Table1.PhoneNumber1 = Table2.PhoneNumber
UNION
SELECT Table1.PhoneNumber2 as PhoneNumber, Table2.SomeOtherField as OtherField
FROM Table1
JOIN Table2
ON Table1.PhoneNumber2 = Table2.PhoneNumber
The first method is the proper approach and will do what you need. However, with the inner joins, you will only select rows from Table1 if both phone numbers exist in Table2. You may want to do a LEFT JOIN so that all rows from Table1 are selected. If the phone numbers don't match, then the SomeOtherFields would be null. If you want to make sure you have at least one matching phone number you could then do WHERE t2.PhoneNumber IS NOT NULL OR t3.PhoneNumber IS NOT NULL
The second method could have a problem: what happens if Table2 has both PhoneNumber1 and PhoneNumber2? Which row will be selected? Depending on your data, foreign keys, etc. this may or may not be a problem.
My problem was to display the record even if no or only one phone number exists (full address book). Therefore I used a LEFT JOIN which takes all records from the left, even if no corresponding exists on the right. For me this works in Microsoft Access SQL (they require the parenthesis!)
SELECT t.PhoneNumber1, t.PhoneNumber2, t.PhoneNumber3
t1.SomeOtherFieldForPhone1, t2.someOtherFieldForPhone2, t3.someOtherFieldForPhone3
FROM
(
(
Table1 AS t LEFT JOIN Table2 AS t3 ON t.PhoneNumber3 = t3.PhoneNumber
)
LEFT JOIN Table2 AS t2 ON t.PhoneNumber2 = t2.PhoneNumber
)
LEFT JOIN Table2 AS t1 ON t.PhoneNumber1 = t1.PhoneNumber;
SELECT
T1.ID
T1.PhoneNumber1,
T1.PhoneNumber2
T2A.SomeOtherField AS "SomeOtherField of PhoneNumber1",
T2B.SomeOtherField AS "SomeOtherField of PhoneNumber2"
FROM
Table1 T1
LEFT JOIN Table2 T2A ON T1.PhoneNumber1 = T2A.PhoneNumber
LEFT JOIN Table2 T2B ON T1.PhoneNumber2 = T2B.PhoneNumber
WHERE
T1.ID = 'FOO';
LEFT JOIN or JOIN also return same result. Tested success with PostgreSQL 13.1.1 .