SQL Server : joining 2 tables where each row has unique PKEY value - sql

I am attempting to join 2 tables with an equivalent amount of columns and where each column is named the same. Tb1 is an MS Access table that I have imported to SQL Server. Tb2 is a table that is updated from tb1 quarterly and is used to generate reports.
I have gone into design view and ensured that all column datatypes are the same and have the same names. Likewise, every row in each table is assigned a unique integer value in a column named PKEY.
What I would like to do is add all new entries present in tb1 (the MS Access table) to the existing tb2. I believe this can be done by writing a query that loads all unique pkeys found in tb1 (AKA load all keys that are NOT found in both tables, only load unique keys belonging to rows in the access table) and then appending these entries into Tb2.
Not really sure where to start when writing this query, I tried something like:
SELECT *
FROM tb1
WHERE PKEY <> Tb2.PKEY
Any help would be greatly appreciated. Thanks!

I would recommend not exists:
select tb1.*
from tb1
where not exists (select 1 from tb2 where tb2.pkey = tb1.pkey);
You can put an insert before this to insert the rows into the second table.

Insert into tb2 Select * from tb1 Where tb1.id not in (select Id from tb2)
The script above inserts records to tb2 from the results the first select query.
The select query only returns records with an ID that is not listed in the select sub query.

Related

SQL Query for two tables which only reviews non null values from one table to export matching values

I have created an application that fills one SQL table with over 2 million records.
I have also created another WinForms application that allows users to enter search criteria which then creates another SQL table on demand.
Table A is the table that has over 2 million records
Table B is the user created table.
Both tables have the same amount of fields.
In short I am looking to return only values that match from both tables. However the issue is that I want to ignore null values from the user created table. i.e.
User enters criteria in fields 1 and 2. I want to match on any record that meets those criteria even though the other fields are blank.
What I'm running into right now is that it won't match on the specific records because the user created table has a lot of null value where as the Table A does not.
I've tried a few different things.
SELECT * FROM TableA
UNION
SELECT * FROM TableB
EXCEPT
SELECT * FROM TableA
INTERSECT
SELECT * FROM TableB;
select T1.* from T1 cross join T2
where (T1.column1 = T2.column1 or T2.column1 is null)
and (T1.column2 = T2.column2 or T2.column2 is null) ...

Create new table by merging two existing tables based on matching field

I am attempting to create a new table using columns from two existing tables and it's not behaving the way I expected.
Table A has 91255063 records and table B has 2372294 records. Both tables have a common field named link_id. Link_id is not unique in either table and will not always exist in table B.
The end result I am looking for is a new table with 91255063 records, essentially all of Table A with any additional data from table B for the records with matching link_id's. I had thought outer join would accomplish this as follows:
use database1
SELECT a.*
,b.[AdditionalData1]
,b.[AdditionalData2]
,b.[AdditionalData3]
into dbo.COMBINEDTABLE
FROM Table1 a
left outer join Table2 b
ON a.LINK_ID = b.LINK_ID
This seems to work when looking at the resulting data however my row total for the newly created table COMBINEDTABLE now has 98011015 rows. Am I not using the correct join method here?
Most likely you have duplicate LINK_IDs on the right, thus for quite a few rows from Table1, there are multiplle rows from Table2. You could try using DISTINCT in your SELECT, or specify that you want only the records with the smallest or highest identifier column value (if you have one).

SQL Query CREATE TABLE on multiple conditions

I am trying to deduplicate a large table where values are present but broken into several rows.
For example:
Table 1: Client_Code,Account#, First and last names, address.
Table 2: Client_Code,Account#, First and last names, address, TAX_ID.
Now what I want to do may seem pretty obvious at this point.
I want my results to pull from Table 1 into a new table and the query to be "Select From Table 1 where client code and account# from table 1 match client code and account# from table 2." TAble 2 has all values populated, Table 1 has everyone except TAX ID.
The code i tried looked like this.
CREATE TABLE Dedupe_1 AS SELECT * FROM `TABLE 1`
WHERE `TABLE 1`.`Client_Code`=`TABLE 2`.`Client_Code`
AND
WHERE `TABLE 1`.`account#`=`TABLE 2`.`account#`
ORDER BY `TABLE 2`.`account#`
I keep getting a syntax error. I am very new to this programming language so I apologize if this question is hard to understand.
I was just under the impression that I could call to a field from another table by simply using the 'WHERE' statement.
I think you want to use an exists clause:
CREATE TABLE Dedupe_1 AS
SELECT *
FROM `TABLE 1` t1
WHERE EXISTS (select 1
from table2 t2
where t2.Client_Code = t1.Client_Code and t2.`account#` = t1.`account#`
);
You may want to use Join to connect two tables. You can make use of common column among two tables for Join statement. Common syntax goes like
SELECT table1.column1, table2.column 2 and as many you want in common table
FROM table1 name
INNER JOIN table2 name
ON table1.commoncolumn=table2.Common column;
You may learn more about joins here.

How to copy lookup_id from Table1 to Table2 with INSERT INTO SELECT

I am a student learning SQL Server and using the management studio to normalize a db which started as a single table.
I now have Table1 with 80,000 rows containing ID, CategoryDescription, etc... with many repeated CateogoryDescriptions.
Table2 has a list of all of the CategoryDescriptions and a DescriptionID column which was created using SELECT DISTINCT. It has about 100 rows.
I want to copy the DescriptionID values from Table2 into Table 1 so that I can delete the large CategoryDescription column and replace it with a link to the lookup table.
The following generates the expected data (a single column of 80,000 ids):
SELECT TEST.dbo.LU_ConNames.Con_ID
FROM TEST.dbo.LU_ConNames
JOIN TEST.dbo.MainTable
ON TEST.dbo.MainTable.CONCESSION = TEST.dbo.LU_ConNames.Con_Name
However, when I add the INSERT INTO...
INSERT INTO TEST.dbo.MainTable
SELECT TEST.dbo.LU_ConNames.Con_ID
FROM TEST.dbo.LU_ConNames
JOIN TEST.dbo.MainTable
ON TEST.dbo.MainTable.CONCESSION = TEST.dbo.LU_ConNames.Con_Name
I get "Column name or number of supplied values does not match table definition." To clarify, there is no column in MainTable called Con_ID. I thought that perhaps that was the problem, but when I added one (and verified the same data type) I get the same error.
You should not be inserting new records as you want to update existing ones.
You can do:
UPDATE TEST.dbo.MainTable
SET TEST.dbo.MainTable.Con_ID = C.Con_ID
FROM TEST.dbo.MainTable T
INNER JOIN TEST.dbo.LU_ConNames C
ON T.CONCESSION = C.Con_Name
Some reading on that topicon MSDN - UPDATE syntax

How to auto increment a value in one table when inserted a row in another table

I currently have two tables:
Table 1 has a unique ID and a count.
Table 2 has some data columns and one column where the value of the unique ID of Table 1 is inside.
When I insert a row of data in Table 2, the the count for the row with the referenced unique id in Table 1 should be incremented.
Hope I made myself clear. I am very new to PostgreSQL and SQL in general, so I would appreciate any help how to do that. =)
You could achieve that with triggers.
Be sure to cover all kinds of write access appropriately if you do. INSERT, UPDATE, DELETE.
Also be aware that TRUNCATE on Table 2 or manual edits in Table 1 could break data integrity.
I suggest you consider a VIEW instead to return aggregated results that are automatically up to date. Like:
CREATE VIEW tbl1_plus_ct AS
SELECT t1.*, t2.ct
FROM tbl1 t1
LEFT JOIN (
SELECT tbl1_id, count(*) AS ct
FROM tbl2
GROUP BY 1
) t2 USING (tbl1_id)
If you use a LEFT JOIN, all rows of tbl1 are included, even if there is no reference in tbl2. With a regular JOIN, those rows would be omitted from the VIEW.
For all or much of the table, it is fastest to aggregate tbl2 first in a subquery, then join to tbl1 - like demonstrated above.
Instead of creating a view, you could also just use the query directly, and if you only fetch a single row, or only few, this alternative form would perform better:
SELECT t1.*, count(t2.tbl1_id) AS ct
FROM tbl1 t1
LEFT JOIN tbl2 t2 USING (tbl1_id)
WHERE t1.tbl1_id = 123 -- for example
GROUP BY t1.tbl1_id -- being the primary key of tbl1!