Compare two tables in SQL Server - sql

I have two tables in my SQL Server, both tables has the same columns. I need find the differences between this tables.
The tables (all fields are in nvarchar format):
following the tables columns, i need make an SQL query to indentify this conditions:
table1.uf = table2.uf,
table1.municipio = table2.municipio,
table1.marca_modelo = table2.marca_modelo,
table1.ano_fabricacao = table2.ano_fabricacao
table1.qtd_veiculos != table2.qtd_veiculos
and
`Unique lines in table1`
I've already tried make this query (bellow) but doesn't work.
Select *
from Table1 J left join
Table2 M on J.uf = M.uf
and J.municipio = M.municipio
and J.marca_modelo = M.marca_modelo
and J.ano_fabricacao = M.ano_fabricacao
and J.qtd_veiculos != M.qtd_veiculos
Can you help me with that question?
I expected to receive the following result:
The fields in red show the differences between tables and in the last line, in Table1 not have any correlation in Table2.
I apologize for the lack of information, this is my first topic here in the forum. but thanks for the help!

Here is a script I use to compare two tables. Perhaps you can modify this for yourself by throwing the results into a temp table for further analysis.
You could change the '*'s to list only those columns you want to compare.
-- Thisc script compares any two tables.
-- Enter the two tables names to compare in the first two lines.
DECLARE #table1 NVARCHAR(80)= 'my_first_table_to_compare'
DECLARE #table2 NVARCHAR(80)= 'my_second_table_to_compare'
DECLARE #sql NVARCHAR (2000)
SET #sql = '
SELECT ''' + #table1 + ''' AS table_name,* FROM
(
SELECT * FROM ' + #table1 + '
EXCEPT
SELECT * FROM ' + #table2 + '
) x
UNION
SELECT ''' + #table2 + ''' AS table_name,* FROM
(
SELECT * FROM ' + #table2 + '
EXCEPT
SELECT * FROM ' + #table1 + '
) y
'
EXEC sp_executesql #stmt = #sql

I don't fully understand your question, so this is a [wild] guess. You want to find:
Rows present on both tables but they differ on qtd_veiculos.
Rows on table1 that are not present in table2.
Rows on table2 that are not present in table1.
If this is the question, the query should be:
select j.*, m.*
from table1 j
outer join table2 m on j.uf = m.uf
and j.municipio = m.municipio
and j.marca_modelo = m.marca_modelo
and j.ano_fabricacao = m.ano_fabricacao
where j.qtd_veiculos <> m.qtd_veiculos

Related

INNER JOIN with ON All columns except one Column

I have 2 tables(Table1 and Table2). Both tables schema are exactly the same and both might have duplicated set of records except IDs since ID is auto generated.
I would like to get the common set of records but with ID to follow as Table1's ID. So, I query using Inner join. It works as I expected.
SELECT Table1.ID, Table1.Param1, Table1.Param2, Table1.Param3
INTO #Common
FROM Table1
INNER JOIN Table2 ON Table1.Param1 = Table2.Param1
AND Table1.Param2 = Table2.Param2
AND Table1.Param3 = Table2.Param3
However, in actual usage, the total number of parameters in both tables will be around 100. So, the total number of comparison inside ON clause will increase up to 100.
How can I do inner join by excluding one column instead of comparing all columns in ON clause?
By removing ID column from both tables and doing intersect also no possible since I still want to extract Table1 ID for other purpose.
I can achieve the common of 2 table by removing ID and compare those 2 table.
However, that still do not serve my requirement, since I need to get Table1 ID for those common data.
SELECT * INTO #TemporaryTable1 FROM Table1
ALTER TABLE #TemporaryTable1 DROP COLUMN ID
SELECT * INTO #TemporaryTable2 FROM Table2
ALTER TABLE #TemporaryTable2 DROP COLUMN ID
SELECT * INTO #Common FROM (SELECT * FROM #TemporaryTable1 INTERSECT SELECT * FROM #TemporaryTable2) data
SELECT * FROM #Common
If i understood your problem correctly i guess you could generate dynamically the query you want to use using the following code :
DECLARE #SQL nvarchar(max) = 'SELECT ',
#TBL1 nvarchar(50) = 'data',
#TBL2 nvarchar(50) = 'data1',
#EXCLUDEDCOLUMNS nvarchar(100)= 'ID,col1'
-- column selection
SELECT #sql += #tbl1 + '.' + COLUMN_NAME + ' ,
'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #TBL1
-- from clause and remove last ,
set #SQL = LEFT(#sql,LEN(#sql) - 5)
SET #sql += '
FROM ' + #TBL1 + ' INNER JOIN
' + #TBL2 + '
ON '
-- define the on clause
SELECt #SQL += #tbl1 + '.' + COLUMN_NAME + ' = '+ #tbl2 + '.' + COLUMN_NAME +',
'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #TBL1
AND COLUMN_NAME not in (#EXCLUDEDCOLUMNS)
--remove last ,
set #SQL = LEFT(#sql,LEN(#sql) - 3)
--SELECt #SQL
EXEC SP_EXECUTESQL #sql
Before you execute make sure the #sql is properly generated. choose the columns you want to exclude from your on clause using the #EXCLUDEDCOLUMNS parameter.

query not retrieving values returned from split string

I generate comma seperated string and add single quite to each numbers
Here is how i do it
DECLARE #IDs NVARCHAR(max)
SELECT #IDs = COALESCE(#IDs +',', '') + ''''
+ Cast([mynos] AS NVARCHAR(255)) + ''''
FROM mytable
WHERE id = 22
If i print variable #IDs then i get below output
'78888','3333','1222'
When i use same variable in this query then query doesnt return any value
SELECT *
FROM table1
WHERE ids IN ( #IDs )
How to fix this?
It doesn't work as your query is effectively doing this:
SELECT *
FROM TABLE
WHERE Ids IN ('''78888'',''3333',''1222''');
Which would also be equivalent to:
SELECT *
FROM TABLE
WHERE Ids = '''78888'',''3333',''1222''';
If you want to do the query as you have done, you'll need to split your delomited data out again. As you're using SQL Server 2012, you can't make use of STRING_SPLIT, so you'll need to a different one; such as Jeff Moden's DelimitedSplit8K. Then you can do:
SELECT *
FROM TABLE
WHERE IDs IN (SELECT items
FROM dbo.DelimitedSplit8K (#IDs,','));
However, why are you not simply doing...
SELECT *
FROM TABLE T
WHERE EXISTS (SELECT 1
FROM myTable mT
WHERE mT.Id = 22
AND mT.myNos = T.Ids);
You can use dynamic query #id is string variable not multi-value argument
DECLARE #IDs nVARCHAR(MAX)
SELECT #IDs = COALESCE(#IDs +',' ,'') + '''' + CAST([myNos] AS nVARCHAR(255)) + ''''
FROM myTable WHERE Id = 22
DECLARE #query nVARCHAR(MAX)
SET #query = "Select * from table1 where Ids in ("+#IDs+")"
EXECUTE sp_executesql #query
I tried below and it worked
select * from table1 where id in (select mynos from mytable where id = 22)
Thanks to #Larnu for giving me idea

Select from tables, where the table names are stored in another table

I'm trying to write a query in which I can select data from a series of tables. I want to be able to pull those table names FROM ANOTHER TABLE; I don't want to just write
select * from tableA union select * from tableB etc.
A further restriction that's complicating the issue is that my query MUST start with select.
I've tried to use OPENQUERY within the select statement but the server I'm trying to access is 'not configured for DATA ACCESS.'
You can do something like this:
DECLARE #SQL AS VARCHAR(MAX);
SELECT #SQL = COALESCE(#SQL + ' ', '') +
'SELECT * FROM ' + TableName +
CASE
WHEN TableName = MAX(TableName) OVER () THEN ''
ELSE ' UNION ALL '
END
FROM TableNames;
EXEC(#SQL);

Call SQL Functions after PIVOT

I have a stored procedure that's taking a very long time because I have 2 function calls that are being called before a PIVOT, which means it's calling the functions 5 times for each record rather than once for each record. How can I get rewrite my query so that the 2 function calls right at the end of the query are run after the Pivot rather than before?
Here's the query
CREATE TABLE #Temp
(
ServiceRecordID INT,
LocationStd VARCHAR(1000),
AreaServedStd VARCHAR(1000),
RegionalLimited BIT,
Region VARCHAR(255),
Visible BIT
)
DECLARE #RegionCount INT
SELECT #RegionCount = COUNT(RegionID) FROM Regions WHERE SiteID = #SiteID AND RegionID % 100 != 0
INSERT INTO #Temp
SELECT TOP (#RegionCount * 100) SR.ServiceRecordID, SR.LocationStd, SR.AreaServedStd, SR.RegionalLimited, R.Region,
CASE WHEN (ISNULL(R_SR.RegionID,0) = 0 AND ISNULL(R_SR_Serv.RegionID,0) = 0) THEN 0 ELSE 1 END AS Visible
FROM ServiceRecord SR
INNER JOIN Sites S ON SR.SiteID = S.SiteID
INNER JOIN Regions R ON R.SiteID = S.SiteID
LEFT OUTER JOIN lkup_Region_ServiceRecord R_SR ON R_SR.RegionID = R.RegionID AND R_SR.ServiceRecordID = SR.ServiceRecordID
LEFT OUTER JOIN lkup_Region_ServiceRecord_Serv R_SR_Serv ON R_SR_Serv.RegionID = R.RegionID AND R_SR_Serv.ServiceRecordID = SR.ServiceRecordID AND SR.RegionalLimited = 0
WHERE SR.SiteID = #SiteID
AND R.RegionID % 100 != 0
ORDER BY SR.ServiceRecordID
DECLARE #RegionList varchar(2000),#SQL varchar(max)
SELECT #RegionList = STUFF((SELECT DISTINCT ',[' + Region + ']' FROM #Temp ORDER BY ',[' + Region + ']' FOR XML PATH('')),1,1,'')
SET #SQL='SELECT * FROM
(SELECT ServiceRecordID,
dbo.fn_ServiceRecordGetServiceName(ServiceRecordID,'''') AS ServiceName,
LocationStd,
AreaServedStd,
RegionalLimited,
Region As Region,
dbo.fn_GetOtherRegionalSitesForServiceRecord(ServiceRecordID) AS OtherSites,
CAST(Visible AS INT) AS Visible FROM #Temp) B PIVOT(MAX(Visible) FOR Region IN (' + #RegionList + ')) A'
EXEC(#SQL)
Move the function calls after the PIVOT:
SET #SQL='
SELECT
A.*,
N.ServiceName,
S.OtherSites
FROM
(
SELECT
ServiceRecordID,
LocationStd,
AreaServedStd,
RegionalLimited,
Region,
CAST(Visible AS INT) AS Visible
FROM #Temp
) B
PIVOT(MAX(Visible) FOR Region IN (' + #RegionList + ')) A
OUTER APPLY (
SELECT dbo.fn_ServiceRecordGetServiceName(A.ServiceRecordID,'''')
) N (ServiceName)
OUTER APPLY (
SELECT dbo.fn_GetOtherRegionalSitesForServiceRecord(A.ServiceRecordID)
) S (OtherSites);
';
Or just put them in the outer SELECT:
SET #SQL='
SELECT
A.*,
ServiceName = dbo.fn_ServiceRecordGetServiceName(A.ServiceRecordID,''''),
OtherSites = dbo.fn_GetOtherRegionalSitesForServiceRecord(A.ServiceRecordID)
FROM
(
SELECT
ServiceRecordID,
LocationStd,
AreaServedStd,
RegionalLimited,
Region,
CAST(Visible AS INT) AS Visible
FROM #Temp
) B
PIVOT(MAX(Visible) FOR Region IN (' + #RegionList + ')) A
';
If you can possibly convert those functions to be table-valued rowset-returning consisting of a single SELECT statement, you may get a huge performance improvement as well.
CREATE FUNCTION dbo.fn_ServiceRecordGetServiceName2(
#ServiceRecordID itn
)
RETURNS TABLE
AS
RETURN ( -- single select statement
SELECT ServiceName = Blah
FROM dbo.Gorp
WHERE Gunk = 'Ralph'
);
Then
OUTER APPLY dbo.fn_ServiceRecordGetServiceName(ServiceRecordID,'''') N
And N.ServiceName will return the value(s).
Also, it is not correct to tack on square brackets to convert data values to valid sysnames. You should use the QuoteName function. This will ensure your system doesn't break no matter WHAT crazy value is entered 13 years from now (think 'Taiwan [North]'):
STUFF((SELECT DISTINCT ',' + QuoteName(Region) FROM #Temp ...
Note:
Since you said that this is for display in a web page, you don't even need to do the pivoting on the server. Instead, return 2 rowsets to the client, one with the Site data and one with the column data for the Regions. You would need an additional pass through every row in the Region rowset to find out all the regions needed, but this can be done very quickly. Finally, adjust your program code to step through the Region rows as needed for each matching Site, and created your output.
One reason this is worth the investment is that if your application grows in size, you can always throw another web server at the problem, but it's a lot harder to throw another database at it. A new web server will cost less than continually beefing up your SQL Server.
P.S. Even dynamic SQL is easier to deal with when you format it well. :)

Generic code to determine how many rows from a table are in a different table with matching structure?

How can I create a generic function in C# (LINQ-to-SQL) or SQL that takes two tables with matching structure and counts how many rows in TableA that are in TableB.
TableA Structure
Value1, Value2
TableA Data
1,1
TableB Structure
Value1, Value2
TableB Data
1,1,
1,2
To get count of matching rows between TableA and TableB:
SELECT COUNT(*) FROM TableA
INNER JOIN TableB ON
TableA.Value1 = TableB.Value1 AND
TableA.Value2 = TableB.Value2
The result in this example
1
So the query above works great, but I don't want to have to write a version of it for every pair of tables I want to do this for since the INNER JOIN is on every field. I feel like there should be a more generic way to do this instead having to manually type out all of the join conditions. Thoughts?
Edit: Actually, I think I'll need a C#/LINQ answer since this will be done across servers. Again, this is annoying because I have to write the code to compare each field manually.
var tableARows = dbA.TableA.ToList();
var tableBRows = dbB.TableB.ToList();
var match = 0;
foreach(tableARow in tableARows){
if(tableBRows.Where(a=>a.Value1 = tableARow.Value1 && a.Value2 = tableARow.Value2).Any()){
match++;
}
}
return match;
using sql server this will work
var sql="select count(0) from(
select * from product except select * from product1
) as aa";
dc = dtataContext
var match= dc.ExecuteStoreQuery<int>(sql);
You could generate the join using syscolumns.
declare #tablenameA varchar(50) = 'tableA'
declare #tablenameB varchar(50) = 'tableB'
declare #sql nvarchar(4000)
select #sql =''
select #sql = #sql + ' and ' + quotename(#tablenameA )+ '.'
+ c.name +' = ' + quotename(#tablenameB )+ '.' + c.name
from syscolumns c
inner join sysobjects o on c.id = o.id
where o.name = #tablenameA
select #sql = 'select count(*) from ' + #tablenameA + ' inner join '+#tablenameB+' on '
+ substring (#sql, 5, len(#sql))
exec sp_executesql #sql
You query the ANSI INFORMATION_SCHEMA views, thus:
select *
from INFORMATION_SCHEMA.COLUMNS col
where col.TABLE_SCHEMA = <your-schema-name-here>
and col.TABLE_NAME = <your-table-name-here>
order by col.ORDINAL_POSITION
against each of the tables involved. The result set will provide everything needed for your to construct the required query on the fly.
One fairly simple answer:
select ( select count(*) from
(select * from TableA UNION ALL select * from TableB) a ) -
( select count(*) from
(select * from TableA UNION select * from TableB) d ) duplicates