Compare 2 different tables columns from 2 different databases - sql

I have a requirement to compare different tables' columns from 2 different databases, in order to add columns to the master tables based on the requirement.
For example:
Assume in master database I have created one table like:
create table test(id int,name varchar(10))
Assume in test database I have created one table like
create table testings(id int,name varchar(20), sal int)
now I have to compare 2 table columns
I don't want to use red-gate tools.
Can anyone help me?

Is it just red-gate tools you don’t want to use or basically any third party tool? Why not, even if you don’t have the budget to buy you can still use this in trial mode to get the job done?
We’ve been using Apex Diff tool but there are many more out there.
With so many tools available you can probably run all one by one in trial mode for months…
Knowing system tables and how to do this natively is great but it’s just too time consuming...

You can use the EXCEPT or INTERSECT set operators for this. Like so:
SELECT id, name FROM master.dbo.test
EXCEPT -- or INTERSECT
SELECT id, name FROM test.dbo.testings
This will give you:
EXCEPT: returns any distinct values from the left query that are not
also found on the right query.
INTERSECT: returns any distinct values that are returned by both the
query on the left and right sides of the INTERSECT operand.
In your case, since you want to select from two different databases, you have to use a fully qualified table names. They have to be in the form database.schema.object_name.
Update: If you want compare the two tables columns' names, not the data itself, you have to work with the metadata tables to compare the columns' names the same way with EXCEPT.
For instance, suppose you have two databases:
Test database contains the table:
create table test(id int, name varchar(10), dep varchar(50));
and another database:
anotherdatabase database contains the table:
create table testings(id int,name varchar(20), sal int);
And you want to compare the two tables' columns and get the tables that don't exist in the other table, in our example you need to get sal and dep.
Then you can do this:
SELECT ColumnName
FROM
(
SELECT c.name "ColumnName"
FROM test.sys.tables t
INNER JOIN test.sys.all_columns c
ON t.object_id = c.object_id
INNER JOIN test.sys.types ty
ON c.system_type_id = ty.system_type_id
WHERE t.name = 'test'
EXCEPT
SELECT c.name
FROM anotherdatabase.sys.tables t
INNER JOIN anotherdatabase.sys.all_columns c
ON t.object_id = c.object_id
INNER JOIN anotherdatabase.sys.types ty
ON c.system_type_id = ty.system_type_id
WHERE t.name = 'testings'
) t1
UNION ALL
SELECT ColumnName
FROM
(
SELECT c.name ColumnName
FROM anotherdatabase.sys.tables t
INNER JOIN anotherdatabase.sys.all_columns c
ON t.object_id = c.object_id
INNER JOIN anotherdatabase.sys.types ty
ON c.system_type_id = ty.system_type_id
WHERE t.name = 'testings'
EXCEPT
SELECT c.name
FROM test.sys.tables t
INNER JOIN test.sys.all_columns c
ON t.object_id = c.object_id
INNER JOIN test.sys.types ty
ON c.system_type_id = ty.system_type_id
WHERE t.name = 'test'
) t2;
This should give you:
Note that: I joined the tables:
databasename.sys.tables, and
databasename.sys.all_columns
with the table:
databasename.systypes
to get only those columns that have the same data type. If you didn't joined this table, then if two columns have the same name but different data type, they would be the same.

To compare columns use INFORMATION_SCHEMA.COLUMNS table in a SQL SERVER.
This is the exmaple:
select column_name from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME='your_table_name1'
except
select column_name from INFORMATION_SCHEMA.COLUMNS where TABLE_NAME='your_table_name2'

This is a GPL Java program I wrote for comparing data in any two tables, with a common key and common columns, across any two heterogeneous databases using JDBC: https://sourceforge.net/projects/metaqa/
It intelligently forgives (numeric, string and date) data type differences by reducing them to a common format. The output is a sparse tab delimited file with .xls extension for use in a spreadsheet.

Related

Use field values inside a Where statement

I'm currently developing some Quality Checks for a database and we have a table that lists out which columns are Required fields. My question is, would it be possible to use this table to generate a where...is null statement? Example below
select * from Required_Fields_Table
inner join Transaction_Table
on key fields
where (value inside field) is null
Thanks!
edit: this is using Microsoft SQL Server
More Details:
We have a Transactions table, and whether a field in that table is required is different based on the type of user (new, active, pending, etc). We have a table that maps these requirements out (a record for each field/status combination). I was hoping to use that table to run a check to make sure we weren't missing required information.
I'm not sure if I understand your question, but hopefully this will heap... you can use this query to get a list of tables and all fields that are nullable
select o.name TableName, c.name ColumnName
from sys.objects o
inner join sys.columns c on o.object_id = c.object_id
where c.is_nullable = 1 -- 1 for nullable, 0 for not nullable
and o.type = 'u' -- user table
and o.name = '{insert table name here if you wish to refine your search}'
From there you can build up queries for each table with the help of a cursor

Get column differences between 2 databases (SQL Server)

I've got 2 databases almost identical to one another. However, it seems that for some tables in the new database, they are missing columns that are in the old database.
What would be the best way to see the differences between columns in tables between 2 databases? Specifically, I need to see what columns AREN'T in the new database that ARE in the old one.
I've tried looking this up but most things I found were either not what I needed or looking at "records".
You can query the columns from your db using the sys tables and compare the result sets. This script assumes your old db has all the columns you want.
;WITH old_db_columns AS (
SELECT c.object_id, c.column_id, c.name AS column_name, t.name AS table_name
FROM old_db.sys.tables t
INNER JOIN old_db.sys.columns c
ON t.object_id = c.object_id
)
, new_db_columns AS (
SELECT c.object_id, c.column_id, c.name AS column_name, t.name AS table_name
FROM new_db.sys.tables t
INNER JOIN new_db.sys.columns c
ON t.object_id = c.object_id
)
SELECT *
FROM old_db_columns o
WHERE NOT EXISTS (
SELECT 1
FROM new_db_columns n
WHERE n.table_name = o.table_name
AND n.column_name= o.column_name)
You may use SQL Compare and SQL Data Compare, tools by Red Gate, to compare and sync databases schema and data.
You can generate the create statement of the tables and you can compare them with using any diff tool.
Check out that video: VS Comparision
Visual Studio has built in functionality that you are able to do data compares, schema compares and it will generate the differences for you in a script if you need to recitfy the variances.

Find related columns among hundreds of tables for future relational identification

I am using SQL Server 2016 to pull information out of our ERP system that is stored in a DB2 database. This has thousands of tables with no keys inside of them. When pulling tables from the system, I want to be able to identify matching column names in tables so I can start creating relationships and keys when building dimensions.
Is there a way to create a query that will search my database for column names and list every table that uses that column name? I have been using OPENQUERY and INFORMATION_SCHEMA.TABLES to determine the tables I want to pull over but now I want to start determining relationships between those tables.
Any help would be much appreciated!
You can look in the old yet gold system tables.
A few examples
find all tables with a column named like ID
select so.name, sc.name
from sys.sysobjects so
join sys.syscolumns sc on sc.id = so.id
where so.xtype = N'U'
and sc.name like 'ID%'
Find the FKs from a table
select so2.name
from sys.sysobjects so
join sys.sysforeignkeys fk on so.id = fk.rkeyid
join sys.sysobjects so2 on fk.fkeyid = so2.id
where so.name = 'MyTable'
Check MSDN documentation for further reference and if you want any specific combination just post a new question.
I had to do something similar once, and ended up using something similar to this:
SELECT
T.name
,C1.name
,C2.Name
FROM sys.Tables T
INNER JOIN sys.Columns C1
ON C1.object_id = T.object_id
CROSS APPLY
(
SELECT OBJECT_NAME(CX.object_id) + '.' + CX.Name AS Name
FROM sys.Tables TX
INNER JOIN sys.Columns CX
ON CX.object_id = TX.object_id
AND TX.is_ms_shipped = 0
WHERE CX.object_id <> T.object_id
AND CX.name = C1.name
AND CX.user_type_id = C1.user_type_id
) C2
;
Of course, the problem with any query that we can post here is that it will be extremely generalized, because we aren't familiar with your schema. It's entirely possible, for example, that you will have tables like these:
T_Customers T_Shipments
ID | Name ID | Customer_ID
1 | George 1 | 1
2 | Jane 2 | 1
3 | John 3 | 3
In a case such as that, T_Shipments.Customer_ID should be linked to T_Customers.ID, but won't be in this query, because the name is different.
To search for cases like that, I modified the query later to do a second comparison with concatenations and pattern searches. Not the speediest, but certainly the most thorough - we found all sorts of things we didn't know before. Unfortunately, I can't even begin to guess what your tables/attributes might look like without a lot of further details.
Edit:
Please note that the CROSS APPLY includes a reference to user_type_id, because I wasn't interested at the time in finding columns that had the same name but were a different data type. That might not be the case for you, so you can remove that reference if it isn't relevant.

Searching for a table based on values contained in the table

I have a table with a reason_id foreign key. I need to map that back to its primary table. I have searched our databases for matching/similar column names but I wasn't able to find the table.
I have a list of values that would be associated with the reason_id. I'd like to search for tables that contain the list I have. Any help would be appreciated.
Here's the query I was running to search for columns:
select
t.name as Table_Name,
SCHEMA_NAME(schema_id) as schema_name,
c.name as Column_Name
from
sys.tables as t
inner join
sys.columns c
on
t.OBJECT_ID = c.OBJECT_ID
where
c.name like '%reason%'
There is no easy way to find the related data in other tables.
I'd try with tools such as ApexSQL Search or SQL Search. Both are free and you won’t go wrong with any of these.
If you want to do it with SQL only then identify all columns in all tables that have the same data type. To do so use sys.columns, sys.types and sys.tables views. Once you find all columns just try start writing queries for each table until you find the right one.
I’d go with something like this
select COUNT(*)
from tableX
where tableX.matchedColumn in
(
-- take 100 or more random rows from the original table
-- if result gives you a number that is near to the number of values listed here then you are most probably on the right track
)

Copy Column Into Existing Table using SQL

I want to create a column in databaseA with the same name, type, length and precision as another column in databaseB.
I need something like this:
INSERT INTO dbo.databaseA(col_name)
SELECT col_name
FROM dbo.databaseB;
GO
But the *col_name* does not yet exist in databaseA. I want to create it with the same type as *col_name* in databaseB.
I've also looked at:
ALTER TABLE table_name ADD col_name data_type
INSERT INTO dbo.databaseA(col_name)
SELECT col_name
FROM dbo.databaseB;
GO
But I don't know the data_type of the column I need to copy.
Edit:
** I'm using SQL Server 2008 R2 **
You can retrieve the datatype of the column from the database by querying the system views, similar to the following:
SELECT c.*, s.name
FROM sys.columns c
INNER JOIN sys.objects o
ON c.object_id = o.object_id
INNER JOIN sys.types s
ON c.user_type_id = s.user_type_id
WHERE o.name = 'B'
AND c.name = 'ColumnName'
You'll need to connect to DatabaseB, replace o.name = 'B' with your table's name and replace c.Name = 'ColumnName' with your column's name.
Once you have the datatype you would need to construct a DDL statement to add the column to the table in database A, something along the lines of the following:
ALTER TABLE dbo.MyTableName ADD MyColumnName DATA_TYPE_HERE
Once the table has been updated you can construct and execute your insert statement:
INSERT INTO DatabaseA.dbo.MyTableName (column list)
SELECT (column list)
FROM DatabaseB.dbo.MyTableName
Note that the above assumes both databases are located on the same SQL Server Instance and they both reside in the dbo schema.
If this is something you plan on using in the future you should add some defensive programming steps to make sure you only add the column if the column doesn't already exist in the table.