count between two tables and finding difference

count between two tables and finding difference - sql

I currently have the following code which works. It's comparing two tables that are exactly the same but in two separate databases to ensure they have the same record count.
I was wondering if anyone saw a better way of achieving the below?
Declare #count1 int
Declare #count2 int
select #count1 = count(*) from database1.dbo.table1
select #count2 = count(*) from database2.dbo.table1
if #count1 <> #count2
begin
insert into log table saying counts don't matc
end

There's really no much better way. You can just do it without variables:
if (select count(*) from database1.dbo.table1) <> (select count(*) from database2.dbo.table1)
begin
insert into log table saying counts don't matc
end

If you want to know where the differences are you can use this to find the missing records in database 2
SELECT *
FROM database1.dbo.table1 D1
LEFT JOIN database2.dbo.table2 D2
ON D1.id = D2.id
WHERE D2.id IS NULL

Related

SQL: Delete Rows from Dynamic list of tables where ID is null

I'm a SQL novice, and usually figure things out via Google and SO, but I can't wrap my head around the SQL required for this.
My question is similar to Delete sql rows where IDs do not have a match from another table, but in my case I have a middle table that I have to query, so here's the scenario:
We have this INSTANCES table that basically lists all the occurrences of files sent to the database, but have to join with CROSS_REF so our reporting application knows which table to query for the report, and we just have orphaned INSTANCES rows I want to clean out. Each DETAIL table contains different fields from the other ones.
I want to delete all single records from INSTANCES if there are no records for that Instance ID in any DETAIL table. The DETAIL table got regularly cleaned of old files, but the Instance record wasn't cleaned up, so we have a lot of INSTANCE records that don't have any associated DETAIL data. The thing is, I have to select the Table Name from CROSS_REF to know which DETAIL_X table to look up the Instance ID.
In the below example then, since DETAIL_1 doesn't have a record with Instance ID = 1001, I want to delete the 1001 record from INSTANCES.
INSTANCES
Instance ID
Detail ID
1000
123
1001
123
1002
234
CROSS_REF
Detail ID
Table Name
123
DETAIL_1
124
DETAIL_2
125
DETAIL_3
DETAIL_1
Instance ID
1000
1000
2999

Storing table names or column names in a database is almost always a sign for a bad database design. You may want to change this and thus get rid of this problem.
However, when knowing the possible table names, the task is not too difficult.
delete from instances i
where not exists
(
select null
from cross_ref cr
left join detail_1 d1 on d1.instance_id = i.instance_id and cr.table_name = 'DETAIL_1'
left join detail_2 d2 on d2.instance_id = i.instance_id and cr.table_name = 'DETAIL_2'
left join detail_3 d3 on d3.instance_id = i.instance_id and cr.table_name = 'DETAIL_3'
where cr.detail_id = i.detail_id
and
(
d1.instance_id is not null or
d2.instance_id is not null or
d3.instance_id is not null
)
);
(You can replace is not null by = i.instance_id, if you find that more readable. In that case you could even remove these criteria from the ON clauses.)

Much thanks to #DougCoats, this is what I ended up with.
So here's what I ended up with (#Doug, if you want to update your answer, I'll mark yours correct).
DECLARE #Count INT, #Sql VARCHAR(MAX), #Max INT;
SET #Count = (SELECT MIN(DetailID) FROM CROSS_REF)
SET #Max = (SELECT MAX(DetailID) FROM CROSS_REF)
WHILE #Count <= #Max
BEGIN
IF (select count(*) from CROSS_REF where file_id = #count) <> 0
BEGIN
SET #sql ='DELETE i
FROM Instances i
WHERE NOT EXISTS
(
SELECT InstanceID
FROM '+(SELECT TableName FROM Cross_Ref WHERE DetailID=#Count)+' d
WHERE d.InstanceId=i.InstanceID
AND i.detailID ='+ cast(#Count as varchar) +'
)
AND i.detailID ='+ cast(#Count as varchar)
EXEC(#sql);
SET #Count=#Count+1
END
END

this answer assumes you have sequential data in the CROSS_REF table. If you do not, you'll need to alter this to account it (as it will bomb due to missing object reference).
However, this should give you an idea. It also could probably be written to do a more set based approach, but my answer is to demonstrate dynamic sql use. Be careful when using dynamic SQL though.
DECLARE #Count INT, #Sql VARCHAR(MAX), #Max INT;
SET #Count = (SELECT MIN(DetailID) FROM CROSS_REF)
SET #Max = (SELECT MAX(DetailID) FROM CROSS_REF)
WHILE #Count <= #Max
BEGIN
IF (select count(*) from CROSS_REF where file_id = #count) <> 0
BEGIN
SET #sql ='DELETE i
FROM Instances i
WHERE NOT EXISTS
(
SELECT InstanceID
FROM '+(SELECT TableName FROM Cross_Ref WHERE DetailID=#Count)+' d
WHERE d.InstanceId=i.InstanceID
AND i.detailID ='+ cast(#Count as varchar) +'
)
AND i.detailID ='+ cast(#Count as varchar)
EXEC(#sql);
SET #Count=#Count+1
END
END

How to write if exist statement that would run a different select depending if it exists or not

I am trying to convert a sql if exists statement into a SSRS valid format to run a report on CRM.
CRM report doesn't accept the report on upload if I have a if exists method, I'm having troubles figuring out what I can use in its place.
IF EXISTS(select * from dbo.FC where dbo.FC.ContactID in (select dbo.AV.so_contactid from dbo.AV))
begin
select [STATEMENT 1]
from dbo.AV CRMAF_so_AV join
dbo.FC c
on CRMAF_so_AV.so_contactid = c.ContactID;
end
else
begin
select [STATEMENT 2]
from dbo.AV CRMAF_so_AV join
dbo.FA c
on CRMAF_so_AV.so_contactid = c.AccountID;
end;
I want to be able to either run the select [STATEMENT 1] if the condition is true else I want to run select [STATEMENT 2]

I have managed to get this to work by doing a LEFT JOIN instead of a JOIN.
select [STATEMENT 1 + 2 all columns needed]
from dbo.AV CRMAF_so_AV
left join dbo.FC c on CRMAF_so_AV.so_contactid = c.ContactID;
left join dbo.FA a on CRMAF_so_AV.so_contactid = a.AccountID;
This now runs if its an account or a contact.

Try this -
You have to put your entire statement in #select1 and #select1.
declare #statement1 as varchar(max);
declare #statement2 as varchar(max);
SET #statement1 = 'SELECT 1'
SET #statement2 = 'SELECT 2'
IF EXISTS(select * from dbo.FC where dbo.FC.ContactID in (select dbo.AV.so_contactid from dbo.AV))
BEGIN
EXEC (#statement1)
END
ELSE
BEGIN
EXEC (#statement2)
END

Instead of using if exists can you not get a count of records that meet the criteria and then if its 1 or greater run a different query as apposed to if it was equal to 0.
let me know if I am missing something what you are trying to achieve.
sorry i am unable to put comments due to having a new account so my reputation is low.

I think you need something like this:
WITH PreSelection AS (
SELECT
AV.ID AS AVID,
(SELECT TOP(1) c.ContactID FROM dbo.FC c WHERE c.ContactID = AV.so_contactid) AS ContactID,
(SELECT TOP(1) c.ContactID FROM dbo.FA c WHERE c.AccountID = AV.so_contactid) AS AccountID
FROM dbo.AV
)
SELECT
AVID,
ISNULL(
CASE WHEN ContactID IS NULL
THEN (SELECT TOP(1) AccountName FROM dbo.FA WHERE FA.AccountID = AccountID)
ELSE (SELECT TOP(1) LTRIM(RTRIM(ISNULL(FirstName, '') + ' ' + ISNULL(LastName, ''))) FROM dbo.FC WHERE FC.ContactID = ContactID)
END, '') AS ContactName
FROM PreSelection

A few things to note:
When SSRS evaluates query it expects the resluts to always have the same structure in terms of column names and types.
So you CANNOT do something like this..
IF #x=#y
BEGIN
SELECT Name, Age FROM employees
END
ELSE
BEGIN
SELECT DeptID, DeptName, DeptEMpCOunt FROM departments
END
... as this will return different types and column names and column counts.
What you CAN DO is this..
DECLARE #t TABLE(resultType int, colA varchar(128), colB int, colC varchar(128), colD int)
IF #x=#y
BEGIN
INSERT INTO #t(resultType, colA, ColB)
SELECT 1 as resultType, Name, Age FROM employees
END
ELSE
BEGIN
INSERT INTO #t(resultType, colB, colC, colD)
SELECT 2 AS resultType, DeptID, DeptName, DeptEmpCount FROM departments
END
SELECT * FROM #t
Al we are doing is creating a table that can handle all variations of the data and putting the results into whatever columns can accommodate that data type.
This will always return the same data structure so SSRS will be happy, then you will need to handle which columns to display your data from based on what gets returned, hence why I added the result type to the results so you can test that from within the report.

SQL Anywhere Error -824: Illegal reference to correlation name tableName

When I run this script on Sybase IQ:
declare #YEAR int=2017
declare #MON int=6
declare #DAY int=7
update MainTable
set MainTable.Amount=(X.Number+Y.Number),
MainTable.Total=(X.Total+Y.Total)
from (select 'Number'= count(*), 'Total'=case when SUM(T1_Total) is null then 0 else SUM(T1_Total) end
from Table1
where T1_Account_NO=MainTable.Account_NO
and T1_SENTRY_YEAR=#YEAR and T1_SENTRY_MON=#MON and T1_SENTRY_DAY=#DAY) X,
(select 'Number'= count(*), 'Total'=case when SUM(T2_TOTAL) is null then 0 else SUM(T2_TOTAL) end
from Table2 where T2_Account_NO = MainTable.Account_NO
and T2_YEAR=#YEAR and T2_MON=#MON and T2_DAY=#DAY )Y
where MainTable.YEAR=#YEAR
and MainTable.MON = #MON
and MainTable.DAY=#DAY
I got an error like this : " SQL Anywhere Error -824: Illegal reference to correlation name MainTable"
How can I surpass this problem?

Have you tried adding MainTable to the from clause, eg:
update Maintable
set ...
from MainTable,
(select ... )X,
(select ... )Y
where ...
NOTE: I work with Sybase ASE, which does not allow 'external' correlation names to be referenced within derived tables, so I'm wondering if SQLAnywhere has a similar limitation ... ?
What happens if you pull the MainTable joins out to the top-most level of the query, eg:
declare #YEAR int=2017
declare #MON int=6
declare #DAY int=7
update MainTable
set MainTable.Amount=(X.Number+Y.Number),
MainTable.Total=(X.Total+Y.Total)
from (select T1_account_NO, 'Number'= count(*), 'Total'=case when SUM(T1_Total) is null then 0 else SUM(T1_Total) end
from Table1
where T1_SENTRY_YEAR=#YEAR and T1_SENTRY_MON=#MON and T1_SENTRY_DAY=#DAY
group by T1_Account_NO) X,
(select T2_Account_NO, 'Number'= count(*), 'Total'=case when SUM(T2_TOTAL) is null then 0 else SUM(T2_TOTAL) end
from Table2 where T2_YEAR=#YEAR and T2_MON=#MON and T2_DAY=#DAY
group by T2_Account_NO)Y
where MainTable.YEAR=#YEAR
and MainTable.MON = #MON
and MainTable.DAY=#DAY
and MainTable.Account_NO = X.T1_Account_NO
and MainTable.Account_NO = Y.T2_Account_NO
One potential performance-related downside would be if the derived tables now generate a large set of records that won't be joined with MainTable (unless the SQLAnywhere query engine is able to flatten the query in some way ... ???).
If this is an issue of not allowing 'external' correlation names in derived tables, another (obvious ?) solution would be to create a couple #temp tables from the results of joining MainTable with Table1/Table2, then perform the update of MainTable as a join with the #temp tables. [Possibly indexing the #temp tables if the data volumes are large enough to justify, performance-wise, the indexes.]

Did you try adding MainTable to the FROM clause?

I surpass this problem like this:
declare #YEAR int=2017
declare #MON int=6
declare #DAY int=7
update MainTable
set MainTable.Amount= (X.Number),
MainTable.Total = (X.Total)
from (select T1_Account_NO,'Number'= count(*), 'Total'=case when SUM(T1_Total) is null then 0 else SUM(T1_Total) end
from Table1
where T1_SENTRY_YEAR=#YEAR and T1_SENTRY_MON=#MON and T1_SENTRY_DAY=#DAY
group by T1_Account_NO) X,
where X.T1_Account_NO=MainTable.Account_NO
and MainTable.YEAR=#YEAR
and MainTable.MON = #MON
and MainTable.DAY=#DAY
update MainTable
set MainTable.Amount= coalesce(MainTable.Amount,0)+(Y.Number),
MainTable.Total = coalesce(MainTable.Total,0)+(Y.Total)
(select T2_Account_NO,'Number'= count(*), 'Total'=case when SUM(T2_TOTAL) is null then 0 else SUM(T2_TOTAL) end
from Table2
where T2_YEAR=#YEAR and T2_MON=#MON and T2_DAY=#DAY
group by T2_Account_NO) Y
where MainTable.YEAR=#YEAR
and MainTable.MON = #MON
and MainTable.DAY=#DAY
and Y.T2_Account_NO = MainTable.Account_NO
I have seperated update script the two parts.

Update fails because Subquery returned more than 1 value

I Get the following error when i try to update my table although there's n't any sub query :
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
MY QUERY :
UPDATE t1
SET t1.modified = 2
FROM TransActions AS t1
INNER JOIN Ruser R
ON t1.USERID = r.USERID
WHERE r.dep_code = 54 and r.dep_year =2014
and YEAR(t1.checktime) =2016 and MONTH(t1.checktime) =1 and t1.modified = 0
The data selected like this :
USERID empNum
3090 25
3090 25
2074 464
According to the comments my update trigger :
after update
as
declare #userid int , #date date
if (select userid from inserted)<>(select userid from deleted )
raiserror ('YOU ARE NOT ALLOWED TO PERFORME THIS ACTION',10 , 1)
ELSE
begin
set nocount on;
set #userid = (select userid from inserted)
set #date = (select convert(date , checktime) from inserted)
exec calc_atten #date , #userid
end

Triggers are executed per statement, not per row, that's the source of your error.
Your trigger assumes that the inserted and deleted tables will only ever have one row, however that is simply wrong.
The number of rows in the inserted / deleted tables is the number of rows effected by the DML statement (update/insert/delete).
I don't know what the procedure calc_atten does, but you need to find a way to execute it's logic on a set level and not on scalar variables as it does now.
Your condition at the beginning of the trigger should be changed to fit a multi-row update.
One way to do it is this: (I could probably write it shorter and better if I would have known the table's structure)
IF EXISTS (
SELECT 1
FROM deleted d
INNER JOIN inserted i
ON d.[unique row identifier] = i.[unique row identifier]
WHERE i.userId <> d.UserId
)
*[unique row identifier] stands for any column or column combination that is unique per row in that table. If the unique row identifier contains the UserId column then this will not work properly.

Your query is ok. The problem is the trigger. inserted and deleted are tables (well, really views but that is irrelevant), so they can contain multiple rows.
Assuming that transactions has a primary key, you can check the update by doing
declare #userid int , #date date ;
if (exists (select 1
from inserted i
where not exists (select 1
from deleted d
where d.transactionid = i.transactionid and
d.userid <> i.userid
)
)
)
begin
raiserror ('Changing user ids is not permitted', 10 , 1);
end;
else begin
set nocount on;
declare icursor cursor for select userid, checktime from inserted;
open icursor;
fetch next from icursor into #userid, #date;
while not ##FETCH_STATUS = 0
begin
exec calc_atten #date, #userid
fetch next from icursor into #userid, #date;
end;
close icursor; deallocate icursor;
end;
Cursors are not my favorite SQL construct. But, if you need to loop through a table and call a stored procedure, then they are appropriate. If you can rewrite the code to be set-based, then you can get rid of the cursor.

Try using distinct like this:
UPDATE t1
SET t1.modified = 2
FROM TransActions AS t1
INNER JOIN (select distinct userid from Ruser
where r.dep_code = 54 and r.dep_year = 2014 ) R
ON t1.USERID = r.USERID
WHERE YEAR(t1.checktime) =2016 and MONTH(t1.checktime) =1 and t1.modified = 0
BTW - I don't see any subquery here, so its weird thats the error you get, I have a feeling the error doesn't occurs because of that part of the code.

You can use distinct to return unique userid's:
UPDATE TransActions
SET modified = 2
WHERE YEAR(checktime) = 2016
AND MONTH(checktime = 1
AND modified = 0
AND userid IN ( SELECT DISTINCT userid FROM Ruser r WHERE r.dep_code = 54 and r.dep_year =2014 );

MS sql server looping through huge table

I have a table with 9 million record I need to loop through each row and need to insert into multiple tables in each iteration.
My example query is
//this is the table with 9 million records
create table tablename
(
ROWID INT IDENTITY(1, 1) primary key ,
LeadID int,
Title varchar(20),
FirstName varchar(50),
MiddleName varchar(20),
Surname varchar(50)
)
declare #counter int
declare #leadid int
Declare #totalcounter int
set #counter = 1
Select #totalcounter = count(id) from tablename
while(#counter < #totalcounter)
begin
select #leadid = leadid from tablename
where ROWID = #counter
--perform some insert into multiple tables
--in each iteration i need to do this as well
select * from [sometable]
inner join tablename where leadid = #leadid
set #counter = #counter + 1
end
The problem here is this is taking too long especially the join on each iteration.
Can someone please help me to optimize this.

Yes, your join is taking long because there is no join condition specified between your two tables so you are creating a Cartesian product. That is definitely going to take a while.
If yuo want to optimize this, specifiy what you want to join those tables on.
If it is still slow, have a look at appropriate indexes.

It looks like you are trying to find all the rows in sometable that have the same leadid as the rows in tablename ? If so, a simple join should work
select t2.*
from tablename t2 inner join sometable t2
on t1.leadid=t2.leadid
As long as you have an index on leaid you shouldn't have any problems
What are you really trying to do?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

count between two tables and finding difference - sql

There's really no much better way. You can just do it without variables: if (select count() from database1.dbo.table1) <> (select count() from database2.dbo.table1) begin insert into log table saying counts don't matc end

If you want to know where the differences are you can use this to find the missing records in database 2 SELECT * FROM database1.dbo.table1 D1 LEFT JOIN database2.dbo.table2 D2 ON D1.id = D2.id WHERE D2.id IS NULL

Related

SQL: Delete Rows from Dynamic list of tables where ID is null

How to write if exist statement that would run a different select depending if it exists or not

SQL Anywhere Error -824: Illegal reference to correlation name tableName

Update fails because Subquery returned more than 1 value

MS sql server looping through huge table

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

count between two tables and finding difference - sql

There's really no much better way. You can just do it without variables: if (select count(*) from database1.dbo.table1) <> (select count(*) from database2.dbo.table1) begin insert into log table saying counts don't matc end

If you want to know where the differences are you can use this to find the missing records in database 2 SELECT * FROM database1.dbo.table1 D1 LEFT JOIN database2.dbo.table2 D2 ON D1.id = D2.id WHERE D2.id IS NULL

Related

SQL: Delete Rows from Dynamic list of tables where ID is null

How to write if exist statement that would run a different select depending if it exists or not

SQL Anywhere Error -824: Illegal reference to correlation name tableName

Update fails because Subquery returned more than 1 value

MS sql server looping through huge table

Categories

Resources

There's really no much better way. You can just do it without variables: if (select count() from database1.dbo.table1) <> (select count() from database2.dbo.table1) begin insert into log table saying counts don't matc end