SQL Server Query Performance Issue: Need Replacement of NOT EXISTS - sql

Could someone optimitize the performance of below General SQL Query:
select fn.column1
from dbo.function1(input1) as fn
where (not exists (select 1 from table1 t1
where fn.column1 = t1.column1)
and not exists (select 1 from table2 t2
where fn.column1 = t2.column1))
For the above query, consider the approximate row count given below.
select fn.column1 from dbo.function1(input1) as fn -- returns 64000 records in 2 seconds.
table 1 (Column1) record-- returns 3000 records -- 1 second
table 2 (Column1) record-- returns 2000 records -- 1 second
So, if I run each select statement, it pulls and displays record in 1 or 2 seconds. But, if I run the full query, it takes more than a minute to display 64000 - (3000 + 2000) = 59000 records.
I tried the using EXCEPT like this:
select fn.column1
from dbo.function1(input1)
except
(select column1 from dbo.table1 union select column1 from dbo.table2)
Nothing improves my performance. Same it takes a minute to display 59000 records. This is with the same case for "NOT IN" Scenario.
Also I noticed that if we use UNION, instead of EXCEPT in the above query, it returns 59K records in 2 seconds.
UPDATED:
The function (a bit complex) contains the below pseudocode
select column1, column2,...column6
from dbo.table1
inner join dbo.table2
inner join ....
inner join dbo.table6
inner join dbo.innerfunction1
where <Condition 1>
UNION ALL
select column1, column2,...column6
from dbo.table1
inner join dbo.table2
inner join ...
inner join dbo.table4
inner join dbo.innerfunction2
where (condition 2)
Assume that two inner functions has single table select statement
My question is: if I select the column from the function, it displays 64K records in 1 sec. But, if the whole query executed, it takes more than a minute.
[Please Note: This query need to be used in function]
Could any one help me to improve this?
Kindly let me know if you need more details or clarifications.
Regards,
Viswa V.

Its a bit hard to optimize without data to play with. A fiddle would be good. Nonetheless here is an approach that may work.
Create a temp table, index it then do the EXCEPT as follows:
SELECT
fn.column1
INTO
#temp
FROM
dbo.function1(input1) AS fn
CREATE NONCLUSTERED INDEX [temp_index] ON #temp
(
column1 ASC
)
SELECT
column1
FROM
#temp AS t
EXCEPT
(
SELECT
column1
FROM
dbo.table1
UNION
SELECT
column1
FROM
dbo.table2
)
I would be interested in the result.

Related

SQL - combine two functions to one table

So I have the first code which I use:
with idlist as (
Select uniqueid as masterid
From table1
Union
Select uniqueid
From table2
)
Select i.masterid,
t1.*,
t2.*
From idlist as i
Left join table1 on t1.uniqueid = i.masterid
Left join table2 on t2.uniqueid = i.masterid
Purpose of the code above: to take 2 or more tables which have the same id column and union to one row.
Second code:
select [Id], [price], [description]
from [table1]
where name_1 % 10 = 8 -- enter name_1 manually
union all
select [Id], [price], [description]
from [table2]
where name_2 % 10 = 8
Purpose of the code above: check a specific column, if it ends with '8' then list its Id, price, description
What I want:
Basically combine those two codes. I need the second code to run within the result of the first code.
I thought of creating a new table for the result of the first query above, but as far as i know i need to create all the columns before inserting data into it, yet the result of the first query will be a combination of several tables which have a lot of different columns.
Nonetheless, i still want the result of the combo between the two queries to be in a new table.
So if had to put everything to words and steps:
Select id from tables> union under the same Id to one row with all their row content > create table with the result > run the second code above on the new table
Thanks
You seem to want a bunch of columns from both tables, so I'm wondering if this accomplishes what you want:
select *
from (select t1.*
from [table1] t1
where name_1 % 10 = 8
) t1 full join
(select t2.*
from [table2] t2
where name_2 % 10 = 8
) t2
on t1.id = t2.id;
Note that this is using full join -- which is what your first query is doing (well, assuming that id is not NULL).
Also, it seems very strange to be using an arithmetic calculation (%) on a column called "name".

SQL SELECT compare values from two tables (without UNION ALL)

I have table T1:
ID IMPACT
1 3
I have table T2
PRIORITY URGENCY
1 2
I need to do the SELECT from T1 table.
I would like to get all the rows from T1 where IMPACT is greater than PRIORITY from T2.
I am working in some IBM application where it is only possible to start with SQL statement after the WHERE clause from the first table T1.
So query (unfortunately) must always start with "SELECT * FROM T1 WHERE..."
This cannot be changed (please have that in mind).
This means that I cannot use some JOIN or UNION ALL statement after the "FROM T1" part because I can start to write SQL query only after the WHERE clause.
SELECT * FROM T1
WHERE
IMPACT> SELECT PRIORITY FROM T2 WHERE URGENCY=2
But I am getting an error for this statement.
Please is it possible to write SQL query starting with:
SELECT * FROM T1
WHERE
You want a subquery, so all you need are parentheses:
SELECT *
FROM T1
WHERE IMPACT > (SELECT T2.PRIORITY FROM T2 WHERE T2.URGENCY = 2)
This assumes that the subquery returns one row (or zero rows, in which case nothing is returned). If the subquery can return more than one row, you should ask another question and be very explicit about what you want done.
One reasonable interpretation (for more than one row) is:
SELECT *
FROM T1
WHERE IMPACT > (SELECT MAX(T2.PRIORITY) FROM T2 WHERE T2.URGENCY = 2)
I would use exists:
select t1.*
from t1
where exists (select 1 from t2 where t1.IMPACT > t2.PRIORITY);

Do volatile tables truncate results by default?

I have two queries that supposed to bring equivalent results. However the second query gives only partial results (less than 10 % of the total).
First query gives more than 4 million rows
SELECT id, amount
FROM table1 t1 LEFT OUTER JOIN table2 t2 ON t1.id = t2.id;
Second give only 18 thousand records
CREATE VOLATILE TABLE vt AS
(
SELECT id, amount
FROM table1 t1 LEFT OUTER JOIN table2 t2 ON t1.id = t2.id;
)
WITH DATA
NO PRIMARY INDEX
ON COMMIT PRESERVE ROWS;
SELECT *
FROM vt ;
Why does the second query give less records ???
When you do a SHOW TABLE vt; you'll notice that it's created as a SET table, which doesn't store duplicate rows. There are only 18 thousand distinct (id,amount) combinations.
Either add DISTINCT to your first Select or use CREATE MULTISET VOLATILE TABLE.

SQL I want to duplicate record on insert

Without using a while or forloop, is there a way to insert a record two or more times on a single insert?
Thanks
INSERT INTO TABLE2 ((VALUE,VALUE)
SELECT VALUE,VALUE FROM TABLE1 )) * 2
You would need to CROSS JOIN onto a table with 2 rows. The following would work in SQL Server.
INSERT INTO TABLE2 ((VALUE,VALUE)
SELECT VALUE,VALUE
FROM TABLE1, (SELECT 1 UNION ALL SELECT 2) T(C)
If you have an auxilliary numbers table you could also do
SELECT VALUE,VALUE
FROM TABLE1 JOIN Numbers ON N <=2
--first create a dummy table with 2 records
INSERT INTO TABLE2 ((VALUE,VALUE)
SELECT VALUE,VALUE FROM TABLE1, dummytable ))
This is not an elegant way, but could work easily.
If you have a table with an high enough number of records you can do the cross join with a TOP clause
INSERT INTO TABLE2
SELECT VALUE,VALUE FROM TABLE1
cross join (select top 2 TABLE_DUMMY) as DUMMY
This works for MQ SqlServer, to let it work in other DBMS you should change the TOP with the keyword needed by your DBMS

An issue possibly related to Cursor/Join

Here is my situation:
Table one contains a set of data that uses an id for an unique identifier. This table has a one to many relationship with about 6 other tables such that.
Given Table 1 with Id of 001:
Table 2 might have 3 rows with foreign key: 001
Table 3 might have 12 rows with foreign key: 001
Table 4 might have 0 rows with foreign key: 001
Table 5 might have 28 rows with foreign key: 001
I need to write a report that lists all of the rows from Table 1 for a specified time frame followed by all of the data contained in the handful of tables that reference it.
My current approach in pseudo code would look like this:
select * from table 1
foreach(result) {
print result;
select * from table 2 where id = result.id;
foreach(result2) {
print result2;
}
select * from table 3 where id = result.id
foreach(result3) {
print result3;
}
//continued for each table
}
This means that the single report can run in the neighbor hood of 1000 queries. I know this is excessive however my sql-fu is a little weak and I could use some help.
LEFT OUTER JOIN Tables2-N on Table1
SELECT Table1.*, Table2.*, Table3.*, Table4.*, Table5.*
FROM Table1
LEFT OUTER JOIN Table2 ON Table1.ID = Table2.ID
LEFT OUTER JOIN Table3 ON Table1.ID = Table3.ID
LEFT OUTER JOIN Table4 ON Table1.ID = Table4.ID
LEFT OUTER JOIN Table5 ON Table1.ID = Table5.ID
WHERE (CRITERIA)
Join doesn't do it for me. I hate having to de-tangle the data on the client side. All those nulls from left-joining.
Here's a set-based solution that doesn't use Joins.
INSERT INTO #LocalCollection (theKey)
SELECT id
FROM Table1
WHERE ...
SELECT * FROM Table1 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table2 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table3 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table4 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table5 WHERE id in (SELECT theKey FROM #LocalCollection)
Ah! Procedural! My SQL would look like this, if you needed to order the results from the other tables after the results from the first table.
Insert Into #rows Select id from Table1 where date between '12/30' and '12/31'
Select * from Table1 t join #rows r on t.id = r.id
Select * from Table2 t join #rows r on t.id = r.id
--etc
If you wanted to group the results by the initial ID, use a Left Outer Join, as mentioned previously.
You may be best off to use a reporting tool like Crystal or Jasper, or even XSL-FO if you are feeling bold. They have things built in to handle specifically this. This is not something the would work well in raw SQL.
If the format of all of the rows (the headers as well as all of the details) is the same, it would also be pretty easy to do it as a stored procedure.
What I would do: Do it as a join, so you will have the header data on every row, then use a reporting tool to do the grouping.
SELECT * FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.resultid -- this could be a left join if the table is not guaranteed to have entries for t1.id
INNER JOIN table2 t3 ON t1.id = t3.resultid -- etc
OR if the data is all in the same format you could do.
SELECT cola,colb FROM table1 WHERE id = #id
UNION ALL
SELECT cola,colb FROM table2 WHERE resultid = #id
UNION ALL
SELECT cola,colb FROM table3 WHERE resultid = #id
It really depends on the format you require the data in for output to the report.
If you can give a sample of how you would like the output I could probably help more.
Join all of the tables together.
select * from table_1 left join table_2 using(id) left join table_3 using(id);
Then, you'll want to roll up the columns in code to format your report as you see fit.
What I would do is open up cursors on the following queries:
SELECT * from table1 order by id
SELECT * from table1 r, table2 t where t.table1_id = r.id order by r.id
SELECT * from table1 r, table3 t where t.table1_id = r.id order by r.id
And then I would walk those cursors in parallel, printing your results. You can do this because all appear in the same order. (Note that I would suggest that while the primary ID for table1 might be named id, it won't have that name in the other tables.)
Do all the tables have the same format? If not, then if you have to have a report that can display the n different types of rows. If you are only interested in the same columns then it is easier.
Most databases have some form of dynamic SQL. In that case you can do the following:
create temporary table from
select * from table1 where rows within time frame
x integer
sql varchar(something)
x = 1
while x <= numresults {
sql = 'SELECT * from table' + CAST(X as varchar) + ' where id in (select id from temporary table'
execute sql
x = x + 1
}
But I mean basically here you are running one query on your main table to get the rows that you need, then running one query for each sub table to get rows that match your main table.
If the report requires the same 2 or 3 columns for each table you could change the select * from tablex to be an insert into and get a single result set at the end...