How to prevent SQL Server from running the same subquery multiple times - sql

I have a query that follows the following structure:
SELECT *
FROM
... <generated code> ...
(SELECT <fields>,
CASE(SELECT TOP 1 ID FROM [Configuration] WHERE IsDefault=1 ORDER BY ID)
WHEN 1 THEN t.FirstName
WHEN 2 THEN t.LastName END As Identifier
FROM <table> t) AS tmp
... <generated code> ...
WHERE <generated filters>
In the query execution plan I see that a Clustered Index Scan on the Configuration table is being executed the same number of times as there are numbers in <table>, however, I know that the result of those scans is always going to be the same, when I replace the
SELECT TOP 1 ID
FROM [Configuration]
WHERE IsDefault = 1
ORDER BY ID
part for the current value of the configuration, this query runs fast.
I'm looking for a way to tell SQL Server that this subquery always has the same result so that it runs fast, the obvious way I see is to declare a temporary variable with the value of that query and use the variable in the main query, the problem is that the beginning and end of the query is generated by the application code and I don't have manual control over that.
The ideal solution for me would be to create a deterministic function that runs that query, and have SQL Server know that since the function is deterministic, and it doesn't depend on the current row, it only needs to run it once, but for some reason it just didn't work and it still ran a bunch of times.
How should I go about optimizing this? Am I misunderstanding deterministic functions? Did I just do it wrong with the function? Is there another way?

As mentioned in the comments, using a CROSS JOIN works, the final query is:
SELECT *
FROM
... <generated code> ...
(SELECT <fields>,
CASE config.ID
WHEN 1 THEN t.FirstName
WHEN 2 THEN t.LastName END As Identifier
FROM <table> t
CROSS JOIN (SELECT TOP 1 ID FROM [Configuration] WHERE IsDefault=1 ORDER BY ID) config) AS tmp
... <generated code> ...
WHERE <generated filters>
As mentioned in the question using a temporary variable would fix the execution plan but is not an option for me because the top part of the query is generated by code from a different component
DECLARE #config INT
SET #config = (SELECT TOP 1 ID FROM [Configuration] WHERE IsDefault=1 ORDER BY ID)
SELECT *
FROM
... <generated code> ...
(SELECT <fields>,
CASE #config
WHEN 1 THEN t.FirstName
WHEN 2 THEN t.LastName END As Identifier
FROM <table> t) AS tmp
... <generated code> ...
WHERE <generated filters>

Related

Ensuring two columns only contain valid results from same subquery

I have the following table:
id symbol_01 symbol_02
1 abc xyz
2 kjh okd
3 que qid
I need a query that ensures symbol_01 and symbol_02 are both contained in a list of valid symbols. In other words I would needs something like this:
select *
from mytable
where symbol_01 in (
select valid_symbols
from somewhere)
and symbol_02 in (
select valid_symbols
from somewhere)
The above example would work correctly, but the subquery used to determine the list of valid symbols is identical both times and is quite large. It would be very innefficient to run it twice like in the example.
Is there a way to do this without duplicating two identical sub queries?
Another approach:
select *
from mytable t1
where 2 = (select count(distinct symbol)
from valid_symbols vs
where vs.symbol in (t1.symbol_01, t1.symbol_02));
This assumes that the valid symbols are stored in a table valid_symbols that has a column named symbol. The query would also benefit from an index on valid_symbols.symbol
You could try use a CTE like;
WITH ValidSymbols AS (
SELECT DISTINCT valid_symbol
FROM somewhere
)
SELECT mt.*
FROM MyTable mt
INNER JOIN ValidSymbols v1
ON mt.symbol_01 = v1.valid_symbol
INNER JOIN ValidSymbols v2
ON mt.symbol_02 = v2.valid_symbol
From a performance perspective, your query is the right way to do this. I would write it as:
select *
from mytable t
where exists (select 1
from valid_symbols vs
where t.symbol_01 = vs.valid_symbol
) and
exists (select 1
from valid_symbols vs
where t.symbol_02 = vs.valid_symbol
) ;
The important component is that you need an index on valid_symbols(valid_symbol). With this index, the lookup should be pretty fast. Appropriate indexes can even work if valid_symbols is a view, although the effect depends on the complexity of the view.
You seem to have a situation where you have two foreign key relationships. If you explicitly declare these relationships, then the database will enforce that the columns in your table match the valid symbols.

SQL Server Empty Result

I have a valid SQL select which returns an empty result, up and until a specific transaction has taken place in the environment.
Is there something available in SQL itself, that will allow me to return a 0 as opposed to an empty dataset? Similar to isNULL('', 0) functionality. Obviously I tried that and it didn't work.
PS. Sadly I don't have access to the database, or the environment, I have an agent installed that is executing these queries so I'm limited to solving this problem with just SQL.
FYI: Take any select and run it where the "condition" is not fulfilled (where LockCookie='777777777' for example.) If that condition is never met, the result is empty. But at some point the query will succeed based on a set of operations/tasks that happen. But I would like to return 0, up until that event has occurred.
You can store your result in a temp table and check ##rowcount.
select ID
into #T
from YourTable
where SomeColumn = #SomeValue
if ##rowcount = 0
select 0 as ID
else
select ID
from #T
drop table #T
If you want this as one query with no temp table you can wrap your query in an outer apply against a dummy table with only one row.
select isnull(T.ID, D.ID) as ID
from (values(0)) as D(ID)
outer apply
(
select ID
from YourTable
where SomeColumn = #SomeValue
) as T
alternet way is from code, you can check count of DataSet.
DsData.Tables[0].Rows.count > 0
make sure that your query matches your conditions

optimising/simplifying cursor sql

i've got the below code, and it operates just fine, only it takes a couple of seconds to calculate the answer - i was wondering whether there is a quicker/neater way of writing this code - and if so, what am i doing wrong?
thanks
select case when
(select LSCCert from matterdatadef where ptmatter=$Matter$) is not null then
(
(select case when
(SELECT top 1 dbo.matterdatadef.ptmatter
From dbo.workinprogress, dbo.MatterDataDef
where ptclient=(
select top 1 dbo.workinprogress.ptclient
from dbo.workinprogress
where dbo.workinprogress.ptmatter = $matter$)
and dbo.matterdatadef.LSCCert=(
select top 1 dbo.matterdatadef.LSCCert
from dbo.matterdatadef
where dbo.matterdatadef.ptmatter = $matter$)
)=ptMatter then (
SELECT isnull((DateAdd(mm, 6, (
select top 1 Date
from OfficeClientLedger
where (pttrans=3)
and ptmatter=$matter$
order by date desc))),
(DateAdd(mm, 3, (
SELECT DateAdd
FROM LAMatter
WHERE ptMatter = $Matter$)))
)
)
end
from lamatter
where ptmatter=$matter$)
)
end
It looks like this your sql was generated from a reporting tool. The problem is you are executing the SELECT top 1 dbo.matterdatadef.ptmatter... query for every row of table lamatter. Further slowing execution, within that query you are recalculating comparison values for both ptclient and LSCCert - values that aren't going to change during execution.
Better to use proper joins and craft the query to execute each part only once by avoiding correlated subqueries (queries that reference values in joined tables and must be executed for every row of that table). Calculated values are OK, as long as they are calculated only once - ie from within the final where clause.
Here is a trivial example to demonstrate a correlated subquery:
Bad sql:
select a, b from table1
where a = (select c from table2 where d = b)
Here the sub-select is run for every row, which will be slow, especially without an index on table2(d)
Better sql:
select a, b from table1, table2
where a = c and d = a
Here the database will scan each table at most once, which will be fast

SQL: Get a random entry iff condition is false

Using Firebird:
I want to select a random entry in the table if the first SQL query returns 0 rows. Is there anyway to combine these two queries?
SELECT * FROM table WHERE cond=1;
SELECT FIRST 1 * FROM table ORDER BY rand();
Im using ExecuteNativeQuery on the java-side which takes basic SQL statements. Sadly, If-Else statements don't work. And if i could make a single query to the database instead of two, that would make my code appear faster.
try this: Not sure but think it will work...
Select FIRST 1 t1.*
FROM table t1
left Join Table t2
On t2.pk = t1.pk
And t2.cond=1
ORDER BY Case When t2.Cond = 1
Then 0 Else rand() End
if(exists(select 1 from table where cond=1))
SELECT * FROM table WHERE cond=1;
else
SELECT FIRST 1 * FROM table ORDER BY rand();
something like this, though I forgot whether the then keyword is needed in if statements in FirebirdSQL databases.

Best way to write a named SQL query that returns if row exists?

So I have this SQL query,
<named-query name="NQ::job_exists">
<query>
select 0 from dual where exists (select * from job_queue);
</query>
</named-query>
Which I plan to use like this:
Query q = em.createNamedQuery("NQ::job_exists");
List<Integer> results = q.getResultList();
boolean exists = !results.isEmpty();
return exists;
I am not very strong in SQL/JPA however, and was wondering whether there is a better way of doing it (or ways to improve it). Should I for example, write (select jq.id from job_queue jq) instead of using a star??
EDIT:This call is very performance critical in our app.
EDIT:Did some performance testing, and while the differences were almost negligible, I finally decided to go with:
select distinct null
from dual
where exists (
select null from job_queue
);
IF you are using EXISTS Oracle I recommend using null:
select null
from dual where exists (select null from job_queue);
The following will be the one with lower cost on Oracle:
select null
from job_queue
where rownum = 1;
Update: To include the case when there are no rows on table you can run the following query:
select count(*)
from (select null
from job_queue
where rownum = 1);
With this query you have a optimum plan and only two possible results: 1 if there are rows and 0 if there are no rows.
If you do an "exists" then it will stop looking as soon as it finds a match. This can stop it from doing a full table scan. Same with TOP 1 if you don't have an ORDER BY. If you do a TOP 1 ID and ID is in an index it might use the index and not even go to the table at all. Stopping the full table scan is where the biggest saving in performance is.
Another small savings is that if you do "SELECT 1" or "SELECT COUNT(1)" instead of "SELECT * " or "SELECT COUNT(*)" it saves the work of getting the table structure.
So I would go with:
SELECT TOP 1 1 AS Found
FROM job_queue
Then:
return !results.isEmpty();
This is the least amount of work that I can think of.
For Oracle it would be:
SELECT 1
FROM job_queue
WHERE rownum<2;
Or:
Set Rowcount 1
SELECT 1
FROM job_queue
Why not just do:
select count(*) as JobCount from job_queue
If JobCount = 0, then there's your answer!