Conditional where clause results in terrible performance Firebird - sql

Firebird does not know how to execute a conditional where. Or that is what I think.
The First query returns values after 15ms.
SELECT DISTINCT
A.MANID,
A.DISNO,
A.DISID
FROM
TABLEB B
INNER JOIN TABLEA A ON (A.ITEM_ID = B.ITEM_ID)
WHERE
(
(POSITION('%' IN :ISEARCH) = 0 AND B.CATID = :ISEARCH)
)
This second query takes more than 40 seconds and all is about the OR condition.
SELECT DISTINCT
A.MANID,
A.DISNO,
A.DISID
FROM
TABLEB B
INNER JOIN TABLEA A ON (A.ITEM_ID = B.ITEM_ID)
WHERE
(
(POSITION('%' IN :ISEARCH) = 0 AND B.CATID = :ISEARCH) OR
POSITION('%' IN :ISEARCH) <> 0
)
How can I tell firebird to behave in this type of situations?

A bit far-fetched and I'm not familiar with Firebird but for this particular case I'd suggest to try
(
(POSITION('%' IN :ISEARCH) = 0 AND B.CATID = :ISEARCH) OR
POSITION('%' IN :ISEARCH) <> 0
)
be written like
(
(POSITION('%' IN :ISEARCH) <> 0 OR B.CATID = :ISEARCH)
)
which might make more sense to the query optimizer?
It's still an OR and a lot of RDBMSs don't like ORs but it's worth a try...
Worst case you could try to split the statement into 2 separate queries that you UNION ALL together again where one handles POSITION('%' IN :ISEARCH) <> 0 and the other B.CATID = :ISEARCH. The trouble with that approach might be doubled entries which will require filtering out again. (aka: a new can of worms...)

Related

SELECT NOT IN with multiple columns in subquery

Regarding the statement below, sltrxid can exist as both ardoccrid and ardocdbid. I'm wanting to know how to include both in the NOT IN subquery.
SELECT *
FROM glsltransaction A
INNER JOIN cocustomer B ON A.acctid = B.customerid
WHERE sltrxstate = 4
AND araccttype = 1
AND sltrxid NOT IN(
SELECT ardoccrid,ardocdbid
FROM arapplyitem)
I would recommend not exists:
SELECT *
FROM glsltransaction t
INNER JOIN cocustomer c ON c.customerid = t.acctid
WHERE
??.sltrxstate = 4
AND ??.araccttype = 1
AND NOT EXISTS (
SELECT 1
FROM arapplyitem a
WHERE ??.sltrxid IN (a.ardoccrid, a.ardocdbid)
)
Note that I changed the table aliases to things that are more meaningful. I would strongly recommend prefixing the column names with the table they belong to, so the query is unambiguous - in absence of any indication, I represented this as ?? in the query.
IN sometimes optimize poorly. There are situations where two subqueries are more efficient:
SELECT *
FROM glsltransaction t
INNER JOIN cocustomer c ON c.customerid = t.acctid
WHERE
??.sltrxstate = 4
AND ??.araccttype = 1
AND NOT EXISTS (
SELECT 1
FROM arapplyitem a
WHERE ??.sltrxid = a.ardoccrid
)
AND NOT EXISTS (
SELECT 1
FROM arapplyitem a
WHERE ??.sltrxid = a.ardocdbid
)

Join SQL Server Showing Duplicate Row

I want to ask something about joining query. I have a query like this:
SELECT b.compilecodingid,
a.subjobfamily,
b.position,
b.nocoding,
( CASE
WHEN (SELECT Count(0)
FROM trlspbia
WHERE learningsystemid = a.learningsystemid
AND compilecodingid = b.compilecodingid
AND moduleid = '2018081616230361362303614'
AND learningroadmap = 'Basic') > 0 THEN 1
ELSE 0
END ) AS CountPickPBIA
FROM trlsplanning a,
trcompilecodingheader b
WHERE a.learningsystemid = b.learningsystemid
AND a.position = b.position
AND a.learningsystemid = '2018081513283162000000001'
order by CountPickPBIA desc
I know it's because Column Position on Table TrLsPlanning has more than 1 data,
Anyone can help me to find the solution?
Thank you.
The simplest solution is probably select distinct:
SELECT cch.compilecodingid, p.subjobfamily, cch.position, cch.nocoding,
(CASE WHEN EXISTS (SELECT 1
FROM trlspbia s
WHERE s.learningsystemid = p.learningsystemid AND
s.compilecodingid = ccb.compilecodingid AND
s.moduleid = '2018081616230361362303614' AND
s.learningroadmap = 'Basic'
)
THEN 1
ELSE 0
END) AS CountPickPBIA
FROM trlsplanning p JOIN
trcompilecodingheader cch
ON p.learningsystemid = cch.learningsystemid AND
p.position = cch.position
WHERE p.learningsystemid = '2018081513283162000000001'
ORDER BY CountPickPBIA DESC;
SELECT DISTINCT incurs its own overhead. But without more information about the structure and contents of the table, this is the simplest solution.
Note other changes in the query:
Table aliases are abbreviations for table names, rather than being arbitrary letters.
The JOIN syntax is fixed, to use modern, proper, and standard JOIN/ON.
All columns are qualified with the table alias, particularly those in the correlated subqueries.
The subquery uses EXISTS rather than COUNT(*). This is both more efficient and it probably better expresses the logic you want.

How to check on which column to create Index to optimize performance

I have below query which is costing too much time and i have to optimize the query performance. There is no index on any of the table.
But now for query performance optimization i am thinking to create index. But not sure on particulary which filtered column i have to create index.
I am thinking i will do group by and count the number of distinct records for all the filtered column condition and then decide on which column i should create index but not sure about this.
Select * from ORDER_MART FOL where FOL.PARENT_PROD_SRCID
IN
(
select e.PARENT_PROD_SRCID
from SRC_GRP a
JOIN MAR_GRP b ON a.h_lpgrp_id = b.h_lpgrp_id
JOIN DATA_GRP e ON e.parent_prod_srcid = b.H_LOCPR_ID
WHERE a.CHILD_LOCPR_ID != 0
AND dt_id BETWEEN 20170101 AND 20170731
AND valid_order = 1
AND a.PROD_TP_CODE like 'C%'
)
AND FOL.PROD_SRCID = 0 and IS_CAPS = 1;
Below is my query execution plan:
Select *
from ORDER_MART FOL
INNER JOIN (
select distinct e.PARENT_PROD_SRCID
from SRC_GRP a
JOIN MAR_GRP b ON a.h_lpgrp_id = b.h_lpgrp_id
JOIN DATA_GRP e ON e.parent_prod_srcid = b.H_LOCPR_ID
WHERE a.CHILD_LOCPR_ID != 0 -- remove the lines from INT_CDW_DV.S_LOCAL_PROD_GRP_MAIN with child prod srcid equal to 0
AND dt_id BETWEEN 20170101 AND 20170731
AND valid_order = 1 --and is_caps=1
AND a.PROD_TP_CODE like 'C%'
) sub ON sub.PARENT_PROD_SRCID=FOL.PARENT_PROD_SRCID
where FOL.PROD_SRCID = 0 and IS_CAPS = 1;
What if you use JOIN instead of IN and add distinct to reduce amount of rows in the subquery.

Slowness in update query using inner join

I am using the below query to update one column based on the conditions it is specified. I am using "inner join" but it is taking more than 15 seconds to run the query even if it has to update no records(0 records).
UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST
INNER JOIN (SELECT DISTINCT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
FROM CONFIGURATION_LIST
WHERE
PLANT = '0067'
AND APPLIED_SERIAL_NUMBER IS NOT NULL
AND APPLIED_SERIAL_NUMBER !=''
AND DUPLICATE_SERIAL_NUM = 1
GROUP BY
APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER
HAVING
COUNT(*) = 1) T2 ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER
AND T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE
CONFIGURATION_LIST.PLANT = '0067'
AND DUPLICATE_SERIAL_NUM = 1
The index is there with APPLIED_SERIAL_NUMBER and APPLIED_MAT_CODE and fragmentation is also fine.
Could you please help me on the above query performance.
First, you don't need the DISTINCT when using GROUP BY. SQL Server probably ignores it, but it is a bad idea anyway:
UPDATE CONFIGURATION_LIST
SET DUPLICATE_SERIAL_NUM = 0
FROM CONFIGURATION_LIST INNER JOIN
(SELECT APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, COUNT(*) AS NB
FROM CONFIGURATION_LIST cl
WHERE cl.PLANT = '0067' AND
cl.APPLIED_SERIAL_NUMBER IS NOT NULL AND
cl.APPLIED_SERIAL_NUMBER <> ''
cl.DUPLICATE_SERIAL_NUM = 1
GROUP BY cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER
HAVING COUNT(*) = 1
) T2
ON T2.APPLIED_SERIAL_NUMBER = CONFIGURATION_LIST.APPLIED_SERIAL_NUMBER AND
T2.APPLIED_MAT_CODE = CONFIGURATION_LIST.APPLIED_MAT_CODE
WHERE CONFIGURATION_LIST.PLANT = '0067' AND
DUPLICATE_SERIAL_NUM = 1;
For this query, you want the following index: CONFIGURATION_LIST(PLANT, DUPLICATE_SERIAL_NUM, APPLIED_SERIAL_NUMBER, APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER).
The HAVING COUNT(*) = 1 suggests that you might really want NOT EXISTS (which would normally be faster). But you don't really explain what the query is supposed to be doing, you only say that this code is slow.
Looks like you're checking the table for rows that exist in the same table with the same values, and if not, update the duplicate column to zero. If your table has a unique key (identity field or composite key), you could do something like this:
UPDATE C
SET C.DUPLICATE_SERIAL_NUM = 0
FROM
CONFIGURATION_LIST C
where
not exists (
select
1
FROM
CONFIGURATION_LIST C2
where
C2.APPLIED_SERIAL_NUMBER = C.APPLIED_SERIAL_NUMBER and
C2.APPLIED_MAT_CODE = C.APPLIED_MAT_CODE and
C2.UNIQUE_KEY_HERE != C.UNIQUE_KEY_HERE
) and
C.PLANT = '0067' and
C.DUPLICATE_SERIAL_NUM = 1
I will try with a select first:
select APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER, count(*) as n
from CONFIGURATION_LIST cl
where
cl.PLANT='0067' and
cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;
How many rows do you get with this and how long does it take?
If you remove your DUPLICATE_SERIAL_NUM column from your table it might be very simple. The DUPLICATE_SERIAL_NUM suggests that you are searching for duplicates. As you count your rows you could introduce a simple table that contains the counts:
create table CLCOUNT ( N int unsigned, C int /* or what APPLIED_MAT_CODE is */, S int /* or what APPLIED_SERIAL_NUMBER is */, PLANT char(20) /* or what PLANT is */, index unique (C,S,PLANT), index(PLANT,N));
insert into CLCOUNT select count(*), cl.APPLIED_MAT_CODE, cl.APPLIED_SERIAL_NUMBER, cl.PLANT
from CONFIGURATION_LIST cl
where
cl.PLANT='0067' and
cl.APPLIED_SERIAL_NUMBER IS NOT NULL and
cl.APPLIED_SERIAL_NUMBER <> ''
group by APPLIED_MAT_CODE, APPLIED_SERIAL_NUMBER;
How long does this take?
Now you can simply select * from CLCOUNT where PLANT='0067' and N=1;
This is all far from being perfect. But you should be able to analyze (EXPLAIN SELECT ...) your queries and find why it takes so long.

SQL Server: Logical equivalent of ALL query

I have a following query (simplified):
SELECT
Id
FROM
dbo.Entity
WHERE
1 = ALL (
SELECT
CASE
WHEN {Condition} THEN 1
ELSE 0
END
FROM
dbo.Related
INNER JOIN dbo.Entity AS TargetEntity ON
TargetEntity.Id = Related.TargetId
WHERE
Related.SourceId = Entity.Id
)
where {Condition} is a complex dynamic condition on TargetEntity.
In simple terms, this query should return entities for which all related entities match the required condition.
Unfortunately, that does not work quite well, since by SQL standard 1 = ALL evaluates to TRUE when ALL is applied to an empty set. I know I can add AND EXISTS, but that will require me to repeat the whole subquery, which, I am certain, will cause problems for performance.
How should I rewrite the query to achieve the result I need (SQL Server 2008)?
Thanks in advance.
Note: practically speaking, the whole query is highly dynamic, so the perfect solution would be to rewrite only 1 = ALL ( ... ), since changing top-level select can cause problems when additional conditions are added to top-level where.
Couldn't you use a min to achieve this?
EG:
SELECT
Id
FROM
dbo.Entity
WHERE
1 = (
SELECT
MIN(CASE
WHEN {Condition} THEN 1
ELSE 0
END)
FROM
dbo.Related
INNER JOIN dbo.Entity AS TargetEntity ON
TargetEntity.Id = Related.TargetId
WHERE
Related.SourceId = Entity.Id
)
The min should return null if there's no clauses, 1 if they're all 1 and 0 if there's any 0's, and comparing to 1 should only be true for 1.
It can be translated to pick Entities where no related entities with unmatched condition exist.
This can be accomplished by:
SELECT
Id
FROM
dbo.Entity
WHERE
NOT EXISTS (
//as far as I have an element which do not match the condition, skip this entity
SELECT TOP 1 1
FROM
dbo.Related
INNER JOIN dbo.Entity AS TargetEntity ON
TargetEntity.Id = Related.TargetId
WHERE
Related.SourceId = Entity.Id AND
CASE
WHEN {Condition} THEN 1
ELSE 0
END = 0
)
EDIT: depending on condition, you can write something like:
WHERE Related.SourceId = Entity.Id AND NOT {Condition} if it doesn't change too much the complexity of the query.
Instead of using all, change your query to compare the result of the subquery directly:
select Id
from dbo.Entity
where 1 = (
select
case
when ... then 1
else 0
end
from ...
where ...
)
Probably this will work: WHERE NOT 0 = ANY(...)
If I read the query correctly, it can be simplified to something like:
SELECT e.Id
FROM dbo.Entity e
INNER JOIN dbo.Related r ON r.SourceId = e.Id
INNER JOIN dbo.Entity te ON te.Id = r.TargetId
WHERE <extra where stuff>
GROUP BY e.Id
HAVING SUM(CASE WHEN {Condition} THEN 1 ELSE 0 END) = COUNT(*)
This says the Condition must be true for all rows. It filters the "empty" set case away with the INNER JOINs.