the below select statement takes a long in running - sql

This select statement takes a long time running, after my investigation I found that the problem un subquery, stored procedure, please I appreciate your help.
SELECT DISTINCT
COKE_CHQ_NUMBER,
COKE_PAY_SUPPLIER
FROM
apps.Q_COKE_AP_CHECKS_SIGN_STATUS_V
WHERE
plan_id = 40192
AND COKE_SIGNATURE__A = 'YES'
AND COKE_SIGNATURE__B = 'YES'
AND COKE_AUDIT = 'YES'
AND COKE_CHQ_NUMBER NOT IN (SELECT DISTINCT COKE_CHQ_NUMBER_DELIVER
FROM apps.Q_COKE_AP_CHECKS_DELIVERY_ST_V
WHERE UPPER(COKE_CHQ_NUMBER_DELIVER_STATUS) <> 'DELIVERED')
AND COKE_CHQ_NUMBER NOT IN (SELECT COKE_CHQ_NUMBER_DELIVER
FROM apps.Q_COKE_AP_CHECKS_DELIVERY_ST_V)

Well there are a few issues with your SELECT statement that you should address:
First let's look at this condition:
COKE_CHQ_NUMBER NOT IN (SELECT DISTINCT COKE_CHQ_NUMBER_DELIVER
FROM apps.Q_COKE_AP_CHECKS_DELIVERY_ST_V
WHERE UPPER(COKE_CHQ_NUMBER_DELIVER_STATUS) <> 'DELIVERED')
First you select DISTINCT cheque numbers with a not delivered status then you say you don't want this. Rather than saying I don't want non delivered it is much more readable to say I want delivered ones. However this is not really an issue but rather it would make your SELECT easier to read and understand.
Second let's look at your second cheque condition:
COKE_CHQ_NUMBER NOT IN (SELECT COKE_CHQ_NUMBER_DELIVER
FROM apps.Q_COKE_AP_CHECKS_DELIVERY_ST_V)
Here you want to exclude all cheques that have an entry in Q_COKE_AP_CHECKS_DELIVERY_ST_V. This makes your first DISTINCT condition redundant as whatever cheques numbers will bring back would be rejected by this second condition of yours. I do't know if Oracle SQL engine is clever enough to work out this redundancy but this could cause your slowness as SELECT distinct can take longer to run
In addition to this if you don't have them already I would recommend adding the following indexes:
CREATE INDEX index_1 ON q_coke_ap_checks_sign_status_v(coke_chq_number, coke_pay_supplier);
CREATE INDEX index_2 ON q_coke_ap_checks_sign_status_v(plan_id, coke_signature__a, coke_signature__b, coke_audit);
CREATE INDEX index_3 ON q_coke_ap_checks_delivery_st_v(coke_chq_number_deliver);
I called the index_1,2,3 for easy to read obviously not a good naming convention.
With this in place your select should be optimized to retrieve you your data in an acceptable performance. But of course it all depends on the size and the distribution of your data which is hard to control without performing specific data analysis.

looking to you code .. seems you have redundant where condition the second NOT IN implies the firts so you could avoid
you could also transform you NOT IN clause in a MINUS clause .. join the same query with INNER join of you not in subquery
and last be careful you have proper composite index on table
Q_COKE_AP_CHECKS_SIGN_STATUS_V
cols (plan_id,COKE_SIGNATURE__A , COKE_SIGNATURE__B, COKE_AUDIT, COKE_CHQ_NUMBER, COKE_PAY_SUPPLIER)
SELECT DISTINCT
COKE_CHQ_NUMBER,
COKE_PAY_SUPPLIER
FROM
apps.Q_COKE_AP_CHECKS_SIGN_STATUS_V
WHERE
plan_id = 40192
AND COKE_SIGNATURE__A = 'YES'
AND COKE_SIGNATURE__B = 'YES'
AND COKE_AUDIT = 'YES'
MINUS
SELECT DISTINCT
COKE_CHQ_NUMBER,
COKE_PAY_SUPPLIER
FROM apps.Q_COKE_AP_CHECKS_SIGN_STATUS_V
INNER JOIN (
SELECT COKE_CHQ_NUMBER_DELIVER
FROM apps.Q_COKE_AP_CHECKS_DELIVERY_ST_V
) T ON T.COKE_CHQ_NUMBER_DELIVER = apps.Q_COKE_AP_CHECKS_SIGN_STATUS_V
WHERE
plan_id = 40192
AND COKE_SIGNATURE__A = 'YES'
AND COKE_SIGNATURE__B = 'YES'
AND COKE_AUDIT = 'YES'

Related

Select only the row with the max value, but the column with this info is a SUM()

I have the following query:
SELECT DISTINCT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT
FROM TGFCAB CAB, TGFPAR PAR, TSIBAI BAI
WHERE CAB.CODPARC = PAR.CODPARC
AND PAR.CODBAI = BAI.CODBAI
AND CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI
Which the result is this. Company names and neighborhood hid for obvious reasons
The query at the moment, for those who don't understand Latin languages, is giving me clients, company name, company neighborhood, and the total value of movements.
in the WHERE clause it is only filtering sales movements of companies from an established city.
But if you notice in the Select statement, the column that is retuning the value that aggregates the total amount of value of sales is a SUM().
My goal is to return only the company that have the maximum value of this column, if its a tie, display both of em.
This is where i'm struggling, cause i can't seem to find a simple solution. I tried to use
WHERE AMOUNT = MAX(AMOUNT)
But as expected it didn't work
You tagged the question with the whole bunch of different databases; do you really use all of them?
Because, "PL/SQL" reads as "Oracle". If that's so, here's one option.
with temp as
-- this is your current query
(select columns,
sum(vrlnota) as amount
from ...
where ...
)
-- query that returns what you asked for
select *
from temp t
where t.amount = (select max(a.amount)
from temp a
);
You should be able to achieve the same without the need for a subquery using window over() function,
WITH T AS (
SELECT
CAB.CODPARC,
PAR.RAZAOSOCIAL,
BAI.NOMEBAI,
SUM(VLRNOTA) AS AMOUNT,
MAX(VLRNOTA) over() AS MAMOUNT
FROM TGFCAB CAB
JOIN TGFPAR PAR ON PAR.CODPARC = CAB.CODPARC
JOIN TSIBAI BAI ON BAI.CODBAI = PAR.CODBAI
WHERE CAB.TIPMOV = 'V'
AND STATUSNOTA = 'L'
AND PAR.CODCID = 5358
GROUP BY CAB.CODPARC, PAR.RAZAOSOCIAL, BAI.NOMEBAI
)
SELECT CODPARC, RAZAOSOCIAL, NOMEBAI, AMOUNT
FROM T
WHERE AMOUNT=MAMOUNT
Note it's usually (always) beneficial to join tables using clear explicit join syntax. This should be fine cross-platform between Oracle & SQL Server.

SQL sub query logic

I am trying to calculate values in a column called Peak, but I need to apply different calculations dependant on the 'ChargeCode'.
Below is kind of what I am trying to do, but it results in 3 columns called Peak - Which I know is what I asked for :)
Can anyone help with the correct syntax, so that I end up with one column called Peak?
Use Test
Select Chargecode,
(SELECT 1 Where Chargecode='1') AS [Peak],
(SELECT 1 Where Chargecode='1242') AS [Peak],
Peak*2 AS [Peak],
CallType
from Daisy_March2014
Thanks
You want a case statement. I think this is what you are looking for:
Select Chargecode,
(case when chargecode = '1'
when chargecode = '1242' then 2
else 2 * Peak
end) as Peak,
CallType
from Daisy_March2014;
Thanks Gordon, I have marked you response as Answered. Here is the final working code:
(case when chargecode in ('1') then 1 when chargecode in ('1264') then 2 else Peak*2 end) as Peak,
Since it depends on your charge code, I'm going to make a wild assumption that this might be an ongoing thing where new charge codes / rules could be added. Why not store this as metadata either in the charge code table or in a new table? You could generate the initial data with this:
SELECT ChargeCode,
Multiplier
INTO ChargeMeta
FROM (
Select 1 AS ChargeCode,
1 AS Multiplier
UNION ALL
SELECT 1242 AS ChargeCode,
1 AS Multiplier
UNION ALL
SELECT ChargeCode,
2 AS Multiplier
FROM Daisy_March2014
WHERE ChargeCode NOT IN (1,1242)
) SQ
Then just join to your original data.
SELECT a.ChargeCode,
a.Peak*b.Multiplier AS Peak
FROM Daisy_March2014 a
JOIN ChargeMeta b
ON a.ChargeCode = b.ChargeCode
If you do not want to maintain all charge code multipliers, you could maintain your non-standard ones, and store the standard one in the SQL. This would be about the same as a case statement, but it may still add benefit to store the overrides in a table. At the very least, it makes it easier to re-use elsewhere. No need to check all the queries that deal with Peak values and make them consistent, if ChargeCode 42 needs to have a new multiplier set.
If you want to store the default in the table, you could use two joins instead of one, storing the default charge code under a value that will never be used. (-1?)
SELECT a.ChargeCode,
a.Peak*COALESCE(b.Multiplier,c.Multiplier) AS Peak
FROM Daisy_March2014 a
LEFT JOIN ChargeMeta b ON a.ChargeCode = b.ChargeCode
LEFT JOIN ChargeMeta c ON c.ChargeCode = -1

Query taking too long - Optimization

I am having an issue with the following query returning results a bit too slow and I suspect I am missing something basic. My initial guess is the 'CASE' statement is taking too long to process its result on the underlying data. But it could be something in the derived tables as well.
The question is, how can I speed this up? Are there any glaring errors in the way I am pulling the data? Am I running into a sorting or looping issues somewhere? The query runs for about 40 seconds, which seems quite long. C# is my primary expertise, SQL is a work in progress.
Note I am not asking "write my code" or "fix my code". Just for a pointer in the right direction, I can't seem to figure out where the slow down occurs. Each derived table runs very quickly (less than a second) by themselves, the joins seem correct and the result set is returning exactly what I need. It's just too slow and I'm sure there are better SQL scripter's out there ;) Any tips would be greatly appreciated!
SELECT
hdr.taker
, hdr.order_no
, hdr.po_no as display_po
, cust.customer_name
, hdr.customer_id
, 'INCORRECT-LARGE ORDER' + CASE
WHEN (ext_price_calc >= 600.01 and ext_price_calc <= 800) and fee_price.unit_price <> round(ext_price_calc * -.01,2)
THEN '-1%: $' + cast(cast(ext_price_calc * -.01 as decimal(18,2)) as varchar(255))
WHEN ext_price_calc >= 800.01 and ext_price_calc <= 1000 and fee_price.unit_price <> round(ext_price_calc * -.02,2)
THEN '-2%: $' + cast(cast(ext_price_calc * -.02 as decimal(18,2)) as varchar(255))
WHEN ext_price_calc > 1000 and fee_price.unit_price <> round(ext_price_calc * -.03,2)
THEN '-3%: $' + cast(cast(ext_price_calc * -.03 as decimal(18,2)) as varchar(255))
ELSE
'OK'
END AS Status
FROM
(myDb_view_oe_hdr hdr
LEFT OUTER JOIN myDb_view_customer cust
ON hdr.customer_id = cust.customer_id)
LEFT OUTER JOIN wpd_view_sales_territory_by_customer territory
ON cust.customer_id = territory.customer_id
LEFT OUTER JOIN
(select
order_no,
SUM(ext_price_calc) as ext_price_calc
from
(select
hdr.order_no,
line.item_id,
(line.qty_ordered - isnull(qty_canceled,0)) * unit_price as ext_price_calc
from myDb_view_oe_hdr hdr
left outer join myDb_view_oe_line line
on hdr.order_no = line.order_no
where
line.delete_flag = 'N'
AND line.cancel_flag = 'N'
AND hdr.projected_order = 'N'
AND hdr.delete_flag = 'N'
AND hdr.cancel_flag = 'N'
AND line.item_id not in ('LARGE-ORDER-1%','LARGE-ORDER-2%', 'LARGE-ORDER-3%', 'FUEL','NET-FUEL', 'CONVENIENCE-FEE')) as line
group by order_no) as order_total
on hdr.order_no = order_total.order_no
LEFT OUTER JOIN
(select
order_no,
count(order_no) as convenience_count
from oe_line with (nolock)
left outer join inv_mast inv with (nolock)
on oe_line.inv_mast_uid = inv.inv_mast_uid
where inv.item_id in ('LARGE-ORDER-1%','LARGE-ORDER-2%', 'LARGE-ORDER-3%')
and oe_line.delete_flag <> 'Y'
group by order_no) as fee_count
on hdr.order_no = fee_count.order_no
INNER JOIN
(select
order_no,
unit_price
from oe_line line with (nolock)
where line.inv_mast_uid in (select inv_mast_uid from inv_mast with (nolock) where item_id in ('LARGE-ORDER-1%','LARGE-ORDER-2%', 'LARGE-ORDER-3%'))) as fee_price
ON fee_count.order_no = fee_price.order_no
WHERE
hdr.projected_order = 'N'
AND hdr.cancel_flag = 'N'
AND hdr.delete_flag = 'N'
AND hdr.completed = 'N'
AND territory.territory_id = ‘CUSTOMERTERRITORY’
AND ext_price_calc > 600.00
AND hdr.carrier_id <> '100004'
AND fee_count.convenience_count is not null
AND CASE
WHEN (ext_price_calc >= 600.01 and ext_price_calc <= 800) and fee_price.unit_price <> round(ext_price_calc * -.01,2)
THEN '-1%: $' + cast(cast(ext_price_calc * -.01 as decimal(18,2)) as varchar(255))
WHEN ext_price_calc >= 800.01 and ext_price_calc <= 1000 and fee_price.unit_price <> round(ext_price_calc * -.02,2)
THEN '-2%: $' + cast(cast(ext_price_calc * -.02 as decimal(18,2)) as varchar(255))
WHEN ext_price_calc > 1000 and fee_price.unit_price <> round(ext_price_calc * -.03,2)
THEN '-3%: $' + cast(cast(ext_price_calc * -.03 as decimal(18,2)) as varchar(255))
ELSE
'OK' END <> 'OK'
Just as a clue to the right direction for optimization:
When you do an OUTER JOIN to a query with calculated columns, you are guaranteeing not only a full table scan, but that those calculations must be performed against every row in the joined table. It appears that you can actually do your join to oe_line without the column calculations (i.e. by filtering ext_price_calc to a specific range).
You don't need to do most of the subqueries that are in your query--the master query can be recrafted to use regular table join syntax. Joins to subqueries containing subqueries presents a challenge to the SQL optimizer that it may not be able to meet. But by using regular joins, the optimizer has a much better chance at identifying more efficient query strategies.
You don't tag which SQL engine you're using. Every database has proprietary extensions that may allow for speedier or more efficient queries. It would be easier to provide useful feedback if you indicated whether you were using MySQL, SQL Server, Oracle, etc.
Regardless of the database you're using, reviewing the query plan is always a good place to start. This will tell you where most of the I/O and time in your query is being spent.
Just on general principle, make sure your statistics are up-to-date.
It's may not be solvable by any of us without the real stuff to test with.
IF that's the case and nobody else posts the answer, I can still help. Here is how to trouble shoot it.
(1) take joins and pieces out one by one.
(2) this will cause errors. Remove or fake the references to get rid of them.
(3) see how that works.
(4) Put items back before you try taking something else out
(5) keep track...
(6) also be aware where a removal of something might drastically reduce the result set.
You might find you're missing an index or some other smoking gun.
I was having the same problem and I was able to solve it by indexing one of the tables and setting a primary key.
I strongly suspect that the problem lies in the number of joins you're doing. A lot of databases do joins basically by systemically checking all possible combinations of the various tables as being valid - so if you're joinging table A and B on column C, and A looks like:
Name:C
Fred:1
Alice:2
Betty:3
While B looks like:
C:Pet
1:Alligator
2:Lion
3:T-Rex
When you do the join, it checks all 9 possibilities:
Fred:1:1:Alligator
Fred:1:2:Lion
Fred:1:3:T-Rex
Alice:2:1:Alligator
Alice:2:2:Lion
Alice:2:3:T-Rex
Betty:3:1:Alligator
Betty:3:2:Lion
Betty:3:3:T-Rex
And goes through and deletes the non-matching ones:
Fred:1:1:Alligator
Alice:2:2:Lion
Betty:3:3:T-Rex
... which means with three entries in each table, it creates nine temporary records, sorts through them all, and deletes six of them ... all before it actually sorts through the results for what you're after (so if you are looking for Betty's Pet, you only want one row on that final result).
... and you're doing how many joins and sub-queries?

multiple count(distinct)

I get an error unless I remove one of the count(distinct ...). Can someone tell me why and how to fix it?
I'm in vfp. iif([condition],[if true],[else]) is equivalent to case when
SELECT * FROM dpgift where !nocalc AND rectype = "G" AND sol = "EM112" INTO CURSOR cGift
SELECT
list_code,
count(distinct iif(language != 'F' AND renew = '0' AND type = 'IN',donor,0)) as d_Count_E_New_Indiv,
count(distinct iif(language = 'F' AND renew = '0' AND type = 'IN',donor,0)) as d_Count_F_New_Indiv /*it works if i remove this*/
FROM cGift gift
LEFT JOIN
(select didnumb, language, type from dp) d
on cast(gift.donor as i) = cast(d.didnumb as i)
GROUP BY list_code
ORDER by list_code
edit:
apparently, you can't use multiple distinct commands on the same level. Any way around this?
VFP does NOT support two "DISTINCT" clauses in the same query... PERIOD... I've even tested on a simple table of my own, DIRECTLY from within VFP such as
select count( distinct Col1 ) as Cnt1, count( distinct col2 ) as Cnt2 from MyTable
causes a crash. I don't know why you are trying to do DISTINCT as you are just testing a condition... I more accurately appears you just want a COUNT of entries per each category of criteria instead of actually DISTINCT
Because you are not "alias.field" referencing your columns in your query, I don't know which column is the basis of what. However, to help handle your DISTINCT, and it appears you are running from WITHIN a VFP app as you are using the "INTO CURSOR" clause (which would not be associated with any OleDB .net development), I would pre-query and group those criteria, something like...
select list_code,
donor,
max( iif( language != 'F' and renew = '0' and type = 'IN', 1, 0 )) as EQualified,
max( iif( language = 'F' and renew = '0' and type = 'IN', 1, 0 )) as FQualified
from
list_code
group by
list_code,
donor
into
cursor cGroupedByDonor
so the above will ONLY get a count of 1 per donor per list code, no matter how many records that qualify. In addition, if one record as an "F" and another does NOT, then you'll have a value of 1 in EACH of the columns... Then you can do something like...
select
list_code,
sum( EQualified ) as DistEQualified,
sum( FQualified ) as DistFQualified
from
cGroupedByDonor
group by
list_code
into
cursor cDistinctByListCode
then run from that...
You can try using either another derived table or two to do the calculations you need, or using projections (queries in the field list). Without seeing the schema, it's hard to know which one will work for you.

MySQL to PostgreSQL: GROUP BY issues

So I decided to try out PostgreSQL instead of MySQL but I am having some slight conversion problems. This was a query of mine that samples data from four tables and spit them out all in on result.
I am at a loss of how to convey this in PostgreSQL and specifically in Django but I am leaving that for another quesiton so bonus points if you can Django-fy it but no worries if you just pure SQL it.
SELECT links.id, links.created, links.url, links.title, user.username, category.title, SUM(votes.karma_delta) AS karma, SUM(IF(votes.user_id = 1, votes.karma_delta, 0)) AS user_vote
FROM links
LEFT OUTER JOIN `users` `user` ON (`links`.`user_id`=`user`.`id`)
LEFT OUTER JOIN `categories` `category` ON (`links`.`category_id`=`category`.`id`)
LEFT OUTER JOIN `votes` `votes` ON (`votes`.`link_id`=`links`.`id`)
WHERE (links.id = votes.link_id)
GROUP BY votes.link_id
ORDER BY (SUM(votes.karma_delta) - 1) / POW((TIMESTAMPDIFF(HOUR, links.created, NOW()) + 2), 1.5) DESC
LIMIT 20
The IF in the select was where my first troubles began. Seems it's an IF true/false THEN stuff ELSE other stuff END IF yet I can't get the syntax right. I tried to use Navicat's SQL builder but it constantly wanted me to place everything I had selected into the GROUP BY and that I think it all kinds of wrong.
What I am looking for in summary is to make this MySQL query work in PostreSQL. Thank you.
Current Progress
Just want to thank everybody for their help. This is what I have so far:
SELECT links_link.id, links_link.created, links_link.url, links_link.title, links_category.title, SUM(links_vote.karma_delta) AS karma, SUM(CASE WHEN links_vote.user_id = 1 THEN links_vote.karma_delta ELSE 0 END) AS user_vote
FROM links_link
LEFT OUTER JOIN auth_user ON (links_link.user_id = auth_user.id)
LEFT OUTER JOIN links_category ON (links_link.category_id = links_category.id)
LEFT OUTER JOIN links_vote ON (links_vote.link_id = links_link.id)
WHERE (links_link.id = links_vote.link_id)
GROUP BY links_link.id, links_link.created, links_link.url, links_link.title, links_category.title
ORDER BY links_link.created DESC
LIMIT 20
I had to make some table name changes and I am still working on my ORDER BY so till then we're just gonna cop out. Thanks again!
Have a look at this link GROUP BY
When GROUP BY is present, it is not
valid for the SELECT list expressions
to refer to ungrouped columns except
within aggregate functions, since
there would be more than one possible
value to return for an ungrouped
column.
You need to include all the select columns in the group by that are not part of the aggregate functions.
A few things:
Drop the backticks
Use a CASE statement instead of IF() CASE WHEN votes.use_id = 1 THEN votes.karma_delta ELSE 0 END
Change your timestampdiff to DATE_TRUNC('hour', now()) - DATE_TRUNC('hour', links.created) (you will need to then count the number of hours in the resulting interval. It would be much easier to compare timestamps)
Fix your GROUP BY and ORDER BY
Try to replace the IF with a case;
SUM(CASE WHEN votes.user_id = 1 THEN votes.karma_delta ELSE 0 END)
You also have to explicitly name every column or calculated column you use in the GROUP BY clause.