Need to optimize a nested select statement - sql

I've got the following SQL:
SELECT customfieldvalue.ISSUE
FROM customfieldvalue
WHERE customfieldvalue.STRINGVALUE
IN (SELECT customfieldvalue.STRINGVALUE
FROM customfieldvalue
WHERE customfieldvalue.CUSTOMFIELD = "10670"
GROUP BY customfieldvalue.STRINGVALUE
HAVING COUNT(*) > 1);
The inner nested select returns 3265 rows in 1.5secs on MySQL 5.0.77 when run on its own.
The customfieldvalue table contains 2286831 rows.
I want to return all values of the ISSUE column where the STRINGISSUE column value is not exclusive to that row and the CUSTOMFIELD column contains "10670".
When I try and run the query above, MySQL seems to be stuck. I've left it run for up to a minute, but I'm pretty sure the problem is my query.

Try something along these lines:
SELECT cfv1.ISSUE
COUNT(cfv2.STRINGVALUE) as indicator
FROM customfieldvalue cfv1
INNER JOIN customfieldvalue cfv2
ON cfv1.STRINGVALUE = cfv2.STRINGVALUE AND cfv2.CUSTOMFIELD = "10670"
GROUP BY cfv1.ISSUE
HAVING indicator > 1
This probably doesn't work on copy&paste as I haven't verified it, but in MySQL JOINs are often much much faster than subqueries, even orders of magnitude.

Related

SQL SRVR 2016: Trouble joining to a nested select statement

I'm working in a query window in SSMS.
Using 3 tables:
WORK_ORDER wo
An order to fabricate a part
OPERATION op
An operation in the fabrication of the part (laser, grinding, plating, etc.)
PART pt
A unique record defining the part
My objective is to report on the status of an operation (say #3) (#total parts ordered, #completed parts), but additionally to include the number of parts that have completed the previous operation (#2) in the sequence and are ready for the process. My solution was to use the LAG function, which works perfectly when the nested select statement below is run independently, but I get an avg of 4X duplication in my results, and my Completed_QTY_PREV_OP column is not displayed. I am aware that's because it's not in the parent select statement, but I wanted to correct the join first. I'm guessing the two problems are related.
Footnote: The WHERE contains a filter that you can ignore. The parent select statement works perfectly without the joined subquery.
Here's my sql:
SELECT op.RESOURCE_ID, pt.USER_5 AS PRODUCT, wo.PART_ID, wo.TYPE, wo.BASE_ID,
wo.LOT_ID, wo.SPLIT_ID, wo.SUB_ID, op.SEQUENCE_NO, pt.DESCRIPTION,
wo.DESIRED_QTY, op.FULFILLED_QTY AS QTY_COMP, op.SERVICE_ID, op.DISPATCHED_QTY, wo.STATUS
FROM dbo.WORK_ORDER wo INNER JOIN
dbo.OPERATION op ON wo.TYPE = op.WORKORDER_TYPE
AND wo.BASE_ID = op.WORKORDER_BASE_ID
AND wo.LOT_ID = op.WORKORDER_LOT_ID
AND wo.SPLIT_ID = op.WORKORDER_SPLIT_ID
AND wo.SUB_ID = op.WORKORDER_SUB_ID INNER JOIN
dbo.PART pt ON wo.PART_ID = pt.ID
LEFT OUTER JOIN
--The nested select statement works by itself in a query window,
--but the JOIN throws an error.
(SELECT
pr.WORKORDER_TYPE, pr.WORKORDER_BASE_ID, pr.WORKORDER_LOT_ID,
pr.WORKORDER_SPLIT_ID, pr.WORKORDER_SUB_ID, pr.SEQUENCE_NO,
LAG (COMPLETED_QTY, 1) OVER (ORDER BY pr.WORKORDER_TYPE, pr.WORKORDER_BASE_ID,
pr.WORKORDER_LOT_ID, pr.WORKORDER_SPLIT_ID, pr.WORKORDER_SUB_ID, pr.SEQUENCE_NO) AS COMP_QTY_PREV_OP
FROM dbo.OPERATION AS pr) AS prev
--End of nested select
ON
op.WORKORDER_TYPE = prev.WORKORDER_TYPE AND
op.WORKORDER_BASE_ID = prev.WORKORDER_BASE_ID AND
op.WORKORDER_LOT_ID = prev.WORKORDER_LOT_ID AND
op.WORKORDER_SPLIT_ID = prev.WORKORDER_SPLIT_ID AND
op.WORKORDER_SUB_ID = prev.WORKORDER_SUB_ID
WHERE (NOT (op.SERVICE_ID IS NULL)) AND (wo.STATUS = N'R')
You haven't given enough information for a definitive answer, so instead I will give you an approach to debugging this.
You are getting unexpected rows as a result of a JOIN. This means that your JOIN condition is not matching the two sides of the JOIN on a one-to-one basis. There are multiple rows in the table being JOINed that meet the JOIN conditions.
To find these rows, temporarily change your SELECT list to SELECT *. Do this both in the outer SELECT, and in the derived table. Look through the columns being returned by the JOINed table, and find the values that you didn't expect to be returned.
Since the JOIN that causes the issue is the last one, they will be all the way to right of the result of a SELECT *.
Then add more conditions to the JOIN to eliminate the unwanted rows from the results.
I simplified the whole query by first creating a temp table filled by the previously nested SELECT, and then joining to it from the parent SELECT.
Works perfectly now. Thanks for looking.
PS: I apologize for the confusion about an error message. I noticed after I posted that I had an old comment in the code regarding an error. The error had been resolved before posting, but I neglected to remove the comment.

Issue with joins in a SQL query

SELECT
c.ConfigurationID AS RealflowID, c.companyname,
c.companyphone, c.ContactEmail, COUNT(k.caseid)
FROM
dbo.Configuration c
INNER JOIN
dbo.cases k ON k.SiteID = c.ConfigurationId
WHERE
EXISTS (SELECT * FROM dbo.RepairEstimates
WHERE caseid = k.caseid)
AND c.AccountStatus = 'Active'
AND c.domainid = 46
GROUP BY
c.configurationid,c.companyname, c.companyphone, c.ContactEmail
I have this query - I am using the configuration table to get the siteid of the cases in the cases table. And if the case exists in the repair estimates table pull the company details listed and get a count of how many cases are in the repair estimator table for that siteid.
I hope that is clear enough of a description.
But the issue here is the count is not correct with the data that is being pulled. Is there something I could do differently? Different join? Remove the exists add another join? I am not sure I have tried many different things.
Realized I was using the wrong table. The query was correct.

Inner query with same source table as Outer Query

I went through some PL/SQL codes and found a piece of query where I not actually get how it works. Hoping to get some technical advise from here.
The piece of query was shown as below:
SELECT a.ROWID
FROM TableA a
WHERE a.object_name IN ('HEADERS','LINES','DELIVERIES')
AND a.change_type IN ('A','C')
AND a.ROWID NOT IN (SELECT MAX (b.ROWID)
FROM TableA b
WHERE b.object_name = a.object_name
AND b.change_type = a.change_type
AND b.pk1 = a.pk1
AND b.object_identifier = a.object_identifier
);
From what I know, the inner query should run first (correct me if I am wrong) and then the inner query result will used for the outer query.
For the above query, how the inner query run as it needs data from the outer query (data from alias TableA a).
Hope to have some guidance on this as I am very fresh in PL/SQL development.
Thanks!
It is not PL/SQL, just classic SQL statement.
The purpose seams to be
retrieve all the lines which are not the "last version" (biggest rowid for a couple pk1 and object_identifier)
The "not in" part will retrieve the max rowid for a couple (pk1 and object_identifier) and then, the outer query will retrive all the lines which are not the max rowid
In term of execution process, you can take a look at the explain plan to see what oracle is going to do.
The inner query does not run first. Conceptually, you can think of it running like this:
Run the outer query,
For each row in the other query, run the inner query using specific values for the a.* columns
If the inner query for that row doesn't return anything, output the outer query row to the result set

Getting way more results than expected in SQL left join query

My code is such:
SELECT COUNT(*)
FROM earned_dollars a
LEFT JOIN product_reference b ON a.product_code = b.product_code
WHERE a.activity_year = '2015'
I'm trying to match two tables based on their product codes. I would expect the same number of results back from this as total records in table a (with a year of 2015). But for some reason I'm getting close to 3 million.
Table a has about 40,000,000 records and table b has 2000. When I run this statement without the join I get 2,500,000 results, so I would expect this even with the left join, but somehow I'm getting 300,000,000. Any ideas? I even refered to the diagram in this post.
it means either your left join is using only part of foreign key, which causes row multiplication, or there are simply duplicate rows in the joined table.
use COUNT(DISTINCT a.product_code)
What is the question are are trying to answer with the tsql?
instead of select count(*) try select a.product_code, b.product_code. That will show you which records match and which don't.
Should also add a where b.product_code is not null. That should exclude the records that don't match.
b is the parent table and a is the child table? try a right join instead.
Or use the table's unique identifier, i.e.
SELECT COUNT(a.earned_dollars_id)
Not sure what your datamodel looks like and how it is structured, but i'm guessing you only care about earned_dollars?
SELECT COUNT(*)
FROM earned_dollars a
WHERE a.activity_year = '2015'
and exists (select 1 from product_reference b ON a.product_code = b.product_code)

SQL- make all rows show a column value if one of the rows has it

I have an SQL statement for a PICK sheet that returns the header/detail records for an order.
One of the fields in the SQL is basically a field to say if there are dangerous goods. If a single product on the order has a code against it, then the report should display that its hazardous.
The problem I am having is that in the SQL results, because I am putting the code on the report in the header section (and not the detail section), it is looking for the code only on the first row.
Is there a way through SQL to basically say "if one of these rows has this code, make all of these rows have this code"? I'm guessing a subselect would work here... the problem is, is that I am using a legacy system built on FoxPro and FoxPro SQL is terrible!
EDIT: just checked and I am running VFP8, subqueries in the SELECT statement were added in FVP9 :(
SELECT Header.HeaderId, Header.HeaderDescription,
Detail.DetailId, Detail.DetailDescription, Detail.Dangerous,
Danger.DangerousItems
FROM Header
INNER JOIN Detail ON Header.HeaderId = Detail.HeaderId
LEFT OUTER JOIN
(SELECT HeaderId, COUNT(*) AS DangerousItems FROM Detail WHERE Dangerous = 1 GROUP BY HeaderId) Danger ON Header.HeaderId = Danger.HeaderId
If Danger.DangerousItems > 0 then something is dangerous. If it is Null then nothing is dangerous.
If you can't do nested queries, then you should be able to create a view-like object (called a query in VFP8) for the nested select:
SELECT HeaderId, COUNT(*) AS DangerousItems FROM Detail WHERE Dangerous = 1 GROUP BY HeaderId
and then can you left join on that?
In VFP 8 and earlier, your best bet is to use three queries in a row:
SELECT Header.HeaderId, Header.HeaderDescription,
Detail.DetailId, Detail.DetailDescription, Detail.Dangerous,
Danger.DangerousItems
FROM Header
INNER JOIN Detail ON Header.HeaderId = Detail.HeaderId
INTO CURSOR csrDetail
SELECT HeaderId, COUNT(*) AS DangerousItems
FROM Detail
WHERE Dangerous
GROUP BY HeaderId
INTO CURSOR csrDanger
SELECT csrDetail.*, csrDanger.DangerousItems
FROM csrDetail.HeaderID = csrDanger.HeaderID
INTO CURSOR csrResult