SQL multiple SELECT too slow (7 min)

SQL multiple SELECT too slow (7 min) - sql

This source is good but too slow.
Function:
Selecting all rows if SC and %%5 and 2013.07.11 < date < 2013.07.18
and
some older lines represent lines
Method:
Finding X count rows.
one by one to see whether there is consistency 28 days
select efi_name, efi_id, count(*) as dupes, id, mlap_date
from address m
where
mlap_date > "2013.07.11"
and mlap_date < "2013.07.18"
and mlap_type = "SC"
and calendar_id not like "%%5"
and concat(efi_id,irsz,ucase(city), ucase(address)) in (
select concat(k.efi_id,k.irsz,ucase(k.city), ucase(k.address)) as dupe
from address k
where k.mlap_date > adddate(m.`mlap_date`,-28)
and k.mlap_date < m.mlap_date
and k.mlap_type = "SC"
and k.calendar_id not like "%%5"
and k.status = 'Befejezett'
group by concat(k.efi_id,k.irsz,ucase(k.city), ucase(k.address))
having (count(*) > 1)
)
group by concat(efi_id,irsz,ucase(city), ucase(address))
Thanks for helping!

NOT LIKE plus wildcard-prefixed terms are index-usage killers.
You could also try replacing the IN + inline table with an inner join: does the optimizer run the NOT LIKE query twice (see your explain plan)?
It looks like you might be using MySql, in which case you could build a hash column based on
efi_id
irsz
ucase(city)
ucase(address))
and compare that column directly. This is a way of implementing a hash join in MySql.

I don't think you need a subquery to do this. You should be able to do it just with the outer group by and conditional aggregations.
select efi_name, efi_id,
sum(case when mlap_date > "2013.07.11" and mlap_date < "2013.07.18" then 1 else 0 end) as dupes,
id, mlap_date
from address m
where mlap_type = 'SC' and calendar_id not like '%%5'
group by efi_id,irsz, ucase(city), ucase(address)
having sum(case when m.status = 'Befejezett' and
m.mlap_date <= '2013.07.11' and
k.mlap_date > adddate(date('2013.07.11'), -28)
then 1
else 0
end) > 1
This produces a slightly different result from your query. Instead of looking at the 28 days before each record, it looks at all records in the week period and then at the four weeks before that period. Despite this subtle difference, it is still identifying dupes in the four-week period before the one-week period.

Related

SQL Query to Find if (Count of date > X) = 0 For Group of ID

I apologize if the title is not be correct as I'm not sure what I need to ask for, since I don't know how to build the query.
I have the following query built to return a list of chemicals and other related fields.
SELECT DISTINCT
RDB.Chemical_Record.[Chemical_ID],
RDB.Chemical_Record.[Expires_Date],
RDB.Assay_Group.[Assay_Group_Name] AS [Assay Group],
RDB.Chemical.[Chemical_Name],
RDB.Chemical.[Product_Number],
RDB.Chemical_Record.[Lot_Number],
RDB.Storage_Location.[Location_Name]
FROM RDB.Chemical_Record
LEFT JOIN RDB.Chemical ON Chemical_Record.[Chemical_ID] = Chemical.[ID_Chemical]
LEFT JOIN RDB.Storage_Location ON Storage_Location.[ID_Storage_Location] = Chemical_Record.[Storage_Location_ID]
LEFT JOIN RDB.Chemical_To_AGroup ON Chemical_To_AGroup.[Chemical_ID] = Chemical_Record.[Chemical_ID]
LEFT JOIN RDB.Assay_Group ON Assay_Group.[ID_Assay_Group] = Chemical_To_AGroup.[Assay_Group_ID]
WHERE RDB.Chemical_Record.[Expires_Date] >= DATEADD(day,-60, GETDATE())
ORDER BY RDB.Chemical_Record.[Chemical_ID], RDB.Chemical_Record.[Expires_Date], RDB.Assay_Group.[Assay_Group_Name]
I am using this query in a VB.Net application where it exports the results to an Excel worksheet and then performs additional actions to delete the rows I don't need. The process to query is quick, but working with Excel from .Net is painful and slow.
Instead I'd like to build the query to return the exact results I want, which I think is possible, I just can't figure out how. I have tried using a combination of Count, Group and Having, but since I've never worked with those I can't get them to work for me.
Example:
SELECT
COUNT(RDB.Chemical_Record.[Chemical_ID]) Count_ID,
RDB.Chemical_Record.[Chemical_ID],
RDB.Chemical_Record.[Expires_Date]
FROM RDB.Chemical_Record
WHERE RDB.Chemical_Record.[Expires_Date] > DATEADD(day,30,GETDATE())
GROUP BY RDB.Chemical_Record.[Chemical_ID], RDB.Chemical_Record.[Expires_Date]
ORDER BY RDB.Chemical_Record.[Chemical_ID]
As you can see from this example, it doesn't return the count of ID's where Expiration Date > DATEADD(day,30,GETDATE()) nor does it return the ID's that I actually wanted.
What I need to return is all chemicals (ID) that DO NOT have an expiration date > Today + 30 for that specific ID. The screenshot below shows an example of the data that gets pulled. The yellow highlighted rows are the only two in that set that should get returned as there are no other chemicals of those two ID's with an expiration date > Today + 30. All the other ID's should not show up since they DO have ID's of COUNT(Expiration Date > Today + 30) > 0.
If someone could help me build the query using the appropriate Aggregate functions, it would be MUCH appreciated.

What I need to return is all chemicals (ID) that DO NOT have an expiration date > Today + 30 for that specific ID.
For this question, you can use a HAVING clause. No WHERE is needed:
SELECT COUNT(*) as Count_ID, cr.[Chemical_ID]
FROM RDB.Chemical_Record cr
GROUP BY cr.[Chemical_ID]
HAVING MAX(cr.Expires_Date) <= DATEADD(day, 30, GETDATE())
ORDER BY cr.[Chemical_ID]

Using the HAVING MAX solved my problem and I was then able to work out exactly what I needed. I had to do some more research to figure out how to bring all my columns back, but that wasn't as difficult.
Here is my final solution:
WITH CHEM AS (
SELECT RDB.Chemical_Record.[Chemical_ID]
FROM RDB.Chemical_Record
GROUP BY RDB.Chemical_Record.[Chemical_ID]
HAVING MAX(RDB.Chemical_Record.Expires_Date) <= DATEADD(day, 60, GETDATE())
)
SELECT DISTINCT
RDB.Chemical_Record.[Chemical_ID],
RDB.Chemical_Record.[Expires_Date],
RDB.Assay_Group.[Assay_Group_Name] AS [Assay Group],
RDB.Chemical.[Chemical_Name],
RDB.Chemical.[Product_Number],
RDB.Chemical_Record.[Lot_Number],
RDB.Storage_Location.[Location_Name]
FROM RDB.Chemical_Record
INNER JOIN CHEM ON CHEM.Chemical_ID = RDB.Chemical_Record.Chemical_ID
LEFT JOIN RDB.Chemical ON Chemical_Record.[Chemical_ID] = Chemical.[ID_Chemical]
LEFT JOIN RDB.Storage_Location ON Storage_Location.[ID_Storage_Location] = Chemical_Record.[Storage_Location_ID]
LEFT JOIN RDB.Chemical_To_AGroup ON Chemical_To_AGroup.[Chemical_ID] = Chemical_Record.[Chemical_ID]
LEFT JOIN RDB.Assay_Group ON Assay_Group.[ID_Assay_Group] = Chemical_To_AGroup.[Assay_Group_ID]
WHERE Expires_Date >= DATEADD(day, -60, GETDATE())
ORDER BY RDB.Chemical_Record.[Chemical_ID], RDB.Chemical_Record.Expires_Date
And a screenshot showing the resulting search:

Is there a way I can Query Missing numbers in a table?

I work for a Logistics Company and we have to have a 7 digit Pro Number on each piece of freight that is in a pre-determined order. So we know there is gaps in the numbers, but is there any way I can Query the system and find out what ones are missing?
So show me all the numbers from 1000000 to 2000000 that do not exist in column name trace_number.
So as you can see below the sequence goes 1024397, 1024398, then 1051152 so I know there is a substantial gap of 26k pro numbers, but is there anyway to just query the gaps?
Select t.trace_number,
integer(trace_number) as number,
ISNUMERIC(trace_number) as check
from trace as t
left join tlorder as tl on t.detail_number = tl.detail_line_id
where left(t.trace_number,1) in ('0','1','2','3','4','5','6','7','8','9')
and date(pick_up_by) >= current_date - 1 years
and length(t.trace_number) = 7
and t.trace_type = '2'
and site_id in ('SITE5','SITE9','SITE10')
and ISNUMERIC(trace_number) = 'True'
order by 2
fetch first 10000 rows only

I'm not sure what your query has to do with the question, but you can identify gaps using lag()/lead(). The idea is:
select (trace_number + 1) as start_gap,
(next_tn - 1) as end_gap
from (select t.*,
lead(trace_number) order by (trace_number) as next_tn
from t
) t
where next_tn <> trace_number + 1;
This does not find them within a range. It just finds all gaps.

try Something like this (adapt the where condition, put into clause "on") :
with Range (nb) as (
values 1000000
union all
select nb+1 from Range
where nb<=2000000
)
select *
from range f1 left outer join trace f2
on f2.trace_number=f1.nb
and f2.trace_number between 1000000 and 2000000
where f2.trace_number is null

SQL Query - combine 2 rows into 1 row

I have the following query below (view) in SQL Server. The query produces a result set that is needed to populate a grid. However, a new requirement has come up where the users would like to see data on one row in our app. The tblTasks table can produce 1 or 2 rows. The issue becomes when they're is two rows that have the same job_number but different fldProjectContextId (1 or 31). I need to get the MechApprovalOut and ElecApprovalOut columns on one row instead of two.
I've tried restructuring the query using CTE and over partition and haven't been able to get the necessary results I need.
SELECT TOP (100) PERCENT
CAST(dbo.Job_Control.job_number AS int) AS Job_Number,
dbo.tblTasks.fldSalesOrder, dbo.tblTaskCategories.fldTaskCategoryName,
dbo.Job_Control.Dwg_Sent, dbo.Job_Control.Approval_done,
dbo.Job_Control.fldElecDwgSent, dbo.Job_Control.fldElecApprovalDone,
CASE WHEN DATEDIFF(day, dbo.Job_Control.Dwg_Sent, GETDATE()) > 14
AND dbo.Job_Control.Approval_done IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 1
THEN 1 ELSE 0
END AS MechApprovalOut,
CASE WHEN DATEDIFF(day, dbo.Job_Control.fldElecDwgSent, GETDATE()) > 14
AND dbo.Job_Control.fldElecApprovalDone IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 31
THEN 1 ELSE 0
END AS ElecApprovalOut,
dbo.tblProjectContext.fldProjectContextName,
dbo.tblProjectContext.fldProjectContextId, dbo.Job_Control.Drawing_Info,
dbo.Job_Control.fldElectricalAppDwg
FROM dbo.tblTaskCategories
INNER JOIN dbo.tblTasks
ON dbo.tblTaskCategories.fldTaskCategoryId = dbo.tblTasks.fldTaskCategoryId
INNER JOIN dbo.Job_Control
ON dbo.tblTasks.fldSalesOrder = dbo.Job_Control.job_number
INNER JOIN dbo.tblProjectContext
ON dbo.tblTaskCategories.fldProjectContextId = dbo.tblProjectContext.fldProjectContextId
WHERE (dbo.tblTaskCategories.fldTaskCategoryName = N'Approval'
OR dbo.tblTaskCategories.fldTaskCategoryName = N'Re-Approval')
AND (CASE WHEN DATEDIFF(day, dbo.Job_Control.Dwg_Sent, GETDATE()) > 14
AND dbo.Job_Control.Approval_done IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 1
THEN 1 ELSE 0
END = 1)
OR (dbo.tblTaskCategories.fldTaskCategoryName = N'Approval'
OR dbo.tblTaskCategories.fldTaskCategoryName = N'Re-Approval')
AND (CASE WHEN DATEDIFF(day, dbo.Job_Control.fldElecDwgSent, GETDATE()) > 14
AND dbo.Job_Control.fldElecApprovalDone IS NULL
AND dbo.tblProjectContext.fldProjectContextID = 31
THEN 1 ELSE 0
END = 1)
ORDER BY dbo.Job_Control.job_number, dbo.tblTaskCategories.fldProjectContextId
The above query gives me the following result set:
I've created a work around via code (which I don't like but it works for now) where i've used code to populate a "temp" table the way i need it to display the data, that is, one record if duplicate job numbers to get the MechApprovalOut and ElecApprovalOut columns on one row (see first record in following screen shot).
Example:
With the desired result set and one row per job_number, this is how the form looks with the data and how I am using the result set.
Any help restructuring my query to combine duplicate rows with the same job number where MechApprovalOut and ElecApproval out columns are on one row is greatly appreciated! I'd much prefer to use a view on SQL then code in the app to populate a temp table.
Thanks,
Jimmy

What I would do is LEFT JOIN the main table to itself at the beginning of the query, matching on Job Number and Sales Order, such that the left side of the join is only looking at Approval task categories and the right side of the join is only looking at Re-Approval task categories. Then I would make extensive use of the COALESCE() function to select data from the correct side of the join for use later on and in the select clause. This may also be the piece you were missing to make a CTE work.
There is probably also a solution that uses a ranking/windowing function (maybe not RANK itself, but something that category) along with the PARTITION BY clause. However, as those are fairly new to Sql Server I haven't used them enough personally to be comfortable writing an example solution for you without direct access to the data to play with, and it would still take me a little more time to get right than I can devote to this right now. Maybe this paragraph will motivate someone else to do that work.

Performance issue in retrieving multiple output of the same column with different values from a single table

is there anyway to get the following results from query without joining the same table three times (or) without reading the same "wordlocation" table three times (or more if there are more words)? If there are three or more words, it takes about over a minute for the results to be returned.
Currently "wordlocation" table has three rows being ("bookid","wordid","location") and it currently has 917802 rows.
What I am trying to do is
retrieve the "bookid" that contains all the words specified in the query by "wordid".
sum word count of all words (from the query) from each book
minimum values of each word location, e.g. (min(w0.location), min (w1.location)
I have tried commenting out count(w0.wordid) and min(location) calculations to see whether they are affecting the performance but this is not the case. Joining the same table multiple time was the case.
(this is the same code as the above image)
select
w0.bookid,
count(w0.wordid) as wcount,
abs(min(w0.location) + min(w1.location) + min(w2.location)) as wordlocation,
(abs(min(w0.location) - min(w1.location)) + abs(min(w1.location) - min(w2.location))) as distance
from
wordlocation as w0
inner join
wordlocation as w1 on w0.bookid = w1.bookid
join
wordlocation as w2 on w1.bookid = w2.bookid
where
w0.wordid =3
and
w1.wordid =52
and
w2.wordid =42
group by w0.bookid
order by wcount desc;
This is the result that I am looking for, and which I got from running the above query, but it takes too long if I specify more than 3 words, e.g. (w0 = 3, w1 = 52 , w2 = 42, w3 = 71)

Try this query
SELECT bookid,
ABS(L3+L52+L42) as wordlocation,
ABS(L3-L52)+ABS(L52-L42) as distance
FROM
(SELECT bookid, wordid, CASE WHEN wordid=3 THEN min(location) ELSE 0 END L3,
CASE WHEN wordid=52 THEN min(location) ELSE 0 END L52,
CASE WHEN wordid=42 THEN min(location) ELSE 0 END L42
FROM wordlocation WL
WHERE wordid in (3,52,42)
GROUP BY bookid, wordid) T
GROUP BY bookid
You may also need to create index on wordid

SQL Server Update via Select Statement

I have the following sql statement and I want to update a field on the rows returned from the select statement. Is this possible with my select? The things I have tried are not giving me the desired results:
SELECT
Flows_Flows.FlowID,
Flows_Flows.Active,
Flows_Flows.BeatID,
Flows_Flows.FlowTitle,
Flows_Flows.FlowFileName,
Flows_Flows.FlowFilePath,
Flows_Users.UserName,
Flows_Users.DisplayName,
Flows_Users.ImageName,
Flows_Flows.Created,
SUM(CASE WHEN [Like] = 1 THEN 1 ELSE 0 END) AS Likes,
SUM(CASE WHEN [Dislike] = 1 THEN 1 ELSE 0 END) AS Dislikes
FROM Flows_Flows
INNER JOIN Flows_Users ON Flows_Users.UserID = Flows_Flows.UserID
LEFT JOIN Flows_Flows_Likes_Dislikes ON
Flows_Flows.FlowID=Flows_Flows_Likes_Dislikes.FlowID
WHERE Flows_Flows.Active = '1' AND Flows_Flows.Created < DATEADD(day, -60, GETDATE())
Group By Flows_Flows.FlowID, Flows_Flows.Active, Flows_Flows.BeatID,
Flows_Flows.FlowTitle, Flows_Flows.FlowFileName, Flows_Flows.FlowFilePath,
Flows_Users.UserName, Flows_Users.DisplayName, Flows_Users.ImageName,
Flows_Flows.Created
Having SUM(CASE WHEN [Like] = 1 THEN 1 ELSE 0 END) = '0' AND SUM(CASE WHEN [Dislike] = 1
THEN 1 ELSE 0 END) >= '0'
This select statement returns exactly what I need but I want to change the Active field from 1 to 0.

yes - the general structure might be like this: (note you don't declare your primary key)
UPDATE mytable
set myCol = 1
where myPrimaryKey in (
select myPrimaryKey from mytable where interesting bits happen here )

Because you haven't made your question more clear in what result you want to achieve, I'll provide an answer with my own assumptions.
Assumption
You have a select statement that gives you stuffs, and it works as desired. What you want it to do is to make it return results and update those selected rows on the fly - basically like saying "find X, tell me about X and make it Y".
Anwser
If my assumption is correct, unfortunately I don't think there is any way you can do that. A select does not alter the table, it can only fetch information. Similarly, an update does not provide more detail than the number of rows updated.
But don't give up yet, depending on the result you want to achieve, you have alternatives.
Alternatives
If you just want to update the rows that you have selected, you can
simply write an UPDATE statement to do that, and #Randy has provided
a good example of how it will be written.
If you want to reduce calls to server, meaning you want to make just
one call to the server and get result, as well as to update the
rows, you can write store procedures to do that.
Store procedures are like functions you wrote in programming languages. It essentially defines a set of sql operations and gives them a name. Each time you call that store procedure, the set of operations gets executed with supplied inputs, if any.
So if you want to learn more about store procedures you can take a look at:
http://www.mysqltutorial.org/introduction-to-sql-stored-procedures.aspx

If I understand correctly you are looking for a syntax to be able to select the value of Active to be 0 if it is 1. The syntax for something like that is
SELECT
Active= CASE WHEN Active=1 THEN 0 ELSE Active END
FROM
<Tables>
WHERE
<JOIN Conditions>

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL multiple SELECT too slow (7 min) - sql

Related

SQL Query to Find if (Count of date > X) = 0 For Group of ID

Is there a way I can Query Missing numbers in a table?

SQL Query - combine 2 rows into 1 row

Performance issue in retrieving multiple output of the same column with different values from a single table

SQL Server Update via Select Statement

Categories

Resources