I have a query against a table that contains like 2 million rows using linked server.
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id
WHERE
PV.col_id = ''80C53C9B-6272-11DA-BB34-000E0C7F3ED2''')
That query results into 365 rows and is executed within 0 second.
However when I make that query into a view it runs for about minimum of 20 seconds and sometimes it reaches to 40 seconds tops.
Here's my create view script
CREATE VIEW [dbo].[myview]
AS
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id')
then
Select * from myview where PV.col_id = '80C53C9B-6272-11DA-BB34-000E0C7F3ED2'
Any idea ? Thanks !
Your queries are quite different. In the first, the where clause is part of the SQL statement passed to OPENQUERY(). This has two important effects:
The amount of data returned is much smaller, only being the rows that match the condition.
The query can be optimized with the WHERE clause.
If you need to share the table, I might suggest that you make a copy on the local server -- either using replication or scheduling a job to copy it over.
Related
I'm trying to replace a entity.school view with entity.current_facts_with_frg view.
There are a few differences between the 2 views, one of which is that entity.school only includes a portion of the data from entity.current_facts_with_frg, so to get the same data, I'd have to run:
select * from entity.current_facts_with_frg where type = 'School'
The above query takes about 30 seconds to run, and select * from entity.school takes less than a second. I am okay with the time difference, however when I try to replace:
select * from entity.relationship r
join entity.school_network sn on r.related_id = sn.id
join entity.school s on r.main_id = s.id
with
select * from entity.relationship r
join entity.school_network sn on r.related_id = sn.id
join (select * from entity.current_facts_with_frg where type = 'School') s on r.main_id = s.id
there ends up being a huge difference in the query speed. The first query taking about 4 minutes, and the second query taking over 30 minutes.
I'm confused as I would presume that it should only take the extra 30 seconds from the initial time discrepancy.
I need my query to run weekly, but it is taking very long (about one hour per execution). The sheer volume of information makes it long to run, but I wondered how I could optimize it. I'm an SQL novice. This is my request :
SELECT PRIME_MINISTER.PRIME_MINISTER_ID,
PRIME_MINISTER.PRIME_MINISTER_NAME,
CITY.CITY_ID,
CITY.CITY_POPULATION,
CITY.CITY_FOUNDATION_DATE,
STATE.STATE_ID,
STATE.STATE_NAME,
CITY_ACCOUNTANT.CITY_ACCOUNTANT_ID,
CITY_ACCOUNTANT.CITY_ACCOUNTANT_SOCIAL,
CITY_ACCOUNTANT.CITY_ACCOUNTANT_NAME,
CITY_COUNCIL.CITY_COUNCIL_ID,
CITY_COUNCIL.CITY_COUNCIL_FREQUENCY,
CITY_DEBT.CITY_DEBT_ID,
CITY_DEBT.CITY_DEBT_NATURE,
CITY_DEBT.CITY_DEBT_AMOUNT,
HEAD_OF_STATE.HEAD_OF_STATE_ID,
HEAD_OF_STATE.HEAD_OF_STATE_SOCIAL,
DEPUTY_HEAD_OF_STATE.DEPUTY_HEAD_OF_STATE_ID,
DEPUTY_HEAD_OF_STATE.DEPUTY_HEAD_OF_STATE_SOCIAL,
DEPUTY_HEAD_OF_STATE.DEPUTY_HEAD_OF_STATE_NAME
FROM CITY
LEFT JOIN CITY_COUNCIL ON CITY.CITY_ID = CITY_COUNCIL.CITY_ID
LEFT JOIN CITY_DEBT ON CITY_COUNCIL.CITY_COUNCIL_ID = CITY_DEBT.CITY_COUNCIL_ID
OR CITY.CITY_ID = CITY_DEBT.CITY_ID
INNER JOIN CITY_ACCOUNTANT ON CITY_ACCOUNTANT.CITY_ACCOUNTANT_ID = CITY.CITY_ACCOUNTANT_ID
INNER JOIN STATE ON STATE.STATE_ID = CITY.STATE_ID
INNER JOIN HEAD_OF_STATE ON HEAD_OF_STATE.HEAD_OF_STATE_ID = STATE.HEAD_OF_STATE_ID
INNER JOIN DEPUTY_HEAD_OF_STATE ON DEPUTY_HEAD_OF_STATE.DEPUTY_HEAD_OF_STATE_ID = HEAD_OF_STATE.DEPUTY_HEAD_OF_STATE_ID
INNER JOIN PRIME_MINISTER ON STATE.STATE_ID = PRIME_MINISTER.STATE_ID
WHERE CITY.CITY_STATUS = 2
AND CITY.PRIME_MINISTER_STATUS = 2
AND CITY.JURISDICTION = '70'
AND CITY.CITY_ACCOUNTANT_NATURE = 'S'
ORDER BY DEPUTY_HEAD_OF_STATE.DEPUTY_HEAD_OF_STATE_ID,
HEAD_OF_STATE.HEAD_OF_STATE_ID,
STATE.STATE_ID,
CITY.CITY_ID,
CITY_COUNCIL.CITY_COUNCIL_ID,
CITY_DEBT.CITY_DEBT_ID,
CITY_ACCOUNTANT.CITY_ACCOUNTANT_ID;
I select all of this data in the reading part of a spring batch to be able to write it in a file.
This is the database model :
Database Model
The database is not mine so I can't modify the database model but I can create indexes if needed.
There is between 1,000 and 7,000 rows selected per execution. All the columns are needed.
CITY_ACCOUTANT in SQL vs CITY_ACCOUNTANT in picture: is this the right query or a typo?
Is the spring batch process is taking an hour or just the query?
so I have a query that is taking several seconds to run.
SELECT it.invoiceID, SUM(xgtpp.total + ws.expense) AS invoice_total
FROM Invoices_Timesheets it (NOLOCK)
INNER JOIN Timesheets_WorkSegments tws (NOLOCK)
INNER JOIN WorkSegments ws (NOLOCK) ON (tws.worksegmentID = ws.ID)
CROSS APPLY (
SELECT gtpp.worksegmentID, SUM(gtpp.pay_per_shift) AS total
FROM dbo.fnGetTotalPerProject(ws.projectID) gtpp
WHERE (gtpp.worksegmentID = tws.worksegmentID)
GROUP BY gtpp.worksegmentID
) xgtpp
ON (it.timesheetID = tws.timesheetID)
WHERE it.invoiceID = 37
GROUP BY it.invoiceID
The tables used are:
[Invoices]
ID,companyID,userID,projectID,insertDate,submitDate,viewDate,tax
[Invoices_Timesheets]
ID,timesheetID,invoiceID
[WorkSegments]
ID,companyID,userID,projectID,insertDate,startTime,endTime,break,poa,deleteDate,expense
[Timesheets_WorkSegments]
ID,timesheetID,worksegmentID
The UDF dbo.fnGetTotalPerProject() accepts only one parameter projectID
When I replace ws.projectID inside the UDF with a static value the performance is incredible, but as soon as I make it use ws.projectID the performance slows down badly.
This query is a sub-query of a larger one, but it is definitely the bottle neck.
I ended up re-writing the dbo.fnGetTotalPerProject to be per timesheet rather than per project. My next goal is to optimize it per work segment.
I have a SQL query with many left joins
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
LEFT JOIN T_PLAN_TYPE tp ON tp.plan_type_id = po.Plan_Type_Fk
LEFT JOIN T_PRODUCT_TYPE pt ON pt.PRODUCT_TYPE_ID = po.cust_product_type_fk
LEFT JOIN T_PROPOSAL_TYPE prt ON prt.PROPTYPE_ID = po.proposal_type_fk
LEFT JOIN T_BUSINESS_SOURCE bs ON bs.BUSINESS_SOURCE_ID = po.CONT_AGT_BRK_CHANNEL_FK
LEFT JOIN T_USER ur ON ur.Id = po.user_id_fk
LEFT JOIN T_ROLES ro ON ur.roleid_fk = ro.Role_Id
LEFT JOIN T_UNDERWRITING_DECISION und ON und.O_Id = po.decision_id_fk
LEFT JOIN T_STATUS st ON st.STATUS_ID = po.piv_uw_status_fk
LEFT OUTER JOIN T_MEMBER_INFO mi ON mi.proposal_info_fk = po.O_ID
WHERE 1 = 1
AND po.CUST_APP_NO LIKE '%100010233976%'
AND 1 = 1
AND po.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
The performance seems to be not good and I would like to optimize the query.
Any suggestions please?
Try this one -
SELECT COUNT(DISTINCT po.o_id)
FROM T_PROPOSAL_INFO po
WHERE PO.CUST_APP_NO LIKE '%100010233976%'
AND PO.IS_STP <> 1
AND po.PIV_UW_STATUS_FK != 10
First, check your indexes. Are they old? Did they get fragmented? Do they need rebuilding?
Then, check your "execution plan" (varies depending on the SQL Engine): are all joins properly understood? Are some of them 'out of order'? Do some of them transfer too many data?
Then, check your plan and indexes: are all important columns covered? Are there any outstandingly lengthy table scans or joins? Are the columns in indexes IN ORDER with the query?
Then, revise your query:
- can you extract some parts that normally would quickly generate small rowset?
- can you add new columns to indexes so join/filter expressions will get covered?
- or reorder them so they match the query better?
And, supporting the solution from #Devart:
Can you eliminate some tables on the way? does the where touch the other tables at all? does the data in the other tables modify the count significantly? If neither SELECT nor WHERE never touches the other joined columns, and if the COUNT exact value is not that important (i.e. does that T_PROPOSAL_INFO exist?) then you might remove all the joins completely, as Devart suggested. LEFTJOINs never reduce the number of rows. They only copy/expand/multiply the rows.
Looking for suggestions on speeding up this query. Access 2007.
The inner query takes a few minutes , full query takes very long (40 - 80 minutes). Result is as expected. Everything is Indexed.
SELECT qtdetails.F5, qtdetails.F16, ExpectedResult.DLID, ExpectedResult.NumRows
FROM qtdetails
INNER JOIN (INVDL
INNER JOIN ExpectedResult
ON INVDL.DLID = ExpectedResult.DLID)
ON (qtdetails.F1 = INVDL.RegionCode)
AND (qtdetails.RoundTotal = ExpectedResult.RoundTotal)
WHERE
(qtdetails.F5 IN (SELECT qtdetails.F5
FROM (ExpectedResult
INNER JOIN INVDL
ON ExpectedResult.DLID = INVDL.DLID)
INNER JOIN qtdetails
ON (INVDL.RegionCode = qtdetails.F1)
AND (ExpectedResult.RoundTotal = qtdetails.RoundTotal)
GROUP BY qtdetails.F5
HAVING (((COUNT(ExpectedResult.DLID)) < 2));
)
);
INVDL - 80,000 records
ExpectedResult - Ten Million records
qtDetails - 12,000 records
The Inner Query will result in around 5000 - 8000 records.
Tried saving the results of the Inner Query in a table. and then using
Select F5 from qTempTable instead. But still taking a lot of time.
Any help would be very highly appreciated.
Data Type :
qtdetails.F5 = Number
qtdetails.F16 = Text
ExpectedResult.NumRows = Number
INVDL.DLID = Number
ExpectedResult.DLID = Number
INVDL.RegionCode = Text
qtdetails.F1 = Text
Rebuild indexes on all the tables involved in the query. Run the query again and check time. It will decrease the execution time. I will update you soon with tuned query if I can.
Keep Querying!