Optimize Query (remove subquery) - sql

Can you help me to optimize this Query ?. I need to remove the subquery because the performance is awful.
select LICENSE,
(select top 1 SERVICE_KEY
from SERVICES
where SERVICES.LICENSE = VEHICLE.LICENSE
order by DATE desc, HOUR desc)
from VEHICLE
The problem is that I can have two SERVICES on the same DATE and HOUR, so I haven't been able to code an equivalent SQL avoiding the subquery.
The query runs on a Legacy database where I can't modify its metadata, and it doesn't have any index at all. That's the reason to look for a solution that can avoid a correlated query.
Thank you.

You can express your query using ROW_NUMBER() without the need for a correlated subquery. Try the following query and see how the peformance is:
SELECT t.LICENSE, t.SERVICE_KEY
FROM
(
SELECT t1.LICENSE, t1.SERVICE_KEY
ROW_NUMBER() OVER (PARTITION BY t1.LICENSE
ORDER BY t2.DATE DESC, t2.HOUR DESC) rn
FROM VEHICLE t1
INNER JOIN SERVICES t2
ON t1.LICENSE = t2.LICENSE
) t
WHERE t.rn = 1
The performance of this query would depend, among other things, on having indices on the join columns of your two tables.

Related

Avoid correlated subqueries error in BigQuery

I have a simple query to obtain the currency rate in use at the time a transaction was created:
SELECT t.orderid, t.date,
(SELECT rate FROM sources.currency_rates r WHERE currencyid=1 AND
r.date>=t.date ORDER BY date LIMIT 1) rate
FROM sources.transactions t
This triggers an error:
Error: Correlated subqueries that reference other tables are not
supported unless they can be de-correlated, such as by transforming
them into an efficient JOIN.'
I've tried with several types of joins and named subqueries, but none seem to work. What is the best way to accomplish this? Seems like a very common scenario that should be quite straightforward to implement in BQ's Standard Sql.
Below is for BigQuery Standard SQL
#standardSQL
SELECT
t.orderid AS orderid,
t.date AS date,
ARRAY_AGG(r.rate ORDER BY r.date LIMIT 1)[SAFE_OFFSET(0)] AS rate
FROM `sources.transactions` AS t
JOIN `sources.currency_rates` AS r
ON currencyid = 1
AND r.date >= t.date
GROUP BY orderid, date
I've noticed similar behavior with other correlated subqueries. They are useful, but can't always be automatically modeled to JOINs by BigQuery.
Similar case which works:
#standardSQL
SELECT name, (
SELECT AVG(temp)
FROM `bigquery-public-data.noaa_gsod.gsod2017` b
WHERE a.usaf=b.stn
) temp
FROM `bigquery-public-data.noaa_gsod.stations` a
LIMIT 10
Doesn't work:
#standardSQL
SELECT name, (
SELECT temp
FROM `bigquery-public-data.noaa_gsod.gsod2017` b
WHERE a.usaf=b.stn
ORDER BY da
LIMIT 1
) temp
FROM `bigquery-public-data.noaa_gsod.stations` a
LIMIT 10
Fix:
#standardSQL
SELECT name, ARRAY_AGG(temp ORDER BY da LIMIT 1) temp
FROM `bigquery-public-data.noaa_gsod.stations` a
JOIN `bigquery-public-data.noaa_gsod.gsod2017` b
ON a.usaf=b.stn
GROUP BY 1
LIMIT 10
(give me a public dataset, and I'll write a query that works with your data)

Order by DESC reverse result

I'm retrieving some data in SQL, order by DESC. I then want to reverse the result. I was doing this by pushing the data into an array and then using array_reverse, but I am finding it's quite taxing on CPU time and would like to simply use the correct SQL query.
I've looked at this thread SQL Server reverse order after using desc, but I cannot seem to make it work with my query.
SELECT live.message,
live.sender,
live.sdate,
users.online
FROM live, users
WHERE users.username = live.sender
ORDER BY live.id DESC
LIMIT 15
You can place your query into a subquery and then reverse the order:
SELECT t.message,
t.sender,
t.sdate,
t.online
FROM
(
SELECT live.id,
live.message,
live.sender,
live.sdate,
users.online
FROM live
INNER JOIN users
ON users.username = live.sender
ORDER BY live.id DESC
LIMIT 15
) t
ORDER BY t.id ASC
You'll notice that I replaced your implicit JOIN with an explicit INNER JOIN. It is generally considered undesirable to use commas in the FROM clause (q.v. the ANSI-92 standard) because it makes the query harder to read.
You could wrap your query with another query and order by with asc. Since you want to order by live.id, you must include it in the inner query so the outer one can sort by it:
SELECT message, sender, sdate, online
FROM (SELECT live.message, live.sender, live.sdate, users.online, live.id
FROM live, users
WHERE users.username = live.sender
ORDER BY live.id DESC
LIMIT 15) t
ORDER BY id ASC

Order by not working in Oracle subquery

I'm trying to return 7 events from a table, from todays date, and have them in date order:
SELECT ID
FROM table
where ID in (select ID from table
where DATEFIELD >= trunc(sysdate)
order by DATEFIELD ASC)
and rownum <= 7
If I remove the 'order by' it returns the IDs just fine and the query works, but it's not in the right order. Would appreciate any help with this since I can't seem to figure out what I'm doing wrong!
(edit) for clarification, I was using this before, and the order returned was really out:
select ID
from TABLE
where DATEFIELD >= trunc(sysdate)
and rownum <= 7
order by DATEFIELD
Thanks
The values for the ROWNUM "function" are applied before the ORDER BY is processed. That why it doesn't work the way you used it (See the manual for a similar explanation)
When limiting a query using ROWNUM and an ORDER BY is involved, the ordering must be done in an inner select and the limit must be applied in the outer select:
select *
from (
select *
from table
where datefield >= trunc(sysdate)
order by datefield ASC
)
where rownum <= 7
You cannot use order by in where id in (select id from ...) kind of subquery. It wouldn't make sense anyway. This condition only checks if id is in subquery. If it affects the order of output, it's only incidental. With different data query execution plan might be different and output order would be different as well. Use explicit order by at the end of the main query.
It is well known 'feature' of Oracle that rownum doesn't play nice with order by. See http://www.adp-gmbh.ch/ora/sql/examples/first_rows.html for more information. In your case you should use something like:
SELECT ID
FROM (select ID, row_number() over (order by DATEFIELD ) r
from table
where DATEFIELD >= trunc(sysdate))
WHERE r <= 7
See also:
http://www.orafaq.com/faq/how_does_one_select_the_top_n_rows_from_a_table
http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html
http://asktom.oracle.com/pls/asktom/f?p=100:11:507524690399301::::P11_QUESTION_ID:127412348064
See also other similar questions on SO, eg.:
Oracle SELECT TOP 10 records
Oracle/SQL - Select specified range of sequential records
Your outer query cant "see" the ORDER in the inner query and in this case the order in the inner doesn't make sense because it (the inner) is only being used to create a subset of data that will be used on the WHERE of the outer one, so the order of this subset doesn't matter.
maybe if you explain better what you want to do, we can help you
ORDER BY CLAUSE IN Subqueries:
the order by clause is not allowed inside a subquery, with the exception of the inline views. If attempt to include an ORDER BY clause, you receive an error message
An inline View is a query at the from clause.
SELECT t.*
FROM (SELECT id, name FROM student) t

Avoiding Correlated Subquery in Oracle

In Oracle 9.2.0.8, I need to return a record set where a particular field (LAB_SEQ) is at a maximum (it is a sequential VARCHAR array '0001', '0002', etc.) for each of another field (WO_NUM). To select the maximum, I am attempting to order in descending order and select the first row. Everything I can find on StackOverflow suggests that the only way to do this is with a correlated subquery. Then I use this maximum in the WHERE clause of the outer query to get the row I want for each WO_NUM:
SELECT lt.WO_NUM, lt.EMP_NUM, lt.LAB_END_DATE, lt.LAB_END_TIME
FROM LAB_TIM lt WHERE lt.LAB_SEQ = (
SELECT LAB_SEQ FROM (
SELECT lab.LAB_SEQ FROM LAB_TIM lab WHERE lab.CCN='1' AND MAS_LOC='1'
AND lt.WO_NUM = lab.WO_NUM ORDER BY ROWNUM DESC
) WHERE ROWNUM=1
)
However, this returns an invalid identifier for lt.WO_NUM error. Research suggests that ORacle 8 only allows correlated subqueries one level deep, and suggests rewriting to avoid the subquery - something which discussion of selecting maximums suggests can't be done. Any help getting this statement to execute would be greatly appreciated.
Your correlated subquery would need to be something like
SELECT lt.WO_NUM, lt.EMP_NUM, lt.LAB_END_DATE, lt.LAB_END_TIME
FROM LAB_TIM lt WHERE lt.LAB_SEQ = (
SELECT max(lab.LAB_SEQ)
FROM LAB_TIM lab
WHERE lab.CCN='1' AND MAS_LOC='1'
AND lt.WO_NUM = lab.WO_NUM
)
Since you are on Oracle 9.2, it will probably be more efficient to use a correlated subquery. I'm not sure what the predicates lab.CCN='1' AND MAS_LOC='1' are doing in your current query so I'm not quite sure how to translate them into the analytic function approach. Is the combination of LAB_SEQ and WO_NUM not unique in LAB_TIM? Do you need to add in the predicates on CCN and MAS_LOC in order to get a single unique row for every WO_NUM? Or are you using those predicates to decrease the number of rows in your output? The basic approach will be something like
SELECT *
FROM (SELECT lt.WO_NUM,
lt.EMP_NUM,
lt.LAB_END_DATE,
lt.LAB_END_TIME,
rank() over (partition by wo_num
order by lab_seq desc) rnk
FROM LAB_TIM lt)
WHERE rnk = 1
but it's not clear to me whether CCN and MAS_LOC need to be added to the ORDER BY clause in the analytic function or whether they need to be added to the WHERE clause.
This is one case where a correlated subquery is better, particularly if you have indexes on the table. However, it should be possible to rewrite correlated subqueries as joins.
I think the following is equivalent, without the correlated subquery:
SELECT lt.WO_NUM, lt.EMP_NUM, lt.LAB_END_DATE, lt.LAB_END_TIME
FROM (select *, rownum as r
from LAB_TIM lt
) lt join
(select wo_num, max(r) as maxrownum
from (select LAB_SEQ, wo_num, rownum as r
from LAB_TIM lt
where lab.CCN = '1' AND MAS_LOC = '1'
)
) ltsum
on lt.wo_num = ltsum.wo_num and
lt.r = ltsum.maxrownum
I'm a little unsure about how Oracle works with rownums in things like ORDER BY.

Complex SQL pagination Query

I am doing pagination for my data using the solution to this question.
I need to be using this solution for a more complex query now. Ie. the SELECT inside the bracket has joins and aggregate functions.
This is that solution I'm using as a reference:
;WITH Results_CTE AS
(
SELECT
Col1, Col2, ...,
ROW_NUMBER() OVER (ORDER BY SortCol1, SortCol2, ...) AS RowNum
FROM Table
WHERE <whatever>
)
SELECT *
FROM Results_CTE
WHERE RowNum >= #Offset
AND RowNum < #Offset + #Limit
The query that I need to incorporate into the above solution:
SELECT users.indicator, COUNT(*) as 'queries' FROM queries
INNER JOIN calls ON queries.call_id = calls.id
INNER JOIN users ON calls.user_id = users.id
WHERE queries.isresolved=0 AND users.indicator='ind1'
GROUP BY users.indicator ORDER BY queries DESC
How can I achieve this? So far I've made it work by removing the ORDER BY queries DESC part and putting that in the line ROW_NUMBER() OVER (ORDER BY ...) AS RowNum, but when I do this it doesn't allow me to order by that column ("Invalid column name 'queries'.").
What do I need to do to get it to order by this column?
edit: using SQL Server 2008
Try ORDER BY COUNT(*) DESC . It works on MySQL ... not sure about SQL Server 2008
I think queries your alias name for count(*) column
then use like this
SELECT users.indicator, COUNT(*) as 'queries' FROM queries
INNER JOIN calls ON queries.call_id = calls.id
INNER JOIN users ON calls.user_id = users.id
WHERE queries.isresolved=0 AND users.indicator='ind1'
GROUP BY users.indicator ORDER BY COUNT(*) DESC
http://oops-solution.blogspot.com/2011/11/string-handling-in-javascript.html