selecting minimum value depending on other value - sql

Is there any better way for below sql query? Don't want to drop and create temporary table just would like to do it in 1 query.
I am trying to select minimum value for price depending if its order sell where obviously price is higher then in buy and it just shows 0 results when I try it.
DROP TABLE `#temporary_table`;
CREATE TABLE `#temporary_table` (ID int(11),region int(11),nazwa varchar(100),price float,isBuyOrder int,volumeRemain int,locationID int,locationName varchar(100),systemID int,security_status decimal(1,1));
INSERT INTO `#temporary_table` SELECT * FROM hauling WHERE isBuyOrder=0 ORDER BY ID;
SELECT * FROM `#temporary_table`;
SELECT * FROM `#temporary_table` WHERE (ID,price) IN (select ID, MIN(price) from `#temporary_table` group by ID) order by `ID`
UPDATE: when I try nvogel answer and checked profiling thats what I get:
Any chance to optimize this or different working way with 700k rows database?

Try this:
SELECT *
FROM hauling AS h
WHERE isBuyOrder = 0
AND price =
(SELECT MIN(t.price)
FROM hauling AS t
WHERE t.isBuyOrder = 0
AND t.ID = h.ID);

You don't need a temporary table at all. You can basically use your current logic:
SELECT h.*
FROM hauling h
WHERE h.isBuyOrder = 0 AND
(h.id, h.price) IN (SELECT h2.id, MIN(h2.price)
FROM hauling h2
WHERE h2.isBuyOrder = 0
)
ORDER BY h.id
There are many other ways to write similar logic. However, there is no need to rewrite the logic; you just need to include the comparison on isBuyOrder in the subquery.
Note that not all databases support IN with tuples. If your database does not provide such support, then you would need to rewrite the logic.

Related

UPDATE … FROM syntax with sub-query

I have a table, items, with a priority column, which is just an integer. In trying to do some bulk operations, I'm trying to reset the priority to be a sequential number.
I've been able to use ROW_NUMBER() to successfully generate a table that has the new priority values I want. Now, I just need to get the values from that SELECT query into the matching records in the actual items table.
I've tried something like this:
UPDATE
"items"
SET
"priority" = tempTable.newPriority
FROM
(
SELECT
ROW_NUMBER() OVER (
ORDER BY
/* pile of sort conditions here */
) AS "newPriority"
FROM
"items"
) AS tempTable
WHERE
"items"."id" = "tempTable"."id"
;
I keep getting a syntax error "near FROM".
How can I correct the syntax here?
SQLite is not as flexible as other rdbms, it does not support even joins in an update statement.
What you can do instead is something like this:
update items
set priority = 1 + (
select count(*)
from items i
where i.id < items.id
)
With this the condition is derived only by the ids.
So the column priority will be filled with sequential numbers 1, 2, 3, ....
If you can apply that pile of sort conditions in this manner, you will make the update work.
Edit.
Something like this maybe can do what you need, although I'm not sure about its efficiency:
UPDATE items
SET priority = (
SELECT newPriority FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY /* pile of sort conditions here */) AS newPriority
FROM items
) AS tempTable
WHERE tempTable.id = items.id
)
It turns out that the root answer to this specific question is that SQLite doesn't support UPDATE ... FROM. Therefore, some alternative methods are needed.
https://www.sqlite.org/lang_update.html

Is there something I can change to make my view run faster?

I just created a view but it is really slow, since my actual table has something around 800k rows.
Is there something I can change in the actual sql code to make it run faster?
Here is how it looks now:
Select B.*
FROM
(Select A.*, (select count(B.KEY_ID)/77
FROM book_new B
where B.KEY_ID = A.KEY_ID) as COUNT_KEY
FROM
(select *
from book_new
where region = 'US'
and (actual_release_date is null or
actual_release_date >= To_Date( '01/07/16','dd/mm/yy'))
) A
) B
WHERE B.COUNT_KEY = 1
OR (B.COUNT_KEY > 1 AND B.NEW_OLD <> 'Old')
The most obvious things to do are add indexes:
Add an index on book_new(key_id)
Add an index on book_new(region, actual_release_date)
These are probably sufficient. It is possible that rewriting the query would help, but this is a good beginning. If you want to rewrite the query, it would help if you described the logic you are trying to implement.
There are many ways to solve this issue based on your needs
You can create an indexed view
You can create an index in the base tables which are used in this view.
You can use the required columns in the SELECT statement instead of using SELECT * FROM,
If the table contains many columns but you require only few columns, you can create a NON CLUSTERED INDEX with INCLUDE COLUMNS option which will reduce the LOGICAL READS.
For starters, replace the scalar subquery for COUNT_KEY with a windowed COUNT(*).
SELECT * FROM
(
select book_new.*, COUNT(*) OVER ( PARTITION BY book_new.key_id)/77 COUNT_KEY
from book_new
where region = 'US'
and (actual_release_date is null or
actual_release_date >= To_Date( '01/07/16','dd/mm/yy'))
)
WHERE count_key = 1 OR ( count_key > 1 AND new_old <> 'Old' )
This way, you only go through the BOOK_NEW table one time.
BTW, I agree with other comments that this query makes little sense.

Ensuring two columns only contain valid results from same subquery

I have the following table:
id symbol_01 symbol_02
1 abc xyz
2 kjh okd
3 que qid
I need a query that ensures symbol_01 and symbol_02 are both contained in a list of valid symbols. In other words I would needs something like this:
select *
from mytable
where symbol_01 in (
select valid_symbols
from somewhere)
and symbol_02 in (
select valid_symbols
from somewhere)
The above example would work correctly, but the subquery used to determine the list of valid symbols is identical both times and is quite large. It would be very innefficient to run it twice like in the example.
Is there a way to do this without duplicating two identical sub queries?
Another approach:
select *
from mytable t1
where 2 = (select count(distinct symbol)
from valid_symbols vs
where vs.symbol in (t1.symbol_01, t1.symbol_02));
This assumes that the valid symbols are stored in a table valid_symbols that has a column named symbol. The query would also benefit from an index on valid_symbols.symbol
You could try use a CTE like;
WITH ValidSymbols AS (
SELECT DISTINCT valid_symbol
FROM somewhere
)
SELECT mt.*
FROM MyTable mt
INNER JOIN ValidSymbols v1
ON mt.symbol_01 = v1.valid_symbol
INNER JOIN ValidSymbols v2
ON mt.symbol_02 = v2.valid_symbol
From a performance perspective, your query is the right way to do this. I would write it as:
select *
from mytable t
where exists (select 1
from valid_symbols vs
where t.symbol_01 = vs.valid_symbol
) and
exists (select 1
from valid_symbols vs
where t.symbol_02 = vs.valid_symbol
) ;
The important component is that you need an index on valid_symbols(valid_symbol). With this index, the lookup should be pretty fast. Appropriate indexes can even work if valid_symbols is a view, although the effect depends on the complexity of the view.
You seem to have a situation where you have two foreign key relationships. If you explicitly declare these relationships, then the database will enforce that the columns in your table match the valid symbols.

optimising/simplifying cursor sql

i've got the below code, and it operates just fine, only it takes a couple of seconds to calculate the answer - i was wondering whether there is a quicker/neater way of writing this code - and if so, what am i doing wrong?
thanks
select case when
(select LSCCert from matterdatadef where ptmatter=$Matter$) is not null then
(
(select case when
(SELECT top 1 dbo.matterdatadef.ptmatter
From dbo.workinprogress, dbo.MatterDataDef
where ptclient=(
select top 1 dbo.workinprogress.ptclient
from dbo.workinprogress
where dbo.workinprogress.ptmatter = $matter$)
and dbo.matterdatadef.LSCCert=(
select top 1 dbo.matterdatadef.LSCCert
from dbo.matterdatadef
where dbo.matterdatadef.ptmatter = $matter$)
)=ptMatter then (
SELECT isnull((DateAdd(mm, 6, (
select top 1 Date
from OfficeClientLedger
where (pttrans=3)
and ptmatter=$matter$
order by date desc))),
(DateAdd(mm, 3, (
SELECT DateAdd
FROM LAMatter
WHERE ptMatter = $Matter$)))
)
)
end
from lamatter
where ptmatter=$matter$)
)
end
It looks like this your sql was generated from a reporting tool. The problem is you are executing the SELECT top 1 dbo.matterdatadef.ptmatter... query for every row of table lamatter. Further slowing execution, within that query you are recalculating comparison values for both ptclient and LSCCert - values that aren't going to change during execution.
Better to use proper joins and craft the query to execute each part only once by avoiding correlated subqueries (queries that reference values in joined tables and must be executed for every row of that table). Calculated values are OK, as long as they are calculated only once - ie from within the final where clause.
Here is a trivial example to demonstrate a correlated subquery:
Bad sql:
select a, b from table1
where a = (select c from table2 where d = b)
Here the sub-select is run for every row, which will be slow, especially without an index on table2(d)
Better sql:
select a, b from table1, table2
where a = c and d = a
Here the database will scan each table at most once, which will be fast

Best way to write a named SQL query that returns if row exists?

So I have this SQL query,
<named-query name="NQ::job_exists">
<query>
select 0 from dual where exists (select * from job_queue);
</query>
</named-query>
Which I plan to use like this:
Query q = em.createNamedQuery("NQ::job_exists");
List<Integer> results = q.getResultList();
boolean exists = !results.isEmpty();
return exists;
I am not very strong in SQL/JPA however, and was wondering whether there is a better way of doing it (or ways to improve it). Should I for example, write (select jq.id from job_queue jq) instead of using a star??
EDIT:This call is very performance critical in our app.
EDIT:Did some performance testing, and while the differences were almost negligible, I finally decided to go with:
select distinct null
from dual
where exists (
select null from job_queue
);
IF you are using EXISTS Oracle I recommend using null:
select null
from dual where exists (select null from job_queue);
The following will be the one with lower cost on Oracle:
select null
from job_queue
where rownum = 1;
Update: To include the case when there are no rows on table you can run the following query:
select count(*)
from (select null
from job_queue
where rownum = 1);
With this query you have a optimum plan and only two possible results: 1 if there are rows and 0 if there are no rows.
If you do an "exists" then it will stop looking as soon as it finds a match. This can stop it from doing a full table scan. Same with TOP 1 if you don't have an ORDER BY. If you do a TOP 1 ID and ID is in an index it might use the index and not even go to the table at all. Stopping the full table scan is where the biggest saving in performance is.
Another small savings is that if you do "SELECT 1" or "SELECT COUNT(1)" instead of "SELECT * " or "SELECT COUNT(*)" it saves the work of getting the table structure.
So I would go with:
SELECT TOP 1 1 AS Found
FROM job_queue
Then:
return !results.isEmpty();
This is the least amount of work that I can think of.
For Oracle it would be:
SELECT 1
FROM job_queue
WHERE rownum<2;
Or:
Set Rowcount 1
SELECT 1
FROM job_queue
Why not just do:
select count(*) as JobCount from job_queue
If JobCount = 0, then there's your answer!