Difference between using 'in' and statements combined by 'or'

Difference between using 'in' and statements combined by 'or' - sql-server-2012

what is the difference between this two queries
select * from ggk where gid in (100,200,300,400,500);
select * from ggk where gid=100 or gid=200 or gid=300 or gid=400 or gid=500;
I know that the first one is easy to write and it is eventually converted to the second query. But that will decrease the performance if there are more parameters.
Thanks in advance

Related

SQLite alias (AS) not working in the same query

I'm stuck in an (apparently) extremely trivial task that I can't make work , and I really feel no chance than to ask for advice.
I used to deal with PHP/MySQL more than 10 years ago and I might be quite rusty now that I'm dealing with an SQLite DB using Qt5.
Basically I'm selecting some records while wanting to make some math operations on the fetched columns. I recall (and re-read some documentation and examples) that the keyword "AS" is going to conveniently rename (alias) a value.
So for example I have this query, where "X" is an integer number that I render into this big Qt string before executing it with a QSqlQuery. This query lets me select all the electronic components used in a Project and calculate how many of them to order (rounding to the nearest multiple of 5) and the total price per component.
SELECT Inventory.id, UsedItems.pid, UsedItems.RefDes, Inventory.name, Inventory.category,
Inventory.type, Inventory.package, Inventory.value, Inventory.manufacturer,
Inventory.price, UsedItems.qty_used as used_qty,
UsedItems.qty_used*X AS To_Order,
ROUND((UsedItems.qty_used*X/5)+0.5)*5*CAST((X > 0) AS INT) AS Nearest5,
Inventory.price*Nearest5 AS TotPrice
FROM Inventory
LEFT JOIN UsedItems ON Inventory.id=UsedItems.cid
WHERE UsedItems.pid='1'
ORDER BY RefDes, value ASC
So, for example, I aliased UsedItems.qty_used as used_qty. At first I tried to use it in the next field, multiplying it by X, writing "used_qty*X AS To_Order" ... Query failed. Well, no worries, I had just put the original tab.field name and it worked.
Going further, I have a complex calculation and I want to use its result on the next field, but the same issue popped out: if I alias "ROUND(...)" AS Nearest5, and then try to use this value by multiplying it in the next field, the query will fail.
Please note: the query WORKS, but ONLY if I don't use aliases in the following fields, namely if I don't use the alias Nearest5 in the TotPrice field. I just want to avoid re-writing the whole ROUND(...) thing for the TotPrice field.
What am I missing/doing wrong? Either SQLite does not support aliases on the same query or I am using a wrong syntax and I am just too stuck/confused to see the mistake (which I'm sure it has to be really stupid).

Column aliases defined in a SELECT cannot be used:
For other expressions in the same SELECT.
For filtering in the WHERE.
For conditions in the FROM clause.
Many databases also restrict their use in GROUP BY and HAVING.
All databases support them in ORDER BY.
This is how SQL works. The issue is two things:
The logic order of processing clauses in the query (i.e. how they are compiled). This affects the scoping of parameters.
The order of processing expressions in the SELECT. This is indeterminate. There is no requirement for the ordering of parameters.
For a simple example, what should x refer to in this example?
select x as a, y as x
from t
where x = 2;
By not allowing duplicates, SQL engines do not have to make a choice. The value is always t.x.

You can try with nested queries.
A SELECT query can be nested in another SELECT query within the FROM clause;
multiple queries can be nested, for example by following the following pattern:
SELECT *,[your last Expression] AS LastExp From (SELECT *,[your Middle Expression] AS MidExp FROM (SELECT *,[your first Expression] AS FirstExp FROM yourTables));
Obviously, respecting the order that the expressions of the innermost select query can be used by subsequent select queries:
the first expressions can be used by all other queries, but the other intermediate expressions can only be used by queries that are further upstream.
For your case, your query may be:
SELECT *, PRC*Nearest5 AS TotPrice FROM (SELECT *, ROUND((UsedItems.qty_used*X/5)+0.5)*5*CAST((X > 0) AS INT) AS Nearest5 FROM (SELECT Inventory.id, UsedItems.pid, UsedItems.RefDes, Inventory.name, Inventory.category, Inventory.type, Inventory.package, Inventory.value, Inventory.manufacturer, Inventory.price AS PRC, UsedItems.qty_used*X AS To_Order FROM Inventory LEFT JOIN UsedItems ON Inventory.id=UsedItems.cid WHERE UsedItems.pid='1' ORDER BY RefDes, value ASC))

SQL - HAVING (execution vs structure)

I'm a beginner, studying on my own... please help me to clarify something about a query: I am working with a soccer database and trying to answer this question: list all seasons with an avg goal per Match rate of over 1, in Matchs that didn’t end with a draw;
The right query for it is:
select season,round((sum(home_team_goal+away_team_goal) *1.0) /count(id),3) as ratio
from match
where home_team_goal != away_team_goal
group by season
having ratio > 1
I don't understand 2 things about this query:
Why do I *1.0? why is it necessary?
I know that the execution in SQL is by this order:
from
where
group
having
select
So how does this query include: having ratio>1 if the "ratio" is only defined in the "select" which is executed AFTER the HAVING?
Am I confused?
Thanks in advance for the help!

The multiplication is added as a typecast to convert INT to FLOAT because by default sum of ints is int and the division looses decimal places after dividing 2 ints.
HAVING. You can consider HAVING as WHERE but applied to the query results. Imagine the query is executed first without HAVING and then the HAVING condition is applied to result rows leaving only suitable ones.
In you case you first select grouped data and calculate aggregated results and then skip unnecessary results of aggregation.

the *1.0 is used for its ".0" part so that it tells the system to treat the expression as a decimal, and thus not make an integer division which would cut-off the decimal part (eg 1 instead of 1.33).
About the second part: select being at the end just means that the last thing
to be done is showing the data. Hoewever, assigning an alias to a calculated field is being done, you could say, at first priority. Still, I am a bit doubtful; I am almost certain field aliases cannot be used in the where/group by/having in, say, sql server.

There is no order of execution of a SQL query. SQL is a descriptive language not a procedural language. A SQL query describes the result set that the query is producing. The SQL engine can execute it however it likes. In fact, most SQL engines compile the query into a directed acyclic graph, which looks nothing like the original query.
What you are referring to might be better phrased as the "order of interpretation". This is more simply described by simple rules. Column aliases can be used in the ORDER BY clause in any database. They cannot be used in the FROM, WHERE, or GROUP BY clauses. Some databases -- such as SQLite -- allow them to be referenced in the HAVING clause.
As for the * 1.0, it is because some databases -- such as SQLite -- do integer arithmetic. However, the logic that you want is probably more simply expressed as:
round((avg(home_team_goal + away_team_goal * 1.0), 3)

SQL Server AND AND OR AND AND

Stupid question but is there a difference between
Select * from TableA
where
System=1 and Acct=2 and FiscalNo=4
or
System=2 and FiscalNo=4 and SubAcct=1521
AND
Select * from TableA
where
(System=1 and Acct=2 and FiscalNo=4)
or
(System=2 and FiscalNo=4 and SubAcct=1521)
notice the difference is brackets
the first query does not have brackets

It shouldn't matter much because of the order of operations within the SQL statement. It will still prioritize AND over OR.

AND binds with higher priority than OR. Therefore, there is no difference.
Use style 2 for clarity, though. That obviates the need for this question.

Query evaluation with "INTERSECT" function

I've a query:
QUERY1{statements...}
INTERSECT
QUERY2{statements...}
I need to evaluate these 2 queries according to giving database data, my question is:
do I have to evaluate each query separately and then combine the 2 results together? i.g: cost(Query1) + cost(Query2) = Total query's cost ? .. or there is another way to solve this?

Yes, you have to add the 2 queries' cost.
In Toad Oracle you can evaluate the global intersect query:

Any idea why contains(...) querys so slow in SQL Server 2005

I've got a simple select query which executes in under 1 second normally, but when I add in a contains(column, 'text') into the where clause, suddenly it's running for 20 seconds up to a minute. The table it's selecting from has around 208k rows.
Any ideas what would cause this query to run so slow with just the addition of the contains clause?

Substring matching is a computationally expensive operation. Is the field indexed? If this is a major feature implementation, consider a search-caching table so you can simply lookup where the words exist.

Depending on the search keyword and the median length of characters in the column it is logical that it would take a long time.
Consider searching for 'cookie' in a column with median length 100 characters in a dataset of 200k rows.
Best case scenario with early outs, you would do 100 * 200k = 20m comparisons
Worst case scenario near missing on every compare, you would do (5 * 100) * 200k = 100m comparisons
Generally I would:
reorder your query to filter out as much as possible in advance prior to string matching
limit number of the results if you don't need all of them at once (TOP x)
reduce the number characters in your search term
reduce the number of search terms by filtering out terms that are likely to match a lot, or not at all (if applicable)
cache query results if possible (however cache invalidation can get pretty tricky if you want to do it right)

Try this:
SELECT *
FROM table
WHERE CONTAINS((column1, column2, column3), '"*keyword*"')
Instead of this:
SELECT *
FROM table
WHERE CONTAINS(column1, '"*keyword*"')
OR CONTAINS(column2, '"*keyword*"')
OR CONTAINS(column3y, '"*keyword*"')
The first one is a lot faster.

CONTAINS does a lot of extra work. There's a few things to note here:
NVarChar is always faster, so do CONTAINS(column, N'text')
If all you want to do is see if the word is in there, compare the performance to column LIKE '%' + text + '%'.
Compare query plans before and after, did it go to a table scan? If so, post more so we can figure out why.
In ultimo, you can break up the text's individual words into a separate table so they can be indexed.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas