Maximum values in wherein clause of mysql - where-clause

Do anyone knows about how many values I can give in a where in clause? I get 25000 values in a where in clause and mysql is unable to execute. Any thoughts? Awaiting for your thoughts

Although this is old, it still shows up in search results so is worth answering.
There is no hard-coded maximum in MySQL for the length of a query. This includes all parts of the query such as the WHERE clause.
However, there is a value called max_allowed_packet which determines the largest query you can run on the MySQL server process. It isn't to do with the number of elements in the query, but the total length of the query. So
SELECT * FROM mytable WHERE mycol IN (1,2,3);
is less likely to hit the limit than
SELECT * FROM mytable WHERE mycal IN ('This string','That string','Tother string');
The value of max_allowed_packet is configurable from server to server. But almost certainly, if you find yourself hitting the limit because you're writing SQL statements of epic length (rather than dealing with binary data which is a legitimate reason to hit it), then you need to re-think your SQL.

I think that if this restriction is a problem then you're doing something wrong.
Perhaps you could store the data from your where clause in a table and then join with it. This would probably be more efficient.

I think it is something with execution time.
I think you are doing soemthing like this: Correct me if i am wrong:
Select FROM table WHERE V1='A1' AND V2='A1' AND V3='A3' AND ... Vn='An'
There is always a efficient way how you can do your SELECT in your database. Working in a database is importent to keep in your mind that seconds are very importent.
If you can share how your query is look like, then we can help you making a efficient SELECT statement.
I wish u succes

Related

Querying time higher with 'Where' than without it

I have something what I think is a srange issue. Normally, I think that a Query should last less time if I put a restriction (so that less rows are processed). But I don't know why, this is not the case. Maybe I'm putting something wrong, but I don't get error; the query just seems to run 'till infinity'.
This is the query
SELECT
A.ENTITYID AS ORG_ID,
A.ID_VALUE AS LEI,
A.MODIFIED_BY,
A.AUDITDATETIME AS LAST_DATE_MOD
FROM (
SELECT
CASE WHEN IFE.NEWVALUE IS NOT NULL
then EXTRACTVALUE(xmltype(IFE.NEWVALUE), '/DocumentElement/ORG_IDENTIFIERS/ID_TYPE')
ELSE NULL
end as ID_TYPE,
case when IFE.NEWVALUE is not null
then EXTRACTVALUE(xmltype(IFE.NEWVALUE), '/DocumentElement/ORG_IDENTIFIERS/ID_VALUE')
ELSE NULL
END AS ID_VALUE,
(select u.username from admin.users u where u.userid = ife.analystuserid) as Modified_by,
ife.*
FROM ife.audittrail ife
WHERE
--IFE.AUDITDATETIME >= '01-JUN-2016' AND
attributeid = 499
AND ROWNUM <= 10000
AND (CASE WHEN IFE.NEWVALUE IS NOT NULL then EXTRACTVALUE(xmltype(IFE.NEWVALUE), '/DocumentElement/ORG_IDENTIFIERS/ID_TYPE') ELSE NULL end) = '38') A
--WHERE A.AUDITDATETIME >= '01-JUN-2016';
So I tried with the two clauses commented (one per each time of course).
And with both of them happens the same; the query runs for so long time that I have to abort it.
Do you know why this could be happening? How could I do, maybe in a different way, to put the restriction?
The values of the field AUDITDATETIME are '06-MAY-2017', for example. In that format.
Thank you very much in advance
I think you may misunderstand how databases work.
Firstly, read up on EXPLAIN - you can find out exactly what is taking time, and why, by learning to read the EXPLAIN statement.
Secondly - the performance characteristics of any given query are determined by a whole range of things, but usually the biggest effort goes not in processing rows, but finding them.
Without an index, the database has to look at every row in the database and compare it to your where clause. It's the equivalent of searching in the phone book for a phone number, rather than a name (the phone book is indexed on "last name").
You can improve this by creating indexes - for instance, on columns "AUDITDATETIME" and "attributeid".
Unlike the phone book, a database server can support multiple indexes - and if those indexes match your where clause, your query will be (much) faster.
Finally, using an XML string extraction for a comparison in the where clause is likely to be extremely slow unless you've got an index on that XML data.
This is the equivalent of searching the phone book and translating the street address from one language to another - not only do you have to inspect every address, you have to execute an expensive translation step for each item.
You probably need index(es)... We can all make guesses on what indexes you already have, and need to add, but most dbms have built in query optimizers.
If you are using MS SQL Server you can execute query with query plan, that will tell you what index you need to add to optimize this particular query. It will even let you copy /paste the command to create it.

In Oracle SQL , what is the maximum number of AND clauses in a query?

More of a curious question .. Studying a SQL and I want to know about what is the maximum number of AND clauses:
WHERE condition1
AND condition2
AND condition3
AND condition4
...
AND condition?
...
AND condition_n;
i.e what isthe biggest possible n ? It would seem that since these could be trivial comparisons, the limit it high.
How far can one go before reach limit?
src
Practically, there is no limit.
Most tools will have some limit on the length of the SQL statement that they can deal with. If you want to get really deep into the weeds, though, you could use the dbms_sql package which accepts a collection of varchar2(4000) that comprise a single SQL statement. That would get you up to 2^32 * 4000 bytes. If we assume that every condition is at least 10 bytes, that puts a reasonable upper limit of 400 * 2^32 which is roughly 800 billion conditions. If you're getting anywhere close to that, you're doing something really wrong. Most tools will have limits that kick in well before that.
Of course, if you did create the largest possible SQL statement using dbms_sql, that SQL statement would require ~16 trillion bytes. A single SQL statement that required 16 TB of storage would probably create other issues...
I put together a simple test case:
select * from dual
where 1=1
and 1=1
...
Using SQL*Plus, I was able to run with 100,000 conditions (admittedly, very simple ones) without an issue. I'd find any use case that came even close to approaching that number to be highly suspect...
Not sure about that cause I don't think even in Oracle spec they have defined it but on few factor this would be determined.
length of the query. In few Oracle community/forum post I have read that in oracle 9i the maximum length of an SQL statement is 64k but in later version that limit is not specified rather it's specified saying depends on disk space, memory availability etc.
Again, in few Oracle forum I have read that, Oracle support 1000 element in INLIST (IN (a1,a2,...,a1000)). So it will get converted to 1000 OR condition like a1 OR a2 Or ... OR a1000. With that, my understanding is, if it supports 1000 OR condition; it will be able to cope up with same number of AND condition as well.
But ultimately, I don't think there is any documented limit/upperbound present.

Pagination with SQLite using LIMIT

I'm writing my own SQLIteBrowser and I have one final problem, which apparently is quite often discussed on the web, but it doesn't seem to have a good general solution.
So currently I store the SQL which the user entered. Whenever I need to fetch rows, I execute the SQL by adding "Limit n, m` at the end of the SQL.
For normal SQLs, which I mostly use, this seems good enough. However if I want to use limit myself in the query, this will obviously give an error, because the resulting sql can look like this:
select * from table limit 30 limit 1,100
which is obviously wrong. Is there some better way to do this?
My idea was, that I could scan the SQL and check if there is a limit clause already used and then ignore it. Of course it's not as simlpe as that, because if I have an sql like this:
select * from a where a.b = ( select x from z limit 1)
it obviously should still apply my limit in such a case, so I could scan the string from the end and look if there is a limit somehwere. My question now is, how feasable this is. As I don't know who the SQL parser works, I'm not sure if LIMIT has to be at the end of SQL or if there can be other commands at the end as well.
I tested it with order byand group by and I get SQL errors if limit is not at the end, so my assumption seems to be true.
I found now a much better solution which is quite simple and doesn't require me to parse the SQL.
The user can enter an arbitrary sql. The result is loaded into a table. Since we don't want to load the whole result at once, as this can return millions of records, only N records are retriueved. When the user scroll to the bottom of the table the next N items are fetched and loaded into the table.
The solution is, to wrapt the SQL into an outer sql with my page size limits.
select * from (arbitrary UserSQL) limit PageSize, CurrentOffset
I tested it with SQLs I regularly use, and this seem to work quite nicely and is also fast enough for my purpose.
However, I don't know wether SQLite has a mechanism to fetch the new rows faster, or if the sql has to be rerun every time. In that case it might not be a good solution fo rrealy complex queries with a long response time.

For an Oracle NUMBER datatype, LIKE operator vs BETWEEN..AND operator

Assume mytable is an Oracle table and it has a field called id. The datatype of id is NUMBER(8). Compare the following queries:
select * from mytable where id like '715%'
and
select * from mytable where id between 71500000 and 71599999
I would think the second is more efficient since I think "number comparison" would require fewer number of assembly language instructions than "string comparison". I need a confirmation or correction. Please confirm/correct and throw any further comment related to either operator.
UPDATE: I forgot to mention 1 important piece of info. id in this case must be an 8-digit number.
If you only want values between 71500000 and 71599999 then yes the second one is much more efficient. The first one would also return values between 7150-7159, 71500-71599 etc. and so forth. You would either need to sift through unecessary results or write another couple lines of code to filter the rest of them out. The second option is definitely more efficient for what you seem to want to do.
It seems like the execution plan on the second query is more efficient.
The first query is doing a full table scan of the id's, whereas the second query is not.
My Test Data:
Execution Plan of first query:
Execution Plan of second query:
I don't like the idea of using LIKE with a numeric column.
Also, it may not give the results you are looking for.
If you have a value of 715000000, it will show up in the query result, even though it is larger than 71599999.
Also, I do not like between on principle.
If a thing is between two other things, it should not include those two other things. But this is just a personal annoyance.
I prefer to use >= and <= This avoids confusion when I read the query. In addition, sometimes I have to change the query to something like >= a and < c. If I started by using the between operator, I would have to rewrite it when I don't want to be inclusive.
Harv
In addition to the other points raised, using LIKE in the manner you suggest would cause Oracle to not use any indexes on the ID column due to the implicit conversion of the data from number to character, resulting in a full table scan when using LIKE versus and index range scan when using BETWEEN. Assuming, of course, you have an index on ID. Even if you don't, however, Oracle will have to do the type conversion on each value it scans in the LIKE case, which it won't have to do in the other.
You can use math function, otherwise you have to use to_char function to use like, but it will cause performance problems.
select * from mytable where floor(id /100000) = 715
or
select * from mytable where floor(id /100000) = TO_NUMBER('715') // this is parametric

Performance of SQL functions vs. code functions

We're currently investigating the load against our SQL server and looking at ways to alleviate it. During my post-secondary education, I was always told that, from a performance standpoint, it was cheaper to make SQL Server do the work. But is this true?
Here's an example:
SELECT ord_no FROM oelinhst_sql
This returns 783119 records in 14 seconds. The field is a char(8), but all of our order numbers are six-digits long so each has two blank characters leading. We typically trim this field, so I ran the following test:
SELECT LTRIM(ord_no) FROM oelinhst_sql
This returned the 783119 records in 13 seconds. I also tried one more test:
SELECT LTRIM(RTRIM(ord_no)) FROM oelinhst_sql
There is nothing to trim on the right, but I was trying to see if there was any overhead in the mere act of calling the function, but it still returned in 13 seconds.
My manager was talking about moving things like string trimming out of the SQL and into the source code, but the test results suggest otherwise. My manager also says he heard somewhere that using SQL functions meant that indexes would not be used. Is there any truth to this either?
Only optimize code that you have proven to be the slowest part of your system. Your data so far indicates that SQL string manipulation functions are not effecting performance at all. take this data to your manager.
If you use a function or type cast in the WHERE clause it can often prevent the SQL server from using indexes. This does not apply to transforming returned columns with functions.
It's typically user defined functions (UDFs) that get a bad rap with regards to SQL performance and might be the source of the advice you're getting.
The reason for this is you can build some pretty hairy functions that cause massive overhead with exponential effect.
As you've found with rtrim and ltrim this isn't a blanket reason to stop using all functions on the sql side.
It somewhat depends on what all is encompassed by: "things like string trimming", but, for string trimming at least, I'd definitely let the database do that (there will be less network traffic as well). As for the indexes, they will still be used if you're where clause is just using the column itself (as opposed to a function of the column). Use of the indexes won't be affected whatsoever by using functions on the actual columns you're retrieving (just on how you're selecting the rows).
You may want to have a look at this for performance improvement suggestions: http://net.tutsplus.com/tutorials/other/top-20-mysql-best-practices/
As I said in my comment, reduce the data read per query and you will get a speed increase.
You said:
our order numbers are six-digits long
so each has two blank characters
leading
Makes me think you are storing numbers in a string, if so why are you not using a numeric data type? The smallest numeric type which will take 6 digits is an INT (I'm assuming SQL Server) and that already saves you 4 bytes per order number, over the number of rows you mention that's quite a lot less data to read off disk and send over the network.
Fully optimise your database before looking to deal with the data outside of it; it's what a database server is designed to do, serve data.
As you found it often pays to measure but I what I think your manager may have been referring to is somthing like this.
This is typically much faster
SELECT SomeFields FROM oelinhst_sql
WHERE
datetimeField > '1/1/2011'
and
datetimeField < '2/1/2011'
than this
SELECT SomeFields FROM oelinhst_sql
WHERE
Month(datetimeField) = 1
and
year(datetimeField) = 2011
even though the rows that are returned are the same