Conditionally LIMIT in BigQuery

Conditionally LIMIT in BigQuery - google-bigquery

I have read that in Postgres setting LIMIT NULL will effectively not limit the results of the SELECT. However in BigQuery when I set LIMIT NULL based on a condition I see Syntax error: Unexpected keyword NULL.
I'd like to figure out a way to limit or not based on a condition (could be an argument passed into a procedure, or a parameter passed in by a query job, anything I can write a CASE or IF statement for). The mechanism for setting the condition shouldn't matter, what I'm looking for is whether there is a way to syntactically indicate a value for LIMIT, that will not limit, in a valid way to BigQuery.

The LIMIT clause works differently within BigQuery. It specifies the maximum number of depression inputs in the result. The LIMIT n must be a constant INT64.
Using the LIMIT clause, you can overcome the limitation on cache result size:
Using filters to limit the result set.
Using a LIMIT clause to reduce the result set, especially if you are
using an ORDER BY clause.
You can see this example:
SELECT
title
FROM
`my-project.mydataset.mytable`
ORDER BY
title DESC
LIMIT
100
This will only return 100 rows.
The best practice is to use it if you are sorting a very large number of values. You can see this document with examples.
If you want to return all rows from a table, you need to omit the LIMIT clause.
SELECT
title
FROM
`my-project.mydataset.mytable`
ORDER BY
title DESC
This example will return all the rows from a table. It is not recommended to omit LIMIT if your tables are too large, as it will consume a lot of resources.
One solution to optimize resources is to use cluster tables. This will save costs and querying times. You can see this document with a detailed explanation of how it works.

You can write a stored procedure that dynamically creates a query based on input parameters. Once your sql query is ready, you can use execute immediate to run that. In this way, you can control what value should be provided to the limit clause of your query.
https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting#execute_immediate
Hope this answers your query.

Related

Oracle error code ORA-00913 - IN CLAUSE limitation with more than 65000 values (Used OR condition for every 1k values)

My application team is trying to fetch 85,000 values from a table using a SELECT query that is being built on the fly by their program.
SELECT * FROM TEST_TABLE
WHERE (
ID IN (00001,00002, ..., 01000)
OR ID IN (01001,01002, ..., 02000)
...
OR ID IN (84001,84002, ..., 85000)
));
But i am getting an error "ORA-00913 too many values".
If I reduce the in clause to only 65,000 values, I am not getting this error. Is there any limitation of values for the IN CLAUSE (accompanied by OR clause)

The issue isn't about in lists; it is about a limit on the number of or-delimited compound conditions. I believe the limit applies not to or specifically, but to any compound conditions using any combination of or, and and not, with or without parentheses. And, importantly, this doesn't seem to be documented anywhere, nor acknowledged by anyone at Oracle.
As you clearly know already, there is a limit of 1000 items in an in list - and you have worked around that.
The parser expands an in condition as a compound, or-delimited condition. The limit that applies to you is the one I mentioned already.
The limit is 65,535 "atomic" conditions (put together with or, and, not). It is not difficult to write examples that confirm this.
The better question is why (and, of course, how to work around it).
My suspicion: To evaluate such compound conditions, the compiled code must use a stack, which is very likely implemented as an array. The array is indexed by unsigned 16-bit integers (why so small, only Oracle can tell). So the stack size can be no more than 2^16 = 65,536; and actually only one less, because Oracle thinks that array indexes start at 1, not at 0 - so they lose one index value (0).
Workaround: create a temporary table to store your 85,000 values. Note that the idea of using tuples (artificial as it is) allows you to overcome the 1000 values limit for a single in list, but it does not work around the limit of 65,535 "atomic" conditions in an or-delimited compound condition; this limit applies in the most general case, regardless of where the conditions come from originally (in lists or anything else).
More information on AskTom - you may want to start at the bottom (my comments, which are the last ones in the threads):
https://asktom.oracle.com/pls/apex/f?p=100:11:10737011707014::::P11_QUESTION_ID:9530196000346534356#9545388800346146842
https://asktom.oracle.com/pls/apex/f?p=100:11:10737011707014::::P11_QUESTION_ID:778625947169#9545394700346458835

The number of partitions [100000] relevant to the query is too large

I execute the following script in dolphindb.
select count(*) from pt
It throws an exception.The number of partitions [100000] relevant to the query is too large. Please add more specific filtering conditions on partition columns in WHERE clause, or consider changing the value of the configuration parameter maxPartitionNumPerQuery.
How to change the configuration parameter maxPartitionNumPerQuery?

The parameter maxPartitionNumPerQuery namely specifies the maximal number of partitions a query can involve. It is designed to prevent users from submitting a large query by accident. The default value for this parameter is 65536. Please configure the new value in cluster.cfg

How many items can be placed in an array in WHERE IN?

I have a query
SELECT [whatever] FROM [somewhere] WHERE [someValue] IN [value1, valeue2, ..., valueN]
What is the maximum size for N (from valueN above) in an Oracle 10g database? Could it be as high as 10k or 50k?

If you're using the 'expression list' version of the IN condition, which appears to be the case from your question though you're missing the brackets around the list of values, then you're limited by the expression list itself:
A comma-delimited list of expressions can contain no more than 1000
expressions. A comma-delimited list of sets of expressions can contain
any number of sets, but each set can contain no more than 1000
expressions.
If you're using the subquery version then there is no limit, other than possibly system resources.

Oracle has a fixed limit of 1000 elements for an IN clause as documented in the manual:
http://docs.oracle.com/cd/E11882_01/server.112/e26088/conditions013.htm#i1050801
You can specify up to 1000 expressions in expression_list.

This thread suggests that the limit is 1000. However, I would suggest you don't even go there and instead place your values in a table and turn your query into a subselect. Much neater, more flexible and better performance.

That is depends up on the number of rows you have for that particular column. In some cases it may be millions of records you have in table .

sql to get listing from paging

I have a user list table on my listing page. The data needs to be paged, so how can I make SQL page the data for me (ie. pull the data in sets of 10 records from the table)?

Informix has clauses analogous to, but different from, LIMIT and OFFSET:
SELECT SKIP n LIMIT m ...
You can use FIRST in place of LIMIT.
See the IDS 11.70 InfoCenter, or similar locations for earlier versions of IDS.

Now that you pointed out that you are using Informix, the LIMIT clause will not work. Are you able to instead place your selection into an array and call for the desired data from the array?

MySQL return result count along with all results or check if query limited

I have a query that returns a large amount of results, so i'm limiting the amount returned. But I want to find out if there were more results then the number I limited them to, either by getting all the results back with a count(*) or some way of determing if the results were limited all in the same query as the one that returns the results!

If you don't care about how many more rows there are, you can also just add 1 to the limit.
Say you wanted to display 100 rows per page. So you limit by 101. If at anytime you receive 101 rows, you know that there is at least one more page.
Obviously, you have to discard the extra row each time, which adds some extra complexity to the application code.

Use the FOUND_ROWS function after the query that uses the LIMIT:
SELECT FOUND_ROWS();
From the documentation:
A SELECT statement may include a LIMIT clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without the LIMIT, but without running the statement again. To obtain this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke FOUND_ROWS() afterward:

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas