Querying Oracle with a pick list - sql

I have an oracle database that I have read-only access (with no permission to create temporary tables). I have a pick list (in Excel) of 28000 IDs corresponding to 28000 rows in a table which has millions of records. How do I write a query to return the 28000 rows?
I tried creating a table in access and performing a join through ODBC but Access freezes/takes an incredible long time. Would I have to create a query with 28,000 items in an IN statement?
Is there anything in PL/SQL that would make it easier?
Thank you for your time and help.
-JC

What makes your 28,000 rows special?
Is there another field in the records you can use to restrict you query in a WHERE clause (or at least narrow down the millions of rows a bit)? Perhaps the ID's you're interested in fall within a certain range?

The max number of variables for an IN (.., .. ,,) type query is 1000 in Oracle 10g.

Try creating an index on the table you created in Access.

That's a painful condition to be in. One workaround is to create a view that contains all of the ids, then join to it.
The example below is Oracle.
WITH
ids AS
(
SELECT 314 ID FROM DUAL UNION ALL
SELECT 159 ID FROM DUAL UNION ALL
SELECT 265 ID FROM DUAL
)
SELECT VALUE1, VALUE2
FROM SOME_TABLE, ids
WHERE SOME_TABLE.ID = ids.ID
This basically embeds all 28000 ids, in the with clause, allowing you to do the join, without actually creating a table.
Ugly, but it should work.

The best way to do it is described here: How to put more than 1000 values into an Oracle IN clause

Related

Snowflake sql table name wildcard

What is a good way to "select" from multiple tables at once when the list of tables is not known in advance in snowflake sql?
Something that simulates
Select * from mytable*
which would fetch same results as
Select * from mytable_1
union
Select * from mytable_2
...
I tried doing this in a multistep.
show tables like 'mytable%';
set mytablevar =
(select listagg("name", ' union ') table_
from table(result_scan(last_query_id())))
The idea was to use the variable mytablevar to store the union of all tables in a subsequent query, but the variable size exceeded the size limit of 256 as the list of tables is quite large.
Even if you do not hit 256 character limits, it will not help you to query all these tables. How will you use that session variable?
If you have multiple tables which have the same structure, and hold similar data that you need to query together, why are the data not in one big table? You can use Snowflake's clustering feature to distribute data based on a specific column to your micro-partitions.
https://docs.snowflake.com/en/user-guide/tables-clustering-micropartitions.html
Anyway, you may create a stored procedure which will create/replace a view.
https://docs.snowflake.com/en/sql-reference/stored-procedures-usage.html#dynamically-creating-a-sql-statement
And then you can query that view:
CALL UPDATE_MY_VIEW( 'myview_name', 'table%' );
SELECT * FROM myview_name;

How to put more than 1million ID's using union All [duplicate]

I have comma delimited id's that I want to use in NOT IN clause..
I'm using oracle 11g.
select * from table where ID NOT IN (1,2,3,4,...,1001,1002,...)
results in
ORA-01795: maximum number of expressions in a list is 1000
I don't want to use temp table. am trying considering doing this
select * from table1 where ID NOT IN (1,2,3,4,…,1000) AND
ID NOT IN (1001,1002,…,2000)
Is there any other better workaround to this issue?
You said you don't want to, but: use a temporary table. That's the correct solution here.
Query parsing is expensive in Oracle, and that's what you'll get when you put thousands of identifiers into a giant blob of SQL. Also, there are ill-defined limits on query length that you're going to hit. Doing an anti-JOIN against a table, on the other hand... Oracle is good at that. Bulk loading data into a table, Oracle is good at that too. Use a temp table.
Limiting IN to a thousand entries is a sanity check. The fact that you're hitting it means you're trying to do something insane.
Jump out of the question, can you combine the SQL to get more than 1000 IDs with this SQL. That's the better way to simplify your SQLs.
It's insane.
But you can probably try to select from select:
SELECT * FROM
(SELECT * FROM table WHERE ID NOT IN (1,2,3,4,...,1000))
WHERE ID NOT IN (1001,1002,…,2000)
Make as many levels as you need.
Use MINUS, the opposite to `UNION
SELECT * FROM TABLE
MINUS
SELECT T.* FROM TABLE T,TABLE2 T2 WHERE T.ID = T2.ID
This represents registers on table T which id not in table2 t2

my SELECT statement is very slow

My table contain 97 column with data type varchar(200) about 80% and bigint,char.
select count(*) from mytable it return
4500 rows
I am testing with 100 rows with SQL statement
select * from mytable .It use time about 2.3 min
I think,it is too very slow if return all record
please recommend me for the way to up speed my query and result.
my db server is db2 9.5
The first thing you should do to improve the performance is stop using SELECT *. Rewrite the query to get the columns you need.
the second thing to do is to add a WHERE clause. Do you really need to return all rows for the table?
Third thing is indexes, does you table have any, can you use indexed columns in your WHERE statement, etc.
What is the reason not to use select *?
http://weblogs.asp.net/jgalloway/archive/2007/07/18/the-real-reason-select-queries-are-bad-index-coverage.aspx
You need to isolate what is slow, the query or displaying the query. I am assuming it is a combination of both.
When you have 97 columns each being a varchar(200) sql cannot store all the info on the same row in a page . The worst case scenario is 97 * 202 (i used 202 instead of 200 because sql needs to store the length of a varchar in each column and i believe it stores it as two bytes) 19594 which is much larger than the row on a page can handle. TL;DR; normalize your table and have the columns broken up into logical units on another table.
Also doing a SELECT * FROM X will result in a full table scan which is much slower then querying on an indexed column.
If you must do a SELECT * FROM X and you have to have 97 columns please change the column data types to be a smaller data type (int,tinyint,char(10)) this will optimize the storage and should speed up the query.

SQL argument limit in Oracle

It appears that there is a limit of 1000 arguments in an Oracle SQL. I ran into this when generating queries such as....
select * from orders where user_id IN(large list of ids over 1000)
My workaround is to create a temporary table, insert the user ids into that first instead of issuing a query via JDBC that has a giant list of parameters in the IN.
Does anybody know of an easier workaround? Since we are using Hibernate I wonder if it automatically is able to do a similar workaround transparently.
An alternative approach would be to pass an array to the database and use a TABLE() function in the IN clause. This will probably perform better than a temporary table. It will certainly be more efficient than running multiple queries. But you will need to monitor PGA memory usage if you have a large number of sessions doing this stuff. Also, I'm not sure how easy it will be to wire this into Hibernate.
Note: TABLE() functions operate in the SQL engine, so they need us to declare a SQL type.
create or replace type tags_nt as table of varchar2(10);
/
The following sample populates an array with a couple of thousand random tags. It then uses the array in the IN clause of a query.
declare
search_tags tags_nt;
n pls_integer;
begin
select name
bulk collect into search_tags
from ( select name
from temp_tags
order by dbms_random.value )
where rownum <= 2000;
select count(*)
into n
from big_table
where name in ( select * from table (search_tags) );
dbms_output.put_line('tags match '||n||' rows!');
end;
/
As long as the temporary table is a global temporary table (ie only visible to the session), this is the recommended way of doing things (and I'd go that route for anything more than a dozen arguments, let alone a thousand).
I'd wonder where/how you are building that list of 1000 arguments. If this is a semi-permanent grouping (eg all employees based in a particular location) then that grouping should be in the database and the join done there. Databases are designed and built to do joins really quickly. Much quicker than pulling a bunch of id's back to the mid tier and then sending them back to the database.
select * from orders
where user_id in
(select user_id from users where location = :loc)
You can add additional predicates to split the list into chunks of 1000:
select * from orders where user_id IN (<first batch of 1000>)
OR user_id IN (<second batch of 1000>)
OR user_id IN ...
the comments regarding "if these IDs are in your database, use joins/correlation instead" hold true. However, if your list of IDs comes from elsewhere, like a SOLR result, you can get around the temp table requirement by issuing multiple queries, each with no more than 1000 ids present, and then merging the results of the query in memory. If you place the initial list of ids in a unique collection like a hashset, you can pop off 1000 ids at a time.

Is there efficient SQL to query a portion of a large table

The typical way of selecting data is:
select * from my_table
But what if the table contains 10 million records and you only want records 300,010 to 300,020
Is there a way to create a SQL statement on Microsoft SQL that only gets 10 records at once?
E.g.
select * from my_table from records 300,010 to 300,020
This would be way more efficient than retrieving 10 million records across the network, storing them in the IIS server and then counting to the records you want.
SELECT * FROM my_table is just the tip of the iceberg. Assuming you're talking a table with an identity field for the primary key, you can just say:
SELECT * FROM my_table WHERE ID >= 300010 AND ID <= 300020
You should also know that selecting * is considered poor practice in many circles. They want you specify the exact column list.
Try looking at info about pagination. Here's a short summary of it for SQL Server.
Absolutely. On MySQL and PostgreSQL (the two databases I've used), the syntax would be
SELECT [columns] FROM table LIMIT 10 OFFSET 300010;
On MS SQL, it's something like SELECT TOP 10 ...; I don't know the syntax for offsetting the record list.
Note that you never want to use SELECT *; it's a maintenance nightmare if anything ever changes. This query, though, is going to be incredibly slow since your database will have to scan through and throw away the first 300,010 records to get to the 10 you want. It'll also be unpredictable, since you haven't told the database which order you want the records in.
This is the core of SQL: tell it which 10 records you want, identified by a key in a specific range, and the database will do its best to grab and return those records with minimal work. Look up any tutorial on SQL for more information on how it works.
When working with large tables, it is often a good idea to make use of Partitioning techniques available in SQL Server.
The rules of your partitition function typically dictate that only a range of data can reside within a given partition. You could split your partitions by date range or ID for example.
In order to select from a particular partition you would use a query similar to the following.
SELECT <Column Name1>…/*
FROM <Table Name>
WHERE $PARTITION.<Partition Function Name>(<Column Name>) = <Partition Number>
Take a look at the following white paper for more detailed infromation on partitioning in SQL Server 2005.
http://msdn.microsoft.com/en-us/library/ms345146.aspx
I hope this helps however please feel free to pose further questions.
Cheers, John
I use wrapper queries to select the core query and then just isolate the ROW numbers that i wish to take from the query - this allows the SQL server to do all the heavy lifting inside the CORE query and just pass out the small amount of the table that i have requested. All you need to do is pass the [start_row_variable] and the [end_row_variable] into the SQL query.
NOTE: The order clause is specified OUTSIDE the core query [sql_order_clause]
w1 and w2 are TEMPORARY table created by the SQL server as the wrapper tables.
SELECT
w1.*
FROM(
SELECT w2.*,
ROW_NUMBER() OVER ([sql_order_clause]) AS ROW
FROM (
<!--- CORE QUERY START --->
SELECT [columns]
FROM [table_name]
WHERE [sql_string]
<!--- CORE QUERY END --->
) AS w2
) AS w1
WHERE ROW BETWEEN [start_row_variable] AND [end_row_variable]
This method has hugely optimized my database systems. It works very well.
IMPORTANT: Be sure to always explicitly specify only the exact columns you wish to retrieve in the core query as fetching unnecessary data in these CORE queries can cost you serious overhead
Use TOP to select only a limited amont of rows like:
SELECT TOP 10 * FROM my_table WHERE ID >= 300010
Add an ORDER BY if you want the results in a particular order.
To be efficient there has to be an index on the ID column.