Behavior of IgniteCache.loadCache - ignite

I am using IgniteCache.loadCache(null, keyClassName, sqlArray) to load RDMS data into cache by running sql query specified by sqlArray
It looks that loadCache internally will run the sqlArray with ThreadPool(each sql will be run within a task)
My question is:
Does IgniteCache internally will control the parallesm? I have the following scenario:
My datasource's max connection is set to 200.
The sqlArray's length is about 1000 since I have a large table:
select * from person where id >=0 and i <=20000
...
select * from person where id >=10000000 and i <=10020000
If all these 1000 sql runs at the same time, then the connection would be unavailable from the connection pool which will lead to error

IgniteCache.loadCache method fully relies on configured CacheStore implementation. It looks like CacheAbstractJdbcStore is support parallelism internally.
By default, pool size is equal number of available processors, but you are free to change it with CacheAbstractJdbcStore.setMaximumPoolSize(int) method.
So, you'll run out of connections, if only you have more than 200 processor available.

Related

Extended Events connection_id vs client_connection_id

Hello guys I want to find a way to identify a query executed for Extended Events in Microsoft SQL Server (to filter the Extended Event with only that executed query)
If i query the system views in SQL Server like this:
SELECT session_id, connection_id
FROM sys.dm_exec_requests
WHERE session_id = ##SPID
I get the connection_id of the current query executing which is unique until SQL Server restarts.
But Extended Events have a different value called 'sqlserver.client_connection_id' which is not the same identifier as 'connection_id' from the table 'sys.dm_exec_requests'.
Do you know where can I find the 'sqlserver.client_connection_id' in system tables? or another solution to unquely identify a executed query?
The client_connection_idin Extended Events (according to SSMS)
Provides the optional identifier provided at connection time by a client
and is the SqlConnection.ClientConnectionId, which is intended to support troubleshooting client connectivity issues.
You can locate the connection ID in the extended events log to see if
the failure was on the server if the extended event for logging
connection ID is enabled. You can also locate the connection ID in the
connection ring buffer (Connectivity troubleshooting in SQL Server
2008 with the Connectivity Ring Buffer) for certain connection errors.
If the connection ID is not in the connection ring buffer, you can
assume a network error.
So this id correlates the client-side and server-side of the connection attempt. For successful connections a row in sys.dm_exec_connections and sys.dm_exec_sessions will be created with different id's.
I'm trying to create an Extended Event with error_reported of all queries. And then filter the results in .xel file using an identifier that tells me that this was from X query.
You can capture the query in the error_reported event, eg:
CREATE EVENT SESSION [errors] ON SERVER
ADD EVENT sqlserver.error_reported(
ACTION
(
sqlserver.client_app_name,
sqlserver.session_id,
sqlserver.sql_text
)
WHERE ([severity]>=(11)))
Extended Evets by default tracks all of the connections and activity on the instance. Your filters in the definition will limit that down.
The sqlserver.client_connection_id includes all of the values from all of the queries - so if you DID know the client connection id then you could identify those results.
I'm not clear what you are trying to filter for with the Extended Event? Are you looking to see where a specific query was executed from or track all the queries on a specific connection?
The other places you can look to get the same connection info are :
SELECT * FROM sys.dm_exec_connections
SELECT * FROM sys.dm_exec_sessions
SELECT * FROM sys.dm_exec_requests
Looking at these might help you link the make the connection.

Simple queries take very long

When I execute a query for the first time in DBeaver it can take up to 10-15 seconds to display the result. In SQLDeveloper those queries only take a fraction of that time.
For example:
Simple "select column1 from table1" statement
DBeaver: 2006ms,
SQLDeveloper: 306ms
Example 2 (other way around; so theres no server-side caching):
Simple "select column1 from table2" statement
SQLDeveloper: 252ms,
DBeaver: 1933ms
DBeavers status box says:
Fetch resultset
Discover attribute column1
Find attribute column1
Late bind attribute colummn1
2, 3 and 4 use most of the query execution time.
I'm using oracle 11g, SQLDeveloper 4.1.1.19 and DBeaver 3.5.8.
See http://dbeaver.jkiss.org/forum/viewtopic.php?f=2&t=1870
What could be the cause?
DBeaver looks up some metadata related to objects in your query.
On an Oracle DB, it queries catalog tables such as
SYS.ALL_ALL_TABLES / SYS.ALL_OBJECTS - only once after connection, for the first query you execute
SYS.ALL_TAB_COLS / SYS.ALL_INDEXES / SYS.ALL_CONSTRAINTS / ... - I believe each time you query a table not used before.
Version 3.6.10 introduced an option to enable/disable a hint used in those queries. Disabling the hint made a huge difference for me. The option is in the Oracle Properties tab of the connection edit dialog. Have a look at issue 360 on dbeaver's github for more info.
The best way to get insight is to perfom the database trace
Perform few time the query to eliminate the caching effect.
Than repeat in both IDEs following steps
activate the trace
ALTER SESSION SET tracefile_identifier = test_IDE_xxxx;
alter session set events '10046 trace name context forever, level 12'; /* binds + waits */
Provide the xxxx to identify the test. You will see this string as a part of the trace file name.
Use level 12 to see the wait events and bind variables.
run the query
close the conenction
This is important to not trace other things.
Examine the two trace files to see:
what statements were performed
what number of rows was fetched
what time was elapsed in DB
for the rest of the time the client (IDE) is responsible
This should provide you enough evidence to claim if one IDE behaves different than other or if simple the DB statements issued are different.

SAS and Oracle error: ORA-04031

I am using an ORACLE db with SAS/connect.
I recently implemented a change in my libname statement (a week ago) in which I added the following (don't know if related to issue):
insertbuff=10000 updatebuff=10000 readbuff=10000
Starting yesterday, I have been having an ORACLE issue when, after doing a
proc sql;
drop table oralib.mytable;
quit;
data oralib.mytable;
set work.mytable;
run;
I get the following error:
ERROR: ERROR: ERROR: ORACLE execute error: ORA-04031: unable to
allocate 4160 bytes of shared memory ("shared pool","unknown
object","sga heap(1,0)","modification "). With the occurrence of the above ERROR, the error limit of 1 set by the
ERRLIMIT= option has been reached. ROLLBACK has been issued(Any Rows processed after the last COMMIT are lost).
Total rows processed: 1001
Rows failed : 1
It seems to happen randomly on any table of any size. Sometimes it will go through, sometimes (most of the times) it won't. Is there a shared pool release I should do from SAS?
Thanks for your help!
The shared pool is a memory structure on Oracle which keeps the following stuff:
data dictionary cache
SQL query and PL/SQL function result caches
storage for recently executed code in its parsed form
It is possible to flush the shared pool, but this is not a good idea and I would not recommend it. What you have to do is size the shared pool of the database properly. Note that the shared pool is a pool for the entire Oracle instance - it is not on a per user base. So, if there are other users of the database, they might contribute the problem. I doubt that any particular query is the cause and I guess that the problem is that the shared pool is undersized.
In case you have some DBA privileges granted for your user, you can check the current shared pool size by running the following query:
SELECT * FROM v$sgainfo;
You can increase the size of the shared pool with the following query
ALTER SYSTEM SET SHARED_POOL_SIZE = 200M;
Nevertheless, the best solution will be turn to the DBA managing the database (if there is such).
I'm not a SAS guy, so, I'll answer your question from the POV of an Oracle DBA, which is what I am.
ORA-04031 means you're running out of space in the shared pool. With a product like SAS, I'm sure they have a recommended minimum size that the shared pool should be. So, you should check the SAP installaiton documentation, and confirm whether your database has a large enough shared pool size set. (Use show parameter shared_pool_size to see what size it's set in yourr database.)
Second, I'm not familiar with the changes you made, and I'm not sure if that would have an effect on the shared pool utilization. Perhaps check the SAS documentation on that one.
Third, it could be an Oracle bug. You should check My Oracle Support for your version of Oracle, and do a search on ORA-04031, with those specific arguments you are seeing in your error message. If it's a known bug, there may be a patch already available.
If it's none of the above, you may need to open an SR with Oracle.
Hope that helps.

Are Variables Being Passed In This SQL Statement?

I'm developing an application that pulls information from a Firebird SQL database (accessed via the network) that sits behind an existing application.
To get an idea of what the most commonly used tables are in the application, I've run Wireshark while using the application to capture the SQL statements that are transmitted to the database server when the program is running.
I have no problem viewing what tables are being accessed via the application, however some of the query values passed over the network are not being displayed in the captured SQL packets. Instead these values are replaced with what I assume is a variable of some sort.
Heres a sample query:
select * from supp\x0d\x0aWHERE SUPP.ID=? /* BIND_0 */ \x0d\x0a
(I am assumming \x0d\x0a is used to denote a newline in the SQL query)
Has anyone any idea how I may be able to view the values associated with BIND_0 or /* BIND_0 */?
Any help is much appreciated.
P.S. The version of Firebird I am using is 1.5 - I understand there are syntactical differences in the SQL used in this version and more recent versions.
That /* BIND_0 */ is simply a comment (probably generated by the tool that generated the query), the placeholder is the question mark before that. In Firebird statements are - usually - first prepared by sending the query text (with or without placeholders) to the server with operation op_prepare_statement = 68 (0x44). The server then returns a description of the bind variables and the datatypes of the result set.
When the query is actually executed, the client will send all bind variables together with the execute request (usually in operation op_execute = 63 (0x3F)) in a structure called the XSQLDA.

How do I display the query time when a query completes in Vertica?

When using vsql, I would like to see how long a query took to run once it completes. For example when i run:
select count(distinct key) from schema.table;
I would like to see an output like:
5678
(1 row)
total query time: 55 seconds.
If this is not possible, is there another way to measure query time?
In vsql type:
\timing
and then hit Enter. You'll like what you'll see :-)
Repeating that will turn it off.
Regarding the other part of your question:
is there another way to measure query time?
Vertica can log a history of all queries executed on the cluster which is another source of query time. Before 6.0 the relevant system table was QUERY_REPO, starting with 6.0 it is QUERY_REQUESTS.
Assuming you're on 6.0 or higher, QUERY_REQUESTS.REQUEST_DURATION_MS will give you the query duration in milliseconds.
Example of how you might use QUERY_REQUESTS:
select *
from query_requests
where request_type = 'QUERY'
and user_name = 'dbadmin'
and start_timestamp >= CURRENT_DATE
and request ilike 'select%from%schema.table%'
order by start_timestamp;
The QUERY_PROFILES.QUERY_DURATION_US and RESOURCE_ACQUISITIONS.DURATION_MS columns may also be of interest to you. Here are the short descriptions of those tables in case you're not already familiar:
RESOURCE_ACQUISITIONS - Retains information about resources (memory, open file handles, threads) acquired by each running request for each resource pool in the system.
QUERY_PROFILES - Provides information about queries that have run.
I'm not sure how to enable that in vsql or if that's possible. But you could get that information from a script.
Here's the psuedocode (I used to use perl):
print time
system("vsql -c 'select * from table'");
print time
Or put time into a variable and do some subtraction.
The other option is to use some tool like Toad to connect to Vertica instead of using vsql.