why would a SQLCLR proc run slower than the same code client side - sql-server-2005

I am writing a stored procedure that when completed will be used to scan staging tables for bogus data on a column by column basis.
Step one in the exercise was just to scan the table --- which is what the code below does. The issue is that this code runs in 5:45 seconds --- however the same code run as a console app (changing the connectionstring of course) runs in about 44 seconds.
using (SqlConnection sqlConnection = new SqlConnection("context connection=true"))
{
sqlConnection.Open();
string sqlText = string.Format("select * from {0}", source_table.Value);
int count = 0;
using (SqlCommand sqlCommand = new SqlCommand(sqlText, sqlConnection))
{
SqlDataReader reader = sqlCommand.ExecuteReader();
while (reader.Read())
count++;
SqlDataRecord record = new SqlDataRecord(new SqlMetaData("rowcount", SqlDbType.Int));
SqlContext.Pipe.SendResultsStart(record);
record.SetInt32(0, count);
SqlContext.Pipe.SendResultsRow(record);
SqlContext.Pipe.SendResultsEnd();
}
}
However the same code (different connection string of course) runs in a console app in about 44 seconds (which is closer to what I was expecting on the client side)
What am I missing on the SP side, that would cause it to run so slow.
Please note: I fully understand that if I wanted a count of rows, I should use the count(*) aggregation --- that's not the purpose of this exercise.

The type of code you are writing is highly susceptible to SQL Injection. Rather than processing the reader like you are, you could just use the RecordsAffected Property to find the number of rows in the reader.
EDIT:
After doing some research, the difference you are seeing is a by design difference between the context connection and a regular connection. Peter Debetta blogged about this and writes:
"The context connection is written such that it only fetches a row at a time, so for each of the 20 million some odd rows, the code was asking for each row individually. Using a non-context connection, however, it requests 8K worth of rows at a time."
http://sqlblog.com/blogs/peter_debetta/archive/2006/07/21/context-connection-is-slow.aspx

Well it would seem the answer is in the connection string after all.
context connection=true
versus
server=(local); database=foo; integrated security=true
For some bizzare reason, using the "external" connection the SP runs almost as fast as a console app (still not as fast mind you! -- 55 seconds)
Of course now the assembly has to be deployed as External rather than Safe --- and that introduces more frustration.

Related

EFUtilities with EFProfiler running

I was wondering if there was a way to get EFUtilities running at the same time EFProfiler is running.
I appreciate the profiler would not show the bulk insert due to it being done outside the confines of DBContext. At the moment, I cannot run batch jobs as the profiler has the connection wrapped. It Runs fine when not enabled
The exception I am getting is thus:
A first chance exception of type 'System.InvalidOperationException'
occurred in EntityFramework.Utilities.dll
Additional information: No provider supporting the InsertAll operation
for this datasource was found
The inner exception is null.
This is because EFUtilities automatically finds the correct provider. But when the connection is wrapped this is no longer possible.
InsertAll looks like this.
public void InsertAll<TEntity>(IEnumerable<TEntity> items, DbConnection connection = null, int? batchSize = null)
To use the SqlProvider (which is actually the only provider out of the box) you can create a new SqlConnection() and pass that to insert all.
So basically you would need to do this:
using (var db = new YourContext())
using (var con = new SqlConnection(YourConnectionString))
{
EFBatchOperation.For(db, db.PartialTestClass1).InsertAll(partials, con);
}
Now, maybe you are doing more and want both parts to run under the same transaction. In that case you can wrap that code block in a TransactionScope.

Data is not properly stored to hsqldb when using pooled data source by dbcp

I'm using hsqldb to create cached tables and indexed tables.
The data being stored has pretty high frequency so I need to use a connection pool.
Also because there is a lot of data I do not call checkpoint on every commit, but rather expect the data to be flushed after 50,000 rows are inserted.
So the thing is that I can see the .data file is growing but when I connect with hsqldb client I don't see the tables and the data.
So I had 2 simple tests, one inserted single row and one inserted 60,000 rows to new table. In both cases I couldn't see the result in any hsqldb client.
(Note that I use shutdown=true)
So when I add checkpoint after each commit, it solve the problem.
Also if specify in the connection string to use log, it solves the problem (I don't want the log in production though). Also not using pooled connection solved the problem and last is using pooled data source and explicitly close it before shutdown.
So I guess that some connections in the connection pool are not being closed, preventing from the db to somehow commit the changes and make them available for the client. But then, why couldn't I see the result even with 60,000 rows?
I also would expect the pool to be closed automatically...
What am I doing wrong? What is happening behind the scene?
The code to get the data source looks like this:
Class.forName("org.hsqldb.jdbcDriver");
String url = "jdbc:hsqldb:" + m_dbRoot + dbName + "/db" + ";hsqldb.log_data=false;shutdown=true;hsqldb.nio_data_file=false";
ConnectionFactory connectionFactory = new DriverManagerConnectionFactory(url, user, password);
GenericObjectPool connectionPool = new GenericObjectPool();
KeyedObjectPoolFactory stmtPool = new GenericKeyedObjectPoolFactory(null);
new PoolableConnectionFactory(connectionFactory, connectionPool, stmtPool, null, false, true);
DataSource ds = new PoolingDataSource(connectionPool);
And I'm using this Pooled data source to create table:
Connection c = m_dataSource.getConnection();
Statement st = c.createStatement();
String script = String.format("CREATE CACHED TABLE IF NOT EXISTS %s (id %s NOT NULL, entity %s NOT NULL, PRIMARY KEY (id));", m_tableName, m_idGenerator.getIdType(), TABLE_ENTITY_TYPE);
st.execute(script);
c.close;
st.close();
And insert rows:
Connection c = m_dataSource.getConnection();
c.setAutoCommit(false);
Statement stmt = c.prepareStatement(m_sqlInsert);
stmt.setObject(1, id);
stmt.setBinaryStream(2, Serializer.Helper.serialize(m_serializer, entity));
stmt.executeUpdate();
stmt.close();
stmt = null;
c.commit();
c.close();
stmt.close();
so the above seems to add data but it cannot be seen.
When I explicitly called
connectionPool.close();
Then and only then I could see the result.
I also tried to use JDBCDataSource and it worked as well.
So what is going on? And what is the right way to do this?
Your method of accessing the database from outside your application process is simply wrong.
Only one java process is supposed to connect to the file: database.
In order to achieve your aim, launch an HSQLDB server within your application, using exactly the same JDBC URL. Then connect to this server from the external client.
See the Guide:
http://www.hsqldb.org/doc/2.0/guide/listeners-chapt.html#lsc_app_start
Update: The OP commented that the external client was used after the application had stopped. Because you have turned the log off with hsqldb.log_data=false, nothing is persisted permanently. You need to perform an explicit CHECKPOINT or SHUTDOWN when your application completes its work. You cannot rely on shutdown=true at all, even without connection pooling.
See the Guide:
http://www.hsqldb.org/doc/2.0/guide/deployment-chapt.html#dec_bulk_operations

jdbc statement connection

QUESTION: can YOU use multiple statements and recordset, which operate simultaneously, using the same connection in a non MULTI THREAD?
I only found this question which interests me, but the answer is not consistent.
JDBC Statement/PreparedStatement per connection
The answer explain the relationship between recordset and statement, which is known to me.
Given that, you can not have multiple recordsets for statement
The answer says that you can have multiple recordsets for connection. But they are not mentioned any other sources.
I'm asking if it's possible to loop over the first recordset, then using the same connection (used to generate first recordset) to open another recordset use it looping in iteration. And where is the documentation that define this behavior?
The situation that interests me is like this, the statement perform tasks simultaneously ins
Connection con = Factory.getDBConn (user, pss, endpoint, etc);
Statement stmt = con.createStatement ();
ResultSet rs = stmt.executeQuery ("SELECT TEXT FROM dba");
while (rs.next ()) {
rs.getInt (....
rs.getInt (....
rs.getInt (....
rs.getInt (....
Statement stmt2 con.createStatement = ();
ResultSet rs2 = stmt2.executeQuery ("iSelect ......");
while (rs2.next ()) {
....
rs2.close ();
stm2.close ();
Statement stmt3 con.createStatement = ();
ResultSet rs3 = stmt3.executeQuery ("Insert Into table xxx ......");
....
rs3.close ();
stm3.close ();
}
To clarify a bit more: with the execution of update in stmt3, you could obtain an error like this:
java.sql.SQLException: There is an open result set on the current connection, which must be closed prior to executing a query.
So you can't mix SQL in the same connection.
If I understand correctly, you need to work with two (or more) resultsets simmultaneously within a single method.
It is possible, and it works well. But you have to remember a few things:
Everything you do on each recordset is handled by a single Connection, unless you declare new connections for each Statement (and ResultSet)
If you need to do a multithreaded process, I suggest you to create a Connection for each thread (or use a connection pool); if you use a single connection in a multithreaded process, your program will hang or crash, since every SQL statement goes through a single connection, and every new statement has to wait until the previous one has finnished.
Besides that, your question needs some clarification. What do you really need to do?
A ResultSet object is automatically closed when the Statement object that generated it is closed, re-executed, or used to retrieve the next result from a sequence of multiple results.
http://docs.oracle.com/javase/6/docs/api/java/sql/ResultSet.html
SQL Server is a database that does support multiple record sets. So you can exceute a couple of queries in a single stored procedure for example
SELECT * FROM employees
SELECT * FROM products
SELECT * FROM depts
You can then move between each record set. At least I know you can do this in .Net for example
using (var conn = new SqlConnection("connstring"))
using (var command = new SqlCommand("SPName", conn))
{
conn.Open();
command.CommandType = CommandType.StoredProcedure;
var (reader = command.ExecuteReader())
{
while(reader.Read())
{
//Process all records from first result set
}
reader.Next();
while(reader.Read())
{
//Process all records from 2nd result set
}
reader.Next();
while(reader.Read())
{
//Process all records from 3rd result set
}
}
}
I am assuming that java would support a similar mechanism

SELECT through oledbcommand in vb.net not picking up recent changes

I'm using the following code to work out the next unique Order Number in an access database. ServerDB is a "System.Data.OleDb.OleDbConnection"
Dim command As New OleDb.OleDbCommand("", serverDB)
command.CommandText = "SELECT max (ORDERNO) FROM WORKORDR"
iOrder = command.ExecuteScalar()
NewOrderNo = (iOrder + 1)
If I subsequently create a WORKORDR (using a different DB connection), the code will not pick up the new "next order number."
e.g.
iFoo = NewOrderNo
CreateNewWorkOrderWithNumber(iFoo)
iFoo2 = NewOrderNo
will return the same value to both iFoo and iFoo2.
If I Close and then reopen serverDB, as part of the "NewOrderNo" function, then it works. iFoo and iFoo2 will be correct.
Is there any way to force a "System.Data.OleDb.OleDbConnection" to refresh the database in this situation without closing and reopening the connection.
e.g. Is there anything equivalent to serverdb.refresh or serverdb.FlushCache
How I create the order.
I wondered if this could be caused by not updating my transactions after creating the order. I'm using an XSD for the order creation, and the code I use to create the record is ...
Sub CreateNewWorkOrderWithNumber(ByVal iNewOrder As Integer)
Dim OrderDS As New CNC
Dim OrderAdapter As New CNCTableAdapters.WORKORDRTableAdapter
Dim NewWorkOrder As CNC.WORKORDRRow = OrderDS.WORKORDR.NewWORKORDRRow
NewWorkOrder.ORDERNO = iNewOrder
NewWorkOrder.name = "lots of fields filled in here."
OrderDS.WORKORDR.AddWORKORDRRow(NewWorkOrder)
OrderAdapter.Update(NewWorkOrder)
OrderDS.AcceptChanges()
End Sub
From MSDN
Microsoft Jet has a read-cache that is
updated every PageTimeout milliseconds
(default is 5000ms = 5 seconds). It
also has a lazy-write mechanism that
operates on a separate thread to main
processing and thus writes changes to
disk asynchronously. These two
mechanisms help boost performance, but
in certain situations that require
high concurrency, they may create
problems.
If you possibly can, just use one connection.
Back in VB6 you could force the connection to refresh itself using ADO. I don't know whether it's possible with VB.NET. My Google-fu seems to be weak today.
You can change the PageTimeout value in the registry but that will affect all programs on the computer that use the Jet engine (i.e. programmatic use of Access databases)
I always throw away a Connection Object after I used it. Due to Connection Pooling getting a new Connection is cheap.

Transaction timeout expired while using Linq2Sql DataContext.SubmitChanges()

please help me resolve this problem:
There is an ambient MSMQ transaction. I'm trying to use new transaction for logging, but get next error while attempt to submit changes - "Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding." Here is code:
public static void SaveTransaction(InfoToLog info)
{
using (TransactionScope scope =
new TransactionScope(TransactionScopeOption.RequiresNew))
{
using (TransactionLogDataContext transactionDC =
new TransactionLogDataContext())
{
transactionDC.MyInfo.InsertOnSubmit(info);
transactionDC.SubmitChanges();
}
scope.Complete();
}
}
Please help me.
Thx.
You could consider increasing the timeout or eliminating it all together.
Something like:
using(TransactionLogDataContext transactionDC = new TransactionLogDataContext())
{
transactionDC.CommandTimeout = 0; // No timeout.
}
Be careful
You said:
thank you. but this solution makes new question - if transaction scope was changed why submit operation becomes so time consuming? Database and application are on the same machine
That is because you are creating new DataContext right there:
TransactionLogDataContext transactionDC = new TransactionLogDataContext())
With new data context ADO.NET opens up new connection (even if connection strings are the same, unless you do some clever connection pooling).
Within transaction context when you try to work with more than 1 connection instances (which you just did)
ADO.NET automatically promotes transaction to a distributed transaction and will try to enlist it into MSDTC. Enlisting very first transaction per connection into MSDTC will take time (for me it takes 30+ seconds), consecutive transactions will be fast, however (in my case 60ms). Take a look at this http://support.microsoft.com/Default.aspx?id=922430
What you can do is reuse transaction and connection string (if possible) when you create new DataContext.
TransactionLogDataContext tempDataContext =
new TransactionLogDataContext(ExistingDataContext.Transaction.Connection);
tempDataContext.Transaction = ExistingDataContext.Transaction;
Where ExistingDataContext is the one which started ambient transaction.
Or attemp to speed up your MS DTC.
Also do use SQL Profiler suggested by billb and look for SessionId between different commands (save and savelog in your case). If SessionId changes, you are in fact using 2 different connections and in that case will have to reuse transaction (if you don't want it to be promoted to MS DTC).