bulk insert 3M records to SQLExpress

bulk insert 3M records to SQLExpress - sql-server-express

While trying to bulk insert 3M records from CSV file to SQLExpress database the procedure throw the timeout exception which was set to 30s. I tried to set the Connect Timeout to 1800 but again the procedure threw the same exception.
Does anyone know whether the exception is thrown due to too many records or the timeout was not set correctly?
Below is the connection string, query statement and row from file
connectionString = "Data Source=.\SQLEXPRESS;AttachDbFilename=simulatorDB.mdf;Integrated Security=True;Connect Timeout=1800;User Instance=True"
query = "BULK INSERT real_data FROM '" + path + "' WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')"
AFAB19476C2CEEEE101FFA45FD207BA8B6185B29,539EE0643AFC3A3BE3D20DC6BE7D5376DC536D34,9800,58,29,24,34,2
I would be very thankful if anyone suggested a fix for the described issue.
Thank you!

It's not the connection timeout you need to set - what you need to increase is the Command Timeout.
As for how long importing 3M records will take, it depends exactly on the table you are importing into - i.e. if it's a new table or existing table with some data/indexes already on, whether the table is actively being used by other processes.

Related

SSRS - Ignore if Database Doesn't Exist

I have a report created in SSRS which has a number of data sources.
On very rare occasions, one of those data sources may have been detached. This happened recently, as the SSAS Database was temporarily detached on the Dev server to free up memory for a large job.
When that happened, the entire report refused to run, throwing out the error "Query Execution failed for dataset 'DatasetName' (rsErrorExecutingCommand)
Either the user, 'UserName', does not have access to the 'DBName' database, or the database does not exist.
Is there anyway to amend either the dataset, or perhaps the query in the datasource, so that if the query fails (because the DB is down / detached) that it still runs everything else, but perhaps shows an error on the report.
[EDIT]
With assistance from Bushell - this is what I ended up using:
1) On the SQL Server (which should always be up) - I created a Linked Server to the SSAS instance
2) Changed my Datasource in SSRS to point to the SQL Server instead of the SSAS instance
3) Used this query (see below) to check whether the SSAS linked server was up - I haven't been able to test with SSAS down, but it does work while it's up!
(if anyone reading this is using the same method, you'd just have to replace my 'Select Distinct ... etc.' with your own query)
BEGIN TRY
EXEC sp_testlinkedserver N'SSAS_LinkedServer';
EXEC sp_executesql N'SELECT * FROM OPENQUERY(SSAS_LinkedServer,
''SELECT
DISTINCT
[CATALOG_NAME] as [Database],
[CUBE_NAME],
DIMENSION_CAPTION AS [Dimension],
DIMENSION_CARDINALITY AS [Count]
FROM $system.MDSchema_Dimensions
ORDER BY DIMENSION_CARDINALITY DESC;'');';
END TRY
BEGIN CATCH
SELECT
'' as [Database],
'' as [CUBE_NAME],
'' AS [Dimension],
'' AS [Count]
END CATCH
Thanks to Bushell for pointing me in the right direction.

I would move your datasets into callable Stored Procedures, and then use TRY/CATCH blocks to determine whether the selects run without errors. And in the instance when there is an error, just return the column headers and no rows.
BEGIN TRY
SELECT * FROM dbo.DetachedDB
END TRY
BEGIN CATCH
SELECT '' as [Column1], '' as [Column2]; etc....
END CATCH;
Then in your SSRS report if the count of rows of a dataset are zero, toggle the visibility of the tables, and show an error message which can be set to show instead.

Running a Stored Proc via Excel Connection - works in SSMS but not via the VBA macro

I've successfully created many connections in an Excel file to a Database so the issue is directly related only to this particular scenario. Server is 2012, Excel is 2013.
I have the following SP:
IF OBJECT_ID('usp_LockUnlockCustomer', 'P') IS NOT NULL
DROP PROCEDURE usp_LockUnlockCustomer
GO
CREATE PROCEDURE usp_LockUnlockCustomer
#Lock AS CHAR(10), #Id AS nvarchar(50), #LockedBy AS nvarchar(50)=null
AS BEGIN
SET NOCOUNT ON;
IF #Lock = 'Lock'
INSERT INTO IO_Call_DB..IdLock (Id, LockedBy, LockedDtTm)
VALUES(#Id,#LockedBy,GETDATE())
;
IF #Lock = 'Unlock'
DELETE FROM IO_Call_DB..IdLock
WHERE Id = #Id
;
END;
--EXEC usp_LockUnlockCustomer 'Lock','123456789', 'Test User'
The above SP is called via some VBA as follows:
With ActiveWorkbook.Connections("usp_LockUnlockCustomer").OLEDBConnection
.CommandText = "EXECUTE dbo.usp_LockUnlockCustomer '" & bLock & "','" & Id & "','" & LockedBy & "'"
End With
I have tested the string and the string is formatted correct & contains all required data.
The connection between Excel and SQL is created via "Data > From Other Sources > From SQL Server", it's a fairly standard process which has worked for all other SP's and general queries.
I think, because this connection is not returning data to Excel (I only set up the connection rather than specifying that Excel should return data to a cell) that this may be the issue.
Has anyone experienced this issue before?
EDIT1: I have resolved the issue but it's not a particularly great outcome. Some help would be appreciated.
To resolve the issue, you have to include a "select * from" process at the end of the stored procedure and also tell Excel to output the data to a range within the workbook. This allows the .Refresh portion of the VBA to do whatever it does & submit the SP to SQL.
Essentially, you're being forced to create a data table - but I don't want any data, I just want to submit a command.
So, how do you submit a command and not have Excel require that you 1) explicitly state where the data should be put 2) include a SELECT statement within the stored procedure when I don't require any data to be returned.
My fix was to "select top 0" from the table, at least that way the data table being output to Excel won't grow.

In my experience if you generate the database connection in VBA, (there are multiple previous questions about that), rather than rely on an existing workbook connection, your stored procedure will execute regardless of what it returns.

The problem I have is that by merely creating the connection without specifying a cell to return data to, nothing happens.
In addition, if I specify a cell to return data to, nothing happens unless I use my 'fix' which is to create an empty table at the end of the SP.

Bulk Updates Concept for SQL and VB.net winform

I would appreciate some ideas, discussion and feasibility of same, regarding bulk updates on a large table in SQL Server 2008+.
Currently I have a table with 10,000 rows and 160 columns. This table is updated very frequently with 1 to 100+ column changes per row depending on the process. Using the 'standard newbie' table update using a DataAdapter is very slow and unsuitable.
The quest is to find a faster way. I have tried fine-tuning the DataAdapter.Update with batch size, regardless the more heavy updates take 10-15 seconds. In the meanwhile SqlBulkCopy imports the whole table in (ball park) 1-3 seconds. When the update procedure takes place 30-50 times in a process the 10s-15s add up!
Being internet self thought, I have gaps in my experience, however there are 2 possibilities that I can think of that may be better at accomplishing the task of the update.
Dump the table content from the database and repopulate the table using SqlBulkcopy.
Using a stored procedure with a table passed to it with a merge SQL statement.
The main issue is data safety, although this is a local single user application there needs to be a way to handle errors roll back. From my understanding the dump and replace would be simpler, but perhaps more prone to data loss? The stored procedure would be far more extensive to set up as the update statement would have to have all the update columns typed individually and maintained for changes. Unless there is one 'Update *' statement :).
In trying to keep this short I want to keep this at a concept level only , but will appreciate any different ideas or links and advice.
EDIT further info:
The table has only one index, the ID column. Its a simple process of storing incoming (and changing) data to a simple datatable. and the update can be anywhere between 1 row to 1000 rows. The program stores the information to the database very often, and can be some or nearly all the columns. Building a stored procedure for each update would be impossible as I don't know which data will be updated, you can say that all of the columns will be updated (except the ID column and a few 'hard' data columns) it depends on what the update input is. So there is no fine tuning the update to specific columns unless I list nearly all of them each time. In which case one stored procedure would do it.
I think the issue is the number of 'calls' to the database are made using the current data adapter method.
EDIT:
3 WHat about a staging table where I bulk copy the data to and then have a store procedure do the update. Wouldn't that cut down the SQL trafic? I think that is the problem with the dataadapter update.
Edit: Posted an atempt of concept 1 in an answer to this thread.
Thank you

Dropping the table and reloading the entire thing with a bulk copy is not the correct way.
I suggest creating a stored procedure for each process that updates the table. The procedure should take as input only the columns that need to be updated for that specific process and then run a standard SQL update command to update those columns on the specified row. If possible, try to have indexes on the column(s) that you use to find the record(s) that need to be updated.
Alternately, depending on which version of the .Net framework you're using, you might try using Entity Framework if you don't want to maintain a whole list of stored procedures.

I have coded the following mockup to dump all rows from the table, bulkcopy the table in memory into an sql staging table and then move the data back into the original table. So by updating the data in that table.
Time taken 1.1 to 1.3 seconds
certainly a very attractive time compared to the 10-15s it takes to update. I have placed the truncete code for the staging table on top so that there is always one copy of the information in the database. Although the original table will not have the updated info untill the process completed.
What are the pitfals associate with this aproach ? What can I do about them? I must state that the table is unlikely to ever get beyond 10000 rows so the process will work.
Try
ESTP = "Start Bulk DBselection Update"
Dim oMainQueryT = "Truncate Table DBSelectionsSTAGE"
Using con As New SqlClient.SqlConnection(RacingConStr)
Using cmd As New SqlClient.SqlCommand(oMainQueryT, con)
con.Open()
cmd.ExecuteNonQuery()
con.Close()
End Using
End Using
ESTP = "Step 1 Bulk DBselection Update"
Using bulkCopy As SqlBulkCopy = New SqlBulkCopy(RacingConStr)
bulkCopy.DestinationTableName = "DBSelectionsSTAGE"
bulkCopy.WriteToServer(DBSelectionsDS.Tables("DBSelectionsDetails"))
bulkCopy.Close()
End Using
ESTP = "Step 2 Bulk DBselection Update"
oMainQueryT = "Truncate Table DBSelections"
Using con As New SqlClient.SqlConnection(RacingConStr)
Using cmd As New SqlClient.SqlCommand(oMainQueryT, con)
con.Open()
cmd.ExecuteNonQuery()
con.Close()
End Using
End Using
ESTP = "Step 3 Bulk DBselection Update"
oMainQueryT = "Insert INTO DBSelections Select * FROM DBSelectionsSTAGE"
Using con As New SqlClient.SqlConnection(RacingConStr)
Using cmd As New SqlClient.SqlCommand(oMainQueryT, con)
con.Open()
cmd.ExecuteNonQuery()
con.Close()
End Using
End Using
Data_Base.TextBox25.Text = "Deleting data - DONE "
Data_Base.TextBox25.Refresh()
Catch ex As Exception
ErrMess = "ERROR - occured at " & ESTP & " " & ex.ToString
Call WriteError()
Call ViewError()
End Try

Using a variable to create a variable

I have an execute SQL task that finds the path of the latest backup of a database and populates a variable with it (User::BackupFilePath)
I want to pass that into another task that will generate a restore database script and populate another variable to be used to restore the database.
Select (
'ALTER DATABASE [Database] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
RESTORE DATABASE [Database]
FROM DISK = ''' + **BackupFilePath** + ''' WITH FILE = 1, NOUNLOAD, REPLACE, STATS = 5
ALTER DATABASE [Database] SET MULTI_USER
GO'
) as RestoreScript
The second part that would generate the string is however returning this error message
[Execute SQL Task] Error: Executing the query "Select 'ALTER DATABASE [xxxx..." failed with the following error: "An error occurred while extracting the result into a variable of type (DBTYPE_I4)". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly.
I'm using Visual Studio 2008 Professional Edition

So far, it looks like you're having a simple problem: the variable you're setting, to your command string, needs to be a string datatype.
Your error message mentions DBTYPE_I4, which is a long integer:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms711251%28v=vs.85%29.aspx
Whereas what you'd save a command string into, would be a string type such as DBTYPE_STR or DBTYPE_WSTR (see the link above), which in SSIS would commonly be called DT_STR (for ASCII strings) or DT_WSTR (for Unicode strings) -- see the link below:
http://technet.microsoft.com/en-us/library/ms141036.aspx
Hope that helps...

One possible cause is the property 'ResulSet' of your second SQL Task (the restore)/
Make sure it's set to 'None'

Does MS Access suppress primary key violations on Inserts?

I am in the process of re-writing an MS Access database to SQL server and have found an strange issue in Access that I am hoping someone can help with.
I have a table let's call it 'Main' with a Primary Key on the Account that is indexed and doesn't allow for duplicates. Seems simple enough but my issue is occurring when data is getting Inserted.
My INSERT query is (the number of fields have been limited for brevity)
INSERT INTO Main (Account, SentDate, Amount)
SELECT C.Account, C.SentDate, C.Amount
FROM
(CALLS C LEFT JOIN Bals B ON C.Account = B.ACCT_ID)
LEFT JOIN AggAnt A ON C.Account = A.Account
The issue is this, if I run the SELECT portion of my query I get 2365 records but when I run the INSERT I get 2364 records. So I did some checking and I found one Account is duplicated the difference between the records is the SentDate and the Amount. But Access is inserting only one of the records and not throwing any kind of error message or anything. There is nothing in the query that says select the most recent date, etc.
Sample Data:
Account SentDate Amount
12345678 8/1/2011 123.00
23456789 8/1/2011 45678.00
34567890 8/1/2011 7850.00
45678912 8/1/2011 635.00
45678912 5/1/2011 982.00
56789123 8/1/2011 2639.00
In the sample I have one account that is duplicated 45678912 when I run my INSERT, I get no errors and I get the record from 8/1/2011.
Why is Access not throwing an error when this violates the PK on the table? Is there some quirk in Access to select one record and just skip the other?
I am totally stumped by this issue so any help would be great.

How are you running the query? If you're using DoCmd.RunSQL, switch to using the .Execute method of a DAO database object, and use dbFailOnError.
Dim db As DAO.Database
Dim strInsert As String
strInsert = "your insert statement"
Set db = CurrentDb
db.Execute strInsert, dbFailOnError
Set db = Nothing
Edit: If Main is an ODBC link to a SQL Server table, I would examine the Errors Collection (DAO) after db.Execute strInsert, dbFailOnError

After HansUp pointing me in the direction of checking for SetWarnings = false. I found it buried in my code which is why there was no warning message about the records not being inserted due to primary key violations.
A word of caution would be to make sure you want these messages suppressed.

Is there some quirk in Access to [update] one record and just skip the
other?
Yes, you can control this behaviour at the engine level (also at the recordset level if using OLE DB).
For OLE DB (e.g. ADO) the setting is Jet OLEDB:Global Partial Bulk Ops:
determines the behavior of the Jet database engine when SQL DML bulk
operations fail. When set to allow partial completion of bulk
operations, inconsistent changes can occur because operations on some
records could succeed and others could fail. When set to allow no
partial completion of bulk operations, all changes are rolled back if
a single error occurs. The Jet OLEDB:Global Partial Bulk Ops
property setting can be overridden on a per-Recordset basis by
setting the Jet OLEDB:Partial Bulk Ops property in the
Properties collection of a Recordset object.
Note the default is to allow no partial completion of bulk operations.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas