I have a local data frame of more than 4000 rows and around 10 columns. Currently using dbWriteTable function to write table into SQL server using R. But it is dead slow (takes more than 30 mins)
Is there any alternate approach for this using which I can do this faster?
Consider exporting the dataframe to csv and run SQL Server's BULK INSERT:
BULK INSERT myNewTable
FROM 'C:\Path\To\File.csv'
WITH
(
FORMAT = 'CSV',
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
TABLOCK
)
Alternatively, save the csv into Excel format (.xlsx) or directly from R to Excel format and run a distributed query in a make-table action:
-- Adjust path and sheet name
SELECT *
INTO myNewTable
FROM OPENDATASOURCE('Microsoft.ACE.OLEDB.12.0',
'Data Source=C:\Path\To\File.xlsx;Extended Properties=Excel 12.0')...SheetName$
SELECT *
INTO myNewTable
FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Data Source=C:\Path\To\File.xlsx;Extended Properties=Excel 12.0', SheetName$)
Notes
Bulk operations must be granted to user calling the action which is a server-level right and not database-level. Consequently, you may not be able to run command through R but in SSMS console.
Ad Hoc Distributed queries must be enabled on database to connect to remote data sources that use OLE DB.
Distributed queries assume you have SQL Server and MS Office in same bit-architecture: both at 64-bit installs with Access engine installed. If not, free MS 2007/2010/2013/2016 download are available. See 2016. If SQL Server is on 32-bit, use older Microsoft.JET.OLEDB.4.0 and save Excel file in older .xls format with properties as Excel 8.0.
Related
In SQL Server 2016 they introduced parallel inserts into existing tables. By not having certain features on the target table SQL Server can insert the data in parallel streams.
Using the syntax of
INSERT [tableName] WITH (TABLOCK)
SELECT .....
The data will be inserted in parallel. I have seen great improvements using this. What normally would take about 10 minutes to insert 120 million, using this new feature takes only about 30 seconds.
How can I use this new setting in SSIS? I am using Visual Studio 2015 Enterprise and SQL Server 2016.
I know I can use a "Execute SQL Task" and put something like this in, but what I'm wondering is how to use this in the Data Flow? Is there a specific Connection Manager and setting in the Destination Adapter?
in sql server 2016, we need to provide two condition for using parallelism for our insert operations. The first one is compatibility level of database must be set at 130. So before you run your ssis package check your database compatibility level.
SELECT name, compatibility_level FROM sys.databases
the second condition is using TABLOCK hint. In SSIS Package you can choose TABLOCK hint with OLEDB Destination.
No, you cannot utilize Parallel inserts in DTF.
According to Microsoft description of Parallel Inserts in SQL 2016, it can be used only if executing INSERT ... SELECT ... statement with some limitations. Data Flow prepares data table in memory of SSIS Server, OLEDB or ODBC destination will try to load it with 'INSERTorINSERT BULK` statements, which are not subject to parallel operations.
I'm running a MatLab script, where I update a table on a SQL connection. All is OK if I read or update table from SQL server trhough matLab simple command; data appears perfectly. But I'm facing troubles when I use BULINSERT command. No data are updated!! However, it works while SQL console of SQL Server Management Studio.
My code (MatLab sample):
conn = database('Dados_SQL','sa','SQL#Edison');
A = {100000.00,'KGreen','06/22/2011','Challengers'};
A = A(ones(10000,1),:);
fid = fopen('c:\temp\tmp.txt','wt');
for i = 1:size(A,1)
fprintf(fid,'%10.2f \t %s \t %s \t %s \n',A{i,1}, ...
A{i,2},A{i,3},A{i,4});
e = exec(conn,['bulk insert BULKTEST from '...
'''c:\temp\tmp.txt''with (fieldterminator = ''\t'', '...
'rowterminator = ''\n'')']);
end
close(e);
Thanks in advance.!
Edison.
MSSQL has problems reading from a drive which is not local to the database. If your C drive is not the database drive, that might be the problem. You have two options:
Generate a file that you save onto the database server from matlab. Perhaps you have a shared filesystem so this is easy. Or use FTP. Or even Dropbox on each server. This depends on your setup and security requirements.
Generate a large text set of all SQL insert commands you want to run, and send them to the server in a single query. This takes away the need for multiple connections to the server which is slow, but it is not as efficient on the server side as a bulk insert.
I have got huge list of contact information in an Excel sheet that I would like to turn into table in the database so that I can maintain them in the database.
I tried following the import/export flat file import from the SQL Server 2008 R2, but it did not work. Can anyone tell me how do I successfully import the spreadsheet into a table in the database?
Thanks
There is a microsoft knowledge base article that lays out all the ways this is possible.
http://support.microsoft.com/kb/321686
I think using OPENROWSET or OPENDATASOURCE will be the easiest way, without the wizard. (see Distributed Queries)
SELECT * INTO XLImport4 FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\test\xltest.xls', [Customers$])
See OPENROWSET documentation, with examples lower down the page.
http://msdn.microsoft.com/en-us/library/ms190312.aspx
Manually
Right click on the database name/go to task and then select import data, as a source select an excel file that you created before and choose it's path on the next page select sql server as destination
I need a solution to select a table from access into a temp table in SQL. I looked at bulk insert but from what I understand the source must be a data file so that will not work. Also, I don't want to use the import/export wizard, this has to be done through code as I just need a temp table to perform certain queries on. The query needs to do something like...
SELECT * FROM [Access DB] INTO #TempTable (in SQL)
Anyone got any ideas?
SELECT * INTO #TempTable
FROM [Server_Name].[Database].[Schema].[Table]
You will need to add Access data source as a Linked Server to Sql Server. GOTO Obejct Explorer--> SQL Server--> Server Objects --> Linked Servers--> Right CLick and follow the instruction of adding a linked server. Once you have added Access database as a Linked server you can query it by using above command
Or you can use the OPENROWSET to query Data
SELECT * INTO #TempTable
FROM OPENROWSET(
'Microsoft.Jet.OLEDB.4.0',
'C:\Program Files\Path_to_Access_Database_File\Database_Name.mdb';
'admin';'',Table_Name
)
Using a linked server your best solution. Listed below is a technet article on setting them up. You may need to install a driver, I have included the link for the Office 2007 drivers. Here is a screenshot of my config for an Access 12.0 connection.
http://www.anony.ws/i/2013/11/21/UPm4G.jpg
http://technet.microsoft.com/en-us/library/ff772782.aspx#SSMSProcedure
http://www.microsoft.com/en-us/download/details.aspx?id=23734
I need to import data from Excel to my database. I need to insert the data from my Excel sheet to an existing table in my database.
I tried to import data with the help of sql wizard. Firstly I imported it into a temp table then I used an insert query to import data into my destination table. But it does not seem to have worked correctly.
So, plz suggest me good way of importing the data. And also it would be better if you suggest some good sql script for importing.
For file excel 2007 version (*.xlsx)
INSERT INTO MyTable
SELECT * FROM OPENROWSET('Microsoft.ACE.OLEDB.12.0',
'Excel 12.0;Database=D:\test.xlsx', [Customer$])
For file excel 97-2003 version (*.xls)
INSERT INTO MyTable
SELECT * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=D:\test.xls', [Customer$])
I tried using the previous answer on an .xlsx file (version 14.0.6112.2500 64-bit Microsoft Excel file)
SELECT *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\xls_to_sql\xltest.xlsx', [Sheet1$])
I then saved the spreadsheet as an .xls (97-2003 version) and tried again.
SELECT *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\xls_to_sql\xltest.xls', [Sheet1$])
Bot times I got the same error message:
Msg 7308, Level 16, State 1, Line 1
OLE DB provider 'Microsoft.Jet.OLEDB.4.0' cannot be used for distributed queries because the provider is configured to run in single-threaded apartment mode.
SQL SERVER information:
Microsoft SQL Server Management Studio 10.50.1617.0 Microsoft
Analysis Services Client Tools 10.50.1617.0 Microsoft Data Access
Components (MDAC) 6.1.7601.17514 Microsoft MSXML 2.6 3.0 6.0
Microsoft Internet Explorer 9.0.8112.16421 Microsoft .NET Framework
2.0.50727.5448 Operating System 6.1.7601
I don't have the Microsoft.Jet.OLEDB.4.0 or at least I don't know how to get it. I also don't know how to run everything in 32-Bit mode if that is the cause of the problem. I would appreciate help running in 32-Bit mode and also downloading and installing Microsoft.Jet.OLEDB.4.0 if I don't have it installed for some reason.
I tried the linked server method I saw posted for SQL Server 2005 but there is no Microsoft.Jet.OLEDB.4.0 option that was mentioned in the tutorial. See http://support.microsoft.com/kb/321686.
you did not mention anything about your existing table and keys in table and excel file, so just to give you a push, following command selects all data from xltest.xls file sheet ('customrs')
SELECT * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\test\xltest.xls', [Customers$])
from this point, it's up to your database structure how to import/merge to existing data.
first saved the spreadsheet as an .xls (97-2003 version) format and import them to SQL in temporary file. After import those data make sure that temp table field lengths are similar to original.
then use below statement to update orginal table
insert into orginal (field1, field2)
select field1, field2 from temp
There are three ways I usually do this.
Use VBA inside the Worksheet. This involves some development work which imho is too much effort, if you are doing this only one time. This is nice if you want to use this worksheet multiple times.
Use a combination of macros inside the worksheet to concatenate
insert queries which I then paste into SQL Management Studio, or
some similar SQL client, and run the inserts.
Use the bulk copy command-line tool to copy a CSV file, which I
would convert from the worksheet, like this: bcp
[dbname].[dbo].[myTableName] in data1.csv -T -SmyServerName -c -t^|
> log1.txt
If you need to import a .xlsx into 64-bit SQL Server, try installing the 64-bit Microsoft Access database Engine.
See http://www.microsoft.com/en-us/download/details.aspx?id=13255
For example, to import data from c:\data.xlsx, which has a sheet called MyData, then you could use:
SELECT *
FROM OPENROWSET ( 'Microsoft.ACE.OLEDB.12.0'
, 'Excel 12.0;database=c:\data.xlsx;IMEX=1'
, 'SELECT * FROM [MyData$]')