Save big dataset from R into temp table SQL database - sql

I have a dataframe (predict_prc) with 60k rows and 2 variables (chrt_id and prc). And I need to save this dataframe into MS SQL database.
I choose the next way - create temp table, insert new values and exec the stored proc.
I tried the code below:
sql = paste("
CREATE TABLE #t (chrt_id INT PRIMARY KEY,prc FLOAT)
INSERT INTO #t
VALUES",
paste0(sprintf("(%.2i, ", predict_prc$chrt_id), sprintf("%.2f)", predict_prc$predict_prc), collapse = ", ")
,"EXEC DM.LoadChrtPrc
")
But its too many values to insert this way.
Then I tried next code:
sql_create = paste("
IF (SELECT object_id('#t')) IS NOT NULL
BEGIN
DROP TABLE #t
END
CREATE TABLE #t (chrt_id FLOAT PRIMARY KEY, prc FLOAT)
")
sql_exec = paste("
EXEC DM.LoadChrtPrc
")
channel <- odbcConnect('db.w')
create <- sqlQuery(channel, sql_create)
save <- sqlSave(channel, predict_prc, tablename = '#t', fast=TRUE, append=F, rownames=FALSE)
output <- sqlQuery(channel, sql_exec)
odbcClose(channel)
But i`ve got an error:
> save <- sqlSave(channel, predict_prc, tablename = '#t', fast=TRUE, append=F, rownames=FALSE)
Error in sqlSave(channel, predict_prc, tablename = "#t", fast = TRUE, :
42S01 2714 [Microsoft][ODBC SQL Server Driver][SQL Server]There is already an object named '#t' in the database.
[RODBC] ERROR: Could not SQLExecDirect 'CREATE TABLE "#t" ("chrt_id" float, "prc" float)'
If I execute save without create then I`ve got this error:
> save <- sqlSave(channel, predict_prc, tablename = '#t1', fast=TRUE, append=F, rownames=FALSE)
Error in sqlColumns(channel, tablename) :
‘#t’: table not found on channel
Can anybody help me with this issue?

SQL Server doesn't allow more than 1000 rows in one query.
You can insert all the values by creating chunks of 1000.
For every 1000 rows, you should create a new sql query and run it.

To solve the problem i have to create several temporary tables and then I inserted 1000 records per table to solve my problem.
So before you create temporary table, count the number of records you gonna put in temp table , divide by 1000 and then create a temp table as per your requirement.
This solution is for one time query solution.
If you want automate the process, use something else.

Related

Create temporary data or table in sql and use as a database

I tried to create a temporary sql data(table) with same fields as in the original database.
What I tried..
dirname = os.path.dirname(__file__)
database = os.path.join(dirname, "database/database.db")
conn = sqlite3.connect(database)
cursor = conn.cursor()
stm = cursor.execute("CREATE TABLE new_table SELECT * FROM database;")
#and also tried with some selected fields like following
stm = cursor.execute("CREATE TABLE new_table(pressure int,parameter varchar(20),day int,month int,latitude int,longitude int,surface int);")
tables = cursor.execute("INSERT into new_table SELECT pressure,parameter, day,month,latitude,longitude,surface FROM database;")
And the error is,
sqlite3.OperationalError: near "SELECT": syntax error
How to create a new data table with selected tables or all tables(only) from the original database.
Hope someone can help me.
In your first create table you need to define it with an "as".
CREATE TABLE new_table AS (SELECT * FROM database);
This should clear up your error.

R in SQL Server: Output data frame into a table

This probably has a simple answer but I cannot figure it out as I'm still getting a hang of working with R in SQL Server. I have a piece of code that reads in data from a SQL Server table, executes in R and returns a data frame.
execute sp_execute_external_script
#language=N'R',
#script=N'inp_dat=InputDataSet
inp_dat$NewCol=max(inp_dat$col1,inp_dat$col2)
new_dat=inp_dat
OutputDataSet=new_dat'
#input_data_1=N'select * from IM_COMP_TEST_SQL2016.dbo.temp_table';
I want to insert new_dat into a SQL Server table (select * into new_table from new_dat). How do I go about this?
As shown in this tutorial, you can use INSERT INTO ... EXEC in a previously created table with columns aligning to script's dataframe return:
INSERT INTO Table1
execute sp_execute_external_script
#language=N'R',
#script=N'inp_dat <- InputDataSet
inp_dat$NewCol <- max(inp_dat$col1,inp_dat$col2)
new_dat <- inp_dat',
#input_data_1=N'SELECT * FROM IM_COMP_TEST_SQL2016.dbo.temp_table',
#output_data_1=N'newdat';
However, to use the make-table query may require OPENQUERY() or OPENROWSET() using an ad-hoc distributed query as described in this SO Post to return the output of stored procedure:
Stored Procedure
CREATE PROCEDURE dbo.R_DataFrame
AS
BEGIN
execute sp_execute_external_script
#language=N'R',
#script=N'inp_dat <- InputDataSet
inp_dat$NewCol <- max(inp_dat$col1,inp_dat$col2)
new_dat <- inp_dat',
#input_data_1=N'SELECT * FROM IM_COMP_TEST_SQL2016.dbo.temp_table',
#output_data_1=N'newdat';
-- ADD ALL COLUMN TYPES;
WITH RESULT SETS (("newdat" [col1] varchar(20), [col2] double, [col3] int ...));
END
GO
Action Query
SELECT * INTO Table1
FROM OPENROWSET('SQLNCLI', 'Server=(local);Trusted_Connection=yes;',
'EXEC dbo.R_DataFrame')

R- create temporary table in sql server from R data frame

I know I can create a temporary table in SQL from R with, for example:
require(RODBC)
X<- odbcDriverConnect('driver={SQL Server};
server=s001000;database=X1;trusted_connection=true')
sqlQuery(X, "create table #temptable (test int)" )
sqlQuery(X, "insert into #temptable(test) values(201508)")
doesItWork <- sqlQuery(X, "select * from #temptable")
But I would like to create a temporary table in sql server from an R object (I have a table that has the result of previous R calculations and I need to query it against another table in SQL. I don't want to export it as txt and upload it to SQL server. It has to be a way to do it from R. I tried:
tabla<-data.frame(per=c(201508,201510))
sqlQuery(X, "Select * into ##temporal from tabla")
But I got an error message:
"42S02 208 [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid
object name 'tabla'."
"[RODBC] ERROR: Could not SQLExecDirect
'Select * into ##temporal from tabla '"
I also know I can create a table with sqlSave:
sqlSave(X, tabla, rownames=FALSE,safer=FALSE)
But I want to create a temporary table.
How can I create a temporary table in SQL from an R object?
Unfortunately, I don't recall sqlSave(conection, new_data, table_name, append = TRUE) ever working correctly for inserting data into existing tables (e.g. not creating new tables), so you may have to use the less efficient approach of generating the INSERT statements yourself. For example,
con <- odbcConnect(...)
query <- "
SET NOCOUNT ON;
IF ( OBJECT_ID('tempdb..##tmp_table') IS NOT NULL )
DROP TABLE ##tmp_table;
CREATE TABLE ##tmp_table
(
[ID] INT IDENTITY(1, 1)
,[Value] DECIMAL(9, 2)
);
SET NOCOUNT OFF;
SELECT 1;
"
sqlQuery(con, gsub("\\s|\\t", " ", query))
df <- data.frame(Value = round(rnorm(5), 2))
update_query <- paste0(
"SET NOCOUNT ON; INSERT INTO ##tmp_table ([Value]) VALUES ",
paste0(sprintf("(%.2f)", df$Value), collapse = ", "),
" SET NOCOUNT OFF; SELECT * FROM ##tmp_table;"
)
sqlQuery(con, update_query)
# ID Value
# 1 1 0.79
# 2 2 -2.23
# 3 3 0.13
# 4 4 0.07
# 5 5 0.50
#sqlQuery(con, "DROP TABLE ##tmp_table;")
#odbcClose(con)

Insert Values from Table Variable into already EXISTING Temp Table

I'm successfully inserting values from Table Variable into new (not yet existing table) Temp Table. Have not issues when inserting small number of rows (eg. 10,000), but when inserting into a Table Variable a lot of rows (eg. 30,000) is throws an error "Server ran out of memory and external resources).
To walk around the issue:
I split my (60,000) Table Variable rows into small batches (eg. 10,000) each, thinking I could insert new data to already existing Temp Table, but I'm getting this error message:
There is already an object named '##TempTable' in the database.
My code is:
USE MyDataBase;
Go
Declare ##TableVariable TABLE
(
[ID] bigint PRIMARY KEY,
[BLD_ID] int NOT NULL
-- 25 more columns
)
Insert Into ##TableVariable VALUES
(1,25),
(2,30)
-- 61,000 more rows
Select * Into #TempTable From ##TableVariable;
Select Count(*) From #TempTable;
Below is the error message I'm getting
The problem is that SELECT INTO wants to create the destination table, so at second run you get the error.
first you have to create the #TempTable:
/* this creates the temptable copying the #TableVariable structure*/
Select *
Into #TempTable
From #TableVariable
where 1=0;
now you can loop through your batches and call this insert as many times you want..
insert Into #TempTable
Select * From #TableVariable;
pay attention that #TempTable is different from ##TempTable ( # = Local, ## = Global ) and remember to drop it when you have finished.
also you should NOT use ## for you table variable, use only #TableVariable
I hope this help

Running SQL query through RStudio via RODBC: How do I deal with Hash Tables?

I've got a very basic SQL query that I'd like to be able to view in R.
The trouble is, I need to be able to reference a #table:
select
RAND(1) as random
into #test
select * from #test
Is this possible, or will I need to create permanent tables, or find some other work around?
I currently do this via a RODBC script which allows me to choose which SQL file to run:
require(RODBC)
sql.filename <- choose.files('T:\\*.*')
sqlconn <- odbcDriverConnect("driver={SQL Server};Server=SERVER_NAME;Trusted_Connection=True;")
file.content <- readLines(sql.filename)
output <- sqlQuery(sqlconn, paste(file.content[file.content!='--'],collapse=' '))
closeAllConnections()
Do you have any advice on how I can utilise #tables in my SQL scrips in R?
Thanks in advance!
When you use temp tables SQL outputs a message with the number of rows in the table. R doesn't know what to do with this message. If you begin your SQL query with SET NOCOUNT ON SQL will not output the count message.
I use #tables by separating my query into two parts, it returns character(0) if I do like:
sqlQuery(test_conn, paste("
drop table #test;
select
RAND(1) as random
into #test
select * from #test
"))
So instead I would use:
sqlQuery(test_conn, paste("
drop table #test;
select
RAND(1) as random
into #test
"))
sqlQuery(test_conn,"select * from #test")
It seems to work fine if you send one Query to make the #table, and a second to retrieve the contents. I also added in drop table #test; to my query, this makes sure there is not already a #test. If you try to write to a #table name that is already there you will get an error