How to use RODBC to save dataframe to table with primary key generated at database - sql

I would like to enter a data frame into an existing table in a database using an R script, and I want the table in the database to have a sequential primary key. My problem is that RODBC doesn't seem to allow the primary key constraint.
Here's the SQL for creating the table I want:
CREATE TABLE [dbo].[results] (
[ID] INT IDENTITY (1, 1) NOT NULL,
[FirstName] VARCHAR (255) NULL,
[LastName] VARCHAR (255) NULL,
[Birthday] DATETIME NULL,
[CreateDate] DATETIME NULL,
CONSTRAINT [PK_dbo.results] PRIMARY KEY CLUSTERED ([ID] ASC)
);
And a test with some R code:
ConnectionString1="Driver=ODBC Driver 11 for SQL Server;Server=myserver; Database=TestDb; trusted_connection=yes"
ConnectionString2="Driver=ODBC Driver 11 for SQL Server;Server=notmyserver; Database=TestDb; trusted_connection=yes"
db1=odbcDriverConnect(ConnectionString1)
query="SELECT a.[firstname] as FirstName
, a.[lastname] as LastName
, Cast(a.[dob] as datetime) as Birthday
, cast(a.createDate as datetime) as CreateDate
FROM [dbo].[People] a"
results=NULL
results=sqlQuery(db1,query,stringsAsFactors=FALSE)
close(db1)
db2=odbcDriverConnect(ConnectionString)
sqlSave(db2,
results,
append = TRUE,
varTypes=c(Birthday="datetime", CreateDate="datetime"),
colnames = FALSE,
rownames = FALSE,fast=FALSE)
close(db2)
The first part of the R code is just getting some test data into a dataframe--it works fine and it's not part of my question here (I'm just including it here so you can see what format the test data is). When I run the sqlSave function I get an error message:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
However, if I remove the primary key from the database, everything works fine with this table:
CREATE TABLE [dbo].[results] (
[FirstName] VARCHAR (255) NULL,
[LastName] VARCHAR (255) NULL,
[Birthday] DATETIME NULL,
[CreateDate] DATETIME NULL
);
Clearly the primary key is the issue. Normally with entity framework or whatever (as I understand it), the primary key is created at the database when you enter data.
I'd like a way to append data to a table with a primary key using only an R script. Is that possible? There could already be data in the table I'm adding to, so I don't really see a way to create keys in R before trying to append to the table.

The problem is line 361 in http://github.com/cran/RODBC/blob/master/R/sql.R - the data.frame and the DB table must have exactly the same number of columns otherwise you get this error with this stacktrace:
Error in dimnames(x) <- dn :
length of 'dimnames' [2] not equal to array extent
3. `colnames<-`(`*tmp*`, value = c("ID", "FirstName", "LastName",
"Birthday", "CreateDate")) at sql.R#361
2. sqlwrite(channel, tablename, dat, verbose = verbose, fast = fast,
test = test, nastring = nastring) at sql.R#211
1. sqlSave(db2, results, append = TRUE, varTypes = c(Birthday = "datetime",
CreateDate = "datetime"), colnames = FALSE, rownames = FALSE,
fast = FALSE, verbose = TRUE)
If you add the ID column to your data.frame you can no longer use the autoinc ID column so this is no solution (or workaround).
A "simple" workaround to the "same columns" limitation of RODBC::sqlSave is:
Use sqlSave to save the new rows into another table name
Send an insert into ... select from ... via RODBC::sqlQuery to append the new rows to your original table that includes the autoinc ID
column
Delete the table with the new rows again (drop table...)
A better option would be to use the new odbc package which also offers better performance through bulk-alike inserts instead of sending single insert statements like RODBC does:
https://github.com/r-dbi/odbc
Look for the function dbWriteTable (which is an implementation of the interface DBI::dbWriteTable).

Related

How to find the columns that need to be indexed?

I'm starting to learn SQL and relational databases. Below is the table that I have, and it has around 10 million records. My composite key is (reltype, from_product_id, to_product_id).
What strategy should I follow while selecting the columns that needs to be indexed? Also, I have documented the operations that would be performed on the table. Please help in determining which columns or combination of columns that need to be indexed?
Table DDL is shown below.
Table name: prod_rel.
Database schema name : public
CREATE TABLE public.prod_rel (
reltype varchar NULL,
assocsequence float4 NULL,
action varchar NULL,
from_product_id varchar NOT NULL,
to_product_id varchar NOT NULL,
status varchar NULL,
starttime varchar NULL,
endtime varchar null,
primary key reltype, from_product_id, to_product_id)
);
Operations performed on table:
select distinct(reltype )
from public.prod_rel;
update public.prod_rel
set status = ? , starttime = ?
where from_product_id = ?;
update public.prod_rel
set status = ? , endtime = ?
where from_product_id = ?;
select *
from public.prod_rel
where from_product_id in (select distinct (from_product_id)
from public.prod_rel
where status = ?
and action in ('A', 'E', 'C', 'P')
and reltype = ?
fetch first 1000 rows only);
Note: I'm not performing any JOIN operations. Also please ignore the uppercase for table or column names. I'm just getting started.
Ideal would be two indexes:
CREATE INDEX ON prod_rel (from_product_id);
CREATE INDEX ON prod_rel (status, reltype)
WHERE action IN ('A', 'E', 'C', 'P');
Your primary key (which also is implemented using an index) cannot support query 2 and 3 because from_product_id is not in the beginning. If you redefine the primary key as from_product_id, to_product_id, reltype, you don't need the first index I suggested.
Why does order matter? Imagine you are looking for a book in a library where the books are ordered by “last name, first name”. You can use this ordering to find all books by “Dickens” quickly, but not all books by any “Charles”.
But let me also comment on your queries.
The first one will perform badly if there are lots of different reltype values; try raising work_mem in that case. It is always a sequential scan of the whole table, and no index can help.
I have changed the order of primary columns as shown below as per #a_horse_with_no_name 's suggestion and created only one index for (from_product_id, reltype, status, action) columns.
CREATE TABLE public.prod_rel (
reltype varchar NULL,
assocsequence float4 NULL,
action varchar NULL,
from_product_id varchar NOT NULL,
to_product_id varchar NOT NULL,
status varchar NULL,
starttime varchar NULL,
endtime varchar null,
primary key reltype, from_product_id, to_product_id)
);
Also, I have gone thorough the portal suggested by #a_horse_with_no_name. It was amazing. I came to know lot of new things on indexing.
https://use-the-index-luke.com/

Change Column Type to "ORDSYS"."ORDIMAGE"

Trying to change the column type from BLOB to to ORDSYS.ORDImage with the following code:
alter table "POSTS"
modify ("IMAGE" "ORDSYS"."ORDIMAGE");
But it produces the following error:
ORA-22859: invalid modification of columns
The table and column names are definitely right.
A possible solution would be creating a new table via CREATE TABLE AS SELECT statement, then drop the source table and rename the new one.
According to Oracle Technology Network you can create an ORDImage from a BLOB with
select ordsys.ordimage(ordsys.ordsource(IMAGE, null, null, null, null, 1),
null, null, null, null, null, null, null) from POSTS
(not tested)
The solution I found was to drop the column and create a new one.

SQL server Invalid Column name Invalid object name

I'm having a problem with a table I created. I am trying to run a query however a red line appears under my code ('excursionID', and 'excursions'), claiming 'Invalid Column name 'excursionID' and 'Invalid object name 'dbo.excursions' even though I have created the table already!
Here is the query
SELECT
excursionID
FROM [dbo].[excursions]
Here is the query I used to create the table
USE [zachtravelagency]
CREATE TABLE excursions (
[excursionID] INTEGER NOT NULL IDENTITY (1,1) PRIMARY KEY,
[companyName] NVARCHAR (30) NOT NULL,
[location] NVARCHAR (30) NOT NULL,
[description] NVARCHAR (30) NOT NULL,
[date] DATE NOT NULL,
[totalCost] DECIMAL NOT NULL,
I've tried dropping the table and inserting table again.
For some reason all my other tables work, it's just this table that doesn't identify itself. I'm very new to SQL so thank you for your patience!
You use DB [zachtravelagency] for create table.And You dont use this DB in your query. Default used db master in SSMS. Try
SELECT
excursionID
FROM [zachtravelagency].[excursions]

Get value of PRIMARY KEY during SELECT in ORACLE

For a specific task I need to store the identity of a row in a tabel to access it later. Most of these tables do NOT have a numeric ID and the primary key sometimes consists of multiple fields. VARCHAR & INT combined.
Background info:
The participating tables have a trigger storing delete, update and insert events in a general 'sync' tabel (Oracle v11). Every 15 minutes a script is then launched to update corresponding tables in a remote database (SQL Server 2012).
One solution I came up with was to use multiple columns in this 'sync' table, 3 INT columns and 3 VARCHAR columns. A table with 2 VARCHAR columns would then use 2 VARCHAR columns in this 'sync' table.
A better/nicer solution would be to 'select' the value of the primary key and store this in this table.
Example:
CREATE TABLE [dbo].[Workers](
[company] [nvarchar](50) NOT NULL,
[number] [int] NOT NULL,
[name] [nvarchar](50) NOT NULL,
CONSTRAINT [PK_Workers] PRIMARY KEY CLUSTERED ( [company] ASC, [number] ASC )
)
// Fails:
SELECT [PK_Workers], [name] FROM [dbo].[Workers]
UPDATE [dbo].[Workers] SET [name]='new name' WHERE [PK_Workers]=#PKWorkers
// Bad (?) but works:
SELECT ([company] + CAST([number] AS NVARCHAR)) PK, [name] FROM [dbo].[Workers];
UPDATE [dbo].[Workers] SET [name]='newname' WHERE ([company] + CAST([number] AS NVARCHAR))=#PK
The [PK_Workers] fails in these queries. Is there another way to get this value without manually combining and casting the index?
Or is there some other way to do this that I don't know?
for each table create a function returning a concatenated primary key. create a function based index on this function too. then use this function in SELECT and WHERE clauses

sqlite3 Error Executing SQL From File

I am trying to create tables in an SQLite database with sqlite3.
The command $ sqlite3 mydb < mytables.sql produce the following error: Incomplete SQL: ??C.
mytables.sql is:
CREATE TABLE SizeCulture (
SizeCultureID INTEGER PRIMARY KEY ASC,
SizeID INTEGER NULL,
CultureID TEXT NULL,
Name TEXT NULL,
Description TEXT NULL,
Abbreviation TEXT NULL,
);
CREATE TABLE Size(
SizeID INTEGER PRIMARY KEY ASC ,
Creation TEXT NOT NULL,
Modification TEXT NOT NULL,
Deleted INTEGER NOT NULL,
);
/****** Object: Table [Ordering].[BarCode] Script Date: 11/09/2011 14:58:19 ******/
CREATE TABLE BarCode(
BarCodeID INTEGER PRIMARY KEY ASC NOT NULL,
BarCodeValue TEXT NOT NULL,
);
This was modified from a script generated by SQL Server, where some tables need to be replicated on an Android device.
The above is just a set of repeating create table statements. From what I understand, SQLite follows standard SQL (like MySQL or postgres).
Though I can't test it at the moment, I think it's the trailing commas that are confusing it (for example, the comma at the end of Abbreviation TEXT NULL,). Try removing all those trailing commas.
Edit: To be clear, I'm talking about all of these commas:
Abbreviation TEXT NULL,
...
Deleted INTEGER NOT NULL,
...
BarCodeValue TEXT NOT NULL,
I had the same problem, but for a different reason (so I'm commenting because Google led me here). Turns out you can also encounter this error if your file has a weird encoding (like UCS-2 instead of UTF8).