I'm trying to catalog the structure of a MSSQL 2008 R2 database using R/RODBC. I have set up a DSN, connected via R and used the sqlTables() command but this is only getting the 'system databases' info.
library(RODBC)
conn1 <- odbcConnect('my_dsn')
sqlTables(conn1)
However if I do this:
library(RODBC)
conn1 <- odbcConnect('my_dsn')
sqlQuery('USE my_db_1')
sqlTables(conn1)
I get the tables associated with the my_db_1 database. Is there a way to see all of the databases and tables without manually typing in a separate USE statement for each?
There may or may not be a more idiomatic way to do this directly in SQL, but we can piece together a data set of all tables from all databases (a bit more programatically than repeated USE xyz; statements) by getting a list of databases from master..sysdatabases and passing these as the catalog argument to sqlTables - e.g.
library(RODBC)
library(DBI)
##
tcon <- RODBC::odbcConnect(
dsn = "my_dsn",
uid = "my_uid",
pwd = "my_pwd"
)
##
db_list <- RODBC::sqlQuery(
channel = tcon,
query = "SELECT name FROM master..sysdatabases")
##
R> RODBC::sqlTables(
channel = tcon,
catalog = db_list[14, 1]
)
(I can't show any of the output for confidentiality reasons, but it produces the correct results.) Of course, in your case you probably want to do something like
all_metadata <- lapply(db_list$name, function(DB) {
RODBC::sqlTables(
channel = tcon,
catalog = DB
)
})
# or some more efficient variant of data.table::rbindlist...
meta_df <- do.call("rbind", all_metadata)
Related
I have set up two different connections in R:
connection_1 <- dbConnect(odbc::odbc(), driver = my_driver, database = "database_1", uid = "my_id", pwd = "my_pwd", server = "server_1", port = "my_port)
connection_2 <- dbConnect(odbc::odbc(), driver = my_driver, database = "database_2", uid = "my_id", pwd = "my_pwd", server = "server_2", port = "my_port)
I have a table stored in "connection_1" (table_1), and another table stored in "connection_2" (table_2) . I would like to join these two tables together and save the resulting table on "connection_1":
dbGetQuery(connection_1, "create table my_table as select * from connection_1.table_1 a inner join connection_2.table_2 B on A.Key_1 = B.Key_2")
But I am not sure if this is possible in R.
Does anyone know if the code I have written can be changed to do this?
Or will establishing "connection_2" automatically cancel "connection_1"?
Thank you!
Aside: If I was using SAS, I could have solved the above problem like this:
#connection 1
%let NZServer = 'server_1';
$ let NZSchema = 'my_schema_1';
% let NZDatawork = 'database_1';
$ let SAS_LIB = 'LIB_1';
LIBNAME ....
#connection 2
%let NZServe = 'server_2';
$ let NZSchem = 'my_schema_2';
% let NZDatawor = 'database_2';
$ let SAS_LI = 'LIB_2';
#remove last letter from each word to make it different
LIBNAME ...;
# run earlier join:
proc sql outobs = 100;
create table LIB_1.a as select * from LIB_1.table_1 a inner join LIB_2.table_2 B on A.Key_1 = B.Key_2;
quit;
There is no straightforward way to join across two servers. You could create a temp table in one of the serer (that has more data) and populate it with data from the other table/server. That way you will be moving the least amount of data (as opposed to extracting from both tables) and utilizing netezzas colocated join to speed up your query.
New to R shiny and SQL
I have made some reactive dashboards but none yet using SQL database connection.
Here is my toy:
The database is the MySQL world database.
I want to join various tables and show some columns from each, but I want to be able to filter by Language found in the CountryLanguage table.
My WHERE statement doesn't work.
Current code:
ui <- fluidPage(
numericInput("nrows", "Enter the number of rows to display:", 5),
selectizeInput("inputlang", label = "Language", choices = NULL, selected = NULL, options = list(placeholder = "Please type a language")),
tableOutput("tbl")
)
server <- function(input, output, session) {
output$tbl <- renderTable({
conn <- dbConnect(
drv = RMySQL::MySQL(),
dbname = "shinydemo",
host = "shiny-demo.csa7qlmguqrf.us-east-1.rds.amazonaws.com",
username = "guest",
password = "guest")
on.exit(dbDisconnect(conn), add = TRUE)
dbGetQuery(conn, paste0(
"SELECT City.Name, City.Population, Country.Name, Country.Continent, CountryLanguage.Language, CountryLanguage.Percentage
FROM City
INNER JOIN Country on City.CountryCode = Country.Code
INNER JOIN CountryLanguage on Country.Code = CountryLanguage.CountryCode
WHERE CountryLanguage.Language = reactive({get(input$Selectize)})
LIMIT ", input$nrows, ";"))
})
}
shinyApp(ui, server)
I did not expect that code to work, but tried anyway. I suspect I can't pass an R command from within a dbGetQuery because it is expecting SQL syntax only. Is that correct?
So... what is the best way to set something like this up? I imagine I could make the joined selected stuff into a dataframe like
df <-dbGetQuery ( SELECT & JOIN)
dffilter <- df %>% filter ()
But is that going to make things super slow if the dataset is still quite large?
What would be the best practice here?
Having reactive(...) in a string is not evaluated, it's just a string. Further, DBI is not using glue on the query, so {get(...)} will do nothing.
You define the input as input$inputlang but in your reactive, you reference input$Selectize, I think that's a mistake.
You may want to consider parameterized queries vice constructing query strings manually. While there are security concerns about malicious SQL injection (e.g., XKCD's Exploits of a Mom aka "Little Bobby Tables"), it is also a concern for malformed strings or Unicode-vs-ANSI mistakes, even if it's a single data analyst running the query. Both DBI (with odbc) and RODBC support parameterized queries, either natively or via add-ons.
While this does not work for the LIMIT portion, it is useful for most other portions of a query. For that limit part, the req(is.numeric(input$nrows)) should be a reasonable check to ensure inadvertent injection problems.
Try this:
output$tbl <- renderTable({
req(is.numeric(input$nrows), input$inputlang)
conn <- dbConnect(
drv = RMySQL::MySQL(),
dbname = "shinydemo",
host = "shiny-demo.csa7qlmguqrf.us-east-1.rds.amazonaws.com",
username = "guest",
password = "guest")
on.exit(dbDisconnect(conn), add = TRUE)
dbGetQuery(conn, paste("
SELECT City.Name, City.Population, Country.Name, Country.Continent, CountryLanguage.Language, CountryLanguage.Percentage
FROM City
INNER JOIN Country on City.CountryCode = Country.Code
INNER JOIN CountryLanguage on Country.Code = CountryLanguage.CountryCode
WHERE CountryLanguage.Language = ?
LIMIT ", input$nrows),
params = list(input$inputlang))
})
Trying to solve issue with wrong display of national characters (Polish) in results of query to MS SQL database.
The script is pretty standard
First the definnition connection object
library(DBI)
db.conn <- DBI::dbConnect(odbc::odbc(),
Driver = "SQL Server Native Client 11.0",
Server = "10.0.0.100",
Port = 1433,
Database = "DB",
UID = "user",
PWD = rstudioapi::askForPassword("Database password"),
encoding = "latin1"
)
then SQL statement
db_sql = "
select
*
from test
where active = 'ACTIVE'
order by name_id"
Then execution of SQL
db_query <- dbSendQuery(db.conn, db_sql)
db_data <- dbFetch(db_query)
or
db_data <- dbGetQuery(db.conn, db_sql)
It does not matter whether in connection object definition I use "latin1", "windows-1250" or "utf-8" parameter for encoding parameter the results are always the same
Strings with U+009C or similar
It also does not matter what codepage I select in RStudio Global options.
Problem solved.
First it is necessary to change locale to Polish
Sys.setlocale(category = "LC_ALL", locale = "Polish")
Then set proper encoding in
DBI::dbConnect(odbc::odbc()
...
encoding = "windows-1250"
and voilla, working.
How Should I create Table in SQL using the R Data frame object structure with out writing complex code. Is there is a any function in R to accomplish this.
if you are using mysql, you can use the dbWriteTable from RMySQL package
library(RMySQL)
con <- dbConnect(MySQL(),
user="USER_NAME",
host="localhost",
password = "PASS",
db = "NAME_DATA_BASE")
dbWriteTable(conn = con, name = 'test', value = iris)
In this example i put iris data frame in table named test
In sqlite:
library(RSQLite)
#set working directory to where your sqlite database resides
setwd("C:/sqlite/Data")
#connect
sqlite<-dbDriver("SQLite")
my_conn<-dbConnect(sqlite,"my_db.db")
#write out the dataframe in the database
dbWriteTable(my_conn, "new_table_in_db", my_dataframe, row.names=F)
In R, i am using the following function,which uses 3 or 4 database operation within that function. But an error message is displaying like this:
Error in sqliteExecStatement(conn, statement, ...) :
RS-DBI driver: (RS_SQLite_exec: could not execute1: database is locked)
What modification i need to make in my code? my code is as follows:
library('RSQLite')
test <- function(portfolio,date,frame){
lite <- dbDriver("SQLite", max.con = 25)
db <- dbConnect(lite, dbname = "portfolioInfo1.db")
sql <- paste("SELECT * from ", portfolio," where portDate='", date, "' ", sep = "")
res <- dbSendQuery(db, sql)
data <- fetch(res)
frame1 <- data.frame(portDate=date,frame)
lite <- dbDriver("SQLite", max.con = 25)
db <- dbConnect(lite, dbname = "portfolioInfo1.db")
sql <- paste("delete from ", portfolio," where portDate='", date, "' ", sep = "")
res <- dbSendQuery(db, sql)
lite <- dbDriver("SQLite", max.con = 25)
db <- dbConnect(lite, dbname = "portfolioInfo1.db")
dbWriteTable(db,portfolio,frame1,append=TRUE,row.names=FALSE)
}
tick <- c("AAPL","TH","YHOO")
quant <- c("121","1313","131313131")
frame <-data.frame(ticker=tick,quantities=quant)
#print(frame)
test("RUSEG","2006-02-28",frame)
It seems that you connect several times to the same database without disconnecting. Probably the database goes into a lock if a connection is made to prevent anyone else from editing a database which is already being edited.
Either disconnect after each connect, or simply connect once, perform all the queries, and than finally disconnect.
More precisely, multiple processes can read an SQLite database file simultaneously, but only one process at a time can write: SQLite documentation, File Locking
In my case I was using the DB Browser to add the table. I didn't save the changes, that's why trying to connect in RStudio (Shiny) did not work.