I have set up two different connections in R:
connection_1 <- dbConnect(odbc::odbc(), driver = my_driver, database = "database_1", uid = "my_id", pwd = "my_pwd", server = "server_1", port = "my_port)
connection_2 <- dbConnect(odbc::odbc(), driver = my_driver, database = "database_2", uid = "my_id", pwd = "my_pwd", server = "server_2", port = "my_port)
I have a table stored in "connection_1" (table_1), and another table stored in "connection_2" (table_2) . I would like to join these two tables together and save the resulting table on "connection_1":
dbGetQuery(connection_1, "create table my_table as select * from connection_1.table_1 a inner join connection_2.table_2 B on A.Key_1 = B.Key_2")
But I am not sure if this is possible in R.
Does anyone know if the code I have written can be changed to do this?
Or will establishing "connection_2" automatically cancel "connection_1"?
Thank you!
Aside: If I was using SAS, I could have solved the above problem like this:
#connection 1
%let NZServer = 'server_1';
$ let NZSchema = 'my_schema_1';
% let NZDatawork = 'database_1';
$ let SAS_LIB = 'LIB_1';
LIBNAME ....
#connection 2
%let NZServe = 'server_2';
$ let NZSchem = 'my_schema_2';
% let NZDatawor = 'database_2';
$ let SAS_LI = 'LIB_2';
#remove last letter from each word to make it different
LIBNAME ...;
# run earlier join:
proc sql outobs = 100;
create table LIB_1.a as select * from LIB_1.table_1 a inner join LIB_2.table_2 B on A.Key_1 = B.Key_2;
quit;
There is no straightforward way to join across two servers. You could create a temp table in one of the serer (that has more data) and populate it with data from the other table/server. That way you will be moving the least amount of data (as opposed to extracting from both tables) and utilizing netezzas colocated join to speed up your query.
Related
I'm trying to catalog the structure of a MSSQL 2008 R2 database using R/RODBC. I have set up a DSN, connected via R and used the sqlTables() command but this is only getting the 'system databases' info.
library(RODBC)
conn1 <- odbcConnect('my_dsn')
sqlTables(conn1)
However if I do this:
library(RODBC)
conn1 <- odbcConnect('my_dsn')
sqlQuery('USE my_db_1')
sqlTables(conn1)
I get the tables associated with the my_db_1 database. Is there a way to see all of the databases and tables without manually typing in a separate USE statement for each?
There may or may not be a more idiomatic way to do this directly in SQL, but we can piece together a data set of all tables from all databases (a bit more programatically than repeated USE xyz; statements) by getting a list of databases from master..sysdatabases and passing these as the catalog argument to sqlTables - e.g.
library(RODBC)
library(DBI)
##
tcon <- RODBC::odbcConnect(
dsn = "my_dsn",
uid = "my_uid",
pwd = "my_pwd"
)
##
db_list <- RODBC::sqlQuery(
channel = tcon,
query = "SELECT name FROM master..sysdatabases")
##
R> RODBC::sqlTables(
channel = tcon,
catalog = db_list[14, 1]
)
(I can't show any of the output for confidentiality reasons, but it produces the correct results.) Of course, in your case you probably want to do something like
all_metadata <- lapply(db_list$name, function(DB) {
RODBC::sqlTables(
channel = tcon,
catalog = DB
)
})
# or some more efficient variant of data.table::rbindlist...
meta_df <- do.call("rbind", all_metadata)
New to R shiny and SQL
I have made some reactive dashboards but none yet using SQL database connection.
Here is my toy:
The database is the MySQL world database.
I want to join various tables and show some columns from each, but I want to be able to filter by Language found in the CountryLanguage table.
My WHERE statement doesn't work.
Current code:
ui <- fluidPage(
numericInput("nrows", "Enter the number of rows to display:", 5),
selectizeInput("inputlang", label = "Language", choices = NULL, selected = NULL, options = list(placeholder = "Please type a language")),
tableOutput("tbl")
)
server <- function(input, output, session) {
output$tbl <- renderTable({
conn <- dbConnect(
drv = RMySQL::MySQL(),
dbname = "shinydemo",
host = "shiny-demo.csa7qlmguqrf.us-east-1.rds.amazonaws.com",
username = "guest",
password = "guest")
on.exit(dbDisconnect(conn), add = TRUE)
dbGetQuery(conn, paste0(
"SELECT City.Name, City.Population, Country.Name, Country.Continent, CountryLanguage.Language, CountryLanguage.Percentage
FROM City
INNER JOIN Country on City.CountryCode = Country.Code
INNER JOIN CountryLanguage on Country.Code = CountryLanguage.CountryCode
WHERE CountryLanguage.Language = reactive({get(input$Selectize)})
LIMIT ", input$nrows, ";"))
})
}
shinyApp(ui, server)
I did not expect that code to work, but tried anyway. I suspect I can't pass an R command from within a dbGetQuery because it is expecting SQL syntax only. Is that correct?
So... what is the best way to set something like this up? I imagine I could make the joined selected stuff into a dataframe like
df <-dbGetQuery ( SELECT & JOIN)
dffilter <- df %>% filter ()
But is that going to make things super slow if the dataset is still quite large?
What would be the best practice here?
Having reactive(...) in a string is not evaluated, it's just a string. Further, DBI is not using glue on the query, so {get(...)} will do nothing.
You define the input as input$inputlang but in your reactive, you reference input$Selectize, I think that's a mistake.
You may want to consider parameterized queries vice constructing query strings manually. While there are security concerns about malicious SQL injection (e.g., XKCD's Exploits of a Mom aka "Little Bobby Tables"), it is also a concern for malformed strings or Unicode-vs-ANSI mistakes, even if it's a single data analyst running the query. Both DBI (with odbc) and RODBC support parameterized queries, either natively or via add-ons.
While this does not work for the LIMIT portion, it is useful for most other portions of a query. For that limit part, the req(is.numeric(input$nrows)) should be a reasonable check to ensure inadvertent injection problems.
Try this:
output$tbl <- renderTable({
req(is.numeric(input$nrows), input$inputlang)
conn <- dbConnect(
drv = RMySQL::MySQL(),
dbname = "shinydemo",
host = "shiny-demo.csa7qlmguqrf.us-east-1.rds.amazonaws.com",
username = "guest",
password = "guest")
on.exit(dbDisconnect(conn), add = TRUE)
dbGetQuery(conn, paste("
SELECT City.Name, City.Population, Country.Name, Country.Continent, CountryLanguage.Language, CountryLanguage.Percentage
FROM City
INNER JOIN Country on City.CountryCode = Country.Code
INNER JOIN CountryLanguage on Country.Code = CountryLanguage.CountryCode
WHERE CountryLanguage.Language = ?
LIMIT ", input$nrows),
params = list(input$inputlang))
})
How Should I create Table in SQL using the R Data frame object structure with out writing complex code. Is there is a any function in R to accomplish this.
if you are using mysql, you can use the dbWriteTable from RMySQL package
library(RMySQL)
con <- dbConnect(MySQL(),
user="USER_NAME",
host="localhost",
password = "PASS",
db = "NAME_DATA_BASE")
dbWriteTable(conn = con, name = 'test', value = iris)
In this example i put iris data frame in table named test
In sqlite:
library(RSQLite)
#set working directory to where your sqlite database resides
setwd("C:/sqlite/Data")
#connect
sqlite<-dbDriver("SQLite")
my_conn<-dbConnect(sqlite,"my_db.db")
#write out the dataframe in the database
dbWriteTable(my_conn, "new_table_in_db", my_dataframe, row.names=F)
I am trying to run the below sql statement (SQL Server), however getting the error
"FROM clause in UPDATE and DELETE statements cannot contain subquery sources or joins."
update fp
set fp.totalcapacity = hc.totalcapacity,
fp.sellablecapacity = hc.sellablecapacity
from [fact].[FinalPosition] fp
join fact.[HotelCapacity] hc
on fp.hotelkey = hc.hotelkey
and fp.staydate = hc.staydate
where fp.staydate = '2016-06-18'
I can't seem to understand why I am getting this error. Any idea?
I think the syntax you want is:
update fp
set totalcapacity = hc.totalcapacity,
sellablecapacity = hc.sellablecapacity
from fp join
fact.[HotelCapacity] hc
on fp.hotelkey = hc.hotelkey and fp.staydate = hc.staydate
where fp.staydate = '2016-06-18';
If you want fp to refer to an actual table, include that in the from clause and make the fp the alias for the table.
I am trying, unsuccessfully so far, to update records in a Microsoft Access 2013 table (called tbl_Data) with data from an AS400 table (LIBRARY.TABLE).
As you can see in my Access 2013 pass-through query below, I am trying to join the access table with the AS400 table using the Prefix & Number fields, and from there, update the access table with Name & Address information from the AS400 table.
Here is my latest attempt:
UPDATE
tbl_Data
SET
tbl_Data.FirstName = a.NINMFR,
tbl_Data.MiddleName = a.NINMMD,
tbl_Data.LastName = a.NINAML,
tbl_Data.BuildingNumber = a.NIBLNR,
tbl_Data.StreetName = a.NISTNM,
tbl_Data.AptSuite = a."NIAPT#",
tbl_Data.Address2 = a.NIADR2,
tbl_Data.City = a.NICITY,
tbl_Data.State = a.NISTAT,
tbl_Data.ZipCode = a.NIZIPC
INNER JOIN
LIBRARY.TABLE a
ON
tbl_Data.Prefix = a.NIPRFX,
tbl_Data.Number = a.NIPLNR;
When I run this query, I get an error that says:
OBDC--call failed.
[IBM][System i Access ODBC Driver][DB2 for i5/OS]SQL0199 - Keyword INNER not expected. Valid tokens: USE SKIP WAIT WITH WHERE. (#-199)
I would really appreciate any assistance, as I'm out of ideas.
Thanks!
That is Microsoft specific syntax for an update, it does not work on DB2. Try this:
UPDATE
tbl_Data
SET
(tbl_Data.FirstName,
tbl_Data.MiddleName,
tbl_Data.LastName,
tbl_Data.BuildingNumber,
tbl_Data.StreetName,
tbl_Data.AptSuite,
tbl_Data.Address2,
tbl_Data.City,
tbl_Data.State,
tbl_Data.ZipCode)
=
(SELECT
a.NINMFR,
a.NINMMD,
a.NINAML,
a.NIBLNR,
a.NISTNM,
a."NIAPT#",
a.NIADR2,
a.NICITY,
a.NISTAT,
a.NIZIPC
FROM
library.table a
WHERE
tbl_Data.Prefix = a.NIPRFX,
tbl_Data.Number = a.NIPLNR)
WHERE
EXISTS (
SELECT *
FROM
library.table a
WHERE
tbl_Data.Prefix = a.NIPRFX,
tbl_Data.Number = a.NIPLNR);