how to filter double quotes when creating SQL file with read.csv.sql, R - sql

After much searching and experimenting, I simply cannot get this. I have a csv whose fields are all within quotation marks and I cannot import them. My basic script is the following. It works fine if there are no quotes. I would expect it to print out just the first three lines. I am thinking it could be the filter argument that I need to add?
library(sqldf)
sqldf("attach testingdb as new")
read.csv.sql("df.csv", sql = "create table main as select * from file", dbname = "testingdb")
sqldf("select * from main limit 3", dbname = "testingdb")
This is what the two few lines of the csv look like:
"{6DA08449}","89000","2002-05-10 00:00","MK42Q","S","N","F","38",""
"{6DB08449}","100000","2002-05-10 00:00","M429HQ","S","N","F","38",""

Related

Rmarkdown - Use table name as variable in dynamic sql chunk?

I need to execute an SQL-engine chunk in my Rmarkdown, where the table which is queried has a dynamic name, defined by R code.
I know that linking variables to the current R-environment is doable by using ?, but this works only for strings and numerics, not for "objects".
Of course I could just run the SQL query with DBI::dbGetQuery() but this would imply building all my request (which is very long) as a string which is not comfortable (I have many chunks to run).
Basically what I would need is :
`` {r}
mytable <- "name_of_table_on_sql_server"
``
then
`` {sql}
select * from ?mytable
``
This fails because the created query is select * from "name_of_table_on_sql_server" where SQL would need select * from name_of_table_on_sql_server (without quotes).
Using glue for defining mytable as mytable <- glue("name_of_table_on_sql_server") is not working neither.
Any idea ?
A slight variant on what you posted works for me (I don't have SQL Server so I tested with sqlite):
`` {r}
library(glue)
mytable <- glue_sql("name_of_table_on_sql_server")
``
then
`` {sql}
select * from ?mytable;
``
My only real changes were to use the function glue_sql and add a semicolon (;) to the end of the SQL chunk.

Putting output from sql query into another query using R environment

I am wondering what approach should have been selected to perform action from title. I am using ODBC connection and what I get from first sql query are like 40-50 rows in one column. What I want is to put this output as a values in to search for.
How should i treat this? Like a array or separated variables? I still do not know R well so just need to know where to search for.
Regards
------more explanation below----
I have list of 40-50 numbers of 10 digits each, organized in a column.
I am trying to do this:
list <- c(my_input)
sql_in <- paste0(list, collapse="")
and characters are organized like this after this operations:
'c(1234567890, , 1234567890, 1234567890)'
and almost all looks fine and fit into my query besides additional c character at the beginning and missing apostrophes.I try to use gsub function but did not work in way I want.
You may likely do this in one SQL call using a subquery. Notice in the call below that the result of
SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4)
Is passed to the WHERE clause of the primary query. This is perfectly valid and will allow your query to execute entirely in SQL without having to do any intermediate steps in R.
(I use sqldf for simplicity of illustration, but this should work through just about any ODBC connection)
library(sqldf)
Gear <- data.frame(n_gear = 1:5)
sqldf(
"SELECT mpg, qsec, gear, wt
FROM mtcars
WHERE gear IN (SELECT n_gear
FROM Gear
WHERE n_gear IN (3,4))"
)
Try something like this:
list<-c("try","this") #The output from your first query
sql_in<-paste0(list, collapse="','")
The Output
paste("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
[1] "select * from table where table.var in ('try','this')"
If yuo have space as first or last element of the string you can use this code:
`list<-c(" first element is a space","try","this","last element is a space ")` #The output from your first query
Find space at first or last character
first_space<-substr(list, start = 1, stop = 1)==" "
last_space<-substr(list, start = nchar(list), stop = nchar(list))==" "
Remove spaces
list[first_space]<-substr(list[first_space], start = 2, stop = nchar(list[first_space]))
list[last_space]<-substr(list[last_space], start = 1, stop = nchar(list[last_space])-1)
sql_in<-paste0(list, collapse="','")
Your output
paste0("select * from table where table.var in ",paste("('",sql_in,"')",sep=''))
"select * from table where table.var in ('first element is a space','try','this','last element is a space')"
I think You are expecting some thing like shown below code,
data <- dbGetQuery(con, "select column from yourfirsttable")
list <- paste(data$column, collapse="','")
result <- dbGetQuery(con, statement = sprintf("select * from yourresulttable where inv in ('%s')",list))
It's not entirely clear exactly what you're wanting to achieve here. For example, one use case just means you can do it all with a join. But I have cases where I don't know the values for the test without doing some computation. Then I do a separate query having created a query string thus:
> id <- 1:5
> paste0("SELECT * FROM table WHERE ID IN (", paste0(id, collapse = ","), ")")
[1] "SELECT * FROM table WHERE ID IN (1,2,3,4,5)"

IPython SQL Magic - Generate Query String Programmatically

I'm generating SQL programmatically so that, based on certain parameters, the query that needs to be executed could be different (i.e., tables used, unions, etc). How can I insert a string like this: "select * from table", into a %%sql block? I know that using :variable inserts variable into the %%sql block, but it does so as a string, rather than sql code.
The answer was staring me in the face:
query="""
select
*
from
sometable
"""
%sql $query
If you want to templatize your queries, you can use string.Template:
from string import Template
template = Template("""
SELECT *
FROM my_data
LIMIT $limit
""")
limit_one = template.substitute(limit=1)
limit_two = template.substitute(limit=2)
%sql $limit_one
Source: JupySQL documentation.
Important: If you use this approach, ensure you trust/sanitize the input!

PYTHON - Using double quotes in SQL constant

I have a SQL query entered into a constant. One of the fields that I need to put in my where clause is USER which is a key word. To run the query I put the keyword into double quotes.
I have tried all of the suggestions from here yet none seem to be working.
Here is what I have for my constant:
SELECT_USER_SECURITY = "SELECT * FROM USER_SECURITY_TRANSLATED WHERE \"USER\" = '{user}' and COMPANY = " \
"'company_number' and TYPE NOT IN (1, 4)"
I am not sure how to get this query to work from my constant.
I also tried wrapping the whole query in """. I am getting a key error on the USER.
SELECT_USER_SECURITY = """SELECT * FROM USER_SECURITY_TRANSLATED WHERE "USER" = '{user}' and
COMPANY = 'company_number' and TYPE NOT IN (1, 4)"""
Below is the error I am getting:
nose.proxy.KeyError: 'user'
So the triple quoted solution was the best one. The problem I was running into was I had not included the "user" key in my dictionary of params which formatted the query.

Limit number of characters imported from SQL in R

I am using the sqlquery function in R to connect the DB with R. I am using the following lines
for (i in 1:length(Counter)){
if (Counter[i] %in% str_sub(dir(),1,29) == FALSE){
DT <- data.table(sqlQuery(con, paste0("select a.* from edp_data.sme_loan a
where a.edcode IN (", print(paste0("\'",EDCode,"\'"), quote=FALSE),
") and a.poolcutoffdate in (",print(paste0("\'",str_sub(PoolCutoffDate,1,4),"-",str_sub(PoolCutoffDate,5,6),"-",
str_sub(PoolCutoffDate,7,8),"\'"), quote=FALSE),")")))}}
Thus I am importing subsets of the DB by EDCode and PoolCutoffDate. This works perfectly, however there is one variable in edp_data.sme in one particular EDCode which produces an undesired result.
If I take the unique of this "as.3" variable for a particular EDCode I get:
unique(DT$as3)
[1] 30003000000000019876240886000 30003000000000028672000424000
In reality there shoud be more unique IDs in this DB. The problem is that the string of as3 is much longer than the one which is imported.
nchar(unique(DT$as3))
[1] 29 29
How can I import more characters from this string? I do not want to specify each variable in select a.* ideally, but only make sure that it imports the full string of as3.
Any help is appreciated!