Converting an SQL Statement into R Code Without SQLDF - sql

I'm a new-ish programmer in R and I'm having a bit of an issue with some SQL code.
What I want to do is to convert this operation to base R code. I know it's quite complicated and I tried using merge but I didn't really manage to get anywhere.
Censored <- sqldf("SELECT Censored1.ModelYearID, Censored1.InServiceDate, Censored1.Censored, Censored1.VIN
FROM Censored1 LEFT JOIN Claims ON Censored1.VIN = Claims.VIN
GROUP BY Censored1.ModelYearID, Censored1.InServiceDate, Censored1.Censored, Censored1.VIN, Claims.VIN
HAVING (((Claims.VIN) Is Null))")
The reason I want to do this is because I have ~1600 different Claims tables in a data frame list (df_listl) which are named like this:
LabourOperation.ModelYearID e.g. Q123456.1997, Q234567.1998
and I need to run this query for every one of these tables, putting each of the comparable censored tables in the same kind of list.
If anyone could help me with this, that would be great. It's a bit complicated and I'm really struggling as I've only just learnt that you can put data frames in lists!
I was thinking that lapply might be a good way to go but I'm not very good with functions yet.
Thank you in advance :D

Related

JOIN with a dataset

I am very new to Superset and SQL in general, please excuse my poor language as well.
General question: How do I use an existing superset dataset in a sql query?
Case: I am trying to create a map based on german postal codes. Therefor I need to join that table with a translation table containing german postal code to JSON coordinates. The translation table is in another database than the german postal codes are. I am constantly trying to JOIN these both together, but it does not work. I assume you can only work with the data from one single database at once. Is it possible to create datasets with the needed data and reuse these datasets in a sql query? I tried this, but I dont know how to access these. When using data on a database I would write:
Select * from database.table
To access a superset dataset in my query:
Select * from dataset (how it is named in the superset dataset list)
which does not work at all.
I am desperatly trying to solve this problem but I am just not able to.
Thanks for your help in advance.
In Superset's SQL Lab, you can run pretty much any valid SQL query that your database accepts. The query will more / less be sent to your database and the results displayed to you in the results panel. So you can run JOIN queries in SQL Lab, for example.
If you want to visualize data from the results of a SQL Query, hit the "Explore" button after running the query. Then, you'll be asked to publish the query you wrote & ran as a Virtual Dataset. Finally, you'll be taken to the Explore, no-code chart builder to visualize your data.
I wrote a bit more about the semantic layer in Superset here, if you'd like to learn more: https://preset.io/blog/understanding-superset-semantic-layer/

Learning ExecuteSQL in FMP12, a few questions

I have joined a new job where I am required to use FileMaker (and gradually transition systems to other databases). I have been a DB Admin of a MS SQL Server database for ~2 years, and I am very well versed in PL/SQL and T-SQL. I am trying to pan my SQL knowledge to FMP using the ExecuteSQL functionaloty, and I'm kinda running into a lot of small pains :)
I have 2 tables: Movies and Genres. The relevant columns are:
Movies(MovieId, MovieName, GenreId, Rating)
Genres(GenreId, GenreName)
I'm trying to find the movie with the highest rating in each genre. The SQL query for this would be:
SELECT M.MovieName
FROM Movies M INNER JOIN Genres G ON M.GenreId=G.GenreId
WHERE M.Rating=
(
SELECT MAX(Rating) FROM Movies WHERE GenreId = M.GenreId
)
I translated this as best as I could to an ExecuteSQL query:
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
I set the field type to Text and also ensured values are not stored. But all I see are '?' marks.
What am I doing incorrectly here? I'm sorry if it's something really stupid, but I'm new to FMP and any suggestions would be appreciated.
Thank you!
--
Ram
UPDATE: Solution and the thought process it took to get there:
Thanks to everyone that helped me solve the problem. You guys made me realize that traditional SQL thought process does not exactly pan to FMP, and when I probed around, what I realized is that to best use SQL knowledge in FMP, I should be considering each column independently and not think of the entire result set when I write a query. This would mean that for my current functionality, the JOIN is no longer necessary. The JOIN was to bring in the GenreName, which is a different column that FMP automatically maps. I just needed to remove the JOIN, and it works perfectly.
TL;DR: The thought process context should be the current column, not the entire expected result set.
Once again, thank you #MissJack, #Chuck (how did you even get that username?), #pft221 and #michael.hor257k
I've found that FileMaker is very particular in its formatting of queries using the ExecuteSQL function. In many cases, standard SQL syntax will work fine, but in some cases you have to make some slight (but important) tweaks.
I can see two things here that might be causing the problem...
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON
M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
You can't use the standard FMP table::field format inside the query.
Within the quotes inside the ExecuteSQL function, you should follow the SQL format of table.column. So M::MovieName should be M.MovieName.
I don't see an AS anywhere in your code.
In order to create an alias, you must state it explicitly. For example, in your FROM, it should be Movies AS M.
I think if you fix those two things, it should probably work. However, I've had some trouble with JOINs myself, as my primary experience is with FMP, and I'm only just now becoming more familiar with SQL syntax.
Because it's incredibly hard to debug SQL in FMP, the best advice I can give you here is to start small. Begin with a very basic query, and once you're sure that's working, gradually add more complicated elements one at a time until you encounter the dreaded ?.
There's a number of great posts on FileMaker Hacks all about ExecuteSQL:
Since you're already familiar with SQL, I'd start with this one: The Missing FM 12 ExecuteSQL Reference. There's a link to a PDF of the entire article if you scroll down to the bottom of the post.
I was going to recommend a few more specific articles (like the series on Robust Coding, or Dynamic Parameters), but since I'm new here and I can't include more than 2 links, just go to FileMaker Hacks and search for "ExecuteSQL". You'll find a number of useful posts.
NB If you're using FMP Advanced, the Data Viewer is a great tool for testing SQL. But beware: complex queries on large databases can sometimes send it into fits and freeze the program.
The first thing to keep in mind when working with FileMaker and ExecuteSQL() is the difference between tables and table occurrences. This is a concept that's somewhat unique to FileMaker. Succinctly, tables store the data, but table occurrences define the context of that data. Table occurrences are what you're seeing in FileMaker's relationship graph, and the ExecuteSQL() function needs to reference the table occurrences in its query.
I agree with MissJack regarding the need to start small in building the SQL statement and use the Data Viewer in FileMaker Pro Advanced, but there's one more recommendation I can offer, which is to use SeedCode's SQL Explorer. It does require the adding of table occurrences and fields to duplicate the naming in your existing solution, but this is pretty easy to do and the file they offer includes a wizard for building the SQL query.

Script to compare two tables in database, from user input

I am very new to VBA and SQL and am trying to learn. I have a MS Access project that requires a VBA script that prompts the user to input two table names and numerous field names and create a SQL query utilizing those the names.
The specific SQL query I'm trying to use is below.
SELECT
A.user_index, A.input1, B.input1, A.input2, B.input2, A.input3, B.input3, B.input4,
A.input4, A.input5, B.input5
FROM
table1 AS A
LEFT JOIN
table2 AS B ON A.user_index = B.user_index
WHERE
(((A.input1) <> [B].[input1)) OR
(((A.input2) <> [B].[input2])) or
(((A.input4) <> [B].[input4]));
The overall purpose of this is to have a script that will be able to list fields for comparison that is applicable with any database. I know this is probably a relatively easy solution. However, I have no idea where to start.
My first instinct is to say "What have you tried so far?", but as you said, you don't know where to start.
It sounds like you need to first prompt the user for several field and table names, then build a query based on those values. I recommend first outlining exactly what you want your script to do. Maybe something like:
Declare variables to hold the values.
Prompt the user for each of the values and store them in the variables.
2a. After the user enters a value, make sure it is valid. If not, do something accordingly.
Declare a variable to hold your SQL query.
Construct the query.
Run the query.
This is obviously just an example. Break down each step into "baby steps" as much as possible.
It's a good idea to ask yourself how unique these baby steps are to your particular situation (hint: they almost certainly are not unique). If they aren't, then they have probably been solved tens of thousands of times already, so you have a very good chance of googling your questions.
If you still can't find an answer to how to do a particular step, feel free to ask here. Just remember to include your code even if it is broken :)

Is avoiding SQL statements in programs a good idea?

I recently came across a program which is developed using sql statements in a table with a code for each statement. rather than having specific sql statements in the program itself.
So, rather than having code like this:
string query = "SELECT id, name from [Users]";
cmd.ExecuteQuery(query);
They use code like this: (simplified)
string firstQuery = "SELECT queryText from [Queries] where queryCode = 'SELECT_ALL_USERS'";
string userQuery = cmd.ExecuteQuery(firstQuery);//pretend this directly returns the result of the first query
cmd.ExecuteQuery(userQuery);
The logic behind this as far as I've heard is that it makes the program easier to maintain as the developer is free to change the "user sql" without having to actually change the program.
However, this struck me as maybe a little counterproductive. Would this kind of code be considered a good idea?
EDIT: I'm not looking for suggestions like "use an ORM". Assume that sql queries are the only option.
In my opinion, this approach is ridiculous. There is value (maintainability, modularity) in separating as much SQL from the middle tier as possible, but to accomplish this end, I would recommend using stored procedures.
No i really dont think its a good idea to proceed further with design.
As a test or learning activity is a differetn part, but going foward with such implementations is definately not advisable.
pros:
1. We get complete modularity. The Real Business Schema can change at any time, and we do not need to modify the Running application to get the results from Different schema (Considering result Format dont change).
Cons.
1. With this implementation we are firing 2 SQLs to Database each time when we want to execute 1. I/O call including DB calls are always performnace hit, and with this implementation we are doubling the performance which is definately not advisable.

SQL - Return Columns with a specific value

Before I ask my question will layout what I'm trying to do.
I have this table such as below
Columns - PID, Choice1, Choice2,......Choice10
Rows - 1,X, O, X, O.........
Ive been searching on the net for quite some time and need a little push in the right direction if what I'm trying to do is possible. While getting the coding will help me with the small project I'm doing, it doesn't really help me learn more about SQl.
Is it possible to do a search on the table and return only the columns that have a value of X where PID = some value??
My gut instinct is saying no and I might have to restructure my database to accomplish what I'm doing. As i said a point in the right direction where I can read up on what I'm trying to do is great, getting the coding for it.. really doesn't help me learn it for future reference.
It does sound like you should restructure your database, but you can use PIVOT and UNPIVOT to transpose and restructure the output table. Columns are normally fixed with a variable number of rows depending on the WHERE clause. Using PIVOT can swap columns for rows, giving you what you need.