Methodology to check if a SQL query is correct for a context - sql

I am doing dozen of queries per day, and I admit sometimes I miss the context of the user's demand.
I would like to know if you have any tips to check / double check, if a SQL query is really doing what the context is asking.
For example, I have this context :
retrieve the firstname of 10 male students from the school "great_school" having 21 years old.
For this context I would write a pseudo query like this :
SELECT st.firstname
FROM studient st
JOIN school_studient sc_st ON sc_st.studient_id = st.id
JOIN school sc ON sc.id = sc_st.shool_id
AND sc.name = "great_school"
WHERE st.age = 21
AND st.sexe = "male"
LIMIT 10
How to be sure that this query is really doing what the context asked ?
It isn't about using EXPLAIN to check if the query is valid, it is about checking that the query has all the conditions needful.
Is there any tools who is able to read pseudo query and tell what it does in human language ?
I was thinking about a paper checklist with 2 columns : "fields to select" and "criterias", and then I tick everytime one item is in the query.
But isn't there more advance tools than a piece of paper ?

Your question is asking for a recommendation for a software product, which is off topic on SO, but you might have more luck here.
However, I would focus more on a process than on a tool. I find it really helpful to work with sample data sets, and ask the end user to mark up what should and should not be included. The challenges in interpretation are usually much more around "what's in/out", "how do we aggregate (what's the group-by)" and "how do we sort" (your example grabs 10 random students).
If you can build a sample data set, and ask your users to say "I want these records to be included, those excluded", and "I want you to aggregate this column for every change in that column", you get a much higher quality specification. Once you find problems in the specification, you can adjust the sample data set to avoid that problem in future...

Related

What would be the best way to store checkboxes in SQL

Hello Community from a Newbie;
I haven't found anything like this here with the search function so I hope it's not a stupid question . ( And sorry for my bad english ) .
I've planned to do a checklist with checkboxes in it (over 30 I think). I'm now thinking about how to store the status ( check or not checked ) into my SQL Database and read it out .
Plan A :
For each Checkbox I'll put a Column in the database, which sounds like an easy way but I think there has to be a better way in sense of Performance
Plan B : like an binary code in a single Column - like 00100 - means that just the 3rd Checkbox is a checked Checkbox . Makes more sense to me but give me a little bit the feeling that it is hard to code especially in case of reading those checkbox list out from SQL again like
if (00100 == 00010) checkbox2.checked;
if (00100 == 00100) checkbox3.checked;
I hope you can help a C# Newbie and give me maybe some other Solutions Ideas.
Thank you All
Greetings from Austria
So, there is a lot to unpack here. Basically, you are asking how to structure a database table.
IMHO Plan B is not a road you would like to go down. It will be quite confusing to anyone trying to make sense of it in the future. With respect to the table, it depends on your checklist.
Plan A is too static. Will that list change(grow or shrink)? When you talk about storing checks, you are storing responses. This begs the question how will you be storing the actual questions?
I would probably create two seperate tables, one for questions, and the other for responses.
This may seem a little overboard at first from your perspective but it allows you change your checklist over time.
Modeling-wise my question table might look like this:
table: questions
columns: content, order
table: responses
columns: question_id, user_id, checked:boolean

Learning ExecuteSQL in FMP12, a few questions

I have joined a new job where I am required to use FileMaker (and gradually transition systems to other databases). I have been a DB Admin of a MS SQL Server database for ~2 years, and I am very well versed in PL/SQL and T-SQL. I am trying to pan my SQL knowledge to FMP using the ExecuteSQL functionaloty, and I'm kinda running into a lot of small pains :)
I have 2 tables: Movies and Genres. The relevant columns are:
Movies(MovieId, MovieName, GenreId, Rating)
Genres(GenreId, GenreName)
I'm trying to find the movie with the highest rating in each genre. The SQL query for this would be:
SELECT M.MovieName
FROM Movies M INNER JOIN Genres G ON M.GenreId=G.GenreId
WHERE M.Rating=
(
SELECT MAX(Rating) FROM Movies WHERE GenreId = M.GenreId
)
I translated this as best as I could to an ExecuteSQL query:
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
I set the field type to Text and also ensured values are not stored. But all I see are '?' marks.
What am I doing incorrectly here? I'm sorry if it's something really stupid, but I'm new to FMP and any suggestions would be appreciated.
Thank you!
--
Ram
UPDATE: Solution and the thought process it took to get there:
Thanks to everyone that helped me solve the problem. You guys made me realize that traditional SQL thought process does not exactly pan to FMP, and when I probed around, what I realized is that to best use SQL knowledge in FMP, I should be considering each column independently and not think of the entire result set when I write a query. This would mean that for my current functionality, the JOIN is no longer necessary. The JOIN was to bring in the GenreName, which is a different column that FMP automatically maps. I just needed to remove the JOIN, and it works perfectly.
TL;DR: The thought process context should be the current column, not the entire expected result set.
Once again, thank you #MissJack, #Chuck (how did you even get that username?), #pft221 and #michael.hor257k
I've found that FileMaker is very particular in its formatting of queries using the ExecuteSQL function. In many cases, standard SQL syntax will work fine, but in some cases you have to make some slight (but important) tweaks.
I can see two things here that might be causing the problem...
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON
M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
You can't use the standard FMP table::field format inside the query.
Within the quotes inside the ExecuteSQL function, you should follow the SQL format of table.column. So M::MovieName should be M.MovieName.
I don't see an AS anywhere in your code.
In order to create an alias, you must state it explicitly. For example, in your FROM, it should be Movies AS M.
I think if you fix those two things, it should probably work. However, I've had some trouble with JOINs myself, as my primary experience is with FMP, and I'm only just now becoming more familiar with SQL syntax.
Because it's incredibly hard to debug SQL in FMP, the best advice I can give you here is to start small. Begin with a very basic query, and once you're sure that's working, gradually add more complicated elements one at a time until you encounter the dreaded ?.
There's a number of great posts on FileMaker Hacks all about ExecuteSQL:
Since you're already familiar with SQL, I'd start with this one: The Missing FM 12 ExecuteSQL Reference. There's a link to a PDF of the entire article if you scroll down to the bottom of the post.
I was going to recommend a few more specific articles (like the series on Robust Coding, or Dynamic Parameters), but since I'm new here and I can't include more than 2 links, just go to FileMaker Hacks and search for "ExecuteSQL". You'll find a number of useful posts.
NB If you're using FMP Advanced, the Data Viewer is a great tool for testing SQL. But beware: complex queries on large databases can sometimes send it into fits and freeze the program.
The first thing to keep in mind when working with FileMaker and ExecuteSQL() is the difference between tables and table occurrences. This is a concept that's somewhat unique to FileMaker. Succinctly, tables store the data, but table occurrences define the context of that data. Table occurrences are what you're seeing in FileMaker's relationship graph, and the ExecuteSQL() function needs to reference the table occurrences in its query.
I agree with MissJack regarding the need to start small in building the SQL statement and use the Data Viewer in FileMaker Pro Advanced, but there's one more recommendation I can offer, which is to use SeedCode's SQL Explorer. It does require the adding of table occurrences and fields to duplicate the naming in your existing solution, but this is pretty easy to do and the file they offer includes a wizard for building the SQL query.

Script to compare two tables in database, from user input

I am very new to VBA and SQL and am trying to learn. I have a MS Access project that requires a VBA script that prompts the user to input two table names and numerous field names and create a SQL query utilizing those the names.
The specific SQL query I'm trying to use is below.
SELECT
A.user_index, A.input1, B.input1, A.input2, B.input2, A.input3, B.input3, B.input4,
A.input4, A.input5, B.input5
FROM
table1 AS A
LEFT JOIN
table2 AS B ON A.user_index = B.user_index
WHERE
(((A.input1) <> [B].[input1)) OR
(((A.input2) <> [B].[input2])) or
(((A.input4) <> [B].[input4]));
The overall purpose of this is to have a script that will be able to list fields for comparison that is applicable with any database. I know this is probably a relatively easy solution. However, I have no idea where to start.
My first instinct is to say "What have you tried so far?", but as you said, you don't know where to start.
It sounds like you need to first prompt the user for several field and table names, then build a query based on those values. I recommend first outlining exactly what you want your script to do. Maybe something like:
Declare variables to hold the values.
Prompt the user for each of the values and store them in the variables.
2a. After the user enters a value, make sure it is valid. If not, do something accordingly.
Declare a variable to hold your SQL query.
Construct the query.
Run the query.
This is obviously just an example. Break down each step into "baby steps" as much as possible.
It's a good idea to ask yourself how unique these baby steps are to your particular situation (hint: they almost certainly are not unique). If they aren't, then they have probably been solved tens of thousands of times already, so you have a very good chance of googling your questions.
If you still can't find an answer to how to do a particular step, feel free to ask here. Just remember to include your code even if it is broken :)

Beginner SQL section: avoiding repeated expression

I'm entirely new at SQL, but let's say that on the StackExchange Data Explorer, I just want to list the top 15 users by reputation, and I wrote something like this:
SELECT TOP 15
DisplayName, Id, Reputation, Reputation/1000 As RepInK
FROM
Users
WHERE
RepInK > 10
ORDER BY Reputation DESC
Currently this gives an Error: Invalid column name 'RepInK', which makes sense, I think, because RepInK is not a column in Users. I can easily fix this by saying WHERE Reputation/1000 > 10, essentially repeating the formula.
So the questions are:
Can I actually use the RepInK "column" in the WHERE clause?
Do I perhaps need to create a virtual table/view with this column, and then do a SELECT/WHERE query on it?
Can I name an expression, e.g. Reputation/1000, so I only have to repeat the names in a few places instead of the formula?
What do you call this? A substitution macro? A function? A stored procedure?
Is there an SQL quicksheet, glossary of terms, language specification, anything I can use to quickly pick up the syntax and semantics of the language?
I understand that there are different "flavors"?
Can I actually use the RepInK "column" in the WHERE clause?
No, but you can rest assured that your database will evaluate (Reputation / 1000) once, even if you use it both in the SELECT fields and within the WHERE clause.
Do I perhaps need to create a virtual table/view with this column, and then do a SELECT/WHERE query on it?
Yes, a view is one option to simplify complex queries.
Can I name an expression, e.g. Reputation/1000, so I only have to repeat the names in a few places instead of the formula?
You could create a user defined function which you can call something like convertToK, which would receive the rep value as an argument and returns that argument divided by 1000. However it is often not practical for a trivial case like the one in your example.
Is there an SQL quicksheet, glossary of terms, language specification, anything I can use to quickly pick up the syntax and semantics of the language?
I suggest practice. You may want to start following the mysql tag on Stack Overflow, where many beginner questions are asked every day. Download MySQL, and when you think there's a question within your reach, try to go for the solution. I think this will help you pick up speed, as well as awareness of the languages features. There's no need to post the answer at first, because there are some pretty fast guns on the topic over here, but with some practice I'm sure you'll be able to bring home some points :)
I understand that there are different "flavors"?
The flavors are actually extensions to ANSI SQL. Database vendors usually augment the SQL language with extensions such as Transact-SQL and PL/SQL.
You could simply re-write the WHERE clause
where reputation > 10000
This won't always be convenient. As an alternativly, you can use an inline view:
SELECT
a.DisplayName, a.Id, a.Reputation, a.RepInK
FROM
(
SELECT TOP 15
DisplayName, Id, Reputation, Reputation/1000 As RepInK
FROM
Users
ORDER BY Reputation DESC
) a
WHERE
a.RepInK > 10
Regarding something like named expressions, while there are several possible alternatives, the query optimizer is going to do best just writing out the formula Reputation / 1000 long-hand. If you really need to run a whole group of queries using the same evaluated value, your best bet is to create view with the field defined, but you wouldn't want to do that for a one-off query.
As an alternative, (and in cases where performance is not much of an issue), you could try something like:
SELECT TOP 15
DisplayName, Id, Reputation, RepInk
FROM (
SELECT DisplayName, Id, Reputation, Reputation / 1000 as RepInk
FROM Users
) AS table
WHERE table.RepInk > 10
ORDER BY Reputation DESC
though I don't believe that's supported by all SQL dialects and, again, the optimizer is likely to do a much worse job which this kind of thing (since it will run the SELECT against the full Users table and then filter that result). Still, for some situations this sort of query is appropriate (there's a name for this... I'm drawing a blank at the moment).
Personally, when I started out with SQL, I found the W3 schools reference to be my constant stopping-off point. It fits my style for being something I can glance at to find a quick answer and move on. Eventually, however, to really take advantage of the database it is necessary to delve into the vendors documentation.
Although SQL is "standarized", unfortunately (though, to some extent, fortunately), each database vendor implements their own version with their own extensions, which can lead to quite different syntax being the most appropriate (for a discussion of the incompatibilities of various databases on one issue see the SQLite documentation on NULL handling. In particular, standard functions, e.g., for handling DATEs and TIMEs tend to differ per vendor, and there are other, more drastic differences (particularly in not support subselects or properly handling JOINs). If you care for some of the details, this document provides both the standard forms and deviations for several major databases.
You CAN refer to RepInK in the Order By clause, but in the Where clause you must repeat the expression. But, as others have said, it will only be executed once.
There are good answers for the technical problem already, so I'll only address some of the rest of your questions.
If you're just working with the DataExplorer, you'll want to familiarize yourself with SQL Server syntax since that's what it's running. The best place to find that, of course, is MSDN's reference.
Yes, there are different variations in SQL syntax. For example, the TOP clause in the query you gave is SQL Server specific; in MySQL you'd use the LIMIT clause instead (and these keywords don't necessarily appear in the same spot in the query!).

MySQL: Limit output according to associated ID

So here's my situation. I have a books table and authors table. An author can have many books... In my authors page view, the user (logged in) can click an author in a tabled row and be directed to a page displaying the author's books (collected like this URI format: viewauthorbooks.php?author_id=23), very straight forward... However, in my query, I need to display the books for the author only, and not all books stored in the books table (as i currently have!) As I am a complete novice, I used the most simple query of:
SELECT * FROM books
This returns the books for me, but returns every single value (book) in the database, and not ones associated with the selected author. And when I click a different author the same books are displayed for them...I think everyone gets what I'm trying to achieve, I just don't know how to perform the query. I'm guessing that I need to start using more advanced query clauses like INNER JOIN etc. Anyone care to help me out :)
Enters the WHERE clause:
The WHERE clause is used to extract only those records that fulfill a specific criteria. In your case, all you need to do is:
SELECT * FROM tasks_tb WHERE author_id = '23';
You will obviously need to change the '23' with the value passed in the URL querystring, so that each page lists the books of each relevant author.
Since it is never too early to start reading about best practices, note that for public websites it is really dangerous to include any un-sanitized input into an SQL query. You may want to read further on this topic from the following Stack Overflow posts:
XKCD sql injection - please explain (with pictures!)
What is SQL injection?
Is SQL injection a risk today?
SQL Injection Topics on Stack Overflow
The basic syntax is
SELECT * FROM table WHERE column=value;
But you really need to study more SQL, I'd suggest going through sqlzoo tutorial. http://sqlzoo.net/