MySQL: Limit output according to associated ID - sql

So here's my situation. I have a books table and authors table. An author can have many books... In my authors page view, the user (logged in) can click an author in a tabled row and be directed to a page displaying the author's books (collected like this URI format: viewauthorbooks.php?author_id=23), very straight forward... However, in my query, I need to display the books for the author only, and not all books stored in the books table (as i currently have!) As I am a complete novice, I used the most simple query of:
SELECT * FROM books
This returns the books for me, but returns every single value (book) in the database, and not ones associated with the selected author. And when I click a different author the same books are displayed for them...I think everyone gets what I'm trying to achieve, I just don't know how to perform the query. I'm guessing that I need to start using more advanced query clauses like INNER JOIN etc. Anyone care to help me out :)

Enters the WHERE clause:
The WHERE clause is used to extract only those records that fulfill a specific criteria. In your case, all you need to do is:
SELECT * FROM tasks_tb WHERE author_id = '23';
You will obviously need to change the '23' with the value passed in the URL querystring, so that each page lists the books of each relevant author.
Since it is never too early to start reading about best practices, note that for public websites it is really dangerous to include any un-sanitized input into an SQL query. You may want to read further on this topic from the following Stack Overflow posts:
XKCD sql injection - please explain (with pictures!)
What is SQL injection?
Is SQL injection a risk today?
SQL Injection Topics on Stack Overflow

The basic syntax is
SELECT * FROM table WHERE column=value;
But you really need to study more SQL, I'd suggest going through sqlzoo tutorial. http://sqlzoo.net/

Related

Regexp search SQL query fields

I have a repository of SQL queries and I want to understand which queries use certain tables or fields.
Let's say I want to understand what queries use the email field, how can I write it?
Example SQL query:
select
users.email as email_user
,users.email as email_user_too
,email as email_user_too_2
email as email_user_too_3,
back_email as wrong_email -- wrong field
from users
So to state the problem more accurately, you are sorting through a list of SQL queries [as text], and you now need to find the queries that use certain fields using SQL & RegEx (Regular Expressions) in PostgreSQL. (please tag the question so that StackOverflow indexes your question correctly, more importantly, readers have more context about the question)
PostgreSQL has Regular Expression support OOTB (Out Of The Box). So we skip exploring other ways to do this. (If you are reading this as Microsoft SQL Server person, then I strongly suggest you to have a read of this brilliant article on Microsoft's website on defining a Table-Valued UDF (User Defined Function))
The simplest way I could think of to approach your problem, is to throw away what we don't want out of the query text first, and then filter out what's left.
This way, after throwing away the stuff you don't need, you will be left with a set of "tokens" that you can easily filter, and I'm putting token in quotes since we are not really parsing the SQL language, but if we did that would be the first step: to extract tokens.. (:
Take this query for example:
With Queries (
Id
, QueryText
) As (
values (1, 'select
users.email as email_user
,users.email as email_user_too
,email as email_user_too_2,
email as email_user_too_3,
back_email as wrong_email -- wrong field
from users')
)
Select QueryText
, found
From (
Select Id
, QueryText
, regexp_split_to_table (QueryText, '(--[\s\w]+|select|from|as|where|[ \s\n,])') As found
From Queries
) As Result
Where found != ''
And found = 'back_email'
I have sourced the concept of a "query repository" with a WITH statement for ease of doing the pseudo-code.
I have also selected few words/characters to split QueryText with. Like select, where etc. We don't need these in our 'found' set.
And in the end, as you can see above, I simply used found as what's left and filtered it with the field name you are looking for. (Assuming that you know the field you are looking for)
You could improve upon the RegEx I did, or change the method as you wish to make it better. But I think the general concept addresses what you need to achieve. One problem I can see with my solution right off the bat is the fact that you can search for anything really, not just names of the selected fields - which begs the question, why use RegEx, and not Like statements? But again, as I mentioned, you can improve upon the RegEx and address specific requirements you may have. Using Like might limit you in that direction. (In other words, only you know what's good for you. I can't say that from here.)
You can play with the query online here: db-fiddle query and use https://regex101.com/ for testing your RegEx.
Disclaimer I'm not a PostgreSQL developer. There must be other, perhaps better ways of doing this. (:

Methodology to check if a SQL query is correct for a context

I am doing dozen of queries per day, and I admit sometimes I miss the context of the user's demand.
I would like to know if you have any tips to check / double check, if a SQL query is really doing what the context is asking.
For example, I have this context :
retrieve the firstname of 10 male students from the school "great_school" having 21 years old.
For this context I would write a pseudo query like this :
SELECT st.firstname
FROM studient st
JOIN school_studient sc_st ON sc_st.studient_id = st.id
JOIN school sc ON sc.id = sc_st.shool_id
AND sc.name = "great_school"
WHERE st.age = 21
AND st.sexe = "male"
LIMIT 10
How to be sure that this query is really doing what the context asked ?
It isn't about using EXPLAIN to check if the query is valid, it is about checking that the query has all the conditions needful.
Is there any tools who is able to read pseudo query and tell what it does in human language ?
I was thinking about a paper checklist with 2 columns : "fields to select" and "criterias", and then I tick everytime one item is in the query.
But isn't there more advance tools than a piece of paper ?
Your question is asking for a recommendation for a software product, which is off topic on SO, but you might have more luck here.
However, I would focus more on a process than on a tool. I find it really helpful to work with sample data sets, and ask the end user to mark up what should and should not be included. The challenges in interpretation are usually much more around "what's in/out", "how do we aggregate (what's the group-by)" and "how do we sort" (your example grabs 10 random students).
If you can build a sample data set, and ask your users to say "I want these records to be included, those excluded", and "I want you to aggregate this column for every change in that column", you get a much higher quality specification. Once you find problems in the specification, you can adjust the sample data set to avoid that problem in future...

Learning ExecuteSQL in FMP12, a few questions

I have joined a new job where I am required to use FileMaker (and gradually transition systems to other databases). I have been a DB Admin of a MS SQL Server database for ~2 years, and I am very well versed in PL/SQL and T-SQL. I am trying to pan my SQL knowledge to FMP using the ExecuteSQL functionaloty, and I'm kinda running into a lot of small pains :)
I have 2 tables: Movies and Genres. The relevant columns are:
Movies(MovieId, MovieName, GenreId, Rating)
Genres(GenreId, GenreName)
I'm trying to find the movie with the highest rating in each genre. The SQL query for this would be:
SELECT M.MovieName
FROM Movies M INNER JOIN Genres G ON M.GenreId=G.GenreId
WHERE M.Rating=
(
SELECT MAX(Rating) FROM Movies WHERE GenreId = M.GenreId
)
I translated this as best as I could to an ExecuteSQL query:
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
I set the field type to Text and also ensured values are not stored. But all I see are '?' marks.
What am I doing incorrectly here? I'm sorry if it's something really stupid, but I'm new to FMP and any suggestions would be appreciated.
Thank you!
--
Ram
UPDATE: Solution and the thought process it took to get there:
Thanks to everyone that helped me solve the problem. You guys made me realize that traditional SQL thought process does not exactly pan to FMP, and when I probed around, what I realized is that to best use SQL knowledge in FMP, I should be considering each column independently and not think of the entire result set when I write a query. This would mean that for my current functionality, the JOIN is no longer necessary. The JOIN was to bring in the GenreName, which is a different column that FMP automatically maps. I just needed to remove the JOIN, and it works perfectly.
TL;DR: The thought process context should be the current column, not the entire expected result set.
Once again, thank you #MissJack, #Chuck (how did you even get that username?), #pft221 and #michael.hor257k
I've found that FileMaker is very particular in its formatting of queries using the ExecuteSQL function. In many cases, standard SQL syntax will work fine, but in some cases you have to make some slight (but important) tweaks.
I can see two things here that might be causing the problem...
ExecuteSQL ("
SELECT M::MovieName FROM Movies M INNER JOIN Genres G ON
M::GenreId=G::GenreId
WHERE M::Rating =
(SELECT MAX(M2::Rating) FROM Movies M2 WHERE M2::GenreId = M::GenreId)
"; "" ; "")
You can't use the standard FMP table::field format inside the query.
Within the quotes inside the ExecuteSQL function, you should follow the SQL format of table.column. So M::MovieName should be M.MovieName.
I don't see an AS anywhere in your code.
In order to create an alias, you must state it explicitly. For example, in your FROM, it should be Movies AS M.
I think if you fix those two things, it should probably work. However, I've had some trouble with JOINs myself, as my primary experience is with FMP, and I'm only just now becoming more familiar with SQL syntax.
Because it's incredibly hard to debug SQL in FMP, the best advice I can give you here is to start small. Begin with a very basic query, and once you're sure that's working, gradually add more complicated elements one at a time until you encounter the dreaded ?.
There's a number of great posts on FileMaker Hacks all about ExecuteSQL:
Since you're already familiar with SQL, I'd start with this one: The Missing FM 12 ExecuteSQL Reference. There's a link to a PDF of the entire article if you scroll down to the bottom of the post.
I was going to recommend a few more specific articles (like the series on Robust Coding, or Dynamic Parameters), but since I'm new here and I can't include more than 2 links, just go to FileMaker Hacks and search for "ExecuteSQL". You'll find a number of useful posts.
NB If you're using FMP Advanced, the Data Viewer is a great tool for testing SQL. But beware: complex queries on large databases can sometimes send it into fits and freeze the program.
The first thing to keep in mind when working with FileMaker and ExecuteSQL() is the difference between tables and table occurrences. This is a concept that's somewhat unique to FileMaker. Succinctly, tables store the data, but table occurrences define the context of that data. Table occurrences are what you're seeing in FileMaker's relationship graph, and the ExecuteSQL() function needs to reference the table occurrences in its query.
I agree with MissJack regarding the need to start small in building the SQL statement and use the Data Viewer in FileMaker Pro Advanced, but there's one more recommendation I can offer, which is to use SeedCode's SQL Explorer. It does require the adding of table occurrences and fields to duplicate the naming in your existing solution, but this is pretty easy to do and the file they offer includes a wizard for building the SQL query.

Where to get resources and demonstration on second order SQL injection?

I've been trawling around in the internet for a demo on second order SQLi but I still haven't found one yet. Many sites don't really give a thorough explanation on how it works.
I need to present a short demonstration and I've been practicing using Mutillidae. Can anybody lead me in the right direction?
A Google search for 'second order sql injection' comes up with a number of more or less relevant explanations of what Second Order SQL Injection is, with differing degrees of detail (as you say).
The basic idea is that the database stores some text from the user that is later incorporated into an SQL statement — but the text is insufficiently sanitized before reuse.
Think of an application which allows a user to create user-defined queries against a database. A simple example might be a bug tracking system. Some of the user-defined query attributes might be simple conditions such as 'bug status is "closed"'. This might be coded by looking at the stored query definition:
CREATE TABLE UserDefinedQuery
(
...user info...,
bug_status VARCHAR(20),
...other info...
);
SELECT ..., bug_status, ...
INTO ..., hv_bug_status, ...
FROM UserDefinedQuery
WHERE bug_status IS NOT NULL
AND ...other criteria...
where hv_bug_status is a host variable (PHP, C, whatever language you're using) holding the bug status criterion.
If this value is = 'closed', then the resulting SQL might contain:
SELECT *
FROM Bugs
WHERE status = 'closed'
AND ...other criteria...
Now suppose that when the user defined their query, they wrote instead:
= 'open' or 1=1
This means that the generated query now looks like:
SELECT *
FROM Bugs
WHERE status = 'open' or 1=1
AND ...other criteria...
The presence of the OR changes the meaning of the query dramatically and will show all sorts of other records that were not the ones that the user was intended to see. This is a bug in the bug querying application. If this modification means that CustomerX can see bugs reported by other customers CustomerY and CustomerZ that they are not supposed to see, then CustomerX has managed to create a second order SQL injection attack. (If the injection simply means that they get to see more records than they should, including ones that aren't relevant to them, then they've simply created a buggy query.)
Clearly, in a VARCHAR(20) field, your options for injecting lethal SQL are limited simply because SQL is a verbose language. But 'little Bobby Tables' could strike if the criteria are stored in a longer field.
='';DELETE Bugs;--
(Using a non-standard contraction for the DELETE statement; that squeaks in at 18 characters.)
How can you avoid this? Don't allow the user to write raw SQL fragments that you include in the generated SQL. Treat the value in UserDefinedQuery.Bug_Status as a space/comma separated list of string values, and build the query accordingly:
SELECT *
FROM Bugs
WHERE status IN ('=', '''open''', 'or', '1=1')
AND ...other criteria...
The query may not be useful, but it doesn't get its structure altered by the data in the UserDefinedQuery table.

Beginner SQL section: avoiding repeated expression

I'm entirely new at SQL, but let's say that on the StackExchange Data Explorer, I just want to list the top 15 users by reputation, and I wrote something like this:
SELECT TOP 15
DisplayName, Id, Reputation, Reputation/1000 As RepInK
FROM
Users
WHERE
RepInK > 10
ORDER BY Reputation DESC
Currently this gives an Error: Invalid column name 'RepInK', which makes sense, I think, because RepInK is not a column in Users. I can easily fix this by saying WHERE Reputation/1000 > 10, essentially repeating the formula.
So the questions are:
Can I actually use the RepInK "column" in the WHERE clause?
Do I perhaps need to create a virtual table/view with this column, and then do a SELECT/WHERE query on it?
Can I name an expression, e.g. Reputation/1000, so I only have to repeat the names in a few places instead of the formula?
What do you call this? A substitution macro? A function? A stored procedure?
Is there an SQL quicksheet, glossary of terms, language specification, anything I can use to quickly pick up the syntax and semantics of the language?
I understand that there are different "flavors"?
Can I actually use the RepInK "column" in the WHERE clause?
No, but you can rest assured that your database will evaluate (Reputation / 1000) once, even if you use it both in the SELECT fields and within the WHERE clause.
Do I perhaps need to create a virtual table/view with this column, and then do a SELECT/WHERE query on it?
Yes, a view is one option to simplify complex queries.
Can I name an expression, e.g. Reputation/1000, so I only have to repeat the names in a few places instead of the formula?
You could create a user defined function which you can call something like convertToK, which would receive the rep value as an argument and returns that argument divided by 1000. However it is often not practical for a trivial case like the one in your example.
Is there an SQL quicksheet, glossary of terms, language specification, anything I can use to quickly pick up the syntax and semantics of the language?
I suggest practice. You may want to start following the mysql tag on Stack Overflow, where many beginner questions are asked every day. Download MySQL, and when you think there's a question within your reach, try to go for the solution. I think this will help you pick up speed, as well as awareness of the languages features. There's no need to post the answer at first, because there are some pretty fast guns on the topic over here, but with some practice I'm sure you'll be able to bring home some points :)
I understand that there are different "flavors"?
The flavors are actually extensions to ANSI SQL. Database vendors usually augment the SQL language with extensions such as Transact-SQL and PL/SQL.
You could simply re-write the WHERE clause
where reputation > 10000
This won't always be convenient. As an alternativly, you can use an inline view:
SELECT
a.DisplayName, a.Id, a.Reputation, a.RepInK
FROM
(
SELECT TOP 15
DisplayName, Id, Reputation, Reputation/1000 As RepInK
FROM
Users
ORDER BY Reputation DESC
) a
WHERE
a.RepInK > 10
Regarding something like named expressions, while there are several possible alternatives, the query optimizer is going to do best just writing out the formula Reputation / 1000 long-hand. If you really need to run a whole group of queries using the same evaluated value, your best bet is to create view with the field defined, but you wouldn't want to do that for a one-off query.
As an alternative, (and in cases where performance is not much of an issue), you could try something like:
SELECT TOP 15
DisplayName, Id, Reputation, RepInk
FROM (
SELECT DisplayName, Id, Reputation, Reputation / 1000 as RepInk
FROM Users
) AS table
WHERE table.RepInk > 10
ORDER BY Reputation DESC
though I don't believe that's supported by all SQL dialects and, again, the optimizer is likely to do a much worse job which this kind of thing (since it will run the SELECT against the full Users table and then filter that result). Still, for some situations this sort of query is appropriate (there's a name for this... I'm drawing a blank at the moment).
Personally, when I started out with SQL, I found the W3 schools reference to be my constant stopping-off point. It fits my style for being something I can glance at to find a quick answer and move on. Eventually, however, to really take advantage of the database it is necessary to delve into the vendors documentation.
Although SQL is "standarized", unfortunately (though, to some extent, fortunately), each database vendor implements their own version with their own extensions, which can lead to quite different syntax being the most appropriate (for a discussion of the incompatibilities of various databases on one issue see the SQLite documentation on NULL handling. In particular, standard functions, e.g., for handling DATEs and TIMEs tend to differ per vendor, and there are other, more drastic differences (particularly in not support subselects or properly handling JOINs). If you care for some of the details, this document provides both the standard forms and deviations for several major databases.
You CAN refer to RepInK in the Order By clause, but in the Where clause you must repeat the expression. But, as others have said, it will only be executed once.
There are good answers for the technical problem already, so I'll only address some of the rest of your questions.
If you're just working with the DataExplorer, you'll want to familiarize yourself with SQL Server syntax since that's what it's running. The best place to find that, of course, is MSDN's reference.
Yes, there are different variations in SQL syntax. For example, the TOP clause in the query you gave is SQL Server specific; in MySQL you'd use the LIMIT clause instead (and these keywords don't necessarily appear in the same spot in the query!).