Understanding why UNION is used in this SQL injection example - sql

I'm trying to understand more about SQL injection, so I found this lesson from Red Tiger Labs.
According to the solution, the cat=1 part of the URL is vulnerable to SQL injection.
I can understand that you can append ORDER BY X# and keep incrementing X to establish the number of columns, which is 4.
However according to the solution, the next step is to do:
cat=1 union select 1,2,3,4 from level1_users #
The table name is provided, so that's ok. But I'm really having trouble understanding the purpose of the UNION. My guess is the underlying code does something like:
SELECT * FROM level1_users where cat=1
Presumably it would expect only 0 or 1 results. Then it prints out some number of columns onto the screen. According to the example, it prints out:
This hackit is cool :)
My cats are sweet.
Miau
3
4
The first three lines were printed out without the extra SQL injection. So what's going on, and what's the significance?
I would not expect the union to do anything, I assume the numbers refer to columns?

So, I've managed to figure out what's going on here.
cat=1 union select 1,2,3,4 from level1_users #
The select part selects the numbers 1, 2, 3, 4 as columns. You could actually use anything here, like select 'cats', 'fish', 'bread', 42 and sometimes you have to do this as the union select must match the column types in the target table. The level1_users table is integers (or at least, integers work), hence selecting numbers.
I actually thought it might be selecting columns by their index, because often in sql you can do ORDER BY 1 for example to order by the first column, however that's not the case.
What tripped me up was that this particular SQL injection website dumps the entire contents of the result set to the screen, and I wasn't expecting that. If you think about it though it is looking for a category id and therefore it's not unreasonable to expect it to list everything in that category.
By performing a union it first shows that extra rows will be printed to the screen, and because we've numbered the columns, it shows which columns, columns 3 and 4.
From there it's possible to simply select username and password into those columns (you have to guess the table headers in this instance because although you can normally union onto the db data it has been disabled for this exercise).

Related

What is "Select -1", and how is it different from "Select 1"?

I have the following query that is part of a common table expression. I don't understand the function of the "Select -1" statement. It is obviously different than the "Select 1" that is used in "EXISTS" statements. Any ideas?
select days_old,
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
group by days_old
union all
select -1, -- Selecting the -1 here
count(express_cd),
count(*),
case
when round(count(express_cd)*100.0/count(*),2) < 1 then '0'
else ''
end ||
cast(decimal(round(count(express_cd)*100.0/count(*),2),5,2) as varchar(7)) ||
'%'
from foo.bar
where days_old between 1 and 7
It's just selecting the number "minus one" for each row returned, just like "select 1" will select the number "one" for each row returned.
There is nothing special about the "select 1" syntax uses in EXISTS statements by the way; it's just selecting some random value because EXISTS requires a record to be returned and a record needs data; the number 1 is sufficient.
Why you would do this, I have no idea.
When you have a union statement, each part of the union must contain the same columns. From what I read when I look at this, the first statement is giving you one line for each days old value and then some stats for each day old. The second part of the union is giving you a summary of all the records that are only a week or so less. Since days old column is not relevant here, they put in a fake value as a placeholder in order to do the union. OF course this is just a guess based on reading thousands of queries through the years. To be sure, I would need to actually run teh code.
Since you say this is a CTE, to really understand why this is is happening, you may need to look at the data it generates and how that data is used in the next query that uses the CTE. That might answer your question.
What you have asked is basically about a business rule unique to your company. The true answer should lie in any requirements documents for the original creation of the code. You should go look for them and read them. We can make guesses based on our own experience but only people in your company can answer the why question here.
If you can't find the documentation, then you need to talk (Yes directly talk, preferably in person) to the Stakeholders who use the data and find out what their needs were. Only do this after running the code and analyzing the results to better understand the meaning of the data returned.
Based on your query, all the records with days_old between 1 and 7 will be output as '-1', that is what select -1 does, nothing special here and there is no difference between select -1 and select 1 in exists, both will output the records as either 1 or -1, they are doing the same thing to check whether if there has any data.
Back to your query, I noticed that you have a union all and compare each four columns you select connected by union all, I am guessing your task is to get a final result with days_old not between 1 and 7 and combine the result with day_old, which is one because you take all between 1 and 7.
It is just a grouping logic there.
Your query returns aggregated
data (counts and rounds) grouped by days_old column plus one more group for data where days_old between 1 and 7.
So, -1 is just another additional group there, it cannot be 1 because days_old=1 is an another valid group.
result will be like this:
row1: days_old=1 count(*)=2 ...
row2: days_old=3 count(*)=5 ...
row3: days_old=9 count(*)=6 ...
row4: days_old=-1 count(*)=7

sql query to show a range and account for missing numbers

I have a SQL query
SELECT
Group_Id, MIN(Rec_Number) as RecStart, MAX(Rec_Number) AS RecEnd
FROM
Rec
WHERE
Group_Id != ''
GROUP BY
Group_Id
ORDER BY
Group_Id
This produces the following kind of results.
92-2274 9222740001 9222740004
92-2275 9222750001 9222750026
etc...
However if record 3 is missing (in the first row for instance) the query obviously doesn't account for it. What I am trying to do is the following
92-2274 9222740001 9222740002
92-2274 9222740004 9222740018
92-2275 9222750001 9222750016
92-2275 9222750018 9222750026
etc...
So essentially each time the script sees a record missing inside the group it starts a new line whilst staying inside the group before iterating on the next group. The group_Id is of course the first 6 digits of the rec_Number
I would also like to do this as well
92-2274 0001 0002
92-2274 0003 0004
Or even trim it to and remove the leading 0's as well if possible I know about using Right (Rec_Number, 4) however as this is a float the automatic convert to string seems to be messing something up as I get +009 is many columns so I assume I need to cast first or something. This particular function I could do it Excel after the fact I guess but I'm sure SQL could do it if the guy writing the query was a DBA and not a bumbling server admin (that's me!)
So is there a way of doing that in SQL also I must warn you that the standard CTE or using functions such as row number don't work as this is SQL Server 2000 - yes it is that old!
Hence me struggling to find posts on Stack Overflow that apply. Many of them start with the WITH keyword which means I can't use any of those to start with!
I think I am needing an IF ELse kind of block but I am not sure what kind of method I can use to get the query to create a new row each time it hits a missing concurrent number in the group range.
The final output will show me the ranges of records in each group whilst highlighting the missing ones via a new line each time.
For the second part, this should work :
RIGHT ( CAST ( MIN (Rec_Number) as Decimal(10)), 4)
It will only keep the last 4 characters of your number.

Custom SQL sort by

Use:
The user searches for a partial postcode such as 'RG20' which should then be displayed in a specific order. The query uses the MATCH AGAINST method in boolean mode where an example of the postcode in the database would be 'RG20 7TT' so it is able to find it.
At the same time it also matches against a list of other postcodes which are in it's radius (which is a separate query).
I can't seem to find a way to order by a partial match, e.g.:
ORDER BY FIELD(postcode, 'RG20', 'RG14', 'RG18','RG17','RG28','OX12','OX11')
DESC, city DESC
Because it's not specifically looking for RG20 7TT, I don't think it can make a partial match.
I have tried SUBSTR (postcode, -4) and looked into left and right, but I haven't had any success using 'by field' and could not find another route...
Sorry this is a bit long winded, but I'm in a bit of a bind.
A UK postcode splits into 2 parts, the last section always being 3 characters and within my database there is a space between the two if that helps at all.
Although there is a DESC after the postcodes, I do need them to display in THAT particular order (RG20, RG14 then RG18 etc..) I'm unsure if specifying descending will remove the ordering or not
Order By Case
When postcode Like 'RG20%' Then 1
When postcode Like 'RG14%' Then 2
When postcode Like 'RG18%' Then 3
When postcode Like 'RG17%' Then 4
When postcode Like 'RG28%' Then 5
When postcode Like 'OX12%' Then 6
When postcode Like 'OX11%' Then 7
Else 99
End Asc
, City Desc
You're on the right track, trimming the field down to its first four characters:
ORDER BY FIELD(LEFT(postcode, 4), 'RG20', 'RG14', ...),
-- or SUBSTRING(postcode FROM 1 FOR 4)
-- or SUBSTR(postcode, 1, 4)
Here you don't want DESC.
(If your result set contains postcodes whose prefixes do not appear in your FIELD() ordering list, you'll have a bit more work to do, since those records will otherwise appear before any explicitly ordered records you specify. Before 'RG20' in the example above.)
If you want a completely custom sorting scheme, then I only see one way to do it...
Create a table to hold the values upon which to sort, and include a "sequence" or "sort_order" field. You can then join to this table and sort by the sequence field.
One note on the sequence field. It makes sense to create it as an int as... well, sequences are often ints :)
If there is any possibility of changing the sort order, you may want to consider making it alpha numeric... It is a lot easier to insert "5A" between "5 and "6" than it is to insert a number into a sequence of integers.
Another method I use is utilising the charindex function:
order by charindex(substr(postcode,4,1),"RG20RG14RG18...",1)
I think that's the syntax anyway, I'm just doing this in SAS at the moment so I've had to adapt from memory!
But essentially the sooner you hit your desired part of the string, the higher the rank.
If you're trying to rank on a large variety of postcodes then a case statement gets pretty hefty.

Parameters in Microsoft Access

I'm really confused with how parameters work in Microsoft Access. I know that parameters are supposed to be used to allow a user to type in values when the query is run - instead of having to modify the query for each instance.
So, let's use the following example.
SELECT countyTable.countyName, Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2) as Distance
FROM countyTable
WHERE ((([avgLatitude]-5)<46.47) AND (([avgLatitude]+5)>46.47) AND (([avgLongitude]-5)<-90.17) AND (([avgLongitude]+5)>-90.17))
ORDER BY Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2), countyTable.countyName
1) I am SELECTing a column that contains the SQR function. I also have that column named as 'Distance'. However, when I try to ORDER BY on said column - and refer to it as 'Distance' - it asks for a value instead of sorting on that column. The only way I can get the query to ORDER BY is to duplicate the expression from the SELECT line. This seems unnecessary.
2) Right now, I have some values hard-coded in. I could care less about the values '57.3' and '69.1' However, for '46.47' I would like to replace with 'x2' and -90.17 with 'y2'. How I've been trying to write this with parameters, Access asks for values for each instance of 'x2' and 'y2'. This doesn't help me at all, so I have them hardcoded in.
Any help at all? Thanks!
1) I am SELECTing a column that contains the SQR function. I also have that column named as 'Distance'. However, when I try to ORDER BY on said column - and refer to it as 'Distance' - it asks for a value instead of sorting on that column. The only way I can get the query to ORDER BY is to duplicate the expression from the SELECT line. This seems unnecessary.
Yes Access does a poor job. Every real DBMS now supports ordering by the column alias created in the SELECT clause. To do this in Access, you can either do what you are doing (repeat the expression) or subquery it, e.g.
select a,b,c
from (
select a, b, a+b as C
from sometable
) AS SUBQUERIED
order by c
2) How I've been trying to write this with parameters, Access asks for values for each instance of 'x2' and 'y2'.
You're doing it wrong. Access should prompt only once. If you have a query like this
select a, b, a+b as C
from sometable
where a > [x] and y > [x]
It will see both [x]'s as being the same - and only one prompt for both. Just make sure they are spelt exactly the same.
If you wanted something like this simplified example:
SELECT
countyTable.countyName,
Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2) as Distance
FROM countyTable
ORDER BY Distance;
For the ORDER BY you can reference that complex Distance expression by its ordinal position in the field list.
SELECT
countyTable.countyName,
Sqr((69.1*(46.47-avgLatitude))^2+(69.1*(-90.17-avgLongitude)*Cos(avgLatitude/57.3))^2) as Distance
FROM countyTable
ORDER BY 2;
That method is supported at least since Jet 4 (Access 2000), and also by the newer ACE database engine.

MS SQL 2000 - How to efficiently walk through a set of previous records and process them in groups. Large table

I'd like to consult one thing. I have table in DB. It has 2 columns and looks like this:
Name...bilance
Jane...+3
Jane...-5
Jane...0
Jane...-8
Jane...-2
Paul...-1
Paul...2
Paul....9
Paul...1
...
I have to walk through this table and if I find record with different "name" (than was on previous row) I process all rows with the previous "name". (If I step on the first Paul row I process all Jane rows)
The processing goes like this:
Now I work only with Jane records and walk through them one by one. On each record I stop and compare it with all previous Jane rows one by one.
The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs
Summary:
I loop through this table in 3 levels paralelly (nested loops)
1st level = search for changes of "name" column
2nd level = if change was found, get all rows with previous "name" and walk through them
3rd level = on each row stop and walk through all previous rows with current "name"
Can this be solved only using CURSOR and FETCHING, or is there some smoother solution?
My real table has 30 000 rows and 1500 people and If I do the logic in PHP, it takes long minutes and than timeouts. So I would like to rewrite it to MS SQL 2000 (no other DB is allowed). Are cursors fast solution or is it better to use something else?
Thank you for your opinions.
UPDATE:
There are lots of questions about my "summarization". Problem is a little bit more difficult than I explained. I simplified it just to describe my algorithm.
Each row of my table contains much more columns. The most important is month. That's why there are more rows for each person. Each is for different month.
"Bilances" are "working overtimes" and "arrear hours" of workers. And I need to sumarize + and - bilances to neutralize them using values from previous months. I want to have as many zeroes as possible. All the table must stay as it is, just bilances must be changed to zeroes.
Example:
Row (Jane -5) will be summarized with row (Jane +3). Instead of 3 I will get 0 and instead of -5 I will get -2. Because I used this -5 to reduce +3.
Next row (Jane 0) won't be affected
Next row (Jane -8) can not be used, because all previous bilances are negative
etc.
You can sum all the values per name using a single SQL statement:
select
name,
sum(bilance) as bilance_sum
from
my_table
group by
name
order by
name
On the face of it, it sounds like this should do what you want:
select Name, sum(bilance)
from table
group by Name
order by Name
If not, you might need to elaborate on how the Names are sorted and what you mean by "summarize".
I'm not sure what you mean by this line... "The task is to sumarize "bilance" column (in the scope of actual person) if they have different signs".
But, it may be possible to use a group by query to get a lot of what you need.
select name, case when bilance < 0 then 'negative' when bilance >= 0 then 'positive', count(*)
from table
group by name, bilance
That might not be perfect syntax for the case statement, but it should get you really close.