SQL Query Limit for DB2 AS/400 Version 4 - sql

I know the version is way too old (yea version 4!), but I have no choice.
How to limit my query for example 100 rows only for DB2 AS400?
FETCH FIRST n ROWS ONLY
and
ROW_NUMBER()
don't work.
Any ideas or workaround?
Here is a sample SQL query (does not work):
SELECT POLNOP FROM ZICACPTF.POLHDR FETCH FIRST 10 ROWS ONLY
It says
[SQL0199] Keyword FETCH not expected. Valid tokens: FOR WITH ORDER UNION OPTIMIZE.

There is no dbms support for this operation, check Version 4 DB2 UDB for AS/400 SQL Reference: No Limit, Top, First, ... reserved words.
You can try to limit rows via where clause, where sequence between 100 and 200. But this is an unreal scenario.
First work around is via cursor:
DECLARE ITERROWS INTEGER;
...
SET ITERROWS = 0;
DO WHILE (SUBSTR(SQLSTATE,1,2) = '00' and ITERROWS < 100
DO
...
SET ITERROWS = ITERROWS + 1;
second one, in your middleware language.
I hope someone post a clever workaround, but, in my opinion, they are not.

Solution only for > V4R4
Using FETCH FIRST [n] ROWS ONLY:
SELECT LASTNAME, FIRSTNAME, EMPNO, SALARY
FROM EMP
ORDER BY SALARY DESC
FETCH FIRST 10 ROWS ONLY;
Reference: publib.boulder.ibm.com
The difference I can see from your query to this example is that here we are using a ORDER BY clause - do you have the possibility to add a ORDER BY - it should do the trick. Referencing to: https://stackoverflow.com/a/16858430/1581725
To get ranges or also only the first 10 rows, you'd have to use ROW_NUMBER() (since v5r4):
SELECT
*
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY {{table field}}) AS ROWNUM, * {{yourtable}}
) AS {{yourcursor}}
WHERE
{{yourcursor}}.ROWNUM>0 AND
{{yourcursor}}.ROWNUM<=10
Reference: blog.zanclus.com

Related

HQL/Hive Missing EOF in LIMIT query

i'm new to HQL and was wondering the reason of the error below:
I was selecting the whole database which had ~9 millions of records so I was trying to get it chunk by chunk. Therefore I tried:
Everything worked fine when I used:
SELECT * FROM tableABC ORDER BY tableABC.ID LIMIT 10; //Select everything from the table with total 10 rows
However, when I tried to get them with:
SELECT * FROM tableABC ORDER BY tableABC.ID LIMIT 0,10; //Select everything from the table from row 0 to total 10 rows
I kept getting the error of "FAILED: ParseException line 1:111 missing EOF at ',' near '0')". I tried using LIMIT with OFFSET, and it still showed the same error about EOF.
May I know what would be the problem?
Limit with two arguments should work in hive 2.0.0 or higher version. Could you please check your hive version using select version() and find the root cause for yourself?
If your hive is lower than 2, you can use below SQL to get the data you want. I am using row_number() to generate sequential numbers and then putting a filter on it. This may be little slower than limit x,y but shouldn't be too much different.
select id,col_1, col_2...
from (
select id, col_1, col_2, ... , row_number() OVER (ORDER by id) as rownum from tableABC
) rs
where rownum between 0 and 50

Sql get the latest row in a table by date - double select vs order by

Have a query that uses double select (with select max) to fetch the row with the latest 'calculation_time' column among multiple rows which can have the same 'patient_set_id'. If there are multiple rows with the same 'patient_set_id', only the row with the latest 'calculation_time' should be retrieved. Calculation time is a date.
So far I've tried this but I'm not really sure if there is any better way for this, maybe using ORDER BY. But I'm very new to sql and need to know which one would be the fastest and more appropriate?
SELECT median from diagnostic_risk_stats WHERE
calculation_time=(SELECT MAX(calculation_time) FROM diagnostic_risk_stats WHERE
patient_set_id = UNHEX(REPLACE('5a9dbfca-74d6-471a-af27-31beb4b53bb2', "-","")));
You can use not exists as follows:
SELECT median
from diagnostic_risk_stats t
WHERE not exists
(Select 1 from diagnostic_risk_stats tt
Where t.patient_set_id = tt.patient_set_id
And tt.calculation_time > t.calculation_time)
And t.patient_set_id = UNHEX(REPLACE('5a9dbfca-74d6-471a-af27-31beb4b53bb2', "-",""));

How can I get a specific chunk of results?

Is it possible to retrieve a specific range of results? I know how to do TOP x but the result I will retrieve is WAY too big and will time out. I was hoping to be able to pick say the first 10,000 results then the next 10,000 and so on. Is this possible?
WITH Q AS (
SELECT ROW_NUMBER() OVER (ORDER BY ...some column) AS N, ...other columns
FROM ...some table
) SELECT * FROM Q WHERE N BETWEEN 1 AND 10000;
Read more about ROW_NUMBER() here: http://msdn.microsoft.com/en-us/library/ms186734.aspx
Practically all SQL DB implementations have a way of specifying the starting row to return, as well as the number of rows.
For example, in both mysql and postgres it looks like:
SELECT ...
ORDER BY something -- not required, but highly recommended
LIMIT 100 -- only get 100 rows
OFFSET 500; -- start at row 500
Note that normally you would include an ORDER BY to make sure your chunks are consistent
MS SQL Server (being a "pretend" DB) don't support OFFSET directly, but it can be coded using ROW_NUMBER() - see this SO post for more detail.

Sql query to get a non-contiguous subset of results

I'm writing a web application that should show very large results on a search query.
Say some queries will return 10.000 items.
I'd like to show those to users paginated; no problem so far: each page will be the result of a query with an appropriate LIMIT statement.
But I'd like to show clues about results in each page of the paginated query: some data from the first item and some from the last.
This mean that, for example, with a result of 10.000 items and a page size of 50 items, if the user asked for the first page I will need:
the first 50 items (the page requested by the user)
item 51 and 100 (the first and last of the second page)
item 101 and 151
etc
For efficiency reasons I want to avoid one query per row.
[edit] I also would prefer not downloading 10.000 results if I only need 50 + 10000/50*2 = 400
The question is: is there a single query I can issue to the RDBMS (mysql, by the way, but I'd prefer a cross-db solution) that will return only the data I need?
I can't use server side cursor, because not all dbs support it and I want my app to be database-agnostic.
Just for fun, here is the MSSQL version of it.
declare #pageSize as int; set #pageSize = 10;
declare #pageIndex as int; set #pageIndex = 0; /* first page */
WITH x AS
(
select
ROW_NUMBER() OVER (ORDER BY (created) ASC) AS RowNumber,
*
from table
)
SELECT * FROM x
WHERE
((RowNumber <= (#pageIndex+1)*#pageSize) AND (RowNumber >= #pageIndex*#PageSize+1))
OR
RowNumber % #pageSize = 1
OR
RowNumber % #pageSize = #pageSize-1
Note, that an ORDER BY is provided in the over clause.
Also note, that if you have gazillion rows, your result set will have millions. You need to maximize the result rows for practical reasons.
I have no idea how this could be a solved in generic SQL. (My bet: no way. Even simple pageing cannot be solved without DB-specific operators.)
UPDATE: I completely misread the initial question. You can do this using UNION and the LIMIT clause in MySQL, although it might be what you meant by "one query per row". The syntax would be like:
select FOO from BAZ limit 50
union
select FOO from BAZ limit 50, 1
union
select FOO from BAZ limit 99, 1
union
select FOO from BAZ limit 100, 1
union
select FOO from BAZ limit 149, 1
and so on and so forth. Since you're using UNION, you'll only need one roundtrip to the database. I'm not sure how MySQL will treat the various SELECT statements, though. It should be able to recognize that they are essentially the same query and use a cached query plan, but I don't work with MySQL enough to know if that's a reasonable expectation for its optimizer.
Obviously, to build this query in a general fashion, you'll first need to run a count query so you can calculate what your offsets will be.
This is definitely not a tractable problem for standard SQL, since the paging logic requires nonstandard features.

LIMIT in FoxPro

I am attempting to pull ALOT of data from a fox pro database, work with it and insert it into a mysql db. It is too much to do all at once so want to do it in batches of say 10 000 records. What is the equivalent to LIMIT 5, 10 in Fox Pro SQL, would like a select statement like
select name, address from people limit 5, 10;
ie only get 10 results back, starting at the 5th. Have looked around online and they only make mention of top which is obviously not of much use.
Take a look at the RecNo() function.
FoxPro does not have direct support for a LIMIT clause. It does have "TOP nn" but that only provides the "top-most records" within a given percentage, and even that has a limitation of 32k records returned (maximum).
You might be better off dumping the data as a CSV, or if that isn't practical (due to size issues), writing a small FoxPro script that auto-generates a series of BEGIN-INSERT(x10000)-COMMIT statements that dump to a series of text files. Of course, you would need a FoxPro development environment for this, so this may not apply to your situation...
Visual FoxPro does not support LIMIT directly.
I used the following query to get over the limitation:
SELECT TOP 100 * from PEOPLE WHERE RECNO() > 1000 ORDER BY ID;
where 100 is the limit and 1000 is the offset.
It is very easy to get around LIMIT clause using TOP clause ; if you want to extract from record _start to record _finish from a file named _test, you can do :
[VFP]
** assuming _start <= _finish, if not you get a top clause error
*
_finish = MIN(RECCOUNT('_test'),_finish)
*
SELECT * FROM (SELECT TOP (_finish - _start + 1) * FROM (SELECT TOP _finish *, RECNO() AS _tempo FROM _test ORDER BY _tempo) xx ORDER BY _tempo DESC) yy ORDER BY _tempo
**
[/VFP]
I had to convert a Foxpro database to Mysql a few years ago. What I did to solve this was add an auto-incrementing id column to the Foxpro table and use that as the row reference.
So then you could do something like.
select name, address from people where id >= 5 and id <= 10;
The Foxpro sql documentation does not show anything similar to limit.
Here, adapt this to your tables. Took me like 2 mins, i do this waaaay too often.
N1 - group by whatever, and make sure you got a max(id), you can use recno() to make one, sorted correctly
N2 - Joins N1 where the ID = Max Id of N1, display the field you want from N2
Then if you want to join to other tables, put that all in brackets and give it an alias and include it in a join.
Select N1.reference, N1.OrderNoteCount, N2.notes_desc LastNote
FROM
(select reference, count(reference) OrderNoteCount, Max(notes_key) MaxNoteId
from custnote
where reference != ''
Group by reference
) N1
JOIN
(
select reference, count(reference) OrderNoteCount, notes_key, notes_desc
from custnote
where reference != ''
Group by reference, notes_key, notes_desc
) N2 ON N1.MaxNoteId = N2.notes_key
To expand on Eyvind's answer I would create a program to uses the RecNo() function to pull records within a given range, say 10,000 records.
You could then programmatically cycle through the large table in chucks of 10,000 records at a time and preform your data load into you MySQL database.
By using the RecNO() function you can be certain not to insert rows more than once, and be able to restart at a know point in the data load process. That by it's self can be very handy in the event you need to stop and restart the load process.
Depending on the number of the returned rows and if you are using .NET Framework you can offset/limit the gotten DataTable on the following way:
dataTable = dataTable.AsEnumerable().Skip(offset).Take(limit).CopyToDataTable();
Remember to add the Assembly System.Data.DataSetExtensions.