Retrieve multiple rows from an ODBC source with a UNION query - sql

I am retrieving multiple rows into a listview control from an ODBC source. For simple SELECTs it seems to work well with a statement attribute of SQL_SCROLLABLE. How do I do this with a UNION query (with two selects)?
The most likely server will be MS SQL Server (probably 2005). The code is C for the Win32 API.
This code sets (what I think is) a server side cursor which feeds data into the ODBC driver that roughly corresponds with the positional fetches of SQLFetchScroll, which is turn feeds the cache for the listview. (Sometimes using SQL_FETCH_FIRST or SQL_FETCH_LAST as well as):
SQLSetStmtAttr(hstmt1Fetch,
SQL_ATTR_CURSOR_SCROLLABLE,
(SQLPOINTER)SQL_SCROLLABLE,
SQL_IS_INTEGER);
SQLSetStmtAttr(hstmt1Fetch,
SQL_ATTR_CURSOR_SENSITIVITY,
(SQLPOINTER)SQL_INSENSITIVE,
SQL_IS_INTEGER);
...
retcode = SQLGetStmtAttr(hstmt1Fetch,
SQL_ATTR_ROW_NUMBER,
&CurrentRowNumber,
SQL_IS_UINTEGER,
NULL);
...
retcode = SQLFetchScroll(hstmt1Fetch, SQL_FETCH_ABSOLUTE, Position);
(The above is is a fragment from working code for a single SELECT).
Is this the best way to do it? Given that I need to retrieve the last row to get the number of rows and populate the end buffer is there a better way of doing it? (Can I use forward only scrolling?)
Assuming yes to the above, how do I achieve the same result with a UNION query?
LATE EDIT: The problem with the union query being that effectively it forces forward only scrolling which breaks SQLFetchScroll(hstmt1Fetch, SQL_FETCH_ABSOLUTE, Position). The answer is I suspect: "you can't". And it really means redesigning the DB to included either a view or a single table to replace the UNION. But I'll leave the question open in case I have missed something.

can you not define a view on the db server that does the union query for you, so from the client code it just looks like a single select?
if you can't, can you just issue the union operation as part of your select, e.g.
select some_fields from table1
union
select same_fields from table2
and treat the result as a single result set?

Have you tried using union to make a derived table ?
select * from
(select field1, field from table1
union all
slect field1, filed2 from table2) a

If the issue is just needing to get the last row to get the number of rows and caching the last few rows (I assume if there are a million items in the select that you're not populating a drop-list with all of them) then you may be able to take advantage of the ROW_NUMBER() function of SQL Server 2005
You could:
select count(*)
from (select blah UNION select blah)
to get the number of rows.
Then:
select ROW_NUMBER() as rownum,blah
from (select blah UNION select blah)
where rownum between minrow and maxrow
to just fetch the rows that you need to display/cache
But seriously folks, if you're selecting items from a million-row table, you might want to consider a different mechanism
Good luck!

Related

How to put more than 1million ID's using union All [duplicate]

I have comma delimited id's that I want to use in NOT IN clause..
I'm using oracle 11g.
select * from table where ID NOT IN (1,2,3,4,...,1001,1002,...)
results in
ORA-01795: maximum number of expressions in a list is 1000
I don't want to use temp table. am trying considering doing this
select * from table1 where ID NOT IN (1,2,3,4,…,1000) AND
ID NOT IN (1001,1002,…,2000)
Is there any other better workaround to this issue?
You said you don't want to, but: use a temporary table. That's the correct solution here.
Query parsing is expensive in Oracle, and that's what you'll get when you put thousands of identifiers into a giant blob of SQL. Also, there are ill-defined limits on query length that you're going to hit. Doing an anti-JOIN against a table, on the other hand... Oracle is good at that. Bulk loading data into a table, Oracle is good at that too. Use a temp table.
Limiting IN to a thousand entries is a sanity check. The fact that you're hitting it means you're trying to do something insane.
Jump out of the question, can you combine the SQL to get more than 1000 IDs with this SQL. That's the better way to simplify your SQLs.
It's insane.
But you can probably try to select from select:
SELECT * FROM
(SELECT * FROM table WHERE ID NOT IN (1,2,3,4,...,1000))
WHERE ID NOT IN (1001,1002,…,2000)
Make as many levels as you need.
Use MINUS, the opposite to `UNION
SELECT * FROM TABLE
MINUS
SELECT T.* FROM TABLE T,TABLE2 T2 WHERE T.ID = T2.ID
This represents registers on table T which id not in table2 t2

Is there a way to include a query that is non updateable in an UPDATE query? [duplicate]

This question already has an answer here:
Access SQL Update One Table In Join Based on Value in Same Table
(1 answer)
Closed 10 years ago.
For the following query:
UPDATE tempSpring_ASN AS t
SET t.RECORD_TYPE = (
SELECT TOP 1 RECORD_TYPE
FROM (
SELECT "A" AS RECORD_TYPE
FROM TABLE5
UNION ALL
SELECT "B" AS RECORD_TYPE
FROM TABLE5
)
);
I'm getting, "Operation must use an updateable query." I don't understand. I'm not trying to update a union query. I'm just trying to update an otherwise updatable recordset with the output (single value) of a union query.
(The solution provided at Access SQL Update One Table In Join Based on Value in Same Table (which is also provided below) does not work for this situation, contrary to what is indicated on the top of this page.)
This question is a reference to a previous question, data and code examples posted here:
Access SQL Update One Table In Join Based on Value in Same Table
Hi AYS,
In Access, an Update query needs to be run on a table.
As a UNION query is a combination of multiple sets of records, the result set is no longer a table, and cannot be the object of an Update query as the records in the result set are no longer uniquely identified with any one particular table (even if they theoretically could be). Access is hard-coded to treat every UNION query as read-only, which makes sense when there are multiple underlying tables. There are a number of other conditions (such as a sub-query in the SELECT statement) that also trigger this condition.
Think if it this way: if you were not using TOP 1 and your UNION query returned multiple results, how would JET know which result to apply to the unique record in your table? As such, JET treats all such cases the same.
Unfortunately, this is the case even when all of the data is being derived from the same table. In this case, it is likely that the JET optimizer is simply not smart enough to realize that this is the case and re-phrase the query in a manner that does not use UNION.
In this case, you can still get what you want by re-stating your query in such a way that everything references your base table. For example, you can use the following as a SELECT query to get the PO_NUM value of the previous SHP_CUSTOM_5 record:
SELECT
t1.SHP_CUSTOM_5
, t1.PO_NUM
, t1.SHP_CUSTOM_5 -1 AS PREV_RECORD
, (SELECT
t2.PO_NUM
FROM
tempSpring_ASN As t2
WHERE
t2.SHP_CUSTOM_5 = (t1.SHP_CUSTOM_5 -1)
) AS PREV_PO
FROM
tempSpring_ASN AS t1
;
You can then phrase this as an Update query as follows in order to perform the "LIN" updates:
UPDATE
tempSpring_ASN AS t1
SET
t1.RECORD_TYPE = "LIN"
WHERE
t1.PO_NUM=
(
SELECT
t2.PO_NUM
FROM
tempSpring_ASN As t2
WHERE
t2.SHP_CUSTOM_5 = (t1.SHP_CUSTOM_5 -1)
)
;
This code was successful in the tests I ran with dummy data.
Regarding your "HDR" updates, your are really performing two separate updates.
1) If the PO_NUM matches the previous record's PO_NUM, set RECORD_TYPE to "LIN"
2) If it is the first record, set RECORD_TYPE to "HDR"
It is not clear to me why there would be a benefit to performing these actions within one query. I would recommend performing the HDR update using the "TOP 1" by SHP_CUSTOM_5 method you used in your original SELECT query example, as this will be a relatively simple UPDATE query. It is possible to use IIF() within an Update query, but I do not know what additional benefit you would gain from the additional time and complexity that would be required (it would most likely only be much less readable).
Best of luck!

SELECT COUNT(*) ;

I have a database, database1, with two tables (Table 1, Table2) in it.
There are 3 rows in Table1 and 2 rows in Table2. Now if I execute the following SQL query SELECT COUNT(*); on database1, then the output is "1".
Does anyone has the idea, what this "1" signifies?
The definition of the two tables is as below.
CREATE TABLE Table1
(
ID INT PRIMARY KEY,
NAME NVARCHAR(20)
)
CREATE TABLE Table2
(
ID INT PRIMARY KEY,
NAME NVARCHAR(20)
)
Normally all selects are of the form SELECT [columns, scalar computations on columns, grouped computations on columns, or scalar computations] FROM [table or joins of tables, etc]
Because this allows plain scalar computations we can do something like SELECT 1 + 1 FROM SomeTable and it will return a recordset with the value 2 for every row in the table SomeTable.
Now, if we didn't care about any table, but just wanted to do our scalar computed we might want to do something like SELECT 1 + 1. This isn't allowed by the standard, but it is useful and most databases allow it (Oracle doesn't unless it's changed recently, at least it used to not).
Hence such bare SELECTs are treated as if they had a from clause which specified a table with one row and no column (impossible of course, but it does the trick). Hence SELECT 1 + 1 becomes SELECT 1 + 1 FROM ImaginaryTableWithOneRow which returns a single row with a single column with the value 2.
Mostly we don't think about this, we just get used to the fact that bare SELECTs give results and don't even think about the fact that there must be some one-row thing selected to return one row.
In doing SELECT COUNT(*) you did the equivalent of SELECT COUNT(*) FROM ImaginaryTableWithOneRow which of course returns 1.
Along similar lines the following also returns a result.
SELECT 'test'
WHERE EXISTS (SELECT *)
The explanation for that behavior (from this Connect item) also applies to your question.
In ANSI SQL, a SELECT statement without FROM clause is not permitted -
you need to specify a table source. So the statement "SELECT 'test'
WHERE EXISTS(SELECT *)" should give syntax error. This is the correct
behavior.
With respect to the SQL Server implementation, the FROM
clause is optional and it has always worked this way. So you can do
"SELECT 1" or "SELECT #v" and so on without requiring a table. In
other database systems, there is a dummy table called "DUAL" with one
row that is used to do such SELECT statements like "SELECT 1 FROM
dual;" or "SELECT #v FROM dual;". Now, coming to the EXISTS clause -
the project list doesn't matter in terms of the syntax or result of
the query and SELECT * is valid in a sub-query. Couple this with the
fact that we allow SELECT without FROM, you get the behavior that you
see. We could fix it but there is not much value in doing it and it
might break existing application code.
It's because you have executed select count(*) without specifying a table.
The count function returns the number of rows in the specified dataset. If you don't specify a table to select from, a single select will only ever return a single row - therefore count(*) will return 1. (In some versions of SQL, such as Oracle, you have to specify a table or similar database object; Oracle includes a dummy table (called DUAL) which can be selected from when no specific table is required.)
you wouldn't normally execute a select count(*) without specifying a table to query against. Your database server is probably giving you a count of "1" based on default system table it is querying.
Try using
select count(*) from Table1
Without a table name it makes no sense.
without table name it always return 1 whether it any database....
Since this is tagged SQL server, the MSDN states.
COUNT always returns an int data type value.
Also,
COUNT(*) returns the number of items in a group. This includes NULL
values and duplicates.
Thus, since you didn't provide a table to do a COUNT from, the default (assumption) is that it returns a 1.
COUNT function returns the number of rows as result. If you don't specify any table, it returns 1 by default. ie., COUNT(*), COUNT(1), COUNT(2), ... will return 1 always.
Select *
without a from clause is "Select ALL from the Universe" since you have filtered out nothing.
In your case, you are asking "How many universe?"
This is exactly how I would teach it. I would write on the board on the first day,
Select * and ask what it means. Answer: Give me the world.
And from there I would teach how to filter the universe down to something meaningful.
I must admit, I never thought of Select Count(*), which would make it more interesting but still brings back a true answer. We have only one world.
Without consulting Steven Hawking, SQL will have to contend with only 1.
The results of the query is correct.

Is there efficient SQL to query a portion of a large table

The typical way of selecting data is:
select * from my_table
But what if the table contains 10 million records and you only want records 300,010 to 300,020
Is there a way to create a SQL statement on Microsoft SQL that only gets 10 records at once?
E.g.
select * from my_table from records 300,010 to 300,020
This would be way more efficient than retrieving 10 million records across the network, storing them in the IIS server and then counting to the records you want.
SELECT * FROM my_table is just the tip of the iceberg. Assuming you're talking a table with an identity field for the primary key, you can just say:
SELECT * FROM my_table WHERE ID >= 300010 AND ID <= 300020
You should also know that selecting * is considered poor practice in many circles. They want you specify the exact column list.
Try looking at info about pagination. Here's a short summary of it for SQL Server.
Absolutely. On MySQL and PostgreSQL (the two databases I've used), the syntax would be
SELECT [columns] FROM table LIMIT 10 OFFSET 300010;
On MS SQL, it's something like SELECT TOP 10 ...; I don't know the syntax for offsetting the record list.
Note that you never want to use SELECT *; it's a maintenance nightmare if anything ever changes. This query, though, is going to be incredibly slow since your database will have to scan through and throw away the first 300,010 records to get to the 10 you want. It'll also be unpredictable, since you haven't told the database which order you want the records in.
This is the core of SQL: tell it which 10 records you want, identified by a key in a specific range, and the database will do its best to grab and return those records with minimal work. Look up any tutorial on SQL for more information on how it works.
When working with large tables, it is often a good idea to make use of Partitioning techniques available in SQL Server.
The rules of your partitition function typically dictate that only a range of data can reside within a given partition. You could split your partitions by date range or ID for example.
In order to select from a particular partition you would use a query similar to the following.
SELECT <Column Name1>…/*
FROM <Table Name>
WHERE $PARTITION.<Partition Function Name>(<Column Name>) = <Partition Number>
Take a look at the following white paper for more detailed infromation on partitioning in SQL Server 2005.
http://msdn.microsoft.com/en-us/library/ms345146.aspx
I hope this helps however please feel free to pose further questions.
Cheers, John
I use wrapper queries to select the core query and then just isolate the ROW numbers that i wish to take from the query - this allows the SQL server to do all the heavy lifting inside the CORE query and just pass out the small amount of the table that i have requested. All you need to do is pass the [start_row_variable] and the [end_row_variable] into the SQL query.
NOTE: The order clause is specified OUTSIDE the core query [sql_order_clause]
w1 and w2 are TEMPORARY table created by the SQL server as the wrapper tables.
SELECT
w1.*
FROM(
SELECT w2.*,
ROW_NUMBER() OVER ([sql_order_clause]) AS ROW
FROM (
<!--- CORE QUERY START --->
SELECT [columns]
FROM [table_name]
WHERE [sql_string]
<!--- CORE QUERY END --->
) AS w2
) AS w1
WHERE ROW BETWEEN [start_row_variable] AND [end_row_variable]
This method has hugely optimized my database systems. It works very well.
IMPORTANT: Be sure to always explicitly specify only the exact columns you wish to retrieve in the core query as fetching unnecessary data in these CORE queries can cost you serious overhead
Use TOP to select only a limited amont of rows like:
SELECT TOP 10 * FROM my_table WHERE ID >= 300010
Add an ORDER BY if you want the results in a particular order.
To be efficient there has to be an index on the ID column.

SQL Server UNION - What is the default ORDER BY Behaviour

If I have a few UNION Statements as a contrived example:
SELECT * FROM xxx WHERE z = 1
UNION
SELECT * FROM xxx WHERE z = 2
UNION
SELECT * FROM xxx WHERE z = 3
What is the default order by behaviour?
The test data I'm seeing essentially does not return the data in the order that is specified above. I.e. the data is ordered, but I wanted to know what are the rules of precedence on this.
Another thing is that in this case xxx is a View. The view joins 3 different tables together to return the results I want.
There is no default order.
Without an Order By clause the order returned is undefined. That means SQL Server can bring them back in any order it likes.
EDIT:
Based on what I have seen, without an Order By, the order that the results come back in depends on the query plan. So if there is an index that it is using, the result may come back in that order but again there is no guarantee.
In regards to adding an ORDER BY clause:
This is probably elementary to most here but I thought I add this.
Sometimes you don't want the results mixed, so you want the first query's results then the second and so on. To do that I just add a dummy first column and order by that. Because of possible issues with forgetting to alias a column in unions, I usually use ordinals in the order by clause, not column names.
For example:
SELECT 1, * FROM xxx WHERE z = 'abc'
UNION ALL
SELECT 2, * FROM xxx WHERE z = 'def'
UNION ALL
SELECT 3, * FROM xxx WHERE z = 'ghi'
ORDER BY 1
The dummy ordinal column is also useful for times when I'm going to run two queries and I know only one is going to return any results. Then I can just check the ordinal of the returned results. This saves me from having to do multiple database calls and most empty resultset checking.
Just found the actual answer.
Because UNION removes duplicates it does a DISTINCT SORT. This is done before all the UNION statements are concatenated (check out the execution plan).
To stop a sort, do a UNION ALL and this will also not remove duplicates.
If you care what order the records are returned, you MUST use an order by.
If you leave it out, it may appear organized (based on the indexes chosen by the query plan), but the results you see today may NOT be the results you expect, and it could even change when the same query is run tomorrow.
Edit: Some good, specific examples: (all examples are MS SQL server)
Dave Pinal's blog describes how two very similar queries can show a different apparent order, because different indexes are used:
SELECT ContactID FROM Person.Contact
SELECT * FROM Person.Contact
Conor Cunningham shows how the apparent order can change when the table gets larger (if the query optimizer decides to use a parallel execution plan).
Hugo Kornelis proves that the apparent order is not always based on primary key. Here is his follow-up post with explanation.
A UNION can be deceptive with respect to result set ordering because a database will sometimes use a sort method to provide the DISTINCT that is implicit in UNION , which makes it look like the rows are deliberately ordered -- this doesn't apply to UNION ALL for which there is no implicit distinct, of course.
However there are algorithms for the implicit distinct, such as Oracle's hash method in 10g+, for which no ordering will be applied.
As DJ says, always use an ORDER BY
It's very common to come across poorly written code that assumes table data is returned in insert order, and 95% of the time the coder gets away with it and is never aware that this is a problem as on many common databases (MSSQL, Oracle, MySQL). It is of course a complete fallacy and should always be corrected when it's come across, and always, without exception, use an Order By clause yourself.