Can someone explain the difference between these 2 simple queries ...
SET ROWCOUNT = 10
select * from t_Account order by account_date_last_maintenance
and this one
select * from t_Account order by account_date_last_maintenance
SET ROWCOUNT = 10
when executed both return only 10 rows, but the rows are different. There are millions of rows in the table if that matters. Also, the first query runs consistently 20% longer.
Thanks everyone
When you execute SET ROWCOUNT 10 you are telling SQL to stop the query after 10 results are returned. Your first SQL statement is the correct syntax (with the exception of the first line which should read SET ROWCOUNT 10).
The second statement as written will return all of the values ordered when initially executed and then set the row count to 10, so any subsequent execution will return the first 10 items.
ROWCOUNT must be set to 0 to get things back to "normal" execution.
As to why things were returning differently, the data might not be processed the same every time and given the size of your dataset it is most likely that you might sometimes get matching results, but it is not a sure thing. If you want consistent results and only want the first 10 results I would recommend using TOP.
Related
I have a 450GB database... with millions of records.
Here is an example query:
SELECT TOP 1 * FROM c WHERE c.type='water';
To speed up our queries, I thought about just taking the first one but we have noticed that the query still takes quite a while, despite the very first record in the database matching our constraints.
So, my question is, how does the SELECT TOP 1 really work? Does it:
A) Select ALL records and then return just the first (top) one where
type='water'
B) Return the first record which is encountered where type='water'
Try this line, noting the offset limit:
SELECT * FROM c WHERE c.type='water' OFFSET 0 LIMIT 1
For more information about the offset limit:
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-offset-limit
Assuming you aren't sorting your results (which you query isn't) then TOP 1 will return the first result as soon as it finds one. This should then end the query.
Cosmos db Explorer doesn't work with the TOP Command, It's an existing issue. It works fine in SDK Call.
Check some Top command usage below
https://learn.microsoft.com/en-us/azure/cosmos-db/sql-query-subquery
I've just had to start paging in SQL Server 2012 and I'm trying to get the total row count before paging is applied, but the problem I have is that my view has a few too many function calls in it that massively slow it down.
I've looked at this post and I've ended up with a query that takes 39 secs to run without the full data set in the DB.
Get total row count while paging
SELECT *
, COUNT(TaskId) OVER()
FROM TaskVersionView
WHERE (.. ~10 predicates here .. )
ORDER BY StartDate
OFFSET 0 ROWS
FETCH NEXT 50 ROWS ONLY
Without the COUNT it takes <1 second.
I would have expected SQL to optimize it so that it only counts the TaskIds instead of calling the functions but that doesn't seem to be the case, because:
SELECT COUNT(TaskId)
FROM TaskVersionView
Takes <1 sec.
I would have expected SQL to optimize it so that it only counts the TaskIds instead of calling the functions
If the predicates are always 'true' then this 'optimization' would return the correct value. Not that SQL Server could, even in theory, guess that the functions will always return true. But if you know (as it seems to imply from your expectation) that the functions in the predicates always return true then obviously you should remove them from the WHERE clause...
If the predicates sometimes return 'false' then obviously they cannot be optimized away, as the returned values would be incorrect.
Something gotta give.
PS. Paging with total counts is a bad idea, as it forces a full scan on every visit. Paging with total counts on which the total count is returned for every row is a horrible bad idea (modeling wise, perf wise, sanity wise).
I'll try to be as specific as i can here, so here's the query using MsAccess.
SELECT MsThread.ID,
MsThread.ThreadName,
COUNT(MsThread.ThreadName) AS TotalPost
FROM MsThread
LEFT OUTER JOIN MsPosts
ON MsThread.ThreadName = MsPosts.ThreadName
GROUP BY MsThread.ID, MsThread.ThreadName, MsThread.ThreadCategory
When I ran the query in MsAccess it returns this:
It shows that I have 4 rows ( # of threads ), and the number 5,2,1,1 is the number of posts with specified threads. Here I've been trying to get the resultset to return 4 instead of 9 so I can loop it without no invalid cursor state error.
rs.last();
int row = rs.getRow();
that returns 8, so i am guessing its returning how many process thats done. so how do i get it to return 4 similar to the COUNT function?
Thanks a lot!!
rs, I'm assuming, is the resultant recordset that your SQL returns? If so, you can do:
rs.MoveFirst
rs.MoveLast
X = rs.RecordCount
X should equal 4 in this case, since you have 4 records in your recordset.
I don't think you want to use "row" as a variable name, as that is a reserved word and will probably cause issues when you try to reference it. Make your life easier and call it "RowX" or something that's not reserved.
To display my results from PDO, I always use following PHP code for example:
$STH = $DBH->prepare("SELECT logo_id, guess_count, guessed, count(id) AS counter FROM guess WHERE user_id=:id");
$STH->bindParam(":id",$loginuser['id']);
$STH->execute();
while($row = $STH->fetch()){
print_r($row);
}
Now the issue is that I only get one result. I used to use $STH->rowCount() to check the amount of rows returned, but this method isn't really advised for SELECT statements because in some databases it doesn't react correctly. So I used the count(id) AS counter, but now I only get one result every time, even though the value of $row['counter'] is larger than one.
What is the correct way to count the amount of results in one query?
If you want to check the number of rows that are returned by a query, there are a couple of options.
You could do a ->fetchAll to get an array of all rows. (This isn't advisable for large result sets (i.e. a lot of rows returned by the query); you could add a LIMIT clause on your query to avoid returning more than a certain number of rows, if what you are checking is whether you get more than one row back, you would only need to retrieve two rows.) Checking the length of the array is trivial.
Another option is to run a another, separate query, to get the count separately, e.g.
SELECT COUNT(1) AS counter FROM guess WHERE user_id=:id
But, that approach requires another round trip to the database.
And the old standby SQL_CALC_ROUND_ROWS is another option, though that too can have problematic performance with large sets.
You could also just add a loop counter in your existing code:
$i = 0;
while($row = $STH->fetch()){
$i++
print_r($row);
}
print "fetched row count: ".$i;
If what you need is an exact count of the number of rows that satisfy a particular predicate PRIOR to running a query to return the rows, then the separate COUNT(1) query is likely the most suitable approach. Yes, it's extra code in your app; I recommend you preface the code with a comment that indicates the purpose of the code... to get an exact count of rows that satisfy a set of predicates, prior to running a query that will retrieve the rows.
If I had to process the rows anyway, and adding LIMIT 0,100 to the query was acceptable, I would go for the ->fetchAll(), get the count from the length of the array, and process the rows from the array.
You have to use GROUP BY. Your query should look like
SELECT logo_id, guess_count, guessed, COUNT(id) AS counter
FROM guess
WHERE user_id=:id
GROUP BY logo_id, guess_count, guessed
What is the difference between "TOP" and "SAMPLE" in TeraData SQL? Are they the same?
From TOP vs SAMPLE:
TOP 10 means "first 10 rows in sorted
order". If you don't have an ORDER BY,
then by extension it will be
interpreted as asking for "ANY 10
rows" in any order. The optimizer is
free to select the cheapest plan it
can find and stop processing as soon
as it has found enough rows to return.
If this query is the only thing
running on your system, TOP may appear
to always give you exactly the same
answer, but that behavior is NOT
guaranteed.
SAMPLE, as you have observed, does
extra processing to try to randomize
the result set yet maintain the same
approximate distribution. At a very
simple level, for example, it could
pick a random point at which to start
scanning the table and a number of
rows to skip between rows that are
returned.
‘Sample’ command:
Sel * from tablename
sample 100
That will give you a sample of 100 different records from the table.
The SAMPLE command will give DIFFERENT results each time you run it.
TOP command:
sel top 100 * from tablename;
This will give the first 100 rows of the table.
The TOP command will give you THE SAME results each time you run it.