using SQL COUNT function or executing search query directly which is more efficient - sql

Let's say i have a very big database , if i execute a search query directly then count the returned rows would it be more faster ? Or using COUNT(searchquery) then start executing query like ->
SELECT *
FROM TABLE
WHERE bla='blabla'
OFFSET 0 FETCH NEXT 20 ROWS ONLY
I searched for it but i couldn't find any solution.

Do the count in the database! It will be much faster.
First, a count(*) only returns one row and one value. That is much, much less data -- and much faster -- than returning all the rows.
Second, a count(*) does not reference any columns in the select, so the query can be better optimized. It might be possible to get the count without ever looking at the data pages.

It looks like you are doing paging. You need the total count to do display the total count and calculate the total number of pages to the user, yes?
Than Gordon's answer is the one to use.

Related

SQL query: How to select the first 100000-200000 rows in a huge table

I know the normal way should be:
SELECT *
FROM mytable
ORDER BY date
OFFSET 100000 ROWS
FETCH FIRST 100000 ROWS
However, when mytable has 80 Million rows, the "ORDER BY" command will takes a long time to run. To me, the order doesn't matter, I just want to download 100,000 rows of data one at each time. Is there any good way to achieve it?
The order by only takes a long time because you use a column without an index on it. Use an indexed column like an id column in your order by.
Or add an index on date
If the order doesn't matter, just don't use it. But the correct point is to follow #juergen instructions. Try to order always on indexed columns.
I'm not sure if you fetch 100k lines at once, the system will load 100k rows to memory.
But when you process this, the loop will work over the 100k rows and will end.
¿Could you explain why order does not matter to you?
Without that, you can get repeated rows within different fetches.
If you use order, you ensure to not get repeated rows.
If th query is slow, create an index.
In this kind of select, I'm not sure if pruning would help, as you have no where clause.

Different result size between SELECT * and SELECT COUNT(*) on Oracle

I have an strange behavior on an oracle database. We make a huge insert of around 3.1 million records. Everything fine so far.
Shortly after the insert finished (around 1 too 10 minutes) I execute two statements.
SELECT COUNT(*) FROM TABLE
SELECT * FROM TABLE
The result from the first statement is fine it gives me the exact number of rows that was inserted.
The result from the second statement is now the problem. Depending on the time, the number of rows that are returned is for example around 500K lower than the result from the first statement. The difference of the two results is decreasing with time.
So I have to wait 15 to 30 minutes before both statements return the same number of rows.
I already talked with the oracle dba about this issue but he has no idea how this could happen.
Any ideas, questions or suggestions?
Update
When I select only an index column I get the correct row count.
When I instead select an non index column I get again the wrong row count.
That doesn't sounds like a bug to me, if I understood you correctly, it just takes time for Oracle to fetch the entire table . After all, 3 Mil is not a small amount.
As opposed to count, which brings 1 record with the total number of rows.
If after some waiting, the number of records being output equals to the number that the count query returns, then everything is fine.
Have you already verified with these things:
1- Count single column instead of * ALL to verify both result
2- You can verify both queries result by adding where clause and gradually select more rows by removing conditions so that you can get the issue where it is returning different value from both.
I think you should check Execution plan to identify missing indexes to improve performance.
Add missing Indexes and check the result.
Why missing Indexes are impotent:
To count row, Oracle engine no need to go throw paging operation. But while fetching all the details from a table, it requires to go through paging.
And paging process depends on indexes created on a table to fetch the data effectively and fast.
So to decrease time for your second statement, you should find missing indexes and create those indexes.
How to Find Missing Indexes:
You can start with DBA_HIST_ACTIVE_SESS_HISTORY, and look at all statements that contain that type of hint.
From there, you can pull the index name coming from that hint, and then do a lookup on dba_indexes to see if the index exists, is valid, etc.

google big query limit clause returning too many rows

In big query I am running a query on exported tables from GA.
I can not seem to get big query to limit the results. Here is a sample query, quite basic.
SELECT * FROM [1111111.ga_sessions_20140318] LIMIT 20000
The result set returns but with 7 million+ rows! I have tried this several different ways, ie. out to a table, just return result set, use cache results, don't use cached results, etc.
No matter which table I try to query it always returns the entire table.
This is basically the same as the sample query big query gives when clicking on the query table button except I changed the limit value from 1000 to 20000.
Anyone have any insight?
As noted by the comment on the original question:
"Is it possible that the number of rows shown at the bottom of the
result set returned in big query is my 20000 main object records plus
all the nested records?"
The answer is yes: BigQuery will apply the limit to the number of rows in the response, but if there are nested records involved, those will be flattened in the output.

Get total number of rows while using limit clause

I am querying my table to achieve pagination but I do not know the total number of rows in the table.
select name from table where id = 1 limit 0, 10
Is there a way to find out the total number of rows that would have returned if I had not used limit clause without querying for total count.
SQLite computes results on the fly when they are actually needed.
The only way to get the total count is to run the actual query (or better, SELECT COUNT(*)) without the LIMIT.
Depends on which back end technology you are using. In PHP, mysql_num_rows() returns number of rows without actually fetching the data.

How do I avoid multiple query for Pagination count and data retreival?

I have a scenario where I get a count and then pass the count as a variable to a similar query to get the paginated records. So basically I am doing a full query to get all the count by internally creating the full table and then using that count to display the same table with 10 per page. What solutions do I have to avoid this sort of multiple query?
Something like this is a Pseudo language .
select count {big table}
select big table where records are between count and count+10
Is there a sensible way to get the COUNT variable in the same query?
I am wondering how would Google handle a search, would it first find all the records or just fetch the records without tracking the no: of pages? Page numbers can't be computed prior as it is dependent on the variable sent by the user.
Edit: I have a similar question here https://dba.stackexchange.com/questions/161586/including-count-of-a-result-in-the-main-query
Regarding Google, they are likely to generate only the requested amount of results (like 10) and to estimate the count. The estimated count is very imprecise.
You can't have SQL Server count all results and get only a subset of them. There 3 strategies to deal with this:
execute a counting and a data query
execute an unlimited data query and discard all but ten results on the client
execute an unlimited data query into a temp-table whose primary key is the row number. You can then count instantly (get the last row) and select any subset by rownumber with a single seek
Counting the data can be significantly cheaper because SQL Server can use different indexes or discard joins.