I have a sqlite3 database worth 4GB and 400k rows. The id column is a consecutive number starting at 1. If I query the following
select * from game where id = 1
After printing the first match the query continues until it reach the 400k row, thus taking a few seconds to finish the query.
How do I make the query stop at the first match?
Or how do I go directly to specific row since id and rowcount are the same?
Just add a LIMIT 1 to your query:
SELECT * FROM game WHERE id = 1 LIMIT 1;
Related
select *
from customers
where column1 = 'test'
limit 5;
I just need 5 records. will execution engine stop running after finding 5 records which matches condition.
I am working on table with millions of records simple select statement with limit is taking ~20 minutes.
Can I improve the performance of this query?
Make sure that you have an index on column1. If not, then the engine has to scan ALL records starting with the first one until it finds 5 matching records. If you know that more than this single column will match your desired rows and also exclude other rows, you could create a compound index consisting of more than one row. You could also consider partitioning your table.
I am using select top 5 from(select * from tbl2) tbl, will it get all records from tbl2 and then it will it get only specific records i.e 5? or do it only get 5 records from internal memory. suppose we have 1000 records in tbl2.
For future reference, SQL actually has a huge amount of amazing documentation/resources online. For instance: This google search.
As to your question, it will pull the top five results matching your criteria, so it depends. It'll go through, and find the first five results matching your criteria. If those are the last results it goes through, it'll still have to do the comparisons and filtering on all rows, the only difference will be that it will have to send less rows to your computer.
For example, let's say we have a table my_table with two columns: Customer_ID (which is unique) and First_Purchase_Date (which is a date value between 2015-01-01 and 2017-07-26). If we simply do SELECT TOP 5 * FROM my_table then it will go through and pull the first five rows it finds, without looking at the rest of the rows. On the other hand, if we do SELECT TOP 5 * FROM my_table WHERE First_Purchase_Date = '2017-05-17' then it will have to go through all the rows until it can find five rows with a First_Purchase_Date of 2017-05-17. If First_Purchase_Date is indexed, this should not be very expensive, as it'll more or less know where to look. If it's not, then it depends on your how SQL has decided to structure your table, and if it has created any useful statistics. Worst case, it could not have any statistics and the desired rows could be the last five in the database, in which case it will have to complete the comparison on all the rows in the database.
By the way, this is a somewhat poor idea, as the columns returned will not necessarily stay consistent over time. It may be a good idea to throw in an ORDER BY clause, to ensure you get the same records every time.
The SELECT TOP clause is used to specify the number of records to return.
It limits the rows returned in a query result set to a specified number of rows or percentage of rows. When TOP is used in conjunction with the ORDER BY clause, the result set is limited to the first N number of ordered rows; otherwise, it returns the first N number of rows in an undefined order.
SELECT TOP number|percent column_name(s)
FROM table_name
WHERE condition;
I need to get 10 random rows from table at each time, but rows shall never repeat when I repeat the query.
But if I get all rows it will repeat again from one, like table has 20 rows, at first time I get 10 random rows, 2nd time I will need to get remaining 10 rows and at my 3rd query I need to get 10 rows randomly.
Currently my query for getting 10 rows randomly:
SELECT TOP 10 *
FROM tablename
ORDER BY NEWID()
But MSDN suggest this query
SELECT TOp 10 * FROM Table1
WHERE (ABS(CAST(
(BINARY_CHECKSUM(*) *
RAND()) as int)) % 100) < 10
For good performance. But this query not return constant rows. Could you please suggest something on this
Since required outcome of your second query depends on the (random) outcome of the first query, the querying cannot be stateless. You'll need to store the state (info about the previous query/queries) somewhere, somehow.
The simplest solution would probably be storing the already-retrieved rows or their IDs in a temporary table and then querying ... where id not in (select id from temp_table) in the second query.
As Jiri Tousek said, each query that you run has to know what previous queries returned.
Instead of inserting the IDs of previously returned rows in a table and then checking that new result is not in that table yet, I'd simply add a column to the table with the random number that would define a new random order of rows.
You populate this column with random numbers once.
This will remember the random order of rows and make it stable, so all you need to remember between your queries is how many random rows you have requested so far. Then just fetch as many rows as needed starting from where you stopped in the previous query.
Add a column RandomNumber binary(8) to the table. You can choose a different size. 8 bytes should be enough.
Populate it with random numbers. Once.
UPDATE tablename
SET RandomNumber = CRYPT_GEN_RANDOM(8)
Create an index on RandomNumber column. Unique index. If it turns out that there are repeated random numbers (which is unlikely for 20,000 rows and random numbers 8 bytes long), then re-generate random numbers (run the UPDATE statement once again) until all of them are unique.
Request first 10 random rows:
SELECT TOP(10) *
FROM tablename
ORDER BY RandomNumber
As you process/use these 10 random rows remember the last used random number. The best way to do it depends on how you process these 10 random rows.
DECLARE #VarLastRandomNumber binary(8);
SET #VarLastRandomNumber = ...
-- the random number from the last row returned by the previous query
Request next 10 random rows:
SELECT TOP(10) *
FROM tablename
WHERE RandomNumber > #VarLastRandomNumber
ORDER BY RandomNumber
Process them and remember the last used random number.
Repeat. As a bonus you can request different number of random rows on each iteration (it doesn't have to be 10 each time).
what I would do is have two new fields, SELECTED (int) and TimesSelected (integer) then
UPDATE tablename SET SELECTED = 0;
WITH CTE AS (SELECT TOP 10 *
FROM tablename
ORDER BY TimesSelected ASC, NEWID())
UPDATE CTE SET SELECTED = 1, TimesSelected = TimesSelected + 1;
SELECT * from tablename WHERE SELECTED = 1;
so if you use that each time, once selected a record goes to the top of the pile, and records below it are selected randomly.
you might want to put an index on SELECTED and do
UPDATE tablename SET SELECTED = 0 WHERE SELECTED = 1; -- for performance
The most elegant solution, provided you do the consecutive queries within a certain amount of time, would be to use a cursor:
DECLARE rnd_cursor CURSOR FOR
SELECT col1, col2, ...
FROM tablename
ORDER BY NEWID();
OPEN rnd_cursor;
FETCH NEXT FROM rnd_cursor; -- Repeat ten times
Keep the cursor open and just keep fetching rows as you need them. Close the cursor when you're done:
CLOSE rnd_cursor;
DEALLOCATE rnd_cursor;
As for the second part of your question, once you fetched the last row, open a new cursor:
IF ##FETCH_STATUS <> 0
BEGIN
CLOSE rnd_cursor;
OPEN rnd_cursor;
END;
I have a table with huge records in sql.(I have a procedure which displays top 100) How do i iterate through the whole records and display first 100 and then the next 100 and so on in a loop. ie
Process in batches of 100. An external query or process can execute this procedure in a loop
In C#, you could use the SqlDataReaderclass. With this, you can read through all records, row by row.
You can pass two parameters offset and limit to SP. Then keep limit to 100 and add the 100 in offset every time you execute the SP.
Starting value of offset is zero.
i want to get records from 0 - 100 first time, next time 101 - 200 . Im thinking of cursors to open whole records and then select top 100 where id deos not exist in the temp table and insert the unique id into temp table.
What is the fastest regarding performance way to check that integer column contains specific value?
I have a table with 10 million rows in postgresql 8.4. I need to do at least 10000 checks per sec.
Currently i am doing query SELECT id FROM table WHERE id = my_value and then checking does DataReader have rows. But it is quite slow. Is there any way to speed up without loading whole column into memory?
You can select COUNT instead:
SELECT COUNT(*) FROM table WHERE id = my_value
It will return just one integer value - number of rows matching your select condition.
You need two things,
As Marcin pointed out, you want to use the COUNT(*) if all you need is to know how many. You also need an index on that column. The index will have the answer pretty much right at hand. Without the index, Postgresql would still have to go through the entire table to count that one number.
CREATE INDEX id_idx ON table (id) ASC NULLS LAST;
Something of the sort should get you there. Whether it is enough to run the query 10,000/sec. will depend on your hardware...
If you use where id = X then all values matching X will be returned. Suppose 1000 values match X then 1000 values will be returned.
Now, if you only want to check if the value is at least once then after you matched the first value there is no need to process the other 999. Even if you count the values you are still going through all of them.
What I would do in this case is this:
SELECT 1 FROM table
WHERE id = my_value
LIMIT 1
Note that I'm not even returning the id itself. So if you get one record then the value is there.
Of course, in order to improve this query, make sure you have an index on the id column.