I have a table T1 with seven columns with name c1,c2,c3,.... c7.
All columns are of type varchar ranging from 10 - 500, but mostly are of size 500.
Table has approximately 24k rows. When I execute
SELECT * FROM T1;
it takes more than 1 min to fetch all rows.But when I run same query with additional order by clause
SELECT * FROM T1 ORDER BY C1;
it takes less then 25 secs to load completed data.
Note: There are no indices on table.
Tool I am using to query is Embarcadero DBArtisan and Database is IBM DB2 9.7+.
Also server and client are several thousand kilometres apart.
I am bit confused why it is taking less time with order by clause and time difference is huge.
Related
I am using sql server 2012. I have one table, contains 1.7 million records. I am just selecting all record using (select * from table_name). But it takes 1 Hour 2 minutes to fetch.
What should i do to fetch records quickly?
All you can do is limit your Result by Using
Top(100) Result or Where Clause Your Desired Data instead of
SELECT * FROM table
This will help you get only concerning and limited data in a less amount of time.
Another thing you can do is to Get Concerning columns only which gives you desired results.
This will significantly enhance the fetching time.
I have a query as following:
SELECT Brand,Id,Model FROM product
Which takes time in order of seconds as Product table has more than 1 million records.
But the query executes within no time. (less than even one second))
select count(*) as numberOfRows from (SELECT Brand,Id,Model FROM product) result
Why is that?
When you execute a Query, the Time taken will vary depending on the Number of Columns and Rows and their datatypes.
In a table where you have 10 Columns, The Performance will be different if you select all columns (*) for all records and Just 1 or Two Columns for All records.
Because The Amount of data loaded is less in the Second case, it will execute faster.
Just like that, When you say Count(*) the result is Just a single Cell whereas in your first Select, you are selecting Millions of rows for those 3 columns, So the amount of Data is high.
That's why you are getting the Count(*) result faster. You don't need to give * inside the count, Instead Just use Count(1) for even more better performance.
I'd like to know if it's possible to run to querys at the same time in SQL Server 2014
Here is and exemple why I want to know.
Select * into #temp1 from table1
Select * into #temp2 from table2
Let's say the first line takes 15 seconds to run and the second one takes 10 seconds.If I run this query normaly it would take approximately 25 seconds.
If I can run the 2 lines at the same time this query should take just 15 seconds.
Is it possible?
Thanks
This two queries are select * from two different tables (table1 and table2) and insert data into two different temp table #temp1 and #temp2.
you could run these two queries at the same time surly. the query duration of each select * from.. into has little impact on the other.
I have a very simple query on a table with 60 million rows :
select id, max(version) from mytable group by id
It returns 6 million records and takes more than one hour to run. I just need to run it once because I am transferring the records to another new table that I keep updated.
I tried a few things that didn't work for me but that are often suggested here on stackoverflow:
inner query with select top 1 / order by desc: it is not supported in Sybase ASE
left outer join where a.version < b.version and b.version is null: I interrupted the query after more than one hour and barely a hundred thousand records were found
I understand that Sybase has to do a full scan.
Why could the full scan be so slow?
Is the slowness due to the Sybase ASE instance itself or specific to the query?
What are my options to reduce the running time of the query?
I am not intimately familiar with Sybase optimization. However, your query is really slow. Here are two ideas.
First, add an index on mytable(id, version desc). At a minimum, this is a covering index for the query, meaning that all columns used are in the index. Sybase is probably smart enough to eliminate the group by.
Another option uses the same index, but with a correlated subquery:
select t.id
from mytable t
where t.version = (select max(t2.version)
from mytable t2
where t2.id = t.id
);
This would be a full table scan (a little expensive but not an hour's worth) and an index lookup on each row (pretty cheap). The advantage of this approach is that you can select all the columns you want. The disadvantage is that if two rows have the same maximum version for an id, you will get both in the result set.
Edit : Here Nicolas a more precise answer. I have no particular experience with Sybase but I earned experience working with tones of data with a quite small server on Sql Server. From this experience, I learn that when you work with a large amount of data and your server doesn't have enough memory to deal with that amount of data, you will encounter bottlenecks (I guess it takes times to write the temporary results on the disk). I think it's your case (60 millions rows) but once again, I don't know Sybase and it depends of many factors as the numbers of columns mytable have and the amount of RAM your server have, etc ...
Here the results of a small experience I just did :
I run on Sql-Server and PostgreSQL those two queries.
Query 1 :
SELECT id, max(version)
FROM mytable
GROUP BY id
Query 2 :
SELECT id, version
FROM
(
SELECT id, version, ROW_NUMBER() OVER (PARTITION BY id ORDER BY version DESC) as RN
FROM mytable
) q
WHERE q.rn = 1
On PostgreSQL, mytable has 2.878.441 rows.
Query#1 takes 31.458 sec and returns 1.200.146 rows.
Query#2 takes 41.787 sec and returns 1.200.146 rows.
On Sql Server, mytable has 1.600.010 rows.
Query#1 takes 6 sec and returns 537.232 rows.
Query#2 takes 10 sec and returns 537.232 rows.
So far, your query is always faster. So I tried on a bigger tables.
On PostgreSQL, mytable has now 5.875.134 rows.
Query#1 takes 100.915 sec and returns 2.796.800 rows.
Query#2 takes 98.805 sec and returns 2.796.800 rows.
On Sql Server, mytable has now 11.712.606 rows.
Query#1 takes 28 min 28 sec and returns 6.262.778 rows.
Query#2 takes 2 min 39 sec and returns 6.262.778 rows.
Now we can make an assumption. In the first part on this experience. The two servers have enough memory to deal with the data, thus Group By is faster. The second part on this experiment might prove that too much data kill the performance of group by. To prevent the bottleneck ROW_NUMBER() seems to do the trick.
Criticisms : I don't have a bigger table on PostgreSQL nor I have a Sybase server at hand.
For this experiment I was using PostgreSQL 9.3.5 on x86_64 and SQL Server 2012 - 11.0-2100.60 (X64)
Maybe Nicolas this experiment will help you.
So finally the nonclustered index on (id, version desc) did the trick without having to change anything to the query. Index creation also takes one hour and the query responds in few seconds. But I guess it's still better than having another table that could cause data integrity issues.
the function max() does not help the optimizer to use the index.
Perhaps you should create a function-based index on max(version):
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc32300.1550/html/sqlug/CHDDHJIB.htm
My Table has only 2 Columns
`Select * from table` ----Execution time 10 second
Select name,mobile from table--Execution time 5 second
Select * and Select Columns doesn't have any differences when you are scanning the whole table..
Below differences can crop up when you use unnecessary columns
1.Scan of whole another table due to Rowlookup cost exceeding limit
2.Sending data over network
In your case the difference can be due to locking,blocking,memory pressure...whole lot of reasons..but not due to expanding all columns vs *
Excerpt from Conor cunningham on the same..
Did you do 'select *', then 'select name, mobile' right after ?
If yes, then it's warm caching. Your data is in memory.
The second query runs faster.