Speed up Apache Derby DISTINCT query?

Speed up Apache Derby DISTINCT query? - sql

I have a problem with Apache Derby. I imported the data from www.geonames.org and want to get the DISTINCT names.
The query SELECT name FROM GEONAME returns the results instantly.
The query SELECT DISTINCT name FROM GEONAME takes almost 20 minutes to complete.
How can I speed this up? There already is an index on the name table (CREATE INDEX GEONAME_NAME_index ON GEONAME(NAME))

Related

Oracle SQL - Subquery Works fine, However Create Table with that subquery appears to hang

I have the following query structure
CREATE TABLE <Table Name> AS
(
SELECT .... FROM ...
)
When i run the SELECT statement on its own, this compiles and returns the results within seconds. however when I run that with the CREATE Table Statement it takes hours to the point where I believe it has hung and will never compile.
What is the reason for this? and what could a work around be?
Oracle Database 12c <12.1.0.2.0>

If you ran that SELECT in some GUI, note that most (if not all) of them return only a few hundred rows, not the whole result set. For example: if your query really returns 20 million rows, GUI displays the first 50 (or 500, depending on tool you use) rows which is kind of confusing - just like it confused you.
If you used current query as an inline view, e.g.
select count(*)
from
(select ... from ...) --> this is your current query
it would "force" Oracle to fetch all rows, so you'd see how long it actually takes.
Apart from that, see if SELECT can be optimized, e.g.
see whether columns used in WHERE clause are indexed
collect statistics for all involved tables (used in the FROM clause)
remove ORDER BY clause (if there's any; it is irrelevant in CTAS operation)
check explain plan
Performance Tuning is far more from what I've suggested; those are just a few suggestions you might want to look at.

Have you tried Direct Load insert by first creating the table using CTAS where 1= 2and then doing the insert. This will atleast tell us if anything is wrong in data(corrupt data) or if it is a performance issue.

I had the same problem before since the new data is too large (7 million rows) and it took me 3 hours to execute the code.
My best suggestion is to create a view since it took less space instead of a new table.

So the answer to this one.
CREATE TABLE <Table Name> AS
(
SELECT foo
FROM baa
LEFT JOIN
( SELECT foo FROM baa WHERE DATES BETWEEN SYSDATE AND SYSDATE - 100 )
WHERE DATES_1 BETWEEN SYSDATE - 10 AND SYSDATE - 100
)
The problem was that the BETWEEN statements did not match the same time period and the sub query was looking at more data than the main query (I guess this was causing a full scan over the tables?)
The below query has the matching between statement time period and this returned the results in less than 3 minutes.
CREATE TABLE <Table Name> AS
(
SELECT foo FROM baa
LEFT JOIN ( SELECT foo FROM baa WHERE DATES BETWEEN SYSDATE - 10 AND SYSDATE - 100 )
WHERE DATES_1 BETWEEN SYSDATE - 10 AND SYSDATE - 100
)

Improve the performance of a query on a view which references external tables

I have a view which looks like this:
CREATE VIEW My_View AS
SELECT * FROM My_Table UNION
SELECT * FROM My_External_Table
What I have found is that performance is very slow when ordering the data which I need to do for pagination. For example the following query takes almost 2 minutes despite only returning 20 rows:
SELECT * FROM My_View
ORDER BY My_Column
OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY
In contrast the following (useless) query takes less than 2 seconds:
SELECT * FROM My_View
ORDER BY GETDATE()
OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY
I cannot add indexes to the view as it is not SCHEMABOUND and I cannot make it SCHEMABOUND as it references an external table.
Is there any way I can improve the performance of the query or otherwise get the desired result. All the databases involved are AzureSQL.

If all items are unique in My_table and My_external_table using OUTER UNION would help you to improve the performance.
And adding an index to table would help to run your query faster.

You can't really get around the order by so I don't think there is anything you can do.
I'm a bit surprised the order by getdate() works, because ordering by a constant does not usually work. I imagine it is equivalent to order by (select null) and no ordering takes place.
My recommendation? You probably need to replicate the external table on the local system and have a process to create a new local table. That sounds complicated, but you may be able to do it using a materialized view. However this works with the "external" table depends on what you mean by "external".
Note that you will also want an index on my_column to avoid the sort.

Query is very slow when we put a where clause on the total selected data by query

I am running a query which is selecting data on the basis of joins between 6-7 tables. When I execute the query it is taking 3-4 seconds to complete. But when I put a where clause on the fetched data it's taking more than one minute to execute. My query is fetching large amounts of data so I can't write it here but the situation I faced is explained below:
Select Category,x,y,z
from
(
---Sample Query
) as a
it's only taking 3-4 seconds to execute. But
Select Category,x,y,z
from
(
---Sample Query
) as a
where category Like 'Spart%'
is taking more than 2-3 minutes to execute.
Why is it taking more time to execute when I use the where clause?

It's impossible to say exactly what the issue is without seeing the full query. It is likely that the optimiser is pushing the WHERE into the "Sample query" in a way that is not performant. Possibly could be resolved by updating statistics on the table, but an easier option would be to insert the whole query into a temporary table, and filter from there.
Select Category,x,y,z
INTO #temp
from
(
---Sample Query
) as a
SELECT * FROM #temp WHERE category Like 'Spart%'
This will force the optimiser to tackle it in the logical order of pulling your data together before applying the WHERE to the end result. You might like to consider indexing the temp table's category field also.

If you're using MS SQL by checking the management studio actual execution plan it may already suggest an index creation
In any case, you should add to the index used by the query the column "Category"
If you don't have an index on that table create it composed by column "Category" and all the other columns used in join or where
bear in mind by using like 'text%' clause you could end in index scan and not index seek

Fastest execution time for querying on Big size table

i need advice how to get fastest result for querying on big size table.
I am using SQL Server 2012, my condition is like this:
I have 5 tables contains transaction record, each table has 35 millions of records.
All tables has 14 columns, the columns i need to search is GroupName, CustomerName, and NoRegistration. And I have a view that contains 5 of all these tables.
The GroupName, CustomerName, and NoRegistration records is not unique each tables.
My application have a function to search to these column.
The query is like this:
Search by Group Name:
SELECT DISTINCT(GroupName) FROM TransactionRecords_view WHERE GroupName LIKE ''+#GroupName+'%'
Search by Name:
SELECT DISTINCT(CustomerName) AS 'CustomerName' FROM TransactionRecords_view WHERE CustomerName LIKE ''+#Name+'%'
Search by NoRegistration:
SELECT DISTINCT(NoRegistration) FROM TransactionRecords_view WHERE LOWER(NoRegistration) LIKE LOWER(#NoRegistration)+'%'
My question is how can i achieve fastest execution time for searching?
With my condition right now, every time i search, it took 3 to 5 minutes.
My idea is to make a new tables contains the distinct of GroupName, CustomerName, and NoRegistration from all 5 tables.
Is my idea is make execution time is faster? or any other idea?
Thank you
EDIT:
This is query for view "TransactionRecords_view"
CREATE VIEW TransactionRecords_view
AS
SELECT * FROM TransactionRecords_1507
UNION ALL
SELECT * FROM TransactionRecords_1506
UNION ALL
SELECT * FROM TransactionRecords_1505
UNION ALL
SELECT * FROM TransactionRecords_1504
UNION ALL
SELECT * FROM TransactionRecords_1503

You must show sql of TransactionRecords_view. Do you have indexes? What is the collation of NoRegistration column? Paste the Actual Execution Plan for each query.

Ok, so you don't need to make those new tables. If you create Non-Clustered indexes based upon these fields it will (in effect) do what you're after. The index will only store data on the columns that you indicate, not the whole table. Be aware, however, that indexes are excellent to aid in SELECT statements but will negatively affect any write statements (INSERT, UPDATE etc).
Next you want to run the queries with the actual execution plan switched on. This will show you how the optimizer has decided to run each query (in the back end). Are there any particular issues here, are any of the steps taking up a lot of the overall operator cost? There are plenty of great instructional videos about execution plans on youtube, check them out if you haven't looked at exe plans before.

Did you try to check if there were missing indexes with the actual execution plan ?
Moreover, as you use clause on varchar, I've heard about Full-Text Search.. maybe it can be useful for you :
https://msdn.microsoft.com/en-us/library/ms142571(v=sql.120).aspx

Why SQL query can take so long time to return results?

I have an SQL query as simple as:
select * from recent_cases where user_id=1000000 and case_id=10095;
It takes up to 0.4 seconds to execute it in Oracle. And when I do 20 requests in a row, it takes > 10s.
The table 'recent_cases' has 4 columns: ID, USER_ID, CASE_ID and VISITED_DATE. Currently there are only 38 records in this table.
Also, there are 3 indexes on this table: on ID column, on USER_ID column, and on (USER_ID, CASE_ID) columns pair.
Any ideas?

One theory -- the table has a very large data segment and high water mark near the end, but the statistics are not prompting the optimiser to use an index. Therefore you're getting a slow full table scan. You could ALTER TABLE ... MOVE and rebuild the indexes to fix such a problem, or COALESCE it.

Oracle Databases have a function called "analyze table". This function can speed up select statements a lot, even if there are just a few rows in the table.
Here are some links which might help you:
http://www.dba-oracle.com/t_oracle_analyze_table.htm
http://docs.oracle.com/cd/B28359_01/server.111/b28310/general002.htm

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Speed up Apache Derby DISTINCT query? - sql

Related

Oracle SQL - Subquery Works fine, However Create Table with that subquery appears to hang

Improve the performance of a query on a view which references external tables

Query is very slow when we put a where clause on the total selected data by query

Fastest execution time for querying on Big size table

Why SQL query can take so long time to return results?

Categories

Resources