I have a table with format below. And I also know the most common used sql on it, so my question is how to create index on my table thus this sql query can have best performance. Btw, my db is sybase ASE 12.5.
Table t:
bu, name, date, score_a, score_b
SQL:
SELECT bu, name, max(score_a), max(score_b)
FROM
t
WHERE date > '20110101' AND date < '20110901'
GROUP BY bu, name
Thanks for any suggestions.
Basically you need to add indexes to fields used by WHERE and GROUP BY clause, so I'd go with code, bu and name. How to create an index:
CREATE INDEX index_name ON table_name (column_name);
In your case:
CREATE INDEX idate ON t (date);
The index on Date suggested by Matino will make sure Sybase only hit rows contributing to the result.
As all fields from each row is used in the query, any other indexes won't help.
The only way to speed up the query some more would be to include all columns in the date index. But that would normally be overkill!
Related
when I'm trying to use this query in oracle it's taking 0.04054s and while using the same query in PostgreSQL then it taking 49.8min how can I change the query to increase performance in PostgreSQL?
SELECT
"ID","IMAGE","TITLE","SERVICE_DESC"
,"STATUS", "ACTION","REMOVAL_TEXT","SERVICE_PROVIDER"
, "SERVICE_PROVIDER_NAME"
FROM (
SELECT DISTINCT "ID","IMAGE"
,"TITLE", "SERVICE_DESC"
, COALESCE("STATUS",'N') as "STATUS"
,"ACTION","REMOVAL_TEXT","CREATED_DT"
,"SERVICE_PROVIDER", "SERVICE_PROVIDER_NAME"
FROM MZP_ADP.ALL_SERVICE_DETAILS
WHERE "ZIP_CODE"='55005' AND "MAKE_LIVE" = 'Y'
AND "LOCATION_ID" = '2407605'
AND "END_DATE" > CURRENT_TIMESTAMP(0)::TIMESTAMP WITHOUT TIME ZONE
AND "IS_ACTIVE" = 'Y' order by "CREATED_DT" desc
) alias;
There can be a lot of problem. (rowcounts, hardwer, no index)
First, what is the rowcounts of table?
Have you inserted a lot of row some time before?
(Then can be REINDEX TABLE TABLE_NAME , And VACUUM ANALYZE TABLE_NAME help.)
CHECK indexes on this columns
LOCATION_ID
ZIP_CODE
CREATED_DT
END_DATE
Why is the select in subselect?
Please eliminate.
Can you eliminate the distinct with an additional where clause?
Please Share plans and rowcount than can we say more details.
EXPLAIN ANALIZE SELECT..
You can try:
Create this index
create index ALL_SERVICE_DETAILS_CMP_INDEX on MZP_ADP.ALL_SERVICE_DETAILS ("ZIP_CODE", "MAKE_LIVE", "LOCATION_ID", "END_DATE", "IS_ACTIVE");
Remove parent select
Remove distinct (if there is at least one unique column in the select)
Apply few things for performance boosting
VACUUM FULL for tables (it also rebuild indexes). But any confusion execute rebuild indexes
VACUUM (FULL, ANALYZE) table_name;
REINDEX TABLE table_name;
Increase work_mem and maintenance_work_mem as per your memory and server
configuration
Use GROUP BY instead of DISTINCT (distinct is slower)
Remove ORDER BY inside subquery. If needed then use it outside
create an composite index with column ZIP_CODE, LOCATION_ID, END_DATE and use
proper ordering in WHERE clause (As MAKE_LIVE and IS_ACTIVE are flag type so need
to add first in index)
EXPLAIN ANALYZE QUERY for checking execution time and using proper index in query
Pseudocode:
SELECT columns
FROM (SELECT columns
FROM table
WHERE searching columns as per index creation
GROUP BY WITHOUT aggregated COLUMNS) t
ORDER BY columns -- if needed
I have a table that has Many rows up to 1000000. and 300 columns
Question : I would like to know the fastest way to get the max date of the column 'LastModifiedDate in table.
In My table I have the columns : Id, LastModifiedDate and others columns...
Thanks in advance for your help
If the inserted records are the latest LastModifiedDate and this LastModifiedDate column will not change after being written into the table.
I think you can create a Clustered index on the LastModifiedDate column. In a Clustered table, a SQL Server clustered index is used to store the data rows sorted based on the clustered index key values. You can referance this documentation Designing effective SQL Server clustered indexes.
Put an index on the column and use:
select max(LastModifiedDate)
from t;
If you want the complete row with the largest date:
select top (1) t.*
from t
order by LastModifiedDate desc;
I have never created an Index before but I'm thinking it may help here. I have a SAS dataset of approx. 7million records. It is a listing of employee entries along with their respective timestamps. I am identifying if there are any subsequent entries by the same user on the same day and then noting the timestamp.
The data set (Entries) is 3 columns: Storage_ID, User_ID and EventTimestamp.
I'm thinking maybe an Index on Stoarge_ID and User_ID would help speed things along.
If they would help, how/where would I need to go about creating the index?
PROC SQL;
CREATE TABLE sub_ENTRIES AS
SELECT A.*,
(SELECT
MIN(B.EVENTTIMESTAMP)
FROM
ENTRIES B
WHERE
A.STORAGE_ID=B.STORAGE_ID
AND A.USER_ID=B.USER_ID
AND DATEPART(A.EVENTTIMESTAMP)=DATEPART(B.EVENTTIMESTAMP)
AND B.EVENTTIMESTAMP > A.EVENTTIMESTAMP
) AS NEXT_ACCESS FORMAT=DATETIME27.6
FROM
ENTRIES A
;
You can create a composite index (two or more columns) using SQL.
For example:
Proc SQL;
create index STORAGE_USER on ENTRIES (storage_id, user_id);
The general syntax is for a index key of n columns is:
create index <index-name>
on <table-name>
( <column-name-1>,
<column-name-2>,
…
<column-name-<n>>
)
The index is most effective / applicable when the query select or join criteria involves all the columns of the composite key. Using OPTION MSGLEVEL=I to have SAS log index usage.
I have the very table in sql server 2008 with lot of data
|ID|Name|Column_1|Column_2|
|..|....|........|........|
more than 18,000 records. So i need to the the row with the lowest value of Column_1 that is date but could by any data type (that is unsorted) so I use these sentence
SELECT TOP 1 ID, Name from table ORDER BY Column_1 ASC
But this is very very slow. And i think that i don't need to to sort the whole table. My question es how to get the same date with out using TOP 1 and ORDER BY
I cannot see why 18,000 rows of information would cause too much of a slow down, but that is obviously without seeing what the data is you are storing.
If you are regularly going to be using the Column_1 field, then I would suggest you place a non-clustered index on it... that will speed up your query.
You can do it by "designing" your table via Sql Server Management Studio, or directly via TSQL...
CREATE INDEX IX_myTable_Column_1 ON myTable (Column_1 ASC)
More information on MSDN about creating indexes here
Update thanks to comments by #GarethD who helped me with this, as I wasn't actually aware of it.
As an extra part of the above TSQL statement, it will increase the speed of your queries if you include the names of the other columns that will be used within the index....
CREATE INDEX IX_myTable_Column_1 ON myTable (Column_1 ASC) INCLUDE (ID, Name)
As GarethD points out, using this SQLFiddle as proof, the execution plan is much quicker as it avoids a "RID" (or Row Identifier) lookup.
More information on MSDN about creating indexes with include columns here
Thank you #GarethD
Would this work faster? When I read this question, this was the code that came to mind:
Select top 1 ID, Name
from table
where Column_1 = (Select min(Column_1) from table)
I am beginner with indexes. I want to create index for this SQL expression which takes too much time to execute so I would like on what exact columns should I create index?
I am using DB2 db but never mind I think that question is very general.
My SQL expression is:
select * from incident where (relatedtoglobal=1)
and globalticketid in (select ticketid from INCIDENT where status='RESOLVED')
and statusdate <='2012-10-09 12:12:12'
Should I create index with this 5 columns or how?
Thanks
Your query:
select *
from incident
where relatedtoglobal = 1
and globalticketid in ( select ticketid
from INCIDENT
where status='RESOLVED'
)
and statusdate <='2012-10-09 12:12:12' ;
And the subquery inside:
select ticketid
from INCIDENT
where status='RESOLVED'
An index on (status, ticketid) will certainly help efficiency of the subquery evaluation and thus of the query.
For the query, besides the previous index, you'll need one more index. The (relatedtoglobal, globalticketid) may be sufficient.
I'm not sure if a more complex indexing would/could be used by the DB2 engine.
Like one on (relatedtoglobal, globalticketid) INCLUDE (statusdate) or
Two indices, one on (relatedtoglobal, globalticketid) and one on (relatedtoglobal, statusdate)
The DB2 documentation is not an easy read but has many details. Start with CREATE INDEX statement and Implementing Indexes.