Create index for lower case of a to z characters - sql

I have a database column which allow us to store lower case a to z characters and 'Space'. How we can create an index with more specific expressions.
We are need of this specific index to improve the 'Order by' clause performance issue.
The performance problem here is, when we do 'Order by' for large number of table column it creates problem. If the order by column is date or integer then it is faster but not for varchar type column.
We want to make the query faster by add specific index to the varchar column or make another decision

First, I suggest to take a look to this page: http://www.postgresql.org/docs/9.4/static/indexes-expressional.html
Second, is the order by on the field, or on the expression? I mean, if your order by is:
ORDER BY col1
You just need to index like:
CREATE INDEX idx_table_col1 ON yourtable (col1);
If your ORDER BY is:
ORDER BY lower(col1)
Then:
CREATE INDEX idx_lower_col1_table ON yourtable (lower(col1));
Anyway, to improve your question, I suggest to:
Show your query
Show execution plan retrived with EXPLAIN ANALYZE
Show your table and indexes

Related

Fastest PostgreSQL query for field length?

If I need to find the maximum length of a few fields stored as numeric (i.e. variable length number) in postgresql so my team can build a fixed width file layout and the length isn't in the metadata, is there a faster way to get that info than either
select field
from table
where field is not null
order by field desc
limit 1;
or
select max(field)
from table;
?
The tables these fields are in have tens of millions of rows so these queries are taking quite a while. I'm a decent postgresql user, but optimizing for efficiency has never been my strong suite - I don't usually work with such large datasets. Any help is appreciated, even if this is a dumb question!
Your queries look fine. The where clause is not needed in the first query, that can be written as:
select myfield from mytable order by myfield desc nulls last limit 1;
Then, for performance, consider the following index:
create index myidx on mytable(myfield desc nulls last);
Actually Postgres should be able to read the index backwards, so this should be just as good:
create index myidx on mytable(myfield);
With any of these indexes in place, the database should be able to execute the whole query by looking at the index only, which should be very efficient.

Does index help a sql select sorting performance?

I have a SQL statement like this:
SELECT * FROM "table1" WHERE "id" In('1', '2', '3') ORDER BY "createdAt"
I think the benefit for indexing 'createdAt' column is quite minimal, since it does the select first then sort 3 rows. Am I correct? Or it's better to add indexing?
There are two possible indexing strategies for the query you show:
Index the IN condition:
CREATE INDEX ON table1 (id);
That is a good idea if the condition is selective, that is, if few table rows match the condition.
Index the ORDER BY clause:
CREATE INDEX ON table1 ("createdAt");
Then the database can scan the index to get the result rows in ORDER BY order without an explicit sort.
This will only be beneficial if the IN condition is not selective, that is, most table rows meet the condition.
Still, depending on the row size and other parameters, PostgreSQL may choose to use a sequential scan and an explicit sort unless you limit the number of result rows with a LIMIT clause.
Unfortunately it is not possible to have an index support both the IN condition and the ORDER BY – that would only be possible if the WHERE condition were a plain equality comparison.

Select data with a dynamic where clause on non-indexed column

I have a table with 30 columns and millions of entries.
I want to execute a stored procedure on this table to search data.
The search criteria are passed in a parameter to this SP.
If I serach data with a dynamic WHERE clause on non-indexed column, it spends a lot of time.
Below is an example :
Select counterparty_name from counterparty where counterparty_name = 'test'
In this example this counterparty is in th row number 5000000.
As explained,I can't create an index to this table .
I would like to know if the processing time is normal.
I would like to know if there is any recommandation that can improve the execution time?
Best regards.
If you do not have an index on the column then it will have to do a scan of the clustered index in order to look for the data (or maybe a smaller index which might have that column included in it). As such it is going to take a long time.

Two questions on PostgreSQL performance

1) What is the best way to implement paging in PostgreSQL?
Assume we need to implement paging. The simplest query is select * from MY_TABLE order by date_field DESC limit 10 offset 20. As far as I understand, we have 2 problems here: in case the dates may have duplicated values every run of this query may return different results and the more offset value is the longer the query runs. We have to provide additional column which is date_field_index:
--date_field--date_field_index--
12-01-2012 1
12-01-2012 2
14-01-2012 1
16-01-2012 1
--------------------------------
Now we can write something like
create index MY_INDEX on MY_TABLE (date_field, date_field_index);
select * from MY_TABLE where date_field=<last_page_date and not (date_field_index>=last_page_date_index and date_field=last+page_date) order by date_field DESC, date_field_index DESC limit 20;
..thus using the where clause and corresponding index instead of offset. OK, now the questions:
1) is this the best way to improve the initial query?
2) how can we populate that date_field_index field? we have to provide some trigger for this?
3) We should not use RowNumber() functions in Postgres because they are not using indexes and thus very slow. Is it correct?
2) Why column order in concatenated index is not affecting performance of the query?
My measurements show, that while searching using concatenated index (index consisting of 2 and more columns) there is no difference if we place the most selective column to the first place - or if we place it to the end. Why? If we place the most selective column to the first place - we run through a shorter range of the found rows which should have impact on performance. Am I right?
Use the primary key to untie in instead of the date_field_index column. Otherwise explain why that is not an option.
order by date_field DESC, "primary_key_column(s)" DESC
The combined index with the most unique column first is the best performer, but it will not be used if:
the distinct values are more than a few percent of the table
there aren't enough rows to make it worth
the range of dates is not small enough
What is the output of explain my_query?

Adding fields to optimize MySQL queries

I have a MySQL table with 3 fields:
Location
Variable
Value
I frequently use the following query:
SELECT *
FROM Table
WHERE Location = '$Location'
AND Variable = '$Variable'
ORDER BY Location, Variable
I have over a million rows in my table and queries are somewhat slow. Would it increase query speed if I added a field VariableLocation, which is the Variable and the Location combined? I would be able to change the query to:
SELECT *
FROM Table
WHERE VariableLocation = '$Location$Variable'
ORDER BY VariableLocation
I would add a covering index, for columns location and variable:
ALTER TABLE
ADD INDEX (variable, location);
...though if the variable & location pairs are unique, they should be the primary key.
Combining the columns will likely cause more grief than it's worth. For example, if you need to pull out records by location or variable only, you'd have to substring the values in a subquery.
Try adding an index which covers the two fields you should then still get a performance boost but also keep your data understandable because it wouldn't seem like the two columns should be combine but you are just doing it to get performance.
I would advise against combining the fields. Instead, create an index that covers both fields in the same order as your ORDER BY clause:
ALTER TABLE tablename ADD INDEX (location, variable);
Combined indices and keys are only used in queries that involve all fields of the index or a subset of these fields read from left to right. Or in other words: If you use location in a WHERE condition, this index would be used, but ordering by variable would not use the index.
When trying to optimize queries, the EXPLAIN command is quite helpful: EXPLAIN in mysql docs
Correction Update:
Courtesy: #paxdiablo:
A column in the table will make no difference. All you need is an index over both columns and the MySQL engine will use that. Adding a column in the table is actually worse than that since it breaks 3NF and wastes space. See http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html which states: SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2; If a multiple-column index exists on col1 and col2, the appropriate rows can be fetched directly.