Improving where IN query in posgresql

Improving where IN query in posgresql - sql

Is there any idea for improving my postgresql query like this :
Select * from employees where employe_id in(?)
Is there any replacement for the query or something that can drastically improving my query like partition, indexing, etc?

Related

Optimize SELECT MAX(timestamp) query

I would like to run this query about once every 5 minutes to be able to run an incremental query to MERGE to another table.
SELECT MAX(timestamp) FROM dataset.myTable
-- timestamp is of type TIMESTAMP
My concern is that will do a full scan of myTable on a regular basis.
What are the best practices for optimizing this query? Will partitioning help even if the SELECT MAX doesn't extract the date from the query? Or is it just the columnar nature of BigQuery will make this optimal?
Thank you.

What you can do is, instead of querying your table directly, query the INFORMATION_SCHEMA.PARTITIONS table within your dataset. Doc here.
You can for instance go for:
SELECT LAST_MODIFIED_TIME
FROM `project.dataset.INFORMATION_SCHEMA.PARTITIONS`
WHERE TABLE_NAME = "myTable"
The PARTITIONS table hold metadata at the rate of one record for each of your partitions. It is therefore greatly smaller than your table and it's an easy way to cut your query costs. (it is also much faster to query).

Is there a performance or memory benefit : Select A, B, C into temp from query VS Select into temp from (Select A,B,C from table) as TTable

In Sql server (2008 version),
I have two queries like this :
Select * from sampleTable into #tempTable
OR
Select * into #tempTable from (Select Query) as someTableName
Is there performance or memory space benefit in any of these queries ? or both are equally good.
This is known that they individually are better than
Insert into Temp
<Query>
But how about when compared to each other.
Updated Text :
Two queries are like this:
Select A,B,C into #tempTable from TestTable
OR
Select * into #tempTable from (Select A,B,C from TestTable) as someTableName

All of those result in the same query plan.
SQL Server has a query optimizer, and optimizing away redundant columns is about the easiest and most basic optimization there is.
The best way to answer such questions for yourself is to look at the query plans and compare them. It is generally quite pointless to memorize the performance behavior of specific queries. It is a better approach to understand how queries are optimized and executed in general.

Same performance
if you are going to use very large datasets like 10000, then #temptable slows down the query, if you are using this for paging or something there is groupby and in newer versions fetch newxt rows methods

Fastest execution time for querying on Big size table

i need advice how to get fastest result for querying on big size table.
I am using SQL Server 2012, my condition is like this:
I have 5 tables contains transaction record, each table has 35 millions of records.
All tables has 14 columns, the columns i need to search is GroupName, CustomerName, and NoRegistration. And I have a view that contains 5 of all these tables.
The GroupName, CustomerName, and NoRegistration records is not unique each tables.
My application have a function to search to these column.
The query is like this:
Search by Group Name:
SELECT DISTINCT(GroupName) FROM TransactionRecords_view WHERE GroupName LIKE ''+#GroupName+'%'
Search by Name:
SELECT DISTINCT(CustomerName) AS 'CustomerName' FROM TransactionRecords_view WHERE CustomerName LIKE ''+#Name+'%'
Search by NoRegistration:
SELECT DISTINCT(NoRegistration) FROM TransactionRecords_view WHERE LOWER(NoRegistration) LIKE LOWER(#NoRegistration)+'%'
My question is how can i achieve fastest execution time for searching?
With my condition right now, every time i search, it took 3 to 5 minutes.
My idea is to make a new tables contains the distinct of GroupName, CustomerName, and NoRegistration from all 5 tables.
Is my idea is make execution time is faster? or any other idea?
Thank you
EDIT:
This is query for view "TransactionRecords_view"
CREATE VIEW TransactionRecords_view
AS
SELECT * FROM TransactionRecords_1507
UNION ALL
SELECT * FROM TransactionRecords_1506
UNION ALL
SELECT * FROM TransactionRecords_1505
UNION ALL
SELECT * FROM TransactionRecords_1504
UNION ALL
SELECT * FROM TransactionRecords_1503

You must show sql of TransactionRecords_view. Do you have indexes? What is the collation of NoRegistration column? Paste the Actual Execution Plan for each query.

Ok, so you don't need to make those new tables. If you create Non-Clustered indexes based upon these fields it will (in effect) do what you're after. The index will only store data on the columns that you indicate, not the whole table. Be aware, however, that indexes are excellent to aid in SELECT statements but will negatively affect any write statements (INSERT, UPDATE etc).
Next you want to run the queries with the actual execution plan switched on. This will show you how the optimizer has decided to run each query (in the back end). Are there any particular issues here, are any of the steps taking up a lot of the overall operator cost? There are plenty of great instructional videos about execution plans on youtube, check them out if you haven't looked at exe plans before.

Did you try to check if there were missing indexes with the actual execution plan ?
Moreover, as you use clause on varchar, I've heard about Full-Text Search.. maybe it can be useful for you :
https://msdn.microsoft.com/en-us/library/ms142571(v=sql.120).aspx

SQL Match Against Slow Query

I have a table of 2+ million (rows) products with 44 fields (columns).
I am attempting to query this table based on the 'NAME' field (varchar 160) which I have a fulltext index on.
Here is the query that is currrently taking 71.34 seconds to execute with a three word $keyword,
62.47 seconds with two word $keyword and 0.017 seconds to execute with a single word $keyword.
SELECT ID,
MATCH(NAME) AGAINST ('$keyword') as Relevance,
MANUFACTURER,
ADVERTISERCATEGORY,
THIRDPARTYCATEGORY,
DESCRIPTION,
AID,
SALEPRICE,
RETAILPRICE,
PRICE,
SKU,
BUYURL,
IMAGEURL,
NAME,
PROGRAMNAME
FROM products
WHERE MATCH(NAME) AGAINST ('$keyword' IN BOOLEAN MODE)
GROUP BY NAME
HAVING Relevance > 6
ORDER BY Relevance DESC LIMIT 24
How can I optimize this query to perform better on 2+ word $keyword queries?

This may not be quite the answer you were looking for, however:
With a table that large, fulltext searches are never going to be speedy. I would suggest looking into a fulltext engine like Sphinx.
As an added benefit, you can also do the relevancy matching in Sphinx, as it will return results ranked in order of relevance. Once Sphinx returns matching IDs, you can then just an IN statement in your query's WHERE clause and select the other data you need.
I would also suggest looking into Sphinx's Extended Query syntax, as this will let you match multiple words across things like proximity and word order.

Is table partioning an option with MySQL? As for SQL Server, it is an reveals to be useful in such situation.
Or perhaps changing the MATCH function for a standard WHERE ... LIKE clause. I don't know about MySQL, so I fear to be of lesser help. Sorry!

Difference between two sql count and subquery count statements

Is there a big performance difference between those two sql count statements, when performing large counts (large here means 100k + records)
first:
SELECT count(*) FROM table1 WHERE <some very complex conditions>
second:
SELECT count(*) FROM (SELECT * FROM table1 WHERE <some very complex conditions>) subquery_alias
I know that first approach is right, but i want to know is this statements will perform similar ?

The query optimizer will most likely transform your second query into the first one. There should be no measurable performance difference between those two queries.

The answer depends on Database being used. for MS SQL the Query Optimizer will optimize the query and both will have similar performance. But for other database system it depends on the intelligence of the Query Optimizer.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Improving where IN query in posgresql - sql

Is there any idea for improving my postgresql query like this : Select * from employees where employe_id in(?) Is there any replacement for the query or something that can drastically improving my query like partition, indexing, etc?

Related

Optimize SELECT MAX(timestamp) query

Is there a performance or memory benefit : Select A, B, C into temp from query VS Select into temp from (Select A,B,C from table) as TTable

Fastest execution time for querying on Big size table

SQL Match Against Slow Query

Difference between two sql count and subquery count statements

Categories

Resources