SQL Server query perfomance tuning with group by and join clause

SQL Server query perfomance tuning with group by and join clause - sql

We have been experiencing performance concerns over job and I could fortunately find the query causing the slowness..
select name from Student a, Student_Temp b
where a.id = b.id and
a.name in (select name from Student
group by name having count(*) = #sno)
group by a.name having count(*) = #sno
OPTION (MERGE JOIN, LOOP JOIN)
This particular query is iteratively called many times slowing down the performance..
Student table has 8 Million records and Student_temp receives 5-20 records in the iteration process each time.
Student table has composite primary key on ( id and name)
and sno = No of records in Student_Temp.
My questions are below,
1) why does this query show performance issues.
2) could you guys give a more efficient way of writing this piece ?
Thanks in Advance !

It's repeating the same logic unnecessarily. You really just want:
Of the Student(s) who also exist in Student_temp
what names exist #sno times?
Try this:
SELECT
name
FROM
Student a JOIN
Student_Temp b ON a.id = b.id
GROUP BY
name
HAVING
count(*) = #sno

Your query returns the following result: Give me all names that are #sno times in the table Student and exactly once in Student_temp.
You can rewrite the query like this:
SELECT a.name
FROM Student a
INNER JOIN Student_temp b
ON a.id = b.id
GROUP BY a.name
HAVING COUNT(*) = #sno
You should omit the query hint unless you are absolutely sure that the query optimizer screws up.
EDIT: There is of course a difference between the queries: if for instance #sno=2 then a name that shows up once in Student but twice in Student_temp would be included in my query but not in the original. I depends on what you really want to achieve whether that needs to be adressed or not.

Here you go
select name
from Student a
inner join Student_Temp b
on a.id = b.id
group by a.name
HAVING COUNT(*) = #sno

Related

Oracle - NVL(col1,col2) Order By slowness

There is a column in Select clause NVL(b.name, a.name) and I am using this column in Order By due to which the Oracle query has become slow.
I tried creating index on the NAME column but of no use.
SELECT
*
FROM
(
SELECT
nvl(b.name,a.name) AS b_a_name, -- Order by is using this column and hence the slowness. Index is present on NAME column but of no use
b.name b_name,
a.name a_name
FROM
employee a
LEFT JOIN employee b ON a.parent_id = b.child_id
)
ORDER BY b_a_name --- this Order By is taking time
;
I expect how to tune Order By clause or how can I re-write the query to get the same output but with improved performance.

you can write your query without using the subquery though not sure it will improve any performance or not
SELECT nvl(b.name,a.name) AS b_a_name,
b.name a_name,
a.name b_name
FROM
employee a
LEFT JOIN employee b ON a.parent_id = b.child_id
order by b_a_name

How about removing NVL from ORDER BY?
SELECT NVL(b.name, a.name) AS b_a_name,
b.name a_name,
a.name b_name
FROM employee a
LEFT JOIN employee b ON a.parent_id = b.child_id
ORDER BY b.name, a.name;
Anyway, ORDER BY will slow things down. Unordered set is always retrieved faster.
By the way, why did you do that with column aliases? To confuse the enemy? Well, you confused me.
b.name a_name --> shouldn't that be b_name
a.name b_name --> a_name

Here the time was taken by the IO time and no degradation by (CPU time)Order By clause. I confirmed this by putting all data of query into a table and then applied Order by. All time was consumed by writing the data to table. Response time is 0.2 secs but IO time is 6 secs which cant be reduced, parallel hint may help.

How to ignore lines in sql query which specific id php

I have a simply shop with php and I need to ignore some products in shop on manage page. How to possible to make ignore in SQL query?
Here is my query:
$query = "SELECT a.*,
a.user as puser,
a.id as pid,
b.date as date,
b.price as price,
b.job_id as job_id,
b.masterkey as masterkey
FROM table_shop a
INNER JOIN table_shop_s b ON a.id = b.buyid
WHERE b.payok = 1
ORDER BY buyid";
I need to ignore list with product_id = "3","4" from table table_shop_s in this query

WHERE b.payok = 1 AND tablename.product_id != 3 AND tablename.product_id != 4

Simply use NOT IN (to ignore specific pids), with AND logical condition. Use the following:
$query = "SELECT a.*,
a.user as puser,
a.id as pid,
b.date as date,
b.price as price,
b.job_id as job_id,
b.masterkey as masterkey
FROM table_shop a
INNER JOIN table_shop_s b ON a.id = b.buyid
WHERE b.payok = 1
AND a.id NOT IN (3,4)
ORDER BY buyid";

Other answer has noted you would probably use a "productid NOT IN (3,4)" which would work, but that would be a short-term fix. Extend the thinking a bit. 2 products now, but in the future you have more you want to hide / prevent? What then, change all your queries and miss something?
My suggestion would be to update your product table. Add a column such as ExcludeFlag and have it set to 1 or 0... 1 = Yes, Exclude, 0 = ok, leave it alone. Then join your shop detail table to products and exclude when this flag is set... Also, you only need to "As" columns when you are changing their result column name, Additionally, by doing A.*, you are already getting ALL columns from alias "a" table, do you really need to add the extra instances of "a.user as puser, a.id as pid" ?
something like
SELECT
a.*,
b.date,
b.price,
b.job_id,
b.masterkey
FROM
table_shop a
INNER JOIN table_shop_s b
ON a.id = b.buyid
AND b.payok = 1
INNER JOIN YourProductTable ypt
on b.ProductID = ypt.ProductID
AND ypt.ExcludeFlag = 0
ORDER BY
a.id
Notice the extra join and specifically including all those where the flag is NOT set.
Also, good practice to alias table names closer to context of purpose vs just "a" and "b" much like my example of long table YourProductTable aliased as ypt.
I also changed the order by to "a.id" since that is the primary table in your query and also, since a.id = b.buyid, it is the same key order anyhow and probably is indexed on your "a" table too. the table_shop_s table I would assume already has an index on (buyid), but might improve when you get a lot of records to be indexed on (buyid, payok) to better match your JOINING criteria on both parts.

Simple SQL query on small tables takes too long to execute

I have a query that takes much too long time to execute. It is simple and tables are small. The simplified query (but still slow) is:
SELECT D.ID, C.Name, T.Name AS TownName
FROM Documents D, Companies C, Towns T
WHERE C.ID = D.Company AND T.ID = C.Town
ORDER BY C.Name
Primary keys and foreign keys between tables are properly set. Also, column Companies.Name is indexed.
I tried using JOINs, restarting SQL Server, rebuilding indices etc. but it still needs about 40 seconds to execute on my computer with SSD. Number of records in tables Documents and Companies is only 18K (currently, they are 1:1) and only about 20 records in table Towns.
On the other side, the following query returns completely the same records, but it takes practically no time to execute:
SELECT D.ID, C.Name, (SELECT Name FROM Towns WHERE ID = C.Town) AS TownName
FROM Documents D, Companies C
WHERE C.ID = D.Company
ORDER BY C.Name
In my opinion, the first query should be even faster, but I am obviously wrong. Does anybody have a clue what's happening here? It seems that indices are ignored when sorting by column in a table which is a master of one and detail of another one.

I can't explain why your subquery query is running faster but I would try something else to see if I could eliminate the subquery.
I usually go from least to greatest when i'm not using where conditions.. So my query would look like
Select t.Name TownName,
c.Name,
d.Id
From Towns t
Join Companies c ON t.Id = c.Town
Join Documents d ON c.Id = d.Company
Order By c.Name
Then I'd make sure that Companies has an Index on Town, and that Documents has and index on Company.. 18k records might take a little while to display in the output window but the query should be pretty quick

what happens when you use join statements?
SELECT D.ID, C.Name, T.Name AS TownName
FROM Documents D
inner join Companies C on C.
inner join Towns T on T.ID = C.Town
ORDER BY C.Name
also, try with and without the order

sql server - how to modify values in a query statement?

I have a statement like this:
select lastname,firstname,email,floorid
from employee
where locationid=1
and (statusid=1 or statusid=3)
order by floorid,lastname,firstname,email
The problem is the column floorid. The result of this query is showing the id of the floors.
There is this table called floor (has like 30 rows), which has columns id and floornumber. The floorid (in above statement) values match the id of the table floor.
I want the above query to switch the floorid values into the associated values of the floornumber column in the floor table.
Can anyone show me how to do this please?
I am using Microsoft sql server 2008 r2.
I am new to sql and I need a clear and understandable method if possible.

select lastname,
firstname,
email,
floor.floornumber
from employee
inner join floor on floor.id = employee.floorid
where locationid = 1
and (statusid = 1 or statusid = 3)
order by floorid, lastname, firstname, email

You have to do a simple join where you check, if the floorid matches the id of your floor table. Then you use the floornumber of the table floor.
select a.lastname,a.firstname,a.email,b.floornumber
from employee a
join floor b on a.floorid = b.id
where a.locationid=1 and (a.statusid=1 or a.statusid=3)
order by a.floorid,a.lastname,a.firstname,a.email

You need to use a join.
This will join the two tables on a certain field.
This way you can SELECTcolumns from more than one table at the time.
When you join two tables you have to specify on which column you want to join them.
In your example, you'd have to do this:
from employee join floor on employee.floorid = floor.id
Since you are new to SQL you must know a few things. With the other enaswers you have on this question, people use aliases instead of repeating the table name.
from employee a join floor b
means that from now on the table employee will be known as a and the table floor as b. This is really usefull when you have a lot of joins to do.
Now let's say both table have a column name. In your select you have to say from which table you want to pick the column name. If you only write this
SELECT name from Employee a join floor b on a.id = b.id
the compiler won't understand from which table you want to get the column name. You would have to specify it like this :
SELECT Employee.name from Employee a join floor b on a.id = b.id or if you prefer with aliases :
SELECT a.name from Employee a join floor b on a.id = b.id
Finally there are many type of joins.
Inner join ( what you are using because simply typing Join will refer to an inner join.
Left outer join
Right outer join
Self join
...
To should refer to this article about joins to know how to use them correctly.
Hope this helps.

comparison query taking ages

My query is quite simple:
select a.ID, a.adres, a.place, a.postalcode
from COMPANIES a, COMPANIES b
where a.Postcode = b.Postcode
and a.Adres = b.Adres
and (
select COUNT(COMPANYID)
from USERS
where COMPANYID=a.ID
)>(
select COUNT(COMPANYID)
from USERS
where COMPANYID=b.ID
)
Database: sql server 2008 r2
What I'm trying to do:
The table of COMPANIES contains double entries. I want to know the ones that are connected to the most amount of users. So I only have to change the foreign keys of those with the least. ( I already know the id's of the doubles)
Right now it's taking a lot of time to complete. I was wondering if if could be done faster

Try this version. It should be only a little faster. The COUNT is quite slow. I've added a.ID <> b.ID to avoid few cases earlier.
select a.ID, a.adres, a.place, a.postalcode
from COMPANIES a INNER JOIN COMPANIES b
ON
a.ID <> b.ID
and a.Postcode = b.Postcode
and a.Adres = b.Adres
and (
select COUNT(COMPANYID)
from USERS
where COMPANYID=a.ID
)>(
select COUNT(COMPANYID)
from USERS
where COMPANYID=b.ID
)
The FROM ... INNER JOIN ... ON ... is a preferred SQL construct to join tables. It may be faster too.

One approach would be to pre-calculate the COMPANYID count before doing the join since you'll be repeatedly calculating it in the main query. i.e. something like:
insert into #CompanyCount (ID, IDCount)
select COMPANYID, COUNT(COMPANYID)
from USERS
group by COMPANYID
Then your main query:
select a.ID, a.adres, a.place, a.postalcode
from COMPANIES a
inner join #CompanyCount aCount on aCount.ID = a.ID
inner join COMPANIES b on b.Postcode = a.Postcode and b.Adres = a.Adres
inner join #CompanyCount bCount on bCount.ID = b.ID and aCount.IDCount > bCount.IDCount
If you want all instances of a even though there is no corresponding b then you'd need to have left outer joins to b and bCount.
However you need to look at the query plan - which indexes are you using - you probably want to have them on the IDs and the Postcode and Adres fields as a minimum since you're joining on them.

Build an index on postcode and adres
The database probably executes the subselects for every row. (Just guessing here, veryfy it in the explain plan. If this is the case you can rewrite the query to join with the inline views (note this is how it would look in oracle hop it works in sql server as well):
select distinct a.ID, a.adres, a.place, a.postalcode
from
COMPANIES a,
COMPANIES b,
(
select COUNT(COMPANYID) cnt, companyid
from USERS
group by companyid) cntA,
(
select COUNT(COMPANYID) cnt, companyid
from USERS
group by companyid) cntb
where a.Postcode = b.Postcode
and a.Adres = b.Adres
and a.ID<>b.ID
and cnta.cnt>cntb.cnt

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas