Sql query tuning/optimization - sql

For large amounts of data, it is taking a lot of time to execute.
Please help tune this query.
select *
from
(select cs.sch, cs.cls, cs.std, d.date, d.count
from
(select c.sch, c.cls, s.std
from
(select distinct sch, cls from Data) c --List of school/classes
cross join
(select distinct std from Data) s --list of std
) cs --every possible combination of school/classes and std
left outer join
Data D on D.sch = cs.sch and D.cls = cs.cls and D.std = cs.std --try and join to the original data
group by
c.sch, c.cls, s.std, d.date, d.count)
order by
cs.sch, cs.cls,
case
when (cs.std= 'Ax')
then 1
when (cs.std= 'Bo')
then 2
when (cs.std= 'Ct')
then 3
else null
end
Thanks in advance
Magickk

First, the query is generating a lot of rows (presumably) and so it is going to take time.
From what I can tell, the outer aggregation is not necessary. At the very least, you have no aggregation functions which is suspicious.
select c.sch, c.cls, s.std, d.date, d.count
from (Select distinct sch, cls from Data
) c cross join -- list of school/classes
(select distinct std from Data
) s left join -- list of std
Data d
on d.sch = cs.sch and d.cls = cs.cls and d.std = cs.std
order by cs.sch, cs.cls,
(case cs.std when 'Ax' then 1 when 'Bo' then 2 when 'Ct' else 3 end)
There is nothing you can do about the outer order by. For the select distinct subqueries, you can create indexes on data(sch, cls, std) (the third column is for the join) and data(std).

DISTINCT is slowing down performance on big tables. Instead, a replacement for DISTINCT could be GROUP BY (wich in some scenarios is more rapid)
select *
from
(select cs.sch, cs.cls, cs.std, d.date, d.count
from
(select c.sch, c.cls, s.std
from
(select sch, cls from Data
group by sch, cls) c
cross join
(select std from Data
group by std) s) cs --every possible combination of school/classes and std
left outer join
Data D on D.sch = cs.sch and D.cls = cs.cls and D.std = cs.std --try and join to the original data
group by
c.sch, c.cls, s.std, d.date, d.count)
order by
cs.sch, cs.cls,
case
when (cs.std= 'Ax')
then 1
when (cs.std= 'Bo')
then 2
when (cs.std= 'Ct')

Related

How to use alias name with partition by in sql

I'm fetching record of players having categorized with golf handicaps. Like players having handicap between 0 to 5 lies in 0-5 range and similarly having handicap between 6-11 lies in the range of 6-11 and so on and so forth. What I'm trying is to fetch top 3 players from each range so that I can setup flights for each rounds.
I have used partition by clause to separate records and ROW_NUMBER to get top 3 players from each range. In order to define ranges, i have used multiple cases. Now how do i use range as alias name with partition by or any way that can generate the correct result. Below is my query.
select * from (
select uu.Id, firstname, lastname, userhandicap,
case when userhandicap>=0 and userhandicap<=5 then '0-5'
when userhandicap>=6 and userhandicap<=11 then '6-11'
when UserHandicap>=12 and UserHandicap<=18 then '12-18'
when UserHandicap>=19 and UserHandicap<=26 then '19-26'
else '27 and above' end as range, RN = ROW_Number() over (PARTITION BY
range order by cast(userhandicap as int))
from dbo.[User] uu inner join dbo.[EventRegisteredUsers] eru
on uu.Id = eru.UserId
where eru.UserId not in (Select fp.UserId from dbo.[FlightPlayer] fp
inner join dbo.[Flight] f
on fp.FlightId = f.Id
where f.Rounds = '1'
and f.Starthole = '0a9b926e-0baa-4369-8cf8-8fc84ca80d65' and f.EventId =
'7de10ad6-098d-419f-9c2d-2e62803ad1f7')
and eru.EventId = '7de10ad6-098d-419f-9c2d-2e62803ad1f7') uu
WHERE
uu.RN <= 3
You can use apply to define the range value within the subquery. This is the simplest method for defining the range:
select *
from (select uu.Id, firstname, lastname, userhandicap,
row_number() over (partition by v.range order by cast(userhandicap as int)) as seqnum
from dbo.[User] uu inner join
dbo.[EventRegisteredUsers] eru
on uu.Id = eru.UserId cross apply
(values (case when userhandicap <= 5 then '0-5'
when userhandicap <= 11 then '6-11'
when UserHandicap <= 18 then '12-18'
when UserHandicap <= 26 then '19-26'
else '27 and above'
end)
) v(range)
where not exists (select 1
from dbo.[FlightPlayer] fp join
dbo.[Flight] f
on fp.FlightId = f.Id
where eru.UserId = fp.UserId and
f.Rounds = '1' and
f.Starthole = '0a9b926e-0baa-4369-8cf8-8fc84ca80d65' and
f.EventId = '7de10ad6-098d-419f-9c2d-2e62803ad1f7'
) and
eru.EventId = '7de10ad6-098d-419f-9c2d-2e62803ad1f7'
) uu
where uu.seqnum <= 3;
Note other changes to the query:
Don't use not in with a subquery. If the subquery returns a NULL value, then all values are filtered out. That is not (usually) the expected behavior.
The case expression is overly complicated. Use the fact that case is guaranteed to evaluate the conditions in order.
You should qualify all column names in a query that has more than one query. However, it is unclear where the columns come from.
Presumably handicap is not ever negative, based on your original logic (and the rules of golf), so I am comfortable removing that condition.
use cte
with cte as
(
select uu.Id, firstname, lastname, userhandicap,
case when userhandicap>=0 and userhandicap<=5 then '0-5'
when userhandicap>=6 and userhandicap<=11 then '6-11'
when UserHandicap>=12 and UserHandicap<=18 then '12-18'
when UserHandicap>=19 and UserHandicap<=26 then '19-26'
else '27 and above' end as range
from dbo.[User] uu inner join dbo.[EventRegisteredUsers] eru
on uu.Id = eru.UserId
where eru.UserId not in (Select fp.UserId from dbo.[FlightPlayer] fp
inner join dbo.[Flight] f
on fp.FlightId = f.Id
where f.Rounds = '1'
and f.Starthole = '0a9b926e-0baa-4369-8cf8-8fc84ca80d65' and f.EventId =
'7de10ad6-098d-419f-9c2d-2e62803ad1f7')
and eru.EventId = '7de10ad6-098d-419f-9c2d-2e62803ad1f7'
), t2 as
(
select *,row_number() over(partition by range order by cast(userhandicap as int) rn from cte
) select * from t2 where rn<=3

Sql code for distinct fields

I was wondering if anyone can help me with this query.
I have two tables that I join together (DDS2ENVR.QBO AND KCA0001.ORTS)
THE QBO Table has a field labeled NIIN AND RIC. THE KCA0001.ORTS table has a field named SERVICE and OWN_RIC.
I Join the tables by QBO.RIC and ORTS.OWN_RIC. My dilemma is that under the NIIN field multiple rows can be identical but have different values for RIC.
Example:
NIIN RIC
123455 A
122222 B
123456 C
122222 A
I want to query a distinct count for NIINS that separates by the different service where it does not overlap. So example NIIN should only find distinct values only associated with A where the same NIIN is not found in B,C,D etc.
SELECT D.SERVICE, COUNT(C.NIIN)
FROM DDS2ENVR.QBO C
JOIN KCA0001.ORTS D ON D.OWN_RIC = C.RIC
WHERE C.SITE_ID = ('HEAA')
GROUP BY D.SERVICE
HAVING COUNT(DISTINCT C.NIIN) > 1
Please ask questions if this does not make any sense.
Using Not Exists
SELECT D.SERVICE, COUNT(C.NIIN)
FROM DDS2ENVR.QBO C
JOIN KCA0001.ORTS D ON D.OWN_RIC = C.RIC
WHERE C.SITE_ID = ('HEAA')
and NOT EXISTS (Select 1 from DDS2ENVR.QBO C1 where C1.NIIN = C.NIIN and C1.RIC <> C.RIC)
GROUP BY D.SERVICE
HAVING COUNT(DISTINCT C.NIIN) > 1
Also if the table DDS2ENVR.QBO doesn't contain duplicates and your dbms supports CTE
With cte as
(Select NIIN from DDS2ENVR.QBO group by NIIN having count(*) = 1)
SELECT D.SERVICE, COUNT(C.NIIN)
FROM DDS2ENVR.QBO C
JOIN KCA0001.ORTS D ON D.OWN_RIC = C.RIC
WHERE C.SITE_ID = ('HEAA')
and C.NIIN in (Select * from cte)
GROUP BY D.SERVICE
HAVING COUNT(DISTINCT C.NIIN) > 1

how to convert an empty record to zero

I am working on a Point-of-sale system, but I have been stranded lately. I have two tables from which I want to pick data, but some of the values I retrieve are NULLs, whereas I need them to be zeros. This happens when there is a row in one table ([dbo.ProdDetails]) without a corresponding row in the other table [dbo.Pro_Sales].
I am using this query:
SELECT a.ItemCODE,a.OpenStock,c.UnitsSold
FROM
(SELECT x.ItemCODE,x.OpenStock FROM dbo.ProdDetails x) a
LEFT OUTER JOIN
(SELECT x.ProductID,ISNULL(CONVERT(varchar(50),x.Quantity),'') UnitsSold
FROM dbo.Pro_Sales x
GROUP BY x.ProductID,x.Quantity
) c
ON a.ItemCODE=c.ProductID
WHERE a.ItemCODE ='0005'
The result am getting is
itemCode OpenStock UnitsSold
0005 6 NULL
Use ISNULL(columnName, alt.Value) as columnName:
SELECT a.ItemCODE,a.OpenStock,ISNULL(c.UnitsSold,0) as UnitsSold
FROM
(SELECT x.ItemCODE,x.OpenStock FROM dbo.ProdDetails x) a
LEFT OUTER JOIN
(SELECT x.ProductID,ISNULL(CONVERT(varchar(50),x.Quantity),'') UnitsSold
FROM dbo.Pro_Sales x
GROUP BY x.ProductID,x.Quantity
) c
ON a.ItemCODE=c.ProductID
WHERE a.ItemCODE ='0005'
This is what the COALESCE() function is for:
SELECT a.ItemCODE, a.OpenStock, COALESCE(c.UnitsSold, 0) AS UnitsSold
FROM
(SELECT x.ItemCODE,x.OpenStock FROM dbo.ProdDetails x) a
LEFT OUTER JOIN
(SELECT x.ProductID,ISNULL(CONVERT(varchar(50),x.Quantity),'') UnitsSold
FROM dbo.Pro_Sales x
GROUP BY x.ProductID,x.Quantity
) c
ON a.ItemCODE=c.ProductID
WHERE a.ItemCODE ='0005'

Replace no result

I have a query like this:
SELECT TV.Descrizione as TipoVers,
sum(ImportoVersamento) as ImpTot,
count(*) as N,
month(DataAllibramento) as Mese
FROM PROC_Versamento V
left outer join dbo.PROC_TipoVersamento TV
on V.IDTipoVersamento = TV.IDTipoVersamento
inner join dbo.PROC_PraticaRiscossione PR
on V.IDPraticaRiscossioneAssociata = PR.IDPratica
inner join dbo.DA_Avviso A
on PR.IDDatiAvviso = A.IDAvviso
where DataAllibramento between '2012-09-08' and '2012-09-17' and A.IDFornitura = 4
group by V.IDTipoVersamento,month(DataAllibramento),TV.Descrizione
order by V.IDTipoVersamento,month(DataAllibramento)
This query must always return something. If no result is produced a
0 0 0 0
row must be returned. How can I do this. Use a isnull for every selected field isn't usefull.
Use a derived table with one row and do a outer apply to your other table / query.
Here is a sample with a table variable #T in place of your real table.
declare #T table
(
ID int,
Grp int
)
select isnull(Q.MaxID, 0) as MaxID,
isnull(Q.C, 0) as C
from (select 1) as T(X)
outer apply (
-- Your query goes here
select max(ID) as MaxID,
count(*) as C
from #T
group by Grp
) as Q
order by Q.C -- order by goes to the outer query
That will make sure you have always at least one row in the output.
Something like this using your query.
select isnull(Q.TipoVers, '0') as TipoVers,
isnull(Q.ImpTot, 0) as ImpTot,
isnull(Q.N, 0) as N,
isnull(Q.Mese, 0) as Mese
from (select 1) as T(X)
outer apply (
SELECT TV.Descrizione as TipoVers,
sum(ImportoVersamento) as ImpTot,
count(*) as N,
month(DataAllibramento) as Mese,
V.IDTipoVersamento
FROM PROC_Versamento V
left outer join dbo.PROC_TipoVersamento TV
on V.IDTipoVersamento = TV.IDTipoVersamento
inner join dbo.PROC_PraticaRiscossione PR
on V.IDPraticaRiscossioneAssociata = PR.IDPratica
inner join dbo.DA_Avviso A
on PR.IDDatiAvviso = A.IDAvviso
where DataAllibramento between '2012-09-08' and '2012-09-17' and A.IDFornitura = 4
group by V.IDTipoVersamento,month(DataAllibramento),TV.Descrizione
) as Q
order by Q.IDTipoVersamento, Q.Mese
Use COALESCE. It returns the first non-null value. E.g.
SELECT COALESCE(TV.Desc, 0)...
Will return 0 if TV.DESC is NULL.
You can try:
with dat as (select TV.[Desc] as TipyDesc, sum(Import) as ToImp, count(*) as N, month(Date) as Mounth
from /*DATA SOURCE HERE*/ as TV
group by [Desc], month(Date))
select [TipyDesc], ToImp, N, Mounth from dat
union all
select '0', 0, 0, 0 where (select count (*) from dat)=0
That should do what you want...
If it's ok to include the "0 0 0 0" row in a result set that has data, you can use a union:
SELECT TV.Desc as TipyDesc,
sum(Import) as TotImp,
count(*) as N,
month(Date) as Mounth
...
UNION
SELECT
0,0,0,0
Depending on the database, you may need a FROM for the second SELECT. In Oracle, this would be "FROM DUAL". For MySQL, no FROM is necessary

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...