How to improve XMLAGG(XMLELEMENT() written query performance? - sql

I am working with following query to get required result. But the query takes long time to run in live environment.Query as follows
SELECT a.int_key,main_number,b.local_number,dia.diag_number,a.clm_key
FROM clamp a
INNER JOIN polins b
ON a.ins_id=b.ins_id
LEFT OUTER JOIN
(SELECT int_key, RTRIM(XMLAGG(XMLELEMENT(E,MD.VAL,',').EXTRACT('//text()')
ORDER BY int_key).GetClobVal(),',') diag_number
FROM dia D
INNER JOIN ms_dia MD
ON MD.dia_key=D.dia_code
GROUP BY int_key
) dia ON a.int_key =dia.int_key
WHERE a.int_key NOT IN
(SELECT int_key FROM table1
);
I think subquery in from clause RTRIM(XMLAGG(XMLELEMENT(E,MD.VAL,',').EXTRACT('//text()') ORDER BY int_key).GetClobVal(),',') took long time to run. I already tried with listagg but following error was raise ORA-01489: result of string concatenation is too long. Can anyone rewrite the above query effectively.

Related

How to convert and optimize this SQL query in SqlKata

I have this SQL query and I need to convert it SqlKata
SELECT VVIAGGIO AS VIAGGIO, ISNULL(AEAN,'') AS EAN
FROM
SALDI V INNER JOIN ARTICOLI A ON VARTI=AARTI
INNER JOIN ORDI OT ON VSTAB=OTSTAB AND VMAGA=OTMAGA AND VAGG=OTRAGG
WHERE VORDI ='21'
AND VHOST ='68'
ORDER BY VPROG, AARTI
I don't know how to structure it because here is ISNULL(), INNER JOIN..
I already checked SqlKata select instruction.
Any suggestions on how to optimize and convert this query in SqlKata?
here is what you need:
var query = new Query("SALDI as V")
.Join("ARTICOLI as A","A.ARTI","V.ARTI")
.Join("ORDI as OT",j => j.On("OT.STAB","V.STAB")
.On("V.MAGA","OT.MAGA")
.On("V.AGG","OT.RAGG")
)
.Where("V.ORDI","21")
.Where("V.HOST","68")
.Select("V.VIAGGIO")
.SelectRaw("ISNULL(V.AEAN,'') as EAN")
.OrderBy("V.PROG", "A.ARTI")

How to execute subquery in Oracle? My query gets parentheses erro

I'm trying to execute a query inside LabVIEW so I can informations stored in a Oracle Database, but when a try to execute a query with parenthesis it doesn't works and gives me this erro:
ADO Error: 0x80004005 Exception occured in Microsoft OLE DB Provider for ODBC Drivers: [Oracle][ODBC][Ora]ORA-00907: parêntese direito não encontrado
Here is the SQL query I'm trying to execute:
SELECT
F.CODIGOFAIXAMODELO,
F.CODIGOMODELO,
F.INICIOESCALA,
F.FUNDOESCALA,
F.FAIXA,
F.DESCFAIXA,
F.ORDEM,
P.CODIGOPROCEDIMENTO
FROM FAIXAS F INNER JOIN PROCEDS P ON F.CODIGOFAIXAMODELO=(
SELECT
CODIGOFAIXAMODELO
FROM PROCEDS
WHERE
PROCEDS.CODIGOFAIXAMODELO=F.CODIGOFAIXAMODELO
LIMIT 1
)
WHERE
F.CODIGOMODELO='%CODIGOMODELO%'
ORDER BY F.ORDEM ASC;
The %CODIGOMODELO% is replaced with a value by LabVIEW.
When I try the following Query it works:
SELECT
F.CODIGOFAIXAMODELO,
F.CODIGOMODELO,
F.INICIOESCALA,
F.FUNDOESCALA,
F.FAIXA,
F.DESCFAIXA,
F.ORDEM,
P.CODIGOPROCEDIMENTO
FROM FAIXAS F INNER JOIN PROCEDS P ON F.CODIGOFAIXAMODELO=P.CODIGOFAIXAMODELO
WHERE
F.CODIGOMODELO='%CODIGOMODELO%'
ORDER BY F.ORDEM ASC;
The problem with the second solution is that it returns me many P.CODIGOPROCEDIMENTO, and what I want is to get only one even when there are many.
there is no LIMIT function in Oracle
you need to use ROWNUM = 1 or OFFSET 0 ROWS FETCH NEXT 1 ROWS ONLY
Also as it is stated by #APC, you shouldn't be joining your table on a subquery.
I would write it this way. It may be more efficient and more readable to avoid trying to evaluate a subquery inside an expression:
SELECT
F.CODIGOFAIXAMODELO,
F.CODIGOMODELO,
F.INICIOESCALA,
F.FUNDOESCALA,
F.FAIXA,
F.DESCFAIXA,
F.ORDEM,
P.CODIGOPROCEDIMENTO
FROM FAIXAS F
INNER JOIN (
SELECT
P1.CODIGOFAIXAMODELO,
MAX(P1.CODIGOPROCEDIMENTO) AS CODIGOPROCEDIMENTO
FROM PROCEDS P1
GROUP BY P1.CODIGOFAIXAMODELO
) P ON P.CODIGOFAIXAMODELO = F.CODIGOFAIXAMODELO
WHERE F.CODIGOMODELO = '%CODIGOMODELO%'
ORDER BY F.ORDEM ASC;
MAX() is an Aggregate Function that will return only one value for each group - specified in the GROUP BY clause. Therefore, using a subquery and joining on CODIGOFAIXAMODELO ensures that only one row is filtered against the main query.
The results really depend on the key structures, datatypes and how many rows are available in PROCEDS. There are of course other, more complex methods to achieve the same result, such as using Analytic Functions.
I think you can write it this way:
SELECT
F.CODIGOFAIXAMODELO,
F.CODIGOMODELO,
F.INICIOESCALA,
F.FUNDOESCALA,
F.FAIXA,
F.DESCFAIXA,
F.ORDEM,
P.CODIGOPROCEDIMENTO
FROM FAIXAS F INNER JOIN PROCEDS P ON F.CODIGOFAIXAMODELO=P.CODIGOFAIXAMODELO and ROWNUM = 1
WHERE
F.CODIGOMODELO='%CODIGOMODELO%'
ORDER BY F.ORDEM ASC;

Query SQL Microsoft Access bug?

I've been working on an Access database with SQL. I was trying to perform the following query:
SELECT Produtos.produto,
[aux].[total]/[Produtos].[existencias] AS [peso consumos nas existencias]
FROM (SELECT Produtos.produto, SUM(Consumos.quantidade) AS total
FROM Consumos, Produtos, Fornecedores
WHERE Consumos.codproduto=Produtos.produto
AND Produtos.codfornecedor=9
GROUP BY Produtos.produto
ORDER BY Produtos.produto) AS aux
INNER JOIN Produtos
ON aux.produto = Produtos.produto
WHERE (((aux.produto)=[Produtos].[produto]));
A closer look at the results showed me that the column [peso consumos nas existencias] was multiplied by 10. After trying to fix this, I noticed that I was not using the table Fornecedores although I was calling it after FROM keyword, so I removed it:
SELECT Produtos.produto,
[aux].[total]/[Produtos].[existencias] AS [peso consumos nas existencias]
FROM (SELECT Produtos.produto, SUM(Consumos.quantidade) AS total
FROM Consumos, Produtos
WHERE Consumos.codproduto=Produtos.produto
AND Produtos.codfornecedor=9
GROUP BY Produtos.produto
ORDER BY Produtos.produto) AS aux
INNER JOIN Produtos
ON aux.produto = Produtos.produto
WHERE (((aux.produto)=[Produtos].[produto]));
After running, the results were right. Was this suppose to happen? if so, why?
Thanks!
Your Fornecedores table probably has 10 records.
FROM Consumos, Produtos, Fornecedores
WHERE Consumos.codproduto=Produtos.produto
was doing a cartesian product of the Consumos-Produtos join with those 10 records, so the SUM() used each number 10 times.
Note 1:
It is considered better style to use the explicit INNER JOIN syntax:
FROM Consumos INNER JOIN Produtos
ON Consumos.codproduto=Produtos.produto
WHERE Produtos.codfornecedor=9
instead of FROM Consumos, Produtos
Note 2:
If you think you have found a bug in the Access (or any database) query engine, chances are almost 100% that the bug is in your query. ;-)

Timeout running SQL query

I'm trying to using the aggregation features of the django ORM to run a query on a MSSQL 2008R2 database, but I keep getting a timeout error. The query (generated by django) which fails is below. I've tried running it directs the SQL management studio and it works, but takes 3.5 min
It does look it's aggregating over a bunch of fields which it doesn't need to, but I wouldn't have though that should really cause it to take that long. The database isn't that big either, auth_user has 9 records, ticket_ticket has 1210, and ticket_watchers has 1876. Is there something I'm missing?
SELECT
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined],
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id],
[auth_user].[password],
[auth_user].[last_login],
[auth_user].[is_superuser],
[auth_user].[username],
[auth_user].[first_name],
[auth_user].[last_name],
[auth_user].[email],
[auth_user].[is_staff],
[auth_user].[is_active],
[auth_user].[date_joined]
HAVING
(COUNT([tickets_ticket].[id]) > 0 OR COUNT(T3.[id]) > 0 )
EDIT:
Here are the relevant indexes (excluding those not used in the query):
auth_user.id (PK)
auth_user.username (Unique)
tickets_ticket.id (PK)
tickets_ticket.capturer_id
tickets_ticket.responsible_id
tickets_ticket_watchers.id (PK)
tickets_ticket_watchers.user_id
tickets_ticket_watchers.ticket_id
EDIT 2:
After a bit of experimentation, I've found that the following query is the smallest that results in the slow execution:
SELECT
COUNT([tickets_ticket].[id]) AS [tickets_captured__count],
COUNT(T3.[id]) AS [assigned_tickets__count],
COUNT([tickets_ticket_watchers].[ticket_id]) AS [tickets_watched__count]
FROM
[auth_user]
LEFT OUTER JOIN [tickets_ticket] ON ([auth_user].[id] = [tickets_ticket].[capturer_id])
LEFT OUTER JOIN [tickets_ticket] T3 ON ([auth_user].[id] = T3.[responsible_id])
LEFT OUTER JOIN [tickets_ticket_watchers] ON ([auth_user].[id] = [tickets_ticket_watchers].[user_id])
GROUP BY
[auth_user].[id]
The weird thing is that if I comment out any two lines in the above, it runs in less that 1s, but it doesn't seem to matter which lines I remove (although obviously I can't remove a join without also removing the relevant SELECT line).
EDIT 3:
The python code which generated this is:
User.objects.annotate(
Count('tickets_captured'),
Count('assigned_tickets'),
Count('tickets_watched')
)
A look at the execution plan shows that SQL Server is first doing a cross-join on all the table, resulting in about 280 million rows, and 6Gb of data. I assume that this is where the problem lies, but why is it happening?
SQL Server is doing exactly what it was asked to do. Unfortunately, Django is not generating the right query for what you want. It looks like you need to count distinct, instead of just count: Django annotate() multiple times causes wrong answers
As for why the query works that way: The query says to join the four tables together. So say an author has 2 captured tickets, 3 assigned tickets, and 4 watched tickets, the join will return 2*3*4 tickets, one for each combination of tickets. The distinct part will remove all the duplicates.
what about this?
SELECT auth_user.*,
C1.tickets_captured__count
C2.assigned_tickets__count
C3.tickets_watched__count
FROM
auth_user
LEFT JOIN
( SELECT capturer_id, COUNT(*) AS tickets_captured__count
FROM tickets_ticket GROUP BY capturer_id ) AS C1 ON auth_user.id = C1.capturer_id
LEFT JOIN
( SELECT responsible_id, COUNT(*) AS assigned_tickets__count
FROM tickets_ticket GROUP BY responsible_id ) AS C2 ON auth_user.id = C2.responsible_id
LEFT JOIN
( SELECT user_id, COUNT(*) AS tickets_watched__count
FROM tickets_ticket_watchers GROUP BY user_id ) AS C3 ON auth_user.id = C3.user_id
WHERE C1.tickets_captured__count > 0 OR C2.assigned_tickets__count > 0
--WHERE C1.tickets_captured__count is not null OR C2.assigned_tickets__count is not null -- also works (I think with beter performance)

Whats wrong with this nested query?

I am trying to write a query to return the id of the latest version of a market index stored in a database.
SELECT miv.market_index_id market_index_id from ref_market_index_version miv
INNER JOIN ref_market_index mi ON miv.market_index_id = mi.id
WHERE mi.short_name='dow30'
AND miv.version_num = (SELECT MAX(m1.version_num) FROM ref_market_index_version m1 INNER JOIN ref_market_index m2 ON m1.market_index_id = m2.id )
The above SQL statement can be (roughly) translated into the form:
SELECT some columns FROM SOME CRITERIA MATCHED TABLES
WHERE mi.short_name='some name'
AND miv.version_num = SOME NUMBER
What I don't understand is that when I supply an actual number (instead of a sub query), the SQL statement works - also, when I test the SUB query used to determine the latest version number, that also works - however, when I attempt to use the result returned by sub query in the outer (parent?) query, it returns 0 rows - what am I doing wrong here?
Incidentally, I also tried an IN CLAUSE instead of the strict equality match i.e.
... AND miv.version_num IN (SUB QUERY)
That also resulted in 0 rows, although as before, when running the parent query with a hard coded version number, I get 1 row returned (as expected).
BTW I am using postgeresql, but I prefer the solution to be db agnostic.
The problem is probably that the max(version_num) doesn't exist for 'dow30'.
Try the following correlated subquery:
SELECT miv.market_index_id market_index_id
from ref_market_index_version miv INNER JOIN
ref_market_index mi
ON miv.market_index_id = mi.id
WHERE mi.short_name='dow30' AND
miv.version_num = (SELECT MAX(m1.version_num)
FROM ref_market_index_version m1 INNER JOIN
ref_market_index m2
ON m1.market_index_id = m2.id
where m1.short_name = 'dow30'
)
I added the where clause in the subquery.