SQL UNIONs that use the WITH statement - sql

I have to run this query on 20 databases to produce one unified report. I've done this before with a UNION. Is there any way to reuse the WITH statement for each sub-query? I am receiving an error saying that the previous statement must be terminated with a semicolon. Could there be a better way of doing this?
WITH
DUPS (DocNum)
AS (
SELECT DocNum
FROM Akron.dbo.PWZ3
INNER JOIN Akron.dbo.OPWZ T5 ON T5.IdNumber = PWZ3.IdEntry
WHERE T5.PmntDate = '3/10/2011'
GROUP BY DocNum
HAVING COUNT(1) > 1;
)
SELECT PWZ3.IdEntry, PWZ3.DocNum, 'ARK' + PWZ3.CardCode, QUOTENAME(CardName,'"'),
Convert(Decimal(10,2),PayAmount), Convert(Decimal(10,2),InvPayAmnt),
CONVERT(VARCHAR(10), T5.PmntDate,101), NumAtCard, PymMeth, ObjType
FROM Akron.dbo.PWZ3 PWZ3
INNER JOIN DELAWARE.dbo.OPWZ T5 ON T5.IdNumber = PWZ3.IdEntry
LEFT JOIN Dups ON DUPS.DocNum = PWZ3.DocNum
WHERE T5.PmntDate = '3/10/2011'
AND T5.Canceled = 'N'
AND Checked = 'Y'
AND Dups.DocNum is null

The WITH statement indicates that the query expressed within the AS clause is a Common Table Expression (CTE).
If you're asking if you can write more than one query that uses a CTE, then the answer is no. The closest approximation would be creating an inline table-valued function.
(The simple definition of an ordinary table-valued function versus an inline table-valued function is that the inline version is defined only as a query, much like a CTE; you can't do any procedural operations within the function, including declaring/assigning variables).

The following works for me:
WITH stuff AS (
... some select
)
SELECT some_col
FROM some_table
JOIN stuff ON ...
UNION ALL
SELECT other_col
FROM other_table
JOIN stuff ON ...
Tested on PostgreSQL and Oracle

Related

How to convert inline SQL queries to JOINS in SQL SERVER to reduce load time

I need help in optimizing this SQL query.
In the main SELECT statement there are three columns which is dependent on the outer query result. This is why my query is taking a long time to return data. I have tried making left joins but this is not working properly.
Can anyone help me to resolve this issue?
SELECT
DISTINCT ou.OrganizationUserID AS StudentID,
ou.FirstName,
ou.LastName,
(
SELECT
STRING_AGG(
(ug.UG_Name),
','
)
FROM
Groups ug
INNER JOIN ApplicantUserGroup augm ON augm.AUGM_UserGroupID = ug.UG_ID
WHERE
augm.AUGM_OrganizationUserID = ou.OrganizationUserID
AND ug.UG_IsDeleted = 0
AND augm.AUGM_IsDeleted = 0
) AS UserGroups,
order1.OrderNumber AS OrderId -- UAT-2455
,
(
SELECT
STRING_AGG(
(CActe.CustomAttribute),
','
)
FROM
CustomAttributeCte CActe
WHERE
CActe.HierarchyNodeID = dpm.DPM_ID
AND CActe.OrganizationUserID = ps.OrganizationUserID
) AS CustomAttributes -- UAT-2455
,
(
SELECT
STRING_AGG(
(CActe.CustomAttributeID),
','
)
FROM
CustomAttributeCte CActe
WHERE
CActe.HierarchyNodeID = dpm.DPM_ID
AND CActe.OrganizationUserID = ps.OrganizationUserID
) AS CustomAttributeID
FROM
ApplicantData acd WITH (NOLOCK)
INNER JOIN ClientPackage ps WITH (NOLOCK) ON acd.ClientSubscriptionID = ps.ClientSubscriptionID
INNER JOIN [ClientOrder] order1 WITH (NOLOCK) ON order1.OrderID = ps.OrderID
AND order1.IsDeleted = 0
INNER JOIN OUser ou WITH (NOLOCK) ON ou.OrganizationUserID = ps.OrganizationUserID
It looks like this query can be simplified, and the dependent subqueries in your SELECT clause removed, Consider your second and third dependent subqueries. You can refactor them into one nondependent subquery with a LEFT JOIN. Using nondependent subqueries is more efficient because the query planner can run them just once, rather than once for each row.
You want two STRING_AGG() results from the same table. This subquery gives those two outputs for every possible combination of HierarchyNodeID and OrganizationUserID values. STRING_AGG() is an aggregate function like SUM() and so works nicely with GROUP BY.
SELECT HierarchyNodeID, OrganizationUserID,
STRING_AGG((CActe.CustomAttribute), ',') CustomAttributes -- UAT-2455,
STRING_AGG((CActe.CustomAttributeID), ',') CustomAttributeIDs -- UAT-2455
FROM CustomAttributeCte CActe
GROUP BY HierarchyNodeID, OrganizationUserID
You can run this subquery itself to convince yourself it works.
Now, we can LEFT JOIN that into your query. Like this. (For readability I took out the NOLOCKs and used JOIN: it means the same thing as INNER JOIN.)
SELECT DISTINCT
ou.OrganizationUserID AS StudentID,
ou.FirstName,
ou.LastName,
'tempvalue' AS UserGroups, -- shortened for testing
order1.OrderNumber AS OrderId, -- UAT-2455
uat2455.CustomAttributes, -- UAT-2455
uat2455.CustomAttributeIDs -- UAT-2455
FROM ApplicantData acd
JOIN ClientPackage ps
ON acd.ClientSubscriptionID = ps.ClientSubscriptionID
JOIN ClientOrder order1
ON order1.OrderID = ps.OrderID
AND order1.IsDeleted = 0
JOIN OUser ou
ON ou.OrganizationUserID = ps.OrganizationUserID
LEFT JOIN (
SELECT HierarchyNodeID, OrganizationUserID,
STRING_AGG((CActe.CustomAttribute), ',') CustomAttributes -- UAT-2455,
STRING_AGG((CActe.CustomAttributeID), ',') CustomAttributeIDs -- UAT-2455
FROM CustomAttributeCte CActe
GROUP BY HierarchyNodeID, OrganizationUserID
) uat2455
ON uat2455.HierarchyNodeID = dpm.DPM_ID
AND uat2455.OrganizationUserId = ps.OrganizationUserID
See how we collapsed your second and third dependent subqueries to just one, then used it as a virtual table with LEFT JOIN? We transformed the WHERE clauses from the dependent subqueries into an ON clause.
You can test this: run it with TOP(50) and eyeball the results.
When you're happy, the next step is to transform your first dependent subquery the same way.
Pro tip Don't use WITH (NOLOCK), ever, unless a database administration expert tells you to after looking at your specific query. If your query's purpose is a historical report and you don't care whether the most recent transactions in your database are represented exactly right, you can precede your query with this statement. It also allows the query to run while avoiding locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Pro tip Be obsessive about formatting your queries for readability. You, your colleagues, and yourself a year from now must be able to read and reason about queries like this.

aggregate functions are not allowed in WHERE

I am using this query to find the unique records by latest date using postgresql. The error I am having is "aggregate functions are not allowed in WHERE". How to fix error “aggregate functions are not allowed in WHERE” Following this link I have tried to use inner select function. But this did not work. Please help me to edit the query. I am using PgAdmin III as client.
SELECT Distinct t1.pa_serial_
,t1.homeownerm_name
,t1.districtvdc
,t1.date as firstrancheinspection_date
,t1.status
,t1.name_of_data_collector
,t1.fulcrum_id
,first_tranche_inspection_v2_reporting_questionnaire.date_reporting
From first_tranche_inspection_v2 t1
LEFT JOIN first_tranche_inspection_v2_reporting_questionnaire ON (t1.fulcrum_id = first_tranche_inspection_v2_reporting_questionnaire.fulcrum_parent_id)
where first_tranche_inspection_v2_reporting_questionnaire.date_reporting = (
select Max(first_tranche_inspection_v2_reporting_questionnaire.date_reporting)
from first_tranche_inspection_v2
where first_tranche_inspection_v2.pa_serial_ = t1.pa_serial_
);
You want to join the latest reporting questionaire per inspection. In PostgreSQL you can use DISTINCT ON for this:
select fti.*, rq.*
from first_tranche_inspection_v2 fti
left join
(
select distinct on (fulcrum_parent_id) *
from first_tranche_inspection_v2_reporting_questionnaire
order by fulcrum_parent_id, date_reporting desc
) rq on rq.fulcrum_parent_id = fti.fulcrum_id;
Or use standard SQL's ROW_NUMBER:
select fti.*, rq.*
from first_tranche_inspection_v2 fti
left join
(
select
ftirq.*,
row_number() over (partition by fulcrum_parent_id order by date_reporting desc) as rn
from first_tranche_inspection_v2_reporting_questionnaire ftirq
) rq on rq.fulcrum_parent_id = fti.fulcrum_id and rq.rn = 1;
What you were trying to do should look like this:
select fti.*, rq.*
from first_tranche_inspection_v2 fti
left join first_tranche_inspection_v2_reporting_questionnaire rq
on rq.fulcrum_parent_id = fti.fulcrum_id
and (rq.fulcrum_parent_id, rq.date_reporting) in
(
select fulcrum_parent_id, max(date_reporting)
from first_tranche_inspection_v2_reporting_questionnaire
group by fulcrum_parent_id
);
This works, too, and only has the disadvantage that you read the table first_tranche_inspection_v2_reporting_questionnaire twice.
DISTINCT often ends up being implemented with a GROUP BY query in many RDBMS. What I think is happening in your current query is that there is already an implicit aggregation involving the columns in your SELECT. Hence, the correlated subquery involving MAX() actually is an aggregation because of the DISTINCT.
One quick workaround might be to perform the original query without DISTINCT, then subquery the result set to retain only distinct records:
WITH cte AS (
SELECT t1.pa_serial_,
t1.homeownerm_name,
t1.districtvdc,
t1.date as firstrancheinspection_date,
t1.status,
t1.name_of_data_collector,
t1.fulcrum_id,
t2.date_reporting
FROM first_tranche_inspection_v2 t1
LEFT JOIN first_tranche_inspection_v2_reporting_questionnaire t2
ON t1.fulcrum_id = t2.fulcrum_parent_id
WHERE t2.date_reporting = (SELECT MAX(t.date_reporting)
FROM first_tranche_inspection_v2 t
WHERE t.pa_serial_ = t1.pa_serial_)
);
SELECT DISTINCT t.pa_serial_,
t.homeownerm_name,
t.districtvdc,
t.firstrancheinspection_date,
t.status,
t.name_of_data_collector,
t.fulcrum_id,
t.date_reporting
FROM cte t
Note that I went ahead and added an alias to the second table in your join, which leaves the query much easier to read.

Oracle ORA:00904: Subquery in LEFT JOIN

I'm aware of the subquery limitations of Oracle's ANSI SQL setup. You can't use an identifier in a subquery that is declared more than one level deep.
I'm attempting the following query, which as far as I can see is only one level deep, but I'm getting this error. Does this not work for table joins? (I've truncated the procedure somewhat, but the problem should be clear. Additionally, if it means anything, I'm using FIRST_VALUE analytic functions in my select values. We're on 10g.)
The error:
Error(111,79): PL/SQL: ORA-00904: "VT"."MAIL_TO_ADDRESS_NUMBER": invalid identifier
The proc:
PROCEDURE MYPROCEDURE (
p_TransactionId IN NUMBER,
p_Cursor_Out OUT SYS_REFCURSOR
)
AS
BEGIN
OPEN p_Cursor_Out FOR
SELECT
...
FROM vehicle_transaction vt
INNER JOIN registration_transaction reg ON vt.transaction_id = reg.transaction_id
/* The problem is here */
LEFT OUTER JOIN (
SELECT
laddt2.address
FROM lien_address_transaction laddt2
WHERE vt.mail_to_address_number IS NOT NULL AND laddt2.address_number = vt.mail_to_address_number
) laddt
ON (laddt2.address_number = vt.mail_to_address_number)
WHERE vt.transaction_id = p_TransactionId;
END MYPROCEDURE;
You are trying to do a lateral join. You cannot use an external table alias in the from clause. In general, the work-around is to use aggregation:
SELECT
...
FROM vehicle_transaction vt INNER JOIN
registration_transaction reg
ON vt.transaction_id = reg.transaction_id LEFT OUTER JOIN
(SELECT laddt2.address_number, MIN(laddt2.address) as address
FROM lien_address_transaction laddt2
WHERE vt.mail_to_address_number IS NOT NULL AND
GROUP BY laddt2.address_number
) laddt
ON laddt.address_number = vt.mail_to_address_number
WHERE vt.transaction_id = p_TransactionId;

JOIN Nested DISTINCT

I am writing SQL statements with MS Access and I am wondering how I can nest a SELECT DISTINCT statement in my JOIN?
At the moment I am writing 2 queries to get an output (2 steps):
1st query is a simple DISTINCT statement.
2nd query is a JOIN on the query created in 1.
How can I nest a DISTINCT statement in the join in order to perform the action in a single step?
SELECT DISTINCT tblFinalIssuerNames_ReverseRepos.ISIN, tblFinalIssuerNames_ReverseRepos.IssuerCode
FROM tblFinalIssuerNames_ReverseRepos;
SELECT QSel_CollateralReposII.ISIN, QSel_CollateralReposII.MarketValueUSD, QSelDistinctISINs.IssuerCode
FROM QSel_CollateralReposII INNER JOIN QSelDistinctISINs ON QSel_CollateralReposII.ISIN = QSelDistinctISINs.ISIN;
I was thinking about something like the below but the syntax is wrong…
SELECT QSel_CollateralReposII.ISIN, QSel_CollateralReposII.MarketValueUSD, tblFinalIssuerNames_ReverseRepos.IssuerCode
FROM QSel_CollateralReposII INNER JOIN tblFinalIssuerNames_ReverseRepos ON SELECT DISTINCT QSel_CollateralReposII.ISIN = tblFinalIssuerNames_ReverseRepos.ISIN;
I can't test what is allowed and what is not in Access but you can replace QSelDistinctISINs in your query with the query that defines it, by just putting it inside parenthesis and giving it an alias - d below. This is valid SQL syntax:
SELECT
r.ISIN,
r.MarketValueUSD,
d.IssuerCode
FROM
QSel_CollateralReposII r
INNER JOIN
( SELECT DISTINCT
ISIN,
IssuerCode
FROM
tblFinalIssuerNames_ReverseRepos
) d
ON r.ISIN = d.ISIN ;

Passing result of requery as parameter to UDF

I need to execute a UDF within a query statement and its parameter depends on the current row in the larger query. I need to get a scalar from another table and pass that to the UDF however I get syntax errors if I try to use a query within the parameters of a UDF.
Example:
SELECT M.Col1
FROM MyTable M
WHERE M.RemoteID = UDFLookupRemoteID(SELECT W.Name
FROM WidgetNames W
WHERE W.Col2 = M.RemoteID)
The select within the UDF cannot be done elsewhere since it depends on the outer query.
What's the correct syntax for this?
I think this will give you what you need.
SELECT m.col1
FROM mytable m
INNER JOIN widgetnames w
ON w.col2 = m.remoteid
WHERE m.remoteid = Udflookupremoteid(w.name)
Here's an example I tested with the AdventureWorks database
SELECT pr.*
FROM production.productreview pr
INNER JOIN production.product p
ON p.productid = pr.productid
WHERE pr.rating < dbo.Ufngetstock(p.productid)