Could use advice in making my query smarter

Could use advice in making my query smarter - sql

Hi Guys I have the following query but the unions make it quite heavy so could anyone help in fixing my query.
There are 3 scenarios.
1. pack_no = pack of an item (inside packitem)
2. item = item inside pack (inside packitem)
3. item = doesn't have pack (inside item_master)
SELECT DISTINCT item, loc FROM
(SELECT e.pack_no item, g.store loc
FROM dc_store_ranging a
JOIN store g
ON g.store_name_secondary = CAST(a.loc AS VARCHAR2(150 BYTE)) AND
g.store_close_date >= SYSDATE
LEFT JOIN dc_pim_export_vert b
ON a.dpac = b.dpac AND b.artikel_type_LMS NOT IN ('S','V')
LEFT JOIN dc_ccn190_sid_vtb c ON a.dpac = c.dpac
JOIN item_master d
ON (b.item = d.item OR c.item = d.item) AND d.status = 'A'
LEFT JOIN packitem e
ON (b.item = e.pack_no or c.item = e.pack_no) AND d.item = e.pack_no
WHERE d.item NOT IN
(SELECT f.item
FROM item_attributes f
WHERE f.sh_store_order_unit = 'N' AND f.sh_trade_unit = 'Y')
UNION
SELECT e.item, g.store loc
FROM dc_store_ranging a
JOIN store g
ON g.store_name_secondary = CAST(a.loc AS VARCHAR2(150 BYTE)) AND
g.store_close_date >= SYSDATE
LEFT JOIN dc_pim_export_vert b
ON a.dpac = b.dpac AND b.artikel_type_LMS NOT IN ('S','V')
LEFT JOIN dc_ccn190_sid_vtb c ON a.dpac = c.dpac
JOIN item_master d
ON (b.item = d.item OR c.item = d.item) AND d.status = 'A'
LEFT JOIN packitem e
ON (b.item = e.pack_no or c.item = e.pack_no)
WHERE e.item NOT IN
(SELECT f.item
FROM item_attributes f
WHERE f.sh_store_order_unit = 'N' AND f.sh_trade_unit = 'Y')
UNION
SELECT d.item, g.store loc
FROM dc_store_ranging a
JOIN store g
ON g.store_name_secondary = CAST(a.loc AS VARCHAR2(150 BYTE)) AND
g.store_close_date >= SYSDATE
LEFT JOIN dc_pim_export_vert b
ON a.dpac = b.dpac AND b.artikel_type_LMS NOT IN ('S','V')
LEFT JOIN dc_ccn190_sid_vtb c ON a.dpac = c.dpac
JOIN item_master d
ON (b.item = d.item OR c.item = d.item) AND d.status = 'A'
WHERE d.item NOT IN
(SELECT f.item
FROM item_attributes f
WHERE f.sh_store_order_unit = 'N' and f.sh_trade_unit = 'Y')
);

The simplest way to improve the performance of the query would be to change the UNIONs to UNION ALLs - that way, the query only has to eliminate duplicates once.
However, it should be possible to simplify this query to:
WITH CTE AS
(SELECT d.item d_item, e.item e_item, e.pack_no e_pack_no, g.store loc
FROM dc_store_ranging a
JOIN store g
ON g.store_name_secondary = CAST(a.loc AS VARCHAR2(150 BYTE)) AND
g.store_close_date >= SYSDATE
LEFT JOIN dc_pim_export_vert b
ON a.dpac = b.dpac AND b.artikel_type_LMS NOT IN ('S','V')
LEFT JOIN dc_ccn190_sid_vtb c ON a.dpac = c.dpac
JOIN item_master d
ON (b.item = d.item OR c.item = d.item) AND d.status = 'A'
LEFT JOIN packitem e
ON (b.item = e.pack_no or c.item = e.pack_no)
)
SELECT DISTINCT item, loc FROM
(--SELECT e_pack_no item, loc FROM CTE WHERE d_item = e_pack_no UNION ALL -- this select is a subset of the third select
SELECT e_item item, loc FROM CTE UNION ALL
SELECT d_item item, loc FROM CTE) uc
WHERE uc.item NOT IN
(SELECT f.item
FROM item_attributes f
WHERE f.sh_store_order_unit = 'N' and f.sh_trade_unit = 'Y')

Related

Counting all records from inner-join

I have 4 tables that has aantal ( count ) and each tables shows normal numbers or numbers with - before (example: -20) now I need to count all that records. but I don't know how I can fix that.
Sorry I am a noob in sql.
this is my code
The problem that i facing to is tha all records from the different tables that has column aantal not counting in total.
example:
CSSDKMagento_10_Plankvoorraad returns 10
CSSDKMagento_20_GeenAllocatieWelFiat returns -3 and -2
CSSDKMagento_30_AllocatieVoorraad returns 5
CSSDKMagento_50_AllocatieBestellingBinnen returns -1 and -1
That means that i get on return from Voorraad is 8.
I tried count(*) but that is not the solution. with best way I can do it?
SELECT
i.ItemCode,
g.warehouse,
SUM(g.aantal) AS Voorraad,
MAX(CASE
WHEN g.transtype = 'N' THEN g.sysmodified
ELSE NULL
END) AS LastDate
FROM dbo.CSSDKMagento_10_Plankvoorraad AS g
INNER JOIN dbo.Items AS i
ON (g.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_20_GeenAllocatieWelFiat AS a
ON (a.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_30_AllocatieVoorraad AS v
ON (v.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_50_AllocatieBestellingBinnen AS b
ON (b.artcode = i.ItemCode)
WHERE
i.itemcode = 'TEST'
GROUP BY i.itemcode,
g.warehouse;

Try this
SELECT
i.ItemCode,
g.warehouse,
SUM(g.aantal)+SUM(a.aantal)+SUM(v.aantal)+SUM(b.aantal) AS Voorraad,
MAX(CASE
WHEN g.transtype = 'N' THEN g.sysmodified
ELSE NULL
END) AS LastDate
FROM dbo.CSSDKMagento_10_Plankvoorraad AS g
INNER JOIN dbo.Items AS i
ON (g.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_20_GeenAllocatieWelFiat AS a
ON (a.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_30_AllocatieVoorraad AS v
ON (v.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_50_AllocatieBestellingBinnen AS b
ON (b.artcode = i.ItemCode)
WHERE
i.itemcode = 'TEST'
GROUP BY i.itemcode,
g.warehouse;

Edited:
SELECT SUM(Voorraad) FROM (
SELECT
i.ItemCode,
g.warehouse,
g.aantal AS Voorraad,
MAX(CASE
WHEN g.transtype = 'N' THEN g.sysmodified
ELSE NULL
END) AS LastDate
FROM dbo.CSSDKMagento_10_Plankvoorraad AS g
INNER JOIN dbo.Items AS i
ON (g.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_20_GeenAllocatieWelFiat AS a
ON (a.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_30_AllocatieVoorraad AS v
ON (v.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_50_AllocatieBestellingBinnen AS b
ON (b.artcode = i.ItemCode)
WHERE
i.itemcode = 'TEST' enter code here
GROUP BY i.itemcode,
g.warehouse
)src

Looks like following would work:
WITH itemCodesScope AS
(
SELECT 'TEST' as target_ItemCode
),
aggregated_CSSDKMagento_10_Plankvoorraad AS
(
SELECT g.artcode,
g.warehouse,
COUNT(g.aantal) as count_aantal,
SUM(g.aantal) as sum_aantal,
MAX(CASE
WHEN g.transtype = 'N' THEN g.sysmodified
ELSE NULL
END) AS LastDate
FROM dbo.CSSDKMagento_10_Plankvoorraad AS g
JOIN itemCodesScope ON itemCodesScope.target_itemCode = g.artcode
GROUP BY
g.artcode,
g.warehouse
),
aggregated_CSSDKMagento_20_GeenAllocatieWelFiat AS
(
SELECT a.artcode ,
COUNT(a.aantal) as count_aantal,
SUM(a.aantal) as sum_aantal
FROM dbo.CSSDKMagento_20_GeenAllocatieWelFiat AS a
JOIN itemCodesScope ON itemCodesScope.target_itemCode = a.artcode
GROUP BY
a.artcode
),
aggregated_CSSDKMagento_30_AllocatieVoorraad AS
(
SELECT v.artcode ,
COUNT(v.aantal) as count_aantal,
SUM(v.aantal) as sum_aantal
FROM dbo.CSSDKMagento_30_AllocatieVoorraad AS v
JOIN itemCodesScope ON itemCodesScope.target_itemCode = v.artcode
GROUP BY
v.artcode
),
aggregated_CSSDKMagento_50_AllocatieBestellingBinnen AS
(
SELECT b.artcode ,
COUNT(b.aantal) as count_aantal,
SUM(b.aantal) as sum_aantal
FROM dbo.CSSDKMagento_50_AllocatieBestellingBinnen AS b
JOIN itemCodesScope ON itemCodesScope.target_itemCode = b.artcode
GROUP BY
b.artcode
)
SELECT
g.artcode as ItemCode,
g.warehouse,
g.sum_aantal AS Voorraad,
g.LastDate AS LastDate,
g.sum_aantal + ISNULL(a.sum_aantal, 0) + ISNULL(v.sum_aantal, 0) + ISNULL(b.sum_aantal, 0) as sum_aantal,
g.count_aantal + ISNULL(a.count_aantal, 0) + ISNULL(v.count_aantal, 0) + ISNULL(b.count_aantal, 0) as count_aantal
FROM aggregated_CSSDKMagento_10_Plankvoorraad AS g
INNER JOIN dbo.Items AS i
ON (g.artcode = i.ItemCode)
INNER JOIN itemCodesScope
ON itemCodesScope.target_itemCode = i.ItemCode
LEFT JOIN aggregated_CSSDKMagento_20_GeenAllocatieWelFiat AS a
ON (a.artcode = i.ItemCode)
LEFT JOIN aggregated_CSSDKMagento_30_AllocatieVoorraad AS v
ON (v.artcode = i.ItemCode)
LEFT JOIN aggregated_CSSDKMagento_50_AllocatieBestellingBinnen AS b
ON (b.artcode = i.ItemCode)
Explanation
SQL-joins produce Cartesian Products, which most probably led to unexpected results in the initial query. Here there are 4 'amount'-tables which are connected via Joins with conditions 'ON (b.artcode = i.ItemCode)', so if there any table contains several records per condition output will contain several records per ItemCode.
Let's say there are 9 records in a-table with a.artcode per single i.ItemCode, so there is a one-to-many relation. And let's say there is 1 record in b-table per single i.ItemCode. Output of Join will have 9 a.aantal records, but 9 repeated b.aantal records as well. Given that there is an aggregation (group by) it will effect that SUM(b.aantal) in this Joins-query will produce 9-time more than sum(b.aantal) in standalone query on b-table only.
Cartesian Product could be seen more easily, if run the initial query without aggregation:
SELECT *
FROM dbo.CSSDKMagento_10_Plankvoorraad AS g
INNER JOIN dbo.Items AS i
ON (g.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_20_GeenAllocatieWelFiat AS a
ON (a.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_30_AllocatieVoorraad AS v
ON (v.artcode = i.ItemCode)
INNER JOIN dbo.CSSDKMagento_50_AllocatieBestellingBinnen AS b
ON (b.artcode = i.ItemCode)
WHERE
i.itemcode = 'TEST'
The fixture was: doing group by aggregation before making joins. Most convenient way for this imho is CTE. With CTE I created 4 temporary tables with aggregates per ItemCode, so temporary tables are one-to-one per ItemCode. Then Joins on one-to-one produce just single output row per ItemCode.

SQL JOIN between A, B and C mixing full and left join

I'd like to know if there is a better way to create my left join in the below example:
SELECT TOP 10 COALESCE(A.COD_PRODUCT, B.COD_PRODUCT),
COALESCE(A.COD_FAMILY, B.COD_FAMILY),
COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT),
A.MASS
B.VOLUME
C.PRICE
FROM FIRSTTABLE A FULL JOIN SECONDTABLE B ON B.COD_PRODUCT = A.COD_PRODUCT
AND B.COD_FAMILY = A.COD_FAMILY
AND B.DATE_EXTRACT = A.DATE_EXTRACT
LEFT JOIN THIRDTABLE C ON C.COD_PRODUCT = COALESCE(A.COD_PRODUCT,B.COD_PRODUCT)
AND C.COD_FAMILY = COALESCE(A.COD_FAMILY, B.COD_FAMILY)
AND C.DATE_EXTRACT = COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT)
That kind of jointure takes long time and I suspect it to be highly expensive and improvable
EDIT: I'd like to improve this SELECT FROM JOIN statement in a View.

You could split the query into two: gather all data matching FIRSTTABLE. And then union it with all data matching SECONDTABLE that is not in FIRSTTABLE.
That should allow SQL Server to use the indexes on these tables better.
SELECT A.COD_PRODUCT,
A.COD_FAMILY,
A.DATE_EXTRACT,
A.MASS,
B.VOLUME,
C.PRICE
FROM FIRSTTABLE A
LEFT OUTER JOIN SECONDTABLE B
ON B.COD_PRODUCT = A.COD_PRODUCT
AND B.COD_FAMILY = A.COD_FAMILY
AND B.DATE_EXTRACT = A.DATE_EXTRACT
LEFT OUTER JOIN THIRDTABLE C
ON C.COD_PRODUCT = A.COD_PRODUCT
AND C.COD_FAMILY = A.COD_FAMILY
AND C.DATE_EXTRACT = A.DATE_EXTRACT
UNION ALL
SELECT B.COD_PRODUCT,
B.COD_FAMILY,
B.DATE_EXTRACT,
NULL AS MASS,
B.VOLUME,
C.PRICE
FROM SECONDTABLE B
LEFT OUTER JOIN THIRDTABLE C
ON C.COD_PRODUCT = B.COD_PRODUCT
AND C.COD_FAMILY = B.COD_FAMILY
AND C.DATE_EXTRACT = B.DATE_EXTRACT
WHERE NOT EXISTS (SELECT 1
FROM FIRSTTABLE A
WHERE A.COD_PRODUCT = B.COD_PRODUCT
AND A.COD_FAMILY = B.COD_FAMILY
AND A.DATE_EXTRACT = B.DATE_EXTRACT)

Compare the execution plan with this replacement join:
LEFT JOIN THIRDTABLE C
ON (C.COD_PRODUCT = A.COD_PRODUCT or C.COD_PRODUCT = B.COD_PRODUCT)
AND (C.COD_FAMILY = A.COD_FAMILY or C.COD_FAMILY = B.COD_FAMILY)
AND (C.DATE_EXTRACT =A.DATE_EXTRACT or C.DATE_EXTRACT = B.DATE_EXTRACT)

You might can achieve the same result with a UNION or UNION ALL
WITH cte AS (
SELECT
A.COD_PRODUCT,
A.COD_FAMILY,
A.DATE_EXTRACT,
A.MASS,
NULL as VOLUME
FROM
FIRSTTABLE A
UNION
SELECT
B.COD_PRODUCT,
B.COD_FAMILY,
B.DATE_EXTRACT,
NULL,
B.VOLUME
FROM
FIRSTTABLE A
)
SELECT
*
FROM
cte AB
LEFT JOIN
THIRDTABLE C ON C.COD_PRODUCT = AB.COD_PRODUCT
AND C.COD_FAMILY = AB.COD_FAMILY
AND C.DATE_EXTRACT = AB.DATE_EXTRACT
If you can have Mass and Volume for the product,family,extract combination you can use aggregate to join A and B together
WITH cte AS (
SELECT
COD_PRODUCT,
COD_FAMILY,
DATE_EXTRACT,
MAX(MASS) MASS,
MAX(VOLUME) VOLUME
FROM (
SELECT
A.COD_PRODUCT,
A.COD_FAMILY,
A.DATE_EXTRACT,
A.MASS,
NULL as VOLUME
FROM
FIRSTTABLE A
UNION ALL
SELECT
B.COD_PRODUCT,
B.COD_FAMILY,
B.DATE_EXTRACT,
NULL,
B.VOLUME
FROM
FIRSTTABLE A
) T
GROUP BY
COD_PRODUCT,
COD_FAMILY,
DATE_EXTRACT
)
SELECT
*
FROM
cte AB
LEFT JOIN
THIRDTABLE C ON C.COD_PRODUCT = AB.COD_PRODUCT
AND C.COD_FAMILY = AB.COD_FAMILY
AND C.DATE_EXTRACT = AB.DATE_EXTRACT

You can try this:
--- isolate the full join data from a and b into a temp table
SELECT
COD_PRODUCT= COALESCE(A.COD_PRODUCT, B.COD_PRODUCT),
COD_FAMILY= COALESCE(A.COD_FAMILY, B.COD_FAMILY),
DATE_EXTRACT= COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT),
MASS= A.MASS,
VOLUME= B.VOLUME
INTO #TEMP
FROM FIRSTTABLE A
FULL JOIN SECONDTABLE B
ON B.COD_PRODUCT = A.COD_PRODUCT
AND B.COD_FAMILY = A.COD_FAMILY
AND B.DATE_EXTRACT = A.DATE_EXTRACT
-- add index clustered onto table (covering index)
CREATE CLUSTERED INDEX ix_tempCIndex ON #Temp ([COD_PRODUCT],[COD_FAMILY],[DATE_EXTRACT],[MASS],[VOLUME]);
-- left join C to this temp table
SELECT TOP 10
T.*, C.PRICE
FROM #TEMP T
LEFT JOIN THIRDTABLE C
ON C.COD_PRODUCT = T.COD_PRODUCT
AND C.COD_FAMILY = T.COD_FAMILY
AND C.DATE_EXTRACT = T.DATE_EXTRACT
-- drop temp table
DROP TABLE #TEMP

A coalesce or an OR inside of a join is going to slow down the query a lot. Depending on the size of the table, a better answer might be to join to table c twice. Once on table a and again on table B, then coalesce inside of your select clause.
SELECT TOP 10 COALESCE(A.COD_PRODUCT, B.COD_PRODUCT,c.COD_PRODUCT,ca.COD_PRODUCT),
COALESCE(A.COD_FAMILY, B.COD_FAMILY,c.COD_FAMILY,ca.COD_FAMILY),
COALESCE(A.DATE_EXTRACT, B.DATE_EXTRACT,c.DATE_EXTRACT,ca.DATE_EXTRACT),
A.MASS
B.VOLUME
C.PRICE
FROM FIRSTTABLE A
FULL JOIN SECONDTABLE B
ON B.COD_PRODUCT = A.COD_PRODUCT
AND B.COD_FAMILY = A.COD_FAMILY
AND B.DATE_EXTRACT = A.DATE_EXTRACT
LEFT JOIN THIRDTABLE C
ON C.COD_PRODUCT = A.COD_PRODUCT
AND C.COD_FAMILY = A.COD_FAMILY
AND C.DATE_EXTRACT = A.DATE_EXTRACT
LEFT JOIN THIRDTABLE Ca
ON Ca.COD_PRODUCT = b.COD_PRODUCT
AND Ca.COD_FAMILY = b.COD_FAMILY
AND Ca.DATE_EXTRACT = b.DATE_EXTRACT

How to union aggregated queries?

I have a query
SELECT
CntApp = COUNT(app.ApplicationID)
,r.RegionName
,d.DistrictName
FROM dim.Application app
JOIN dim.Geography g ON (app.ApplicationID = g.GeographyID)
AND (app.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
and
SELECT
CntCon = COUNT(c.ContractID)
,r.RegionName
,d.DistrictName
FROM dim.Contract c
JOIN dim.Geography g ON (c.ContractID = g.GeographyID)
AND (c.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
which I want to merge into one table, so the group by still works.
The result I want to get:
CntApp | CntCon | RegionName | DistrictName
31 24 Pardubicky Pardubice
21 16 Pardubicky Chrudim
...
I've tried UNION ALL but got something like this instead:
CntApp | CntCon | RegionName | DistrictName
NULL 24 Pardubicky Pardubice
21 NULL Pardubicky Pardubice
26 NULL Pardubicky Chrudim
...

You need to join 2 subqueries. This way you will get columns of both the queries side by side as you expect.
this should work :
SELECT iq1.CntApp , iq2.CntCon, iq1.iq1.RegionName,iq1.DistrictName
FROM
(
SELECT
CntApp = COUNT(app.ApplicationID)
,r.RegionName
,d.DistrictName
FROM dim.Application app
JOIN dim.Geography g ON (app.ApplicationID = g.GeographyID)
AND (app.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
) iq1
inner join
(
SELECT
CntCon = COUNT(c.ContractID)
,r.RegionName
,d.DistrictName
FROM dim.Contract c
JOIN dim.Geography g ON (c.ContractID = g.GeographyID)
AND (c.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
) iq2
on
iq1.RegionName = iq2.iq1.RegionName
and
iq1.DistrictName = iq2.DistrictName

UNION ALL will combine results column by column. You need to introduce fake columns and aggregate it again (or join like in the other solution):
SELECT SUM(CntApp) CntApp, SUM(CntCon) CntCon, RegionName, DistrictName FROM (
SELECT
CntApp = COUNT(app.ApplicationID)
,CntCon = 0
,r.RegionName
,d.DistrictName
FROM dim.Application app
JOIN dim.Geography g ON (app.ApplicationID = g.GeographyID)
AND (app.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
UNION ALL
SELECT
CntApp = 0
,CntCon = COUNT(c.ContractID)
,r.RegionName
,d.DistrictName
FROM dim.Contract c
JOIN dim.Geography g ON (c.ContractID = g.GeographyID)
AND (c.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
) d
GROUP BY RegionName, DistrictName

You need a FULL JOIN
SELECT coalesce(app.RegionName, c.RegionName) AS RegionName,
coalesce(app.DistrictName, c.DistrictName) AS DistrictName,
coalesce(app.CntApp,0) AS CntApp,
coalesce(c.CntCon,0) AS CntCon
FROM
(SELECT
CntApp = COUNT(app.ApplicationID)
,r.RegionName
,d.DistrictName
FROM dim.Application app
JOIN dim.Geography g ON (app.ApplicationID = g.GeographyID)
AND (app.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
) app
FULL JOIN
(
SELECT
CntCon = COUNT(c.ContractID)
,r.RegionName
,d.DistrictName
FROM dim.Contract c
JOIN dim.Geography g ON (c.ContractID = g.GeographyID)
AND (c.CountryId = g.CountryId)
JOIN dim.Region r ON r.RegionID = g.RegionID
JOIN dim.District d ON d.DistrictId = g.DistrictID
JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
GROUP BY
r.RegionName
,d.DistrictName
) c ON app.RegionName = c.RegionName AND app.DistrictName = c.DistrictName

You can probably ditch the union, and it will be safer because your results wont be affected by stray cartesian joins that might occur if bad data works its way into the g/r/d/z tables:
SELECT
CntApp,
CntCon,
r.RegionName,
d.DistrictName
FROM
dim.Geography g
INNER JOIN dim.Region r ON r.RegionID = g.RegionID
INNER JOIN dim.District d ON d.DistrictId = g.DistrictID
INNER JOIN dim.ZIPcode z ON g.ZIPcodeID = z.ZIPcodeID
LEFT JOIN (SELECT ApplicationID, CountryID, COUNT(*) CntApp FROM dim.Application GROUP BY ApplicationID, CountryID) app
ON (app.ApplicationID = g.GeographyID) AND (app.CountryId = g.CountryId)
LEFT JOIN (SELECT ContractID, CountryId, COUNT(*) as CntCon FROM dim.Contract GROUP BY ContractID, CountryId) c
ON (c.ContractID = g.GeographyID) AND (c.CountryId = g.CountryId)
Here's a bit of education point for you though:
If you have two blocks of data (from table, query, whatever) and you want to conenct them together vertically (more rows) then you use UNION
If you want to conenct them together horizontally (more columns), you use JOIN
If we have:
a,b,c
a,b,c
a,b,c
And
a,y,z
a,y,z
a,y,z
This is what you get with UNION:
a,b,c
a,b,c
a,b,c
a,y,z
a,y,z
a,y,z
And this is what you get with JOIN:
a,b,c,y,z
a,b,c,y,z
a,b,c,y,z
Remember this, is will serve you well

Subquery in from clause, Invalid Identifier in Where clause

select * from iiasa_inventory.inv_device d
join iiasa_inventory.inv_type ty on d.type_id = ty.id
join iiasa_inventory.inv_category c on ty.category_id = c.id
join iiasa_inventory.inv_device_2_barcode b on b.device_id = d.id
join iiasa_inventory.inv_barcodes bc on b.barcode_id = bc.id
join iiasa_inventory.inv_status s on d.status = s.id
join iiasa_inventory.inv_brand br on ty.brand_id = br.id
left join iiasa_inventory.inv_supplier su on su.id = d.supplier_id
left join iiasa_inventory.inv_supplier sup on sup.id = d.maintenance_with
left join (select distinct device_id from
iiasa_inventory.inv_device_2_persons_cc) dp
on dp.device_id = d.id
where dp.active = 1
I am trying to select my data but the where-clause says that "dp.active" is an INVALID Identifier. This is probably because the table dp is in the subquery. I have tried to give it an alias name and some other things I found while browsing stackoverflow, but I cant seem to find a solution. Any idea?
This is Oracle PL/SQL.

Put the check for active = 1 in the subquery as shown below.
select * from iiasa_inventory.inv_device d
join iiasa_inventory.inv_type ty on d.type_id = ty.id
join iiasa_inventory.inv_category c on ty.category_id = c.id
join iiasa_inventory.inv_device_2_barcode b on b.device_id = d.id
join iiasa_inventory.inv_barcodes bc on b.barcode_id = bc.id
join iiasa_inventory.inv_status s on d.status = s.id
join iiasa_inventory.inv_brand br on ty.brand_id = br.id
left join iiasa_inventory.inv_supplier su on su.id = d.supplier_id
left join iiasa_inventory.inv_supplier sup on sup.id = d.maintenance_with
left join (select distinct device_id from iiasa_inventory.inv_device_2_persons_cc where active = 1) dp on dp.device_id = d.id

That is because you are not selecting active column in dp.
select * from iiasa_inventory.inv_device d
join iiasa_inventory.inv_type ty on d.type_id = ty.id
join iiasa_inventory.inv_category c on ty.category_id = c.id
join iiasa_inventory.inv_device_2_barcode b on b.device_id = d.id
join iiasa_inventory.inv_barcodes bc on b.barcode_id = bc.id
join iiasa_inventory.inv_status s on d.status = s.id
join iiasa_inventory.inv_brand br on ty.brand_id = br.id
left join iiasa_inventory.inv_supplier su on su.id = d.supplier_id
left join iiasa_inventory.inv_supplier sup on sup.id = d.maintenance_with
left join (select distinct device_id,active from iiasa_inventory.inv_device_2_persons_cc) dp on dp.device_id = d.id
where dp.active = 1
OR you can just filter from the subquery itself. Like:
left join (select distinct device_id
from iiasa_inventory.inv_device_2_persons_cc
where active=1) dp on dp.device_id = d.id

How to convert correlated sub query containing duplicate table to non-correlated one?

I have to convert the correlated sub-query to non-correlated sub-query cuz of performance issues .
like that :
The correlated sub-query :(So slow ) returns 4000 row
SELECT a.personid,a.name,b.conid,d.condat,e.connam
FROM main_empr a INNER JOIN coninr b
ON a.personid = b.personid AND a.calc_year = b.calc_year
INNER JOIN mainconinr c
ON b.conid = c.conid
INNER JOIN coninr d
ON a.personid = d.personid AND a.calc_year = d.calc_year
INNER JOIN mainconinr e
ON d.conid = e.conid
WHERE c.active_flag = 1 and c.endreward_flag = 1
AND d.condat = (SELECT MIN(bb.condat) FROM coninr bb WHERE bb.personid = b.personid AND bb.calc_year = b.calc_year AND ((bb.conid > 0 AND bb.conid < 4 ) OR (bb.conid IN(16,6) )) )
AND b.condat = (SELECT MAX(bb.condat) FROM coninr bb WHERE bb.personid = b.personid AND bb.calc_year = b.calc_year AND ((bb.conid > 0 AND bb.conid < 4 ) OR (bb.conid IN(16,6) )) )
AND ( 0 = ( SELECT COUNT(*) FROM servmain x WHERE x.personid = a.personid AND x.calc_year = a.calc_year )
OR b.condat > ( SELECT MAX(x.serv_date) FROM servmain x WHERE x.personid = a.personid AND x.calc_year = a.calc_year ) )
AND a.calc_year = 2018
The non-correlated query :returns about 12300 rows!!
SELECT a.personid,a.name,b.conid,d.condat,e.connam
FROM main_empr a INNER JOIN
coninr b
ON a.personid = b.personid AND a.calc_year = b.calc_year
INNER JOIN mainconinr c
ON b.conid = c.conid
INNER JOIN coninr d
ON a.personid = d.personid AND a.calc_year = d.calc_year
INNER JOIN mainconinr e ON d.conid = e.conid
INNER JOIN
(SELECT MAX(bb.condat) AS condat ,bb.personid,bb.calc_year ,bb.conid
FROM coninr bb
GROUP BY bb.personid,bb.calc_year,bb.conid
)Max_cont
ON Max_cont.personid = b.personid AND Max_cont.calc_year = b.calc_year AND Max_cont.condat = b.condat AND ((Max_cont.conid > 0 AND Max_cont.conid < 4 ) OR (Max_cont.conid IN(16,6) ))
INNER JOIN
(SELECT MIN(dd.condat) AS condat ,dd.personid,dd.calc_year,dd.conid
FROM coninr dd GROUP BY dd.personid,dd.calc_year,dd.conid
)Min_cont
ON Min_cont.personid = d.personid AND Min_cont.calc_year = d.calc_year AND Min_cont.condat = d.condat AND ((Min_cont.conid > 0 AND Min_cont.conid < 4 ) OR (Min_cont.conid IN(16,6) ))
WHERE c.active_flag = 1 and c.endreward_flag = 1
AND ( 0 = ( SELECT COUNT(*) FROM servmain x WHERE x.personid = a.personid AND x.calc_year = a.calc_year )
OR b.condat > ( SELECT MAX(x.serv_date) FROM servmain x WHERE x.personid = a.personid AND x.calc_year = a.calc_year ) )
AND a.calc_year = 2018
The problem is :
I use the coninr table twice to get the last and the first contract date in the same row .
It works fine in the first query but it was so slow because of the correlated sub-query,but in the second query it brings more than one row for the same person one of them for the first contract date and the other for the last one !!
How to fix this problem ?

This looks reasonable, but I've no way to know how it'll perform:
SELECT a.personid,a.name,b.conid,d.condat,e.connam
FROM main_empr a INNER JOIN coninr b
ON a.personid = b.personid AND a.calc_year = b.calc_year
INNER JOIN mainconinr c
ON b.conid = c.conid
INNER JOIN coninr d
ON a.personid = d.personid AND a.calc_year = d.calc_year
INNER JOIN mainconinr e
ON d.conid = e.conid
inner join
(
SELECT bb.personid, bb.calc_year, bb.conid, MIN(bb.condat) MinDate, MAX(bb.condat) MaxDate
FROM coninr bb WHERE
where (bb.conid > 0 and bb.conid < 4) or (bb.conid in (6, 16))
group by bb.personid, bb.calc_year, bb.conid
) zz on d.concat = zz.MinDate and b.condat = zz.MaxDate and b.personid = zz.personid and b.calc_year = zz.calc_year
left outer join
(
select s.personid, s.calc_year, max(s.serv_date) MaxServDate
from servmain s
group by s.personid, s.calc_year
) s on a.personid = s.personid and a.calc_year = s.calc_year
WHERE c.active_flag = 1 and c.endreward_flag = 1
and (s.MaxServDate is null or b.condat ? s.MaxServDate
AND a.calc_year = 2018
You don't need two queries for table coninr, you can get min and max in the same query with the group by. Also, for ServMain, doing a left outer join and putting in the where that either it's null (equivalent to count(*) = 0) or is less than b.condat takes care of that.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Could use advice in making my query smarter - sql

Related

Counting all records from inner-join

SQL JOIN between A, B and C mixing full and left join

How to union aggregated queries?

Subquery in from clause, Invalid Identifier in Where clause

How to convert correlated sub query containing duplicate table to non-correlated one?

Categories

Resources