Sql right outer join ignore all null rows - sql

Old SQL query
SELECT
ISNULL(SUM(ColValue1), 0.00) AS Rejection
FROM
Tabel1 a, Table2 b, Table3 c
WHERE
b.col1 =* a.col2
AND c.col1 = a.col3
AND b.colx = 'xxxxxxx'
AND YEAR(TDate) = 2017
AND MONTH(TDate) = 11
GROUP BY
c.columnz
ORDER BY
c.columnZ
This returns 15 rows based on c.columnz:
Rejection
-----------
0.02897429
0.02215681
0.00000000
0.00000000
0.00000000
0.58119017
0.24542928
1.17601530
1.41633147
0.00000000
0.00000000
0.51131100
0.00000000
1.10613613
0.09033161
After I converted the query to SQL Server 2008:
SELECT
ISNULL(SUM(ColValue1), 0.00) AS Rejection
FROM
Table2 b
RIGHT OUTER JOIN
Tabel1 a ON b.col1 = a.col2
,Table3 c
WHERE
c.col1 = a.col3
AND b.colx = 'xxxxxxx'
AND YEAR(TDate) = 2017
AND MONTH(TDate) = 11
GROUP BY
c.columnz
ORDER BY
c.columnZ
The query only returns 9 rows (ignored all null rows)
Rejection
----------
0.02897429
0.02215681
0.58119017
0.24542928
1.17601530
1.41633147
0.51131100
1.10613613
0.09033161
Please help me fix the new query and get it to return all 15 rows.
All three tables and columns
SELECT
[REJ_CODE], [REJ_GROUPING], [TYPE]
FROM
[QAApr2006].[dbo].[Reject_Group];
SELECT
[REJ_RKEY], [REJECT_CODE],
[REJECT_DESCRIPTION], [REJECT_ABBRV]
FROM
[QAApr2006].[dbo].[Reject_Code];
SELECT
[PR_GRP_CODE], [PROD_CODE],
[CUSTOMER_PART_DESC], [TDATE],
[REJ_CODE], [REJ_PARTS], [REJ_M2],
[P_TYPE], [WO_Number],
[REJ_CODE_ABBRV], [REJECT_DESCRIPTION]
FROM
[QAApr2006].[dbo].[QA_Rej_Det1];

try 2
in a comment below "c. REJ_GROUPING has 15 rows and b.[Tdate]" so I upended the query as follows:
SELECT
ISNULL(SUM(REJ_M2), 0.00) AS rejection
FROM [QAApr2006].[dbo].Reject_Group c
LEFT JOIN [QAApr2006].[dbo].Reject_Code a ON c.REJ_CODE = a.REJECT_CODE
LEFT JOIN [QAApr2006].[dbo].QA_Rej_Det1 b ON a.REJ_RKEY = b.REJ_CODE
AND b.CUSTOMER_PART_DESC = '0115761002'
AND b.[Tdate] >= '20171101'
AND b.[Tdate] < '20171201'
GROUP BY
c.REJ_GROUPING
ORDER BY
c.REJ_GROUPING
It worked (although the confirming comment was added elsewhere).
original:
The alias b is used in the where clause so it is most likely to be the table we use first in the from clause so the outer joins now are left joins.
SELECT
ISNULL(SUM(REJ_M2), 0.00) AS rejection
FROM [QAApr2006].[dbo].QA_Rej_Det1 b
LEFT JOIN [QAApr2006].[dbo].Reject_Code a ON b.REJ_CODE = a.REJ_RKEY
LEFT JOIN [QAApr2006].[dbo].Reject_Group c ON a.REJECT_CODE = c.REJ_CODE
WHERE b.CUSTOMER_PART_DESC = '0115761002'
AND TDate >= '20171101'
AND TDate < '20171201'
GROUP BY
REJ_GROUPING
ORDER BY
Rej_Grouping
There is simply no good reason to use YEAR() and MONTH() to achieve an accurate date range filter, just nominate the 1st date of the month you want, and the 1st of the next month.
The other thing this query seriously suffers from is that REJ_M2,REJ_GROUPING and Tdate have no table aliases, so I really have no idea if the left joins are effective or not. Reference EVER column with the table aliases (or table names if aliases aren't supplied).
----
As a personal note I really detest aliases that indicate sequence in a query (a,b,c...) because if we need to re-sequence all of a sudden those aliases are really painful. I prefer "first letter" or "first-letter-of-each-word" as the aliases method, e.g.
SELECT
ISNULL(SUM(rc.REJ_M2), 0.00) AS rejection
FROM [QAApr2006].[dbo].QA_Rej_Det1 q
LEFT JOIN [QAApr2006].[dbo].Reject_Code rc ON q.REJ_CODE = rc.REJ_RKEY
LEFT JOIN [QAApr2006].[dbo].Reject_Group rg ON rc.REJECT_CODE = rg.REJ_CODE
WHERE q.CUSTOMER_PART_DESC = '0115761002'
AND q.TDate >= '20171101'
AND q.TDate < '20171201'
GROUP BY
q.REJ_GROUPING
ORDER BY
q.Rej_Grouping
This variant may not work because the column aliases are just guesses, & I'm not fixing that.

Related

How to force postgres to return 0 even if there are no rows matching query, using coalesce, group by and join

I've been trying hopelessly to get the following SQL statement to return the query results and default to 0 if there are no rows matching the query.
This is the intended result:
vol | year
-------+------
0 | 2018
Instead I get:
vol | year
-----+------
(0 rows)
Here is the sql statement:
select coalesce(vol,0) as vol, year
from (select sum(vol) as vol, year
from schema.fact_data
join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
join schema.product_data
on schema.fact_data.product_tag =
schema.product_data.tag
join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where "retailer"='MadeUpRetailer'
and "product_tag"='FakeProductTag'
and "year"='2018' group by year
) as DerivedTable;
I know the query works because it returns data when there is data. Just doesn't default to 0 as intended...
Any help in finding why this is the case would be much appreciated!
Using your subquery DerivedTable, you could write:
SELECT coalesce(DerivedTable.vol, 0) AS vol,
y.year
FROM (VALUES ('2018'::text)) AS y(year)
LEFT JOIN (SELECT ...) AS DerivedTable
ON DerivedTable.year = y.year;
Remove the GROUP BY (and the outer query):
select 2018 as year, coalesce(sum(vol), 0) as vol
from schema.fact_data f join
schema.period_data p
on f.period_tag = p.tag join
schema.product_data pr
on f.product_tag = pr.tag join
schema.market_data m
on fd.market_tag = m.tag
where "retailer" = 'MadeUpRetailer' and
"product_tag" = 'FakeProductTag' and
"year" = '2018';
An aggregation query with no GROUP BY always returns exactly one row, so this should do what you want.
EDIT:
The query would look something like this:
select v.yyyy as year, coalesce(sum(vol), 0) as vol
from (values (2018), (2019)) v(yyyy) left join
schema.fact_data f
on f.year = v.yyyy left join -- this is just an example. I have no idea where year is coming from
schema.period_data p
on f.period_tag = p.tag left join
schema.product_data pr
on f.product_tag = pr.tag left join
schema.market_data m
on fd.market_tag = m.tag
group by v.yyyy
However, you have to move the where conditions to the appropriate on clauses. I have no idea where the columns are coming from.
From the code you posted it is not clear in which table you have the year column.
You can use UNION to fetch just 1 row in case there are no rows in that table for the year 2018 like this:
select sum(vol) as vol, year
from schema.fact_data innrt join schema.period_data
on schema.fact_data.period_tag = schema.period_data.tag
inner join schema.product_data
on schema.fact_data.product_tag = schema.product_data.tag
inner join schema.market_data
on schema.fact_data.market_tag = schema.market_data.tag
where
"retailer"='MadeUpRetailer' and
"product_tag"='FakeProductTag' and
"year"='2018'
group by "year"
union
select 0 as vol, '2018' as year
where not exists (
select 1 from tablename where "year" = '2018'
)
In case there are rows for the year 2018, then nothing will be fetched by the 2nd query,

Subquery returned more than 1 value.The subquery that contains SUM(dbo.SalarySettingsBreakup.Amount) AS AmountSSB

My sub-query returns more than one value and gives error.
(SELECT dbo.employee.id,
dbo.employee.employeecode,
dbo.employee.firstname,
dbo.employee.departmentid,
dbo.salarysettings.monthlyoffered,
dbo.salarysettings.id AS SalarySettingsID,
(SELECT Sum(amount) AS AmountVP
FROM voucherprocesses
WHERE vouchertypeid = 2
AND employee = dbo.employee.id
AND voucherdate BETWEEN '9/1/2017 12:00:00 AM' AND
'9/30/2017 12:00:00 AM'
GROUP BY employee) AS SalaryAdvance,
(SELECT Sum(dbo.salarysettingsbreakup.amount) AS AmountSSB
FROM dbo.employee
LEFT JOIN dbo.salarysettings
ON dbo.employee.id = dbo.salarysettings.employee
LEFT JOIN dbo.salarysettingsbreakup
ON dbo.salarysettings.id =
dbo.salarysettingsbreakup.salarysetting
WHERE dbo.salarysettingsbreakup.paymenttype = 2
AND dbo.salarysettingsbreakup.isactive = 1
GROUP BY dbo.employee.id) AS TotalDeduction,
(SELECT CASE
WHEN employee.joiningdate BETWEEN
'9/1/2017 12:00:00 AM' AND '9/30/2017 12:00:00 AM' THEN(
( salarysettings.monthlyoffered / 30 ) * ( 30 -
( Datepart(dd, joiningdate) - 1 ) ) )
ELSE 0
END) AS PayToBank
FROM dbo.employee
LEFT JOIN dbo.salarysettings
ON dbo.employee.id = dbo.salarysettings.employee
WHERE dbo.salarysettings.isactive = 1)
hope will work, try this :
(SELECT e.id,
e.employeecode,
e.firstname,
e.departmentid,
dbo.salarysettings.monthlyoffered,
dbo.salarysettings.id AS SalarySettingsID,
(SELECT Sum(amount) AS AmountVP
FROM voucherprocesses
WHERE vouchertypeid = 2
AND voucherprocesses.employee = e.id
AND voucherdate BETWEEN '9/1/2017 12:00:00 AM' AND
'9/30/2017 12:00:00 AM'
) AS SalaryAdvance,
(SELECT Sum(dbo.salarysettingsbreakup.amount) AS AmountSSB
FROM dbo.employee e2
LEFT JOIN dbo.salarysettings
ON e2.id = dbo.salarysettings.employee
LEFT JOIN dbo.salarysettingsbreakup
ON dbo.salarysettings.id =
dbo.salarysettingsbreakup.salarysetting
AND dbo.salarysettingsbreakup.paymenttype = 2
AND dbo.salarysettingsbreakup.isactive = 1
WHERE e2.id = e.id
) AS TotalDeduction,
(SELECT CASE
WHEN employee.joiningdate BETWEEN
'9/1/2017 12:00:00 AM' AND '9/30/2017 12:00:00 AM' THEN(
( salarysettings.monthlyoffered / 30 ) * ( 30 -
( Datepart(dd, joiningdate) - 1 ) ) )
ELSE 0
END) AS PayToBank
FROM dbo.employee e
LEFT JOIN dbo.salarysettings
ON e.id = dbo.salarysettings.employee
WHERE dbo.salarysettings.isactive = 1)
You have much to learn. You need to understand how subqueries work as well as outer joins. The following is wrong due to 2 issues.
(SELECT Sum(dbo.salarysettingsbreakup.amount) AS AmountSSB
FROM dbo.employee
LEFT JOIN dbo.salarysettings
ON dbo.employee.id = dbo.salarysettings.employee
LEFT JOIN dbo.salarysettingsbreakup
ON dbo.salarysettings.id =
dbo.salarysettingsbreakup.salarysetting
WHERE dbo.salarysettingsbreakup.paymenttype = 2
AND dbo.salarysettingsbreakup.isactive = 1
GROUP BY dbo.employee.id) AS TotalDeduction,
First is that you did not properly correlate the subquery. As Rahmat posted (but did not explain), you need to associate the employee ID from the outer query with the subquery. Because you did not correlate the subquery, it produces multiple rows for each row in the outer query - producing your error.
In addition, your lack of understanding about the correlation causes you to add complexity and a logical mistake (which gets covered up when correlated correctly). There is no need to include the employee table in your subquery. Since you correlate it to the employee table in the main query, it is redundant. In addition, you don't need to group by anything in the subquery since it is intended to generate a single scalar value per row in the outer query. And lastly, there is no purpose to outer joining in the subquery. Either you have matching rows in salarysettingsbreakup or you don't. An inner and outer join will achieve the same result - NULL if no matches. I also question whether you need to sum at all given the table and column names involved. You should search for explanations about how outer joins work and what happens when you reference columns from the unpreserved table (e.g. salarysettingsbreakup) in the where clause.
So a better subquery is:
(SELECT Sum(bkp.amount)
FROM dbo.salarysettings as sset
INNER JOIN dbo.salarysettingsbreakup as bkp
ON sset.id = bkp.salarysetting
AND bkp.paymenttype = 2
AND bkp.isactive = 1
WHERE sset.employee = dbo.employee.id) as TotalDeduction,
Note the inclusion of some best practices. Give a readable alias to your tables and use it with all of the columns referenced. I also despise the practice of using a table name as a column name - that adds to the confusion of reading your queries IMO.

SQL statement merge two rows into one

In the results of my sql-statement (SQL Server 2016) I would like to combine two rows with the same value in two columns ("study_id" and "study_start") into one row and keep the row with higest value in a third cell ("Id"). If any columns (i.e. "App_id" or "Date_arrival) in the row with higest Id is NULL, then it should take the value from the row with the lowest "Id".
I get the result below:
Id study_id study_start Code Expl Desc Startmonth App_id Date_arrival Efter_op Date_begin
167262 878899 954 4.1 udd.ord Afbrudt feb 86666 21-06-2012 N 17-08-2012
180537 878899 954 1 Afsluttet Afsluttet feb NULL NULL NULL NULL
And I would like to get this result:
Id study_id study_start Code Expl Desc Startmonth App_id Date_arrival Efter_op Date_begin
180537 878899 954 1 Afsluttet Afsluttet feb 86666 21-06-2012 N 17-08-2012
My statement looks like this:
SELECT dbo.PopulationStam_V.ELEV_ID AS id,
dbo.PopulationStam_V.PERS_ID AS study_id,
dbo.STUDIESTARTER.STUDST_ID AS study_start,
dbo.Optagelse_Studiestatus.AFGANGSARSAG AS Code,
dbo.Optagelse_Studiestatus.KORT_BETEGNELSE AS Expl,
ISNULL((CAST(dbo.Optagelse_Studiestatus.Studiestatus AS varchar(20))), 'Indskrevet') AS 'Desc',
dbo.STUDIESTARTER.OPTAG_START_MANED AS Startmonth,
dbo.ANSOGNINGER.ANSOG_ID as App_id,
dbo.ANSOGNINGER.ANKOMSTDATO AS Data_arrival',
dbo.ANSOGNINGER.EFTEROPTAG AS Efter_op,
dbo.ANSOGNINGER.STATUSDATO AS Date_begin
FROM dbo.INSTITUTIONER
INNER JOIN dbo.PopulationStam_V
ON dbo.INSTITUTIONER.INST_ID = dbo.PopulationStam_V.SEMI_ID
LEFT JOIN dbo.ANSOGNINGER
ON dbo.PopulationStam_V.ELEV_ID = dbo.ANSOGNINGER.ELEV_ID
INNER JOIN dbo.STUDIESTARTER
ON dbo.PopulationStam_V.STUDST_ID_OPRINDELIG = dbo.STUDIESTARTER.STUDST_ID
INNER JOIN dbo.UDD_NAVNE_T
ON dbo.PopulationStam_V.UDDA_ID = dbo.UDD_NAVNE_T.UDD_ID
INNER JOIN dbo.UDDANNELSER
ON dbo.UDD_NAVNE_T.UDD_ID = dbo.UDDANNELSER.UDDA_ID
LEFT OUTER JOIN dbo.PERSONER
ON dbo.PopulationStam_V.PERS_ID = dbo.PERSONER.PERS_ID
LEFT OUTER JOIN dbo.POSTNR
ON dbo.PERSONER.PONR_ID = dbo.POSTNR.PONR_ID
LEFT OUTER JOIN dbo.KønAlleElevID_V
ON dbo.PopulationStam_V.ELEV_ID = dbo.KønAlleElevID_V.ELEV_ID
LEFT OUTER JOIN dbo.Optagelse_Studiestatus
ON dbo.PopulationStam_V.AFAR_ID = dbo.Optagelse_Studiestatus.AFAR_ID
LEFT OUTER JOIN dbo.frafaldsmodel_adgangsgrundlag
ON dbo.frafaldsmodel_adgangsgrundlag.ELEV_ID = dbo.PopulationStam_V.ELEV_ID
LEFT OUTER JOIN dbo.Optagelse_prioriteterUFM
ON dbo.Optagelse_prioriteterUFM.cpr = dbo.PopulationStam_V.CPR_NR
AND dbo.Optagelse_prioriteterUFM.Aar = dbo.frafaldsmodel_adgangsgrundlag.optagelsesaar
LEFT OUTER JOIN dbo.frafaldsmodel_stoettetabel_uddannelser AS fsu
ON fsu.id_uddannelse = dbo.UDDANNELSER.UDDA_ID
AND fsu.id_inst = dbo.INSTITUTIONER.INST_ID
AND fsu.uddannelse_aar = dbo.frafaldsmodel_adgangsgrundlag.optagelsesaar
WHERE dbo.STUDIESTARTER.STUDIESTARTSDATO > '2012-03-01 00:00:00.000'
AND (dbo.Optagelse_Studiestatus.AFGANGSARSAG IS NULL
OR dbo.Optagelse_Studiestatus.AFGANGSARSAG NOT LIKE '2.7.4')
AND (dbo.PopulationStam_V.INDSKRIVNINGSFORM = '1100'
OR dbo.PopulationStam_V.INDSKRIVNINGSFORM = '1700')
GROUP BY dbo.PopulationStam_V.ELEV_ID,
dbo.PopulationStam_V.PERS_ID,
dbo.STUDIESTARTER.STUDST_ID,
dbo.Optagelse_Studiestatus.AFGANGSARSAG,
dbo.Optagelse_Studiestatus.KORT_BETEGNELSE,
dbo.STUDIESTARTER.OPTAG_START_MANED,
Studiestatus,
dbo.ANSOGNINGER.ANSOG_ID,
dbo.ANSOGNINGER.ANKOMSTDATO,
dbo.ANSOGNINGER.EFTEROPTAG,
dbo.ANSOGNINGER.STATUSDATO
I really hope somebody out there can help.
Many ways, this will work:
WITH subSource AS (
/* Your query here */
)
SELECT
s1.id,
/* all other columns work like this:
COALESCE(S1.column,s2.column)
for example: */
coalesce(s1.appid,s2.appid) as appid
FROM subSource s1
INNER JOIN subSource s2
ON s1.study_id =s2.study_id
and s1.study_start = s2.study_start
AND s1.id > s2.id
/* I imagine some other clauses might be needed but maybe not */
The rest is copy paste

Display rows that have a zero count

I am trying to display rows even if they return a count of zero. However no luck.
I tried using left join.
select
a.Month,
count(b.InsuranceFromJob) [Number of Participants without Insurance]
from
hsAdmin.ReportPeriodLkup a
left join hsAdmin.ClientReport b on
b.ReportPeriod = a.ReportPeriodId
where
b.insurancefromjob = 2 and
a.reportperiodid between (#lastReportId - 11) and #lastReportId
group by
a.Month
Because clientreport is in the where, only rows that exists in clientreport will be in the resultset.
Move the check to the join and you will get the desired result:
select
a.Month,
count(b.InsuranceFromJob) [Number of Participants without Insurance]
from
hsAdmin.ReportPeriodLkup a
left join hsAdmin.ClientReport b on
b.ReportPeriod = a.ReportPeriodId
and b.insurancefromjob = 2
where
a.reportperiodid between (#lastReportId - 11) and #lastReportId
group by
a.Month

Is there any effect on Result due to column ordering in Group by clause in SQL server

If i have 20 columns and want to get result based on group by clause. Is there any effect on result if I change the order of columns in SQL query.
My example is as follows :
Select
R.ClientId
,R.FirmName
,R.StrategyID
,R.SecurityType
,SUM(R.QtySent)
,SUM(R.ExecutedQty) AS ExecutedQty
,SUM(R.CrossedExecutedQty) AS CrossedExecutedQty
FROM ClientDetail m inner join ClientMaster c on
m.clordid = c.masterorderId
and m.msg_id = 43
and c.msg_id in (10,11,12,40)--Msg_Id 40 for manual trade
inner join #ResultsDaily R on c.clordid = R.clordid
GROUP BY R.TethysClientId
,R.FirmName
,R.StrategyID
,R.SecurityType
--Query 1
SELECT R.ClientId --A
,R.FirmName --B
,R.StrategyID --C
,R.SecurityType --D
,SUM(R.QtySent)
,SUM(R.ExecutedQty) AS ExecutedQty
,SUM(R.CrossedExecutedQty) AS CrossedExecutedQty
FROM ClientDetail m
JOIN ClientMaster c
ON m.clordid = c.masterorderId
AND m.msg_id = 43
AND c.msg_id in (10,11,12,40)
JOIN #ResultsDaily R on c.clordid = R.clordid
GROUP BY R.TethysClientId --1
,R.FirmName --2
,R.StrategyID --3
,R.SecurityType --4
In the above Query, 1,2,3 and 4 can be of any order. same time A,B,C and D can be also be in any order.
No column should be missed , thats all.
Parden, if i misinterpreted the Question.