Get rid of duplicate lines in SQL result set

Get rid of duplicate lines in SQL result set - sql

I have 2 tables which I use to get a result set from
select
a.id,
a.test,
a.score,
b.name,
b.person,
b.grade
from table_test a, table_pers b
where a.test=b.test
now the problem is that table b has multiple entries which are all the same but for "passed"... now if passed is yes and no to 1 person I only need the yes result row and only that one otherwise I need the no row and only onw of them if there are multiple.
Any idea on how that could work?
Thanks in advance.
Ok since the case doens't like the group by here is a more detailed view on the select:
select
t.id,
t.tests test,
t.lang,
m.title_TEXT Titel,
m.Sched Schedual,
m.prof profs,
m.date_out Date,
m.sub subject,
m.chan Changes,
case
when m.cha2 = ''
then m.cha1
else m.cha2
end as last_change,
case
when m.datac2 = ''
then m.datac1
else m.datac2
end as Change_date,
t.posp,
t.A1,
t.B1,
t.Failed,
t.analy,
t.vect,
t.cover,
t.typ,
t.circ,
t.deadline
from table_test t, table_pers m where m.test=t.test
The values I look at in t.passed are '1A' and '1B'
If there is a 1A then then I need that row if tehre is only 1B then I need one of those rows.
The complete select has 39 fields.. but those missing are just normal selects no cases or things like that.

You can take advantage of the fact that 'Yes' > 'No', so you could just use:
select a.id,
a.test,
a.score,
b.name,
b.person,
b.grade,
MAX(Passed) AS Passed
from table_test a
INNER JOIN table_pers b
ON a.test = b.test
GROUP BY a.id, a.test, a.score, b.name, b.person, b.grade;
N.B. I have switched your ANSI 89 JOINs to the new ANSI 92 JOIN syntax. This article covers some good reasons for doing so, however it is subjective and the choice is ultimately yours, the result is the same either way.
An alternative, and possibly more rubust solution (if you have other different allowed values for passed) would be:
select a.id,
a.test,
a.score,
b.name,
b.person,
b.grade,
CASE WHEN COUNT(CASE WHEN Passed = 'Yes' THEN 1 END) > 0 THEN 'Yes' ELSE 'No' END AS Passed
from table_test a
INNER JOIN table_pers b
ON a.test = b.test
GROUP BY a.id, a.test, a.score, b.name, b.person, b.grade;
EDIT
There is no reason I know of that CASE won't work in a group by (unless you are using the alias in the group by and not the full case statement), however you could also achieve this using the ROW_NUMBER() function:
SELECT *
FROM ( SELECT t.id,
t.tests test,
t.lang,
m.title_TEXT Titel,
m.Sched Schedual,
m.prof profs,
m.date_out Date,
m.sub subject,
m.chan Changes,
case when m.cha2 = '' then m.cha1 else m.cha2 end as last_change,
case when m.datac2 = ''then m.datac1 else m.datac2 end as aenderungsdatumChange_date,
t.posp,
t.A1,
t.B1,
t.Failed,
t.analy,
t.vect,
t.cover,
t.typ,
t.circ,
t.deadline,
ROW_NUMBER() OVER(PARTITION BY t.ID ORDER BY t.Passed) AS rn
from table_test t
INNER JOIN table_pers m
ON m.test = t.test
) t
WHERE rn = 1

select a.id,
a.test,
a.score,
b.name,
b.person,
b.grade,
Passed
from table_test a
INNER JOIN table_pers b
ON a.test = b.test
where qualify row_number() over (partition by a.id order by passed desc)=1;

Related

How to return Yes or No if nested query has result or not in SQL Server?

I have a stored procedure with a nested query that checks whether "category" from the main table matches a "category" in a sub table.
So there can either be one match or none.
How can I return Yes if there is a match and the sub query returns something and No if there is no match and the sub query returns nothing ?
I tried the following which works in general but only if there is a match as otherwise this returns nothing.
My SQL (shortened):
SELECT A.categoryID,
A.category,
A.[description],
(
SELECT 'Yes' AS subscribed
FROM MOC_Categories_Subscribers D
WHERE D.category = A.category
FOR XML PATH(''), ELEMENTS, TYPE
)
FROM MOC_Categories A

If subquery doesn't return any rows then your result will be NULL. Thus you need to check it. In SQL Server you can do this by using functions ISNULL and COALESCE, it depends on version that you're using
SELECT A.categoryID,
A.category,
A.[description],
COALESCE((SELECT TOP 1 'Yes'
FROM MOC_Categories_Subscribers D
WHERE D.category = A.category), 'No') AS Result
FROM MOC_Categories A

SELECT A.categoryID,
A.category,
A.[description],
(
SELECT
case
when count(subscribed) > 0 then 'Yes'
else 'No'
end
FROM MOC_Categories_Subscribers D
WHERE D.category = A.category
)
FROM MOC_Categories A

You can use an outer join, which returns null values if there is no match. Combine with a case to convert to a yes/no value:
SELECT A.categoryID,
A.category,
A.[description],
subscribed = CASE
WHEN D.category IS NOT NULL THEN 'Yes'
ELSE 'No'
END,
FROM MOC_Categories A
LEFT OUTER JOIN MOC_Categories_Subscribers D
ON D.category = A.category

Create custom field in SELECT if other field is null

This is a seemingly simple thing to do but I can't find any reference to it. I want to add a customized field to my select statement if the value of another field is null. In the below I want to create a field named 'IMPACT' that shows a value of 'Y' if the LOCATION_ACCOUNT_ID field in the subquery is null. How do I do this?
SELECT FIRST_NAME,LAST_NAME,ULTIMATE_PARENT_NAME, IMPACT = IF LOCATION_ACCOUNT_ID IS NULL THEN 'Y' ELSE ''
FROM (SELECT DISTINCT A.FIRST_NAME,
A.LAST_NAME,
B.LOCATION_ACCOUNT_ID,
A.ULTIMATE_PARENT_NAME
FROM ACTIVE_ACCOUNTS A,
QL_ASSETS B
WHERE A.ACCOUNT_ID = B.LOCATION_ACCOUNT_ID(+)

Use CASE instead of IF:
SELECT
FIRST_NAME,
LAST_NAME,
ULTIMATE_PARENT_NAME,
CASE WHEN LOCATION_ACCOUNT_ID IS NULL THEN 'Y' ELSE '' END AS IMPACT
FROM (
SELECT DISTINCT
A.FIRST_NAME,
A.LAST_NAME,
B.LOCATION_ACCOUNT_ID,
A.ULTIMATE_PARENT_NAME
FROM ACTIVE_ACCOUNTS A,
QL_ASSETS B
WHERE A.ACCOUNT_ID = B.LOCATION_ACCOUNT_ID(+)
You should also use LEFT JOIN syntax instead of the old (+) syntax (but that's more of a style choice in this case - it does not change the result):
SELECT
FIRST_NAME,
LAST_NAME,
ULTIMATE_PARENT_NAME,
CASE WHEN LOCATION_ACCOUNT_ID IS NULL THEN 'Y' ELSE '' END AS IMPACT
FROM (
SELECT DISTINCT
A.FIRST_NAME,
A.LAST_NAME,
B.LOCATION_ACCOUNT_ID,
A.ULTIMATE_PARENT_NAME
FROM ACTIVE_ACCOUNTS A
LEFT JOIN QL_ASSETS B
ON A.ACCOUNT_ID = B.LOCATION_ACCOUNT_ID
)
In fact, since you aren't using any of the columns from B in your result (only checking for existence) you can just use EXISTS:
SELECT
FIRST_NAME,
LAST_NAME,
ULTIMATE_PARENT_NAME,
CASE WHEN EXISTS(SELECT NULL
FROM QL_ASSETS
WHERE LOCATION_ACCOUNT_ID = A.ACCOUNT_ID)
THEN 'Y'
ELSE ''
END AS IMPACT
FROM ACTIVE_ACCOUNTS A

Use a case statement:
SELECT FIRST_NAME,
LAST_NAME,
ULTIMATE_PARENT_NAME,
CASE WHEN Location_Account_ID IS NULL THEN 'Y' ELSE '' END AS IMPACT
FROM (
SELECT DISTINCT A.FIRST_NAME,
A.LAST_NAME,
B.LOCATION_ACCOUNT_ID,
A.ULTIMATE_PARENT_NAME
FROM ACTIVE_ACCOUNTS A,
QL_ASSETS B
WHERE A.ACCOUNT_ID = B.LOCATION_ACCOUNT_ID(+)
) a
p.s. also added a alias for your derived table so you wont get an error for that.

I didn't exactly get what you were asking based on your following statement
(that shows a value of 'Y' if the LOCATION_ACCOUNT_ID field in the
subquery is null)
I can suggest that you use an expression.
Put this statement in between your expression.
NVL(B.LOCATION_ACCOUNT_ID,'Y') IMPACT

Showing specific output data based on duplicate rows and null values [postgresql]

I'm using the following SQL (with a union to two similar queries):
SELECT
distinct a.source,
a.p_id,
a.name,
b.prod_count,
b.prod_amt,
'Def' as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
(
SELECT
distinct source,
p_id,
name,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag = 0 AND
loan_flag = 3
GROUP BY
source,
name,
p_id ) as b
ON
a.p_id = b.p_id
UNION
SELECT
distinct a.source,
a.p_id,
a.name,
b.prod_count,
b.prod_amt,
'Other' as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
(
SELECT
distinct source,
p_id,
name,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag = 1 AND
loan_flag = 3
GROUP BY
source,
name,
p_id
ORDER BY
name ) as b
ON
a.p_id = b.p_id
The output I'm getting looks like this:
Essentially since FakeName #2 has one row showing actual numbers (not null), I ONLY want FakeName #2 to show up. This means I also want the null row for FakeName #2. But, since FakeName #1 and #3 have 2 null rows, I don't need them to show. What type of SQL command (or edit to my query) can accomplish this?

Firstly, if I read your query correctly, you can eliminate the need for a UNION by using CASE and IN. You also have a couple of bogus DISTINCTs in there (since you're using GROUP BY anyway). That gives:
SELECT DISTINCT
a.source,
a.p_id,
a.name,
b.prod_count,
b.prod_amt,
Case When default_banner_flag = 0 Then 'Def' Else 'Other' End as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
(
SELECT
source,
p_id,
name,
default_banner_flag,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag in (0, 1) AND
loan_flag = 3
GROUP BY
source,
name,
p_id,
default_banner_flag
) as b
ON
a.p_id = b.p_id
However, what you actually want is information about those p_ids which have at least one row in dwh.prod_count, so I think you can change your whole query around to use that as the sub-select:
SELECT
a.source,
a.p_id,
a.name,
sum(acct_count) as prod_count,
sum(acct_amt) as prod_amt,
Case When default_banner_flag = 0 Then 'Def' Else 'Other' End as prod_type
FROM
dwh.attribution_product_count a
LEFT OUTER JOIN
dwh.prod_count b
On a.p_id = b.p_id
INNER JOIN
(
SELECT DISTINCT
p_id
FROM
dwh.prod_count
WHERE
month = 3 AND
default_banner_flag in (0, 1) AND
loan_flag = 3
) as c
ON a.p_id = c.p_id
WHERE
month = 3 AND
default_banner_flag in (0, 1) AND
loan_flag = 3
(You could also rewrite this as a WHERE p_id IN ( sub-select ) or with a little fiddling WHERE EXISTS ( ... ), but this seemed the easiest version to demonstrate.)
Note that I haven't actually tested any of these queries, but I think they're logically sound.

Remove repeated values from the table

I have a PL/SQL select query like,
select
a.sgm,
b.numbr
from tbl1 a, tbl2 b
where b.itemId = a.itemId
and b.orgId = a.orgId
and a.srvCode = 'F'
and a.nbrCode <> 1
and rownum <= 7
Right now it retrieves like ,
sgm-|-numbr
-----------
abc-|-123
abc-|-678
abc-|-78
abc-|-099
bcd-|-153
bcd-|-123
bcd-|-123
I need to retrieve like ,
sgm-|-numbr
-----------
abc-|-123
bcd-|-153
ie, I need to remove the repeated ones in the first column. ie sgm shouldn't repeat.

Since you are using Oracle, then try this simplified version using a CTE:
WITH CTE as (
SELECT sgm, numbr,
rownum rn
FROM YourTable
)
SELECT CTE.sgm, CTE.numbr
FROM CTE
JOIN (
SELECT sgm, MIN(rownum) minrn
FROM CTE
GROUP BY sgm) t ON CTE.sgm = t.sgm AND CTE.rn = t.minrn
http://sqlfiddle.com/#!4/8d6fb/10
You can replace your query in the CTE above.
Good luck.

SELECT a.sgm, MAX(b.numbr)
FROM tbl1 a INNER JOIN tbl2 b
ON a.itemID = b.itemId
AND a.orgId = b.orgId
WHERE a.srvCode= 'F'
AND a.nbrCode <> 1
AND rownum <= 7
GROUP BY a.sgm
Apply a group function of your choice like MAX() on b.numbr, and apply the grouping on a.sgm, this should do what you need.
Advice : do your joins explicitly, see the difference between your query and mine.

select a.sgm,MAX(b.numbr)
from tbl1 a, tbl2 b
where b.itemId = a.itemId
AND b.orgId= a.orgId
and a.srvCode= 'F'
and a.nbrCode <> 1
and rownum<=7
group by sgm
The value of sgm wont repeat, but maximum value of number will be selected, similarly you can also select the minimum value using the Min function

Use group by function
select
a.sgm,
b.numbr
from tbl1 a, tbl2 b
where b.itemId = a.itemId
and b.orgId = a.orgId
and a.srvCode = 'F'
and a.nbrCode <> 1
and rownum <= 7
group by a.sgm

Select a from tbl a , tbl b WHERE a.userid > b..userid and
a.sgm = b.sgm;
Check this fiddle http://sqlfiddle.com/#!2/40b8f/2

Produce result table trom multiple tables

SQL Server 2008 R2
I have 3 tables contained data for 3 different types of events
Type1, Type2, Type3 with two columns:
DatePoint ValuePoint
I want to produce result table which would look like that:
DatePoint TotalType1 TotalType2 TotalType3
I've started from that
SELECT [DatePoint]
,SUM(ValuePoint) as TotalType1
FROM [dbo].[Type1]
GROUP BY [DatePoint]
ORDER BY [DatePoint]
SELECT [DatePoint]
,SUM(ValuePoint) as TotalType2
FROM [dbo].[Type2]
GROUP BY [DatePoint]
ORDER BY [DatePoint]
SELECT [DatePoint]
,SUM(ValuePoint) as TotalType3
FROM [dbo].[Type3]
GROUP BY [DatePoint]
ORDER BY [DatePoint]
So I have three result but I need to produce one (Date TotalType1 TotalType2 TotalType3), what I need to do next achieve my goal?
UPDATE
Forgot to mention that DatePoint which is exists in one type may or may not exist in another

Here's my take. I assume that you don't have the same datetime values in every table (certainly, the stuff I get to work with is never so consistant). There should be an easier way to do this, but once you're past two outer joins things can get pretty tricky.
SELECT
dp.DatePoint
,isnull(t1.TotalType1, 0) TotalType1
,isnull(t2.TotalType2, 0) TotalType2
,isnull(t3.TotalType3, 0) TotalType3
from (-- Without "ALL", UNION will filter out duplicates
select DatePoint
from Type1
union select DatePoint
from Type2
union select DatePoint
from Type3) dp
left outer join (select DatePoint, sum(ValuePoint) TotalType1
from Type1
group by DatePoint) t1
on t1.DatePoint = db.DatePoint
left outer join (select DatePoint, sum(ValuePoint) TotalType2
from Type2
group by DatePoint) t2
on t2.DatePoint = db.DatePoint
left outer join (select DatePoint, sum(ValuePoint) TotalType3
from Type3
group by DatePoint) t3
on t3.DatePoint = db.DatePoint
order by dp.DatePoint

Suppose some distinct could help, but the general idea should be the following:
SELECT
t.[DatePoint],
SUM(t1.ValuePoint) as TotalType1,
SUM(t2.ValuePoint) as TotalType2,
SUM(t3.ValuePoint) as TotalType3
FROM
(
SELECT [DatePoint] FROM [dbo].[Type1]
UNION
SELECT [DatePoint] FROM [dbo].[Type2]
UNION
SELECT [DatePoint] FROM [dbo].[Type3]
) as t
LEFT JOIN
[dbo].[Type1] t1
ON
t1.[DatePoint] = t.[DatePoint]
LEFT JOIN
[dbo].[Type2] t2
ON
t2.[DatePoint] = t.[DatePoint]
LEFT JOIN
[dbo].[Type3] t3
ON
t3.[DatePoint] = t.[DatePoint]
GROUP BY
t.[DatePoint]
ORDER BY
t.[DatePoint]

To avoid all of the JOINs:
SELECT
SQ.DatePoint,
SUM(CASE WHEN SQ.type = 1 THEN SQ.ValuePoint ELSE 0 END) AS TotalType1,
SUM(CASE WHEN SQ.type = 2 THEN SQ.ValuePoint ELSE 0 END) AS TotalType2,
SUM(CASE WHEN SQ.type = 3 THEN SQ.ValuePoint ELSE 0 END) AS TotalType3
FROM (
SELECT
1 AS type,
DatePoint,
ValuePoint
FROM
dbo.Type1
UNION ALL
SELECT
2 AS type,
DatePoint,
ValuePoint
FROM
dbo.Type2
UNION ALL
SELECT
3 AS type,
DatePoint,
ValuePoint
FROM
dbo.Type3
) AS SQ
GROUP BY
DatePoint
ORDER BY
DatePoint
From the little information provided though, it seems like there are some flaws in the database design, which is probably part of the reason that querying the data is so difficult.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get rid of duplicate lines in SQL result set - sql

select a.id, a.test, a.score, b.name, b.person, b.grade, Passed from table_test a INNER JOIN table_pers b ON a.test = b.test where qualify row_number() over (partition by a.id order by passed desc)=1;

Related

How to return Yes or No if nested query has result or not in SQL Server?

Create custom field in SELECT if other field is null

Showing specific output data based on duplicate rows and null values [postgresql]

Remove repeated values from the table

Produce result table trom multiple tables

Categories

Resources