How to avoid duplicate data in the subquery

How to avoid duplicate data in the subquery - sql

I have two tables as below.
Product table:
+-----+------------+-----+-------+--------+
| id | activityId | age | queue | status |
+-----+------------+-----+-------+--------+
| 100 | 2 | 0 | start | 2 |
| 101 | 3 | 0 | in | 5 |
+-----+------------+-----+-------+--------+
Department table:
+-----+------------+-------+----------+
| id | activityId | queue | exittime |
+-----+------------+-------+----------+
| 100 | 1 | new | null |
| 100 | 2 | start | null |
| 100 | 2 | start | null |
| 101 | 1 | new | null |
| 101 | 1 | new | null |
| 101 | 3 | in | null |
| 101 | 3 | in | null |
+-----+------------+-------+----------+
I am trying to update product table age column with below query. But its returning error as ORA-01427 Single-row subquery returning more than one row.
update Product pd set pd.age = (select (case when dp.exittime!= null then
(sysdate - dp.exittime)
else ( case when pd.queue = dp.queue
then (select (sysdate - dp1.entrytime) from department dp1 where pd.id = dp1.id
) else 2 END) END)
from department dp
where dp.id > 1
AND pd.id = dp.id
AND pd.status in('1','7','2','5')
AND pd.queue= dp.queue
AND pd.activityId = dp.activityId )
where exists
(select 1 from department dp
where dp.id > 1
AND pd.id = dp.id
AND pd.status in('1','7','2','5')
AND pd.queue= dp.queue
AND pd.activityId = dp.activityId );
Subquery returning multiple values due to activityId in department table. How can I avoid sub-query returning multiple value.

This query will identify the scenarios under which you get mutliple rows.
select
dp.id,
dp.queue,
dp.activityId,
COUNT(*)
from
department dp
inner join
product pd
ON pd.id = dp.id
AND pd.queue= dp.queue
AND pd.activityId = dp.activityId
where
dp.id > 1
AND pd.status in('1','7','2','5')
GROUP BY
dp.id,
dp.queue,
dp.activityId
HAVING
COUNT(*) > 1
For those cases you need to determine one of the following...
How to fix the data to return only one row
How to fix the query to return only one row
How to pick just one row from the multiple rows returned
As we can't see your data, we can't fix any of that for you.
After investigating, however, you may be able to return with a more specific question.

Related

SQL Joins for Multiple Fields with Null Values

I have a table of maintenance requirements and associated monthly frequency it is to be performed
maint
+----------+------+
| maint_id | freq |
+----------+------+
| 1 | 6 |
| 2 | 12 |
| 3 | 24 |
| 4 | 3 |
+----------+------+
I also have a table of equipment with data on its manufacturer, model, device type and building.
equip
+----------+--------+--------+--------+---------+
| equip_id | mfg_id | mod_id | dev_id | bldg_id |
+----------+--------+--------+--------+---------+
| 1 | 1 | 1 | 3 | 1 |
| 2 | 1 | 2 | 3 | 1 |
| 3 | 2 | 3 | 1 | 2 |
| 4 | 2 | 3 | 1 | 3 |
+----------+--------+--------+--------+---------+
I am trying to match each maintenance requirement with its associated equipment. Each requirement applies to a specific manufacturer, model, device, facility or any combination of these in its scope of application.
I have created a table to manage these relationships like this:
maint_equip
+----------------+----------+--------+--------+--------+---------+
| maint_equip_id | maint_id | mfg_id | mod_id | dev_id | bldg_id |
+----------------+----------+--------+--------+--------+---------+
| 1 | 1 | NULL | NULL | 1 | NULL |
| 2 | 2 | 2 | NULL | NULL | 2 |
| 3 | 3 | NULL | NULL | NULL | 1 |
| 4 | 3 | NULL | NULL | NULL | 3 |
| 5 | 4 | 1 | NULL | 3 | 1 |
+----------------+----------+--------+--------+--------+---------+
As per the table above, requirement 1 would only apply to any equipment having device type "1."
Requirement 2 would apply to all equipment having both manufacturer "2" AND building "2."
Requirement 3 would apply to all equipment having building "1" OR building "3"
Requirement 4 would apply to equipment having all of mfg_id "1" AND dev_id "3" AND building "1."
I am trying to write a query to give me a list of all equipment ids and all the associated frequency requirements based on the relationships defined in maint_equip. The problem I'm running into is handling the multiple joins. I have already tried:
SELECT equip.equip_id, maint.freq
FROM equip INNER JOIN
maint_equip ON equip.mfg_id = maint_equip.mfg_id
OR equip.mod_id = maint_equip.mod_id
OR equip.dev_id = maint_equip.dev_id
OR equip.bldg_id = maint_equip.bldg_id INNER JOIN
maint ON maint_equip.maint_id = maint.maint_id
but separating multiple joins using OR means that it is not accounting for the AND contingencies of each row. For example, maint_id 2 should only apply to equip_id 3 but ids 3 and 4 are both returned. If AND is used, then no rows are returned because none have a value for all columns.
Is it possible to join the tables in such a way to accomplish this or is there another way to structure the data?

If I get this right, when an equipment related ID in maint_equip is null, that should count as a match. Only if it isn't null, it must match the respective ID in equip. That is, you want to check if an ID in maint_equip is null or equal to its counterpart from equip.
SELECT e.equip_id,
m.freq
FROM equip e
INNER JOIN maint_equip me
ON (me.mfg_id IS NULL
OR me.mfg_id = e.mfg_id)
AND (me.mod_id IS NULL
OR me.mod_id = e.mod_id)
AND (me.dev_id IS NULL
OR me.dev_id = e.dev_id)
AND (me.bldg_id IS NULL
OR me.bldg_id = e.bldg_id)
INNER JOIN maint m
ON m.maint_id = me.main_id;

Try this:
( equip.mfg_id = maint_equip.mfg_id OR maint_equip.mfg_id is null )
AND( equip.mod_id = maint_equip.mod_id OR maint_equip.mod_id is null )
AND( equip.dev_id = maint_equip.dev_id OR maint_equip.dev_id is null )
AND( equip.bldg_id = maint_equip.bldg_id OR maint_equip.bldg_id is null )

Pay attention that your mod_id is always null. Otherwise query below goes through all your cases.
SELECT maint_equip.maint_id, equip.equip_id, maint.freq
FROM equip INNER JOIN
maint_equip ON (
(equip.mfg_id = maint_equip.mfg_id AND
equip.dev_id = maint_equip.dev_id AND
equip.bldg_id = maint_equip.bldg_id
) OR
(equip.mfg_id = maint_equip.mfg_id AND
maint_equip.dev_id is NULL AND
equip.bldg_id = maint_equip.bldg_id
) OR
(maint_equip.mfg_id is NULL AND
equip.dev_id = maint_equip.dev_id AND
maint_equip.bldg_id is NULL
) OR
(maint_equip.mfg_id is NULL AND
maint_equip.dev_id is NULL AND
equip.bldg_id = maint_equip.bldg_id
) )
INNER JOIN
maint ON maint_equip.maint_id = maint.maint_id
;

It seems to me that what you're actually looking for is the maintenance schedule that has the highest number of matches. You can get that by using a SUM with a series of CASE expressions to get the count of matching columns.
Then you have to account for ties where there are multiple maint_id values that match an equal number of times. For the example below, I opted to use maintenance frequency as the tie breaker, favoring more frequent maintenance over less frequent maintenance.
Rextester link with data set up: https://rextester.com/VISR88105
The ROW_NUMBER in the ORDER BY clause sorts the results by number of column matches (the nutty SUM/CASE combo) in descending order to get the most matches first, and then by maintenance frequency in ascending order to favor more frequent maintenance. Easy to reverse that with a DESC if you like. Then the TOP (1) WITH TIES limits the result set to all of the rows where ROW_NUMBER evaluates to 1.
The code:
SELECT TOP (1) WITH TIES
e.equip_id,
m.maint_id,
m.freq
FROM
#maint as m
JOIN
#maint_equip as me
ON
m.maint_id = me.maint_id
JOIN
#equip as e
ON
e.mfg_id = COALESCE(me.mfg_id, e.mfg_id)
AND
e.mod_id = COALESCE(me.mod_id, e.mod_id)
AND
e.dev_id = COALESCE(me.dev_id, e.dev_id)
AND
e.bldg_id = COALESCE(me.bldg_id, e.bldg_id)
GROUP BY
e.equip_id,
m.maint_id,
m.freq
ORDER BY
ROW_NUMBER() OVER (PARTITION BY e.equip_id ORDER BY (
SUM(
(CASE WHEN e.mfg_id = me.mfg_id THEN 1 ELSE 0 END) +
(CASE WHEN e.mod_id = me.mod_id THEN 1 ELSE 0 END) +
(CASE WHEN e.dev_id = me.dev_id THEN 1 ELSE 0 END) +
(CASE WHEN e.bldg_id = me.bldg_id THEN 1 ELSE 0 END)) ) DESC, m.freq )
Results:
+----------+----------+------+
| equip_id | maint_id | freq |
+----------+----------+------+
| 1 | 4 | 3 |
| 2 | 4 | 3 |
| 3 | 2 | 12 |
| 4 | 1 | 6 |
+----------+----------+------+

SQL Group rows for every ID using left outer join

I have a table with almost a million records of claims for 6 different conditions like Diabetes, Hypertension, Heart Failure etc. Every member has a number of claims. He might have claims with the condition as Diabetes or Hypertension or anything else. My goal is to group the conditions they have(number of claims) per every member row.
Existing table
+--------------+---------------+------+------------+
| Conditions | ConditionCode | ID | Member_Key |
+--------------+---------------+------+------------+
| DM | 3001 | 1212 | A1528 |
| HTN | 5001 | 1213 | A1528 |
| COPD | 6001 | 1214 | A1528 |
| DM | 3001 | 1215 | A1528 |
| CAD | 8001 | 1823 | B4354 |
| HTN | 5001 | 3458 | B4354 |
+--------------+---------------+------+------------+
Desired Result
+------------+------+-----+----+----+-----+-----+
| Member_Key | COPD | CAD | DM | HF | CHF | HTN |
+------------+------+-----+----+----+-----+-----+
| A1528 | 1 | | 2 | | | 1 |
| B4354 | | 1 | | | | 1 |
+------------+------+-----+----+----+-----+-----+
Query
select distinct tr.Member_Key,C.COPD,D.CAD,DM.DM,HF.HF,CHF.CHF,HTN.HTN
FROM myTable tr
--COPD
left outer join (select Member_Key,'X' as COPD
FROM myTable
where Condition=6001) C
on C.Member_Key=tr.Member_Key
--CAD
left outer join ( ....
For now I'm just using 'X'. But i'm trying to get the number of claims in place of X based on condition. I don't think using a left outer join is efficient when you are searching 1 million rows and doing a distinct. Do you have any other approach in solving this

You don't want so many sub-queries, this is easy with group by and case statements:
SELECT Member_Key
SUM(CASE WHEN Condition=6001 THEN 1 ELSE 0 END) AS COPD,
SUM(CASE WHEN Condition=3001 THEN 1 ELSE 0 END) AS DM,
SUM(CASE WHEN Condition=5001 THEN 1 ELSE 0 END) AS HTN,
SUM(CASE WHEN Condition=8001 THEN 1 ELSE 0 END) AS CAD
FROM myTable
GROUP BY Member_Key

This is an ideal situation for CASE statments:
SELECT tr.Member_Key,
SUM(CASE WHEN Condition=6001 THEN 1 ELSE 0 END) as COPD,
SUM(CASE WHEN Condition=6002 THEN 1 ELSE 0 END) as OtherIssue,
SUM(CASE etc.)
FROM myTable tr
GROUP BY tr.Member_Key

This should be done with a PIVOT, like:
SELECT *
FROM
(SELECT conditions, member_key
FROM t) src
PIVOT
(COUNT (conditions)
for conditions in ([COPD], [CAD], [DM], [HF], [CHF], [HTN])) pvt

Finding nth row using sql

select top 20 *
from dbo.DUTs D
inner join dbo.Statuses S on d.StatusID = s.StatusID
where s.Description = 'Active'
Above SQL Query returns the top 20 rows, how can I get a nth row from the result of the above query? I looked at previous posts on finding the nth row and was not clear to use it for my purpose.
Thanks.

The row order is arbitrary, so I would add an ORDER BY expression. Then, you can do something like this:
SELECT TOP 1 * FROM (SELECT TOP 20 * FROM ... ORDER BY d.StatusID) AS d ORDER BY d.StatusID DESC
to get the 20th row.
You can also use OFFSET like:
SELECT * FROM ... ORDER BY d.StatusID OFFSET 19 ROWS FETCH NEXT 1 ROWS ONLY
And a third option:
SELECT * FROM (SELECT *, rownum = ROW_NUMBER() OVER (ORDER BY d.StatusID) FROM ...) AS a WHERE rownum = 20

I tend to use CTEs with the ROW_NUMBER() function to get my lists numbered in order. As #zambonee said, you'll need an ORDER BY clause either way or SQL can put them in a different order every time. It doesn't usually, but without ordering it yourself, you're not guaranteed to get the same thing twice. Here I'm assuming there's a [DateCreated] field (DATETIME NOT NULL DEFAULT GETDATE()), which is usually a good idea so you know when that record was entered. This says "give me everything in that table and add a row number with the most recent record as #1":
; WITH AllDUTs
AS (
SELECT *
, DateCreatedRank = ROW_NUMBER() OVER(ORDER BY [DateCreated] DESC)
FROM dbo.DUTs D
INNER JOIN dbo.Statuses S ON D.StatusID = S.StatusID
WHERE S.Description = 'Active'
)
SELECT *
FROM AllDUTs
WHERE AllDUTs.DateCreatedRank = 20;

SELECT * FROM (SELECT * FROM EMP ORDER BY ROWID DESC) WHERE ROWNUM<11

It's another sample:
SELECT * ,CASE WHEN COUNT(0)OVER() =ROW_NUMBER()OVER(ORDER BY number) THEN 1 ELSE 0 END IsNth
FROM (
select top 10 *
from master.dbo.spt_values AS d
where d.type='P'
) AS t
+------+--------+------+-----+------+--------+-------+
| name | number | type | low | high | status | IsNth |
+------+--------+------+-----+------+--------+-------+
| NULL | 0 | P | 1 | 1 | 0 | 0 |
| NULL | 1 | P | 1 | 2 | 0 | 0 |
| NULL | 2 | P | 1 | 4 | 0 | 0 |
| NULL | 3 | P | 1 | 8 | 0 | 0 |
| NULL | 4 | P | 1 | 16 | 0 | 0 |
| NULL | 5 | P | 1 | 32 | 0 | 0 |
| NULL | 6 | P | 1 | 64 | 0 | 0 |
| NULL | 7 | P | 1 | 128 | 0 | 0 |
| NULL | 8 | P | 2 | 1 | 0 | 0 |
| NULL | 9 | P | 2 | 2 | 0 | 1 |
+------+--------+------+-----+------+--------+-------+

Count within the result set of a subquery

I have the following relations in my database:
Invoice InvoiceMeal
--------------------- ---------------------------
| InvoiceId | Total | | Id | InvoiceId | MealId |
--------------------- ---------------------------
| 1 | 22.32 | | 1 | 1 | 3 |
--------------------- ---------------------------
| 2 | 12.18 | | 2 | 1 | 2 |
--------------------- ---------------------------
| 3 | 27.76 | | 3 | 2 | 2 |
--------------------- ---------------------------
Meal Type
----------------------------------- -------------------
| Id | Name | TypeId | | Id | Name |
----------------------------------- -------------------
| 1 | Hamburger | 1 | | 1 | Meat |
----------------------------------- -------------------
| 2 | Soja Beans | 2 | | 2 | Vegetarian |
----------------------------------- -------------------
| 3 | Chicken | 2 |
-----------------------------------
What I want to query from the database is InvoiceId and Total of all Invoices which consist of at least two Meals where at least one of the Meals is of Type Vegetarian. I have the following SQL query and it works:
SELECT
i."Id", i."Total"
FROM
public."Invoice" i
WHERE
(SELECT COUNT(*)
FROM public."InvoiceMeal" im
WHERE im."InvoiceId" = i."Id" AND
(SELECT COUNT(*)
FROM public."Meal" m, public."Type" t
WHERE im."MealId" = m."Id" AND
m."TypeId" = t."Id" AND
g."Name" = 'Vegetarian') > 0
) >= 2;
My problem with this query is that I can not easily modify the condition that there must at least one vegetarien Meal. I want to be able, for example, to change it to at least two vegetarian meals. How can I achieve this with my query?

I would approach this by joining the tables together and using aggregation. The having clause can handle the conditions:
select i.Id, i.Total
from InvoiceMeal im join
Invoice i
on i.InvoiceId = im.InvoiceId join
Meal m
on im.mealid = m.mealid join
Type t
on m.typeid = t.typeid
group by i.Id, i.Total
having count(distinct im.mealid) >= 2 and
sum(case when t.name = 'Vegetarian' then 1 else 0 end) > 0;
I also see no reason to put double quotes around column names. That just makes the query harder to write and read.

Is it possible to select multiple conditional counts across three tables in a single SQL query?

My SQL-fu is too weak for this, and I'm not even sure it's possible in a single SQL call.
Given I have the following tables:
PARTNER
+----+--------+
| id | name |
+----+--------+
| 1 | bloggs |
| 2 | jones |
PARTNER MANAGER
+----+--------------+------+
| id | partner_id | name |
+----+--------------+------+
| 1 | 1 | fred |
| 2 | 2 | dave |
COMPANY
+----+--------------------+--------+----------+
| id | partner_manager_id | name | active |
+----+--------------------+--------+----------+
| 1 | 1 | comp1 | true |
| 2 | 1 | comp2 | false |
| 3 | 2 | comp3 | true |
| 4 | 2 | comp4 | true |
| 5 | 2 | comp5 | true |
| 6 | 2 | comp6 | true |
I'd like to output the following in a single SQL call:
+--------------+--------------------+----------------------+
| partner_name | n_active_companies | n_inactive_companies |
+--------------+--------------------+----------------------+
| bloggs | 1 | 1 |
| jones | 4 | 0 |
I can join the three tables using two LEFT JOINs but how I can aggregate the counts (with or without the WHERE clause) is eluding me.
Am I barking up the wrong tree, so to speak?

This gets you most of the way there:
SELECT
partner_manager_id,
SUM(CASE WHEN active THEN 1 ELSE 0 END) AS n_active_companies,
SUM(CASE WHEN active THEN 0 ELSE 1 END) AS n_inactive_companies
FROM COMPANY
GROUP BY partner_manager_id
The rest of your question is basically asking how to join this result to the remaining tables. As you point out, to do this use JOINs.
SELECT
PARTNER.name,
T1.n_active_companies,
T1.n_inactive_companies
FROM
PARTNER
LEFT JOIN PARTNER_MANAGER ON partner_id = PARTNER.id
LEFT JOIN
(
SELECT
partner_manager_id,
SUM(CASE WHEN active THEN 1 ELSE 0 END) AS n_active_companies,
SUM(CASE WHEN active THEN 0 ELSE 1 END) AS n_inactive_companies
FROM COMPANY
GROUP BY partner_manager_id
) T1
ON T1.partner_manager_id = PARTNER_MANAGER.id

select p.name "Partner Name"
, c1.cnt "n_active_companies"
, c2.cnt "n_inactive_companies"
from partner p
, (select partner_manager_id id, count(partner_manager_id) cnt from company where active = 'true' group by partner_manager_id) c1
, (select partner_manager_id id, count(partner_manager_id) cnt from company where active = 'false' group by partner_manager_id) c2
where c1.id = p.id
and c2.id = p.id

select p.name as 'partner_name',
sum(case when active then 1 else 0) as 'n_active_companies',
sum(case when active then 0 else 1) as 'n_inactive_companies'
from COMPANY c
join PARTNER_MANAGER pm on c.partner_manager_id = pm.id
join PARTNER p on pm.partner_id = p.id
group by p.name

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to avoid duplicate data in the subquery - sql

Related

SQL Joins for Multiple Fields with Null Values

SQL Group rows for every ID using left outer join

Finding nth row using sql

Count within the result set of a subquery

Is it possible to select multiple conditional counts across three tables in a single SQL query?

Categories

Resources