SQL Joins for Multiple Fields with Null Values

SQL Joins for Multiple Fields with Null Values - sql

I have a table of maintenance requirements and associated monthly frequency it is to be performed
maint
+----------+------+
| maint_id | freq |
+----------+------+
| 1 | 6 |
| 2 | 12 |
| 3 | 24 |
| 4 | 3 |
+----------+------+
I also have a table of equipment with data on its manufacturer, model, device type and building.
equip
+----------+--------+--------+--------+---------+
| equip_id | mfg_id | mod_id | dev_id | bldg_id |
+----------+--------+--------+--------+---------+
| 1 | 1 | 1 | 3 | 1 |
| 2 | 1 | 2 | 3 | 1 |
| 3 | 2 | 3 | 1 | 2 |
| 4 | 2 | 3 | 1 | 3 |
+----------+--------+--------+--------+---------+
I am trying to match each maintenance requirement with its associated equipment. Each requirement applies to a specific manufacturer, model, device, facility or any combination of these in its scope of application.
I have created a table to manage these relationships like this:
maint_equip
+----------------+----------+--------+--------+--------+---------+
| maint_equip_id | maint_id | mfg_id | mod_id | dev_id | bldg_id |
+----------------+----------+--------+--------+--------+---------+
| 1 | 1 | NULL | NULL | 1 | NULL |
| 2 | 2 | 2 | NULL | NULL | 2 |
| 3 | 3 | NULL | NULL | NULL | 1 |
| 4 | 3 | NULL | NULL | NULL | 3 |
| 5 | 4 | 1 | NULL | 3 | 1 |
+----------------+----------+--------+--------+--------+---------+
As per the table above, requirement 1 would only apply to any equipment having device type "1."
Requirement 2 would apply to all equipment having both manufacturer "2" AND building "2."
Requirement 3 would apply to all equipment having building "1" OR building "3"
Requirement 4 would apply to equipment having all of mfg_id "1" AND dev_id "3" AND building "1."
I am trying to write a query to give me a list of all equipment ids and all the associated frequency requirements based on the relationships defined in maint_equip. The problem I'm running into is handling the multiple joins. I have already tried:
SELECT equip.equip_id, maint.freq
FROM equip INNER JOIN
maint_equip ON equip.mfg_id = maint_equip.mfg_id
OR equip.mod_id = maint_equip.mod_id
OR equip.dev_id = maint_equip.dev_id
OR equip.bldg_id = maint_equip.bldg_id INNER JOIN
maint ON maint_equip.maint_id = maint.maint_id
but separating multiple joins using OR means that it is not accounting for the AND contingencies of each row. For example, maint_id 2 should only apply to equip_id 3 but ids 3 and 4 are both returned. If AND is used, then no rows are returned because none have a value for all columns.
Is it possible to join the tables in such a way to accomplish this or is there another way to structure the data?

If I get this right, when an equipment related ID in maint_equip is null, that should count as a match. Only if it isn't null, it must match the respective ID in equip. That is, you want to check if an ID in maint_equip is null or equal to its counterpart from equip.
SELECT e.equip_id,
m.freq
FROM equip e
INNER JOIN maint_equip me
ON (me.mfg_id IS NULL
OR me.mfg_id = e.mfg_id)
AND (me.mod_id IS NULL
OR me.mod_id = e.mod_id)
AND (me.dev_id IS NULL
OR me.dev_id = e.dev_id)
AND (me.bldg_id IS NULL
OR me.bldg_id = e.bldg_id)
INNER JOIN maint m
ON m.maint_id = me.main_id;

Try this:
( equip.mfg_id = maint_equip.mfg_id OR maint_equip.mfg_id is null )
AND( equip.mod_id = maint_equip.mod_id OR maint_equip.mod_id is null )
AND( equip.dev_id = maint_equip.dev_id OR maint_equip.dev_id is null )
AND( equip.bldg_id = maint_equip.bldg_id OR maint_equip.bldg_id is null )

Pay attention that your mod_id is always null. Otherwise query below goes through all your cases.
SELECT maint_equip.maint_id, equip.equip_id, maint.freq
FROM equip INNER JOIN
maint_equip ON (
(equip.mfg_id = maint_equip.mfg_id AND
equip.dev_id = maint_equip.dev_id AND
equip.bldg_id = maint_equip.bldg_id
) OR
(equip.mfg_id = maint_equip.mfg_id AND
maint_equip.dev_id is NULL AND
equip.bldg_id = maint_equip.bldg_id
) OR
(maint_equip.mfg_id is NULL AND
equip.dev_id = maint_equip.dev_id AND
maint_equip.bldg_id is NULL
) OR
(maint_equip.mfg_id is NULL AND
maint_equip.dev_id is NULL AND
equip.bldg_id = maint_equip.bldg_id
) )
INNER JOIN
maint ON maint_equip.maint_id = maint.maint_id
;

It seems to me that what you're actually looking for is the maintenance schedule that has the highest number of matches. You can get that by using a SUM with a series of CASE expressions to get the count of matching columns.
Then you have to account for ties where there are multiple maint_id values that match an equal number of times. For the example below, I opted to use maintenance frequency as the tie breaker, favoring more frequent maintenance over less frequent maintenance.
Rextester link with data set up: https://rextester.com/VISR88105
The ROW_NUMBER in the ORDER BY clause sorts the results by number of column matches (the nutty SUM/CASE combo) in descending order to get the most matches first, and then by maintenance frequency in ascending order to favor more frequent maintenance. Easy to reverse that with a DESC if you like. Then the TOP (1) WITH TIES limits the result set to all of the rows where ROW_NUMBER evaluates to 1.
The code:
SELECT TOP (1) WITH TIES
e.equip_id,
m.maint_id,
m.freq
FROM
#maint as m
JOIN
#maint_equip as me
ON
m.maint_id = me.maint_id
JOIN
#equip as e
ON
e.mfg_id = COALESCE(me.mfg_id, e.mfg_id)
AND
e.mod_id = COALESCE(me.mod_id, e.mod_id)
AND
e.dev_id = COALESCE(me.dev_id, e.dev_id)
AND
e.bldg_id = COALESCE(me.bldg_id, e.bldg_id)
GROUP BY
e.equip_id,
m.maint_id,
m.freq
ORDER BY
ROW_NUMBER() OVER (PARTITION BY e.equip_id ORDER BY (
SUM(
(CASE WHEN e.mfg_id = me.mfg_id THEN 1 ELSE 0 END) +
(CASE WHEN e.mod_id = me.mod_id THEN 1 ELSE 0 END) +
(CASE WHEN e.dev_id = me.dev_id THEN 1 ELSE 0 END) +
(CASE WHEN e.bldg_id = me.bldg_id THEN 1 ELSE 0 END)) ) DESC, m.freq )
Results:
+----------+----------+------+
| equip_id | maint_id | freq |
+----------+----------+------+
| 1 | 4 | 3 |
| 2 | 4 | 3 |
| 3 | 2 | 12 |
| 4 | 1 | 6 |
+----------+----------+------+

Related

How to avoid duplicate data in the subquery

I have two tables as below.
Product table:
+-----+------------+-----+-------+--------+
| id | activityId | age | queue | status |
+-----+------------+-----+-------+--------+
| 100 | 2 | 0 | start | 2 |
| 101 | 3 | 0 | in | 5 |
+-----+------------+-----+-------+--------+
Department table:
+-----+------------+-------+----------+
| id | activityId | queue | exittime |
+-----+------------+-------+----------+
| 100 | 1 | new | null |
| 100 | 2 | start | null |
| 100 | 2 | start | null |
| 101 | 1 | new | null |
| 101 | 1 | new | null |
| 101 | 3 | in | null |
| 101 | 3 | in | null |
+-----+------------+-------+----------+
I am trying to update product table age column with below query. But its returning error as ORA-01427 Single-row subquery returning more than one row.
update Product pd set pd.age = (select (case when dp.exittime!= null then
(sysdate - dp.exittime)
else ( case when pd.queue = dp.queue
then (select (sysdate - dp1.entrytime) from department dp1 where pd.id = dp1.id
) else 2 END) END)
from department dp
where dp.id > 1
AND pd.id = dp.id
AND pd.status in('1','7','2','5')
AND pd.queue= dp.queue
AND pd.activityId = dp.activityId )
where exists
(select 1 from department dp
where dp.id > 1
AND pd.id = dp.id
AND pd.status in('1','7','2','5')
AND pd.queue= dp.queue
AND pd.activityId = dp.activityId );
Subquery returning multiple values due to activityId in department table. How can I avoid sub-query returning multiple value.

This query will identify the scenarios under which you get mutliple rows.
select
dp.id,
dp.queue,
dp.activityId,
COUNT(*)
from
department dp
inner join
product pd
ON pd.id = dp.id
AND pd.queue= dp.queue
AND pd.activityId = dp.activityId
where
dp.id > 1
AND pd.status in('1','7','2','5')
GROUP BY
dp.id,
dp.queue,
dp.activityId
HAVING
COUNT(*) > 1
For those cases you need to determine one of the following...
How to fix the data to return only one row
How to fix the query to return only one row
How to pick just one row from the multiple rows returned
As we can't see your data, we can't fix any of that for you.
After investigating, however, you may be able to return with a more specific question.

Efficient ROW_NUMBER increment when column matches value

I'm trying to find an efficient way to derive the column Expected below from only Id and State. What I want is for the number Expected to increase each time State is 0 (ordered by Id).
+----+-------+----------+
| Id | State | Expected |
+----+-------+----------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 0 | 2 |
| 4 | 1 | 2 |
| 5 | 4 | 2 |
| 6 | 2 | 2 |
| 7 | 3 | 2 |
| 8 | 0 | 3 |
| 9 | 5 | 3 |
| 10 | 3 | 3 |
| 11 | 1 | 3 |
+----+-------+----------+
I have managed to accomplish this with the following SQL, but the execution time is very poor when the data set is large:
WITH Groups AS
(
SELECT Id, ROW_NUMBER() OVER (ORDER BY Id) AS GroupId FROM tblState WHERE State=0
)
SELECT S.Id, S.[State], S.Expected, G.GroupId FROM tblState S
OUTER APPLY (SELECT TOP 1 GroupId FROM Groups WHERE Groups.Id <= S.Id ORDER BY Id DESC) G
Is there a simpler and more efficient way to produce this result? (In SQL Server 2012 or later)

Just use a cumulative sum:
select s.*,
sum(case when state = 0 then 1 else 0 end) over (order by id) as expected
from tblState s;

Other method uses subquery :
select *,
(select count(*)
from table t1
where t1.id < t.id and state = 0
) as expected
from table t;

Finding nth row using sql

select top 20 *
from dbo.DUTs D
inner join dbo.Statuses S on d.StatusID = s.StatusID
where s.Description = 'Active'
Above SQL Query returns the top 20 rows, how can I get a nth row from the result of the above query? I looked at previous posts on finding the nth row and was not clear to use it for my purpose.
Thanks.

The row order is arbitrary, so I would add an ORDER BY expression. Then, you can do something like this:
SELECT TOP 1 * FROM (SELECT TOP 20 * FROM ... ORDER BY d.StatusID) AS d ORDER BY d.StatusID DESC
to get the 20th row.
You can also use OFFSET like:
SELECT * FROM ... ORDER BY d.StatusID OFFSET 19 ROWS FETCH NEXT 1 ROWS ONLY
And a third option:
SELECT * FROM (SELECT *, rownum = ROW_NUMBER() OVER (ORDER BY d.StatusID) FROM ...) AS a WHERE rownum = 20

I tend to use CTEs with the ROW_NUMBER() function to get my lists numbered in order. As #zambonee said, you'll need an ORDER BY clause either way or SQL can put them in a different order every time. It doesn't usually, but without ordering it yourself, you're not guaranteed to get the same thing twice. Here I'm assuming there's a [DateCreated] field (DATETIME NOT NULL DEFAULT GETDATE()), which is usually a good idea so you know when that record was entered. This says "give me everything in that table and add a row number with the most recent record as #1":
; WITH AllDUTs
AS (
SELECT *
, DateCreatedRank = ROW_NUMBER() OVER(ORDER BY [DateCreated] DESC)
FROM dbo.DUTs D
INNER JOIN dbo.Statuses S ON D.StatusID = S.StatusID
WHERE S.Description = 'Active'
)
SELECT *
FROM AllDUTs
WHERE AllDUTs.DateCreatedRank = 20;

SELECT * FROM (SELECT * FROM EMP ORDER BY ROWID DESC) WHERE ROWNUM<11

It's another sample:
SELECT * ,CASE WHEN COUNT(0)OVER() =ROW_NUMBER()OVER(ORDER BY number) THEN 1 ELSE 0 END IsNth
FROM (
select top 10 *
from master.dbo.spt_values AS d
where d.type='P'
) AS t
+------+--------+------+-----+------+--------+-------+
| name | number | type | low | high | status | IsNth |
+------+--------+------+-----+------+--------+-------+
| NULL | 0 | P | 1 | 1 | 0 | 0 |
| NULL | 1 | P | 1 | 2 | 0 | 0 |
| NULL | 2 | P | 1 | 4 | 0 | 0 |
| NULL | 3 | P | 1 | 8 | 0 | 0 |
| NULL | 4 | P | 1 | 16 | 0 | 0 |
| NULL | 5 | P | 1 | 32 | 0 | 0 |
| NULL | 6 | P | 1 | 64 | 0 | 0 |
| NULL | 7 | P | 1 | 128 | 0 | 0 |
| NULL | 8 | P | 2 | 1 | 0 | 0 |
| NULL | 9 | P | 2 | 2 | 0 | 1 |
+------+--------+------+-----+------+--------+-------+

get the value from the previous row if row is NULL

I have this pivoted table
+---------+----------+----------+-----+----------+
| Date | Product1 | Product2 | ... | ProductN |
+---------+----------+----------+-----+----------+
| 7/1/15 | 5 | 2 | ... | 7 |
| 8/1/15 | 7 | 1 | ... | 9 |
| 9/1/15 | NULL | 7 | ... | NULL |
| 10/1/15 | 8 | NULL | ... | NULL |
| 11/1/15 | NULL | NULL | ... | NULL |
+---------+----------+----------+-----+----------+
I wanted to fill in the NULL column with the values above them. So, the output should be something like this.
+---------+----------+----------+-----+----------+
| Date | Product1 | Product2 | ... | ProductN |
+---------+----------+----------+-----+----------+
| 7/1/15 | 5 | 2 | ... | 7 |
| 8/1/15 | 7 | 1 | ... | 9 |
| 9/1/15 | 7 | 7 | ... | 9 |
| 10/1/15 | 8 | 7 | ... | 9 |
| 11/1/15 | 8 | 7 | ... | 9 |
+---------+----------+----------+-----+----------+
I've found this article that might help me but this only manipulate one column. How do I apply this to all my column or how can I achieve such result since my columns are dynamic.
Any help would be much appreciated. Thanks!

The ANSI standard has the IGNORE NULLS option on LAG(). This is exactly what you want. Alas, SQL Server has not (yet?) implemented this feature.
So, you can do this in several ways. One is using multiple outer applys. Another uses correlated subqueries:
select p.date,
(case when p.product1 is not null else p.product1
else (select top 1 p2.product1 from pivoted p2 where p2.date < p.date order by p2.date desc)
end) as product1,
(case when p.product1 is not null else p.product1
else (select top 1 p2.product1 from pivoted p2 where p2.date < p.date order by p2.date desc)
end) as product1,
(case when p.product2 is not null else p.product2
else (select top 1 p2.product2 from pivoted p2 where p2.date < p.date order by p2.date desc)
end) as product2,
. . .
from pivoted p ;
I would recommend an index on date for this query.

I would like to suggest you a solution. If you have a table which consists of merely two columns my solution will work perfectly.
+---------+----------+
| Date | Product |
+---------+----------+
| 7/1/15 | 5 |
| 8/1/15 | 7 |
| 9/1/15 | NULL |
| 10/1/15 | 8 |
| 11/1/15 | NULL |
+---------+----------+
select x.[Date],
case
when x.[Product] is null
then min(c.[Product])
else
x.[Product]
end as Product
from
(
-- this subquery evaluates a minimum distance to the rows where Product column contains a value
select [Date],
[Product],
min(case when delta >= 0 then delta else null end) delta_min,
max(case when delta < 0 then delta else null end) delta_max
from
(
-- this subquery maps Product table to itself and evaluates the difference between the dates
select p.[Date],
p.[Product],
DATEDIFF(dd, p.[Date], pnn.[Date]) delta
from #products p
cross join (select * from #products where [Product] is not null) pnn
) x
group by [Date], [Product]
) x
left join #products c on x.[Date] =
case
when abs(delta_min) < abs(delta_max) then DATEADD(dd, -delta_min, c.[Date])
else DATEADD(dd, -delta_max, c.[Date])
end
group by x.[Date], x.[Product]
order by x.[Date]
In this query I mapped the table to itself rows which contain values by CROSS JOIN statement. Then I calculated differences between dates in order to pick the closest ones and thereafter fill empty cells with values.
Result:
+---------+----------+
| Date | Product |
+---------+----------+
| 7/1/15 | 5 |
| 8/1/15 | 7 |
| 9/1/15 | 7 |
| 10/1/15 | 8 |
| 11/1/15 | 8 |
+---------+----------+
Actually, the suggested query doesn't choose the previous value. Instead of this, it selects the closest value. In other words, my code can be used for a number of different purposes.

First You need to add identity column in temporary or hard table then resolved by following method.
--- Solution ----
Create Table #Test (ID Int Identity (1,1),[Date] Date , Product_1 INT )
Insert Into #Test ([Date], Product_1)
Values
('7/1/15',5)
,('8/1/15',7)
,('9/1/15',Null)
,('10/1/15',8)
,('11/1/15',Null)
Select ID , DATE ,
IIF ( Product_1 is null ,
(Select Product_1 from #TEST
Where ID = (Select Top 1 a.ID From #TEST a where a.Product_1 is not null and a.ID<b.ID
Order By a.ID desc)
),Product_1) Product_1
from #Test b
-- Solution End ---

SQL query handling three tables

I want to compile the data for reporting purpose, using php communicating with postgres,
I have three tables
product_status:
status_code | value
------------|------
1 | a
2 | b
product_child:
code| ID | status_code
----|----|------------
X | 1 | 1
X | 2 | 1
X | 3 | 2
Y | 1 | 2
Y | 2 | 2
Z | 1 | 1
product_master:
code|description(and some other columns not relevent)
X | la
Y | alb
Z | lab
In the end I want basically a table like this which i'll display
| total child | status a | status b
bla | 3 | 2 | 1
alb | 2 | 0 | 2
lab | 1 | 1 | 0
I have tried
SELECT s.value, count(s.value)
FROM product_child p, product_status s
WHERE
p.product_status = s.status_code
and p.product_code = get_product_code('Sofa (1-seater) J805')
group by
s.value
it gives we grouping for particular code but I want this flipped and appended in front of distinct product_codes

do you mean like this?
select pm.description,
count(pc.code) total,
count(case when ps.value = 'A' then ps.value else null end) statusA,
count(case when ps.value = 'B' then ps.value else null end) statusB
from product_master pm join product_child pc on pm.code = pc.code
join product_status ps on pm.status_code = ps.status_code
group by pm.description

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Joins for Multiple Fields with Null Values - sql

Try this: ( equip.mfg_id = maint_equip.mfg_id OR maint_equip.mfg_id is null ) AND( equip.mod_id = maint_equip.mod_id OR maint_equip.mod_id is null ) AND( equip.dev_id = maint_equip.dev_id OR maint_equip.dev_id is null ) AND( equip.bldg_id = maint_equip.bldg_id OR maint_equip.bldg_id is null )

Related

How to avoid duplicate data in the subquery

Efficient ROW_NUMBER increment when column matches value

Finding nth row using sql

get the value from the previous row if row is NULL

SQL query handling three tables

Categories

Resources