Update table according to matches found in second table - sql

Having a heavy brain freeze here, all help is welcome.
Background info: working on a search function that should return the number of occurrences of one or multiple (user provided) keywords based on matches in another table.
The first table will actually be a temporary table inside a table function but for the sake of simplicity, let's make it a regular table.
CREATE TABLE search_query {
keyword VARCHAR (30),
occurrences INT
}
That "table" will contain the (sanitized) keywords the user provides. For example:
keyword occurrences
HOW 0
TO 0
CODE 0
THIS 0
"occurences" is 0 by default.
My second table contains a record for each significant keyword a product can have.
CREATE TABLE product_keywords {
productId INT,
keyword VARCHAR(30)
}
Sample data:
productId keyword
12 HOW
12 NOT
12 CODE
13 RANDOM
13 THIS
13 CODE
What I'm trying to do is build a query that will UPDATE the "occurrences" field in search_query based on keyword matches with product_keywords.
In this example, after the query runs, search_query should contain
HOW 1
TO 0
CODE 2
THIS 1
My efforts so far stranded because for the life of me, I can't figure out how to build the JOIN necessary for the update. I'm about here:
UPDATE destination_table
SET destination_table.occurrences = JN.matches
FROM search_query AS destination_table
INNER JOIN [where I freeze] AS JN
WHERE ....
Any help appreciated! Thanks, Chris

You need a subquery to return the matches and join it to the table:
UPDATE s
SET s.occurrences = g.counter
FROM search_query s INNER JOIN (
SELECT s.keyword, COUNT(*) counter
FROM search_query s INNER JOIN product_keywords p
ON p.keyword = s.keyword
GROUP BY s.keyword
) g ON g.keyword = s.keyword
See the demo.
Or for readability you could use a CTE:
WITH cte AS (
SELECT s.keyword, COUNT(*) counter
FROM search_query s INNER JOIN product_keywords p
ON p.keyword = s.keyword
GROUP BY s.keyword
)
UPDATE s
SET s.occurrences = c.counter
FROM search_query s INNER JOIN cte c
ON c.keyword = s.keyword
See the demo.
Results:
> keyword | occurrences
> :------ | ----------:
> HOW | 1
> TO | 0
> CODE | 2
> THIS | 1

A correlated subquery with UPDATE might be the simplest and fastest approach:
update search_query
set occurrences = (select count(*)
from product_keywords pk
where pk.keyword = search_query.keyword
);
For performance, you want an index on product_keywords(keywords).

Related

Don't select rows where column A is duplicated AND any row of column B is a specific value

I'm working on generating a report merging multiple tables. The report requires only showing projects that did not have any document marked 'Not Received' These document markings are listed in a table that lists each document in an individual line. So when merged into my other table it creates multiple rows of the same project. For example the following table
Project Number
ChecklistValue
565
Received
565
Not Received
465
Received
465
Not Applicable
As you can see really only two projects are listed on this table but the desired output is:
Project Number
Other Info
465
etc
I do not need the checklist value on the actual report, so I can use the GROUP BY to combine all the good rows, but where I have an Issue is that would still include project 565 even if I include something like where ChecklistValue <> 'Not Received', 565 needs to be hidden from the report entirely because any row for 565 contains 'Not Received'.
So that's my actual question, how do I exclude all project numbers rows that have any row containing 'Not Received'?
I'm adding the entire query will generalized names below:
SELECT
Project Number
,Name
,Contractor
,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod
,S.NoteDate
,S.FinalAppDate
,Status
,S.ONE
,S.TWO
,S.THREE
,S.FOUR
,CH.ChecklistValue
FROM [DB1] A
INNER JOIN [DB2] C ON A.Contractor = C.Contractor
INNER JOIN [DB3] S ON A.AppID = S.AppID
INNER JOIN [DB4] LS ON S.StatusID = LS.StatusID
LEFT OUTER JOIN [DB5] CH ON A.AppID = CH.AppID AND CH.OtherID = 1
WHERE C.TypeID = 4 AND A.YEAR = 2022, AND S.THING = 1 AND
(CH.CheckListValue IS NULL OR A.AppID NOT IN (SELECT * FROM [DB5] WHERE
CheckListValue = 'Not Reveived'))
GROUP BY Project Number,Name,Contractor,ABS(DATEDIFF(day,(ActualDate),(EstDate))) AS DelayPeriod,S.NoteDate,S.FinalAppDate,Status,S.ONE,S.TWO,S.THREE,S.FOUR
The last portion of the WHERE clause was added from a suggestion, but I'm clearly not implementing it correctly as it errors
You can use not in like:
create table test(
num int,
description varchar(20)
);
insert into test(num,description)
values(565,'Received'),
(565,'Not Received'),
(465,'Received'),
(465,'Not Applicable');
select *
from test
where num not in
(
select num -- Only select one column here
from test
where description = 'Not Received'
);
Results:
+-----+---------------+
| num | description |
+-----+---------------+
| 465 | Received |
| 465 | Not Applicable|
+-----+---------------+
db<>fiddle this is on sql-server but works on other dbms as well.
So in your query you should have (in my understanding):
OR A.AppID NOT IN
(
SELECT AppID -- Not select *
FROM [DB5]
WHERE CheckListValue = 'Not Reveived'
)
Other way to do it is with a cte but it is complicated at first glance:
with x as(
select num
from test
where description = 'Not Received'
)
select t.num, t.description
from test t
left join x
on t.num = x.num
where x.num is null
I'm first creating a cte on the num column where the description = not received then I'm selecting all from the test table, and I'm left joining to the cte but I'm only selecting the num column that are not in the cte by using where x.num is null, and this will only return 465.
Now which one is better? I don't know sometimes join would be faster and sometimes in, for more you can find on this post.

Join Tables to return 1 or 0 based on multiple conditions

I am working on a project management website and have been asked for a new feature in a review meeting section.
A meeting is held to determine whether to proceed to the next phase, and I need to maintain a list of who attended each phase review meeting. I need to write an SQL query to return all people, with an additional column that states they have already been added before.
There are two tables involved to get my desired result, with the relevant columns listed below:
Name: PersonList
ID | Name | Division
Name: reviewParticipants
ProjectID | PersonID | GateID
The query I am looking for is something that returns all people in PersonList, with an additional "hasAttended" bit that is TRUE if reviewParticipants.ProjectID = 5 AND reviewParticpants.CurrentPhase = 'G0' ELSE FALSE.
PersonName | PersonID | hasAttended
Mr Smith | 1 | 1
Mr Jones | 2 | 0
I am not sure how to structure such a query with multiple conditions in a (left?) join, that would return as a different column name and data type, so I would appreciate if anybody can point me in the right direction?
With the result of this query I am going to add a series of checkboxes, and use this additional bit to mark it checked, or not, for page refreshes.
You can use LEFT JOIN as well:
SELECT DISTINCT p.*
,CASE WHEN rp.id IS NOT NULL THEN 1 ELSE 0 END AS hasAttended
FROM personlist p
LEFT JOIN reviewParticipants rp ON rp.personid = p.id
AND rp.projectid = 5
AND rp.currentphase = 'GO'
I agree with Gordon Linoff: I would prefer an int or tinyint over a bit value,
You can use exists to see if there is a matching row.
select p.*,
(case when exists (select 1
from reviewParticipants rp
where rp.personid = p.id and
rp.projectid = 5 and
rp.currentphase = 'GO'
)
then 1 else 0 end)
from personlist p;
I see no reason to prefer a bit over an integer, but you can return a bit if you really prefer.
This will do :
select a.* from PersonList a where a.hasAttended=1 and
a.Id in (select b.PersonId from reviewParticipants b
where b.ProjectID =5 and exists (
select 1 from reviewParticipants c where c.CurrentPhase = 'G0'and
c.Project =b.projectId
)
)

Returning a number when result set is null

Each lot object contains a corresponding list of work orders. These work orders have tasks assigned to them which are structured by the task set on the lots parent (the phase). I am trying to get the LOT_ID back and a count of TASK_ID where the TASK_ID is found to exist for the where condition.
The problem is if the TASK_ID is not found, the result set is null and the LOT_ID is not returned at all.
I have uploaded a single row for LOT, PHASE, and WORK_ORDER to the following SQLFiddle. I would have added more data but there is a fun limiter .. err I mean character limiter to the editor.
SQLFiddle
SELECT W.[LOT_ID], COUNT(*) AS NUMBER_TASKS_FOUND
FROM [PHASE] P
JOIN [LOT] L ON L.[PHASE_ID] = P.[PHASE_ID]
JOIN [WORK_ORDER] W ON W.[LOT_ID] = L.[LOT_ID]
WHERE P.[TASK_SET_ID] = 1 AND W.[TASK_ID] = 41
GROUP BY W.[LOT_ID]
The query returns the expected result when the task id is found (46) but no result when the task id is not found (say 41). I'd expect in that case to see something like:
+--------+--------------------+
| LOT_ID | NUMBER_TASKS_FOUND |
+--------+--------------------+
| 500 | 0 |
| 506 | 0 |
+--------+--------------------+
I have a feeling this needs to be wrapped in a sub-query and then joined but I am uncertain what the syntax would be here.
My true objective is to be able to pass a list of TASK_ID and get back any LOT_ID that doesn't match, but for now I am just doing a query per task until I can figure that out.
You want to see all lots with their counts for the task. So either outer join the tasks or cross apply their count or use a subquery in the select clause.
select l.lot_id, count(wo.work_order_id) as number_tasks_found
from lot l
left join work_order wo on wo.lot_id = l.lot_id and wo.task_id = 41
where l.phase_id in (select p.phase_id from phase p where p.task_set_id = 1)
group by l.lot_id
order by l.lot_id;
or
select l.lot_id, w.number_tasks_found
from lot l
cross apply
(
select count(*) as number_tasks_found
from work_order wo
where wo.lot_id = l.lot_id
and wo.task_id = 41
) w
where l.phase_id in (select p.phase_id from phase p where p.task_set_id = 1)
order by l.lot_id;
or
select l.lot_id,
(
select count(*)
from work_order wo
where wo.lot_id = l.lot_id
and wo.task_id = 41
) as number_tasks_found
from lot l
where l.phase_id in (select p.phase_id from phase p where p.task_set_id = 1)
order by l.lot_id;
Another option would be to outer join the count and use COALESCE to turn null into zero in your result.

Is there a simpler way to write this query? [MS SQL Server]

I'm wondering if there is a simpler way to accomplish my goal than what I've come up with.
I am returning a specific attribute that applies to an object. The objects go through multiple iterations and the attributes might change slightly from iteration to iteration. The iteration will only be added to the table if the attribute changes. So the most recent iteration might not be in the table.
Each attribute is uniquely identified by a combination of the Attribute ID (AttribId) and Generation ID (GenId).
Object_Table
ObjectId | AttribId | GenId
32 | 2 | 3
33 | 3 | 1
Attribute_Table
AttribId | GenId | AttribDesc
1 | 1 | Text
2 | 1 | Some Text
2 | 2 | Some Different Text
3 | 1 | Other Text
When I query on a specific object I would like it to return an exact match if possible. For example, Object ID 33 would return "Other Text".
But if there is no exact match, I would like for the most recent generation (largest Gen ID) to be returned. For example, Object ID 32 would return "Some Different Text". Since there is no Attribute ID 2 from Gen 3, it uses the description from the most recent iteration of the Attribute which is Gen ID 2.
This is what I've come up with to accomplish that goal:
SELECT attr.AttribDesc
FROM Attribute_Table AS attr
JOIN Object_Table AS obj
ON obj.AttribId = obj.AttribId
WHERE attr.GenId = (SELECT MIN(GenId)
FROM(SELECT CASE obj2.GenId
WHEN attr2.GenId THEN attr2.GenId
ELSE(SELECT MAX(attr3.GenId)
FROM Attribute_Table AS attr3
JOIN Object_Table AS obj3
ON obj3.AttribId = attr3.AttribId
WHERE obj3.AttribId = 2
)
END AS GenId
FROM Attribute_Table AS attr2
JOIN Object_Table AS obj2
ON attr2.AttribId = obj2.AttribId
WHERE obj2.AttribId = 2
) AS ListOfGens
)
Is there a simpler way to accomplish this? I feel that there should be, but I'm relatively new to SQL and can't think of anything else.
Thanks!
The following query will return the matching value, if found, otherwise use a correlated subquery to return the value with the highest GenId and matching AttribId:
SELECT obj.Object_Id,
CASE WHEN attr1.AttribDesc IS NOT NULL THEN attr1.AttribDesc ELSE attr2.AttribDesc END AS AttribDesc
FROM Object_Table AS obj
LEFT JOIN Attribute_Table AS attr1
ON attr1.AttribId = obj.AttribId AND attr1.GenId = obj.GenId
LEFT JOIN Attribute_Table AS attr2
ON attr2.AttribId = obj.AttribId AND attr2.GenId = (
SELECT max(GenId)
FROM Attribute_Table AS attr3
WHERE attr3.AttribId = obj.AttribId)
In the case where there is no matching record at all with the given AttribId, it will return NULL. If you want to get no record at all in this case, make the second JOIN an INNER JOIN rather than a LEFT JOIN.
Try this...
Incase the logic doesn't find a match for the Object_table GENID it maps it to the next highest GENID in the ON clause of the JOIN.
SELECT AttribDesc
FROM object_TABLE A
INNER JOIN Attribute_Table B
ON A.AttrbId = B.AttrbId
AND (
CASE
WHEN A.Genid <> B.Genid
THEN (
SELECT MAX(C.Genid)
FROM Attribute_Table C
WHERE A.AttrbId = C.AttrbId
)
ELSE A.Genid
END
) -- Selecting the right GENID in the join clause should do the job
= B.Genid
This should work:
with x as (
select *, row_number() over (partition by AttribId order by GenId desc) as rn
from Attribute_Table
)
select isnull(a.attribdesc, x.attribdesc)
from Object_Table o
left join Attribute_Table a
on o.AttribId = a.AttribId and o.GenId = a.GenId
left join x on o.AttribId = x.AttribId and rn = 1

SQL Update Skipping duplicates

Table 1 looks like the following.
ID SIZE TYPE SERIAL
1 4 W-meter1 123456
2 5 W-meter2 123456
3 4 W-meter 585858
4 4 W-Meter 398574
As you can see. Items 1 and 2 both have the same Serial Number. I have an innerjoin update statement that will update the UniqueID on these devices based on linking their serial number to the list.
What I would like to do. Is modify by hand the items with duplicate serial numbers and scripted update the ones that are unique. Im presuming I have to reference the distinct command here somewhere buy not sure.
This is my update statement as is. Pretty simple and straight forward.
update UM00400
Set um00400.umEquipmentID = tb2.MIUNo
from UM00400 tb1
inner join AA_Meters tb2 on
tb1.umSerialNumber = tb2.Old_Serial_Num
where tb1.umSerialNumber <> tb2.New_Serial_Num
;WITH CTE
AS
(
SELECT * , rn = ROW_NUMBER() OVER (PARTITION BY SERIAL ORDER BY SERIAL)
FROM UM00400
)
UPDATE CTE
SET CTE.umEquipmentID = tb2.MIUNo
inner join AA_Meters tb2
on CTE.umSerialNumber = tb2.Old_Serial_Num
where tb1.umSerialNumber <> tb2.New_Serial_Num
AND CTE.rn = 1
This will update the 1st record of multiple records with the same SERIAL.
If i understand your question correctly below query will help you out :
;WITH CTE AS
(
// getting those serial numbers which are not duplicated
SELECT umSerialNumber,COUNT(umSerialNumber) as CountOfSerialNumber
FROM UM00400
GROUP BY umSerialNumber
HAVING COUNT(umSerialNumber) = 1
)
UPDATE A SET A.umEquipmentID = C.MIUNo
FROM UM00400 A
INNER JOIN CTE B ON A.umSerialNumber = B.umSerialNumber
INNER JOIN AA_Meters C ON A.umSerialNumber = C.Old_Serial_Num