Oracle sql join against extracted values - sql

I am looking to reconcile data from 2 different tables where I need to carry out a concatenation and substr to create columns that I can use to carry out a match against.The following separate queries reflect the select statements from each table that produces the matching values that reflect sitenn.zonenn (e.g. site12.zone20) as nodename.
SELECT distinct(REGEXP_SUBSTR(B.NODE_NAME,'*site*.*')) as nodename
FROM OPC_ACT_MESSAGES A,OPC_NODE_NAMES B
WHERE A.MESSAGE_GROUP = 'Ebts_Status_Alarms'
AND A.SEVERITY <> 2
AND A.NODE_ID = B.NODE_ID;
SELECT 'site'||site_id||'.zone'||zone_id as nodename
FROM aw_active_alarms
GROUP BY site_id,zone_id;
I need to write a query that select all nodenames from one table that do not exist in the other.

Use left join to find it. It is faster than minus,not in,not exists etc.
SELECT a.nodename
FROM (SELECT DISTINCT( regexp_substr(B.node_name, '*site*.*') ) AS nodename
FROM opc_act_messages A,
opc_node_names B
WHERE A.message_group = 'Ebts_Status_Alarms'
AND A.severity <> 2
AND A.node_id = B.node_id
) a
LEFT JOIN
(SELECT 'site'
|| site_id
|| '.zone'
|| zone_id AS nodename
FROM aw_active_alarms
GROUP BY site_id,
zone_id
) b
ON a.nodename = b.nodename
WHERE b.nodename IS NULL

One simple way: use MINUS
SELECT distinct(REGEXP_SUBSTR(B.NODE_NAME,'*site*.*')) as nodename
FROM OPC_ACT_MESSAGES A,OPC_NODE_NAMES B
WHERE A.MESSAGE_GROUP = 'Ebts_Status_Alarms'
AND A.SEVERITY <> 2
AND A.NODE_ID = B.NODE_ID
MINUS
SELECT 'site'||site_id||'.zone'||zone_id as nodename
FROM aw_active_alarms
GROUP BY site_id,zone_id;

would this work?
WITH t1
AS (SELECT DISTINCT
(REGEXP_SUBSTR (B.NODE_NAME, '*site*.*')) AS nodename
FROM OPC_ACT_MESSAGES A, OPC_NODE_NAMES B
WHERE A.MESSAGE_GROUP = 'Ebts_Status_Alarms'
AND A.SEVERITY <> 2
AND A.NODE_ID = B.NODE_ID),
t2
AS ( SELECT 'site' || site_id || '.zone' || zone_id AS nodename
FROM aw_active_alarms
GROUP BY site_id, zone_id)
SELECT *
FROM t1
WHERE t1.nodename NOT IN (SELECT nodename FROM t2)

Related

sql - Update more rows than I expected

I'm trying to update a part of my table. If I do a select statement, I find 17 ocurrences, but when I update it, it updates 997 ocurrences. I only want to update the 17 ocurrences. This is my code:
update proc_try k set detail = (
select jobs from
(
with
a ( nameHost ) as (
select b.nameHost
from definition a ,schema.nodes b
where b.nameHost = a.idNode or b.nodeid=a.idNode
and nodetype not like 'R'
group by b.nameHost
having sum(1 + lengthb(nameJob)) - 1 > 4000
)
select nameHost, 'TOOLONG' as jobs
from a
UNION ALL
select p.nameHost, listagg(p.nameJob,',') within group (order by p.nameJob) as jobs
from
(
select distinct b.nameJob, a.nameHost
from definition b
right join schema.nodes a
on b.idNode in (a.nodeid,a.nameHost) and
b.application not like '#NOTINCLUDE'
where a.nameHost not in (select * from a) and nodetype not like 'R'
--b.application not like '#NOTINCLUDE'
) p
group by p.nameHost) random
where k.nameHost=random.nameHost);
Could you help me please?
You can generally convert a complex update into a merge:
merge into proc_try k
using
( select jobs
from ( with a(namehost) as
( select b.namehost
from definition a
join schema.nodes b
on b.namehost = a.idnode
or (b.nodeid = a.idnode and nodetype <> 'R')
group by b.namehost
having sum(1 + lengthb(namejob)) - 1 > 4000 )
select namehost
, 'TOOLONG' as jobs
from a
union all
select p.namehost
, listagg(p.namejob, ',') within group(order by p.namejob) as jobs
from ( select distinct
b.namejob, a.namehost
from schema.nodes a
left join definition b
on b.idnode in (a.nodeid, a.namehost)
and b.application not like '#NOTINCLUDE'
where a.namehost not in (select * from a)
and nodetype not like 'R'
) p
group by p.namehost
) random
) new_jobs
on (k.namehost = new_jobs.namehost)
when matched then update set k.detail = new_jobs.jobs;
This is untested as I don't have your tables or sample data.
Edit: Looks like we can simplify it a bit, to this:
merge into proc_try k
using
( with overlength (namehost) as
( select n.namehost
from definition d
join schema.nodes n
on n.namehost = d.idnode
or (n.nodeid = d.idnode and nodetype <> 'R')
group by n.namehost
having sum(1 + lengthb(n.namejob)) - 1 > 4000 )
select o.namehost, 'TOOLONG' as jobs
from overlength o
union all
select sd.namehost
, listagg(sd.namejob, ',') within group(order by sd.namejob) as jobs
from ( select distinct d.namejob, n.namehost
from schema.nodes n
left join definition d
on d.idnode in (n.nodeid, n.namehost)
and d.application not like '#NOTINCLUDE'
where n.namehost not in (select o.namehost from overlength o)
and n.nodetype not like 'R'
) sd
group by sd.namehost
) new_jobs
on (new_jobs.namehost = k.namehost)
when matched then update set k.detail = new_jobs.jobs;
I still can't see what
sum(1 + lengthb(namejob)) - 1
is meant to do, though. It looks like that could be simplified to
sum(lengthb(namejob))

SQL sub query not working with group by clause

I am using SQL Server 2012. Can anyone tell me where i am going wrong ?
SELECT
avg ( tbl.FirstBillComplete )
FROM
( select l.MONTH, a.OverallScore, (a.FirstBillComplete), ( a.EmailComplete)
from tbl_T1 a join calls.dbo.c1_LP l on a.QID = l.QID
union
select l.MONTH, a.OverallScore, (a.FirstBillComplete), ( a.EmailComplete)
from tbl_2 a join calls.dbo.C3_LP l on a.QID = l.QID
union ALL
select l.MONTH, a.OverallScore, (a.FirstBillComplete), ( a.EmailComplete)
from tbl_3 a join c2 l on a.QID = l.QID
) As tbl
GROUP BY tbl.MONTH
The error I get is :
No column was specified for column 7 of 'tbl'.
No column was specified for column 8 of 'tbl'
You need to specify a column name for column 7 and 8 of tbl: use
'' AS MyColumn7,
For example.

Is there a way to make this query more efficient performance wise?

This query takes a long time to run on MS Sql 2008 DB with 70GB of data.
If i run the 2 where clauses seperately it takes a lot less time.
EDIT - I need to change the 'select *' to 'delete' afterwards, please keep it in mind when answering. thanks :)
select *
From computers
Where Name in
(
select T2.Name
from
(
select Name
from computers
group by Name
having COUNT(*) > 1
) T3
join computers T2 on T3.Name = T2.Name
left join policyassociations PA on T2.PK = PA.EntityId
where (T2.EncryptionStatus = 0 or T2.EncryptionStatus is NULL) and
(PA.EntityType <> 1 or PA.EntityType is NULL)
)
OR
ClientId in
(
select substring(ClientID,11,100)
from computers
)
Swapping IN for EXISTS will help.
Also, as per Gordon's answer: UNION can out-perform OR.
SELECT computers.*
FROM computers
LEFT
JOIN policyassociations
ON policyassociations.entityid = computers.pk
WHERE (
computers.encryptionstatus = 0
OR computers.encryptionstatus IS NULL
)
AND (
policyassociations.entitytype <> 1
OR policyassociations.entitytype IS NULL
)
AND EXISTS (
SELECT name
FROM (
SELECT name
FROM computers
GROUP
BY name
HAVING Count(*) > 1
) As duplicate_computers
WHERE name = computers.name
)
UNION
SELECT *
FROM computers As c
WHERE EXISTS (
SELECT SubString(clientid, 11, 100)
FROM computers
WHERE SubString(clientid, 11, 100) = c.clientid
)
You've now updated your question asking to make this a delete.
Well the good news is that instead of the "OR" you just make two DELETE statements:
DELETE
FROM computers
LEFT
JOIN policyassociations
ON policyassociations.entityid = computers.pk
WHERE (
computers.encryptionstatus = 0
OR computers.encryptionstatus IS NULL
)
AND (
policyassociations.entitytype <> 1
OR policyassociations.entitytype IS NULL
)
AND EXISTS (
SELECT name
FROM (
SELECT name
FROM computers
GROUP
BY name
HAVING Count(*) > 1
) As duplicate_computers
WHERE name = computers.name
)
;
DELETE
FROM computers As c
WHERE EXISTS (
SELECT SubString(clientid, 11, 100)
FROM computers
WHERE SubString(clientid, 11, 100) = c.clientid
)
;
Some things I would look at are
1. are indexes in place?
2. 'IN' will slow your query, try replacing it with joins,
3. you should use column name, I guess 'Name' in this case, while using count(*),
4. try selecting required data only, by selecting particular columns.
Hope this helps!
or can be poorly optimized sometimes. In this case, you can just split the query into two subqueries, and combine them using union:
select *
From computers
Where Name in
(
select T2.Name
from
(
select Name
from computers
group by Name
having COUNT(*) > 1
) T3
join computers T2 on T3.Name = T2.Name
left join policyassociations PA on T2.PK = PA.EntityId
where (T2.EncryptionStatus = 0 or T2.EncryptionStatus is NULL) and
(PA.EntityType <> 1 or PA.EntityType is NULL)
)
UNION
select *
From computers
WHERE ClientId in
(
select substring(ClientID,11,100)
from computers
);
You might also be able to improve performance by replacing the subqueries with explicit joins. However, this seems like the shortest route to better performance.
EDIT:
I think the version with join's is:
select c.*
From computers c left outer join
(select c.Name
from (select c.*, count(*) over (partition by Name) as cnt
from computers c
) c left join
policyassociations PA
on T2.PK = PA.EntityId and PA.EntityType <> 1
where (c.EncryptionStatus = 0 or c.EncryptionStatus is NULL) and
c.cnt > 1
) cpa
on c.Name = cpa.Name left outer join
(select substring(ClientID, 11, 100) as name
from computers
) csub
on c.Name = csub.name
Where cpa.Name is not null or csub.Name is not null;

Find records with exact matches on a many to many relationship

I have three tables that look like these:
PROD
Prod_ID|Desc
------------
P1|Foo1
P2|Foo2
P3|Foo3
P4|Foo4
...
RAM
Ram_ID|Desc
------------
R1|Bar1
R2|Bar2
R3|Bar3
R4|Bar4
...
PROD_RAM
Prod_ID|Ram_ID
------------
P1|R1
P2|R2
P3|R1
P3|R2
P3|R3
P4|R3
P5|R1
P5|R2
...
Between PROD and RAM there's a Many-To-Many relationship described by the PROD_RAM table.
Given a Ram_ID set like (R1,R3) I would like to find all the PROD that has exactly ONE or ALL of the RAM of the given set.
Given (R1,R3) should return for example P1,P4 and P5; P3 should not be returned because has R1 and R3 but also R2.
What's the fastest query to get all the PROD that has exactly ONE or ALL of the Ram_ID of a given RAM set?
EDIT:
The PROD_RAM table could contain relationship bigger than 1->3 so, "hardcoded" checks for count = 1 OR = 2 are not a viable solution.
Another solution you could try for speed would be like this
;WITH CANDIDATES AS (
SELECT pr1.Prod_ID
, pr2.Ram_ID
FROM PROD_RAM pr1
INNER JOIN PROD_RAM pr2 ON pr2.Prod_ID = pr1.Prod_ID
WHERE pr1.Ram_ID IN ('R1', 'R3')
)
SELECT *
FROM CANDIDATES
WHERE CANDIDATES.Prod_ID NOT IN (
SELECT Prod_ID
FROM CANDIDATES
WHERE Ram_ID NOT IN ('R1', 'R3')
)
or if you don't like repeating the set conditions
;WITH SUBSET (Ram_ID) AS (
SELECT 'R1'
UNION ALL SELECT 'R3'
)
, CANDIDATES AS (
SELECT pr1.Prod_ID
, pr2.Ram_ID
FROM PROD_RAM pr1
INNER JOIN PROD_RAM pr2 ON pr2.Prod_ID = pr1.Prod_ID
INNER JOIN SUBSET s ON s.Ram_ID = pr1.Ram_ID
)
, EXCLUDES AS (
SELECT Prod_ID
FROM CANDIDATES
LEFT OUTER JOIN SUBSET s ON s.Ram_ID = CANDIDATES.Ram_ID
WHERE s.Ram_ID IS NULL
)
SELECT *
FROM CANDIDATES
LEFT OUTER JOIN EXCLUDES ON EXCLUDES.Prod_ID = CANDIDATES.Prod_ID
WHERE EXCLUDES.Prod_ID IS NULL
One way to do this would be something like the following:
SELECT PROD.Prod_ID FROM PROD WHERE
(SELECT COUNT(*) FROM PROD_RAM WHERE PROD_RAM.Prod_ID = PROD.Prod_ID) > 0 AND
(SELECT COUNT(*) FROM PROD_RAM WHERE PROD_RAM.Prod_ID = PROD.Prod_ID AND PROD.Ram_ID <>
IFNULL((SELECT TOP 1 Ram_ID FROM PROD_RAM WHERE PROD_RAM.Prod_ID = PROD.Prod_ID),0)) = 0
SELECT Prod_ID
FROM
( SELECT Prod_ID
, COUNT(*) AS cntAll
, COUNT( CASE WHEN Ram_ID IN (1,3)
THEN 1
ELSE NULL
END
) AS cntGood
FROM PROD_RAM
GROUP BY Prod_ID
) AS grp
WHERE cntAll = cntGood
AND ( cntGood = 1
OR cntGood = 2 --- number of items in list (1,3)
)
Not at all sure if it's the fastest way. You'll have to try different ways to write this query (using JOINs and NOT EXISTS ) and test for speed.

MySQL/SQL - When are the results of a sub-query avaliable?

Suppose I have this query
SELECT * FROM (
SELECT * FROM table_a
WHERE id > 10 )
AS a_results LEFT JOIN
(SELECT * from table_b
WHERE id IN
(SElECT id FROM a_results)
ON (a_results.id = b_results.id)
I would get the error "a_results is not a table". Anywhere I could use the re-use the results of the subquery?
Edit: It has been noted that this query doesn't make sense...it doesn't, yes. This is just to illustrate the question which I am asking; the 'real' query actually looks something like this:
SELECT SQL_CALC_FOUND_ROWS * FROM
( SELECT wp_pod_tbl_hotel . *
FROM wp_pod_tbl_hotel, wp_pod_rel, wp_pod
WHERE wp_pod_rel.field_id =12
AND wp_pod_rel.tbl_row_id =1
AND wp_pod.tbl_row_id = wp_pod_tbl_hotel.id
AND wp_pod_rel.pod_id = wp_pod.id
) as
found_hotel LEFT JOIN (
SELECT COUNT(*) as review_count, avg( (
location_rating + staff_performance_rating + condition_rating + room_comfort_rating + food_rating + value_rating
) /6 ) AS average_score, hotelid
FROM (
SELECT r. * , wp_pod_rel.tbl_row_id AS hotelid
FROM wp_pod_tbl_review r, wp_pod_rel, wp_pod
WHERE wp_pod_rel.field_id =11
AND wp_pod_rel.pod_id = wp_pod.id
AND r.id = wp_pod.tbl_row_id
AND wp_pod_rel.tbl_row_id
IN (
SELECT wp_pod_tbl_hotel .id
FROM wp_pod_tbl_hotel, wp_pod_rel, wp_pod
WHERE wp_pod_rel.field_id =12
AND wp_pod_rel.tbl_row_id =1
AND wp_pod.tbl_row_id = wp_pod_tbl_hotel.id
AND wp_pod_rel.pod_id = wp_pod.id
)
) AS hotel_reviews
GROUP BY hotel_reviews.hotelid
ORDER BY average_score DESC
AS sorted_hotel ON (id = sorted_hotel.hotelid)
As you can see, the sub-query which makes up the found_query table is repeated elsewhere downward as another sub-query, so I was hoping to re-use the results
You can not use a sub-query like this.
I'm not sure I understand your query, but wouldn't that be sufficient?
SELECT * FROM table_a a
LEFT JOIN table_b b ON ( b.id = a.id )
WHERE a.id > 10
It would return all rows from table_a where id > 10 and LEFT JOIN rows from table_b where id matches.