Find matching set of records - sql

I have the following tables on SQL Server 2008R2
MessageTable
ContrlNo| LineNo | Msg
1 | 1 | Tiger1 Text
1 | 2 | Tiger1 Text
1 | 3 | Tiger1 Text
1 | 4 | Tiger1 Text
2 | 1 | Tiger1 Text1
2 | 2 | Tiger1 Text2
2 | 3 | Tiger1 Text3
2 | 4 | Tiger1 Text4
3 | 1 | Horse 1
3 | 2 | Horse 2
3 | 3 | Horse 3
3 | 4 | Horse 4
RuleTable
RuleNo| MsgLineNo | RuleStartingPos | RuleMsg
1 | 1 | 1 | Tiger1 Text
2 | 1 | 1 | Tiger1 Text
2 | 3 | 1 | Tiger1 Text3
For each set of ControlNo records in the MESSAGETABLE I would like to apply the rule from the RULETABLE and list the RULENo if any mataches.
If you see the RuleTable, the Rule 2 overlaps rule 1. And the requirement is to get the most matched RuleNo for each Control number. the expected result is,
ContrlNo | RuleNo
1 | 1
2 | 2
3 | NULL
Thanks,
Jay

I believe the following will retrieve the listed results: it will show a control number with the associated rule that has the most matching lines (if, as in your first case, two rules have an equal number of matches, it will check the MatchPercent).
SELECT MT.ContrlNo, r.RuleNo, r.MatchPercent
FROM
MessageTable MT
LEFT JOIN
(
SELECT
ContrlNo,
RuleNo,
MatchedRules / AvailableRules AS MatchPercent,
ROW_NUMBER() OVER (PARTITION BY ContrlNo ORDER BY MatchedRules DESC, MatchedRules / AvailableRules DESC) AS rn
FROM
(
SELECT
ContrlNo,
R.RuleNo,
COUNT(*) as MatchedRules,
(SELECT COUNT(*) FROM RuleTable WHERE RuleTable.RuleNo = R.RuleNo) + 0.0 AS AvailableRules
FROM
MessageTable M
INNER JOIN
RuleTable R ON
M.[LineNo] = R.MsgLineNo AND
SUBSTRING(M.Msg,R.RuleStartingPos,LEN(R.RuleMsg)) LIKE '%' + R.RuleMsg + '%'
GROUP BY M.ContrlNo, R.RuleNo
) q
) r ON
MT.ContrlNo = r.ContrlNo AND
r.rn = 1
GROUP BY MT.ContrlNo, r.RuleNo, r.MatchPercent
SQL Fiddle

First subquery is getting total Matched rules and second sub query is getting total rules, when these two conditions are matched then we show the RuleNo other wise NULL as we are using LEFT JOIN
Select A.ContrlNo, ISNULL(T.RuleNo,0) as RuleNo FROM
( select ContrlNo, COUNT(R.RuleNo) as MatchedRules
FROM Messages M
LEFT JOIN Rules R
on M.[LineNo] = R.MsgLineNo
and SUBSTRING(M.Msg,R.RuleStartingPos,LEN(R.RuleMsg)) = R.RuleMsg
AND M.ContrlNo = R.RuleNo
GROUP BY M.ContrlNo) A
LEFT JOIN (
select COUNT(MsgLineNo) as TotalRules, RuleNo
from Rules R1
group by RuleNo) T
ON A.MatchedRules = T.TotalRules

Related

Get right table data on LEFT JOIN

I have a problem with my sql join request.
I need to get lines of left table who are not referenced in right table for ME (User 1) or referenced in right table with status equal to 0 and user equal to 1.
I also need the field status of right table.
Here is my two tables :
Table left
ID | title
1 | Title 1
2 | Title 2
3 | Title 3
4 | Title 4
5 | Title 5
6 | Title 6
Table right
ID | status | user | left_id
1 | 0 | 1 | 1
2 | 0 | 50 | 1
3 | 1 | 1 | 2
4 | 0 | 50 | 2
5 | 0 | 1 | 3
6 | 1 | 50 | 3
7 | 0 | 50 | 4
8 | 1 | 50 | 5
My goal is to get this result :
left.ID | left.title | right.status | right.user
1 | Title 1 | 0 | 1
3 | Title 3 | 0 | 1
4 | Title 4 | NULL | NULL
5 | Title 5 | NULL | NULL
6 | Title 6 | NULL | NULL
Here is my request for the moment :
SELECT l.id, l.title, r.user, r.status
FROM left as l
LEFT JOIN right as r ON l.id = r.left_id
WHERE r.left_id IS NULL or (r.user = 1 AND r.status = 0)
With this request I get lines ID (left table) 1 / 3 / 6. But I also need the ID 4 / 5.
Those lines isn't displayed because another user (50) as a reference, but it's not me (1).
If someone can help me to add line 4 / 5 to my result I would be happy.
Thanks
Small improvement of the query should be sufficient:
SELECT l.id, l.title, r.user, r.status
FROM left as l
LEFT JOIN right as r ON l.id = r.left_id and r.user = 1
WHERE r.left_id IS NULL or r.status = 0
(Select t1.id, t1.title,t2.status,t2.user
from tableLeft t1
right outer join tableRight t2 on t2.left_id=t1.id
where t1.id not in
(select tt2.left_id from tableRight tt2)
)
union
(select t1.id,t1.title,t2.status,t2.user
from tableLeft t1
left join tableRight t2 on t1.id=t2.left_id
where t2.status=0 and t2.user=1
)

Find duplicate combinations

I need a query to find duplicate combinations in these tables:
AttributeValue:
id | name
------------------
1 | green
2 | blue
3 | red
4 | 100x200
5 | 150x200
Product:
id | name
----------------
1 | Produkt A
ProductAttribute:
id | id_product | price
--------------------------
1 | 1 | 100
2 | 1 | 200
3 | 1 | 100
4 | 1 | 200
5 | 1 | 100
6 | 1 | 200
7 | 1 | 100 -- duplicate combination
8 | 1 | 100 -- duplicate combination
ProductAttributeCombinations:
id_product_attribute | id_attribute
-------------------------------------
1 | 1
1 | 4
2 | 1
2 | 5
3 | 2
3 | 4
4 | 2
4 | 5
5 | 3
5 | 4
6 | 3
6 | 5
7 | 1
7 | 4
8 | 1
8 | 5
I need SQL that creates result like:
id_product | duplicate_attributes
----------------------------------
1 | {7,8}
If I understand correct, 7 is a duplicate of 1 and 8 is a duplicate of 2. As phrased, your question is a bit confusing, because 7 and 8 are not related to each other and the only table of interest is ProductAttributeCombinations.
If this is the case, then one method is to use string aggregation
with combos as (
select id_product_attribute,
string_agg(id_attribute::text, ',' order by id_attribute) as combo
from ProductAttributeCombinations pac
group by id_product_attribute
)
select *
from combos c
where exists (select 1
from combos c2
where c2.id_product_attribute > c.id_product_attribute and
c2.combo = c.combo
);
Your question leaves some room for interpretation. Here is my educated guess:
For each product, return an array of all instances with the same set of attributes as any other instance of the same product with smaller ID.
WITH combo AS (
SELECT id_product, id, array_agg(id_attribute) AS attributes
FROM (
SELECT pa.id_product, pa.id, pac.id_attribute
FROM ProductAttribute pa
JOIN PoductAttributeCombinations pac ON pac.id_product_attribute = pa.id
ORDER BY pa.id_product, pa.id, pac.id_attribute
) sub
GROUP BY 1, 2
)
SELECT id_product, array_agg(id) AS duplicate_attributes
FROM combo c
WHERE EXISTS (
SELECT 1
FROM combo
WHERE id_product = c.id_product
AND attributes = c.attributes
AND id < c.id
)
GROUP BY 1;
Sorting can be inlined into the aggregate function so we don't need a subquery for the sort (like #Gordon already provided). This is shorter, but also typically slower:
WITH combo AS (
SELECT pa.id_product, pa.id
, array_agg(pac.id_attribute ORDER BY pac.id_attribute) AS attributes
FROM ProductAttribute pa
JOIN PoductAttributeCombinations pac ON pac.id_product_attribute = pa.id
GROUP BY 1, 2
)
SELECT ...
This only returns products with duplicate instances.
SQL Fiddle.
Your table names are rather misleading / contradict the rest of your question. Your sample data is not very clear either, only featuring a single product. I assume there are many in your table.
It's also unclear whether you are using double-quoted table names preserving CaMeL-case spelling. I assume: no.

Why isn't this returning unique combinations of these attributes?

When using the following query:
with neededSkills(SkillCode) as (
select distinct SkillCode
from job natural join hasprofile natural join requires_skill
where job_code = '1'
minus
select skillcode
from person natural join hasskill
where id = '1'
)
select distinct
taughtin.c_code as c,
count(taughtin.skillcode) as s,
ti.c_code as cc,
count(ti.skillcode) as ss
from taughtin, taughtin ti
where taughtin.c_code <> ti.c_code
and taughtin.skillcode <> ti.skillcode
and taughtin.skillcode in (select skillcode from neededskills)
and ti.skillcode in (select skillcode from neededskills)
group by (taughtin.c_code, ti.c_code)
order by (taughtin.c_code);
It returns:
C | S | CC | SS
----|----|----|----
1 | 1 | 2 | 1
1 | 1 | 3 | 1
1 | 1 | 5 | 1
2 | 1 | 1 | 1
3 | 1 | 1 | 1
5 | 1 | 1 | 1
I would expect it to return only lines where the combination of C and CC was not already used. Do I misunderstand how group by works? How would I achieve this result?
I am trying to have it return:
C | S | CC | SS
----|----|----|----
1 | 1 | 2 | 1
1 | 1 | 3 | 1
1 | 1 | 5 | 1
I use Oracle SQLPlus.
You're grouping on the combination of taughtin.c_code and ti.c_code, which are seperate columns in the context of the query (even though they are the same column in the schema). A pair of 1, 2 is not the same as a pair of 2, 1; the values may be the same but the sources are not.
If you want to get the combinations one way but not the other then the simplest thing is to always make one value large than the other; instead of:
where taughtin.c_code <> ti.c_code
use:
where ti.c_code > taughtin.c_code
Though it would be better to use ANSI joins for the main query too, and I'm not a fan of natural joins. You also don't need either distinct; the first may eliminate duplicates but they don't logically matter if you're only using the temporary result set for in()

Check if ALL conditions specified are satisfied in SQL query

Lets assume I have the following derived table:
Comment | Condition_Lower_Score | Condition_Higher_Score | Question_Score
=========================================================================
text1 | 1 | 3 | 2
text1 | 3 | 5 | 4
text2 | 5 | 6 | 1
text2 | 3 | 6 | 4
My table has a comment which is in a one to many relationship with conditions. Each condition can specify multiple questions(in this table the question score relates to different questions). I need to create a query that only selects the comment if all its conditions are satisfied.
The derived table is created from the following tables:
Comment:
Comment_ID | Comment_Text
===========================
1 | text1
2 | text2
Condition:
Condition_ID | Condition_Lower_Score | Condition_Higher_Score | Comment_ID | Question_ID
=========================================================================================
10 | 1 | 3 | 1 | 100
11 | 3 | 5 | 1 | 101
12 | 5 | 6 | 2 | 102
13 | 3 | 6 | 2 | 103
Question:
Question_ID | Question_Score
============================
100 | 2
101 | 4
102 | 1
103 | 4
So in this scenario, I would like only 'text1' to be selected from the derived table, and not 'text2' because all its conditions are not satisfied.
How do I create a query that only selects if all the conditions are met?
WITH TestsCTE AS
(
SELECT M.Comment_Text AS Comment,
C.Condition_Lower_Score,
C.Condition_Higher_Score,
Q.Question_Score,
CASE
WHEN Q.Question_Score BETWEEN C.Condition_Lower_Score AND C.Condition_Higher_Score
THEN 1
ELSE 0
END AS Pass
FROM [Condition] C
JOIN Comment M
ON C.Comment_ID = M.Comment_ID
JOIN Question Q
ON C.Question_ID = Q.Question_ID
)
SELECT COMMENT
FROM TestsCTE
GROUP BY COMMENT
HAVING MIN(Pass) = 1
SQL FIDDLE DEMO
select comment from (
<query for your derived table here>
) t1 group by comment
having count(
case
when question_score not between condition_lower_score and condition_higher_score
then 1 end
) = 0
or
select c.comment_text from comment c
join condition co on co.comment_id = c.comment_id
join question q on q.question_id = co.question_id
group by c.comment_text
having count(
case
when question_score not between condition_lower_score and condition_higher_score
then 1 end
) = 0

distinct rows with group by

I have one table:
id_object | version | document
------------------------------
1 | 1 | 1
1 | 2 | 2
2 | 1 | 3
2 | 2 | 1
2 | 3 | 2
1 | 1 | 3
I want to show only one row by object with the version (max) and the document. I have tried the following"
Select Distinct
id_object ,
Max(version),
document
From
prods
Group By
id_object, document
and I get this result
1 | 1 | 1
1 | 2 | 2
2 | 1 | 3
2 | 2 | 1
2 | 3 | 2
1 | 1 | 3
As you can see, I'm getting the entire table. My question is, why?
Since you group by id_object and document, you won't get your desired result. That is because document is different for each version.
select x.id_object,
x.maxversion as version,
p.document
from
(
Select id_object, Max(version) as maxversion
From prods
Group By id_object
) x
inner join prods p on p.id_object = x.id_object
and p.version = x.maxversion
You first have to select the id_object with the max(version). That can be joined with the actual data to get the correct document.
You have to do that because you can't select columns that are not in your group by clause, except you use a aggregate function on them (like max() for instance).
(MySQL can select non aggregated columns, but please avoid that since the outcome is not always clear or even predictable)
select prods.id_object, version, document
from prods inner join
(select id_object, max(version) as ver
from prods
group by id_object) tmp on prods.id_object = tmp.id_object and prods.version = tmp.ver
Query:
SQLFIDDLEExample
SELECT p.id_object,
p.version,
p.document
FROM prods p
WHERE p.version = (SELECT Max(version)
FROM prods
WHERE id_object = p.id_object)
Result:
| ID_OBJECT | VERSION | DOCUMENT |
----------------------------------
| 1 | 2 | 2 |
| 2 | 3 | 2 |