Need only one match from a Left Join - sql

Hello thanks for taking the time to read this.
I have a table TBL_INCIDENT containing columns TXT_INC_ID and TXT_SERVICE, I am using a LEFT JOIN to join it to table TBL_ASMS containing TXT_APPLICATION_ID. The issue I am having is the join will have multiple matches and I only want the first one. I saw an example of code using LIMIT 1, but I am unsure syntactically how it is supposed to be used. I also have seen some solutions for duplication using row_number() over partition, but I could not find one that was a select statement, only deletes.
This is my current state:
SELECT COUNT(A.TXT_INC_ID)
FROM(
SELECT A.TXT_INC_ID, B.APPLICATION_ID
FROM TBL_INCIDENT A
LEFT JOIN TBL_ASMS B ON A.TXT_SERVICE LIKE ('%' || B.APPLICATION_ID || '%') LIMIT 1
)
TXT_INC_ID is the primary key in the table it comes from, and I only want one match per record to be returned by the left join. I am using left because I need every record in table A returned, but only once.
Thanks

Maybe use a Max?
SELECT COUNT(A.TXT_INC_ID)
FROM(
SELECT A.TXT_INC_ID, Max(B.APPLICATION_ID)
FROM TBL_INCIDENT A
LEFT JOIN TBL_ASMS B ON A.TXT_SERVICE LIKE ('%' || B.APPLICATION_ID || '%')
group by A.txt_inc_id
)

Your result is logically the same as:
SELECT COUNT(A.TXT_INC_ID)
FROM TBL_INCIDENT A

Related

SQL Join with partial string match

I have a table 'TableA' like:
ID Group Type
1 AB SomeValue
2 BC SomeValue
Another table 'TableB' like:
Product Subgroup Type
A XX-AB-XX-text SomeValue
B XX-BC-XY-text SomeValue
I am using INNER JOIN between two tables like:
SELECT DISTINCT ID
FROM TableA TA
INNER JOIN TableB TB
ON TA.Type=TB.Type
I want to add another condition for join, which looks for value of 'Group' in 'Subgroup' and only joins if the 'Group' Value matches after 'XX-' and before '-XX'.
In other words, join only if Group 'AB' shows up at the correct place in Subgroup column.
How can I achieve this? I am using MSSQL
Try this:
SELECT (DISTINCT ID)
FROM TableA TA
INNER JOIN TableB TB
ON TA.Type=TB.Type AND TB.SubGroup LIKE '__-' + TA.Group + '%'
You can use LIKE with CONCAT to build the expression, see this fiddle:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d7bb4781488e53c31abce0f38d0aaef4
SELECT *
FROM TableA TA
JOIN TableB TB on (
TA.types = TB.types AND
TB.subgroup LIKE concat('XX-', TA.groups, '-XX%')
);
I am dumping all the returned data so you can see it, but you would want to modify that to return only DISTINCT TA.id.
It is possible to build the expression using + instead of CONCAT:
'XX-' + TA.groups + '-XX%'
Also, I would very strongly warn you against using column names that are reserved words. Both GROUP and TYPE are in use by the SQL engine, so to use them you would have to escape them every time, and they might also cause confusion to anybody reading or maintaining your code.

How to search if a row is a substring of another row of the same column in Oracle

I have a table that contains millions of rows for names of customers as a column. I want to find if a part of a name exists within another row in the same column.
E.g. If a row has value 'Roger Federer' and there are other rows with values, 'Roger' and 'Federer', I want the corresponding primary keys of all the three rows.
You can leverage the use of REGEXP_LIKE
SELECT *
FROM customers
WHERE REGEXP_LIKE (cust_name, 'roger|federer','i')
SQL Fiddle Demo
More examples of REGEXP_LIKE usages can be found here
Another option would be the use of OR
SELECT *
FROM customers
WHERE LOWER(cust_name) LIKE LOWER('%roger%')
OR LOWER(cust_name) LIKE LOWER('%federer%')
SQL Fiddle Demo
Edit
With the use of JOIN, the search string is dynamic. If proper indexes are in place, then it would not have much impact.
SELECT DISTINCT
c1.*
FROM
customers c1
JOIN
customers c2
ON ( LOWER(c1.cust_name) LIKE LOWER(c2.cust_name || '%')
AND c1.cust_id != c2.cust_id)
SQL Fiddle Demo
Edit 2
Perhaps something like the below
SELECT DISTINCT
c1.cust_id,
c1.cust_name,
CASE
WHEN
LOWER(c1.cust_name) LIKE LOWER(c2.cust_name || '%')
THEN
'Matched'
ELSE
'Unmatched'
END
ident
FROM
customers c1
JOIN
customers c2
ON ( LOWER(c1.cust_name) LIKE LOWER(c2.cust_name || '%')
AND c1.cust_id != c2.cust_id)
SQL Fiddle Demo
If you want to construct a logic related to rows, union concepts may suit well,
by the way, in string operations we'd better use collations with patterns through upper or lower functions to satisfy case-insensitivity for letters :
select id from customers where lower(name) like '%roger%' union all
select id from customers where lower(name) like '%federer%';
and no need to add already included complete name ( e.g. Roger Federer ).
Edit :
An Alternative method maybe the following :
select distinct id
from (select lower(regexp_substr('&str', '[^[:space:]-]+', 1, 1)) frst,
lower(regexp_substr('&str', '[^[:space:]-]+', 1, 2)) lst,
lower('&str') nm
from customers) c1
cross join customers c2
where c1.frst like '%' || lower(c2.name) || '%'
or c1.lst like '%' || lower(c2.name) || '%'
or c1.nm like '%' || lower(c2.name) || '%';
by adding a search string('&str') to make the query more dynamic as you wish.
( when prompted enter Roger Federer for str substitution variable )
I think you can use join same table twice (self join) to get output with below query,
select a.*, b.*
from tab1 a
, tab1 b
where ( a.fname like b.fname||'%' or a.lname like b.lname||'%')
and a.id <> b.id

SQL NOT LIKE Wildcard Condition on Inner Join

I have a table called hr_grades that contains employee pay grades such as:-
ID hr_grade
1 PR07
2 AC04
I run two stored procedures. One that returns employees whose grades are in this table, and one that does not. These stored procedures carry out a number of different tasks to prepare the data for loading into the end system.
I want the query to carry out a wildcard search on the rows in the grades table. So for example for employees whose grades are in the table
SELECT DISTINCT
Employee_ID,
FROM
#tmp_vo_hr_acp_staff v
INNER JOIN
hr_grades g ON v.hr_grade LIKE g.HR_Grade + '%' -- wildcard search
The reason for the wildcard is that the hr_grades can be like AC04(1) , AC04(2) etc.
This works fine. However I am struggling to get the reverse of this working.
SELECT DISTINCT
Employee_ID,
FROM
#tmp_vo_hr_acp_staff v
INNER JOIN
hr_grades g ON v.hr_grade NOT LIKE g.HR_Grade + '%'
Any ideas how I could get this to wildcard search to work on a NOT LIKE condition?
Change it to
ON NOT (v.hr_grade LIKE g.HR_Grade + '%' )
EDIT:
Removed the ON inside the brackets.
As almost always in SQL, there are some ways to do it:
-- 1. using outer join where the outer table don't match
SELECT DISTINCT Employee_ID
FROM #tmp_vo_hr_acp_staff v
LEFT JOIN hr_grades g ON (v.hr_grade LIKE g.HR_Grade + '%') -- wildcard search
WHERE g.id IS NULL -- use any non nullable column from hr_grades here
-- 2. using EXISTS for set difference
SELECT DISTINCT Employee_ID
FROM #tmp_vo_hr_acp_staff v
WHERE NOT EXISTS (
SELECT 'x' FROM hr_grades g WHERE v.hr_grade LIKE g.HR_Grade + '%'
)
-- 3. using the less popular ANY operator for set difference
SELECT DISTINCT Employee_ID
FROM #tmp_vo_hr_acp_staff v
WHERE NOT v.hr_grade LIKE ANY (
SELECT g.HR_Grade + '%' FROM hr_grades g
)
Personally, I don't like using joins for filtering, so I would probably use option 2. If hr_grades is a much smaller than #tmp_vo_hr_acp_staff, though, you can benefit from using the option 3 (as the set can be determined beforehand and than cached with a single read operation).

Joining a derived table based on specific data from the outside query

I am trying to join one record in a table to another using a derived table and am having a bit of trouble figuring out the correct query to do so. What I want to do is have a JOIN of a derived table to a query where the derived table uses where statements depending on data from the outer query that is being joined to. So here is the current code that I am working on:
SELECT a.viewerid, a.id, v.id AS entry, a.jobid, v.sourceid, v.cost, a.applicant
FROM a_views a,
JOIN (
SELECT TOP 1 id, sourceid, cost FROM a_views vt
WHERE vt.viewerid = a.viewerid
AND vt.viewed_at <= a.viewed_at
AND vt.referrer NOT LIKE '%' + vt.hostName + '%'
ORDER BY viewed_at DESC
) v
The derived table is a query of the same table that the outer query uses, and viewerid is a FK to itself across the table where id is a unique auto-incrementing PK. I need to get the latest record in the a_views table where the viewer id's match, the datestamp (viewed_at) is less than the outer datestamp and the referrer column doesn't contain the hostName column.
Sounds like you need APPLY:
SELECT a.viewerid, a.id, v.id AS entry, a.jobid, v.sourceid, v.cost, a.applicant
FROM a_views a
CROSS APPLY (
SELECT TOP 1 id, sourceid, cost FROM a_views vt
WHERE vt.viewerid = a.viewerid
AND vt.viewed_at <= a.viewed_at
AND vt.referrer NOT LIKE '%' + vt.hostName + '%'
ORDER BY viewed_at DESC
) v
Since your query has JOIN I've gone for CROSS APPLY, but you may need OUTER APPLY depending on your exact requirements.

Too many results from query

I'm trying to both understand the following query,
SELECT s.LAST_NAME||', '||s.FIRST_NAME||' '||COALESCE(s.MIDDLE_NAME,' ') AS FULL_NAME,
s.LAST_NAME,
s.FIRST_NAME,
s.MIDDLE_NAME,
s.STUDENT_ID,
ssm.SCHOOL_ID,
ssm.SCHOOL_ID AS LIST_SCHOOL_ID,
ssm.GRADE_ID ,
sg1.BENCHMARK_ID,
sg1.GRADE_TITLE,
sg1.COMMENT AS COMMENT_TITLE,
ssm.STUDENT_ID,
sg1.MARKING_PERIOD_ID,
sg1.LONGER_COURSE_COMMENTS,
sp.SORT_ORDER,
sched.COURSE_PERIOD_ID
FROM STUDENTS s,
STUDENT_ENROLLMENT ssm ,
SCHEDULE sched
LEFT OUTER JOIN STUDENT_REPORT_CARD_BENCHMARKS sg1 ON (
sg1.STUDENT_ID=sched.STUDENT_ID
AND sched.COURSE_PERIOD_ID=sg1.COURSE_PERIOD_ID
AND sg1.MARKING_PERIOD_ID IN ('0','442','445','450')
AND sg1.SYEAR=sched.SYEAR)
LEFT OUTER JOIN COURSE_PERIODS rc_cp ON (
rc_cp.COURSE_PERIOD_ID=sg1.COURSE_PERIOD_ID
AND rc_cp.DOES_GRADES='Y')
LEFT OUTER JOIN SCHOOL_PERIODS sp ON (sp.PERIOD_ID=rc_cp.PERIOD_ID)
WHERE ssm.STUDENT_ID=s.STUDENT_ID
AND ssm.SCHOOL_ID='1'
AND ssm.SYEAR='2010'
AND ('22-APR-11' BETWEEN ssm.START_DATE AND ssm.END_DATE OR (ssm.END_DATE IS NULL))
AND (LOWER(s.LAST_NAME) LIKE 'la''porsha%' OR LOWER(s.FIRST_NAME) LIKE 'la''porsha%' )
AND sched.STUDENT_ID=ssm.STUDENT_ID AND sched.MARKING_PERIOD_ID IN ('0','444','446','447','445','448','450','443','449')
AND ('22-APR-11' BETWEEN sched.START_DATE AND sched.END_DATE OR (sched.END_DATE IS NULL AND '22-APR-11'>=sched.START_DATE))
ORDER BY s.LAST_NAME,s.FIRST_NAME
and modify it to return the correct results - to only return one distinct person. When any particular person is searched for, multiple results are returned because there are unique values returned from schedule.course_period_id. As there are several left outer joins on the course_period_id field but across different tables, I'm confused as to where to modify the query.
My attempt to help people answer by formatting your query and getting rid of the mixed syntax. Not really an answer but too long for a comment:
SELECT s.LAST_NAME || ', ' || s.FIRST_NAME || ' ' || COALESCE(s.MIDDLE_NAME,' ')
AS FULL_NAME,
s.LAST_NAME, s.FIRST_NAME, s.MIDDLE_NAME, s.STUDENT_ID,
ssm.SCHOOL_ID, ssm.SCHOOL_ID AS LIST_SCHOOL_ID, ssm.GRADE_ID ,
sg1.BENCHMARK_ID, sg1.GRADE_TITLE, sg1.COMMENT AS COMMENT_TITLE,
ssm.STUDENT_ID, sg1.MARKING_PERIOD_ID, sg1.LONGER_COURSE_COMMENTS,
sp.SORT_ORDER, sched.COURSE_PERIOD_ID
FROM STUDENTS s
INNER JOIN STUDENT_ENROLLMENT ssm
ON ssm.STUDENT_ID=s.STUDENT_ID -- moved from WHERE to here
INNER JOIN SCHEDULE sched
ON sched.STUDENT_ID=ssm.STUDENT_ID -- moved from WHERE to here
LEFT OUTER JOIN STUDENT_REPORT_CARD_BENCHMARKS sg1
ON ( sg1.STUDENT_ID=sched.STUDENT_ID
AND sched.COURSE_PERIOD_ID=sg1.COURSE_PERIOD_ID
AND sg1.MARKING_PERIOD_ID IN ('0','442','445','450')
AND sg1.SYEAR=sched.SYEAR)
LEFT OUTER JOIN COURSE_PERIODS rc_cp
ON ( rc_cp.COURSE_PERIOD_ID=sg1.COURSE_PERIOD_ID
AND rc_cp.DOES_GRADES='Y')
LEFT OUTER JOIN SCHOOL_PERIODS sp
ON (sp.PERIOD_ID=rc_cp.PERIOD_ID)
WHERE ssm.SCHOOL_ID='1'
AND ssm.SYEAR='2010'
AND ('22-APR-11' BETWEEN ssm.START_DATE AND ssm.END_DATE
OR (ssm.END_DATE IS NULL))
AND ( LOWER(s.LAST_NAME) LIKE 'la''porsha%'
OR LOWER(s.FIRST_NAME) LIKE 'la''porsha%' )
AND sched.MARKING_PERIOD_ID
IN ('0','444','446','447','445','448','450','443','449')
AND ( '22-APR-11' BETWEEN sched.START_DATE AND sched.END_DATE
OR ( sched.END_DATE IS NULL
AND '22-APR-11' >= sched.START_DATE))
ORDER BY s.LAST_NAME, s.FIRST_NAME
Hope it helps.
Well of course you have mulitple records if the child tables joined to have multiple records for the same person. That is expected and correct behavior.
If you only want one record per person, then you must modify the query to tell it which of the multiple child records you want it to choose. But why wouldn't you want to see all the scheduled courses for the person, instead of only one?
If you must you coudl use group by and then put an aggregate (like min or max) on the fields which are causing you the multiple records. However, you would still need to know if you only want the first period records or the last period records or how would you decide out of six records for the person which one you want to see?
Look up the group by clause.