How do I limit the number of rows returned by this LEFT JOIN to one? - sql

So I think I've seen a solution to this however they are all very complicated queries. I'm in oracle 11g for reference.
What I have is a simple one to many join which works great however I don't need the many. I just want the left table (the one) to just join any 1 row which meets the join criteria...not many rows.
I need to do this because the query is in a rollup which COUNTS so if I do the normal left join I get 5 rows where I only should be getting 1.
So example data is as follows:
TABLE 1:
-------------
TICKET_ID ASSIGNMENT
5 team1
6 team2
TABLE 2:
-------------
MANAGER_NAME ASSIGNMENT_GROUP USER
joe team1 sally
joe team1 stephen
joe team1 louis
harry team2 ted
harry team2 thelma
what I need to do is join these two tables on ASSIGNMENT=ASSIGNMENT_GROUP but only have 1 row returned.
when I do a left join I get three rows returned beaucse that is the nature of hte left join

If oracle supports row number (partition by) you can create a sub query selecting where row equals 1.
SELECT * FROM table1
LEFT JOIN
(SELECT *
FROM (SELECT *,
ROW_NUMBER()
OVER(PARTITION BY assignmentgroup ORDER BY assignmentgroup) AS Seq
FROM table2) a
WHERE Seq = 1) v
ON assignmet = v.assignmentgroup

You could do something like this.
SELECT t1.ticket_id,
t1.assignment,
t2.manager_name,
t2.user
FROM table1 t1
LEFT OUTER JOIN (SELECT manager_name,
assignment_group,
user,
row_number() over (partition by assignment_group
--order by <<something>>
) rnk
FROM table2) t2
ON ( t1.assignment = t2.assignment_group
AND t2.rnk = 1 )
This partitions the data in table2 by assignment_group and then arbitrarily ranks them to pull one arbitrary row per assignment_group. If you care which row is returned (or if you want to make the row returned deterministic) you could add an ORDER BY clause to the analytic function.

I think what you need is to use GROUP BY on the ASSIGNMENT_GROUP field.
http://www.w3schools.com/sql/sql_groupby.asp

In MySQL you could just GROUP BY ASSIGNMENT and be done. Oracle is more strict and refuses to just choose (in an undefined way) which values of the three rows to choose. That means all returned columns need to be part of GROUP BY or be subject to an aggregat function (COUNT, MIN, MAX...)
You can of course choose to just don't care and use some aggregat function on the returned columns.
select TICKET_ID, ASSIGNMENT, MAX(MANAGER_NAME), MAX(USER)
from T1
left join T2 on T1.ASSIGNMENT=T2.ASSIGNMENT_GROUP
group by TICKET_ID, ASSIGNMENT
If you do that I would seriously doubt that you need the JOIN in the first place.
MySQL could also help with GROUP_CONCAT in the case that you want a string concatenation of group values for a column (humans often like that), but with Oracle that is staggeringly complex.
Using a subquery as already suggested is an option, look here for an example. It also allows you to sort the subquery before selecting the top row.

In Oracle, if you want 1 result, you can use the ROWNUM statement to get the first N values of a query e.g.:
SELECT *
FROM TABLEX
WHERE
ROWNUM = 1 --gets the first value of the result
The problem with this single query is that Oracle never returns the data in the same order. So, you must oder your data before use rownum:
SELECT *
FROM
(SELECT * FROM TABLEX ORDER BY COL1)
WHERE
ROWNUM = 1
For your case, looks like you only need 1 result, so your query should look like:
SELECT *
FROM
TABLE1 T1
LEFT JOIN
(SELECT *
FROM TABLE2 T2 WHERE T1.ASSIGNMENT = T2.ASSIGNMENT_GROUP
AND
ROWNUM = 1) T3 ON T1.ASSIGNMENT = T3.ASSIGNMENT_GROUP

you can use subquery - select top 1

Related

SQL Find most rows that match between two tables

I am using SQL Server 2012 I have two tables like the following
Table1 and Table 2 both have many groups, indicated by the group column. The name of the group may match in both tables, but it may not. What is important is finding the group on Table 2 that has the most members that match members in a group on Table1.
I first tried doing this with a vlookup, but the problem is vlookup pulls the first entry in the Group column that has a match, not the group with the most matches. Below vlookup would pull BBB, but the correct result is CCC.
Ties may occur. There might be more than one group on Table2 that match Table1 with the same number of members thus the best thing may be to count the number of matches, but there are thousands of groups so it's not ideal to sort and sift through a column with counts. I need something like a case statement where if there is a MAX(match) then Table1 would show the group name with MAX(Match) in the derived column BestMatch. It'd be most ideal if the column could display all the groups on table 2 that have MAX(Match which may be one or more. Perhaps it could be comma separated.
If not if the column could just say tie and I could look for the tie, it'd be ideal if this is the best option, when the word tie appears it repeats besides every member that matches so I will know to look for groups that matching which accounts and how many that matched.
We really could do with some expected output to help clarify the question.
If I understand you correctly however, this query will get you close to the results you require:
;with cte as
( SELECT t1a.[group] AS Group1
, t2a.[Group] AS Group2
, RANK() OVER(PARTITION BY t1a.[group]
ORDER BY COUNT(t2a.[Group]) DESC) AS MatchRank
FROM Table1 t1a
JOIN Table2 t2a
ON t1a.member = t2a.member
GROUP BY t1a.[group], t2a.[GRoup])
SELECT *
FROM cte
WHERE MatchRank=1
The query doesn't identify ties, but it will display any tied results...
If you are a newbie to common table expressions(the ;with statement) there is a useful description here.
select *
from Table1 t1
outer apply
(
select top 1 t2.[Group]
from Table2 t2
where t2.Member = t1.Member
group by t2.[Group]
order by count(*) desc
) m
It may not be the most elegant solution but I think it could do the work:
select *
from
(select t1.[group] as t1group, t1.member, t2.[group] as t2group
from Table1 t1 inner join Table2 t2 on t1.member = t2.member)a
where member = (select max(t1.member)
from Table1 t1 inner join Table2 t2 on t1.member = t2.member)
In case of 2 rows from Table2 matching the maximum members in Table1, both results would be displayed
PS: an example of your desired results would have been helpful
Count member matches per group pair and rank them so the group pairs with the highest match count get rank #1. Once you found these, you can select the related records from table1 and table2.
select t1.grp, t1.member, t2.grp
from t1
join
(
select
t1.grp as grp1,
t2.grp as grp2,
rank() over (order by count(*) desc) as rnk
from t1
join t2 on t2.member = t1.member
group by t1.grp, t2.grp
) grps on grps.rnk = 1 and grps.grp1 = t1.grp
left join t2 on t2.grp = grps.grp2 and t2.member = t1.member
order by t1.grp, t1.member, t2.grp;
This gives you ties in separate rows, e.g. for AAA having four different members (123,456,789,555) with two matches both in CCC and DDD:
grp1 member grp2
AAA 123 CCC
AAA 123 DDD
AAA 456 CCC
AAA 789
AAA 555 DDD
If you want one row per grp1 and member with all matching grp2 in a string then you need some clumsy STUFF trick in SQL Server as far as I am aware. Look up "GROUP_CONCAT in SQL Server" to find the technique needed.

Different way of writing this SQL query with partition

Hi I have the below query in Teradata. I have a row number partition and from that I want rows with rn=1. Teradata doesn't let me use the row number as a filter in the same query. I know that I can put the below into a subquery with a where rn=1 and it gives me what I need. But the below snippet needs to go into a larger query and I want to simplify it if possible.
Is there a different way of doing this so I get a table with 2 columns - one row per customer with the corresponding fc_id for the latest eff_to_dt?
select cust_grp_id, fc_id, row_number() over (partition by cust_grp_id order by eff_to_dt desc) as rn
from table1
Have you considered using the QUALIFY clause in your query?
SELECT cust_grp_id
, fc_id
FROM table1
QUALIFY ROW_NUMBER()
OVER (PARTITION BY cust_grp_id
ORDER BY eff_to_dt desc)
= 1;
Calculate MAX eff_to_dt for each cust_grp_id and then join result to main table.
SELECT T1.cust_grp_id,
T1.fc_id,
T1.eff_to_dt
FROM Table1 AS T1
JOIN
(SELECT cust_grp_id,
MAX(eff_to_dt) AS max_eff_to_dt
FROM Table
GROUP BY cust_grp_id) AS T2 ON T2.cust_grp_id = T1.cust_grp_id
AND T2.max_eff_to_dt = T1.eff_to_dt
You can use a pair of JOINs to accomplish the same thing:
INNER JOIN My_Table T1 ON <some criteria>
LEFT OUTER JOIN My_Table T2 ON <some criteria> AND T2.eff_to_date > T1.eff_to_date
WHERE
T2.my_id IS NULL
You'll need to sort out the specific criteria for your larger query, but this is effectively JOINing all of the rows (T1), but then excluding any where a later row exists. In the WHERE clause you eliminate these by checking for a NULL value in a column that is NOT NULL (in this case I just assumed some ID value). The only way that would happen is if the LEFT OUTER JOIN on T2 failed to find a match - i.e. no rows later than the one that you want exist.
Also, whether or not the JOIN to T1 is LEFT OUTER or INNER is up to your specific requirements.

How to compare tables and find duplicates and also find columns with different value

I have the following tables in Oracle 10g:
Table1
Name Status
a closed
b live
c live
Table2
Name Status
a final
b live
c live
There are no primary keys in both tables, and I am trying to write a query which will return identical rows without looping both tables and comparing rows/columns. If the status column is different then the row in the Table2 takes presedence.
So in the above example my query should return this:
Name Status
a final
b live
c live
Since you have mentioned that there are no Primary Key on both tables, I'm assuming that there maybe a possibility that a row may exist on Table1, Table2, or both. The query below uses Common Table Expression and Windowing function to get such result.
WITH unionTable
AS
(
SELECT Name, Status, 1 AS ordr FROM Table1
UNION
SELECT Name, Status, 2 AS ordr FROM Table2
),
ranks
AS
(
SELECT Name, Status,
ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY ordr DESC) rn
FROM unionTable
)
SELECT Name, Status
FROM ranks
WHERE rn = 1
SQLFiddle Demo
Something like this?
SELECT table1.Name, table2.Status
FROM table1
INNER JOIN table2 ON table1.Name = table2.Name
By always returning table2.Status you've covered both the case when they're the same and when they're different (essentially it doesn't matter what the value of table1.Status is).

How to write an SQL query to combine two already-ordered tables horizontally with different columns?

I have two tables which I would like to place side-by-side exactly as they are. For example,
tableOne tableTwo
columnOne | columnTwo | columnThree columnI | columnII | columnIII
The data in the two tables do not need to be related whatsoever -- the tables have the same row count -- and the data is already sorted in the two tables. Basically, I would like to do a full outer join on the two tables without an on operator.
How can I do this in a SQL query?
Well, you do want an ON operator - you just seem to want it to work automatically, which won't happen.
If you're saying Row 1 of tableOne maps to Row 1 of tableTwo, then you need to add a row column to each table and then join on it.
If you don't specify a join condition, you'll do a cross join that joins every row from tableOne to every row in tableTwo, which obviously isn't what you're looking for.
So do something like this:
select * from
(select *, row_number() over (order by 1) as RN from tableOne) a
inner join (select *, row_number() over (order by 1) as RN from tableTwo) b
on a.RN = b.RN

SQL query to limit number of rows having distinct values

Is there a way in SQL to use a query that is equivalent to the following:
select * from table1, table2 where some_join_condition
and some_other_condition and count(distinct(table1.id)) < some_number;
Let us say table1 is an employee table. Then a join will cause data about a single employee to be spread across multiple rows. I want to limit the number of distinct employees returned to some number. A condition on row number or something similar will not be sufficient in this case.
So what is the best way to get the same effect the same output as intended by the above query?
select *
from (select * from employee where rownum < some_number and some_id_filter), table2
where some_join_condition and some_other_condition;
This will work for nearly all DBs
SELECT *
FROM table1 t1
INNER JOIN table2 t2
ON some_join_condition
AND some_other_condition
INNER JOIN (
SELECT t1.id
FROM table1 t1
HAVING
count(t1.ID) > someNumber
) on t1.id = t1.id
Some DBs have special syntax to make this a little bit eaiser.
I may not have a full understanding of what you're trying to accomplish, but lets say you're trying to get it down to 1 row per employee, but each join is causing multiple rows per employee and grouping by employee name and other fields is still not unique enough to get it down to a single row, then you can try using ranking and partitioning and then select the rank you prefer for each employee partition.
See example : http://msdn.microsoft.com/en-us/library/ms176102.aspx