How do I just get the first matching row? - sql

I have a fairly complex SQL query - part of which requires to look up a company_ID value found in the first table to obtain the company_Name in the second table. The second table may have variants of the company name, but that is OK - I just need the first match.
So, tableA looks something like this (approx 2 dozen columns and many rows)
company_ID (CHAR(12))
161012348876
561254435253
103929478273
141567643542
tableB looks something like this
company_ID (Integer) Company_name
161012348876 Watson & Jones Ltd
161012348876 Watson and Jones
561254435253 Fictional Co. plc
103929478273 Made Up Corp.
161012348876 Watson Jones Ltd
141567643542 Thingymajig Gmbh.
This query will return multiple rows for 161012348876. What're good ways just to get one row returned for each matching company_id (i.e. 4 rows instead of 6)?
SELECT *, t2.company_name
FROM tableA t1
JOIN tableB t2 ON t1.company_id = cast(t2.company_id as CHAR(12))
I am using Teradata SQL.
Any help much appreciated.

SELECT *, t2.company_name
FROM tableA t1
JOIN tableB t2 ON t1.company_id = cast(t2.company_id as CHAR(12))
GROUP BY t1.company_id
Will return 1 row for each unique t1.company_id

The following query will get one Name for each company id. The Group by t2.company_id and MAX(t2.company_name) will get a unique name for each id and then join it with tableA.
SELECT t1.Company_ID, t2.company_name
FROM tableA t1
JOIN (SELECT t2.company_id , MAX(t2.company_Name) [aName]
FROM tableB t2 GROUP BY t2.company_id ) as t3
ON t1.company_id = cast(t3.company_id as CHAR(12))

Instead of user2989408's MAX subquery you can also do a
SELECT company_id , company_Name
FROM tableB
QUALIFY ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY company_name) = 1
--if you don't care about MIN/MAX or want a more random result:
QUALIFY COUNT(*) OVER (PARTITION BY company_id ROWS UNBOUNDED PRECEDING) = 1
But assuming that *company_id* is the PI of tableB the MAX will probably perform better.

Related

Compare the results of a ROW COUNT

I have 2 databases in the same server and I need to compare the registers on each one, since one of the databases is not importing all the information
I was trying to do a ROW count but it's not working
Currently I am doing packages of 100,000 rows approximate, and lookup at them in Excel.
Let's say I want a query that does a count for each ID in TABLE A and then compares the count result VS TABLE B count for each ID, since they are the same ID the count should be the same, and I want that brings me the ID on which there where any mismatch between counts.
--this table will contain the count of occurences of each ID in tableA
declare #TableA_Results table(
ID bigint,
Total bigint
)
insert into #TableA_Results
select ID,count(*) from database1.TableA
group by ID
--this table will contain the count of occurences of each ID in tableB
declare #TableB_Results table(
ID bigint,
Total bigint
);
insert into #TableB_Results
select ID,count(*) from database2.TableB
group by ID
--this table will contain the IDs that doesn't have the same amount in both tables
declare #Discordances table(
ID bigint,
TotalA bigint,
TotalB bigint
)
insert into #Discordances
select TA.ID,TA.Total,TB.Total
from #TableA_Results TA
inner join #TableB_Results TB on TA.ID=TB.ID and TA.Total!=TB.Total
--the final output
select * from #Discordances
The question is vague, but maybe this SQL Code might help nudge you in the right direction.
It grabs the IDs and Counts of each ID from database one, the IDs and counts of IDs from database two, and compares them, listing out all the rows where the counts are DIFFERENT.
WITH DB1Counts AS (
SELECT ID, COUNT(ID) AS CountOfIDs
FROM DatabaseOne.dbo.TableOne
GROUP BY ID
), DB2Counts AS (
SELECT ID, COUNT(ID) AS CountOfIDs
FROM DatabaseTwo.dbo.TableTwo
GROUP BY ID
)
SELECT a.ID, a.CountOfIDs AS DBOneCount, b.CountOfIDs AS DBTwoCount
FROM DB1Counts a
INNER JOIN DB2Counts b ON a.ID = b.ID
WHERE a.CountOfIDs <> b.CountOfIDs
This SQL selects from the specific IDs using the "Database.Schema.Table" notation. So replace "DatabaseOne" and "DatabaseTwo" with the names of your two databases. And of course replace TableOne and TableTwo with the names of your tables (I'm assuming they're the same). This sets up two selects, one for each database, that groups by ID to get the count of each ID. It then joins these two selects on ID, and returns all rows where the counts are different.
You could full outer join two aggregate queries and pull out ids that are either missing in one table, or for which the record count is different:
select coalesce(ta.id, tb.id), ta.cnt, tb.cnt
from
(select id, count(*) cnt from tableA) ta
full outer join (select id, count(*) cnt from tableB) tb
on ta.id = tb.id
where
coalesce(ta.cnt, -1) <> coalesce(tb.cnt, -1)
You seem to want aggregation and a full join:
select coalesce(a.id, b.id) as id, a.cnt, b.cnt
from (select id, count(*) as cnt
from a
group by id
) a full join
(select id, count(*) as cnt
from b
group by id
) b
on a.id = b.id
where coalesce(a.cnt, 0) <> coalesce(b.cnt, 0);

Select value in a SQL query based on count() being non-zero

I am trying to write a SQL query that would "calculate" status column of the row based on JOIN with statuses table.
My record is basically: id | name | statusId, which is foreign key to statuses table. That table has: id | statusName
I collect count() for each DISTINCT statusId. Now, I need return Id of any status based on the following idea - if count(status0) > 0, I need to return status0, else I need to check status1 then status2 etc.
Could I write a SQL query to return status for each row status with JOIN, WHERE, HAVING etc without if/else logic?
If you need the statuses that are being used, then use exists:
select s.*
from statuses s
where exists (select 1 from records r where r.statusid = s.id);
Actually, if you just want the ids, you can use:
select distinct r.statusid
from records;
Rephrasing the question, do you want to just get the statuses that are present in the bigger table? How about?
SELECT s.*
FROM status s
WHERE s.id IN (SELECT DISTINCT statusid FROM records);
Start with this it should get you on the right track...
select a.*, b.*, c.cnt from table1 a
join table2 b on a.statusid=b.id
join (
select statusname, count(*) cnt
from table1 a join table2 b on a.statusid=b.id
group by statusname)c
on b.statusname=c.statusname

SQL Find most rows that match between two tables

I am using SQL Server 2012 I have two tables like the following
Table1 and Table 2 both have many groups, indicated by the group column. The name of the group may match in both tables, but it may not. What is important is finding the group on Table 2 that has the most members that match members in a group on Table1.
I first tried doing this with a vlookup, but the problem is vlookup pulls the first entry in the Group column that has a match, not the group with the most matches. Below vlookup would pull BBB, but the correct result is CCC.
Ties may occur. There might be more than one group on Table2 that match Table1 with the same number of members thus the best thing may be to count the number of matches, but there are thousands of groups so it's not ideal to sort and sift through a column with counts. I need something like a case statement where if there is a MAX(match) then Table1 would show the group name with MAX(Match) in the derived column BestMatch. It'd be most ideal if the column could display all the groups on table 2 that have MAX(Match which may be one or more. Perhaps it could be comma separated.
If not if the column could just say tie and I could look for the tie, it'd be ideal if this is the best option, when the word tie appears it repeats besides every member that matches so I will know to look for groups that matching which accounts and how many that matched.
We really could do with some expected output to help clarify the question.
If I understand you correctly however, this query will get you close to the results you require:
;with cte as
( SELECT t1a.[group] AS Group1
, t2a.[Group] AS Group2
, RANK() OVER(PARTITION BY t1a.[group]
ORDER BY COUNT(t2a.[Group]) DESC) AS MatchRank
FROM Table1 t1a
JOIN Table2 t2a
ON t1a.member = t2a.member
GROUP BY t1a.[group], t2a.[GRoup])
SELECT *
FROM cte
WHERE MatchRank=1
The query doesn't identify ties, but it will display any tied results...
If you are a newbie to common table expressions(the ;with statement) there is a useful description here.
select *
from Table1 t1
outer apply
(
select top 1 t2.[Group]
from Table2 t2
where t2.Member = t1.Member
group by t2.[Group]
order by count(*) desc
) m
It may not be the most elegant solution but I think it could do the work:
select *
from
(select t1.[group] as t1group, t1.member, t2.[group] as t2group
from Table1 t1 inner join Table2 t2 on t1.member = t2.member)a
where member = (select max(t1.member)
from Table1 t1 inner join Table2 t2 on t1.member = t2.member)
In case of 2 rows from Table2 matching the maximum members in Table1, both results would be displayed
PS: an example of your desired results would have been helpful
Count member matches per group pair and rank them so the group pairs with the highest match count get rank #1. Once you found these, you can select the related records from table1 and table2.
select t1.grp, t1.member, t2.grp
from t1
join
(
select
t1.grp as grp1,
t2.grp as grp2,
rank() over (order by count(*) desc) as rnk
from t1
join t2 on t2.member = t1.member
group by t1.grp, t2.grp
) grps on grps.rnk = 1 and grps.grp1 = t1.grp
left join t2 on t2.grp = grps.grp2 and t2.member = t1.member
order by t1.grp, t1.member, t2.grp;
This gives you ties in separate rows, e.g. for AAA having four different members (123,456,789,555) with two matches both in CCC and DDD:
grp1 member grp2
AAA 123 CCC
AAA 123 DDD
AAA 456 CCC
AAA 789
AAA 555 DDD
If you want one row per grp1 and member with all matching grp2 in a string then you need some clumsy STUFF trick in SQL Server as far as I am aware. Look up "GROUP_CONCAT in SQL Server" to find the technique needed.

How to compare tables and find duplicates and also find columns with different value

I have the following tables in Oracle 10g:
Table1
Name Status
a closed
b live
c live
Table2
Name Status
a final
b live
c live
There are no primary keys in both tables, and I am trying to write a query which will return identical rows without looping both tables and comparing rows/columns. If the status column is different then the row in the Table2 takes presedence.
So in the above example my query should return this:
Name Status
a final
b live
c live
Since you have mentioned that there are no Primary Key on both tables, I'm assuming that there maybe a possibility that a row may exist on Table1, Table2, or both. The query below uses Common Table Expression and Windowing function to get such result.
WITH unionTable
AS
(
SELECT Name, Status, 1 AS ordr FROM Table1
UNION
SELECT Name, Status, 2 AS ordr FROM Table2
),
ranks
AS
(
SELECT Name, Status,
ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY ordr DESC) rn
FROM unionTable
)
SELECT Name, Status
FROM ranks
WHERE rn = 1
SQLFiddle Demo
Something like this?
SELECT table1.Name, table2.Status
FROM table1
INNER JOIN table2 ON table1.Name = table2.Name
By always returning table2.Status you've covered both the case when they're the same and when they're different (essentially it doesn't matter what the value of table1.Status is).

How do I limit the number of rows returned by this LEFT JOIN to one?

So I think I've seen a solution to this however they are all very complicated queries. I'm in oracle 11g for reference.
What I have is a simple one to many join which works great however I don't need the many. I just want the left table (the one) to just join any 1 row which meets the join criteria...not many rows.
I need to do this because the query is in a rollup which COUNTS so if I do the normal left join I get 5 rows where I only should be getting 1.
So example data is as follows:
TABLE 1:
-------------
TICKET_ID ASSIGNMENT
5 team1
6 team2
TABLE 2:
-------------
MANAGER_NAME ASSIGNMENT_GROUP USER
joe team1 sally
joe team1 stephen
joe team1 louis
harry team2 ted
harry team2 thelma
what I need to do is join these two tables on ASSIGNMENT=ASSIGNMENT_GROUP but only have 1 row returned.
when I do a left join I get three rows returned beaucse that is the nature of hte left join
If oracle supports row number (partition by) you can create a sub query selecting where row equals 1.
SELECT * FROM table1
LEFT JOIN
(SELECT *
FROM (SELECT *,
ROW_NUMBER()
OVER(PARTITION BY assignmentgroup ORDER BY assignmentgroup) AS Seq
FROM table2) a
WHERE Seq = 1) v
ON assignmet = v.assignmentgroup
You could do something like this.
SELECT t1.ticket_id,
t1.assignment,
t2.manager_name,
t2.user
FROM table1 t1
LEFT OUTER JOIN (SELECT manager_name,
assignment_group,
user,
row_number() over (partition by assignment_group
--order by <<something>>
) rnk
FROM table2) t2
ON ( t1.assignment = t2.assignment_group
AND t2.rnk = 1 )
This partitions the data in table2 by assignment_group and then arbitrarily ranks them to pull one arbitrary row per assignment_group. If you care which row is returned (or if you want to make the row returned deterministic) you could add an ORDER BY clause to the analytic function.
I think what you need is to use GROUP BY on the ASSIGNMENT_GROUP field.
http://www.w3schools.com/sql/sql_groupby.asp
In MySQL you could just GROUP BY ASSIGNMENT and be done. Oracle is more strict and refuses to just choose (in an undefined way) which values of the three rows to choose. That means all returned columns need to be part of GROUP BY or be subject to an aggregat function (COUNT, MIN, MAX...)
You can of course choose to just don't care and use some aggregat function on the returned columns.
select TICKET_ID, ASSIGNMENT, MAX(MANAGER_NAME), MAX(USER)
from T1
left join T2 on T1.ASSIGNMENT=T2.ASSIGNMENT_GROUP
group by TICKET_ID, ASSIGNMENT
If you do that I would seriously doubt that you need the JOIN in the first place.
MySQL could also help with GROUP_CONCAT in the case that you want a string concatenation of group values for a column (humans often like that), but with Oracle that is staggeringly complex.
Using a subquery as already suggested is an option, look here for an example. It also allows you to sort the subquery before selecting the top row.
In Oracle, if you want 1 result, you can use the ROWNUM statement to get the first N values of a query e.g.:
SELECT *
FROM TABLEX
WHERE
ROWNUM = 1 --gets the first value of the result
The problem with this single query is that Oracle never returns the data in the same order. So, you must oder your data before use rownum:
SELECT *
FROM
(SELECT * FROM TABLEX ORDER BY COL1)
WHERE
ROWNUM = 1
For your case, looks like you only need 1 result, so your query should look like:
SELECT *
FROM
TABLE1 T1
LEFT JOIN
(SELECT *
FROM TABLE2 T2 WHERE T1.ASSIGNMENT = T2.ASSIGNMENT_GROUP
AND
ROWNUM = 1) T3 ON T1.ASSIGNMENT = T3.ASSIGNMENT_GROUP
you can use subquery - select top 1