How to select only one row that satisfies one of multiple ordered conditions - sql

I have customers that have multiple addresses. Each customer/address combination has its own line in the database (Oracle) table.
I am trying to achieve a query in which, if the customer has a 'Main Address', I display only the Main Address,
otherwise if he has a Shipment Address, I display only the Shipment Address,
otherwise if he has a 'Secondary Address', I display the Secondary Address,
otherwise I display nothing.
This order is important, and the problem is that the entries in the database are in no specific order, meaning that the same customer might be found to have a Shipment Address first, and a Main Address later on. Therefore I don't simply need the first row that satisfies one of the conditions...
I tried this, but it returns a result for each line, e.g. multiple results for one person:
CASE WHEN ADR = 'MAIN' THEN 'MAIN'
WHEN ADR = 'SHIPMENT' THEN 'SHIPMENT'
WHEN ADR = 'SECONDARY' THEN 'SECONDARY'
ELSE null
END AS Adressart
To clarify, the input looks as follows:
CUSTOMER_NR ADDRESS_TYPE
1 SHIPMENT
1 MAIN
2 SHIPMENT
3 SECONDARY
3 SHIPMENT
4 SECONDARY
The results would look like this:
CUSTOMER_NR ADDRESS_TYPE
1 MAIN
2 SHIPMENT
3 SHIPMENT
4 SECONDARY

I think you could use multiple joins to the address table and by using coalesce return only the one address from the order
The query would look something like:
SELECT customer_nr, coalesce(a1.address, a2.address, a3,address) AS address
FROM customer AS c
LEFT JOIN address AS a1 where a1.customer_nr = c.customer_nr and a1.address_type = 'MAIN'
LEFT JOIN address AS a2 where a2.customer_nr = c.customer_nr and a2.address_type = 'SHIPMENT'
LEFT JOIN address AS a3 where a3.customer_nr = c.customer_nr and a3.address_type = 'SECONDARY'

The following query would get you the desired result:
SELECT
CUSTOMER_NR,
ADDRESS_TYPE
FROM
(
select
customer_nr,
address_type,
row_number () over (partition by customer_nr order by
case address_type
when 'MAIN' then 1
when 'SHIPMENT' then 2
when 'SECONDARY' THEN 3
else 4
end) rn
from addresses
)
WHERE rn = 1;
First, the addresses are sorted using the priority with a CASE statement. Then, only the address types that have rn = 1 (the address type of highest priority) are selected.
SQL Fiddle demo

Related

Showing only max value of case statement in group by statement in SQL Server Management Studio

With a query I'd like to check for all the loans in our database, whether they main credit taker has an e-mail address filled out or not.
The problem I am encountering is that the Address table stores various types of addresses (also physical/fax addresses etc.). I can't simply filter on e-mail addresses as I would still like to see all loans (also those without e-mail), with the result whether or not they have an e-mail.
My current query still shows multiple rows for loans who have both mail and another form of address.
select l.ApplicationNumber, p1.Name AS Credittaker, a1.EAddress AS 'Mail
credit taker',
SUM(case
when c1.ContactChannelTypeID = 2 AND a1.ElectronicAddressTypeID = 1 then 1
else 0 end) as MailID
from Line l
JOIN Role r1 on (l.ObjID = r1.ObjID_Businessobject)
JOIN Party p1 on (p1.ObjID = r1.ObjID_Party)
JOIN ContactChannel c1 on (p1.ObjID = c1.ObjID_Party)
JOIN Address a1 on (a1.ObjID = c1.ObjID_Address)
WHERE r1.RDNameID = '17' AND r1.Enddate IS NULL AND c1.EndDate IS NULL
GROUP BY l.ApplicationNumber, p1.Name, a1.EAddress
order by l.ApplicationNumber
What I want is for the query to show 1 row per application number, if there is an e-mail (i.e. a row with MailID 1), to show only this value for the application number. If no e-mail is found (i.e. only row with MailID 0), to show this value.
How would I do this?
Example of results vs desired results:
11650 and 11651 only have non e-mail addresses so their value of MailID = 0 is correct. 11652 and 11653 both have an e-mail address and other types of addresses, and should therefore only display MailID = 1. Adding MAX(MailID) to the Group by clause doesn't work because it doesn't recognize the column name. If I put the case statement in a subquery and refer to it afterwards, I get the error that the an aggregate function can't contain another aggregate function.
I think this will work:
select top (1) with ties . . .
. . .
order by row_number() over (partition by l.ApplicationNumber order by MailId desc);
I'm not 100% sure you can use a column alias like this in the order by. If not, then :
with t as (
<your query here with no order by>
)
select
from (select t.*,
row_number() over (partition by applicationNumber order by mailid desc) as seqnum
from t
) t
where seqnum = 1;

sql join with condition

I've a table EMPLOYEE which has columns like these
EmpId FName LName
I have another table ADDRESS which has columns like these
EmpId AddressType Address Phone Email
AddressType column has 2 possible types, Residential and Official and an Emp can have both types of address. I need a query which will join these 2 tables using EmpId. It also needs to fetch one address which has phone not null. If both addresses has phone, then fetch any one, if none has phone, still fetch any one. Please help.
The trick is to first decide which Address would be best for the Employee, based on your Phone-rule. After the prefered Address has been found, indicated by PhonePreference = 1, you can JOIN the correct Address on the Employee.
WITH AddressCTE AS (
SELECT *
, ROW_NUMBER() OVER (
PARTITION BY EmpId
ORDER BY CASE WHEN Phone IS NOT NULL THEN 1 ELSE 2 END, Phone
) PhonePreference
FROM Address
)
SELECT *
FROM Employee E
JOIN AddressCTE A ON E.EmpId = A.EmpId AND A.PhonePreference = 1

SQL Count the numbers in a returned query?

I have a query to retrieve the email addresses and the names (sometimes more than 1) associated with the email address, where the account status is not closed, in descending order:
SELECT sbg.contact_email, COUNT(DISTINCT sbg.contact_name) Num_Contact_Names
FROM SummaryBillGroup sbg
INNER JOIN Account a
ON sbg.Customer_number = a.Customer_number
WHERE a.account_status_code <> 'c'
GROUP BY sbg.contact_email
ORDER BY Num_Contact_Names DESC
This returns a list of the email addresses and the number of names associated with each email address. What I would like to do now is use that query to count up all of the returned numbers, so that I have a list of the 3's, the 2's, the 1's, etc.
You could use that same query as a derived table for another one. Like this:
select num_contact_names, count(*)
from (SELECT sbg.contact_email, COUNT(DISTINCT sbg.contact_name) Num_Contact_Names
FROM SummaryBillGroup sbg
INNER JOIN Account a
ON sbg.Customer_number = a.Customer_number
WHERE a.account_status_code 'c'
GROUP BY sbg.contact_email) as t
group by t.num_contact_names
order by 2
First row would give you 1's, second row the 2's and so on. Cheers.

Is it Possible to Use IF/Else in SQL?

Is it possible to use if/else in SQL? If I have a table called supplier with columns: sid -> primary key, sname and city.
Then I wish to:
select sid from supplier where city="taipei" if not empty.
Or select sid from supplier where city="tainan"
Yes, you can. I don't know about other DBMS but I have used such things in Microsot SQL Server in my Stored Procedure like this;
IF EXISTS
(SELECT [sid] FROM [supplier] WHERE [city]= "taipei")
select sid from supplier where city="taipei" // your true condition query
ELSE
select sid from supplier where city="tainan"
In MySQL From this link, it turns out that is also possible. see;
IF EXISTS(SELECT * FROM tbl_name WHERE category_code ='some-category-code') THEN UPDATE tbl_name SET active='0' WHERE category_code = 'some-category-code' END IF
It was unclear what you want to do (I leave my previous hypotheses below).
You want to associate a priority to your suppliers, so that the one for Taipei is selected, but if it is unavailable, then Tainan gets selected instead.
In this specific case you can just use:
SELECT sid FROM supplier WHERE city = (
SELECT MAX(city) FROM supplier WHERE city IN ('Taipei', 'Tainan')
);
The inner sub-SELECT will retrieve Taipei or, if unavailable, Tainan.
This uses the fact that Taipei is lexicographically greater than Tainan, but if you wanted a more flexible solution, MAX would not work. In that case you would change the subselect to sort cities in order of desirability (missing cities are of course undesirable) and then fetch the one most desirable:
SELECT sid FROM supplier WHERE city = (
SELECT city FROM supplier ORDER BY CASE
WHEN city = 'Taipei' THEN 1
WHEN city = 'Tainan' THEN 2
WHEN city = 'New York' THEN 3
ELSE 4
END
LIMIT 1
);
The subselect now will retrieve first Taipei, but missing Taipei it will get to Tainan and so on.
Note that if you want only one SID, you can do it much more simply:
SELECT sid FROM supplier ORDER BY CASE
WHEN city = 'Taipei' THEN 1
WHEN city = 'Tainan' THEN 2
WHEN city = 'New York' THEN 3
ELSE 4
END
LIMIT 1
This will retrieve all suppliers, but the one from Taipei, if available, will come out first; and the LIMIT 1 will truncate the response to that first row.
The solutions below do not apply
This will get sid from supplier where city is Taipei or Tainan (which of course means that city is not empty!):
SELECT sid FROM supplier WHERE city IN ('Taipei', 'Tainan');
This will get sid from supplier as above, provided sid is not empty:
SELECT sid FROM supplier WHERE city IN ('Taipei', 'Tainan') AND sid IS NOT NULL;
This will get sid from supplier as above, and replace sid if it is empty.
SELECT CASE WHEN sid IS NULL then 'Empty' ELSE sid END AS sid
FROM supplier WHERE city IN ('Taipei', 'Tainan');
Maybe you should provide two or three sample rows with the expected results.
Edit: sorry, I see now that sid is a primary key, which means it should never be empty. This means that cases 2 and 3 can never apply.
Then perhaps you mean that sname is not empty?:
SELECT sid FROM supplier WHERE city IN ('Taipei', 'Tainan')
AND sname IS NOT NULL AND sname != '';
The following selects a supplier if there is one in taipei, otherwise it selects the one in Tainan. If none of them exists, nothing will be returned.
select sid
from supplier
where city = 'Taipei'
union all
select sid
from supplier
where city = 'Tainan'
and not exists (select 1 from supplier where city = 'taipei')

How can I choose the closest match in SQL Server 2005?

In SQL Server 2005, I have a table of input coming in of successful sales, and a variety of tables with information on known customers, and their details. For each row of sales, I need to match 0 or 1 known customers.
We have the following information coming in from the sales table:
ServiceId,
Address,
ZipCode,
EmailAddress,
HomePhone,
FirstName,
LastName
The customers information includes all of this, as well as a 'LastTransaction' date.
Any of these fields can map back to 0 or more customers. We count a match as being any time that a ServiceId, Address+ZipCode, EmailAddress, or HomePhone in the sales table exactly matches a customer.
The problem is that we have information on many customers, sometimes multiple in the same household. This means that we might have John Doe, Jane Doe, Jim Doe, and Bob Doe in the same house. They would all match on on Address+ZipCode, and HomePhone--and possibly more than one of them would match on ServiceId, as well.
I need some way to elegantly keep track of, in a transaction, the 'best' match of a customer. If one matches 6 fields, and the others only match 5, that customer should be kept as a match to that record. In the case of multiple matching 5, and none matching more, the most recent LastTransaction date should be kept.
Any ideas would be quite appreciated.
Update: To be a little more clear, I am looking for a good way to verify the number of exact matches in the row of data, and choose which rows to associate based on that information. If the last name is 'Doe', it must exactly match the customer last name, to count as a matching parameter, rather than be a very close match.
for SQL Server 2005 and up try:
;WITH SalesScore AS (
SELECT
s.PK_ID as S_PK
,c.PK_ID AS c_PK
,CASE
WHEN c.PK_ID IS NULL THEN 0
ELSE CASE WHEN s.ServiceId=c.ServiceId THEN 1 ELSE 0 END
+CASE WHEN (s.Address=c.Address AND s.Zip=c.Zip) THEN 1 ELSE 0 END
+CASE WHEN s.EmailAddress=c.EmailAddress THEN 1 ELSE 0 END
+CASE WHEN s.HomePhone=c.HomePhone THEN 1 ELSE 0 END
END AS Score
FROM Sales s
LEFT OUTER JOIN Customers c ON s.ServiceId=c.ServiceId
OR (s.Address=c.Address AND s.Zip=c.Zip)
OR s.EmailAddress=c.EmailAddress
OR s.HomePhone=c.HomePhone
)
SELECT
s.*,c.*
FROM (SELECT
S_PK,MAX(Score) AS Score
FROM SalesScore
GROUP BY S_PK
) dt
INNER JOIN Sales s ON dt.s_PK=s.PK_ID
INNER JOIN SalesScore ss ON dt.s_PK=s.PK_ID AND dt.Score=ss.Score
LEFT OUTER JOIN Customers c ON ss.c_PK=c.PK_ID
EDIT
I hate to write so much actual code when there was no shema given, because I can't actually run this and be sure it works. However to answer the question of the how to handle ties using the last transaction date, here is a newer version of the above code:
;WITH SalesScore AS (
SELECT
s.PK_ID as S_PK
,c.PK_ID AS c_PK
,CASE
WHEN c.PK_ID IS NULL THEN 0
ELSE CASE WHEN s.ServiceId=c.ServiceId THEN 1 ELSE 0 END
+CASE WHEN (s.Address=c.Address AND s.Zip=c.Zip) THEN 1 ELSE 0 END
+CASE WHEN s.EmailAddress=c.EmailAddress THEN 1 ELSE 0 END
+CASE WHEN s.HomePhone=c.HomePhone THEN 1 ELSE 0 END
END AS Score
FROM Sales s
LEFT OUTER JOIN Customers c ON s.ServiceId=c.ServiceId
OR (s.Address=c.Address AND s.Zip=c.Zip)
OR s.EmailAddress=c.EmailAddress
OR s.HomePhone=c.HomePhone
)
SELECT
*
FROM (SELECT
s.*,c.*,row_number() over(partition by s.PK_ID order by s.PK_ID ASC,c.LastTransaction DESC) AS RankValue
FROM (SELECT
S_PK,MAX(Score) AS Score
FROM SalesScore
GROUP BY S_PK
) dt
INNER JOIN Sales s ON dt.s_PK=s.PK_ID
INNER JOIN SalesScore ss ON dt.s_PK=s.PK_ID AND dt.Score=ss.Score
LEFT OUTER JOIN Customers c ON ss.c_PK=c.PK_ID
) dt2
WHERE dt2.RankValue=1
Here's a fairly ugly way to do this, using SQL Server code. Assumptions:
- Column CustomerId exists in the Customer table, to uniquely identify customers.
- Only exact matches are supported (as implied by the question).
SELECT top 1 CustomerId, LastTransaction, count(*) HowMany
from (select Customerid, LastTransaction
from Sales sa
inner join Customers cu
on cu.ServiceId = sa.ServiceId
union all select Customerid, LastTransaction
from Sales sa
inner join Customers cu
on cu.EmailAddress = sa.EmailAddress
union all select Customerid, LastTransaction
from Sales sa
inner join Customers cu
on cu.Address = sa.Address
and cu.ZipCode = sa.ZipCode
union all [etcetera -- repeat for each possible link]
) xx
group by CustomerId, LastTransaction
order by count(*) desc, LastTransaction desc
I dislike using "top 1", but it is quicker to write. (The alternative is to use ranking functions and that would require either another subquery level or impelmenting it as a CTE.) Of course, if your tables are large this would fly like a cow unless you had indexes on all your columns.
Frankly I would be wary of doing this at all as you do not have a unique identifier in your data.
John Smith lives with his son John Smith and they both use the same email address and home phone. These are two people but you would match them as one. We run into this all the time with our data and have no solution for automated matching because of it. We identify possible dups and actually physically call and find out id they are dups.
I would probably create a stored function for that (in Oracle) and oder on the highest match
SELECT * FROM (
SELECT c.*, MATCH_CUSTOMER( Customer.Id, par1, par2, par3 ) matches FROM Customer c
) WHERE matches >0 ORDER BY matches desc
The function match_customer returns the number of matches based on the input parameters... I guess is is probably slow as this query will always scan the complete customer table
For close matches you can also look at a number of string similarity algorithms.
For example, in Oracle there is the UTL_MATCH.JARO_WINKLER_SIMILARITY function:
http://www.psoug.org/reference/utl_match.html
There is also the Levenshtein distance algorithym.