Remapping/Concatenating in SQL - sql

I'm trying to reorder/group a set of results using SQL. I have a few fields (which for the example have been renamed to something a bit less specific), and each logical group of records has a field which remains constant - the address field. There are also fields which are present for each address, these are the same for every address.
id forename surname address
1 John These Address1
2 Lucy Values Address1
3 Jenny Are Address1
4 John All Address2
5 Lucy Totally Address2
6 Jenny Different Address2
7 Steve And Address2
8 Richard Blah Address2
address John Lucy Jenny Steve Richard
Address1 These Values Are (null) (null)
Address2 All Totally Different And Blah
For example: John,Lucy,Jenny,Steve and Richard are the only possible names at each address. I know this because it's stored in another location.
Can I select values from the actual records in the left hand image, and return them as a result set like the one on the right? I'm using MySQL if that makes a difference.

Assuming that the column headings "john", "lucy" etc are fixed, you can group by the address field and use if() functions combined with aggregate operators to get your results:
select max(if(forename='john',surname,null)) as john,
max(if(forename='lucy',surname,null)) as lucy,
max(if(forename='jenny',surname,null)) as jenny,
max(if(forename='steve',surname,null)) as steve,
max(if(forename='richard',surname,null)) as richard,
address
from tablename
group by address;
It is a bit brittle though.
There is also the group_concat function that can be used (within limits) to do something similar, but it will be ordered row-wise rather than column-wise as you appear to require.
eg.
select address, group_concat( concat( forename, surname ) ) tenants
from tablename
group by address;

I'm not certain, but I think what you're trying to do is GROUP BY.
SELECT Address,Name FROM Table GROUP BY Name
if you want to select more columns, make sure they're included in the GROUP BY clause. Also, you can now do aggregate functions, like MAX() or COUNT().

I am not sure about the question, but from what I understand you can do:
SELECT concat(column1,column2,column3) as main_column, address from table;

Related

SQL join table on partial string

I have two tables in a Postgres database:
Table A:
**Middle_name**
John
Joe
Fred
Jim Bob
Paul-John
Table B:
**Full_name**
Fred, Joe, Bobda
Jason, Fred, Anderson
Tom, John, Jefferson
Jackson, Jim Bob, Sager
Michael, Paul-John, Jensen
Sometimes the middle name is hyphenated or has a space between it. But there is never a comma in the middle name. If it is hyphenated or two middle names, the entries will still be the same in both Table A and Table B.
I want to join the tables on Middle_name and Full_name. The difficult part is that the join has to check only the values between the commas in Full_name. Otherwise it might match the first name accidentally.
I've been using the query below but I just realized that there is nothing stopping it from matching the middle name to a first name accidentally.
SELECT Full_name, Middle_name
FROM B
JOIN A
ON POSITION(Middle_name IN Full_name)>0
I'm wondering how I can refactor this query to match only the middle name (assuming they all appear in the same format).
use split_part('Fred, Joe, Bobda', ',', 2) which returns the middle name joe
SELECT Full_name, Middle_name
FROM B
JOIN A
ON split_part(B.Full_name, ',', 2)=A.Middle_name
demo for returning middle name
If there is always exactly one space after the comma, and everybody has a middle name like your sample data suggests, the space can just be part of the delimiter in split_part():
SELECT full_name, middle_name
FROM A
JOIN B ON split_part(B.full_name, ', ', 2) = A.middle_name;
Related:
Split comma separated column data into additional columns

Selecting from table where a name appears twice

I want to select from a table where a name appears twice.
For example I have a table like this,
ID Name
---- ------
1 Jane John
2 Kevin Smith
3 Jane John
What I want is for the output to show where Jane John appear twice so it should look something like this:
ID Name
---- ------
1 Jane John
3 Jane John
I tried looking around on stackoverflow but couldn't find an exact and easy answer.
I'm using oracle SQL Developer.
You ask for a record that appears twice. If a row appears three times it won't show unless you modify the having clause as commented.
SELECT id
,NAME
FROM tablen
WHERE NAME IN (
SELECT NAME
FROM TableN n
GROUP BY (NAME)
HAVING counT(NAME) = 2 --Use >1 instead of =2 for more than one record
)
EDIT
I'll add a new solution in regard to your last comment.
As you can only ask for one field in IN() I'll use a special character or string making sure it does not belongs to valid values in any field.
Look at this: http://sqlfiddle.com/#!6/2af55/3
SELECT id
,NAME
,name2
FROM tablen
WHERE concat(NAME,'=',name2) IN (
SELECT concat(NAME,'=',name2)
FROM TableN n
GROUP BY concat(NAME,'=',name2)
HAVING count(concat(NAME,'=',name2)) = 2
)
Note I wrote this thinking in SQL Server, not sure if concat function works as well in Oracle or look for an alternative.

Query that returns a single column containing data from multiple columns (ORACLE)

I'm trying to write an oracle SQL query that returns a single column containing values from multiple columns.
I have a table named CLIENT
clientid firstname Lastname
1 Steve Smith
2 James Hill
I want to return a single column "ALL" like so:
ALL
1
2
Steve
James
Smith
Hill
Is there a simple way to write this query?
This involves UNION
SELECT ClientID AS [All]
FROM Client
UNION ALL
SELECT FirstName
FROM Client
UNION ALL
SELECT LastName
FROM Client

Aggregate value by any of two columns

Suppose I have a Customers Table:
Customers
-----------------------------------------------
Id INTEGER
SSN NCHAR(11)
FullName NVARCHAR(100)
LastPurchaseDate DATETIME
There are many stores around the city, and the customer can be registered in any of them, each one giving him a different Id. Wherever he buys, the corresponding Id gets it's LastPurchaseDate updated.
Now I need to get the Id corresponding to the 'latest' LastPurchaseDate by person. Problem is, due to X different reasons, there can be typos on either the SSN or the FullName. Let's say I have the next data:
Id SSN FullName LastPurchaseDate
----------- ----------- ------------- -----------------
200123 123-45-6789 John Doe 10-09-2015
201978 456-78-9012 Mary Jane 15-08-2015
380789 789-01-2345 Pete Zahut 01-08-2015
389236 123-45-6789 Jhon Doe 23-07-2015
215875 456-87-9012 Mary Jane 30-08-2015
974186 123456789 John Doe 28-04-2015
123758 789-01-2345 Pete Zaut 18-08-2015
A customer is considered to be the same person if it has either the same SSN or the same FullName. So in this sample, customers 200123, 389236 and 974186 are the same person. Therefore, the resulting Ids should be
200123
215875
123758
How can I achieve this?
Edit
So, the match has to be on either SSN or FullName, but it has to be exact; if both fields are different, even if it's by one character, it will be considered a different person. I hope the data will be eventually cleansed but it'll take it's time as it is a lot of info to trace and correct.
The first data cleaning will be:
(select REPLACE(SSN, '-', '') as SSN ,
Min(Id) as Id, Max(FullName) as FullName
max(LastPurchaeDate) as LastPurchaeDate
from Customers group by 1)
That will merge all the SSN numbers. In addition, it will go on the assumption that the lowest Id is the real Id and made max on name to avoid nulls.
You can go to further purification by assuming that the longer name length is the better by Length functions.

Distinct ordering not in place

I have a temporary table making a list of names which are ordered by a different column, e.g.
#table:
John, 1
Mary, 3
Mary, 5
Mary, 7
John, 8
Kyle, 9
Brad, 10
when I call a simple select * from #table, that's what I get, but when I call a select distinct name from #table I get this:
Kyle
John
Mary
Brad
Why is it not using in-place ordering? Is this a sql quirk I don't know about? I would expect (and want) it to be:
John
Mary
Kyle
Brad
EDIT: Additional Question: Since I 'Ordered By' on the original table, is there a functional reason why it wouldn't persist?
When using SELECT DISTINCT you can't order by a column that's not being selected. The easiest way to do what you want is:
SELECT name FROM #table GROUP BY name ORDER BY min(id);