I am looking to return all names with more than one space in a single field.
For example 'John Paul Smith'. Using SQL server management studio 2005
Example I have a patients table with forename and surname
I want to return all forenames that have example 'John Paul Smith' in one field.
The query given seems to work on the surname field but not the forename. I knot for certain that the forename columns has these types of data but it is returning no results.
Con
Oracle:
SELECT MyField
from MyTable
where REGEXP_INSTR (MyField, ' ', 1, 2, 0, 'i') > 0
SQL server:
SELECT MyField
from MyTable
where CHARINDEX(' ', MyField, charindex(' ',MyField)+1) > 0
MySQL
select MyField
from MyTable
where length(SUBSTRING_INDEX(MyField, ' ', 2)) < length(MyField)
Here are two solutions that in my opinion are easier to read/understand than JohnHC's.
It can't get any simpler. Use wildcards to search for (at least) two spaces.
SELECT * FROM your_table WHERE your_column LIKE '% % %';
Check the length after replacing the spaces
SELECT * FROM your_table WHERE LEN(your_column) - LEN(REPLACE(your_column, ' ', '')) >= 2;
Related
I want to return name of customers and order them, but their titles must be excluded in ordering.
SELECT name
FROM customers
WHERE name LIKE ...
ORDER BY name
I mean by 'their titles' is such as Dr., Sn., Lady, Sir, Mr., and Mrs.
Possible solution to your problem.
In Oracle:
regexp_replace(user_name, '^(MISS|MS\.|MS|MRS\.|MRS|MR\.|MR)\s*', '') as user_name
Also you can use REPLACE () function like:
REPLACE (user_name, 'MISS', '') as user_name
If you have a column structure like (mr | mrs | other) / space / username you can try this:
with users(user_name) as
(select 'mr user name1' from dual union all
select 'miss username2 ' from dual union all
select 'other username 3' from dual )
select substr(user_name,instr(user_name,' ')+1) real_username from users
Output:
REAL_USERNAME
----------------
username 1
username 2
username 3
In MSSQL:
DECLARE #str VARCHAR(500)='Mr Sam'
SELECT Title,
first_name,
Substring(NAME, CASE
WHEN Charindex(' ', NAME) = 0 THEN 1
ELSE Charindex(' ', NAME)
END, Len(NAME)) last_name
FROM (SELECT CASE
WHEN LEFT(#str, Charindex(' ', #str)) IN( 'Mr', 'Mrs', 'Miss' ) THEN LEFT(#str, Charindex(' ', #str))
ELSE ''
END AS Title,
CASE
WHEN LEFT(#str, Charindex(' ', #str)) IN ( 'Mr', 'Mrs', 'Miss' ) THEN LEFT(Stuff(#str, 1, Charindex(' ', #str), ''), Charindex(' ', Stuff(#str, 1, Charindex(' ', #str), '')))
ELSE LEFT(#str, Charindex(' ', #str))
END AS first_name,
CASE
WHEN LEFT(#str, Charindex(' ', #str)) IN ( 'Mr', 'Mrs', 'Miss' ) THEN Stuff(#str, 1, Charindex(' ', #str), '')
ELSE #str
END NAME) a
Technically using a % before their name will allow any prefix, which would get you the desired result (ex. WHERE name LIKE ('%' + #name). However, this is not the recommended approach on larger data sets as you will see significant performance issues with this approach.
You need to be more specific. First of all: Which SQL "flavor" are you using? Postgres? Oracle? MySql? This is very important because each engine has different functions. Every time you ask an SQL question in SO, be sure to include at the very least a tag mentioning which DBMS you're using.
Now, what do you mean by "I shouldn't return them". Do you mean you should not return records which have a prefix, or do you mean you need to return the records but without the prefix? (So if you have Dr. Henry Gutierrez, do you exclude him from the result? Or have it output as Henry Gutierrez?)
This is also a good place for another tip: Always write an expected output in your questions. "If X is Y then I expect the output to be this:"
If you need to exclude them entirely, you can use a REGEXP match (Once again, I cannot list a specific function because I have no idea which type of SQL you're using) Something like WHERE REGEXP(UPPER(COL)) NOT ('^(MS|MR)\s+.*$')
If it's the 2nd case, that's going to be much harder because you'd need to get a substring which excludes the prefix, but the prefixes all have different sizes so you can't just write a "one size fits all"
In general, it's a bad normalization practice to have prefixes in your SQL database. You should have a column called PREFIX and another column for the name itself.
EDIT: Based on your answer. This is the closest you can get to achieving what you want.
SELECT NAME
FROM (SELECT NAME,
CASE WHEN NAME LIKE 'MR %' THEN SUBSTRING(NAME, 4)
WHEN NAME LIKE 'MRS %' THEN SUBSTRING(NAME, 5)
ELSE NAME AS NAME2
FROM YOUR_TABLE ORDER BY NAME2) AS SUBQ
I am 99% sure this will not return the results ordered though, because the select order might be ignored outside of the subquery, so you can also just try whats in the subquery in a main query instead, but this will output 2 columns.
***My reply is in pseudocode. It is not exactly MySql syntax, you will need to check MySql documentation for the actual substring function implementation it has and for the CASE syntax.
I have a column email with multiple delimiters like space ,/ , .
email
/john#thundergroup.com.mi/chris#cup.com.ey
r.Info#bc.com / rudi.an#yy.com
Dal#pema.com/Al#ama.com
/randi#mv.com
zul#sd.com/sat#sd.com/ faze#sd.com
My query:
select email,
CASE WHEN CHARINDEX(' ', email) > 0 THEN SUBSTRING(email, 0, CHARINDEX(' ', email)) ELSE
email END as Emailnew
FROM table
my output:
/john#thundergroup.com.mi/chris#cup.com.ey
r.Info#bc.com
Dal#pema.com/Al#ama.com
/randi#mv.com
zul#sd.com/sat#sd.com/ faze#sd.com
Please suggest changes so that in a single query I'm able to extract email
To get the first email always, you can try this below logic-
DEMO HERE
SELECT
CASE
WHEN CHARINDEX('/',email,2) = 0 THEN REPLACE(email,'/','')
ELSE REPLACE(SUBSTRING(email,0,CHARINDEX('/',email,2)),'/','')
END
FROM your_table
Output will be-
john#thundergroup.com.mi
r.Info#bc.com
Dal#pema.com
randi#mv.com
zul#sd.com
On modern SQL Servers try something like:
-- Setup...
create table dbo.Foo (
FooID int not null identity primary key,
Email nvarchar(100)
);
insert dbo.Foo (Email) values
('/john#thundergroup.com.mi/chris#cup.com.ey'),
('r.Info#bc.com / rudi.an#yy.com'),
('Dal#pema.com/Al#ama.com'),
('/randi#mv.com'),
('zul#sd.com/sat#sd.com/ faze#sd.com');
go
-- Demo...
select FooID, [Email]=value
from dbo.Foo
outer apply (
select top 1 value
from string_split(translate(Email, ' /', ';;'), ';')
where nullif(value, '') is not null
) Splitzville;
Which yields:
FooID Email
1 john#thundergroup.com.mi
2 r.Info#bc.com
3 Dal#pema.com
4 randi#mv.com
5 zul#sd.com
Requirements:
SQL Server 2016 and later for string_split().
SQL Server 2017 and later for translate().
If you want the first email only, use patindex():
select email,
left(email, patindex('%[^a-zA-Z0-9#.]%', email + ' ') - 1) as Emailnew
from table;
The pattern (a-zA-Z0-9#.) are valid email characters. You may have additional ones that you care about.
Unfortunately, I notice that some of your lists start with delimiter characters. In my opinion, the above works correctly by returning an empty value. That said, your desired results are to get the second value in that case.
So, you have to start the search at the first valid email character:
select t.email,
left(v.email1, patindex('%[^-_a-zA-Z0-9#.]%', v.email1 + ' ') - 1) as Emailnew
from t cross apply
(values (stuff(t.email, 1, patindex('%[-_a-zA-Z0-9#.]%', t.email) - 1, ''))) v(email1);
Here is a db<>fiddle.
I'm running a series of SQL queries to find data that needs cleaning up. One of them I want to do is look for:
2 or more uppercase letters in a row
starting with a lowercase letter
space then a lowercase letter
For example my name should be "John Doe". I would want it to find "JOhn Doe" or "JOHN DOE" or "John doe", but I would not want it to find "John Doe" since that is formatted correctly.
I am using SQL Server 2008.
The key is to use a case-sensitive collation, i.e. Latin1_General_BIN*. You can then use a query with a LIKE expression like the following (SQL Fiddle demo):
select *
from foo
where name like '%[A-Z][A-Z]%' collate Latin1_General_BIN --two uppercase in a row
or name like '% [a-z]%' collate Latin1_General_BIN --space then lowercase
*As per How do I perform a case-sensitive search using LIKE?, apparently there is a "bug" in the Latin1_General_CS_AS collation where ranges like [A-Z] fail to be case sensitive. The solution is to use Latin1_General_BIN.
First, I think you should make a function that returns a proper name (sounds like you need one anyway). See here under the heading "Proper Casing a Persons Name". Then find the ones that don't match.
SELECT Id, Name, dbo.ProperCase(Name)
FROM MyTable
WHERE Name <> dbo.PoperCase(Name) collate Latin1_General_BIN
This will help you clean up the data and tweak the function to what you need.
You can use a regular expression. I'm not a SQL Server whiz, but you want to use RegexMatch. Something like this:
select columnName
from tableName
where dbo.RegexMatch( columnName,
N'[A-Z]\W[A-Z]' ) = 1
If your goal is to update your column to capitalize the first character of each word (in your case firstName and lastName) , you can use the following query.
Create a sample table with data
Declare #t table (Id int IDENTITY(1,1),Name varchar(50))
insert into #t (name)values ('john doe'),('lohn foe'),('tohnytty noe'),('gohnsdf fgedsfsdf')
Update query
UPDATE #t
SET name = UPPER(LEFT(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1), 1)) + RIGHT(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1), LEN(SUBSTRING(Name, 1, CHARINDEX(' ', Name) - 1)) - 1) +
' ' +
UPPER(LEFT(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000), 1)) + RIGHT(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000), LEN(SUBSTRING(Name, CHARINDEX(' ', Name) + 1, 8000)) - 1)
FROM #t
Output
SELECT * FROM #t
Id Name
1 John Doe
2 Lohn Foe
3 Tohnytty Noe
4 Gohnsdf Fgedsfsdf
I use this way:
;WITH yourTable AS(
SELECT 'John Doe' As name
UNION ALL SELECT 'JOhn Doe'
UNION ALL SELECT 'JOHN DOE'
UNION ALL SELECT 'John doe'
UNION ALL SELECT 'John DoE'
UNION ALL SELECT 'john Doe'
UNION ALL SELECT 'jOhn dOe'
UNION ALL SELECT 'jOHN dOE'
UNION ALL SELECT 'john doe'
)
SELECT name
FROM (
SELECT name,
LOWER(PARSENAME(REPLACE(name, ' ', '.'), 1)) part2,
LOWER(PARSENAME(REPLACE(name, ' ', '.'), 2)) part1
FROM yourTable) t
WHERE name COLLATE Latin1_General_BIN = UPPER(LEFT(part1,1)) + RIGHT(part1, LEN(part1) -1) +
' ' + UPPER(LEFT(part2,1)) + RIGHT(part2, LEN(part2) -1)
Note:
This will be good for just two parted names for more, it should improved.
Running sql server 2008 The title says it, but in my select statement I have this
COALESCE( ca.AttributeList.value('(/AttributeList/IRName)[1]','varchar(max)')
,ca2.AttributeList.value('(/AttributeList/IRName)[1]','varchar(max)'))
AS IR_Name
and it returns lastName, FirstName
This becomes a problem when exporting to a csv as it creates two separate columns and what not.
I somehow have to figure out how to have the string be firstName lastName, no commas, and putting the firstName first.
SELECT PARSENAME(REPLACE('John , Doe', ',', '.'), 1) + ' ' + PARSENAME(REPLACE('John , Doe', ',', '.'), 2)
this will switch 'John , Doe' to 'Doe John'. Now, you just need replace the 'John, Doe" with the
COALESCE(ca.AttributeList.value('(/AttributeList/IRName)[1]','varchar(max)'),ca2.AttributeList.value('(/AttributeList/IRName)[1]','varchar(max)')) AS IR_Name
Another way to do this is by using string functions:
select (case when IR_Name like '%,%'
then (ltrim(rtrim(substring(IR_NAME, charindex(',', IR_NAME)+1, 1000))) + ' ' +
ltrim(rtrim(left(IR_Name, charindex(',', IR_NAME) - 1)))
)
else IR_NAME
end) t
from (select COALESCE( ca.AttributeList.value('(/AttributeList/IRName)[1]','varchar(max)')
,ca2.AttributeList.value('(/AttributeList/IRName)[1]','varchar(max)'))
AS IR_Name
. . .
) t
The trimming functions are to remove extra spaces.
The parsename solution is clever. But, it looks strange since parsename is designed for multipart naming conventions, with up to four parts separated by periods.
I replace all blanks with # using this
SELECT *, REPLACE(NAME,' ','#') AS NAME2
which results miss#test#blogs############## (different number of #s dependent on length of name!
I then delete all # signs after the name using this
select *, substring(Name2,0,charindex('##',Name2)) as name3
which then gives my desired results of, for example MISS#test#blogs
However some wheren't giving this result, they are null. This is because annoyingly some rows in the sheet I have read in dont have the spaces after the name.
is there a case statement i can use so it only deletes # signs after the name if they are there in the first place?
Thanks
The function rtrim can be used to remove trailing spaces. For example:
select replace(rtrim('miss test blogs '),' ','#')
-->
'miss#test#blogs'
Example at SQL Fiddle.
try this:
Declare #t table (name varchar(100),title varchar(100),forename varchar(100))
insert into #t
values('a b c','dasdh dsalkdk asdhl','asd dfg sd')
SELECT REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(name)),' ',' '+CHAR(7)),CHAR(7)+' ','') ,CHAR(7),'') AS Name,
REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(title)),' ',' '+CHAR(7)),CHAR(7)+' ','') ,CHAR(7),'') AS title,
REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(forename)),' ',' '+CHAR(7)),CHAR(7)+' ','') ,CHAR(7),'') AS forename
FROM #t WHERE
(CHARINDEX(' ',NAME) > 0 or CHARINDEX(' ',title) > 0 or CHARINDEX(' ',forename) > 0)
SQL Fiddle Demo
select name2, left(name2,len(name2)+1-patindex('%[^#]%',reverse(name2)+'.'))
from (
SELECT *, REPLACE(NAME,' ','#') AS NAME2
from t
) x;
Check this SQL Fiddle
For posterity, sample table:
create table t (name varchar(100));
insert t select 'name#name#ne###'
union all select '#name#name'
union all select 'name name hi '
union all select 'joe public'
union all select ''
union all select 'joe'
union all select 'joe '
union all select null
union all select ' leading spaces'
union all select ' leading trailing ';
Don't quite understand the question, but if the problem is there is not spaces after some names, can't you do this first:
SELECT *, REPLACE(NAME+' ',' ','#') AS NAME2
i.e., add a space to all names right off the bat?
I had this same problem some days ago.
Well actually, there's a quickly way to subtract the spaces from both the begin and end inside strings. In SQL Server, you can use the RTRIM and LTRIM for this. The first one supresses spaces from right side and the second supresses from left. But, if in your scenario also may exists more than one space in the middle of the string I sugest you take a look on this post on SQL Server Central: http://www.sqlservercentral.com/articles/T-SQL/68378/
There the script's author explain, in details, a good solution for this situation.