sort by second string in database field

sort by second string in database field - sql

I have the below sql statement which sorts an address field (address1) using the street name not the number. This seems to work fine but I want the street names to appear alphabetically. The ASC at the end of order by doesnt help
e.g Address1 field might contain
"5 Elm Close" - a normal sort and order will sort by the number the below will sort by looking at the 2nd string "Elm"
(Using SQL Server)
SELECT tblcontact.ContactID, tblcontact.Forename, tblcontact.Surname,
tbladdress.AddressLine1, tbladdress.AddressLine2
FROM tblcontact
INNER JOIN tbladdress
ON tblcontact.AddressID = tbladdress.AddressID
LEFT JOIN tblDonate
ON tblcontact.ContactID = tblDonate.ContactID
WHERE (tbladdress.CollectionArea = 'Queens Park')
GROUP BY tblcontact.ContactID, tblcontact.Forename, tblcontact.Surname,
tbladdress.AddressLine1, tbladdress.AddressLine2
ORDER BY REVERSE(LEFT(REVERSE(tbladdress.AddressLine1),
charindex(' ', REVERSE(tbladdress.AddressLine1)+' ')-1)) asc
Gordon's statement sorts as below
1 Kings Road
10 Olivier Way
11 Albert Street
11 Kings Road
11 Princes Road
120 High Street

Try this: I based it off of Gordon's code, but altered it to remove the LEFT(AddressLine1, 1) portion - a single-character string could never be match the pattern "n + space + %".
This works on my SQL-Server 2012 environment:
WITH tbladdress AS
(
SELECT AddressLine1 FROM (VALUES ('1 Kings Road'),('10 Olivier Way'), ('11 Albert Street')) AS V(AddressLine1)
)
SELECT
AddressLine1
FROM tbladdress
order by (case when tbladdress.AddressLine1 like '[0-9]% %'
then substrING(tbladdress.AddressLine1, charindex(' ', tbladdress.AddressLine1) + 1, len(tbladdress.AddressLine1))
else tbladdress.AddressLine1
end)
This is edited to be more similar to Gordon's code (position of closing parentheses, substr instead of substring):
order by (case when tbladdress.AddressLine1 like '[0-9]% %'
then substr(tbladdress.AddressLine1, charindex(' ', tbladdress.AddressLine1) + 1), len(tbladdress.AddressLine1)
else tbladdress.AddressLine1
end)

If you assume that the street name is the first or second value in a space separated string, you could try:
order by (case when left(tbladdress.AddressLine1, 1) like '[0-9]% %'
then substr(tbladdress.AddressLine1, charindex(' ', tbladdress.AddressLine1) + 1), len(tbladdress.AddressLine1) )
else tbladdress.AddressLine1
end)

I don't think you need to use REVERSE() at all. That seems like a trap.
ORDER BY
CASE
WHEN ISNUMERIC(LEFT(tbladdress.AddressLine1,CHARINDEX(' ',tbladdress.AddressLine1) - 1))
THEN RIGHT(tbladdress.AddressLine1,LEN(tbladdress.AddressLine1) - CHARINDEX(' ',tbladdress.AddressLine1))
ELSE tbladdress.AddressLine1
END,
CASE
WHEN ISNUMERIC(LEFT(tbladdress.AddressLine1,CHARINDEX(' ',tbladdress.AddressLine1) - 1))
THEN CAST(LEFT(tbladdress.AddressLine1,CHARINDEX(' ',tbladdress.AddressLine1) - 1) AS INT)
ELSE NULL
END
Also, you have a GROUP BY with no aggregate function. While that's not wrong, per se, it is weird. Just use DISTINCT if you're getting duplicate records.

This is the bit of code that works in sql server
order by (case when tbladdress.AddressLine1 like '[0-9]% %'
then substrING(tbladdress.AddressLine1, charindex(' ', tbladdress.AddressLine1) + 1, len(tbladdress.AddressLine1))
else tbladdress.AddressLine1
end)

Related

Sql Server Splitting column value into email and name

I am new to sql. I need help with separating 2 values from a column value.
Example column value:
Sam Taylor <Sam.Taylor#gmail.com>
I need 2 columns from that column.
Name Email
Sam Taylor Sam.Taylor#gmail.com
TIA
https://www.db-fiddle.com/f/beu4tdDo4WFAwKXtt9KL8A/0

DECLARE #yourField nvarchar(100)='Sam Taylor <Sam.Taylor#gmail.com>'
SELECT SUBSTRING (#yourField ,0,CHARINDEX('<',#yourField)) as Name, SUBSTRING (#yourField ,CHARINDEX('<',#yourField)+1,CHARINDEX('>',#yourField)-CHARINDEX('<',#yourField)-1) as Email

One way to do this will be like below but assuming that you do not have < or > as data in name.
select datastring,
name=max(case when row=1 then value else null end),
email=max(case when row=2 then value else null end)
from
(
select
datastring,
value=REPLACE(value,'>',''),
row=row_number() over (partition by datastring order by datastring)
from yourtable
cross apply STRING_SPLIT(datastring,'<')
)t
group by datastring

You can use string_split(), but like this:
select t.*, v.*
from t cross apply
(select max(case when s.value not like '%#%>' then trim(s.value) end) as name,
max(case when s.value like '%#%>' then replace(s.value, '>', '') end) as email
from string_split(t.full_email, '<') s
) v;
In older versions, you can use:
select ltrim(rtrim(left(full_email, charindex('<', full_email) - 1))) as name,
replace(stuff(full_email, 1, charindex('<', full_email), ''), '>', '') as name
from t;
Here is a db<>fiddle.

Select query that will look at digits to left and right of dash and perform a select top based on both values

Here is an example of what the values of the column in my table will look like
18-0267, 19-0001, 19-0002, 19-SHOP
So what I need to do is first split the digits to the left of the '-' and see if those digits are in fact from the current year such as 19 = 2019 or 18 = 2018.
After this I need to get the characters to the right of the '-' and check if they are in fact '%[0,9]%' and if they are I would like to select the Top 1 Order by DESC, but that top value has to take into effect the current year to the left side digits.
I thought that I had it from the query below, but that was until I realized I was not checking the digits to the left of the '-' to make sure the top value is from the current year
So from the numbers in the example above I would like to return the 19-0002 value but really I just want to return 0002 and right now the query is returning the value of 18-0267 and I am getting the 0267.
Any help is appreciated thank you
SELECT TOP 1
RIGHT(Name, CHARINDEX('-', REVERSE(Name)) - 1) AS 'Name'
FROM Job
WHERE RIGHT(Name, CHARINDEX('-', REVERSE(Name)) - 1) LIKE '%[0-9]%'
ORDER BY Name DESC

The problem is what name refers to. Try this:
SELECT TOP 1 RIGHT(Name, CHARINDEX('-', REVERSE(Name)) - 1) AS New_Name
FROM Job j
WHERE RIGHT(Name, CHARINDEX('-', REVERSE(Name)) - 1) LIKE '%[0-9]%'
ORDER BY j.Name DESC;
Your alias called Name was being confused with the column called Name.
If you want to ensure that the right two columns are for the current year, then you need to include that in the WHERE clause:
SELECT TOP 1 RIGHT(Name, CHARINDEX('-', REVERSE(Name)) - 1) AS New_Name
FROM Job j
WHERE RIGHT(Name, CHARINDEX('-', REVERSE(Name)) - 1) LIKE '%[0-9]%' AND
DATENAME(YEAR, GETDATE()) LIKE '__' + LEFT(NAME, 2)
ORDER BY j.Name DESC;

Here's a version that limits based on the leading two characters being the current two digit year:
select TOP 1 RIGHT(job.Name, CHARINDEX('-', REVERSE(job.Name)) - 1) name
from job
where job.name like left(CONVERT(VARCHAR(6), GETDATE(), 12),2) + '-[0-9]%'
order by 1 desc;
Relying on sorting by job.name in descending order will not work if you have future years in the job table.

Count the number of not null columns using a case statement

I need some help with my query...I am trying to get a count of names in each house, all the col#'s are names.
Query:
SELECT House#,
COUNT(CASE WHEN col#1 IS NOT NULL THEN 1 ELSE 0 END) +
COUNT(CASE WHEN col#2 IS NOT NULL THEN 1 ELSE 0 END) +
COUNT(CASE WHEN col#3 IS NOT NULL THEN 1 ELSE 0 END) as count
FROM myDB
WHERE House# in (house#1,house#2,house#3)
GROUP BY House#
Desired results:
house 1 - the count is 3 /
house 2 - the count is 2 /
house 3 - the count is 1
...with my current query the results for count would be just 3's

In this case, it seems that counting names is the same as counting the commas (,) plus one:
SELECT House_Name,
LEN(Names) - LEN(REPLACE(Names,',','')) + 1 as Names
FROM dbo.YourTable;

Another option since Lamak stole my thunder, would be to split it and normalize your data, and then aggregate. This uses a common split function but you could use anything, including STRING_SPLIT for SQL Server 2016+ or your own...
declare #table table (house varchar(16), names varchar(256))
insert into #table
values
('house 1','peter, paul, mary'),
('house 2','sarah, sally'),
('house 3','joe')
select
t.house
,NumberOfNames = count(s.Item)
from
#table t
cross apply dbo.DelimitedSplit8K(names,',') s
group by
t.house

Notice how the answers you are getting are quite complex for what they're doing? That's because relational databases are not designed to store data that way.
On the other hand, if you change your data structure to something like this:
house name
1 peter
1 paul
1 mary
2 sarah
2 sally
3 joe
The query now is:
select house, count(name)
from housenames
group by house
So my recommendation is to do that: use a design that's more suitable for SQL Server to work with, and your queries become simpler and more efficient.

One dirty trick is to replace commas with empty strings and compare the lengths:
SELECT house +
' has ' +
CAST((LEN(names) - LEN(REPLACE(names, ',', '')) + 1) AS VARCHAR) +
' names'
FROM mytable

You can parse using xml and find count as below:
Select *, a.xm.value('count(/x)','int') from (
Select *, xm = CAST('<x>' + REPLACE((SELECT REPLACE(names,', ','$$$SSText$$$') AS [*] FOR XML PATH('')),'$$$SSText$$$','</x><x>')+ '</x>' AS XML) from #housedata
) a

select House, 'has '+cast((LEN(Names)-LEN(REPLACE(Names, ',', ''))+1) as varchar)+' names'
from TempTable

Oracle SubStr for Description field

Hello I need help with the following Scenario.
There is a table with Company_Cd, Company_Name and All I need is the first 2 words of from the Company name if it has more than 3 words and 1 word if it has 2 words
Example:
Company_Cd Company_Name
123 ABC SOLUTIONS INC
345 XYZ GLOBAL TECH SOLUTIONS
899 NOWHERE COMPANY INC LTD
654 QSW SOLUTIONS
Desired Output:
Company_Cd Company_Name
123 ABC SOLUTIONS
345 XYZ GLOBAL
899 NOWHERE COMPANY
654 QSW

You can use the instr function to find the 1st and 2nd occurence of space and then use substr accordingly:
SELECT c.company_name,
(
CASE
WHEN instr(c.company_name,' ',1,2) >0 THEN SUBSTR(c.company_name, 1, instr(c.company_name,' ',1,2))
WHEN instr(c.company_name,' ',1,2) =0 AND instr(c.company_name,' ',1,1) >0 THEN SUBSTR(c.company_name, 1, instr(c.company_name,' ',1,1))
ELSE c.company_name
END)
FROM customer c

Please find below query for your use:
SELECT Company_Cd, IF((length(Company_Name) - length(replace(Company_Name, ' ', '')) + 1) >= 3, SUBSTRING_INDEX(Company_Name, ' ', 2), IF((length(Company_Name) - length(replace(Company_Name, ' ', '')) + 1) >= 2, SUBSTRING_INDEX(Company_Name, ' ', 1), Company_Name)) as result FROM company LIMIT 20;

You can also use Regular Expression:
SELECT Company_Cd,
regexp_replace(company_name,'(((\w+)\s){'||CASE WHEN regexp_count(trim(company_name),' ') IN (0,1) THEN 1
ELSE 2
END||'}).*','\1' )
FROM customer;

select company_cd,
trim(substr(company_name, 1, instr(company_name || ' ', ' ', 1, 2) - 1))
from company_tbl;
This solution begins by adding two spaces at the end of company_name; then it finds the position of the second space in this extended string, it removes the second space and everything after it - and then it trims the remaining space at the end (only needed if the company name was a single word; if all company names were guaranteed to be at least two words, the solution would be even simpler).

T-SQL: Parsing names to ignore spaces and middle initials

I have a poorly maintained database that includes employee information. Human Resources requested a report that lists instances where the employee name associated with an insurance coverage does not match the name on the insurance policy.
There are inconsistencies in the formatting of the names in both tables. It's always last name then first name, but you might see any of the following in either table for a fictional employee named Steven J. Smith:
Smith, Steven
Smith,Steven
Smith, Steven J.
Smith,Steven J.
I need to run a query looking for instances where EMPLOYEE.EMP_NAME <> INSURANCE.SUBSCRIBER_NAME while allowing for differences in name formatting as shown above (i.e. picking up that "Smith,Steven J." and "Smith, Steven" are (probably) the same person and igonring them).
SELECT
EMPLOYEE.EMP_NO
, EMPLOYEE.EMP_NAME
, INSURANCE.SUBSCRIBER_NAME
, INSURANCE.PAYOR_NAME
FROM EMPLOYEE
INNER JOIN INSURANCE ON EMPLOYEE.EMP_NO = INSURANCE.EMP_NO
WHERE EMPLOYEE.EMP_NAME <> INSURANCE.SUBSCRIBER_NAME
I know I want to do a substring to ignore the middle initial, but how do I account for ignoring whether or not there is a space after the comma?

Why not just remove all commas and spaces with REPLACE?
WHERE REPLACE(REPLACE(EMPLOYEE.EMP_NAME,' ',''),',','') <> REPLACE(REPLACE(INSURANCE.SUBSCRIBER_NAME,' ',''),',','')

You could simply replace out the comma
WHERE replace (EMPLOYEE.EMP_NAME,',','') <> replace (INSURANCE.SUBSCRIBER_NAME,',','')
To find most mismatches...
;with cE as
(select
EMP_NO,
REPLACE(REPLACE(REPLACE(EMP_NAME,',',''),' ',''),'.','') as namekey
from EMPLOYEE),
ci as
(select
EMP_NO,
REPLACE(REPLACE(REPLACE(SUBSCRIBER_NAME,',',''),' ',''),'.','') as namekey
from INSURANCE)
select *
from ce
inner join ci on ce.EMP_NO = ci.EMP_NO
where
not
(
(LEN(ce.namekey)< LEN(ci.namekey) and ci.namekey like ce.namekey+'%')
or
(LEN(ce.namekey)>= LEN(ci.namekey) and ce.namekey like ci.namekey+'%')
)

you can remove space after comma and then remove initials
declare #Temp table (Name nvarchar(128))
insert into #Temp
select 'Smith, Steven' union all
select 'Smith,Steven' union all
select 'Smith, Steven J.' union all
select 'Smith,Steven J.'
select
case
when N1.Name like '% %' then left(N1.Name, charindex(' ', N1.Name))
else N1.Name
end as Name_New,
T.Name
from #Temp as T
outer apply (select replace(T.Name, ', ', ',') as Name) as N1

Thanks, your answers helped a lot. I ended up cutting the name into [lastname][firstname] with no spaces and cutting off the middle initial if it was there. Here's what eventually worked in eliminating the vast majority of the same-name matches:
((CASE
WHEN CHARINDEX(' ',REPLACE(REPLACE(EMPLOYEE.EMP_NAME,', ',''),',','')) = 0
THEN UPPER(REPLACE(REPLACE(EMPLOYEE.EMP_NAME,', ',''),',',''))
ELSE UPPER(LEFT(REPLACE(REPLACE(EMPLOYEE.EMP_NAME,', ',''),',',''),CHARINDEX(' ',REPLACE(REPLACE(EMPLOYEE.EMP_NAME,', ',''),',',''))))
END) <>
(CASE
WHEN CHARINDEX(' ',REPLACE(REPLACE(INSURANCE.SUBSCRIBER_NAME
,', ',''),',','')) = 0
THEN UPPER(REPLACE(REPLACE(INSURANCE.SUBSCRIBER_NAME
,', ',''),',',''))
ELSE UPPER(LEFT(REPLACE(REPLACE(INSURANCE.SUBSCRIBER_NAME
,', ',''),',',''),CHARINDEX(' ',REPLACE(REPLACE(INSURANCE.SUBSCRIBER_NAME
,', ',''),',',''))))
END))

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sort by second string in database field - sql

This is the bit of code that works in sql server order by (case when tbladdress.AddressLine1 like '[0-9]% %' then substrING(tbladdress.AddressLine1, charindex(' ', tbladdress.AddressLine1) + 1, len(tbladdress.AddressLine1)) else tbladdress.AddressLine1 end)

Related

Sql Server Splitting column value into email and name

Select query that will look at digits to left and right of dash and perform a select top based on both values

Count the number of not null columns using a case statement

Oracle SubStr for Description field

T-SQL: Parsing names to ignore spaces and middle initials

Categories

Resources