SQL - Join tables with modified data - sql

So, I have two tables in SQL Sever 2008 R2:
Table A:
patient_id first_name last_name external_id
000001 John Smith 4753-23314.0
000002 Mike Davis 4753-12548.0
Table B:
guarantor_id visit_date first_name last_name
23314 01/01/2013 John Smith
12548 02/02/2013 Mike Davis
Notice that the guarantor_id from Table B matches the middle section of the external_id from Table A. Would someone please help me strip the 4753- from the front and the .0 from the back of the external_id so I can join these tables?
Any help/examples is greatly appreciated.

Assuming the prefix and suffix are always the same length, just do this:
SUBSTRING(external_id, 6, 5)
The documentation for SUBSTRING is here if you want to look at that.
If the prefix and suffix change, also use CHARINDEX AND LEN.
SUBSTRING(external_id, CHARINDEX(external_id,'-') + 1, CHARINDEX(external_id,'.') - CHARINDEX(external_id,'-') + 1)

Try this one
SELECT *
FROM TABLE_A inner join TABLE_B on TABLE_A.external_id like '%'+TABLE_B.guarantor_id+'%'

This also works. :)
select LEFT(right(external_id, 7), 5)
from table_a

As #woz said, you can use SUBSTRING, if the length is not fixed, you can use the CHARINDEX function, to determine the positions of the dot and dash to make it more flexible.
On another note, joining based on a function will heavily degrade your performance, I suggest updating the field with the function result, or creating a new column STRIPPED_GUARANTOR_ID that has the stripped value, then joining on that column

use substring and charindex. So long as you are looking for the value between the first '-' and '.' characters...
SUBSTRING (
externalid,
CHARINDEX('-',externalid)+1,
CHARINDEX('.',externalid)-CHARINDEX('-',externalid)
)

Related

SQL WHERE column values into capital letters

Let's say I have the following entries in my database:
Id
Name
12
John Doe
13
Mary anne
13
little joe
14
John doe
In my program I have a string variable that is always capitalized, for example:
myCapString = "JOHN DOE"
Is there a way to retrieve the rows in the table by using a WHERE on the name column with the values capitalized and then matching myCapString?
In this case the query would return two entries, one with id=12, and one with id=14
A solution is NOT to change the actual values in the table.
A general solution in Postgres would be to capitalize the Name column and then do a comparison against an all-caps string literal, e.g.
SELECT *
FROM yourTable
WHERE UPPER(Name) = 'JOHN DOE';
If you need to implement this is Knex, you will need to figure out how to uppercase a column. This might require using a raw query.

Sorting SQL query after search result

I have table that contains column FULL_NAME, and I'm doing search and getting result. I need to sort these columns by position of search(search term).
Example Of data
Column (FULL_NAME)
ID
FULL_NAME
1
zaid said Alabri
2
said sleem salim AlAhmedi
3
salim zaid Ahmed AlZaid
4
Ahmed said zaid AlSalimi
Search example:
SELECT FULL_NAME
FROM Table
WHERE(FULL_NAME LIKE N'%ِAhmed%')
I need result to be like this
ID
FULL_NAME
1
Ahmed said zaid AlSalimi
2
said Ahmed salim AlAhmedi
3
salim zaid Ahmed AlZaid
4
said sleem salim AlAhmedi
This what I'm seeking result sort by position after query result.
Since you have tagged SQL Server as your database please try this:
select *
from table
where full_name like N'%Ahmed%'
order by charindex('Ahmed',full_name,1)
Something like this could work for you:
select *
from table
where full_name like N'%Ahmed%'
order by charindex(N'Ahmed', full_name)
The charindex function returns the character number where the search matches a string.
I think if you are using SQL server you can use a conditional order by and using CHARIndex
as follow
SELECT EnglishName
FROM Employees
where EnglishName like N'%Ahmed%'
order by charindex(N'Ahmed',EnglishName,1)

Using a Table-Valued Function to Turn a Single Row into Many Within Select

Simple enough question I think.
I have a dataset, quite large with a bit of free-text name data. I need to to link this to our employee table.
There's a whole set of different ways people have entered the 'owner' in to this fields (John Smith, J.Smith, John Smith (JSMITH), Company:John Smith/Client: John Smith, ect.)
Most of these are fine, but the problem I have is with the ones where multiple names have been entered. For example; "John Smith / Joe Bloggs".
I have a pre-created Table-Valued function which takes in a string and a delimiter, then returns a table with the results of the split.
dbo.Split('John Smith / Joe Bloggs')
id val
1 John Smith
2 Joe Bloggs
The issue I have is that I need these results to come back for each row within an existing dataset. So for example, my query selecting the Owner, RefNumber and OSProjectCode fro my 'ProjectActions' table containing the following data:
RefNumber OSProjectCode Owner
1 1234 Bill Baggins
2 1234 John Smith / Joe Bloggs
would come out looking like this:
RefNumber OSProjectCode Owner
1 1234 Bill Baggins
2 1234 John Smith
2 1234 Joe Bloggs
What I've tried to far is attempt to join on the results of the function - but unsurprisingly it wont let me send in the column from ProjectsActions into the function like that.
SELECT a.val AS [Owner], pa.[RefNumber], pa.[OSProjectCode]
FROM dbo.ProjectsActions pa
INNER JOIN dbo.Split(pa.[Owner], '/') a
Msg 4104, Level 16, State 1, Line 1
The multi-part identifier "pa.Owner" could not be bound.
The only way I can think of doing this, which seems a little too bulky and messy, is the below:
;with base as(
SELECT
pa.RefNumber
, pa.OSProjectCode
, (SELECT val FROM dbo.Eval(pa.Owner) WHERE id = 1) AS [First]
, (SELECT val FROM dbo.Eval(pa.Owner) WHERE id = 2) AS [Second]
FROM ProjectsActions pa
)
SELECT
a.RefNumber
, a.OSProjectCode
, a.First AS [Owner]
FROM base a WHERE a.First IS NOT NULL
UNION ALL
SELECT
b.RefNumber
, b.OSProjectCode
, b.Second AS [Owner]
FROM base b WHERE a.First IS NOT NULL
Surely there's a better way? Something more similar to my first attempt - joining to the results within each row?
Any feedback or ideas would be much appreciated.
Cheers,
Scott.
EDIT:
FYI if anyone comes accross this with a similar issue, but are missing the 'split' part - I use a function found elsewhere on stackoverflow. https://stackoverflow.com/a/14600765/1700309
You need to use an APPLY as your join.
SELECT
a.val AS [Owner],
pa.[RefNumber],
pa.[OSProjectCode]
FROM dbo.ProjectsActions pa
CROSS APPLY dbo.Split(pa.[Owner], '/') a
The CROSS APPLY acts like an INNER JOIN passing the row-level value into your table-valued function. If you expect split function returns NULL if it can't split the value (NULL, empty, etc), you can use OUTER APPLY so that the NULL won't drop that row out of your result set. You can also add a COALESCE to fall back to the [owner].
SELECT
COALESCE(a.val, pa.[Owner]) AS [Owner],
pa.[RefNumber],
pa.[OSProjectCode]
FROM dbo.ProjectsActions pa
OUTER APPLY dbo.Split(pa.[Owner], '/') a

query, where field contains 2 t's

is there a way to perform a where clause that will match only 2 t's independent off where they are located.
such as
Matthew --
would work
Thanatos --
would work
Thanatos T --
would not work
Tom --
would not work
I've been Googling but cant find anything specific about this
any help is apreciated
You could try
SELECT *
FROM Table
WHERE Field LIKE '%t%t%' AND Field NOT LIKE '%t%t%t%'
I'm curious which would be faster, this or Goat CO's answer.
You could use LEN() and REPLACE():
SELECT *
FROM Table
WHERE LEN(REPLACE(field,'t','tt')) - LEN(Field) = 2
Demo: SQL Fiddle

SQL: Select distinct based on regular expression

Basically, I'm dealing with a horribly set up table that I'd love to rebuild, but am not sure I can at this point.
So, the table is of addresses, and it has a ton of similar entries for the same address. But there are sometimes slight variations in the address (i.e., a room # is tacked on IN THE SAME COLUMN, ugh).
Like this:
id | place_name | place_street
1 | Place Name One | 1001 Mercury Blvd
2 | Place Name Two | 2388 Jupiter Street
3 | Place Name One | 1001 Mercury Blvd, Suite A
4 | Place Name, One | 1001 Mercury Boulevard
5 | Place Nam Two | 2388 Jupiter Street, Rm 101
What I would like to do is in SQL (this is mssql), if possible, is do a query that is like:
SELECT DISTINCT place_name, place_street where [the first 4 letters of the place_name are the same] && [the first 4 characters of the place_street are the same].
to, I guess at this point, get:
Plac | 1001
Plac | 2388
Basically, then I can figure out what are the main addresses I have to break out into another table to normalize this, because the rest are just slight derivations.
I hope that makes sense.
I've done some research and I see people using regular expressions in SQL, but a lot of them seem to be using C scripts or something. Do I have to write regex functions and save them into the SQL Server before executing any regular expressions?
Any direction on whether I can just write them in SQL or if I have another step to go through would be great.
Or on how to approach this problem.
Thanks in advance!
Use the SQL function LEFT:
SELECT DISTINCT LEFT(place_name, 4)
I don't think you need regular expressions to get the results you describe. You just want to trim the columns and group by the results, which will effectively give you distinct values.
SELECT left(place_name, 4), left(place_street, 4), count(*)
FROM AddressTable
GROUP BY left(place_name, 4), left(place_street, 4)
The count(*) column isn't necessary, but it gives you some idea of which values might have the most (possibly) duplicate address rows in common.
I would recommend you look into Fuzzy Search Operations in SQL Server. You can match the results much better than what you are trying to do. Just google sql server fuzzy search.
Assuming at least SQL Server 2005 for the CTE:
;with cteCommonAddresses as (
select left(place_name, 4) as LeftName, left(place_street,4) as LeftStreet
from Address
group by left(place_name, 4), left(place_street,4)
having count(*) > 1
)
select a.id, a.place_name, a.place_street
from cteCommonAddresses c
inner join Address a
on c.LeftName = left(a.place_name,4)
and c.LeftStreet = left(a.place_street,4)
order by a.place_name, a.place_street, a.id