Parsing a string and comparing values to existing column - sql

I have the below table with the string marked "Remark" that needs to be parsed. The highlighted fares need to be compared from the columns TotalBookedFare and Remark. The only issue is that the value I need to compare under the Remark column is in the middle of a string. I've tried to parse the string but I cannot figure it out. I am using SQL Server 2008. As you can see the first row is not a match while the other three are matching.
Ideally I would like to convert the one string "Remark" to the 5 columns listed below so I can compare the TotalBookedFare to the "New" column.dionbennett

I think this should work
select substring(
remark, --string base
charindex ('/', 'xyz/57.77usd/zyx') + 1,
--starting position is location one to the right of first instance of / character (5)
charindex ('u', 'xyz/57.77usd/zyx', charindex ('/', 'xyz/57.77usd/zyx')) - charindex ('/', 'xyz/57.77usd/zyx') - 1
--length is the location of the first instance of the u character
--starting from the location of first instance of the / character (10)
--then subtracted by the location of the first instance of the / character (4)
--and then an additional 1 resulting in the length of the string to be extracted (5)
)
The string I put in there is just a more concrete example, if you replace it with Remark, it should extract the substring for each row. You could even modify it with some copy/pasting to get each of those columns you were looking for.

Related

How to search for separated values in cloumns from a merged values column

I have a database where the data I need to work with is stored into two different columns. I also need to import an excel file and the data in this excel file is all together only separated by a dash. So either I need to figure out how to create a query, maybe an alias, or how to split the column by the dash and then make the query with the data split up.
The code I was trying was the following:
SELECT
CAST (dbo_predios.codigo_manzana_predio as nvarchar(55))+'-
'+CAST(dbo_predios.codigo_lote_predio as nvarchar(55)) as ROL_AVALUO
FROM dbo_predios
WHERE ROL_AVALUO like '%9132-2%'
That is one way I tried, but I don't know well how to split by a determined symbol. The data on the excel comes in the exact same way that I wrote in the "like" portion of the code.
I believe this is what you are after from the sounds of it:
SELECT
[locateDashInString] = CHARINDEX('-', e.FieldHere, 0) --just showing you where it finds the dash
,[SubstringBeforeItemLocated] =
SUBSTRING(
e.FieldHere --string to search from
,0 --starting index
,CHARINDEX('-', e.FieldHere, 0) --index of found item
)
,[SubstringAfterItemLocated] =
SUBSTRING(
e.FieldHere --string to search from
,CHARINDEX('-', e.FieldHere, 0) + 1 --starting index for substring
,LEN(e.FieldHere) --finish substring at this point
)
FROM ExcelImportedDataTable e
The locateDashInString column is just to show you where it finds the '-' symbol, you don't actually need it, the other two columns are a split of the value so '9132-2' split into two values/two columns.
**Just note that this will only work if you always have the format of val1-val2 in the data. Aslong as the format is the same it should be fine.

SQL / REGEX pattern matching

I want to use regex through sql to query some data to return values. The only valid values below returned would be "GB" and "LDN", or could also be "GB-LDN"
G-GB-LDN-TT-TEST
G-GB-LDNN-TT-TEST
G-GBS-LDN-TT-TEST
As it writes the first GB set needs to have 2 characters specifically, and the LDN needs to have 3 characters specifically. Both sets/groups seperated by an - symbol. I kind of need to extract the data but at the same time ensure it is within that pattern. I took a look at regex but I can't see how to, well it's like substring but I can't see it.
IF i undertsand correctly, you could still use of substring() function to extract the string parts separated by -.
select left(parsename(a.string, 3), 2) +'-'+ left(parsename(a.string, 2) ,3) from
(
select replace(substring(data, 1, len(data)-charindex('-', reverse(data))), '-', '.') [string] from <table>
) a
As in above you could also define the length of extracted string.
Result :
GB-LDN
GB-LDN
GB-LDN

Regex to split values in PostgreSQL

I have a list of values coming from a PGSQL database that looks something like this:
198
199
1S
2
20
997
998
999
C1
C10
A
I'm looking to parse this field a bit into individual components, which I assume would take two regexp_replace function uses in my SQL. Essentially, any non-numeric character that appears before numeric ones needs to be returned for one column, and the other column would show all non-numeric characters appearing AFTER numeric ones.
The above list would then be split into this layout as the result from PG:
I have created a function that strips out the non-numeric characters (the last column) and casts it as an Integer, but I can't figure out the regex to return the string values prior to the number, or those found after the number.
All I could come up with so far, with my next to non-existant regex knowledge, was this: regexp_replace(fieldname, '[^A-Z]+', '', 'g'), which just strips out anything not A-Z, but I can;t get to to work with strings before numeric values, or after them.
For extracting the characters before the digits:
regexp_replace(fieldname, '\d.*$', '')
For extracting the characters after the digits:
regexp_replace(fieldname, '^([^\d]*\d*)', '')
Note that:
if there are no digits, the first will return the original value and then second an empty string. This way you are sure that the concatenation is equal to the original value in this case also.
the concatenation of the three parts will not return the original if there are non-numerical characters surrounded by digits: those will be lost.
This also works for any non-alphanumeric characters like #, [, ! ...etc.
Final SQL
select
fieldname as original,
regexp_replace(fieldname, '\d.*$', '') as before_s,
regexp_replace(fieldname, '^([^\d]*\d*)', '') as after_s,
cast(nullif(regexp_replace(fieldname, '[^\d]', '', 'g'), '') as integer) as number
from mytable;
See fiddle.
This answer relies on information you delivered, which is
Essentially, any non-numeric character that appears before numeric
ones needs to be returned for one column, and the other column would
show all non-numeric characters appearing AFTER numeric ones.
Everything non-numeric before a numeric value into 1 column
Everything non-numeric after a numeric value into 2 column
So there's assumption that you have a value that has a numeric value in it.
select
val,
regexp_matches(val,'([a-zA-Z]*)\d+') AS before_numeric,
regexp_matches(val,'\d+([a-zA-Z]*)') AS after_numeric
from
val;
Attached SQLFiddle for a preview.

how to get regexp_substr for a string

In my table for the rows containing values like
sample>test Y10,
Sample> y21
I want to get a substring like y10,y21 from all rows. May I pls know how to get it. I tried with regexp_substr,Instr but not able to find the solution.
I am supposing that your string from column is devided by a single space .
It will give you last occurances which will be splited by ' ' a space
substr(your_string, 1, instr(yourString,' ') - 1)
OR you can achive this using regexp_substr
regexp_substr(your_String, '[^[:space:]]+', 1, -1 )
Assuming that yxx is always preceded by a space, it should be as easy as doing this:
TRIM(REGEXP_SUBSTR(mycolumn, ' y\d+', 1, 1, 'i'))
The above regular expression will grab y (note that it is case-insensitive, so it will grab Y as well) followed by an indefinite number (one or more) of digits. If you want to grab just two digits, replace \d+ with \d{2}.
Also, please note that it will get the first occurrence only. Getting multiple occurrences is a bit more complicated, but it can still be done.

Split string and replace

I have a varchar column with Url's with data that looks like this:
http://google.mews.......http://www.somesite.com
I want to get rid of the first http and keep the second one so the row above would result in:
http://www.somesite.com
I've tried using split() but can't get it to work
Thanks
If you are trying to do this using T-SQL, you can try something in the lines of:
-- assume #v is the variable holding the URL
SELECT SUBSTRING(#v, PATINDEX('%_http://%', #v) + 1, LEN(#v))
This will return the start position of the first http:// that has before it at least one character (hence the '%_' before it and the + 1 offset).
If the first URL always starts right from the beginning of the string, you can use SUBSTRING() & CHARINDEX():
SELECT SUBSTRING(column, CHARINDEX('http://', column, 2), LEN(column))
FROM table
CHARINDEX simply searches a string for a substring and returns the substring's starting position within the string. Its third argument is optional and, if set, specifies the search starting position, in this case it's 2 so it didn't hit the first http://.