Split with and get Results from T-SQL - sql

I have some T-SQL code that pulls information from a SQL Server table. I need to parse a column and display according to following result set. I'm bew to SQL, it would be great if you don't mark as duplicate and show me how to do it.
Can you please help?
SELECT Account, CTSFirm, AccountName, BOCodeGMI
FROM StagingEDFACRRBO
BOCodeGMI column contains:
e=01:c=KW:m=10000
c=C-:e=01:m=10000
c=S-:e=01:m=10000
c=06:e=01:m=10
c=07:e=01:m=100
c=W-:e=01:M=10000
Logic to split BOCodeGMI and display two separate columns BOCodeGMI_1 & BOCodeGMI_2:
If string contains e= then display BOCodeGMI_1 as its corresponding value (ex: 01), if string doesn't contain e=, then display BOCodeGMI_1 as NULL
If string contains c= then display BOCodeGMI_2 as its corresponding value (ex: C-), if string doesn't contain c= then display BOCodeGMI_2 as NULL
Finally this is how it suppose to show -
BOCodeGMI BOCodeGMI_1 BOCodeGMI_2
-----------------------------------------------------
e=01:c=KW:m=10000 01 KW
c=C-:e=01:m=10000 01 C-
c=S-:e=01:m=10000 01 S-

Try the next query via using CASE, CHARINDEX & SUBSTRING
SELECT
BOCodeGMI,
CASE
WHEN CHARINDEX('e=', BOCodeGMI) > 0
THEN SUBSTRING (BOCodeGMI, CHARINDEX('e=', BOCodeGMI) + 2 , 2)
END as BOCodeGMI_1,
CASE
WHEN CHARINDEX('c=', BOCodeGMI) > 0
THEN SUBSTRING (BOCodeGMI, CHARINDEX('c=', BOCodeGMI) + 2, 2)
END as BOCodeGMI_2
FROM
tableName
CASE to go through conditions and return a value.
CHARINDEX to search for a substring in a string, and returns the position.
SUBSTRING to extract some characters from a string.

Related

How to retrieve the required string in SQL having a variable length parameter

Here is my problem statement:
I have single column table having the data like as :
ROW-1>> 7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX
ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX
Here i want to separate these values from '-' and load into a new table. There are 11 segments in this string separated by '-', therefore, 11 columns. The problem is:
A. The length of these values are changing, however, i have to keep it as the length of these values in the standard format or the length which it has
e.g 7302- (should have four values, if the value less then that then keep that value eg. 73 then it should populate 73.
Therefore, i have to separate as well as mentation the integrity. The code which i am writing is :
select
SUBSTR(PROFILE_ID,1,(case when length(instr(PROFILE_ID,'-')<>4) THEN (instr(PROFILE_ID,'-') else SUBSTR(PROFILE_ID,1,4) end)
)AS [RQUIRED_COLUMN_NAME]
from [TABLE_NAME];
getting right parenthesis error
Please help.
I used the regex_substr SQL function to solve the above issue. Here below is an example:
select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,1);
Output is: 7302 --which is the 1st segment of the string
Similarly, the send string segment which is separated by "-" in the string can be obtained by just replacing the 1 with 2 in the above query at the end.
Example : select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,2);
output: 2210177000 which is the 2nd segment of the string

Find a character index in string in spark sql

I am SQL person and new to Spark SQL
I need to find the position of character index '-' is in the string if there is then i need to put the fix length of the character otherwise length zero
string name = 'john-smith'
if '-' is in character position 4 then 10 otherwise length 0
I have done in SQL Server but now need to do in Spark SQL.
select
case
when charindex('-', name) = 4 then 10
else 0
end
I tried in Spark SQL but failed to get results.
select find_in_set('-',name)
Please help. Thanks
You can use instr function as shown next. insrt checks if the second str argument is part of the first one, if so it returns its index starting from 1.
//first create a temporary view if you don't have one already
df.createOrReplaceTempView("temp_table")
//then use instr to check if the name contains the - char
spark.sql("select if(instr(name, '-') = 4, 10, 0) from temp_table")
The arguments for the if statement are:
instr(name, '-') = 4 condition to check
10 result for valid condition
0 result for false condition

SQL Server: How to select rows which contain value comprising of only one digit

I am trying to write a SQL query that only returns rows where a specific column (let's say 'amount' column) contains numbers comprising of only one digit, e.g. only '1's (1111111...) or only '2's (2222222...), etc.
In addition, 'amount' column contains numbers with decimal points as well and these kind of values should also be returned, e.g. 1111.11, 2222.22, etc
If you want to make the query generic that you don't have to specify each possible digit you could change the where to the following:
WHERE LEN(REPLACE(REPLACE(amount,LEFT(amount,1),''),'.','') = 0
This will always use the first digit as comparison for the rest of the string
If you are using SQL Server, then you can try this script:
SELECT *
FROM (
SELECT CAST(amount AS VARCHAR(30)) AS amount
FROM TableName
)t
WHERE LEN(REPLACE(REPLACE(amount,'1',''),'.','') = 0 OR
LEN(REPLACE(REPLACE(amount,'2',''),'.','') = 0
I tried like this in place of 1111111 replace with column name:
Select replace(Str(1111111, 12, 2),0,left(11111,1))

SQL: Replacing dates contained within a text string

I am using SQL Server Management Studio 2012. I work with medical records and need to de-identify reports. The reports are structured in a table with columns Report_Date, Report_Subject, Report_Text, etc... The string I need to update is in report_text and there are ~700,000 records.
So if I have:
"patient had an EKG on 04/09/2012"
I need to replace that with:
"patient had an EKG on [DEIDENTIFIED]"
I tried
UPDATE table
SET Report_Text = REPLACE(Report_Text, '____/___/____', '[DEIDENTIFED]')
because I need to replace anything in there that looks like a date, and it runs but doesn't actually replace anything, because apparently I can't use the _ wildcard in this command.
Any recommendations on this? Advance thanks!
You can use PATINDEX to find the location of Date and then use SUBSTRING and REPLACE to replace the dates.
Since there may be multiple dates in the Text you have to run a while loop to replace all the dates.
Below sql will work for all dates in the form of MM/DD/YYYY
WHILE EXISTS( SELECT 1 FROM dbo.MyTable WHERE PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',Report_Text) > 0 )
BEGIN
UPDATE t
SET Report_Text = REPLACE(Report_Text, DateToBeReplaced, '[DEIDENTIFIED]')
FROM ( SELECT * ,
SUBSTRING(Report_Text,PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',Report_Text), 10) AS DateToBeReplaced
FROM dbo.MyTable AS a
WHERE PATINDEX('%[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]%',Report_Text) > 0
) AS t
END
I have tested the above sql on a dummy table with few rows.I don't know how it will scale for your data but recommend you to give it a try.
To keep it simple, assume that a number represents an identifying element in the string so look for the position of the first number in the string and the position of the last number in the string. Not sure if this will apply to your entire set of records but here is the code ...
I created two test strings ... the one you supplied and one with the date at the beginning of the string.
Declare #tstString varchar(100)
Set #tstString = 'patient had an EKG on 04/09/2012'
Set #tstString = '04/09/2012 EKG for patient'
Select #tstString
-- Calculate 1st Occurrence of a Number
,PATINDEX('%[0-9]%',#tstString)
-- Calculate last Occurrence of a Number
,LEN(#tstString) - PATINDEX('%[0-9]%',REVERSE(#tstString))
,CASE
-- No numbers in the string, return the string
WHEN PATINDEX('%[0-9]%',#tstString) = 0 THEN #tstString
-- Number is the first character to find the last position and remove front
WHEN PATINDEX('%[0-9]%',#tstString) = 1 THEN
CONCAT('[DEIDENTIFIED]',SUBSTRING(#tstString, LEN(#tstString)-PATINDEX('%[0-9]%',REVERSE(#tstString))+2,LEN(#tstString)))
-- Just select string up to the first number
ELSE CONCAT(SUBSTRING(#tstString,1,PATINDEX('%[0-9]%',#tstString)-1),'[DEIDENTIFIED]')
END AS 'newString'
As you can see, this is messy in SQL.
I would rather achieve this with a parser service and move the data with SSIS and call the service.

DB2 SQL Anything left of a /

I've been working on this for days and can't seem to work it out. Basically I need return digits from a field before there is a forward slash. e.g. if the field was 1234/TEXT I want to return 1234. I can't just use left fieldname 4 as the digits vary in left e.g. 12345/TEXT, so it needs to be anything left of the forward slash. Now in the World of MS Access, it is something like this - and it works
Left(TABLE!FIELD,InStr(1,TABLE!FIELD,"/")-1)
However, how do I convert this to be used in an IBM\DB2 system? The DB2 SQL seems somewhat different to 'normal' SQL.
Thanks!
Rather than INSTR, maybe LOCATE
LOCATE(char, string)
char is the search term
string is the string being searched
You can achieve this by combining LOCATE with SUBSTR;
Locate information
Substring information
Cheat sheet (for this example);
SUBSTRING('FIELD','START POSITION', 'LENGTH')
LOCATE('SEARCH STRING', 'SOURCE STRING')
SUBSTRING lets you retrieve specific characters from a string, i.e.;
AFIELD = 'Hello'
SUBSTRING(AFIELD,4,2)
Result = 'lo' (position 4 and 5 of Hello)
LOCATE returns the position of the first character of the search string it finds as a number, i.e.;
AFIELD = 'Hello'
LOCATE('ello', AFIELD)
Result = 2 (it starts at position 2)
So you can combine these to do what you want, example;
XTABLE has 1 column called ACOL with the following values in it;
123467/ABCD
1321/ABDD
1123467/ABCD
To just retrieve the numbers;
SELECT SUBSTRING(ACOL,1, LOCATE('/',ACOL)-1)
FROM XRDK/XTABLE
Result;
123467
1321
1123467
What are we doing?
SUBSTRING(
ACOL,
1,
LOCATE('/',ACOL)-1
)
SUBSTRING(
Field ACOL,
Starting at position 1,
Length; using locate set this to where I find a '/' and subtract 1 from the
resulting postion (without the -1 you'd have the / on the end)
)
Try this
SELECT SUBSTRING(CAST (ROUND(COLUMN,2) AS DECIMAL(6,2)), 0, locate('/',CAST (ROUND(COLUMN,2) AS DECIMAL(6,2))))
FROM TABLE