Get multiple values with brackets from rows in SQL Server - sql

I have rows containing data like this in column called ERROR_CODE:
00111[2003] Maschine0; 000222[2003] Maschinen2
I need to filter out only values in the brackets like this in one row:
2003;2003
I have one solution but only to get first element. And I would need all of them...like 2003,2003
SUBSTRING(ERROR_CODE,CHARINDEX('[',ERROR_CODE)+1 ,CHARINDEX(']',ERROR_CODE)-CHARINDEX('[',ERROR_CODE)-1)
Could you pease help me to find a solution?

This is based on several assumptions:
Each error is semicolon (;) delimited
An error always contains one value in brackets ([])
You are using a fully supported version of SQL Server.
One method to achieve this would be to string your string on the delimiter (;). Then you can find the position left bracket ([) and the right (]) and SUBSTRING to get the content between. Thing finally you can string aggregate to get 1 row (per value of your column) again:
SELECT STRING_AGG(SUBSTRING(V.YourColumn,CI.LB +1, CI.RB - CI.LB - 1),',')
FROM (VALUES('00111[2003] Maschine0; 000222[2003] Maschinen2'))V(YourColumn)
CROSS APPLY STRING_SPLIT(V.YourColumn,';') SS
CROSS APPLY (VALUES(CHARINDEX('[',V.YourColumn),CHARINDEX(']',V.YourColumn)))CI(LB,RB)
GROUP BY V.YourColumn;
For point 3, if you are not using a fully supported version of SQL Server you will need to use a user defined (set based or CLR) string splitter and FOR XML PATH respectively for splitting and aggregating your strings. If either 1 and 2 are not true, you have a far more fundamental problem with your design that your let on; fix your design.

Related

How do I remove a character from strings of different lengths with sql? Intersystems cache sql

I have a column of strings that have an '&' at the beginning and end of each one that I need to remove for a Crystal report I'm creating. I'm writing the SQL code outside of Crystal I am using Intersystems Cache SQL. Below is an example:
&This& This
&is& is
&What& what
&it& I
&looks& need
&like& it
&now& to
look
like
Any suggestions would be greatly appreciated!!!
Assuming the ampersands are always positioned as both the leading and trailing characters, here's at least maybe a start. Use a combination of SUBSTR (or SUBSTRING, if using stream data) and LENGTH, like so:
SELECT SUBSTR((SELECT column FROM table), 2, LENGTH(SELECT column FROM table) - 2)
This should return a substring that starts counting at the 2nd character [of the original string, given by the first sub-expression/argument to SUBSTR], counting up for the total number of characters [of the original string] less 2 (i.e. less the two ampersands).
If you need to including trailing blanks and/or the string termination character, you may need to use a different variation of the LENGTH function. See resources for details on these functions and their variants:
https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=RSQL_substr
https://cedocs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=RSQL_length
Here's a Crystal formula that does the same:
ExtractString({YourData},"&","&")

Hive Regular expression - only portion of string needed

Hi i was trying to extract portion of data from one column in my hive table but the position of character is not in one place
select value4,regexp_extract(value4,'*****',0) from hive_table;
column value is shown below
grade:data:home made;Cat;dinnerbox_grade_Enroll
list:date:may;animal;dinnerbox_list_value
cgrade:made_data;dinnerbox_cgrade_notEnroll
I want data from dinnerbox to till end.
Can any one help on this?
It is a pretty simple regular expression
.*dinnerbox(.*?)$
Using a non-greedy wildcard, but forcing it to the end of the line makes sure that you always get the dinnerbox at the end.
You want capture group 1
To get rid of the _ you can use
.*dinnerbox_(.*?)$

How to select values around .(dot) using sql

I am running below query in Teradata :
sel requesttext from dbc.tables
where tablename='old_employee_table'
Result:
alter table DB_NAME.employee_table,no fallback ;
I want to get below result using SQL:
DB_NAME.employee_table
Requesttext can be:
create set table DB_NAME.employee_table;
DB Name and table can occur anywhere in the result. Since .(dot) is joining them that's why i want to split with .(dot).
Basically I need sql which can result me surrounding values of .(dot)
I want DBName and Tablename in result.
I'm not a Teradata person, but this should work for both strings given so far, as long as teradata's regexp_substr() supports positive look-behind and positive look-ahead assertions (I might have the Teradata syntax wrong, so a little tweaking may be needed):
SELECT REGEXP_SUBSTR(requesttext, '(?<= )(\w+\.\w+)(?=[,$]?)', 1, 1)
FROM dbc.tables
WHERE tablename='old_employee_table'
See the regex101 example. Hopefully it translates to Teradata easily.
The regex looks for and returns the words either side of and including the period, when preceded by a space, and followed by an optional comma or the end of the line.
You could do this with either regexp_substr() or strtok().
As Jamie Zawinski said:
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems.
So I would go with the strtok() method. Also I'm lazy and regular expressions are hard.
Function strtok() takes three arguments:
The string being split
The delimiter to split the string
The number of the token to grab.
To get at the <database>.<table> from that string that is returned in your query, we can split by a space, grab the third token, then split that by a comma and grab the first token.
That would look like:
SELECT strtok(strtok(requestText,' ',3),',',1)
FROM dbc.tables
WHERE tablename='old_employee_table'

SSIS Transform -- Split one column into multiple columns

I'm trying to find out how to split a column I have in a table and split it into three columns after the result is exported to a CSV file.
For example, I have a field called fullpatientname. It is listed in the following text format:
Smith, John C
The expectation is to have it in three separate columns:
Smith
John
C
I'm quite sure I have to split this in a derived column, but I'm not sure how to proceed with that
You are going to need to use a derived column for this process.
The SUBSTRING and FINDSTRING functions will be key to pull this off.
To get the first segment you would use something like this:
(DT_STR,25,1252) SUBSTRING([fullpatientname], 1, FINDSTRING(",",[fullpatientname],1)-1)
The above should display a substring starting with the beginning of the [fullpatientname] to the position prior to the comma (,).
The next segment would be from the position after the comma to the final space separator, and the final would be everything from the position following the final space separator to the end.
It sounds like your business rule is
The "last name" is all of the characters up to the first comma
The "first name" will be all of the characters after the first comma and a space
The "middle name" will be what (and is it always present)?
the last character in the string (you will only ever have an initial letter)
All of the characters after the second space
This logic will fail in lots of fun ways so be prepared for it. And also remember that once you combine information together, you cannot, with 100 accuracy, restore it to the component parts. Capture first, middle, last/surname and store them separately.
Approach A
A derived column component. Actually, a few of them added to your data flow will cover this. The first Derived Column will be tasked with finding the positions of the name breaks. This could be done all in a single Component but debugging becomes a challenge and then you will need to reference the same expression multiple times in a single row * 3 it quickly becomes a maintenance nightmare.
The second Derived Column will then use the positions defined in the first to call the LEFT and SUBSTRING functions to access points in the column
Approach B
I never reach for a script component first and the same should hold true for you. However, this is a mighty fine case for a script. The base .NET string library has a Split function that will break a string into pieces based on whatever delimiter you supply. The default is whitespace. The first call to split will use the ',' as the argument. The zeroeth ordinal string will be the last name. The first ordinal string will contain the first and middle name pieces. Call the string.Split method again, this time using the default value and the last element is the middle name and the remaining elements are called the first name. Or vice versa, the zeroeth element is the first name and everything else is last.
I've had to deal with cleaning names before and so I've seen different rules based on how they want to standardize the name.
Try something like this, if your names are always in the same format (LastName-comma-space-FirstName-space-MI):
declare #FullName varchar(25) = 'Smith, John C'
select
substring(#FullName, 1, charindex(',', #FullName)-1 ) as LastName,
substring(#FullName, charindex(',',#FullName) + 2, charindex(' ',#FullName,charindex(',',#FullName)+2) - (charindex(',',#FullName) + 2) ) as FirstName,
substring(#FullName, len(#FullName), 1) as MiddleInitial
I am using SQL SERVER 2016 with SSIS in Visual Studio 2015. If you are using findstring you need to make sure the order is correct. I tried this first -
FINDSTRING(",",[fullpatientname],1), but it wouldn't work. I had to look up the documentation and found the order to be incorrect. FINDSTRING([fullpatientname],",",1) fixed the problem for me. I am not sure if this is due to differences in versions.

How to format many values in a database?

I have a database (in SQLite) in which some entries (or possibly all) are strings whose first character is a space.
The database may be small enough for me to export it as a CSV file and do a regular-expression search-and-replace which will delete the leading space. Is there an SQL statement which can achieve the same result?
(The database has over 60 columns---listing each one might get tedious.)
You can strip the unneeded spaces right in select query:
SELECT TRIM(field)
or do it once on all rows
UPDATE table SET field = TRIM(field)
Take a look at thr trim family of functions, e.g. ltrim.
ltrim(X), ltrim(X,Y)
The ltrim(X,Y) function returns a string formed by removing any and all characters that appear in Y from the left side of X. If the Y argument is omitted, ltrim(X) removes spaces from the left side of X.
More: http://www.sqlite.org/lang_corefunc.html