A SQL Query to select certain strings in a folder path

A SQL Query to select certain strings in a folder path - sql

I have a table with a column that contains the path to SSIS packages located in a drive. The entire folder path is populated in the column. I need a SQL query to get a section of the string within the folder path.
An example of record in the column_1.
/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E
All I am interested in extracting is the "SSIS_Packages_Source_to_Target_Data_Snowflake". Everything I have tried so far throws errors. The latest code I tried is:
SELECT SUBSTRING(Column_1, LEFT(CHARINDEX('dtsx', Column_1)), LEN(Column_1) - CHARINDEX('dtsx', Column_1)).
I would really appreciate some help with this.
Thanks!

Given you know the extension and its unlikely to appear elsewhere in the string, find it, and truncate to it. Do that in a CROSS APPLY so we can use the value multiple times.
Then find the nearest slash (using REVERSE) and use SUBSTRING from there to the end.
SELECT
SUBSTRING(Y.[VALUE], LEN(Y.[VALUE]) - PATINDEX('%\%', REVERSE(Y.[VALUE])) + 2, LEN(Y.[VALUE]))
FROM (
VALUES ('/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E')
) AS X ([Value])
CROSS APPLY (
VALUES (SUBSTRING(X.[Value], 1, PATINDEX('%.dtsx%', X.[Value])-1))
) AS Y ([Value]);
Returns:
SSIS_Packages_Source_to_Target_Data_Snowflake

Another possible way is this. not sure on the performance of it though
SELECT vt.[value]
FROM (
VALUES ('/FILE "\"G:\Enterprise_Data\Packages\SSIS_Packages_Source_to_Target_Data_Snowflake.dtsx\""/CHECKPOINTING OFF /REPORTING E')
) AS X ([Value])
OUTER APPLY (
SELECT * FROM STRING_SPLIT(x.Value,'\')
) vt
WHERE vt.[value] LIKE '%.dtsx'

Thank you #Dale K for your response and solutions provided. I was able to replicate the same for my query to obtain the result. Below is how I modified the query in my environment to fetch only the new string column1 after applying the string manipulations based on your solution:
SELECT SUBSTRING(Y.column1, LEN(Y.column1) - PATINDEX('%%', REVERSE(Y.column1)) +2, LEN(Y.column1))
FROM (SELECT column1 FROM CTE1) AS X ([column2])
CROSS APPLY (SELECT SUBSTRING(X.column2, 1, PATINDEX('%.dtsx%', X.column2)-1) FROM CTE1) AS Y ([column1])
I am actually query a CTE table (CTE1) to get my desired result. The issue is that, I have other columns in the CTE1 that I need to include in the final select query results, which of course should include the string manipulated results from Column1. Currently, I get errors when I try to include other columns in my final result from the CTE1 along with the resultset from the query above.
Example of final query:
Select Jobname,job_step,job_date,job_duration,Column1 (this will be the resultset from the string manipulation)FROM CTE1;
So, what I'm currently doing that is not working is as follows:
SELECT C1.Jobname,C1.job_step,C1.job_date,C1.job_duration,Column1 =(SELECT SUBSTRING(Y.column1, LEN(Y.column1) - PATINDEX('%\%', REVERSE(Y.column1)) +2, LEN(Y.column1)) FROM (SELECT column1 FROM CTE1) AS X ([column2]) CROSS APPLY (SELECT SUBSTRING(X.column2, 1, PATINDEX('%.dtsx%', X.column2)-1) FROM CTE1) AS Y ([column1])) FROM CTE1 C1
Please, how can I obtain the final results with all the above columns present in the resultsets?
Thank you.

Related

How to get string from value

I have an Customer_value column.
The column contains values like:
DAL123245,HC.533675,ABC.01232423
HC.3425364,ABC.045367544,DAL4346456
HC.35344,ABC.03543645754,ABC.023534454,DAL.4356433
ABC.043534553,HC.3453643,ABC.05746343
What I am trying to do is get the number after the first "ABC.0" string.
For example, this is what I would like to get:
1232423
5367544
3543645754
43534553
this is what I tried:
Substring(customer_value,charindex('ABC.', customer_value) + 5, len(customer_value)) as dataneeded
The issue that I got is for 1 and 2 I got that right data as needed, but for 3 and 4, because there are multiple ABC so it gave me everything after the first ABC.
How can I get the number after the first ABC. only?
Thank you so much

Just another option is to use a bit of JSON to parse and preserve the sequence in concert with a CROSS APPLY
Note: Use OUTER APPLY to see NULL values
Example
Select NewVal = replace(Value,'ABC.0','')
From YourTable A
Cross Apply (
Select Top 1 *
From OpenJSON( '["'+replace(string_escape(customer_value,'json'),',','","')+'"]' )
Where Value like 'ABC.0%'
Order by [key]
) B
Results
NewVal
1232423
45367544
3543645754
43534553

On the assumption you are using SQL Server (given your use of charindex()/substring()/len()) you can use apply to calculate the starting position and then find the next occurence utilising the start position optional parameter of charindex, then get the substring between the values.
select Substring(customer_value, p1.v, Abs(p2.v-p1.v)) as dataneeded
from t
cross apply(values(charindex('ABC.', customer_value)+5))p1(v)
cross apply(values(charindex(',', customer_value,p1.v)))p2(v)

Regex that matches strings with specific text not between text in BigQuery

I have the following strings:
step_1->step_2->step_3
step_1->step_3
step_1->step_2->step_1->step_3
step_1->step_2->step_1->step_2->step_3
What I would like to do is to capture the ones that between step_1 and step 3 there's no step_2.
The results should be like this:
string result
step_1->step_2->step_3 false
step_1->step_3 true
step_1->step_2->step_1->step_3 true
step_1->step_2->step_1->step_2->step_3 false
I have tried to use the negative lookahead but I found out that BigQuery doesn't support it. Any ideas?

You are essentially looking for when the pattern does not exist. The following regex would support that embedded in a case statement. This would not support a scenario where you have both conditions in a single string, however that was not a scenario you listed in your sample data.
Try the following:
with sample_data as (
select 'step_1->step_2->step_3' as string union all
select 'step_1->step_3' union all
select 'step_1->step_2->step_1->step_3' union all
select 'step_1->step_2->step_1->step_2->step_3' union all
select 'step_1->step_2->step_1->step_2->step_2->step_3' union all
select 'step_1->step_2->step_1->step_2->step_2'
)
select
string,
-- CASE WHEN regexp_extract(string, r'step_1->(\w+)->step_3') IS NULL THEN TRUE
CASE WHEN regexp_extract(string, r'1(->step_2)+->step_3') IS NULL THEN TRUE
ELSE FALSE END as result
from sample_data
This results in:

Consider also below option
select string,
not regexp_contains(string, r'step_1->(step_2->)+step_3\b') as result
from your_table

I believe #Daniel_Zagales answer is the one you were expecting. However here is a broader solution that can maybe be interesting in your usecase:it consists in using arrays
WITH sample AS (
SELECT 'step_1->step_2->step_3' AS path
UNION ALL SELECT 'step_1->step_3'
UNION ALL SELECT 'step_1->step_2->step_1->step_3'
UNION ALL SELECT 'step_1->step_2->step_1->step_2->step_3'
),
temp AS (
SELECT
path,
SPLIT(REGEXP_REPLACE(path,'step_', ''), '->') AS sequences
FROM
sample)
SELECT
path,
position,
flattened AS current_step,
LAG(flattened) OVER (PARTITION BY path ORDER BY OFFSET ) AS previous_step,
LEAD(flattened) OVER (PARTITION BY path ORDER BY OFFSET ) AS following_step
FROM
temp,
temp.sequences AS flattened
WITH
OFFSET AS position
This query returns the following table
The concept is to get an array of the step number (splitting on '->' and erasing 'step_') and to keep the OFFSET (crucial as UNNESTing arrays does not guarantee keeping the order of an array).
The table obtained contains for each path and step of said path, the previous and following step. It is therefore easy to test for instance if successive steps have a difference of 1.
(SELECT * FROM <previous> WHERE ABS(current_step-previous_step) != 1 for example)
(CASTing to INT required)

Replace a recurring word and the character before it

I am using SQL Server trying to replace each recurring "[BACKSPACE]" in a string and the character that came before the word [BACKSPACE] to mimic what a backspace would do.
Here is my current string:
"This is a string that I would like to d[BACKSPACE]correct and see if I could make it %[BACKSPACE] cleaner by removing the word and $[BACKSPACE] character before the backspace."
Here is what I want it to say:
"This is a string that I would like to correct and see if I could make it cleaner by removing the word and character before the backspace."
Let me make this clearer. In the above example string, the $ and % signs were just used as examples of characters that would need to be removed since they are before the [BACKSPACE] word that I want to replace.
Here is another before example:
The dog likq[BACKSPACE]es it's owner
I want to edit it to read:
The dog likes it's owner
One last before example is:
I am frequesn[BACKSPACE][BACKSPACE]nlt[BACKSPACE][BACKSPACE]tly surprised
I want to edit it to read:
I am frequently surprised

Without a CLR function that provides Regex replacement the only way you'll be able to do this is with iteration in T-SQL. Note, however, that the below solution does not give you the results you ask for, but does the logic you ask. You state that you want to remove the string and the character before, but in 2 of your scenarios that isn't true. For the last 2 strings you remove ' %[BACKSPACE]' and ' $[BACKSPACE]' respectively (notice the leading whitespace).
This leading whitespace is left in this solution. I am not entertaining fixing that, as the real solution is don't use T-SQL for this, use something that supports Regex.
I also assume this string is coming from a column in a table, and said table has multiple rows (with a distinct value for the string on each).
Anyway, the solution:
WITH rCTE AS(
SELECT V.YourColumn,
STUFF(V.YourColumn,CHARINDEX('[BACKSPACE]',V.YourColumn)-1,LEN('[BACKSPACE]')+1,'') AS ReplacedColumn,
1 AS Iteration
FROM (VALUES('"This is a string that I would like to d[BACKSPACE]correct and see if I could make it %[BACKSPACE] cleaner by removing the word and $[BACKSPACE] character before the backspace."'))V(YourColumn)
UNION ALL
SELECT r.YourColumn,
STUFF(r.ReplacedColumn,CHARINDEX('[BACKSPACE]',r.ReplacedColumn)-1,LEN('[BACKSPACE]')+1,''),
r.Iteration + 1
FROM rCTE r
WHERE CHARINDEX('[BACKSPACE]',r.ReplacedColumn) > 0)
SELECT TOP (1) WITH TIES
r.YourColumn,
r.ReplacedColumn
FROM rCTE r
ORDER BY ROW_NUMBER() OVER (PARTITION BY r.YourColumn ORDER BY r.Iteration DESC);
dB<>fiddle

I've had a crack to see if I can get this to work using the traditional tally-table method without any recursion.
I think I have something that works - however the recursive cte version is definitely a cleaner solution and probably better performing, however throwing this in as just an alternative non-recursive way.
/* tally table for use below */
select top 1000 N=Identity(int, 1, 1)
into dbo.Digits
from master.dbo.syscolumns a cross join master.dbo.syscolumns
with w as (
select seq = Row_Number() over (order by t.N),
part = Replace(Substring(#string, t.N, CharIndex(Left(#delimiter,1), #string + #delimiter, t.N) - t.N),Stuff(#delimiter,1,1,''),'')
from Digits t
where t.N <= DataLength(#string)+1 and Substring(Left(#delimiter,1) + #string, t.N, 1) = Left(#delimiter,1)
),
p as (
select seq,Iif(Iif(Lead(part) over(order by seq)='' and lag(part) over(order by seq)='',1,0 )=1 ,'', Iif( seq<Max(seq) over() and part !='',Left(part,Len(part)-1),part)) part
from w
)
select result=(
select ''+ part
from p
where part!=''
order by seq
for xml path('')
)

Here's a simple RegEx pattern that should work:
/.\[BACKSPACE\]/g
EDIT
I have no way to test this right now on my chromebook, but this seems like it should work for T-SQL in the LIKE clause
LIKE '_\[BACKSPACE]' ESCAPE '\'

get sub string in between mix symbols

I want to get sub string my output should look like gmail,outlook,Skype.
my string values are
'abc#gmail.com'
'cde.nitish#yahoo.com'
'xyz.vijay#sarvang.com.com'
somthing like this as you can see its having variable length with mix symbol '.' and '#'
string values store in table form as a column name Mail_ID and Table name is tbl_Data
i am using sql server 2012
i use chart index for getting sub string
select SUBSTRING(Mail_ID, CHARINDEX('#',MAil_ID)+1, (CHARINDEX('.',MAil_ID) - (CHARINDEX('#', Mail_ID)+1)))
from tbl_data
And i want my output like:
'gmail'
'yahoo'
'sarvang'
Please help me i am newbies in sql server

This is my solution. I first get the position of the '#', and then get the position of the '.' in the string prior to it (the '#'). Then I can use those results to get the appropriate substring:
SELECT V.YourString,
SUBSTRING(V.YourString,D.I,A.I - D.I) AS StringPart
FROM (VALUES('abc#gmail.com'),
('cde.nitish#yahoo.com'),
('xyz.vijay#sarvang.com.com'))V(YourString)
CROSS APPLY(VALUES(CHARINDEX('#',V.YourString)))A(I) --Get position of # to not repeat logic
CROSS APPLY(VALUES(CHARINDEX('.',LEFT(V.YourString,A.I))+1))D(I) --Get position of . to not repeat logic
Note for value of 'abc.def.steve#... it would return 'def.steve'; however, we don't have such an example so I don't know what the correct return value would be.

I'm posting this as a new answer, a the OP moved the goal posts from the original answer. My initial answer was based on their original question, not their "new" one, and it seems silly to remove an answer that was correct at the time:
SELECT V.YourString,
SUBSTRING(V.YourString,A.I, D.I - A.I) AS StringPart
FROM (VALUES('abc#gmail.com'),
('cde.nitish#yahoo.com'),
('xyz.vijay#sarvang.com.com'))V(YourString)
CROSS APPLY(VALUES(CHARINDEX('#',V.YourString)+1))A(I)
CROSS APPLY(VALUES(CHARINDEX('.',V.YourString,A.I)))D(I);

This answers the original version of the question.
This may be simplest with a case expression to detect if there is a period before the '#':
select (case when email like '%.%#%'
then stuff(left(email, charindex('#', email) - 1), 1, charindex('.', email), '')
else left(email, charindex('#', email) - 1)
end)
from (values ('abc#gmail.com'), ('cde.nitish#yahoo.com'), ('xyz.vijay#sarvang.com.com')) v(email)

I create a temp table with your data and write below query its worked
CREATE TABLE #T
(
DATA NVARCHAR(50)
)
INSERT INTO #T
VALUES('abc#gmail.com'),
('cde.nitish#yahoo.com'),
('xyz.vijay#sarvang.com.com')
SELECT *,LEFT(RIGHT(DATA,LEN(DATA)-CHARINDEX('#',DATA,1)),CHARINDEX('.',RIGHT(DATA,LEN(DATA)-CHARINDEX('#',DATA,1)),1)-1)
FROM #t
AND its a output of my T-SQL
abc#gmail.com gmail
cde.nitish#yahoo.com yahoo
xyz.vijay#sarvang.com.com sarvang

SQL Query Remove Part of Path/Null

So I am new the whole SQL Query business but I need some help with two issues. My goal is to have anything in the Column "EnvironmentName" that has the word "Database" in Column "NodeName" to be displayed in the query results. I did this with
FROM [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con]
WHERE NodeName = 'Database'
ORDER BY EnvironmentName asc
WHERE NodePath
Results of Query:
I am able to get my query results but would like to remove the rows with NULL. I have tried to use "IS NOT NULL" but SQL Server Management Studio labeles this as "incorrect syntax."
What I have tried:
FROM [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con]
WHERE NodeName = 'Database'
ORDER BY EnvironmentName asc IS NOT NULL
WHERE NodePath
Thank you in advance!

Your query is pretty close..
1: You have to specify a specific column to not be null while using IS NOT NULL.
So modify your query to:
FROM [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con]
WHERE NodeName = 'Database' AND EnvironmentName IS NOT NULL
ORDER BY EnvironmentName asc
WHERE NodePath
2: Check out this article about trimming parts of strings from query results
http://basitaalishan.com/2014/02/23/removing-part-of-string-before-and-after-specific-character-using-transact-sql-string-functions/

Where clause will come first and Then order by statement
Like following way
Select * FROM [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con]
WHERE [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con].[NodeName] = 'Database' AND [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con].[EnvironmentName] IS NOT NULL
ORDER BY [Backbone_ASPIDER].[dbo].[vw_CFGsvr_Con].[EnvironmentName] asc

EDIT: I just noticed you removed this from your OP, so feel free to disregard if you took care of that.
I don't think anyone addressed the substring problem yet. There's several ways you could get at this depending on how complex the strings are you have to slice up, but here's how I'd do it
-- Populating some fake data, representative of what you've got
if object_id('tempdb.dbo.#t') is not null drop table #t
create table #t
(
nPath varchar(1000)
)
insert into #t
select '/Database/Mappings/Silver/Birthday' union all
select '/Database/Connections/Blue/Happy'
-- First, get the character index of the first '/' after as many characters the word '/database/' takes up.
-- You could have hard coded this value too. Add 1 to it so that it moves PAST the slash.
;with a as
(
select
ixs = charindex('/', nPath, len('/Database/') + 1),
-- Get everything to the right of what you just determined with all the charindex() stuff
ss = right(nPath, len(nPath) - charindex('/', nPath, len('/Database/') + 1)),
nPath
from #t
)
-- Now just take the left of the now-cleaned-up string from start to the first pipe
select
ixs,
ss,
color = left(ss, charindex('/', ss) -1),
nPath
from a

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

A SQL Query to select certain strings in a folder path - sql

Related

How to get string from value

Regex that matches strings with specific text not between text in BigQuery

Replace a recurring word and the character before it

get sub string in between mix symbols

SQL Query Remove Part of Path/Null

Categories

Resources