I got the following entry in my database:
\\folder.abc\es\Folder-A\\2020-08-03\namefile.csv
So basically, I want everything after the last \ and before .
the namefile in that example
Thanks in advance.
If you are making use of an older version of SQL server which doenst support string_split. The reverse function comes in handy as follows.
The steps i do is reverse the string, grab the char position of ".", grab the char position of "\" then apply the substring function on it to slice the data between the two positions. Finally i reverse it again to get the proper value.
Here is an example
with data
as(select '\\folder.abc\es\Folder-A\\2020-08-03\namefile.csv' as col
)
select reverse(substring(reverse(col)
,charindex('.',reverse(col))+1
,charindex('\',reverse(col))
-
charindex('.',reverse(col))-1
)
) as file_name
from data
+-----------+
| file_name |
+-----------+
| namefile |
+-----------+
dbfiddle link
https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=8c0fc11f5ec813671228c362f5375126
You can use:
select t.*,
left(s.value, charindex('.', s.value))
from t cross apply
string_split(t.entry, '\') s
where t.entry like concat('%', s.value);
This splits the string into different components and matches on the one at the end of the string. If components can repeat, the above can return duplicates. That is easily addressed by moving more logic into the apply:
select t.*, s.val
from t cross apply
(select top (1) left(s.value, charindex('.', s.value)) as val
from string_split(t.entry, '\') s
where t.entry like concat('%', s.value)
) s
You can just use String functions (REVERSE,CHARINDEX,SUBSTRING).
SELECT
REVERSE(
SUBSTRING(REVERSE('\\folder.abc\es\Folder-A\\2020-08-03\namefile.csv'),
CHARINDEX('.',REVERSE('\\folder.abc\es\Folder-A\\2020-08-03\namefile.csv'))+1,
CHARINDEX('\',REVERSE('\\folder.abc\es\Folder-A\\2020-08-03\namefile.csv'))-
CHARINDEX('.',REVERSE('\\folder.abc\es\Folder-A\\2020-08-03\namefile.csv'))-1))
OR
SELECT
REVERSE
(
SUBSTRING( --get filename
reverse(path), --to get position last \
CHARINDEX('.',reverse(path))+1,
CHARINDEX('\',reverse(path))- CHARINDEX('.',reverse(path))-1)
)
Related
I need to update a string to amend any aliases - which can be 'H1.', 'H2.', 'H3.'... etc - to all be 'S.' and am struggling to work out the logic.
For example I have this:
'H1.HUB_CUST_ID, H2.HUB_SALE_ID, H3.HUB_LOC_ID'
But I want this:
'S.HUB_CUST_ID, S.HUB_SALE_ID, S.HUB_LOC_ID'
If you could use wildcards in REPLACE, I'd do something like this REPLACE(#string, 'H%.H', 'S.H').
Theoretically, there is no limit to how many H# aliases there could be. In practice there will almost definitely be less than 10.
Is there a better way than a nested replace of H1 - H10 separately, which both looks messy in a script and carries a small risk if more tables are joined in future?
SQL Server doesn't support pattern replacement. You are better off using a different language, that does support pattern/REGEX replacement or implementing a CLR function.
That said, however, considering you said that the value would always be below 10 you could brute force it, but it's not "pretty".
SELECT REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(YourString,'H1.','S.'),'H2.','S.'),'H3.','S.'),'H4.','S.'),'H5.','S.'),'H6.','S.'),'H7.','S.'),'H8.','S.'),'H9.','S.')
FROM YourTable ...
You can convert your string to XML and then convert it into simple table:
DECLARE #txt nvarchar(max) = N'H1.HUB_CUST_ID, H2.HUB_SALE_ID, H3.HUB_LOC_ID',
#x xml
SELECT #x = '<a al="' + REPLACE(REPLACE(#txt,', ','</a><a al="'),'.','">')+ '</a>'
SELECT t.c.value('#al', 'nvarchar(max)') as alias_name,
t.c.value('.','nvarchar(max)') as col_name
FROM #x.nodes('/a') t(c)
Output:
alias_name col_name
H1 HUB_CUST_ID
H2 HUB_SALE_ID
H3 HUB_LOC_ID
You can put results into temp table, amend them using LIKE 'some basic pattern' and then build new string.
If you don't care about the result order, you can unaggregate and reaggregate:
select t.*, v.new_val
from t cross apply
(select string_agg(concat('S1', stuff(s.value, 1, charindex('.'), '') - 1, ',') within group (order by (select null) as newval
from string_split(t.col, ',') s
) s;
Note: This assumes that all values start with the prefix you want to replace -- as your sample data suggests. A case expression can be used if there are exceptions.
You can actually get the original ordering -- assuming no duplicates -- using charindex():
select t.*, v.new_val
from t cross apply
(select string_agg(concat('S1', stuff(s.value, 1, charindex('.'), '') - 1, ',')
within group (order by charindex(s.value, t.col)
) as newval
from string_split(t.col, ',') s
) s;
I'm querying the github sample_files dataset in bigquery and I want to get the the path excluding the filename.
So if I have /path/to/file.txt
I want it to return /path/to
In python I could do something like
"/".join(str.split(a, "/")[0:-1])
but I'm not sure how to do that in bigquery/sql
Any ideas? THanks!
One method is regexp_replace():
regexp_replace('/path/to/file.txt', '/[^/]+$', '')
I would use REGEXP_EXTRACT as in below example
REGEXP_EXTRACT(full_path, r'(.+)/[^/]*$')
Split and rejoin part of a string in BigQuery
If for some reason you need or more comfortable with mimicking same approach (Split and rejoin) with SPLIT as in your question - you can use below approach (provided along with sample data for testing , playing with)
#standardSQL
WITH `project.dataset.table` AS (
SELECT '/path/to/file.txt' full_path UNION ALL
SELECT '/path/to/'
)
SELECT full_path,
(
SELECT STRING_AGG(part, '/')
FROM UNNEST(SPLIT(full_path, '/')) part WITH OFFSET
WHERE OFFSET < ARRAY_LENGTH(SPLIT(full_path, '/')) - 1
) path
FROM `project.dataset.table`
with output
Row full_path path
1 /path/to/file.txt /path/to
2 /path/to/ /path/to
I’ve been spinning around a bit on how to accomplish this in SQL DW. I need to extract the text between two periods in a returned value. So my value returned for Result is:
I’m trying to extract the values between period 1 and 2, so the red portion above:
The values will be a wide variety of lengths.
I’ve got this code:
substring(Result,charindex('.',Result)+1,3) as ResultMid
that results in this:
My problem is I’m not sure how to get to a variable length to return so that I can pull the full value between the two periods. Would someone happen to know how I can accomplish this?
Thx,
Joe
We can build on your current attempt:
substring(
result,
charindex('.', result) + 1,
charindex('.', result, charindex('.', result) + 1) - charindex('.', result) - 1
)
Rationale: you alreay have the first two arguments to substring() right. The third argument defines the number of characters to capture. For this, we compute the position of the next dot (.) with expression: charindex('.', result, charindex('.', result) + 1). Then we substract the position of the first dot from that value, which gives us the number of characters that we should capture.
Demo on DB Fiddle:
result | result_mid
:----------------------- | :---------
sam.pdc.sys.paas.l.com | pdc
sm.ridl.sys.paas.m.com | ridl
s.sandbox.sys.paas.g.com | sandbox
If you are dealing with up to 128 characters per delimited part of the string, try parsename as below. Otherwise, GMB has a pretty solid solution up there.
select *, parsename(left(result,charindex('.',result,charindex('.',result)+1)-1),1) as mid
from your_table;
Another method that you can easily modify to extract 3rd, 4th...(hopefully not too remote) part of the string using cross apply.
select result, mid
from your_table t1
cross apply (select charindex('.',result) as i1) t2
cross apply (select charindex('.',result,(i1 + 1)) as i2) t3
cross apply (select substring(result,(i1+1),(i2-i1-1)) as mid) t4;
DEMO
I have a column of data that looks like this:
58,0:102,56.00
52,0:58,68
58,110
57,440.00
52,0:58,0:106,6105.95
I need to extract the character before the last delimiter (',').
Using the data above, I want to get:
102
58
58
57
106
Might be done with a regular expression in substring(). If you want:
the longest string of only digits before the last comma:
substring(data, '(\d+)\,[^,]*$')
Or you may want:
the string before the last comma (',') that's delimited at the start either by a colon (':') or the start of the string.
Could be another regexp:
substring(data, '([^:]*)\,[^,]*$')
Or this:
reverse(split_part(split_part(reverse(data), ',', 2), ':', 1))
More verbose but typically much faster than a (expensive) regular expression.
db<>fiddle here
Can't promise this is the best way to do it, but it is a way to do it:
with splits as (
select string_to_array(bar, ',') as bar_array
from foo
),
second_to_last as (
select
bar_array[cardinality(bar_array)-1] as field
from splits
)
select
field,
case
when field like '%:%' then split_part (field, ':', 2)
else field
end as last_item
from second_to_last
I went a little overkill on the CTEs, but that was to expose the logic a little better.
With a CTE that removes everything after the last comma and then splits the rest into an array:
with cte as (
select
regexp_split_to_array(
replace(left(col, length(col) - position(',' in reverse(col))), ':', ','),
','
) arr
from tablename
)
select arr[array_upper(arr, 1)] from cte
See the demo.
Results:
| result |
| ------ |
| 102 |
| 58 |
| 58 |
| 57 |
| 106 |
The following treats the source string as an "array of arrays". It seems each data element can be defined as S(x,y) and the overall string as S1:S2:...Sn.
The task then becomes to extract x from Sn.
with as_array as
( select string_to_array(S[n], ',') Sn
from (select string_to_array(col,':') S
, length(regexp_replace(col, '[^:]','','g'))+1 n
from tablename
) t
)
select Sn[array_length(Sn,1)-1] from as_array
The above extends S(x,y) to S(a,b,...,x,y) the task remains to extracting x from Sn. If it is the case that all original sub-strings S are formatted S(x,y) then the last select reduces to select Sn[1]
I have following data in my table
id nml
-- -----------------
1 Temora sepanil
2 Human Mixtard
3 stlliot vergratob
I need to get the result by extracting first word in column nml and get its last 3 characters with reverse order
That means output should be like
nml reverse
----------------- -------
Temora sepanil aro
Human Mixtard nam
stlliot vergratob toi
You use PostgreSQL's string functions to achieve desired output
in this case am using split_part,right,reverse function
select reverse(right(split_part('Temora sepanil',' ',1),3))
output:
aro
so you can write your query in following format
select nml
,reverse(right(split_part(nml,' ',1),3)) "Reverse"
from tbl
Split nml using regexp_split_to_array(string text, pattern text [, flags text ]) refer Postgres Doc for more info.
Use reverse(str) (refer Postgres Doc) to reverse the first word form previous split.
Use substr(string, from [, count]) (refer Postgres Doc) to select first three letters of the reversed test
Query
SELECT
nml,
substr(reverse(regexp_split_to_array(nml, E'\\s+')[0]),3) as reverse
FROM
MyTable
You can use the SUBSTRING, CHARINDEX, RIGHT and REVERSE function
here's the syntax
REVERSE(RIGHT(SUBSTRING(nml , 1, CHARINDEX(' ', nml) - 1),3))
sample:
SELECT REVERSE(RIGHT(SUBSTRING(nml , 1, CHARINDEX(' ', nml) - 1),3)) AS 'Reverse'
FROM TableNameHere