Removing text and adding to front of string - sql

How would I go about removing a string from a field in a sql query (using sql server 2005) and adding it to the front of the string?
For instance my search string field contains: 22378MA
I want to search for the following characters 'MA' in this case.
I would like the query to add this string to the front so that it returns a query like this:
MA2237
My field name is sku for the query.
Not sure I explained myself properly. I don't want to change the field only what is returned in the query in a view. In addition the field value changes so I can't hardcode the sku. In addition the sku length field length is variable. The suffix 'MA' may be changed for certain queries so I need to be able to use it in a case statement.

select SKU as OldSKU, case
when CHARINDEX('MA', SKU) = LEN(SKU) - 1
then 'MA' + SUBSTRING(SKU, 1, LEN(SKU) - 2)
when CHARINDEX('B', SKU) = LEN(SKU)
then 'B' + SUBSTRING(SKU, 1, LEN(SKU) - 1)
when CHARINDEX('XYZ', SKU) = LEN(SKU) - 2
then 'XYZ' + SUBSTRING(SKU, 1, LEN(SKU) - 3)
else SKU
end as NewSKU
from (
select '22378MA' as SKU
union all
select '22378B'
union all
select '22378XYZ'
union all
select '22378TT'
) a
Output:
OldSKU NewSKU
-------- -----------
22378MA MA22378
22378B B22378
22378XYZ XYZ22378
22378TT 22378TT

Alternatively if your suffix is always a text string after numerics, you could use:
;with data as
(
SELECT '22378MA' as sku UNION ALL
SELECT '22444378B' as sku UNION ALL
SELECT '12345GHJ' as sku UNION ALL
SELECT '78456M' as sku
)
SELECT
sku
,RIGHT(sku,LEN(sku) - PATINDEX('%[A-Za-z]%',sku) + 1) + '' + LEFT(sku,PATINDEX('%[A-Za-z]%',sku) - 1) as sku2
from data
Which will put the text (however long it is) before the numbers in the string

Try this to see if you get the results you want
select sku,substring(sku,6,7)+substring(sku,1,5)
from table
If it works OK
update table set sku = substring(sku,6,7)+substring(sku,1,5)

Related

How to fin and extract substring BIGQUERY

A have a string column at BigQuery table for example:
name
WW_for_all_feed
EU_param_1_for_all_feed
AU_for_all_full_settings_18+
WW_for_us_param_5_for_us_feed
WW_for_us_param_5_feed
WW_for_all_25+
and also have a list of variables, for example :
param_1_for_all
param_5_for_us
param_5
full_settings
And if string at column "name" contains one of this substrings needs to extract it :
name
param
WW_for_all_feed
None
EU_param_1_for_all_feed
param_1_for_all
AU_for_all_full_settings_18+
full_settings
WW_for_us_param_5_for_us_feed
param_5_for_us
WW_for_us_param_5_feed
param_5
WW_for_all_25+
None
I want to try regexp and replace, but don't know pattern for find substring
Use below
select name, param
from your_table
left join params
on regexp_contains(name, param)
if apply to sample data as in your question
with your_table as (
select 'WW_for_all_feed' name union all
select 'EU_param_1_for_all_feed' union all
select 'AU_for_all_full_settings_18+' union all
select 'WW_for_us_param_5_for_us_feed' union all
select 'WW_for_all_25+'
), params as (
select 'param_1_for_all' param union all
select 'param_5_for_us' union all
select 'full_settings'
)
output is
but I have an another issue (updated question) If one of params is substring for another?
use below then
select name, string_agg(param order by length(param) desc limit 1) param
from your_table
left join params
on regexp_contains(name, param)
group by name
if applied to your updated data sample - output is

Find and replace pattern inside BigQuery string

Here is my BigQuery table. I am trying to find out the URLs that were displayed but not viewed.
create table dataset.url_visits(ID INT64 ,displayed_url string , viewed_url string);
select * from dataset.url_visits;
ID Displayed_URL Viewed_URL
1 url11,url12 url12
2 url9,url12,url13 url9
3 url1,url2,url3 NULL
In this example, I want to display
ID Displayed_URL Viewed_URL unviewed_URL
1 url11,url12 url12 url11
2 url9,url12,url13 url9 url12,url13
3 url1,url2,url3 NULL url1,url2,url3
Split the each string into an array and unnest them. Do a case to check if the items are in each other and combine to an array or a string.
Select ID, string_agg(viewing ) as viewed,
string_agg(not_viewing ) as not_viewed,
array_agg(viewing ignore nulls) as viewed_array
from (
Select ID ,
case when display in unnest(split(Viewed_URL)) then display else null end as viewing,
case when display in unnest(split(Viewed_URL)) then null else display end as not_viewing,
from (
Select 1 as ID, "url11,url12" as Displayed_URL, "url12" as Viewed_URL UNION ALL
Select 2, "url9,url12,url13", "url9" UNION ALL
Select 3, "url1,url2,url3", NULL UNION ALL
Select 4, "url9,url12,url13", "url9,url12"
),unnest(split(Displayed_URL)) as display
)
group by 1
Consider below approach
select *, (
select string_agg(url)
from unnest(split(Displayed_URL)) url
where url != ifnull(Viewed_URL, '')
) unviewed_URL
from `project.dataset.table`
if applied to sample data in your question - output is

Split a Column with Delimited Values and Compare Each Value

I have a column that contains multiple values in a delimited(comma-separated) format -
id | code
------------
1 11,19,21
2 55,87,33
3 3,11
4 11
I want to be able to compare to each value inside the 'code' column as below -
SELECT id FROM myTbl WHERE code = '11'
This should return -
1
3
4
I've tried the solution below but it does not work for all cases -
SELECT id FROM myTbl WHERE POSITION('11' IN code) <> 0
This will work with a 2 digit number like '11' as it will return a value that is <> 0 if it finds a match. But it will fail when searching for say '3' because rows with 'id' 2 and 3 both will be returned.
Here is link that talks about the POSITION function in REDSHIFT.
Any other approach that will solve this problem?
you can get the count of this string
SELECT id FROM myTbl WHERE regexp_count(user_action, '[11]') > 0
I think we can use regexp_substr() as follow.
select tb .id from myTbl tb where '11' in (
select regexp_substr( (select code from myTbl where id=tb.id),'[^,]+', 1, LEVEL) from dual
connect by regexp_substr((select code from myTbl where id=tb.id) , '[^,]+', 1, LEVEL) is not null);
just try this.
Use split_part() function
SELECT distinct id
FROM myTbl
WHERE '11' in ( split_part( code||',' , ',', 1 ),
split_part( code||',' , ',', 2 ),
split_part( code||',' , ',', 3 ) )
This is a very, very bad data model. You should be storing this information in a junction/association table, with one row per value.
But, if you have no choice, you can use like:
SELECT id
FROM myTbl
WHERE ',' || code || ',' LIKE '%,11,%';

How to use regex to split using last occurrence of forward slash in BigQuery

I have sample data as
with temp_table as
(
select "/category/sub-category/title-of-the-page" as pagename
union all
select "premier-league/splash"
union all
select "portal"
union all
select "news/1970/01/01/new-billion"
union all
select "/premier-league/transfers/"
union all
select "/premier-league/tfflive"
)
, clean_pagename as
(
select * ,
if (regexp_contains(pagename, "^/+" ) , regexp_extract(pagename, "^/+(.*)/?$") , pagename) as clean_page
from temp_table
)
, dated_content as
(
select *, if (
regexp_contains(clean_page , "/[0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]/") ,
regexp_replace(clean_page , "[0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]", "dated-content" ),
clean_page
) as new_pagename
from clean_pagename
)
,category_and_titles as
(
select *, split(new_pagename, "/")[offset(0)] as page_category,
coalesce(REGEXP_EXTRACT(new_pagename, r'/([^/]+)?$') , "no-title") as title,
regexp_replace(new_pagename, r'[^/]+$', "") as path
from dated_content
)
select pagename,
page_category ,
path,
title
from category_and_titles
Here is what I am doing - I remove the first / in the string and replace date-content using a regex. Next I would like to extract 3 things
category - first section of the string before first /
path - that component of string from 0 until last / has been encountered
title - everything after last / in the string.
There are instances where / is not present at all (record #3). In this case I want all the 3 parts to be equal to original string.
For example - for string as /premier-league/transfers/, I would like my output to be -
category = "premier-league" , path = "premier-league/transfers/" , title = ""
My current code gives me results as
Whereas, I need -
Without much refactoring and leaving all your original logic intact - just do below changes for category_and_titles CTE
...
, category_and_titles AS (
SELECT *,
SPLIT(new_pagename, "/")[OFFSET(0)] AS page_category,
IF(REGEXP_CONTAINS(new_pagename, r'/'), REGEXP_REPLACE(new_pagename, r'[^/]+$', ""), new_pagename) AS path,
IF(REGEXP_CONTAINS(new_pagename, r'/'), COALESCE(REGEXP_EXTRACT(new_pagename, r'/([^/]+)?$'), "no-title"), new_pagename) AS title
FROM dated_content
)
...
with this minor change result will be as expected

Select column from table and add ',' except last row

In this
SELECT field + ',' FROM table
I get something like this
1,
2,
3,
But I need to get
1,
2,
3
Last one should have no comma.
You should check this function
LIST()
Also this question may be duplicated you can check out the question below and see if some answer fit to your needs: How to concatenate text from multiple rows into a single text string in SQL Server
It appears Firebird allows you to limit rows with the rows keyword.
Assuming it can also be used in an inline view, you could run the following:
select case when x.field is not null
then t.field
else t.field + ','
end as field_alias
from tbl t
left join
(
select field
from tbl
order by field desc
rows 1 to 1
) x
on t.field = x.field
order by 1
If you are using mysql, this would return you a comma separated list (in one row) of all values in your_column in your_table:
SELECT GROUP_CONCAT(your_column) FROM your_table
It defaults to using a comma but you can specify more options like SEPARATOR, DISTINCT and ORDER BY.
If there is a unique field , you can try this way to get the last row without comma.
SELECT
CASE when isnull(B.field,'')='' THEN A.field+',' ELSE A.field END
FROM [table] A
left join
(
SELECT TOP 1 field FROM [table] ORDER BY unique_field DESC
)B ON A.field=B.field
ORDER BY A.unique_field
As mentioned above MySQL would return comma separated values by default, in the past I've changed the separator to a space -
GROUP_CONCAT(table_column SEPARATOR " ")