Help with T-sql special sorting rules - sql

I have a field like:
SELECT * FROM
(
SELECT 'A9t' AS sortField UNION ALL
SELECT 'A10t' UNION ALL
SELECT 'A11t' UNION ALL
SELECT 'AB9F' UNION ALL
SELECT 'AB10t' UNION ALL
SELECT 'AB11t'
) t ORDER BY sortField
and the result is:
sortField
---------
A10t
A11t
A9t
AB10t
AB11t
AB9F
Actually I need is to combine the string and number sorting rules:
sortField
---------
A9t
A10t
A11t
AB9F
AB10t
AB11t

SELECT *
FROM (
SELECT 'A9t' AS sortField UNION ALL
SELECT 'A10t' UNION ALL
SELECT 'A11t' UNION ALL
SELECT 'AB9F' UNION ALL
SELECT 'AB10t' UNION ALL
SELECT 'AB11t'
)
t
ORDER BY LEFT(sortField,PATINDEX('%[0-9]%',sortField)-1) ,
CAST(substring(sortField,PATINDEX('%[0-9]%',sortField),1 + PATINDEX('%[0-9][A-Z]%',sortField) -PATINDEX('%[0-9]%',sortField) ) AS INT),
substring(sortField,PATINDEX('%[0-9][A-Z]%',sortField) + 1,LEN(sortField))

If the first character is always a letter, try:
SELECT * FROM
(
SELECT 'A9t' AS sortField UNION ALL
SELECT 'A10t' UNION ALL
SELECT 'A11t'
) t ORDER BY substring(sortField,2,len(sortField)-1) desc

I would say that you have combined the alpha and numeric sort. But what I think you're asking is that you want to sort letters in ascending order and numbers in descending order, and that might be hard to do in a nice looking way. The previous answers will not working for your problem, the problem is that Martin Smith's solution doesn't take strings with two letters as prefix and Parkyprg doesn't sort numbers before letters as you ask for.
What you need to do is to use a custom order, see example here: http://www.emadibrahim.com/2007/05/25/custom-sort-order-in-a-sql-statement/, but that is a tedious way to do it.
EDIT: Martins Smith's solution is updated and works just fine!

Related

BQ function to transform caron character from a string to character without caron

I am trying to get rid of a caron characters (č,ć,š,ž) and transform them to c,c,s,z :
with my_table as (
select 'abcčd sštuv' caron, 'abccd sstuv' no_caron union all
select 'uvzž', 'uvzz' union all
select 'ijkl cČd', 'ijkl cCd' union ALL
select 'Ćdef', 'Cdef'
)
SELECT *
FROM my_table
I was trying to get rid of them with
SELECT *,
regexp_replace(caron, r'š', 's') as no_caron
FROM my_table
but I think this is inefficient. I know there is an option to write your on function as described
here, but I have no idea how to use it in my case.
Thanks in advance!
Use below
SELECT *, regexp_replace(normalize(caron, NFD), r"\pM", '') output
FROM my_table
if applied to sample data in your question - output is

Extract number or string after string in BigQuery

I have several 1.000 URLs and want to extract some values from the URL parameters.
Here some examples from the DB:
["www.xxx.com?uci=6666&rci=fefw"]
["www.xxx.com?uci=61
["www.xxx.com?rci=62&uci=5536"]
["www.xxx.com?uci=6666&utm_source=XXX"]
["www.xxx.com?pccst=TEST%20sTESTg"]
["www.xxx.com?pccst=TEST2%20s&uci=1"]
["www.xxx.com?uci=1pccst=TEST42rt24&rci=2"]
How can I extract the value of the parameter UCI. It is always a digit number (don’t know the exact length).
I tried it with REGEXP_EXTRACT. But I didn't succeed:
REGEXP_EXTRACT(URL, '(uci)\=[0-9]+') AS UCI_extract
And I also want to extract the value of the parameter pccst. It can be every character and I don`t know the exact length. But it always ends with “ or ? or &
I tried it also with REGEXP_EXTRACT but didn't succeed:
REGEXP_EXTRACT(URL, r'pccst\=(.*)(\"|\&|\?)') AS pccst_extract
I am really not the REGEX expert.
So would be great if someone could help me.
Thanks a lot in advance,
Peter
You can adapt this solution
#standardSQL
# Extract query parameters from a URL as ARRAY in BigQuery; standard-sql; 2018-04-08
# #see http://www.pascallandau.com/bigquery-snippets/extract-url-parameters-array/
WITH examples AS (
SELECT 1 AS id, 'www.xxx.com?uci=6666&rci=fefw' AS query
UNION ALL SELECT 2, 'www.xxx.com?uci=1pccst%20TEST42rt24&rci=2'
UNION ALL SELECT 3, 'www.xxx.com?pccst=TEST2%20s&uci=1'
)
SELECT
id,
query,
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)((?:[^=]+)=(?:[^&]*))') as params,
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)(?:([^=]+)=(?:[^&]*))') as keys,
REGEXP_EXTRACT_ALL(query,r'(?:\?|&)(?:(?:[^=]+)=([^&]*))') as values
FROM examples
Below example for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT "www.xxx.com?uci=6666&rci=fefw" url UNION ALL
SELECT "www.xxx.com?uci=61" UNION ALL
SELECT "www.xxx.com?rci=62&uci=5536" UNION ALL
SELECT "www.xxx.com?uci=6666&utm_source=XXX" UNION ALL
SELECT "www.xxx.com?pccst=TEST%20sTESTg" UNION ALL
SELECT "www.xxx.com?pccst=TEST2%20s&uci=1" UNION ALL
SELECT "www.xxx.com?uci=1&pccst=TEST42rt24&rci=2"
)
SELECT
url,
REGEXP_EXTRACT(url, r'[?&]uci=(.*?)(?:$|&)') uci,
REGEXP_EXTRACT(url, r'[?&]pccst=(.*?)(?:$|&)') pccst
FROM `project.dataset.table`
result is
Row url uci pccst
1 www.xxx.com?pccst=TEST%20sTESTg null TEST%20sTESTg
2 www.xxx.com?pccst=TEST2%20s&uci=1 1 TEST2%20s
3 www.xxx.com?uci=1&pccst=TEST42rt24&rci=2 1 TEST42rt24
4 www.xxx.com?uci=61 61 null
5 www.xxx.com?rci=62&uci=5536 5536 null
6 www.xxx.com?uci=6666&rci=fefw 6666 null
7 www.xxx.com?uci=6666&utm_source=XXX 6666 null
Also, below option to parse out all key-value pairs so, then you can dynamically select needed
#standardSQL
WITH `project.dataset.table` AS (
SELECT "www.xxx.com?uci=6666&rci=fefw" url UNION ALL
SELECT "www.xxx.com?uci=61" UNION ALL
SELECT "www.xxx.com?rci=62&uci=5536" UNION ALL
SELECT "www.xxx.com?uci=6666&utm_source=XXX" UNION ALL
SELECT "www.xxx.com?pccst=TEST%20sTESTg" UNION ALL
SELECT "www.xxx.com?pccst=TEST2%20s&uci=1" UNION ALL
SELECT "www.xxx.com?uci=1pccst=TEST42rt24&rci=2"
)
SELECT url,
ARRAY(
SELECT AS STRUCT
SPLIT(kv, '=')[SAFE_OFFSET(0)] key,
SPLIT(kv, '=')[SAFE_OFFSET(1)] value
FROM UNNEST(SPLIT(SUBSTR(url, LENGTH(NET.HOST(url)) + 2), '&')) kv
) key_value_pair
FROM `project.dataset.table`

How to cut everything after a specific character, but in case string doesn't contain it do nothing?

Let's say i have following data:
fjflka, kdjf
ssssllkjf fkdsjl
skfjjsld, kjl
jdkfjlj, ksd
lkjlkj hjk
I want to cut out everything after ',' but in case the string doesn't contain this character, it wont do anything, if i use substr and cut everything after ',' the string which doesn't contain this character shows as null. How do i achieve this? Im using oracle 11g.
This should work. Simply use regexp_substr
with t_view as (
select 'fjflka, kdjf' as text from dual union
select 'ssssllkjf fkdsjl' from dual union
select 'skfjjsld, kjl' from dual union
select 'jdkfjlj, ksd' from dual union
select 'lkjlkj hjk' from dual
)
select text,regexp_substr(text,'[^,]+',1,1) from t_view;
Assuming your table :
SQL> desc mytable
s varchar2(100)
you may use:
select decode(instr(s,','),0,s,substr(s,1,instr(s,',')-1)) from mytable;
demo
Well the below query works as per your requirement.
with mytable as
(select 'aaasfasf wqwe' s from dual
union all
select 'aaasfasf, wqwe' s from dual)
select s,substr(s||',',1,instr(s||',',',')-1) from mytable;

SQL: Select strings which have equal words

Suppose I have a table of strings, like this:
VAL
-----------------
Content of values
Values identity
Triple combo
my combo
sub-zero combo
I want to find strings which have equal words. The result set should be like
VAL MATCHING_VAL
------------------ ------------------
Content of values Values identity
Triple combo My combo
Triple combo sub-zero combo
or at least something like this.
Can you help?
One method is to use a hack for regular expressions:
select t1.val, t2.val
from t t1 join
t t2
on regexp_like(t1.val, replace(t2.val, ' ', '|');
You might want the case to be identical as well:
on regexp_like(lower(t1.val), replace(lower(t2.val), ' ', '|');
You could use a combination of SUBSTRING and LIKE.
use charIndex(" ") to split the words up in the substring if thats what you want to do.
Using some of the [oracle internal similiarity] found in UTL_Match (https://docs.oracle.com/database/121/ARPLS/u_match.htm#ARPLS71219) matching...
This logic is more for matching names or descriptions that are 'Similar' and where phonetic spellings or typo's may cause the records not to match.
By adjusting the .5 below you can see how the %'s get you closer and closer to perfect matches.
with cte as (
select 'Content of values' val from dual union all
select 'Values identity' val from dual union all
select 'triple combo' from dual union all
select 'my combo'from dual union all
select 'sub-zero combo'from dual)
select a.*, b.*, utl_match.edit_distance_similarity(a.val, b.val) c, UTL_MATCH.JARO_WINKLER(a.val,b.val) JW
from cte a
cross join cte b
where UTL_MATCH.JARO_WINKLER(a.val,b.val) > .5
order by utl_match.edit_distance_similarity(a.val, b.val) desc
and screenshot of query/output.
Or we could use an inner join and > if we only want one way compairisons...
select a.*, b.*, utl_match.edit_distance_similarity(a.val, b.val) c, UTL_MATCH.JARO_WINKLER(a.val,b.val) JW
from cte a
inner join cte b
on A.Val > B.Val
where utl_match.jaro_winkler(a.val,b.val) > .5
order by utl_match.edit_distance_similarity(a.val, b.val) desc
this returns the 3 desired records.
But this does not explicitly check each any word matches. which was your base requirement. I just wanted you to be aware of alternatives.

Oracle : how to order hexadecimal field

Silly question but I can't find a reasonable answer.
I need to order a field containing hexadecimal values like :
select str from
(
select '2212A' str from dual union all
select '2212B' from dual union all
select '22129' from dual union all
select '22127' from dual union all
select '22125' from dual union all
select '22126' from dual
) t
order by str asc;
This request give :
STR
------------
2212A
2212B
22125
22126
22127
22129
I would like
STR
------------
22125
22126
22127
22129
2212A
2212B
How can I do that ?
Are these HEX numbers? Will the max letter be F? Then convert hex to decimal:
select str
from t
order by to_number(str,'XXXXXXXXXXXX');
EDIT: Stupid me. The title says it's hex numbers :P So this solution should work for you.
You have to clarify further what you want to achieve, but in general, you can sort your table on that column and the first row is the one with smallest value:
SELECT * FROM mytable ORDER BY mycolumn
If you want one record only:
SELECT * FROM mytable ORDER BY mycolumn WHERE rownum = 1
You can try something like a try_convert() in case the conversion fails.
SELECT * FROM table_name ORDER BY TRY_CONVERT(VARBINARY, column_name) ASC;
Hope this helps!