Selecting substrings from different points in strings depending on another column entry SQL - sql

I have 2 columns that look a little like this:
Column A
Column B
Column C
ABC
{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}
1.0
DEF
{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}
24.0
I need a select statement to create column C - the numerical digits in column B that correspond to the letters in Column A. I have got as far as finding the starting point of the numbers I want to take out. But as they have different character lengths I can't count a length, I want to extract the characters from the calculated starting point( below) up to the next comma.
STRPOS(Column B, Column A) +5 Gives me the correct character for the starting point of a SUBSTRING query, from here I am lost. Any help much appreciated.
NB, I am using google Big Query, it doesn't recognise CHARINDEX.

You can use a regular expression as well.
WITH sample_table AS (
SELECT 'ABC' ColumnA, '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}' ColumnB UNION ALL
SELECT 'DEF', '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}' UNION ALL
SELECT 'XYZ', '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}'
)
SELECT *,
REGEXP_EXTRACT(ColumnB, FORMAT('"%s":([0-9.]+)', ColumnA)) ColumnC
FROM sample_table;
Query results
[Updated]
Regarding #Bihag Kashikar's suggestion: sinceColumnB is an invalid json, it will not be properly parsed within js udf like below. If it's a valid json, js udf with json key can be an alternative of a regular expression. I think.
CREATE TEMP FUNCTION custom_json_extract(json STRING, key STRING)
RETURNS STRING
LANGUAGE js AS """
try {
obj = JSON.parse(json);
}
catch {
return null;
}
return obj[key];
""";
SELECT custom_json_extract('{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}', 'ABC') invalid_json,
custom_json_extract('{"ABC":1.0,"DEF":24.0,"XYZ":10.50}', 'ABC') valid_json;
Query results

take a look at this post too, this shows using js udf and with split options
Error when trying to have a variable pathsname: JSONPath must be a string literal or query parameter

Related

Parsing within a field using SQL

We are receiving data in one column where further parsing is needed. In this example the separator is ~.
Goal is to grab the pass or fail value from its respective pair.
SL
Data
1
"PARAM-0040,PASS~PARAM-0045,PASS~PARAM-0070,PASS"
2
"PARAM-0040,FAIL~PARAM-0045,FAIL~PARAM-0070,PASS"
Required outcome:
SL
PARAM-0040
PARAM-0045
PARAM-0070
1
PASS
PASS
PASS
2
FAIL
FAIL
PASS
This will be a part of a bigger SQL query where we are selecting many other columns, and these three columns are to be picked up from the source as well and passed in the query as selected columns.
E.g.
Select Column1, Column2, [ Parse code ] as PARAM-0040, [ Parse code ] as PARAM-0045, [ Parse code ] as PARAM-0070, Column6 .....
Thanks
You can do that with a regular expression. But regexps are non-standard.
This is how it is done in postgresql: REGEXP_MATCHES()
https://www.postgresqltutorial.com/postgresql-regexp_matches/
In postgresql regexp_matches returns zero or more values. So then it has to be broken down (thus the {})
A simpler way, also in postgresql is to use substring.
substring('foobar' from 'o(.)b')
Like:
select substring('PARAM-0040,PASS~PARAM-0045,PASS~PARAM-0070,PASS' from 'PARAM-0040,([^~]+)~');
substring
-----------
PASS
(1 row)
You may use the str_to_map function to split your data and subsequently extract each param's value. This example will first split each param/value pair by ~ before splitting the parameter and value by ,.
Reproducible example with your sample data:
WITH my_table AS (
SELECT 1 as SL, "PARAM-0040,PASS~PARAM-0045,PASS~PARAM-0070,PASS" as DATA
UNION ALL
SELECT 2 as SL, "PARAM-0040,FAIL~PARAM-0045,FAIL~PARAM-0070,PASS" as DATA
),
param_mapped_data AS (
SELECT SL, str_to_map(DATA,"~",",") param_map FROM my_table
)
SELECT
SL,
param_map['PARAM-0040'] AS PARAM0040,
param_map['PARAM-0045'] AS PARAM0045,
param_map['PARAM-0070'] AS PARAM0070
FROM
param_mapped_data
Actual code assuming your table is named my_table
WITH param_mapped_data AS (
SELECT SL, str_to_map(DATA,"~",",") param_map FROM my_table
)
SELECT
SL,
param_map['PARAM-0040'] AS PARAM0040,
param_map['PARAM-0045'] AS PARAM0045,
param_map['PARAM-0070'] AS PARAM0070
FROM
param_mapped_data
Outputs:
sl
param0040
param0045
param0070
1
PASS
PASS
PASS
2
FAIL
FAIL
PASS

Translate function not returning relevant string in amazon redshift

I am trying to use a simple Translate function to replace "-" in a 23 digit string. The example of one such string is "1049477-1623095-2412303" The expected outcome of my query should be 104947716230952412303
The list of all "1049477-1623095-2412303" is present in a single column "table1". The name of the column is "data"
My query is
Select TRANSLATE(t.data, '-', '')
from table1 as t
However, it is returning 104947716230952000000 as the output.
At first, I thought it is an overflow error since the resulting integer is 20 digit so I also tried to use following
SELECT CAST(TRANSLATE(t.data,'-','') AS VARCHAR)
from table1 as t
but this is not working as well.
Please suggest a way so that I could have my desirable output
This is too long for a comment.
This code:
select translate('1049477-1623095-2412303', '-', '')
is going to return:
'104947716230952412303'
The return value is a string, not a number.
There is no way that it can return '104947716230952000000'. I could only imagine that happening if somehow the value is being converted to a numeric or bigint type.
Try regexp_replace()
Taking your own example, execute:
select regexp_replace('[string / column_name]','-');
It can be achieve RPAD try below code.
SELECT RPAD(TRANSLATE(CAST(t.data as VARCHAR),'-','') ,20,'00000000000000000000')

SQL Server: How to select rows which contain value comprising of only one digit

I am trying to write a SQL query that only returns rows where a specific column (let's say 'amount' column) contains numbers comprising of only one digit, e.g. only '1's (1111111...) or only '2's (2222222...), etc.
In addition, 'amount' column contains numbers with decimal points as well and these kind of values should also be returned, e.g. 1111.11, 2222.22, etc
If you want to make the query generic that you don't have to specify each possible digit you could change the where to the following:
WHERE LEN(REPLACE(REPLACE(amount,LEFT(amount,1),''),'.','') = 0
This will always use the first digit as comparison for the rest of the string
If you are using SQL Server, then you can try this script:
SELECT *
FROM (
SELECT CAST(amount AS VARCHAR(30)) AS amount
FROM TableName
)t
WHERE LEN(REPLACE(REPLACE(amount,'1',''),'.','') = 0 OR
LEN(REPLACE(REPLACE(amount,'2',''),'.','') = 0
I tried like this in place of 1111111 replace with column name:
Select replace(Str(1111111, 12, 2),0,left(11111,1))

DB2 SQL Anything left of a /

I've been working on this for days and can't seem to work it out. Basically I need return digits from a field before there is a forward slash. e.g. if the field was 1234/TEXT I want to return 1234. I can't just use left fieldname 4 as the digits vary in left e.g. 12345/TEXT, so it needs to be anything left of the forward slash. Now in the World of MS Access, it is something like this - and it works
Left(TABLE!FIELD,InStr(1,TABLE!FIELD,"/")-1)
However, how do I convert this to be used in an IBM\DB2 system? The DB2 SQL seems somewhat different to 'normal' SQL.
Thanks!
Rather than INSTR, maybe LOCATE
LOCATE(char, string)
char is the search term
string is the string being searched
You can achieve this by combining LOCATE with SUBSTR;
Locate information
Substring information
Cheat sheet (for this example);
SUBSTRING('FIELD','START POSITION', 'LENGTH')
LOCATE('SEARCH STRING', 'SOURCE STRING')
SUBSTRING lets you retrieve specific characters from a string, i.e.;
AFIELD = 'Hello'
SUBSTRING(AFIELD,4,2)
Result = 'lo' (position 4 and 5 of Hello)
LOCATE returns the position of the first character of the search string it finds as a number, i.e.;
AFIELD = 'Hello'
LOCATE('ello', AFIELD)
Result = 2 (it starts at position 2)
So you can combine these to do what you want, example;
XTABLE has 1 column called ACOL with the following values in it;
123467/ABCD
1321/ABDD
1123467/ABCD
To just retrieve the numbers;
SELECT SUBSTRING(ACOL,1, LOCATE('/',ACOL)-1)
FROM XRDK/XTABLE
Result;
123467
1321
1123467
What are we doing?
SUBSTRING(
ACOL,
1,
LOCATE('/',ACOL)-1
)
SUBSTRING(
Field ACOL,
Starting at position 1,
Length; using locate set this to where I find a '/' and subtract 1 from the
resulting postion (without the -1 you'd have the / on the end)
)
Try this
SELECT SUBSTRING(CAST (ROUND(COLUMN,2) AS DECIMAL(6,2)), 0, locate('/',CAST (ROUND(COLUMN,2) AS DECIMAL(6,2))))
FROM TABLE

SQL, joining the same table

I have a table with the columns code and document. The column code may have alphanumeric values (only letters) and/or numeric values (only digits). In the model, every numeric code has an alphanumeric equivalent code. The records below represent an example of this situation (in the form (document,code);(document,code):
(12345678900,ABC);(12345678900,999)
But, a alphanumeric code may not always have a equivalent numeric code, so the example below represents a situation where we have 3 different records
(12345678900,ABC);(12345678900,999);(00987654321,XYZ);(11111111111,DEF)
With this in mind, what I want to do is the following: what I'll use to search records is always alphanumeric codes, and when I have an equivalent numeric, I want as result the numeric one, but when the alpha doesn't have the numeric equivalent, I want the alphanumeric code.
For instance, if I execute the following selects, I would get the results below:
SELECT code FROM table WHERE code = 'ABC' -> Result: 999
SELECT code FROM table WHERE code = 'DEF' -> Result: DEF
SELECT code FROM table WHERE code = 'XXX' -> Result: (blank)
I appreciate if anybody could help me please.
Regards,
AMR
Try this
SELECT
f1.doc
, COALESCE(f2.code, f1.code) code
FROM foo f1
LEFT JOIN(
SELECT doc, code
FROM foo
WHERE code <> #code
) f2 ON f2.doc = f1.doc
WHERE f1.code = #code
demo