I've been working on this for days and can't seem to work it out. Basically I need return digits from a field before there is a forward slash. e.g. if the field was 1234/TEXT I want to return 1234. I can't just use left fieldname 4 as the digits vary in left e.g. 12345/TEXT, so it needs to be anything left of the forward slash. Now in the World of MS Access, it is something like this - and it works
Left(TABLE!FIELD,InStr(1,TABLE!FIELD,"/")-1)
However, how do I convert this to be used in an IBM\DB2 system? The DB2 SQL seems somewhat different to 'normal' SQL.
Thanks!
Rather than INSTR, maybe LOCATE
LOCATE(char, string)
char is the search term
string is the string being searched
You can achieve this by combining LOCATE with SUBSTR;
Locate information
Substring information
Cheat sheet (for this example);
SUBSTRING('FIELD','START POSITION', 'LENGTH')
LOCATE('SEARCH STRING', 'SOURCE STRING')
SUBSTRING lets you retrieve specific characters from a string, i.e.;
AFIELD = 'Hello'
SUBSTRING(AFIELD,4,2)
Result = 'lo' (position 4 and 5 of Hello)
LOCATE returns the position of the first character of the search string it finds as a number, i.e.;
AFIELD = 'Hello'
LOCATE('ello', AFIELD)
Result = 2 (it starts at position 2)
So you can combine these to do what you want, example;
XTABLE has 1 column called ACOL with the following values in it;
123467/ABCD
1321/ABDD
1123467/ABCD
To just retrieve the numbers;
SELECT SUBSTRING(ACOL,1, LOCATE('/',ACOL)-1)
FROM XRDK/XTABLE
Result;
123467
1321
1123467
What are we doing?
SUBSTRING(
ACOL,
1,
LOCATE('/',ACOL)-1
)
SUBSTRING(
Field ACOL,
Starting at position 1,
Length; using locate set this to where I find a '/' and subtract 1 from the
resulting postion (without the -1 you'd have the / on the end)
)
Try this
SELECT SUBSTRING(CAST (ROUND(COLUMN,2) AS DECIMAL(6,2)), 0, locate('/',CAST (ROUND(COLUMN,2) AS DECIMAL(6,2))))
FROM TABLE
Related
I have 2 columns that look a little like this:
Column A
Column B
Column C
ABC
{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}
1.0
DEF
{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}
24.0
I need a select statement to create column C - the numerical digits in column B that correspond to the letters in Column A. I have got as far as finding the starting point of the numbers I want to take out. But as they have different character lengths I can't count a length, I want to extract the characters from the calculated starting point( below) up to the next comma.
STRPOS(Column B, Column A) +5 Gives me the correct character for the starting point of a SUBSTRING query, from here I am lost. Any help much appreciated.
NB, I am using google Big Query, it doesn't recognise CHARINDEX.
You can use a regular expression as well.
WITH sample_table AS (
SELECT 'ABC' ColumnA, '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}' ColumnB UNION ALL
SELECT 'DEF', '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}' UNION ALL
SELECT 'XYZ', '{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}'
)
SELECT *,
REGEXP_EXTRACT(ColumnB, FORMAT('"%s":([0-9.]+)', ColumnA)) ColumnC
FROM sample_table;
Query results
[Updated]
Regarding #Bihag Kashikar's suggestion: sinceColumnB is an invalid json, it will not be properly parsed within js udf like below. If it's a valid json, js udf with json key can be an alternative of a regular expression. I think.
CREATE TEMP FUNCTION custom_json_extract(json STRING, key STRING)
RETURNS STRING
LANGUAGE js AS """
try {
obj = JSON.parse(json);
}
catch {
return null;
}
return obj[key];
""";
SELECT custom_json_extract('{"ABC":1.0,"DEF":24.0,"XYZ":10.50,}', 'ABC') invalid_json,
custom_json_extract('{"ABC":1.0,"DEF":24.0,"XYZ":10.50}', 'ABC') valid_json;
Query results
take a look at this post too, this shows using js udf and with split options
Error when trying to have a variable pathsname: JSONPath must be a string literal or query parameter
In one of the database tables, I have a nvarchar type field that contains a series of special strings combined with some special characters. For example:
'HGHGSD_JHJSD_HGSDHGJD_GFSDGFSHDGF_GFSD'
or
'SJDGh-SUDYSUI-jhsdhsj-YTsagh-ytetyyuwte-sagd'
or
'hwerweyri~sdjhfkjhsdkjfhds~jsdfhjsdhf~mdnfsd,mfn'
Based on a formula, a sub string is always returned after the special character. But this string may be after the first, second or third place of the special character - or _ or ~. I used Charindex and Substring function in SQL server. But always only the first part of the character string after the selected character is returned. for example:
select SUBSTRING ('hwerweyri~sdjhfkjhsdkjfhds~jsdfhjsdhf~mdnfsd,mfn', 0, CHARINDEX('~', 'hwerweyri~sdjhfkjhsdkjfhds~jsdfhjsdhf~mdnfsd,mfn', 0))
returned value: hwerweyri
If there is a solution for this purpose or you have a piece of code that can work in solving this problem, please advise.
It is important to mention that the location of the special character must be entered by ourselves in the function, for example, after the third repetition or the second repetition or the tenth repetition. The method or code should be such that the location can be entered dynamically and the function does not need to be defined statically.
For Example:
'HGHGSD_JHJSD_HGSDHGJD_GFSDGFSHDGF_GFSD' ==> 3rd substring ==> 'GFSDGFSHDGF'
'HGHGSD_JHJSD_HGSDHGJD_GFSDGFSHDGF_GFSD' ==> second substring ==> 'HGSDHGJD'
'HGHGSD_JHJSD_HGSDHGJD_GFSDGFSHDGF_GFSD' ==> 1st substring ==> 'JHJSD'
And The formula will be sent to the function through a programmed form and the generated numbers will be numbers between 1 and 15. These numbers are actually the production efficiency of a product whose form is designed in C# programming language. These numbers sent to the function are variable and each time these numbers may be sent to the function and applied to the desired character string. The output should look something like the one above. I don't know if I managed to get my point across or if I managed to make my request correctly or not.
Try the following function:
CREATE FUNCTION [dbo].[SplitWithCte]
(
#String NVARCHAR(4000),
#Delimiter NCHAR(1),
#PlaceOfDelimiter int
)
RETURNS Table
AS
RETURN
(
WITH SplitedStrings(Ends,Endsp)
AS (
SELECT 0 AS Ends, CHARINDEX(#Delimiter,#String) AS Endsp
UNION ALL
SELECT Endsp+1, CHARINDEX(#Delimiter,#String,Endsp+1)
FROM SplitedStrings
WHERE Endsp > 0
)
SELECT f.DataStr
FROM (
SELECT 'RowId' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
'DataStr' = SUBSTRING(#String,Ends,COALESCE(NULLIF(Endsp,0),LEN(#String)+1)-Ends)
FROM SplitedStrings
) f WHERE f.RowId = #PlaceOfDelimiter + 1
)
How to use:
select * from [dbo].[SplitWithCte](N'HGHGSD_JHJSD_HGSDHGJD_GFSDGFSHDGF_GFSD', N'_', 3)
or
select DataStr from [dbo].[SplitWithCte](N'HGHGSD_JHJSD_HGSDHGJD_GFSDGFSHDGF_GFSD', N'_', 3)
Result: GFSDGFSHDGF
I have a column (RCV1.ECCValue) in a table which 99% of the time has a constant string format- example being:
T0-11.86-273
the middle part of the two hyphens is a percentage. I'm using the below sql to obtain this figure which is working fine and returns 11.86 on the above example. when the data in that table is in above format
'Percentage' = round(SUBSTRING(RCV1.ECCValue,CHARINDEX('-',RCV1.ECCValue)+1, CHARINDEX('-',RCV1.ECCValue,CHARINDEX('-',RCV1.ECCValue)+1) -CHARINDEX('-',RCV1.ECCValue)-1),2) ,
However...this table is updated from an external source and very occasionally the separators differ, for example:
T0-11.86_273
when this occurs I get the error:
Invalid length parameter passed to the LEFT or SUBSTRING function.
I'm very new to SQL and have got myself out of many challenges but this one has got me stuck. Any help would be mostly appreciated. Is there a better way to extract this percentage value?
Replace '_' with '-' to string in CHARINDEX while specifying length to the substring
'Percentage' = round(SUBSTRING(RCV1.ECCValue,CHARINDEX('-',RCV1.ECCValue)+1, CHARINDEX('-',replace(RCV1.ECCValue,'_','-'),CHARINDEX('-',RCV1.ECCValue)+1) -CHARINDEX('-',RCV1.ECCValue)-1),2) ,
If you can guarantee the structure of these strings, you can try parsename
select round(parsename(translate(replace('T0-11.86_273','.',''),'-_','..'),2), 2)/100
Breakdown of steps
Replace . character in the percentage value with empty string using replace.
Replace - or _, whichever is present, with . using translate.
Parse the second element using parsename.
Round it up to 2 digits, which will also
automatically cast it to the desired numeric type.
Divide by 100
to restore the number as percentage.
Documentation & Gotchas
Use NULLIF to null out such values
round(
SUBSTRING(
RCV1.ECCValue,
NULLIF(CHARINDEX('-', RCV1.ECCValue), 0) + 1,
NULLIF(CHARINDEX('-',
RCV1.ECCValue,
NULLIF(CHARINDEX('-', RCV1.ECCValue), 0) + 1
), 0)
- NULLIF(CHARINDEX('-', RCV1.ECCValue), 0) - 1
),
2)
I strongly recommend that you place the repeated values in CROSS APPLY (VALUES to avoid having to repeat yourself. And do use whitespace, it's free.
I am SQL person and new to Spark SQL
I need to find the position of character index '-' is in the string if there is then i need to put the fix length of the character otherwise length zero
string name = 'john-smith'
if '-' is in character position 4 then 10 otherwise length 0
I have done in SQL Server but now need to do in Spark SQL.
select
case
when charindex('-', name) = 4 then 10
else 0
end
I tried in Spark SQL but failed to get results.
select find_in_set('-',name)
Please help. Thanks
You can use instr function as shown next. insrt checks if the second str argument is part of the first one, if so it returns its index starting from 1.
//first create a temporary view if you don't have one already
df.createOrReplaceTempView("temp_table")
//then use instr to check if the name contains the - char
spark.sql("select if(instr(name, '-') = 4, 10, 0) from temp_table")
The arguments for the if statement are:
instr(name, '-') = 4 condition to check
10 result for valid condition
0 result for false condition
I have this query
SELECT text
FROM book
WHERE lyrics IS NULL
AND MOD(TO_NUMBER(SUBSTR(text,18,16)),5) = 1
sometimes the string is something like this $OK$OK$OK$OK$OK$OK$OK, sometimes something like #P,351811040302663;E,101;D,07112018134733,07012018144712;G,4908611,50930248,207,990;M,79379;S,0;IO,3,0,0
if I would like to know if it is possible to prevent ORA-01722: invalid number, because is some causes the char in that position is not a number.
I run this query inside a procedure a process all the rows in a cursor, if 1 row is not a number I can't process any row
You could use VALIDATE_CONVERSION if it's Oracle 12c Release 2 (12.2),
WITH book(text) AS
(SELECT '#P,351811040302663;E,101;D,07112018134733,07012018144712;G,4908611,50930248,207,990;M,79379;S,0;IO,3,0,0'
FROM DUAL
UNION ALL SELECT '$OK$OK$OK$OK$OK$OK$OK'
FROM DUAL
UNION ALL SELECT '12I45678912B456781234567812345671'
FROM DUAL)
SELECT *
FROM book
WHERE CASE
WHEN VALIDATE_CONVERSION(SUBSTR(text,18,16) AS NUMBER) = 1
THEN MOD(TO_NUMBER(SUBSTR(text,18,16)),5)
ELSE 0
END = 1 ;
Output
TEXT
12I45678912B456781234567812345671
Assuming the condition should be true if and only if the 16-character substring starting at position 18 is made up of 16 digits, and the number is equal to 1 modulo 5, then you could write it like this:
...
where .....
and case when translate(substr(text, 18, 16), 'z0123456789', 'z') is null
and substr(text, 33, 1) in ('1', '6')
then 1 end
= 1
This will check that the substring is made up of all-digits: the translate() function will replace every occurrence of z in the string with itself, and every occurrence of 0, 1, ..., 9 with nothing (it will simply remove them). The odd-looking z is needed due to Oracle's odd implementation of NULL and empty strings (you can use any other character instead of z, but you need some character so no argument to translate() is NULL). Then - the substring is made up of all-digits if and only if the result of this translation is null (an empty string). And you still check to see if the last character is 1 or 6.
Note that I didn't use any regular expressions; this is important if you have a large amount of data, since standard string functions like translate() are much faster than regular expression functions. Also, everything is based on character data type - no math functions like mod(). (Same as in Thorsten's answer, which was only missing the first part of what I suggested here - checking to see that the entire substring is made up of digits.)
SELECT text
FROM book
WHERE lyrics IS NULL
AND case when regexp_like(SUBSTR(text,18,16),'^[^a-zA-Z]*$') then MOD(TO_NUMBER(SUBSTR(text,18,16)),5)
else null
end = 1;