I'm using Oracle query with regexp_substr to extract json fields from JSON string. I would like just number(1177) of "pickupLocation": "1177" but I'm using this query didn't work
select to_char(regexp_substr(a.input_msg, '(\w*)("pickupLocation":")(\w*)(")',1,1))
from td_interface_phxsale_log a
output from my query : "pickupLocation":"1177"
myjson data :
{
"shiptoAddr": "",
"shippingCostAmt": "",
"pickupLocation": "1177"
}
Why using regular expressions? That makes absolutely no sense. You pay a lot of money for your Oracle license, use the JSON tools included with it.
with
inputs (my_json_input) as (
select to_clob('{
"shiptoAddr": "",
"shippingCostAmt": "",
"pickupLocation": "1177"
}')
from dual
)
-- End of simulated inputs (for testing only;
-- remove the WITH clause and use your actual table and column names below)
--
select json_value(my_json_input, '$.pickupLocation') as pickuplocation
from inputs
;
PICKUPLOCATION
---------------
1177
You don't need to_char() here; even when the JSON string is CLOB, json_value() returns varchar2 (unless explicitly requested otherwise with the returning clause).
If in fact the "pickup location" data type is number (as apparently it should be), you can stick returning number at the end of the json_value() call (before the closing parenthesis).
In general, you should not attempt to parse JSON content using regular expressions. Assuming you really needed to go down this path, you could try:
SELECT regexp_substr(input_msg, '"pickupLocation": "([^"]+)"', 1, 1, NULL, 1)
FROM td_interface_phxsale_log;
Demo
Note: Since JSON is already text, there is no need to convert what you extract using TO_CHAR.
I have string column that looks usually approximately like this:
https://mapy.cz/zakladni?x=16.3360208&y=49.6718038&z=8&source=firm&id=13123554
https://mapy.cz/turisticka?x=15.9380354&y=50.1990211&z=11&source=base&id=2197
https://mapy.cz/turisticka?x=12.8611357&y=49.8051338&z=16&source=base&id=1703157
I would like to group data by source which is part of the string - four letters behind "source=" (in the case above: firm) and then simply count them. Is there a way to achieve this directly in SQL code? I am using hadoop.
Data is a set of strings that look like above. My expected result is summary table with two columns: 1) Each type of the source (there is about 20 possible and their length is different so I cannot use sipmle substring). Ideally I am looking for solution that says: For the grouping use four letters that come after "source=" 2) Count of their occurences in all the strings.
There is just one source type in each string.
You can use regexp_extract():
select substr(regexp_extract(url, 'source[^&]+'), 8)
You can use charindex in MSSQL to get position of string and extract record
;with cte as (
SELECT SUBSTRING('https://mapy.cz/zakladni?x=16.3360208&y=49.6718038&z=8&source=firm&id=13123554',
charindex('&source=','https://mapy.cz/zakladni?x=16.3360208&y=49.6718038&z=8&source=firm&id=13123554')
+8,4) AS ExtractString )
select ExtractString,count(ExtractString) as count from cte group by ExtractString;
There is equivalent function LOCATE in hiveql for charindex.
Hi, everyone.
I need you help with this problem.
I need to create a bunch of serial numbers in one of my tables and for that I want to use stored procedure. So I want to pass FirstSN and LastSN as parameters to the SP and it inserts N records into my table. A serial number consists of prefix and an incremental part.
For example, I send SN0001 as FirstSN and SN0100 as LastSN and it should insert the following:
SN0001
SN0002
SN0003
...
SN0099
SN0100
How can I do that without using loops?
P.s. I am using oracle 11.2.0
select 'SN' || lpad(lvl, length('100')+1, '0') from (select level lvl from dual connect by level <= 100);
I currently am working with a large data set that was pre-populated in BigQuery. I have a column of orderID's which have the following set-up: o377412876, o380940924, etc. This is stored in a string. I need to do the following and am running into problems:
1) Strip off the first character using the BigQuery query language
2) Convert the remaining (or treat the remaining values), as an integer.
I will then run a join against the values. Now, I would be abundantly happier down this operation in either Python, R, or another language. That said, the challenge I have been given based on client needs is to write all the scripts in BigQuery's querying language.
SELECT 10 * INTEGER(REGEXP_REPLACE(x, '^.', ''))
FROM
(SELECT 'o1234' AS x)
12340
You can use SUBSTR function and SAFE_CAST (in case there are NULL values in your column). INTEGER does not work on BQ.
SELECT SAFE_CAST(SUBSTR(x, 2) AS INT64)
FROM (SELECT 'o1234' AS x)
Output: 1234
I have a table say, ITEM, in MySQL that stores data as follows:
ID FEATURES
--------------------
1 AB,CD,EF,XY
2 PQ,AC,A3,B3
3 AB,CDE
4 AB1,BC3
--------------------
As an input, I will get a CSV string, something like "AB,PQ". I want to get the records that contain AB or PQ. I realized that we've to write a MySQL function to achieve this. So, if we have this magical function MATCH_ANY defined in MySQL that does this, I would then simply execute an SQL as follows:
select * from ITEM where MATCH_ANY(FEAURES, "AB,PQ") = 0
The above query would return the records 1, 2 and 3.
But I'm running into all sorts of problems while implementing this function as I realized that MySQL doesn't support arrays and there's no simple way to split strings based on a delimiter.
Remodeling the table is the last option for me as it involves lot of issues.
I might also want to execute queries containing multiple MATCH_ANY functions such as:
select * from ITEM where MATCH_ANY(FEATURES, "AB,PQ") = 0 and MATCH_ANY(FEATURES, "CDE")
In the above case, we would get an intersection of records (1, 2, 3) and (3) which would be just 3.
Any help is deeply appreciated.
Thanks
First of all, the database should of course not contain comma separated values, but you are hopefully aware of this already. If the table was normalised, you could easily get the items using a query like:
select distinct i.Itemid
from Item i
inner join ItemFeature f on f.ItemId = i.ItemId
where f.Feature in ('AB', 'PQ')
You can match the strings in the comma separated values, but it's not very efficient:
select Id
from Item
where
instr(concat(',', Features, ','), ',AB,') <> 0 or
instr(concat(',', Features, ','), ',PQ,') <> 0
For all you REGEXP lovers out there, I thought I would add this as a solution:
SELECT * FROM ITEM WHERE FEATURES REGEXP '[[:<:]]AB|PQ[[:>:]]';
and for case sensitivity:
SELECT * FROM ITEM WHERE FEATURES REGEXP BINARY '[[:<:]]AB|PQ[[:>:]]';
For the second query:
SELECT * FROM ITEM WHERE FEATURES REGEXP '[[:<:]]AB|PQ[[:>:]]' AND FEATURES REGEXP '[[:<:]]CDE[[:>:]];
Cheers!
select *
from ITEM where
where CONCAT(',',FEAURES,',') LIKE '%,AB,%'
or CONCAT(',',FEAURES,',') LIKE '%,PQ,%'
or create a custom function to do your MATCH_ANY
Alternatively, consider using RLIKE()
select *
from ITEM
where ','+FEATURES+',' RLIKE ',AB,|,PQ,';
Just a thought:
Does it have to be done in SQL? This is the kind of thing you might normally expect to write in PHP or Python or whatever language you're using to interface with the database.
This approach means you can build your query string using whatever complex logic you need and then just submit a vanilla SQL query, rather than trying to build a procedure in SQL.
Ben