Split SQL Column into Multiple Rows by Regex Match - sql

I'm in the middle of converting an NTEXT column into multiple records. I'm looking to split the original column by new line or json object. It's a unique scenario, to be sure, but outside of a sql environment this regex correctly matches everything I need from the original column:
({(.*)(.*\r\n)*?})|(.+\r\n).
If I have a record with a column that has the value:
Foo bar baz
hello world
{
foo: 'bar',
bar: 'foo'
}
{
foo: 'foo',
bar: 'bar'
}
I want to break it into multiple records:
| ID | Text |
---------------------
| 1 | Foo bar baz |
| 2 | hello world |
| 3 | { |
| | foo: 'bar' |
| | bar: 'foo' |
| | } |
| 4 | { |
| | foo: 'foo' |
| | bar: 'bar' |
| | } |
Any easy way to accomplish this? It's a SQL Express server.

With the help of a split/parse function
Declare #String varchar(max)='Foo bar baz
hello world
{
foo: ''bar'',
bar: ''foo''
}
{
foo: ''foo'',
bar: ''bar''
}'
Select ID=Row_Number() over (Order By (Select NULL))
,Text = B.RetVal
From (Select RetSeq,RetVal = IIF(CharIndex('}',RetVal)>0,'{'+RetVal,RetVal) from [dbo].[udf-Str-Parse](#String,'{')) A
Cross Apply (
Select * from [dbo].[udf-Str-Parse](A.RetVal,IIF(CharIndex('{',A.RetVal)>0,char(1),char(10)))
) B
Where B.RetVal is Not Null
Returns
ID Text
1 Foo bar baz
2 hello world
3 {
foo: 'bar',
bar: 'foo'
}
4 {
foo: 'foo',
bar: 'bar'
}
The UDF if needed
CREATE FUNCTION [dbo].[udf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = Row_Number() over (Order By (Select null))
,RetVal = LTrim(RTrim(B.i.value('(./text())[1]', 'varchar(max)')))
From (Select x = Cast('<x>'+ Replace(#String,#Delimiter,'</x><x>')+'</x>' as xml).query('.')) as A
Cross Apply x.nodes('x') AS B(i)
);
--Select * from [dbo].[udf-Str-Parse]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse]('John Cappelletti was here',' ')
--Performance On a 5,000 random sample -8K 77.8ms, -1M 79ms (+1.16), -- 91.66ms (+13.8)

Related

Update column with the same value apart from an object removed in column in Sqlite

I want to remove an object from a json column in Sqlite and I can't make it work. The json column contains a nested object, has the following type:
{
a: number;
pair: {
field1: string;
field2: string;
}[]
}
I want to update the column "ArrayColumn" with the same values but remove the object that has field1 equal to "0" and field2 equal to "1" . Every row contains the "pair" array, but not all the "pair" arrays in ArrayColumn contain this value ({"field1":"0", "field2":"1"})
I have the following structure:
Id| ArrayColumn
--------------------------------------------------------------------------------------------
1 | { "a":1, "pair":[{"field1":"0", "field2":"1"},{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
2 | { "a":5, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
3 | { "a":8, "pair":[{"field1":"G", "field2":"G"},{"field1":"0", "field2":"1"},{"field1":"A", "field2":"A"}] }
4 | { "a":1, "pair":[{"field1":"F", "field2":"T"},{"field1":"C", "field2":"D"},{"field1":"0", "field2":"1"}] }
5 | { "a":1, "pair":[{"field1":"A", "field2":"B"}] }
After updating the rows, the values would be:
Id| ArrayColumn
--------------------------------------------------------------------------------------------
1 | { "a":1, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
2 | { "a":5, "pair":[{"field1":"C", "field2":"D"},{"field1":"E", "field2":"F"}] }
3 | { "a":8, "pair":[{"field1":"G", "field2":"G"},{"field1":"A", "field2":"A"}] }
4 | { "a":1, "pair":[{"field1":"F", "field2":"T"},{"field1":"C", "field2":"D"}] }
5 | { "a":1, "pair":[{"field1":"A", "field2":"B"}] }
I tried with JSON_TREE but can't make it work.
I was thinking that the first step would be to select all the rows that contain that value, I retreived them using these 2 ways:
With LIKE operator searching for the stringified form:
select Id, json_extract(json(par), '$.pair') as pair from Table pair like '%{"field1":"0","field2":"1"}%'
Using json_tree
select Id, value from Table, json_tree(Table.ArrayColumn, '$.pair' ) where json_extract(value, '$.field1' ) = '0' AND json_extract(value, '$.field2' ) = '1'
I tried using json_remove with this small example but no luck:
SELECT json_remove('[{"field1":"1","field2":"0"},{"field1":"A","field2":"B"}]', '${"field1":"1","field2":"0"}' )
I tried using json_remove but had no luck.
Thank you
For this sample data the simplest way to do this is to treat the json column as a string and use string functions to remove the value that you want:
UPDATE tablename
SET ArrayColumn = REPLACE(REPLACE(REPLACE(ArrayColumn, ']', ',]'), '{"field1":"0", "field2":"1"},', ''), ',]', ']')
WHERE ArrayColumn LIKE '%{"field1":"0", "field2":"1"}%';
See the demo.

How can I extract a json column into new columns automatically in Snowflake SQL?

This is as example taken from another thread, but essentially I would like to achieve this:
Sample data
ID Name Value
1 TV1 {"URL": "www.url.com", "Icon": "some_icon"}
2 TV2 {"URL": "www.url.com", "Icon": "some_icon", "Facebook": "Facebook_URL"}
3 TV3 {"URL": "www.url.com", "Icon": "some_icon", "Twitter": "Twitter_URL"}
..........
Expected output
ID Name URL Icon Facebook Twitter
1 TV1 www.url.com some_icon NULL NULL
2 TV2 www.url.com some_icon Facebook_URL NULL
3 TV3 www.url.com some_icon NULL Twitter_URL
I'm totally new to Snowflake so I'm shaking my head on how to do this easily (and hopefully automatically, in the case where some rows might have more elements in the json than other rows, which would be tedious to assign manually). Some lines might have sub-categories too.
I found the parse_json function for Snowflake, but it's only giving me the same json column in a new column, still in json format.
TIA!
You can create a view over your table with the following SELECT:
SELECT ID,
Name,
Value:URL::varchar as URL,
Value:Icon::varchar as Icon,
Value:Facebook::varchar as Facebook,
Value:Twitter::varchar as Twitter
FROM tablename;
Additional attributes will be ignored unless you add them to the view. There is no way to "automatically" include them into the view, but you could create a stored procedure that dynamically generates the view based on all the attributes that are in the full variant content of a table.
You can create a SP to automatically build the CREATE VIEW for you based on the JSON data in the VARIANT.
I have some simple example below:
-- prepare the table and data
create or replace table test (
col1 int, col2 string,
data1 variant, data2 variant
);
insert into test select 1,2, parse_json(
'{"URL": "test", "Icon": "test1", "Facebook": "http://www.facebook.com"}'
), parse_json(
'{"k1": "test", "k2": "test1", "k3": "http://www.facebook.com"}'
);
insert into test select 3,4,parse_json(
'{"URL": "test", "Icon": "test1", "Twitter": "http://www.twitter.com"}'
), parse_json(
'{"k4": "v4", "k3": "http://www.ericlin.me"}'
);
-- create the SP, we need to know which table and
-- column has the variant data
create or replace procedure create_view(
table_name varchar
)
returns string
language javascript
as
$$
var final_columns = [];
// first, find out the columns
var query = `SHOW COLUMNS IN TABLE ${TABLE_NAME}`;
var stmt = snowflake.createStatement({sqlText: query});
var result = stmt.execute();
var variant_columns = [];
while (result.next()) {
var col_name = result.getColumnValue(3);
var data_type = JSON.parse(result.getColumnValue(4));
// just use it if it is not a VARIANT type
// if it is variant type, we need to remember this column
// and then run query against it later
if (data_type["type"] != "VARIANT") {
final_columns.push(col_name);
} else {
variant_columns.push(col_name);
}
}
var columns = {};
query = `SELECT ` + variant_columns.join(', ') + ` FROM ${TABLE_NAME}`;
stmt = snowflake.createStatement({sqlText: query});
result = stmt.execute();
while (result.next()) {
for(i=1; i<=variant_columns.length; i++) {
var sub_result = result.getColumnValue(i);
if(!sub_result) {
continue;
}
var keys = Object.keys(sub_result);
for(j=0; j<keys.length; j++) {
columns[variant_columns[i-1] + ":" + keys[j]] = keys[j];
}
}
}
for(path in columns) {
final_columns.push(path + "::STRING AS " + columns[path]);
}
var create_view_sql = "CREATE OR REPLACE VIEW " +
TABLE_NAME + "_VIEW\n" +
"AS SELECT " + "\n" +
" " + final_columns.join(",\n ") + "\n" +
"FROM " + TABLE_NAME + ";";
snowflake.execute({sqlText: create_view_sql});
return create_view_sql + "\n\nVIEW created successfully.";
$$;
Execute the SP will return below string:
call create_view('TEST');
+---------------------------------------+
| CREATE_VIEW |
|---------------------------------------|
| CREATE OR REPLACE VIEW TEST_VIEW |
| AS SELECT |
| COL1, |
| COL2, |
| DATA1:Facebook::STRING AS Facebook, |
| DATA1:Icon::STRING AS Icon, |
| DATA1:URL::STRING AS URL, |
| DATA2:k1::STRING AS k1, |
| DATA2:k2::STRING AS k2, |
| DATA2:k3::STRING AS k3, |
| DATA1:Twitter::STRING AS Twitter, |
| DATA2:k4::STRING AS k4 |
| FROM TEST; |
| |
| VIEW created successfully. |
+---------------------------------------+
Then query the VIEW:
SELECT * FROM TEST_VIEW;
+------+------+-------------------------+-------+------+------+-------+-------------------------+------------------------+------+
| COL1 | COL2 | FACEBOOK | ICON | URL | K1 | K2 | K3 | TWITTER | K4 |
|------+------+-------------------------+-------+------+------+-------+-------------------------+------------------------+------|
| 1 | 2 | http://www.facebook.com | test1 | test | test | test1 | http://www.facebook.com | NULL | NULL |
| 3 | 4 | NULL | test1 | test | NULL | NULL | http://www.ericlin.me | http://www.twitter.com | v4 |
+------+------+-------------------------+-------+------+------+-------+-------------------------+------------------------+------+
Query the source table:
SELECT * FROM TEST;
+------+------+------------------------------------------+-----------------------------------+
| COL1 | COL2 | DATA1 | DATA2 |
|------+------+------------------------------------------+-----------------------------------|
| 1 | 2 | { | { |
| | | "Facebook": "http://www.facebook.com", | "k1": "test", |
| | | "Icon": "test1", | "k2": "test1", |
| | | "URL": "test" | "k3": "http://www.facebook.com" |
| | | } | } |
| 3 | 4 | { | { |
| | | "Icon": "test1", | "k3": "http://www.ericlin.me", |
| | | "Twitter": "http://www.twitter.com", | "k4": "v4" |
| | | "URL": "test" | } |
| | | } | |
+------+------+------------------------------------------+-----------------------------------+
You can refine this SP to detect nested data and have them added to the columns list as well.

node.js - how to insert values from a json array into a sql database

I have incoming json structure like
{
"type":1,
"location":[
{"lattitude":"0", "longitude":"0"},
{"lattitude":"0", "longitude":"0"},
{"lattitude":"0", "longitude":"0"}]
}
I need to insert this into a database like
|------|-----------|-----------|
| type | lattitude | longitude |
|------|-----------|-----------|
| 1 | 0 | 0 |
|------|-----------|-----------|
| 1 | 0 | 0 |
|------|-----------|-----------|
| 1 | 0 | 0 |
|------|-----------|-----------|
how do I parse the json and build an sql query?
If you want to use Postgres solution, you may do
--INSERT INTO yourtab( type,lattitude,longitude)
select jsoncol->>'type' , j->>'lattitude'
j->>'longitude'
from
( values (:yourjsonstr :: jsonb ) ) as t(jsoncol) cross join
lateral jsonb_array_elements(jsoncol->'location')
as j;
DEMO
You can use json-sql
Example:
var sql = jsonSql.build({
type: 'insert',
table: 'users',
values: {
name: 'John',
lastname: 'Snow',
age: 24,
gender: 'male'
}
});
sql.query
// insert into users (name, lastname, age, gender) values ($p1, $p2, 24, $p3);
sql.values
// { p1: 'John', p2: 'Snow', p3: 'male' }
See the documentation: https://www.npmjs.com/package/json-sql

Linq-like group by for sql

It seems like sql group by is more of aggregate functions (COUNT, MAX, MIN, SUM, AVG).
select count(Id), Country
from Customer
where Country <> 'CountryX'
group by Country
But do we have a linq-like query where we want to return all results grouped by a certain column, E.g. in linq I would do
id | title | category | email
------------------------------------------
1 | tname-1 | cat1 | test#example.com
2 | tname-2 | cat1 | test1#example.com
3 | tname-3 | cat2 | TEst#example.com
linq group-by:
var groupedBy = list.GroupBy(item => item.Email);
or even throw in some comparison
var groupedBy = list.GroupBy(item => item.Email, StringComparer.OrdinalIgnoreCase);
and a result will be something like:
key | items
----------------------------------------------------------------------------------------------
test#example.com | [{Id :1, Title : "tname-1", category: "cat1", email: "test#example.com" },{Id :3, Title : "tname-3", category: "cat2", email: "TEst#example.com" } ]
test1#example.com| [{Id :2, Title : "tname-2", category: "cat1", email: "test1#example.com" }]
but with sql I would definitely want to return only the subset of the columns, say id, title and email.

Select query question

I have a table in a mysql-database with the fields
"name", "title", "cd_id", "tracks"
The entries look like this:
Schubert | Symphonie Nr 7 | 27 | 2
Brahms | Symphonie Nr 1 | 11 | 4
Brahms | Symphonie Nr 2 | 27 | 4
Shostakovich | Jazz Suite Nr 1 | 19 | 3
To get the tracks per cd (cd_id) I have written this script:
#!/usr/bin/env perl
use warnings; use strict;
use DBI;
my $dbh = DBI->connect(
"DBI:mysql:database=my_db;",
'user', 'passwd', { RaiseError => 1 } );
my $select_query = "SELECT cd_id, tracks FROM my_table";
my $sth = $dbh->prepare( $select_query );
$sth->execute;
my %hash;
while ( my $row = $sth->fetchrow_hashref ) {
$hash{"$row->{cd_id}"} += $row->{tracks};
}
print "$_ : $hash{$_}\n" for sort { $a <=> $b } keys %hash;
Is it possible to get these results directly with an appropriate select query?
"SELECT cd_id, SUM(tracks) as tracks FROM my_table GROUP BY cd_id"
edit: see here for further information and other things you can do with GROUP BY: http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
Use an aggregate function
SELECT `cd_id`, SUM(`tracks`) AS s
FROM `your_table`
GROUP BY `cd_id`