SQLite- Normalizing a concatenated field and joining with it?

SQLite- Normalizing a concatenated field and joining with it? - sql

I have some data stored in comma-separated values in a field and I want to turn those comma-separated values into a temporary table and use those to join to another table
CREATE TABLE STRATEGY (STRATEGY_ID INTEGER PRIMARY KEY, APPLIED_SET_IDS VARCHAR);
CREATE TABLE ACTION_SET (APPLIED_ACTION_SET_ID INTEGER PRIMARY KEY, VALUE VARCHAR);
+-----------+---------------+
|STRATEGY_ID|APPLIED_SET_IDS|
+-----------+---------------+
| 1|1,3,6,7 |
| 2|1,2,4 |
+---------------------+-----+
|APPLIED_ACTION_SET_ID|VALUE|
+---------------------+-----+
| 1|X |
| 2|Y |
| 3|Z |
| 4|H |
| 5|I |
| 6|J |
| 7|K |
| 8|L |
I know I have to use some form of recursion as shown here. But every attempt I've done has made my head spin a bit. And my temporary table needs to preserve the original concatenated order of APPLIED_SET_ID values as well, like this...
+-----------+-----+--------------+
|STRATEGY_ID|ORDER|APPLIED_SET_ID|
+-----------+-----+--------------+
| 1| 1| 1|
| 1| 2| 3|
| 1| 3| 6|
| 1| 4| 7|
| 2| 1| 1|
| 2| 2| 2|
| 2| 3| 4|
Ultimately, I will join this table to the second existing table and use GROUP_CONCAT to replace the ID's with the corresponding values in the same order.
+-----------+------------------+
|STRATEGY_ID|APPLIED_SET_VALUES|
+-----------+------------------+
| 1|X,Z,J,K |
| 2|X,Y,H |
So regular expressions are out thanks to the order (otherwise I could have turned the commas to pipes and joined on a REGEXP statement). How can I achieve this? I know this is not normalized but I need to work with this current structure. Thank you for any help in advance.

It is possible to call PHP functions from SQLite. It makes it possible to use simple queries to 'normalize' the table.
I have converted the 'normalize comma delimited strings holding keys' to SQLite. see: (joining on ';' separated values in a column) for a more complete explanation
I started out looking for ways of converting the functions to run in SQLite. After some searching I came across this: Working with PHP UDFs in SQLite.
Which I found interesting - call PHP functions from SQLite. That sounds like fun!
It works but you cannot use PDO. You have to use the SQLite functions directly. Vendor Specific Database Extensions: SQLite3
Updated with your data (see previous edits for working code for other question)
The code:
<?php // Q34231542 -- count_in_set, value_in_set
/*
* See this question for a rather more complete explanation of what this is doing...
*
* https://stackoverflow.com/questions/33782728/can-i-resolve-this-with-pure-mysql-joining-on-separated-values-in-a-column/
*/
define('SQLITE_DB', __DIR__ .'/Q34231542.sqlite');
/**
* #var SQLite3
*/
$db = new SQLite3(SQLITE_DB);
/*
* Define the functions for use by SQLite.
*/
$db->createFunction('count_in_set', 'count_in_set', 2);
$db->createFunction('value_in_set', 'value_in_set', 3);
$sql ="
SELECT STRATEGY.STRATEGY_ID as 'applied_strategy_id',
STRATEGY.APPLIED_SET_IDS as 'applied_strategy_list',
isequence.id as 'which_strategy',
COUNT_IN_SET(STRATEGY.APPLIED_SET_IDS, ',') as 'StrategyCount',
VALUE_IN_SET(STRATEGY.APPLIED_SET_IDS, ',', isequence.id)
as 'TheStrategy',
ACTION_SET.VALUE as 'Action_Set_Value'
FROM STRATEGY
JOIN integerseries AS isequence
ON isequence.id <= COUNT_IN_SET(STRATEGY.APPLIED_SET_IDS, ',') /* normalize */
JOIN ACTION_SET
ON ACTION_SET.APPLIED_ACTION_SET_ID = VALUE_IN_SET(STRATEGY.APPLIED_SET_IDS, ',', isequence.id)
ORDER BY
STRATEGY.STRATEGY_ID , ACTION_SET.APPLIED_ACTION_SET_ID;
";
/*
* Run the query
*/
$stmt = $db->prepare($sql);
$result = $stmt->execute();
/*
* Get the results
*/
$rows = array();
while ($row = $result->fetchArray(SQLITE3_ASSOC)) { // fetch all the rows for now...
$rows[] = $row;
}
/*
* output...
*/
// \Kint::dump($rows);
echo '<pre>';
var_dump($rows);
echo '</pre>';
exit;
/* -------------------------------------------------------------------------
* The PHP functions called from SQLite
*/
/**
* Count the number of delimited items in a string
*
* #param string $delimitedValues
* #param string $delim
* #return integer
*/
function count_in_set($delimitedValues, $delim)
{
return substr_count(trim($delimitedValues, $delim), $delim) + 1;
}
/**
* Treat the delimited values as ONE BASED array.
*
* #param string $delimitedValues
* #param string $delim
* #param integer $which
* #return string
*/
function value_in_set($delimitedValues, $delim, $which)
{
$items = explode($delim, $delimitedValues);
return $items[$which - 1];
}
The output:
applied_strategy_id applied_strategy_list which_strategy StrategyCount TheStrategy Action_Set_Value
#1 1 "1,3,6,7" 1 4 "1" "X"
#2 1 "1,3,6,7" 2 4 "3" "Z"
#3 1 "1,3,6,7" 3 4 "6" "J"
#4 1 "1,3,6,7" 4 4 "7" "K"
#5 2 "1,2,4" 1 3 "1" "X"
#6 2 "1,2,4" 2 3 "2" "Y"
#7 2 "1,2,4" 3 3 "4" "H"
The data:
CREATE TABLE [integerseries] (
[id] INTEGER NOT NULL PRIMARY KEY);
INSERT INTO "integerseries" VALUES(1);
INSERT INTO "integerseries" VALUES(2);
INSERT INTO "integerseries" VALUES(3);
INSERT INTO "integerseries" VALUES(4);
INSERT INTO "integerseries" VALUES(5);
INSERT INTO "integerseries" VALUES(6);
INSERT INTO "integerseries" VALUES(7);
INSERT INTO "integerseries" VALUES(8);
INSERT INTO "integerseries" VALUES(9);
INSERT INTO "integerseries" VALUES(10);
CREATE TABLE STRATEGY (STRATEGY_ID INTEGER PRIMARY KEY, APPLIED_SET_IDS VARCHAR);
INSERT INTO "STRATEGY" VALUES(1,'1,3,6,7');
INSERT INTO "STRATEGY" VALUES(2,'1,2,4');
CREATE TABLE ACTION_SET (APPLIED_ACTION_SET_ID INTEGER PRIMARY KEY, VALUE VARCHAR);
INSERT INTO "ACTION_SET" VALUES(1,'X');
INSERT INTO "ACTION_SET" VALUES(2,'Y');
INSERT INTO "ACTION_SET" VALUES(3,'Z');
INSERT INTO "ACTION_SET" VALUES(4,'H');
INSERT INTO "ACTION_SET" VALUES(5,'I');
INSERT INTO "ACTION_SET" VALUES(6,'J');
INSERT INTO "ACTION_SET" VALUES(7,'K');
INSERT INTO "ACTION_SET" VALUES(8,'L');

My colleague developed a very clever solution, assuming the separator is a pipe | and not a comma ,.
He used REGEXP and the INSTR() function to get a numerical position, and that value drove the sorting.
SELECT STRATEGY_ID,
APPLIED_SET_IDS,
GROUP_CONCAT(VALUE,'|') as DESCRIPTION
FROM (
SELECT STRATEGY_ID,
APPLIED_SET_IDS,
CASE
WHEN APPLIED_ACTION_SET_ID = APPLIED_SET_IDS THEN 1
WHEN instr(APPLIED_SET_IDS, APPLIED_ACTION_SET_ID || '|') = 1 Then 1
WHEN instr(APPLIED_SET_IDS, '|' || APPLIED_ACTION_SET_ID || '|') > 0 Then instr(APPLIED_SET_IDS, '|' || APPLIED_ACTION_SET_ID || '|')
ELSE 999999
END AS APPLIED_ORDER,
VALUE
FROM STRATEGY
INNER JOIN ACTION_SET
ON ACTION_SET.APPLIED_ACTION_SET_ID REGEXP '^(' || STRATEGY.APPLIED_SET_IDS || ')$'
ORDER BY APPLIED_ORDER
) DESCRIPTIONS
GROUP BY 1,2
This gave me the exact output I was looking for.

Related

Data transfer: JSON → SQL

😊
I would like to ask You for some help.
In our production I would like to transfer many kind of data and I've decided to choose using of JSON for that.
I've created the JSON string with all data on the side of the app and now I need to store it into my DB by using of stored procedure.
Everything is good, but in some case I've met a little issue.
Request
I need to know how many defective pieces should be sent to some process (need to know: which process, what defect and how many...) 😉
Sorry for this confused description, but bellow is an example
Here is the part of JSON which I need to transform:
{"production":{"repairs":{"2":{"1":3},"4":{"3":5},"7":{"2":2,"4":4}}}}
Here is the required output:
RowID
ProcessID
ErrorID
Amount
1
2
1
3
2
4
3
5
3
7
2
2
4
7
4
4
Test:
Here is some SQL script which makes exactly what I want, but I can't use it because it doesn't work in stored procedures...
DECLARE #json NVARCHAR(MAX) = '{"production":{"repairs":{"2":{"1":3},"4":{"3":5},"7":{"2":2,"4":4}}}}'
DECLARE #helper INT = 0
DECLARE #counter INT = 0
DECLARE #RepairData TABLE (RowID INT NOT NULL IDENTITY, ProcessID INT, ErrorID INT, Amount INT)
SELECT ROW_NUMBER() OVER(ORDER BY CAST("key" AS INT) ASC) AS 'Row', CAST("key" AS INT) AS 'ProcessID'
INTO #RepairProcesses
FROM OPENJSON(#json, '$.production.repairs')
WHILE #counter < (SELECT COUNT("key") FROM OPENJSON(#json, '$.production.repairs'))
BEGIN
SET #counter = #counter + 1
SET #helper = (SELECT ProcessID FROM #RepairProcesses WHERE Row = #counter)
INSERT INTO #RepairData (ProcessID, ErrorID, Amount)
SELECT #helper AS 'ProcessID', CAST("key" AS INT) AS 'ErrorID', CAST("value" AS INT) AS 'Amount'
FROM OPENJSON(#json, '$.production.repairs."'+CAST(#helper AS NVARCHAR(3))+'"')
END
DROP TABLE #RepairProcesses
SELECT * FROM #RepairData
Output:
RowID|ProcessID|ErrorID|Amount|
-----+---------+-------+------+
1| 2| 1| 3|
2| 4| 3| 5|
3| 7| 2| 2|
4| 7| 4| 4|
Summary:
The reason why I can't use that is because I've used the WHILE loop and iteration for ProcessID in it and the stored procedures for some reason return syntax error in line where I'm using concatenation of the path string in the function OPENJSON. (Even when it works in the classic "query mode" well... 🤨)
Error message (only within stored procedures):
SQL Error [102] [S0001]: Incorrect syntax near '+'.
The error occurs even when I don't use the concatenation and use the whole path string as the parameter...
Seems like the function OPENJSON in stored procedures needs for the path string only absolute value in the ' '...
My request:
Guys I would like to ask You if somebody knows how to solve it in better way (maybe even without the WHILE loop?)...
Thanks a lot. 😉

I think you need this:
DECLARE #json NVARCHAR(MAX) = '{"production":{"repairs":{"2":{"1":3},"4":{"3":5},"7":{"2":2,"4":4}}}}'
SELECT
RowId = ROW_NUMBER() OVER (ORDER BY CONVERT(int, j1.[key]), CONVERT(int, j2.[key])),
ProcessID = j1.[key],
ErrorID = j2.[key],
Amount = j2.[value]
FROM OPENJSON(#json, '$.production.repairs') j1
CROSS APPLY OPENJSON(j1.value) j2
Result:
RowId
ProcessID
ErrorID
Amount
1
2
1
3
2
4
3
5
3
7
2
2
4
7
4
4

custom aggregate function in postgres return NULL value

in order to create a more complex custom aggregate function, i followed first this amazing tutorial.
Here the data i use :
create table entries(
id serial primary key,
amount float8 not null
);
select setseed(0);
insert into entries(amount)
select (2000 * random()) - 1000
from generate_series(1, 1000000);
So I have this table :
id | amount | running_total
---------+-----------------------+--------------------
1 | -462.016298435628 | -462.016298435628
2 | 162.440904416144 | -299.575394019485
3 | -820.292402990162 | -1119.86779700965
4 | -866.230697371066 | -1986.09849438071
5 | -495.30001822859 | -2481.3985126093
6 | 772.393747232854 | -1709.00476537645
7 | -323.866365477443 | -2032.87113085389
8 | -856.917716562748 | -2889.78884741664
9 | 285.323366522789 | -2604.46548089385
10 | -867.916810326278 | -3472.38229122013
-- snip --
And I would like the max of the running_total column
(I know I can do do it without a new aggregate function, but it's for the demonstration)
So i've made this aggregate function
create or replace function grt_sfunc(agg_state point, el float8)
returns point
immutable
language plpgsql
as $$
declare
greatest_sum float8;
current_sum float8;
begin
current_sum := agg_state[0] + el;
greatest_sum := 40;
/*if agg_state[1] < current_sum then
greatest_sum := current_sum;
else
greatest_sum := agg_state[1];
end if;*/
return point(current_sum, greatest_sum);
/*return point(3.14159, 0);*/
end;
$$;
create or replace function grt_finalfunc(agg_state point)
returns float8
immutable
strict
language plpgsql
as $$
begin
return agg_state[0];
end;
$$;
create or replace aggregate greatest_running_total (float8)
(
sfunc = grt_sfunc,
stype = point,
finalfunc = grt_finalfunc
);
Normally it sould work, but in the end, it gives me a null result :
select greatest_running_total(amount order by id asc)
from entries;
id | running_total
---------+---------------
1 | [NULL]
I tried to change the type of the data, to check the 2 first aggregate functions separately, they are working well. Does someone could help me find a solution please ? :)
Thank you very much !

You need to set a non-NULL initcond for the aggregate. Presumably that would be (0,0), or maybe negative very large numbers for each? Or manually check for the agg_state being NULL.
Also, it seems like your grt_finalfunc should be returning subscript [1], not [0].

So, the solution was to add an initial condition. Indeed, without initial condition, the first value is considered as NULL :D (thank you #jjanes and #The Impaler)
So I corrected ma aggregated function :
create or replace aggregate greatest_running_total (float8)
(
sfunc = grt_sfunc,
stype = point,
finalfunc = grt_finalfunc,
initcond = '(0,0)'
);
And, indeed SQL indexes its tables from 1 and not from 0... Here was my second mistake,
Thank you very much !!

How to get SQLite column types from view columns

When defining a view like this:
CREATE TABLE x (a VARCHAR(10));
CREATE VIEW v AS SELECT a, a || ' ' AS b FROM x;
I now want to discover the column types of the view's columns using:
PRAGMA table_info('v');
Unfortunately, this results in
cid |name |type |notnull |dflt_value |pk |
----|-----|------------|--------|-----------|---|
0 |a |VARCHAR(10) |0 | |0 |
1 |b | |0 | |0 |
The column type is absent for any kind of column expression. Is there a different way to define the view and / or query the table info in order to get a column type or is that just how SQLite's type affinity works?

I tried this
CREATE VIEW v2 AS SELECT a, CAST((a || ' ') AS VARCHAR(11)) AS b FROM x;
without success neither.
Isn't the answer documented in the docs. you reference here
https://www.sqlite.org/datatype3.html ?
3.2 Affinity Of Expressions: "... Otherwise, an expression has no affinity."
3.3 Column Affinity For Views And Subqueries: ... "expressions always have no affinity".
In fact, your question is probably about what's in the table/view info and not about type affinity (nor type constraints). Seems to me that this is a pure implementation decision in SQLite3 to not set the view info for anything else than predefined attributes in CREATE VIEW.

A workaround when using a client like JDBC (e.g. via Xerial) is to fetch one row from the view and check the ResultSetMetaData on it:
try (Connection c = DriverManager.getConnection("jdbc:sqlite::memory:");
Statement s = c.createStatement()) {
s.execute("CREATE TABLE x (a VARCHAR(10));");
s.execute("INSERT INTO x VALUES ('a')");
s.execute("CREATE VIEW v AS SELECT a, a || ' ' AS b FROM x;");
try (ResultSet rs = s.executeQuery("SELECT * FROM v LIMIT 1")) {
ResultSetMetaData meta = rs.getMetaData();
for (int i = 0; i < meta.getColumnCount(); i++) {
String name = meta.getColumnName(i + 1);
String type = meta.getColumnTypeName(i + 1);
int precision = meta.getPrecision(i + 1);
System.out.println(name + " "
+ type
+ (precision > 0 ? " (" + precision + ")" : ""));
}
}
}
This yields:
a VARCHAR (10)
b TEXT
However, this doesn't work when the view doesn't produce any rows, in case of which the column type is again unknown:
a VARCHAR (10)
b NULL

Removing varchar(s) after a semicolon

So I have a lot of data like this:
pix11co;10.115.0.1
devapp087co;10.115.0.100
old_main-mgr;10.115.0.101
radius03co;10.115.0.110
And I want to delete the stuff after the ; so it just becomes
pix11co
devapp087co
old_main-mgr
radius03co
Since they're all different I can live with the semi-colon staying there.
I have the following query and it runs successfully but doesn't delete anything.
UPDATE dns$ SET [Name;] = REPLACE ([Name;], '%_;%__________%', '%_;');
What wildcards can I use to specify the characters after the ; ?

Can you use CHARINDEX? E.g.:
SELECT LEFT('pix11co;10.115.0.1', CHARINDEX(';', 'pix11co;10.115.0.1') - 1)

You can use SUBSTRING() and CHARINDEX() functions:
CREATE TABLE MyStrings (
STR VARCHAR(MAX)
);
INSERT INTO MyStrings VALUES
('pix11co;10.115.0.1'),
('devapp087co;10.115.0.100'),
('old_main-mgr;10.115.0.101'),
('radius03co;10.115.0.110');
SELECT STR, SUBSTRING(STR, 1, CHARINDEX(';', STR) -1 ) AS Result
FROM MyStrings;
Results:
+---------------------------+--------------+
| STR | Result |
+---------------------------+--------------+
| pix11co;10.115.0.1 | pix11co |
| devapp087co;10.115.0.100 | devapp087co |
| old_main-mgr;10.115.0.101 | old_main-mgr |
| radius03co;10.115.0.110 | radius03co |
+---------------------------+--------------+

PostgreSQL - Combine SELECT and RETURN VALUE of a Function

In my database I have a table "Datapoint" with the two columns "Id" (integer) and "Description" (character varying). Table "Datapoint"
I then have a table "Logging" with the three columns "Id" (integer), "Dt" (timestamp without timezone) and "Value" (double precision).Table "Logging"
I also have the following function:
CREATE OR REPLACE FUNCTION count_estimate(query text)
RETURNS integer AS
$BODY$ DECLARE rec record;ROWS INTEGER;BEGIN FOR rec IN EXECUTE 'EXPLAIN ' || query LOOP ROWS := SUBSTRING(rec."QUERY PLAN" FROM ' rows=([[:digit:]]+)');EXIT WHEN ROWS IS NOT NULL;END LOOP;RETURN ROWS;END $BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
This function returns the estimated count of entries that are found by a SELECT-Query, e.g. SELECT count_estimate('SELECT * FROM "Logging" WHERE "Id" = 3') would return 2.
I would now like to combine a SELECT-query on the table "Datapoint" with the return value of my function, so that my result looks like this:
ID | Description | EstimatedCount
1 | Datapoint 1 | 3
2 | Datapoint 2 | 4
3 | Datapoint 3 | 2
4 | Datapoint 4 | 1
My SELECT-query should look something like this:
SELECT
"Datapoint"."Id",
"Datapoint"."Description",
(SELECT count_estimate ('SELECT * FROM "Logging" WHERE "Logging"."Id" = "Datapoint"."Id"')) AS "EstimatedCount"
FROM
"Datapoint"
So my problem is to write a functioning SELECT-query for my purposes.

What about:
SELECT
"Datapoint"."Id",
"Datapoint"."Description",
count_estimate ('SELECT * FROM "Logging" WHERE "Logging"."Id" = "Datapoint"."Id"') AS "EstimatedCount"
FROM
"Datapoint"

You almost got it right, except that you need to supply the value of "Datapoint"."Id":
SELECT
"Datapoint"."Id",
"Datapoint"."Description",
count_estimate(
'SELECT * FROM "Logging" WHERE "Logging"."Id" = ' || "Datapoint"."Id"
) AS "EstimatedCount"
FROM "Datapoint";

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQLite- Normalizing a concatenated field and joining with it? - sql

Related

Data transfer: JSON → SQL

custom aggregate function in postgres return NULL value

How to get SQLite column types from view columns

Removing varchar(s) after a semicolon

PostgreSQL - Combine SELECT and RETURN VALUE of a Function

Categories

Resources