Do SQL::Statement's REGEX and TRIM work with DBD::CSV? - sql

The functions "REGEX()" and "TRIM()" in this script don't work as I would expect.
The REGEX-function returns always true and the TRIM-function returns the "trim_char", not the trimmed string. (When I write the TRIM-function with FROM instead the "," I get an error message.)
#!/usr/bin/perl
use warnings;
use strict;
use 5.010;
use DBI;
my $dbh = DBI->connect( "DBI:CSV:", undef, undef, { RaiseError => 1, AutoCommit => 1 } );
my $table = 'artikel';
my $array_ref = [ [ 'a_nr', 'a_name', 'a_preis' ],
[ 12, 'Oberhemd', 39.80, ],
[ 22, 'Mantel', 360.00, ],
[ 11, 'Oberhemd', 44.20, ],
[ 13, 'Hose', 119.50, ],
];
$dbh->do( "CREATE TEMP TABLE $table AS IMPORT(?)", {}, $array_ref );
say "";
# purpose : test if a string matches a perl regular expression
# arguments : a string and a regex to match the string against
# returns : boolean value of the regex match
# example : ... WHERE REGEX(col3,'/^fun/i') ... matches rows
# in which col3 starts with "fun", ignoring case
my $sth = $dbh->prepare( "SELECT a_name FROM $table WHERE REGEX( a_name, '/^O/')" );
$sth->execute();
$sth->dump_results();
say "\n";
# TRIM ( [ [LEADING|TRAILING|BOTH] ['trim_char'] FROM ] string )
$sth = $dbh->prepare( "SELECT a_name, TRIM( TRAILING 'd', a_name ) AS new_name FROM $table" );
$sth->execute();
$sth->dump_results();
say "";
$dbh->disconnect();
Has somebody a piece of advice?
Edit:
DBD::SQLite : 1.25
DBD::ExampleP : 12.010007
DBD::Sponge : 12.010002
DBD::CSV : 0.26
DBD::Gofer : 0.011565
DBD::DBM : 0.03
DBD::Proxy : 0.2004
DBI : 1.609
DBD::File : 0.37
SQL::Statement : 1.23

Answer: Neat issue. Short answers from my testing with SQL::Statement-1.23 and DBD::CSV under 5.10.0 with your script:
REGEX() appears to work, but returns a number, not a boolean, which needs to be handled a bit specially:
Fix:
SELECT a_name FROM $table WHERE REGEX( a_name, '/^O/') = 1
TRIM() does not take a comma (as in your example); however, it seems unusably broken to me.
Any use of TRIM( FROM ), in my testing, greatly confused the parser about table names, and any other interesting use seemed to parse out, as you discovered, as a string literal.
Workaround:
SELECT a_name, REPLACE(a_name, 's/d\$//') AS new_name FROM $table
N.B.: you'll need to backslash that dollar sign in the s///, as I have, to keep your interpolating quotes...
Appeal: Please file bugs with test cases for this module. SQL::Statement may not be ready for prime time as an SQL engine, but we can help get it there!

You should boil your code down to the minimal example necessary to exhibit the problem, and then compare the results you get to what happens when you type those commands into the DB's command line interface. (e.g. try comparing a simple "SELECT TRIM(...)" command.
Also, what DB and version are you using?

Are you sure the underlying SQL engine (DBI::SQL::Nano I guess) has implemented those functions? It may be best to select the data and process it using Perl.

Related

Filtering with Like operator on integer column

I'm using mikro-orm for db related opeartions. My db entity has a number field:
#Property({ defaultRaw: 'srNumber', type: 'number' })
srNumber!: number;
and corresponding db column (Postgresql) is:
srNumber(int8)
The query input for where param in mikro-orm EntityRepository's findAndCount(where, option) is:
repository.findAndCount({"srNumber":{"$like":"%1000%"}}, options)
It translates to:
select * from table1 where srNumber like '%1000%'
The problem here is since srNumber column is not a string, there is a type-mismatch and query fails. Casting it like CAST(srNumber AS TEXT) like '%1000%' should work in db.
Is there any way to somehow specify the field casting here?
You can use custom SQL fragments in the query. To get around strictly typed FilterQuery, you can use expr which is just an identity function (returns its parameter), so have effect only for TS checks.
Something like this should work:
import { expr } from '#mikro-orm/core';
const res = await repo.findAndCount({
[expr('cast(srNumber as text)')]: { $like: '%1000%' },
}, options);
https://mikro-orm.io/docs/entity-manager/#using-custom-sql-fragments

Oracle SQL JSON_QUERY ignore key field

I have a json with several keys being a number instead of a fixed string. Is there any way I could bypass them in order to access the nested values?
{
"55568509":{
"registers":{
"001":{
"isPlausible":false,
"deviceNumber":"55501223",
"register":"001",
"readingValue":"5295",
"readingDate":"2021-02-25T00:00:00.000Z"
}
}
}
}
My expected output here would be 5295, but since 59668509 can vary from json to json, JSON_QUERY(data, '$."59668509".registers."001".readingValue) would not be an option. I'm not able to use regexp here because this is only a part of the original json, which contains more than this.
UPDATE: full json with multiple occurrences:
This is how my whole json looks like. I would like all the readingValue in brackets, in the example below, my expected output would be [32641, 00964].
WITH test_table ( data ) AS (
SELECT
'{
"session":{
"sessionStartDate":"2021-02-26T12:03:34+0000",
"interactionDate":"2021-02-26T12:04:19+0000",
"sapGuid":"369F01DFXXXXXXXXXX8553F40CE282B3",
"agentId":"USER001",
"channel":"XXX",
"bpNumber":"5551231234",
"contractAccountNumber":"55512312345",
"contactDirection":"",
"contactMethod":"Z08",
"interactionId":"5550848784",
"isResponsibleForPayingBill":"Yes"
},
"payload":{
"agentId":"USER001",
"contractAccountNumber":"55512312345",
"error":{
"55549271":{
"registers":{
"001":{
"isPlausible":false,
"deviceNumber":"55501223",
"register":"001",
"readingValue":"32641",
"readingDate":"2021-02-26T00:00:00.000Z"
}
},
"errors":[
{
"contractNumber":"55501231",
"language":"EN",
"errorCode":"62",
"errorText":"Error Text1",
"isHardError":false
},
{
"contractNumber":"55501232",
"language":"EN",
"errorCode":"62",
"errorText":"Error Text2",
"isHardError":false
}
],
"bpNumber":"5557273667"
},
"55583693":{
"registers":{
"001":{
"isPlausible":false,
"deviceNumber":"555121212",
"register":"001",
"readingValue":"00964",
"readingDate":"2021-02-26T00:00:00.000Z"
}
},
"errors":[
],
"bpNumber":"555123123"
}
}
}
}'
FROM
dual
)
SELECT
JSON_QUERY(data, '$.payload.error.*.registers.*[*].readingValue') AS reading_value
FROM
test_table;
UPDATE 2:
Solved, this would do the trick, upvoting the first comment.
JSON_QUERY(data, '$.payload.error.*.registers.*.readingValue' WITH WRAPPER) AS read_value
As I explained in the comment to your question, if you are getting that result from the JSON you posted, you are not using JSON_QUERY(); you must be using JSON_VALUE(). Either that, or there's something else you didn't share with us.
In any case, let's say you are using JSON_VALUE() with the arguments you showed. You are asking, how can you modify the path so that the top-level attribute name is not hard-coded. That is trivial: use asterisk (*) instead of the hard-coded name. (This would work the same with JSON_QUERY() - it's about JSON paths, not the specific function that uses them.)
with test_table (data) as (
select
'{
"59668509":{
"registers":{
"001":{
"isPlausible":false,
"deviceNumber":"40157471",
"register":"001",
"readingValue":"5295",
"readingDate":"2021-02-25T00:00:00.000Z"
}
}
}
}' from dual
)
select json_value (data, '$.*."registers"."001"."readingValue"'
returning number) as reading_value
from test_table
;
READING_VALUE
-------------
5295
As an aside that is not related to your question in any way: In your JSON you have an object with a single attribute named "registers", whose value is another object with a single attribute "001", and in turn, this object has an attribute named "register" with value "001". Does that make sense to you? It doesn't to me.

MongoDB like statement with multiple fields

With SQL we can do the following :
select * from x where concat(x.y ," ",x.z) like "%find m%"
when x.y = "find" and x.z = "me".
How do I do the same thing with MongoDB, When I use a JSON structure similar to this:
{
data:
[
{
id:1,
value : "find"
},
{
id:2,
value : "me"
}
]
}
The comparison to SQL here is not valid since no relational database has the same concept of embedded arrays that MongoDB has, and is provided in your example. You can only "concat" between "fields in a row" of a table. Basically not the same thing.
You can do this with the JavaScript evaluation of $where, which is not optimal, but it's a start. And you can add some extra "smarts" to the match as well with caution:
db.collection.find({
"$or": [
{ "data.value": /^f/ },
{ "data.value": /^m/ }
],
"$where": function() {
var items = [];
this.data.forEach(function(item) {
items.push(item.value);
});
var myString = items.join(" ");
if ( myString.match(/find m/) != null )
return 1;
}
})
So there you go. We optimized this a bit by taking the first characters from your "test string" in each word and compared the tokens to each element of the array in the document.
The next part "concatenates" the array elements into a string and then does a "regex" comparison ( same as "like" ) on the concatenated result to see if it matches. Where it does then the document is considered a match and returned.
Not optimal, but these are the options available to MongoDB on a structure like this. Perhaps the structure should be different. But you don't specify why you want this so we can't advise a better solution to what you want to achieve.

What SQLite column name can be/cannot be?

Is there any rule for the SQLite's column name?
Can it have characters like '/'?
Can it be UTF-8?
Can it have characters like '/'?
All examples are from SQlite 3.5.9 running on Linux.
If you surround the column name in double quotes, you can:
> CREATE TABLE test_forward ( /test_column INTEGER );
SQL error: near "/": syntax error
> CREATE TABLE test_forward ("/test_column" INTEGER );
> INSERT INTO test_forward("/test_column") VALUES (1);
> SELECT test_forward."/test_column" from test_forward;
1
That said, you probably shouldn't do this.
The following answer is based on the SQLite source code, mostly relying on the file parse.y (input for the lemon parser).
TL;DR:
The allowed series of characters for column and table names in CREATE TABLE statements are
'-escaped strings of any kind (even keywords)
Identifiers, which means
``` and "-escaped strings of any kind (even keywords)
a series of the MSB=1 8-bit ASCII characters or 7-bit ASCII characters with 1 in the following table that doesn't form a keyword:
Keyword INDEXED because it's non-standard
Keyword JOIN for reason that is unknown to me.
The allowed series of characters for result columns in a SELECT statement are
Either a string or an identifier as described above
All of the above if used as a column alias written after AS
Now to the exploration process itself
let's look at the syntax for CREATE TABLE columns
// The name of a column or table can be any of the following:
//
%type nm {Token}
nm(A) ::= id(X). {A = X;}
nm(A) ::= STRING(X). {A = X;}
nm(A) ::= JOIN_KW(X). {A = X;}
digging deeper, we find out that
// An IDENTIFIER can be a generic identifier, or one of several
// keywords. Any non-standard keyword can also be an identifier.
//
%type id {Token}
id(A) ::= ID(X). {A = X;}
id(A) ::= INDEXED(X). {A = X;}
"Generic identifier" sounds unfamiliar. A quick look into tokenize.c however brings forth the definition
/*
** The sqlite3KeywordCode function looks up an identifier to determine if
** it is a keyword. If it is a keyword, the token code of that keyword is
** returned. If the input is not a keyword, TK_ID is returned.
*/
/*
** If X is a character that can be used in an identifier then
** IdChar(X) will be true. Otherwise it is false.
**
** For ASCII, any character with the high-order bit set is
** allowed in an identifier. For 7-bit characters,
** sqlite3IsIdChar[X] must be 1.
**
** Ticket #1066. the SQL standard does not allow '$' in the
** middle of identfiers. But many SQL implementations do.
** SQLite will allow '$' in identifiers for compatibility.
** But the feature is undocumented.
*/
For a full map of identifier characters, please consult the tokenize.c.
It is still unclear what are the available names for a result-column (i. e. the column name or alias assigned in the SELECT statement). parse.y is again helpful here.
// An option "AS <id>" phrase that can follow one of the expressions that
// define the result set, or one of the tables in the FROM clause.
//
%type as {Token}
as(X) ::= AS nm(Y). {X = Y;}
as(X) ::= ids(Y). {X = Y;}
as(X) ::= . {X.n = 0;}
Except for placing "illegal" identifier names between double quotes "identifier#1", [ before and ] after works as well [identifire#2].
Example:
sqlite> create table a0.tt ([id#1] integer primary key, [id#2] text) without rowid;
sqlite> insert into tt values (1,'test for [x] id''s');
sqlite> select * from tt
...> ;
id#1|id#2
1|test for [x] id's
Valid field names are subject to the same rules as valid Table names. Checked this with SQlite administrator.
Only Alphanumeric characters and underline are allowed
The field name must begin with an alpha character or underline
Stick to these and no escaping is needed and it may avoid future problems.
This isn't a full answer but may help - these are the keywords from https://www.sqlite.org/lang_keywords.html converted to an array.
["ABORT", "ACTION", "ADD", "AFTER", "ALL", "ALTER", "ALWAYS", "ANALYZE", "AND", "AS",
"ASC", "ATTACH", "AUTOINCREMENT", "BEFORE", "BEGIN", "BETWEEN", "BY", "CASCADE", "CASE", "CAST",
"CHECK", "COLLATE", "COLUMN", "COMMIT", "CONFLICT", "CONSTRAINT", "CREATE", "CROSS", "CURRENT", "CURRENT_DATE",
"CURRENT_TIME", "CURRENT_TIMESTAMP", "DATABASE", "DEFAULT", "DEFERRABLE", "DEFERRED", "DELETE", "DESC", "DETACH", "DISTINCT",
"DO", "DROP", "EACH", "ELSE", "END", "ESCAPE", "EXCEPT", "EXCLUDE", "EXCLUSIVE", "EXISTS",
"EXPLAIN", "FAIL", "FILTER", "FIRST", "FOLLOWING", "FOR", "FOREIGN", "FROM", "FULL", "GENERATED",
"GLOB", "GROUP", "GROUPS", "HAVING", "IF", "IGNORE", "IMMEDIATE", "IN", "INDEX", "INDEXED",
"INITIALLY", "INNER", "INSERT", "INSTEAD", "INTERSECT", "INTO", "IS", "ISNULL", "JOIN", "KEY",
"LAST", "LEFT", "LIKE", "LIMIT", "MATCH", "MATERIALIZED", "NATURAL", "NO", "NOT", "NOTHING",
"NOTNULL", "NULL", "NULLS", "OF", "OFFSET", "ON", "OR", "ORDER", "OTHERS", "OUTER",
"OVER", "PARTITION", "PLAN", "PRAGMA", "PRECEDING", "PRIMARY", "QUERY", "RAISE", "RANGE", "RECURSIVE",
"REFERENCES", "REGEXP", "REINDEX", "RELEASE", "RENAME", "REPLACE", "RESTRICT", "RETURNING", "RIGHT", "ROLLBACK",
"ROW", "ROWS", "SAVEPOINT", "SELECT", "SET", "TABLE", "TEMP", "TEMPORARY", "THEN", "TIES",
"TO", "TRANSACTION", "TRIGGER", "UNBOUNDED", "UNION", "UNIQUE", "UPDATE", "USING", "VACUUM", "VALUES",
"VIEW", "VIRTUAL", "WHEN", "WHERE", "WINDOW", "WITH", "WITHOUT"]

How do I find a literal % with the LIKE-operator with DBD::CSV?

How do I find a literal % with the LIKE-operator?
#!/usr/bin/perl
use warnings;
use strict;
use DBI;
my $table = 'formula';
my $dbh = DBI->connect ( "DBI:CSV:", undef, undef, { RaiseError => 1 } );
my $AoA = [ [ qw( id formula ) ],
[ 1, 'a + b' ],
[ 2, 'c - d' ],
[ 3, 'e * f' ],
[ 4, 'g / h' ],
[ 5, 'i % j' ], ];
$dbh->do( qq{ CREATE TEMP TABLE $table AS IMPORT ( ? ) }, {}, $AoA );
my $sth = $dbh->prepare ( qq{ SELECT * FROM $table WHERE formula LIKE '%[%]%' } );
$sth->execute;
$sth->dump_results;
# Output:
# 3, 'e * f'
# 1 rows
Looks like you can't do this with current version of DBD::CSV.
You are using DBD::CSVmodule to access data. It uses SQL::Statement module to handle expresions. I've searched its source code and found that following code handles LIKE sql statement condition:
## from SQL::Statement::Operation::Regexp::right method
unless ( defined( $self->{PATTERNS}->{$right} ) )
{
$self->{PATTERNS}->{$right} = $right;
## looks like it doen't check any escape symbols
$self->{PATTERNS}->{$right} =~ s/%/.*/g;
$self->{PATTERNS}->{$right} = $self->regexp( $self->{PATTERNS}->{$right} );
}
Look at $self->{PATTERNS}->{$right} =~ s/%/.*/g; line. It converts LIKE pattern to regexp. And it doesn't do any check of any escape symbols. All % symbols are blindly translated to .* pattern. That's why I think it's not implemented yet.
Well, may be someone'll find time to fix this issue.