Can I use a calculated query string in a spatial index query? - cypher

I have a set of nodes labeled 'Foo' with associated geographic information, and for each node in that set I want to find nodes from a second set that are geographically nearby. The nodes in the second set have been added to a spatial index named 'tree'. I have tried to construct a query along the lines of
MATCH (n:Foo)
WITH n, 'withinDistance:[' + n.lat + ',' + n.lon + ',10.0]' as q
START m = node:tree(q)
RETURN n, m LIMIT 2
but I get the error
Invalid input ')': expected an identifier character, whitespace or '='
This error is associated with the last character in line 3.
Is it possible to use a constructed query string? If so, what am I missing?

So, Michael Hunger provided the answer. This is a constraint in Cypher. Unfortunate, but there it is. You cannot do what I was trying to do. The spatial index query string must be a string literal or a parameter passed in via REST.

Related

How to retrieve the required string in SQL having a variable length parameter

Here is my problem statement:
I have single column table having the data like as :
ROW-1>> 7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX
ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX
Here i want to separate these values from '-' and load into a new table. There are 11 segments in this string separated by '-', therefore, 11 columns. The problem is:
A. The length of these values are changing, however, i have to keep it as the length of these values in the standard format or the length which it has
e.g 7302- (should have four values, if the value less then that then keep that value eg. 73 then it should populate 73.
Therefore, i have to separate as well as mentation the integrity. The code which i am writing is :
select
SUBSTR(PROFILE_ID,1,(case when length(instr(PROFILE_ID,'-')<>4) THEN (instr(PROFILE_ID,'-') else SUBSTR(PROFILE_ID,1,4) end)
)AS [RQUIRED_COLUMN_NAME]
from [TABLE_NAME];
getting right parenthesis error
Please help.
I used the regex_substr SQL function to solve the above issue. Here below is an example:
select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,1);
Output is: 7302 --which is the 1st segment of the string
Similarly, the send string segment which is separated by "-" in the string can be obtained by just replacing the 1 with 2 in the above query at the end.
Example : select regex_substr('7302-2210177000-XXXX-XXXXXX-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-U-XXXXXXXXX-XXXXXX ROW-2>> 0311-1130101-XXXX-000000-XXX-XXXXXXXXXX-XXXXXX-XXXXXX-X-XXXXXXXXX-WIPXXX',[^-]+,1,2);
output: 2210177000 which is the 2nd segment of the string

Unordered pattern/rule matching(?) with 'OR' capability on a PostgreSQL array field

Abstract
I'm writing code for a data analysis tool that interfaces with a PostgreSQL database and constructs an SQL query to filter to a set of rows based on user input. In broad terms, each row is a record containing a set of input data and an associated output/result. The utility I'm developing allows users to see different views of this data by applying filters to the input and output values.
There's a field in this table which contains an array of integers which represent the 'classes' of a set of entities, which is part of the 'input'. These classes have the most direct impact upon the output, so the particular assortment of values in this field is of particular importance to users of the system. There are twenty unique 'class' values, and the array typically has no more than six elements. There can, in certain circumstances, be two such arrays in a single record, and they may be queried either separately, or combined together into a single set of up to 12 values.
My system provides a freeform input where users can write filter criteria specifically to filter results based on the contents of this field. It allows the user to specify a list of class designations they wish to include in the filter clause, as well as any they wish to explicitly exclude. The grammar of this freeform input is based upon a preexisting community-defined syntax used outside this system to represent the data in question, and adapted here for the purpose of filtering.
Multiple entities in a given record may have the same 'class', so the same values can appear multiple times in the array, and the user can specify a constraint on the number of instances of each class value. The length of this array can also vary, but the user may only be interested in specific items, so the user may specify wildcards and place constraints upon the length of the array.
The arrays are unsorted, as the particular order (most notably, the value in the first position) can occasionally be of importance.
Examples
The data as stored in the database column is an array of integers, but for demonstration purposes, I will use textual class designations in the following example. Users input these textual designations in their queries, which are then translated by the system to numeric IDs.
Example field data: [A, B, B, E, B, D]
Example user inputs which would successfully match the above:
A B B B D E // Explicitly written, filters to rows matching this exact list of items. Order doesn't matter unless the user also selects an option to match the first entry explicitly.
6* // Array wildcard with length constraint; filters to any rows with an array length of 6.
2-3B * // Filters to any rows containing between two and three (inclusive) instances of B, and zero or more other non-B items (unconstrained array wildcard *).
A 2B 3XX // Filters to any rows containing at least one A, two B, and exactly three other items (class wildcard XX) of any class (which may also be A and/or B)
All of this currently works. My current method is to determine the potential upper/lower bounds of the instance counts (or lack thereof) of all specified classes, as well as that of the array length itself, and construct a query that checks those instance counts and array lengths and returns rows which successfully meet those criteria.
The problem...
All of the current syntax works great at the moment. It is purely of "AND" fashion, however -- and the #1 requested feature for this system is the introduction of an "OR" syntax, which is commonly used within the community to denote when certain sets of classes are considered interchangeable.
For example:
A B|C would match both [A,B] and [A,C].
3(B|C) would match [B,B,B], [C,C,C], [B,C,B], etc..
These kinds of queries are often more complex, with things like 2(A|B) 2(B|C|D) 2E not being uncommon. This potential for increasing complexity is where my brain starts to break down when trying to find a solution.
I believe that my current solution of tracking expected instance counts for each value is not inherently compatible with this (unless I'm simply overcomplicating things or overlooking something), but I have been at a loss for how better to approach it, made worse by the fact that I don't know what this type of issue is even called. I believe it would be considered a form of unordered pattern/rule matching, but that's quite a broad umbrella and my searches thus far have been fruitless.
I'm not really looking to be spoonfed a solution, but if there's anyone who recognizes the sort of problem I'm dealing with and has an idea of what topics I could research to figure it out on my own (particularly in the context of SQL queries), it would be immensely helpful.
Database notes
The data pool that a typical query is performed upon is a 30-day period with a subset of data spanning, on average, about 300,000 rows. This window can be increased, and it's not especially uncommon for users to perform long-term queries spanning many millions of rows. Performance is pretty important.
The SQL database in question is a replica of an external partner's database. It is replicated periodically via a binary copy operation, and thus the original format of the tables is largely maintained. Additional fields may be added to optimize access to certain types of data, but this must be done in a separate step during the replication process, and I'd prefer to avoid that if possible.
The problem as stated is very similar to regular expressions even if the unordered nature of the queries makes regular expressions not fully suitable. But this can be solved by defining an AGGREGATE function which relies on regular expressions.
Considering that :
Your arrays of integers to be evaluated may be converted as a text starting with '{', ending with '}' and with ',' as separator
Your queries may be converted as a text and which is a set of elements with a space as separator. Each element is a regular expression of any kind, and especially : an element may be a simple numeric string which represents an integer, an element may be like '(A|B|C)' where A, B, C are numeric stings so that to implement the 'OR' operator between these integers, etc
Your queries may be either ordered or non-ordered : ordered means that the array of integers is evaluated according to the order of the elements in the query, non-ordered means that the array of elements is evaluating againts every element of the query without any order consideration between these elements
Your queries may be strict or non-strict : strict means that the array of integers exactly match the set of elements in the query, ie no additional integer exists in the array which doesn't match with the query elements, non-strict means that the array of integers may include some integers which do not match with any element of the query
The ordered and strict parameters of the query are independent one from the other, ie the users may need ordered and non-strict queries, or non-ordered and strict queries, etc
the function check_function as defined here below should cover most of your use cases including the 'OR' syntax :
CREATE OR REPLACE FUNCTION deduct
( str1 text
, str2 text
, reg text
) RETURNS text LANGUAGE sql IMMUTABLE AS
$$
SELECT CASE
WHEN res = COALESCE(str1,str2)
THEN NULL
ELSE res
END
FROM regexp_replace( COALESCE(str1,str2)
, reg
, ','
) AS res
$$ ;
DROP AGGREGATE IF EXISTS deduct_agg(text, text);
CREATE AGGREGATE deduct_agg
( str text
, reg text
)
( sfunc = deduct
, stype = text
) ;
-- this function returns true when init_string matches with the reg_string expression according to the parameters ordered_match and strict_match
CREATE OR REPLACE FUNCTION check_function
( init_string text -- string to be checked against the reg_string; in case of an array of integer, it must be converted into text before being passed to the function
, reg_string text -- set of elements separated by a space and individually used for checking the init_string iteratively
, ordered_match boolean -- true = the order of the elements in reg_string must be respected in init_string, false = every element in reg_string is individually checked in init_string without any matching order in init_string
, strict_match boolean -- true = the init_string mut exactly match the reg_string, false = the init_string must match all the elements of the reg_string but with some extra substrings which don't match
) RETURNS boolean LANGUAGE plpgsql IMMUTABLE AS
$$
DECLARE res boolean ;
BEGIN
CASE
WHEN ordered_match AND strict_match
THEN SELECT deduct_agg(init_string, '(,|{)' || r.reg || '(,|})$' ORDER BY r.id DESC) IS NOT DISTINCT FROM ','
INTO res
FROM regexp_split_to_table(reg_string,' ') WITH ORDINALITY AS r(reg,id) ;
WHEN NOT ordered_match AND strict_match
THEN SELECT deduct_agg(init_string, '(,|{)' || r.reg || '(,|})') IS NOT DISTINCT FROM ','
INTO res
FROM regexp_split_to_table(reg_string,' ') AS r(reg) ;
WHEN ordered_match AND NOT strict_match
THEN SELECT deduct_agg(init_string, '(,|{)' || r.reg || '(,|})') IS DISTINCT FROM NULL
INTO res
FROM regexp_replace(reg_string,' ', '.*','g') AS r(reg) ;
ELSE SELECT deduct_agg(init_string, '(,|{)' || r.reg || '(,|})') IS DISTINCT FROM NULL
INTO res
FROM regexp_split_to_table(reg_string,' ') AS r(reg) ;
END CASE ;
RETURN res ;
END ;
$$ ;
The following use cases should be supported :
"A B B B D E // Explicitly written, filters to rows matching this
exact list of items. Order doesn't matter" ==> implemented as SELECT check_function(your_array_of_integers :: text, 'A B B B D E', true, true)
"6* // Array wildcard with length constraint; filters to any rows with an array length of 6." ==> implemented as SELECT check_function(your_array_of_integers :: text,'([0-9]+,){5}([0-9]+)',true,true). This use case can be generalized by replacing "6*" by "n*" and '{5}' by '{' || n-1 || '}' in the reg_string, where n is any integer > 1
"A 3B" with any order and strict ==> implemented as SELECT check_function(your_array_of_integers :: text, 'A B B B', false, true)
"A (B|C)" with no order and not strict ==> implemented as SELECT check_function(your_array_of_integers :: text, 'A (B|C)', false, false)
"3(B|C)" with no order and strict ==> implemented as SELECT check_function(your_array_of_integers :: text, '(B|C) (B|C) (B|C)', false, true)
"2(A|B) 2(B|C|D) 2E" with no order and not strict ==> implemented as SELECT check_function(your_array_of_integers :: text, '(A|B) (A|B) (B|C|D) (B|C|D) E E', false, false)
etc
The use cases which are not yet implemented :
"2-3B" but some additional home work could make it happen, I don't see any blocking point. One idea would be to call the function check_function twice : SELECT check_function (..., 'B B', ..., ...) AND NOT check_function (..., 'B B B B', ..., ...)
"2-3B *" and "A 2B 3XX" because the wildcards * and XX are not clear to me in that cases.
PS : I'm a basic user of regular expressions as I don't use all the capabilities as presented in the manual. Having the advices of an experienced user in regular expression could bring a lot of value in your context.

error: bind message supplies 1 parameters, but prepared statement "" requires 0

I have a table 'article' with column 'content' .I want to query Postgresql in order to search for a string contained in variable 'temp'.This query works fine-
pool.query("select * from article where upper(content) like upper('%some_value%')");
But when I use placeholder $1 and [temp] in place of some_value , I get the above error -
pool.query("select * from article where upper(content) LIKE upper('%$1%')",[temp] );
Note - Here $1 is a placeholder and should be replaced by the value in [temp] , but it treats '%$1%' as a string , I guess. Without the quotes ' ' , the LIKE operator doesn't work. I have also tried the query -
pool.query("select * from article where upper(content) LIKE upper(concat('%',$1,'%'))",[temp] );
to ensure $1 is not treated as a string literal but it gives the error -
error: could not determine data type of parameter $1
pool.query(
"select * from article where upper(content) LIKE upper('%' || $1 || '%')",
[temp]
).then( res => {console.log(res)}, err => {console.error(err)})
This works for me. I just looked at this Postgres doc page to try and understand what concat was doing to the parameter notation. Can't say that I understand the difference between using || operators and using concat string function at this time.
The easiest way I found to do this is like the following:
// You can remove [0] from character[0] if you want the complete value of character.
database.query(`
SELECT * FROM users
WHERE LOWER(users.name) LIKE LOWER($1)
ORDER BY users.id ASC`,
["%" + character[0] + "%"]
);
// [%${character}%] string literal alternative to the last line in the function call.
There are several things going on here, so let me break each line it down.
SELECT * FROM users
This is selecting all the columns associated with table users
WHERE LOWER(users.name) LIKE $1
This is filtering out all the results from the first line so that where the name(lowercased) column of the users table is like the parameter $1.
ORDER BY users.id ASC
This is optional, but I like to include it because I want the data returned to me to be in ascending order (that is from 0 to infinity, or starting low and going high) based on the users.id or the id column of the users table. A popular alternative for client-side data presentation is users.created_at DESC which shows the latest user (or more than likely an article/post/comment) by its creation date in reverse order so you get the newest content at the top of the array to loop through and display on the client-side.
["%" + character + "%"]
This part is the second argument in the .query method call from the database object (or client if you kept with that name, you can name it what you want, and database to me makes for more a sensical read than "client", but that is just my personal opinion, and it's highly possible that "client" may be the more technically correct term to use).
The second argument needs to be an array of values. It takes the place of the parameters inserted in the query string, for example, $1 or ? are examples of parameter placeholders which are filled in with a value in the 2nd argument's array of values. In this case, I used JavaScript's built-in string concatenation to provide a "includes" like pattern, or in plain-broken English, "find me columns that contain a 'this' value" where name(lowercased) is the column and character is the parameter variable value. I am pulling in the parameter value for the character variable from req.params (the URL, so http://localhost:3000/users/startsWith/t), so combining that with % on both ends of the parameter, it returns me all the values that contain the letter t since is the first (and only) character here in the URL.
I know this is a VERY late response, but I wanted to respond with a more thorough answer in case anyone else needed it broken down further.
In my case :
My variable was $1, instead of ?1 ...
I was customizing my query with #Query

from string to map object in Hive

My input is a string that can contain any characters from A to Z (no duplicates, so maximum 26 characters it may have).
For example:-
set Input='ATK';
The characters within the string can appear in any order.
Now I want to create a map object out of this which will have fixed keys from A to Z. The value for a key is 1 if its corresponding character appears in the input string. So in case of this example (ATK) the map object should look like:-
So what is the best way to do this?
So the code should look like:-
set Input='ATK';
select <some logic>;
It should return a map object (Map<string,int>) with 26 key value pairs within it. What is the best way to do it, without creating any user defined functions in Hive. I know there is a function str_to_map that easily comes to mind.But it only works if key value pairs exist in source string and also it will only consider the key value pairs specified in the input.
Maybe not efficient but works:
select str_to_map(concat_ws('&',collect_list(concat_ws(":",a.dict,case when
b.character is null then '0' else '1' end))),'&',':')
from
(
select explode(split("A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z",',')) as dict
) a
left join
(
select explode(split(${hiveconf:Input},'')) as character
) b
on a.dict = b.character
The result:
{"A":"1","B":"0","C":"0","D":"0","E":"0","F":"0","G":"0","H":"0","I":"0","J":"0","K":"1","L":"0","M":"0","N":"0","O":"0","P":"0","Q":"0","R":"0","S":"0","T":"1","U":"0","V":"0","W":"0","X":"0","Y":"0","Z":"0"}

IBM DB2 for i SQL (iSeries) - Removing a character from end of a field using update

I have a product table called PDPRODP - for certain styles within this table I used a concat statement to add a full-stop to their description (PRDESC), I now wish to remove this full stop.
The descriptions are varying length, the field max size is 30 characters and I need to physically remove the full-stop rather than using a select statement to trim the full-stop.
I tried;
UPDATE PDPRODP SET PRDESC = PRDESC-1 where PRSTYLE = 1234
But I got this error:
Character in CAST argument not valid.
I also tried this following some googling;
UPDATE PDPRODP SET PRDESC=LEFT(PRDESC, LEN(PRDESC)-1)
WHERE PRCOMP = 1 AND PRSTYL = 31285
But got this error:
LEN in *LIBL type *N not found.
Use LENGTH
UPDATE PDPRODP SET PRDESC=LEFT(PRDESC, LENGTH(PRDESC)-1)
WHERE PRCOMP = 1 AND PRSTYL = 31285
The REPLACE() function can search for all occurrences of some string, and substitute another in its place. You might search for your full-stop, and replace it with a zero-length string ''. This would be handy in cases where your search string may not always be at the end.