How to set outgoing nodes in Cypher? - variables

I have a simple query: given a node A, count the number of nodes that node A has outgoing links, then set that number as a property of A. However I just can't do it.
Attempt 1
MATCH (n)-->(m)
SET n.out=count(m)
RETURN n.name,n.out
This yields the error:
Invalid use of aggregating function count(...) in this context (line 2, column 11 (offset: 28))
"set n.out=count(m)"
^
Attempt 2
MATCH (n)-->(m)
WITH count(m) AS o
SET n.out=o
RETURN n.name,n.out
This yields the error:
Variable `n` not defined (line 3, column 5 (offset: 43))
"SET n.out=o"
^
In both times the errors are in the SET clause. But reading the documentation for SET I cannot identify why these happen.
I cannot count the links because for one pair of n, m there maybe several link types.

You can achieve the desired outcome with the following query:
MATCH (n)
WITH size((n)-->()) as out, n
SET n.out = out
RETURN n.name, n.out

Related

Passing Optional List argument from Django to filter with in Raw SQL

When using primitive types such as Integer, I can without any problems do a query like this:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(pk)s ISNULL OR id %(pk)s''', params={'pk': 1})
Which would either return row with id = 1 or it would return all rows if pk parameter was equal to None.
However, when trying to use similar approach to pass a list/tuple of IDs, I always produce a SQL syntax error when passing empty/None tuple, e.g. trying:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(ids)s ISNULL OR id IN %(ids)s''', params={'ids': (1,2,3)})
works, but passing () produces SQL syntax error:
psycopg2.ProgrammingError: syntax error at or near ")"
LINE 1: SELECT count(*) FROM account WHERE () ISNULL OR id IN ()
Or if I pass None I get:
django.db.utils.ProgrammingError: syntax error at or near "NULL"
LINE 1: ...LECT count(*) FROM account WHERE NULL ISNULL OR id IN NULL
I tried putting the argument in SQL in () - (%(ids)s) - but that always breaks one or the other condition. I also tried playing around with pg_typeof or casting the argument, but with no results.
Notes:
the actual SQL is much more complex, this one here is a simplification for illustrative purposes
as a last resort - I could alter the SQL in Python based on the argument, but I really wanted to avoid that.)
At first I had an idea of using just 1 argument, but replacing it with a dummy value [-1] and then using it like
cursor.execute(sql='''SELECT ... WHERE -1 = any(%(ids)s) OR id = ANY(%(ids)s)''', params={'ids': ids if ids else [-1]})
but this did a Full table scan for non empty lists, which was unfortunate, so a no go.
Then I thought I could do a little preprocessing in python and send 2 arguments instead of just the single list- the actual list and an empty list boolean indicator. That is
cursor.execute(sql='''SELECT ... WHERE %(empty_ids)s = TRUE OR id = ANY(%(ids)s)''', params={'empty_ids': not ids, 'ids': ids})
Not the most elegant solution, but it performs quite well (Index scan for non empty list, Full table scan for empty list - but that returns the whole table anyway, so it's ok)
And finally I came up with the simplest solution and quite elegant:
cursor.execute(sql='''SELECT ... WHERE '{}' = %(ids)s OR id = ANY(%(ids)s)''', params={'ids': ids})
This one also performs Index scan for non empty lists, so it's quite fast.
From the psycopg2 docs:
Note You can use a Python list as the argument of the IN operator using the PostgreSQL ANY operator.
ids = [10, 20, 30]
cur.execute("SELECT * FROM data WHERE id = ANY(%s);", (ids,))
Furthermore ANY can also work with empty lists, whereas IN () is a SQL syntax error.

MDX divide a set by a measure

I have a set lets just say:
set [A] as {
([Measures].[X],[somedimension].[A])
[Measures].[Y],[somedimension].[A])
[Measures].[Z],[somedimension].[A])
}
What I need to do is I have to divide this set with a specific value say: [Measures].[P]
Is it possible to do something like this in MDX? If yes then how. Because if I use a normal divide operation it fives an error saying "The Divide function expects a string or numeric expression for the 1 argument. A tuple set expression was used"
SET usually is just a list of items from a dimension. Use FILTER with needed condition to get items which will satisfy it.
WITH
SET [A] AS {Your Set Members}
SET [A WITH P Over 100] AS FILTER([A], [Measures].[P] > 100)
SET [All Others] AS [A] - [A WITH P Over 100] -- Just for example
SELECT { [P] } ON COLUMNS,
{[A WITH P Over 100]} ON ROWS
FROM [Your Cube]
WHERE ([P] < 1000)

SQL: run only one of two stataments if an internal condition is met

Don't know if someone here knows Tasker Android app, but I think everyone could globally understand what I'm looking to accomplish, because I will basically talk about "raw" SQL code, as it's written on most common languages.
First, this is what I want, roughly:
IF (SELECT * FROM ("january") WHERE ("day") = (19)) MATCHES [%records(#) = 1] END
ELSE
SELECT * FROM ("january") WHERE ("day") = (19) ORDER BY ("timea") DESC END
What I want to say above is: If in the first part of the code (IF ... END) the number of the resulting records, matching the number 19 on 'day' column, is just one, end execution here; but if more than one record is found, jump to the next part, after ELSE.
And if you are a Tasker user, you will understand the next (my current) setup:
A1: SQL Query [ Mode:Raw File:Tasker/Resources/Calendar Express/calendar_db Table:january Columns:day Query:SELECT * FROM ("january") WHERE ("day") = (19) Selection Parameters: Order By: Output Column Divider: Variable Array:%records Use Root:Off ]
A2: SQL Query [ Mode:Raw File:Tasker/Resources/Calendar Express/calendar_db Table:january Columns:day Query:SELECT * FROM ("january") WHERE ("day") = (19) ORDER BY ("timea") DESC Selection Parameters: Order By: Output Column Divider: Variable Array:%records Use Root:Off ] If [ %records(#) > 1 ]
An:...
So, as you can see, A1 will run always, without exceptions, getting the result in the variable array '%records()' (% is how Tasker identifies vars, as $ in other langs; and the use of parenthesis rather than brackets). Then, if the number of entries inside the array is just one, A2 will be jumped (if %records(#) > 1), and following actions are executed.
But, if after running A1 the %records() array contains 3, A2 action will be executed overwritting the content of %records() array, previoulsy set. But this time will contain same number of records (3), but reordered.
Is possible to do so, in just one code line? Thanks ;)
As 'sticky bit' replied on a comment before, I can just still using the second action, as it won't affect the output if it's only a single record. Solved!

How can I use arrayExists function when the array contains a null value?

I have a nullable array column in my table: Array(Nullable(UInt16)). I want to be able to query this column using arrayExists (or arrayAll) to check if it contains a value above a certain threshold but I'm getting an exception when the array contains a null value:
Exception: Expression for function arrayExists must return UInt8, found Nullable(UInt8)
My query is below where distance is the array column:
SELECT * from TracabEvents_ArrayTest
where arrayExists(x -> x > 9, distance);
I've tried updating the comparison in the lambda to "(isNotNull(x) and x > 9)" but I'm still getting the error. Is there any way of handling nulls in these expressions or are they not supported yet?
Add a condition to filter rows with empty list using notEmpty and assumeNotNull for x in arrayExists.
SELECT * FROM TracabEvents_ArrayTest WHERE notEmpty(distance) AND arrayExists(x -> assumeNotNull(x) > 9, distance)

Using SQLDF to select specific values from a column

SQLDF newbie here.
I have a data frame which has about 15,000 rows and 1 column.
The data looks like:
cars
autocar
carsinfo
whatisthat
donnadrive
car
telephone
...
I wanted to use the package sqldf to loop through the column and
pick all values which contain "car" anywhere in their value.
However, the following code generates an error.
> sqldf("SELECT Keyword FROM dat WHERE Keyword="car")
Error: unexpected symbol in "sqldf("SELECT Keyword FROM dat WHERE Keyword="car"
There is no unexpected symbol, so I'm not sure whats wrong.
so first, I want to know all the values which contain 'car'.
then I want to know only those values which contain just 'car' by itself.
Can anyone help.
EDIT:
allright, there was an unexpected symbol, but it only gives me just car and not every
row which contains 'car'.
> sqldf("SELECT Keyword FROM dat WHERE Keyword='car'")
Keyword
1 car
Using = will only return exact matches.
You should probably use the like operator combined with the wildcards % or _. The % wildcard will match multiple characters, while _ matches a single character.
Something like the following will find all instances of car, e.g. "cars", "motorcar", etc:
sqldf("SELECT Keyword FROM dat WHERE Keyword like '%car%'")
And the following will match "car" or "cars":
sqldf("SELECT Keyword FROM dat WHERE Keyword like 'car_'")
This has nothing to do with sqldf; your SQL statement is the problem. You need:
dat <- data.frame(Keyword=c("cars","autocar","carsinfo",
"whatisthat","donnadrive","car","telephone"))
sqldf("SELECT Keyword FROM dat WHERE Keyword like '%car%'")
# Keyword
# 1 cars
# 2 autocar
# 3 carsinfo
# 4 car
You can also use regular expressions to do this sort of filtering. grepl returns a logical vector (TRUE / FALSE) stating whether or not there was a match or not. You can get very sophisticated to match specific items, but a basic query will work in this case:
#Using #Joshua's dat data.frame
subset(dat, grepl("car", Keyword, ignore.case = TRUE))
Keyword
1 cars
2 autocar
3 carsinfo
6 car
Very similar to the solution provided by #Chase. Because we do not use subset we do not need a logical vector and can use both grep or grepl:
df <- data.frame(keyword = c("cars", "autocar", "carsinfo", "whatisthat", "donnadrive", "car", "telephone"))
df[grep("car", df$keyword), , drop = FALSE] # or
df[grepl("car", df$keyword), , drop = FALSE]
keyword
1 cars
2 autocar
3 carsinfo
6 car
I took the idea from Selecting rows where a column has a string like 'hsa..' (partial string match)