I would like to know if it's possible to search in all vertices by one specifig field value, without naming each vertex explicitly 🤔
If you do not specify the label it is possible to query all nodes via property.
Say I have two labels Actors(properties: ActorId and Name) and Movies(properties: tconst and primaryTitle) in a database called IMDB and I want to search for either movies or actors named Kevin Bacon.
I can query across both node labels. However, if the property names are different this makes little sense and will not utilize the indices.
> GRAPH.QUERY IMDB "MATCH (a{Name: 'Kevin Bacon'}) RETURN a limit 1"
1) 1) 1) "a.ActorId"
2) "a.Name"
3) "a.tconst"
4) "a.primaryTitle"
2) 1) "nm0000102"
2) "Kevin Bacon"
3) "NULL"
4) "NULL"
I am new to Neo4j and wondering if Cypher has functions like GROUP BY in SQL.
Here is my code:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
RETURN m.title AS movie, p.name AS actor
Here is my result from above query:
movie actor
"The Matrix" "Emil Eifrem"
"The Matrix" "Carrie-Anne Moss"
"The Matrix" "Keanu Reeves"
"The Matrix Reloaded" "Hugo Weaving"
"The Matrix Reloaded" "Laurence Fishburne"
"The Matrix Revolutions" "Hugo Weaving"
"The Matrix Revolutions" "Laurence Fishburne"
Here is the result I want to have:
movie actor num_of_actors
"The Matrix" "Emil Eifrem" 3
"The Matrix" "Carrie-Anne Moss" 3
"The Matrix" "Keanu Reeves" 3
"The Matrix Reloaded" "Hugo Weaving" 2
"The Matrix Reloaded" "Laurence Fishburne" 2
"The Matrix Revolutions" "Hugo Weaving" 2
"The Matrix Revolutions" "Laurence Fishburne" 2
Basically I would like to have the number of actors played in each movie together with the original results.
Thanks in advance
You'll want to review the aggregation functions, which you can use within a WITH clause to do grouping.
For example, if you wanted to group the actor names with each movie, you could do this:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WITH m, collect(p.name) as actors
RETURN m.title AS movie, actors
That said, there are some shortcuts we can do here since you're asking about the total number of actors per movie (see our knowledge base article on using degree counts from a node instead of doing expansions).
If you wanted to keep a separate row per actor, but also have the number of actors, since we know :ACTED_IN relationships will never go to the same actor more than once, we can get the degree of :ACTED_IN relationships incoming to each :Movie node to get our count. For best performance, get the degree before you expand out to actors:
MATCH (m:Movie)
WITH m, m.title as title, size((m)<-[:ACTED_IN]-()) as num_of_actors
MATCH (p:Person)-[:ACTED_IN]->(m)
RETURN title, p.name as actor, num_of_actors
I have a table with book titles and I want to select books that have title matching a regexp and to order results by the position of the regexp match in title.
It's easy for a single-word searches. E.g.
TABLE book
id title
1 The Sun
2 The Dead Sun
3 Sun Kissed
I'm going to put .* between words in client's search term before sending query to DB, so I'd write SQL with prepared regexps here.
SELECT book.id, book.title FROM book
WHERE book.title ~* '.*sun.*'
ORDER BY COALESCE(NULLIF(position('sun' in book.title), 0), 999999) ASC;
RESULT
id title
3 Sun Kissed
1 The Sun
2 The Dead Sun
But if search term has more than one word I want to match titles that have all words from search term with anything between them, and sort by the position like before, so I need a function that returns a position of regexp, I didn't find an appropriate one in official PostgreSQL docs.
TABLE books
id title
4 Deep Space Endeavor
5 Star Trek: Deep Space Nine: The Never Ending Sacrifice
6 Deep Black: Space Espionage and National Security
SELECT book.id, book.title FROM book
WHERE book.title ~* '.*deep.*space.*'
ORDER BY ???REGEXP_POSITION_FUNCTION???('.*deep.*space.*' in book.title);
DESIRED RESULT
id title
4 Deep Space Endeavor
6 Deep Black: Space Espionage and National Security
5 Star Trek: Deep Space Nine: The Never Ending Sacrifice
I didn't find any function similar to ???REGEXP_POSITION_FUNCTION???, do you have any ideas?
One way (of many) to do this: Remove the rest of the string beginning at the match and measure the length of the truncated string:
SELECT id, title
FROM book
WHERE title ILIKE '%deep%space%'
ORDER BY length(regexp_replace(title, 'deep.*space.*', '','i'));
Using ILIKE in the WHERE clause, since that is typically faster (and does the same here).
Also note the fourth parameter to the regexp_replace() function ('i'), to make it case insensitive.
Alternatives
As per request in the comment.
At the same time demonstrating how to sort matches first (and NULLS LAST).
SELECT id, title
,substring(title FROM '(?i)(^.*)deep.*space.*') AS sub1
,length(substring(title FROM '(?i)(^.*)deep.*space.*')) AS pos1
,substring(title FROM '(?i)^.*(?=deep.*space.*)') AS sub2
,length(substring(title FROM '(?i)^.*(?=deep.*space.*)')) AS pos2
,substring(title FROM '(?i)^.*(deep.*space.*)') AS sub3
,position((substring(title FROM '(?i)^.*(deep.*space.*)')) IN title) AS p3
,regexp_replace(title, 'deep.*space.*', '','i') AS reg4
,length(regexp_replace(title, 'deep.*space.*', '','i')) AS pos4
FROM book
ORDER BY title ILIKE '%deep%space%' DESC NULLS LAST
,length(regexp_replace(title, 'deep.*space.*', '','i'));
You can find documentation for all of the above in the manual here and here.
-> SQLfiddle demonstrating all.
Another way to do this would be to first get the literal match for the pattern, then find the position of the literal match:
strpos(input, (regexp_match(input, pattern, 'i'))[1]);
Or in this case:
SELECT id, title
FROM book
ORDER BY strpos(book.title, (regexp_match(book.title, '.*deep.*space.*', 'i'))[1]);
However, there are few caveats:
this is not very efficient as it will scan the input string twice.
this will ignore lookaround (lookbehind, lookahead) constraints, since the literal match can appear multiple times, before the pattern match.
e.g: for the input 'aba' and pattern '(?<=b)a', strpos will return 1 (for the 1st 'a') although the actual position should be 3 (for the 2nd 'a').
BTW, you should probably use a greedy quantifier and narrow your character class as much as you can instead of .* to increase performance (e.g 'deep [\w\s]*? space')
We have tbl_Articles:
id title tags
=================================
1 article1 science;
2 article2 art;
3 article3 sports;art;
I am looking for a query to return records from tbl_Articles which have most common words with a specific tags string (Ex: politics;art;):
EX: Select (something from tbl_articles) where Tags has common in "politics;art;"
Result:
tbl_Articles
id title tags
=================================
2 article2 art;
3 article3 sports;art;
Are you looking for this?
select a.*
from articles a
where ';'+tags+';' like '%;politics;%' and
';'+tags+';' like '%;art;%'
Notice that I use the separator at the beginning and end so you can have "art" and "smart" as tags.
You can accomplish this task using LIKE predicate BUT, Please note that it only works on Character Patterns.
I think you are looking for Full-Text Query. The Free Text not only returns the exact wording, but also the nearest meanings attached to it.
For more details check for the following types..
FREETEXT
FREETEXTTABLE
CONTAINS
CONTAINSTABLE
The Performance benefit of using Full-Text search can be best realized when querying against a large amount of unstructured text data. A LIKE query (for example, ‘%microsoft%’) against millions of rows of text data can take minutes to return; whereas a Full-Text query (for ‘microsoft’) can take only seconds or less against the same data, depending on the number of rows that are returned.
I'm writing a search feature for a database of NFL players.
The user enters a search string like "Jason Campbell" or "Campbell" or "Jason".
I'm having trouble getting the appropriate results.
Which Analyzer should I use when indexing? Which Query when querying? Should I distinguish between first name and last name or just index the full name string?
I'd like the following behavior:
Query: "Jason Campbell" -> Result: exact match for 1 player, Jason Campbell
Query: "Campbell" -> Result: all players with Campbell in their name
Query: "Jason" -> Result: all players with Jason in their name
Query: "Cambel" [misspelled] -> Result: all players with Campbell in their name
StandardAnalyzer should work fine for all above queries. Your first query should be enclosed in double-quotes for an exact match, your last query would require a fuzzy query. For example you could set Cambell~0.5 and you could get Campbell as match(with the numeric value after the tilde indicating the fuzziness).
BTW I would suggest using Solr which provides features for spell-check and auto-suggest so you wouldn't have to reinvent the wheel. This is similar to Google's "did you mean..."