How can I compare string values between different classes in cypher query - cypher

I am stuck trying to create relations between two diffferent node classes via a cypher query. I have tried the following:
UNWIND [range(0,100)] AS cols
MATCH (k:Keyword {id:cols}), (d:Document)
WHERE d.key1 = k.name OR d.key2 = k.name OR d.key3 = k.name OR d.key4 = k.name
CREATE (d)-[r:CONTAINS_KEYWORD]->(k);
Is there a better way to compare node properties? Why doesn't this work? The following example works just fine:
MATCH (k:Keyword {name: "algorithm"}), (d:Document)
WHERE d.key1 = "algorithm" OR d.key2 = "algorithm" OR d.key3 = "algorithm" OR d.key4 = "algorithm"
CREATE (d)-[r:CONTAINS_KEYWORD]->(k);
Of course I don't want to do this for my list of several 100 keywords manually.
Thanks in advance for any hints or ideas.

Related

Cypher - Add multiple connections

I have 2 nodes:
Students and Subjects.
I want to be able to add multiple student names to multiple subjects at the same time using cypher query.
So far I have done it by iterating through the list of names of students and subjects and executing the query for each. but is there a way to do the same in the query itself?
This is the query I use for adding 1 student to 1 subject:
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WHERE
s.id = ${"$"}studentId
AND c.id = ${"$"}classroomId
AND u.name = ${"$"}subjectNames
AND NOT (s)-[:IN_SUBJECT]->(u)
CREATE (s)-[:IN_SUBJECT]->(u)
So I want to be able to receive multiple subjectNames and studentIds at once to create these connections. Any guidance for multi relationships in cypher ?
I think what you are looking for is UNWIND. If you have an array as parameter to your query:
studentList :
[
studentId: "sid1", classroomId: "cid1", subjectNames: ['s1','s2'] },
studentId: "sid2", classroomId: "cid2", subjectNames: ['s1','s3'] },
...
]
You can UNWIND that parameter in the beginning of your query:
UNWIND $studentList as student
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WHERE
s.id = student.studentId
AND c.id = student.classroomId
AND u.name = in student.subjectNames
AND NOT (s)-[:IN_SUBJECT]->(u)
CREATE (s)-[:IN_SUBJECT]->(u)
You probably need to use UNWIND.
I haven't tested the code, but something like this might work:
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WITH
s AS student, COLLECT(u) AS subjects
UNWIND subjects AS subject
CREATE (student)-[:IN_SUBJECT]->(subject)

Selecting related model: Left join, prefetch_related or select_related?

Considering I have the following relationships:
class House(Model):
name = ...
class User(Model):
"""The standard auth model"""
pass
class Alert(Model):
user = ForeignKey(User)
house = ForeignKey(House)
somevalue = IntegerField()
Meta:
unique_together = (('user', 'property'),)
In one query, I would like to get the list of houses, and whether the current user has any alert for any of them.
In SQL I would do it like this:
SELECT *
FROM house h
LEFT JOIN alert a
ON h.id = a.house_id
WHERE a.user_id = ?
OR a.user_id IS NULL
And I've found that I could use prefetch_related to achieve something like this:
p = Prefetch('alert_set', queryset=Alert.objects.filter(user=self.request.user), to_attr='user_alert')
houses = House.objects.order_by('name').prefetch_related(p)
The above example works, but houses.user_alert is a list, not an Alert object. I only have one alert per user per house, so what is the best way for me to get this information?
select_related didn't seem to work. Oh, and surely I know I can manage this in multiple queries, but I'd really want to have it done in one, and the 'Django way'.
Thanks in advance!
The solution is clearer if you start with the multiple query approach, and then try to optimise it. To get the user_alerts for every house, you could do the following:
houses = House.objects.order_by('name')
for house in houses:
user_alerts = house.alert_set.filter(user=self.request.user)
The user_alerts queryset will cause an extra query for every house in the queryset. You can avoid this with prefetch_related.
alerts_queryset = Alert.objects.filter(user=self.request.user)
houses = House.objects.order_by('name').prefetch_related(
Prefetch('alert_set', queryset=alerts_queryset, to_attrs='user_alerts'),
)
for house in houses:
user_alerts = house.user_alerts
This will take two queries, one for houses and one for the alerts. I don't think you require select related here to fetch the user, since you already have access to the user with self.request.user. If you want you could add select_related to the alerts_queryset:
alerts_queryset = Alert.objects.filter(user=self.request.user).select_related('user')
In your case, user_alerts will be an empty list or a list with one item, because of your unique_together constraint. If you can't handle the list, you could loop through the queryset once, and set house.user_alert:
for house in houses:
house.user_alert = house.user_alerts[0] if house.user_alerts else None

Django - Making a SQL query on a many to many relationship with PostgreSQL Inner Join

I am looking for a perticular raw SQL query using Inner Join.
I have those models:
class EzMap(models.Model):
layers = models.ManyToManyField(Shapefile, verbose_name='Layers to display', null=True, blank=True)
class Shapefile(models.Model):
filename = models.CharField(max_length=255)
class Feature(models.Model):
shapefile = models.ForeignKey(Shapefile)
I would like to make a SQL Query valid with PostgreSQL that would be like this one:
select id from "table_feature" where' shapefile_ezmap_id = 1 ;
but I dont know how to use the INNER JOIN to filter features where the shapefile they belongs to are related to a particular ezmap object
Something like this:
try:
id = Feature.objects.get(shapefile__ezmap__id=1).id
except Feature.DoesNotExist:
id = 0 # or some other action when no result is found
You will need to use filter (instead of get) if you want to deal with multiple Feature results.

Filtering model with HABTM relationship

I have 2 models - Restaurant and Feature. They are connected via has_and_belongs_to_many relationship. The gist of it is that you have restaurants with many features like delivery, pizza, sandwiches, salad bar, vegetarian option,… So now when the user wants to filter the restaurants and lets say he checks pizza and delivery, I want to display all the restaurants that have both features; pizza, delivery and maybe some more, but it HAS TO HAVE pizza AND delivery.
If I do a simple .where('features IN (?)', params[:features]) I (of course) get the restaurants that have either - so or pizza or delivery or both - which is not at all what I want.
My SQL/Rails knowledge is kinda limited since I'm new to this but I asked a friend and now I have this huuuge SQL that gets the job done:
Restaurant.find_by_sql(['SELECT restaurant_id FROM (
SELECT features_restaurants.*, ROW_NUMBER() OVER(PARTITION BY restaurants.id ORDER BY features.id) AS rn FROM restaurants
JOIN features_restaurants ON restaurants.id = features_restaurants.restaurant_id
JOIN features ON features_restaurants.feature_id = features.id
WHERE features.id in (?)
) t
WHERE rn = ?', params[:features], params[:features].count])
So my question is: is there a better - more Rails even - way of doing this? How would you do it?
Oh BTW I'm using Rails 4 on Heroku so it's a Postgres DB.
This is an example of a set-iwthin-sets query. I advocate solving these with group by and having, because this provides a general framework.
Here is how this works in your case:
select fr.restaurant_id
from features_restaurants fr join
features f
on fr.feature_id = f.feature_id
group by fr.restaurant_id
having sum(case when f.feature_name = 'pizza' then 1 else 0 end) > 0 and
sum(case when f.feature_name = 'delivery' then 1 else 0 end) > 0
Each condition in the having clause is counting for the presence of one of the features -- "pizza" and "delivery". If both features are present, then you get the restaurant_id.
How much data is in your features table? Is it just a table of ids and names?
If so, and you're willing to do a little denormalization, you can do this much more easily by encoding the features as a text array on restaurant.
With this scheme your queries boil down to
select * from restaurants where restaurants.features #> ARRAY['pizza', 'delivery']
If you want to maintain your features table because it contains useful data, you can store the array of feature ids on the restaurant and do a query like this:
select * from restaurants where restaurants.feature_ids #> ARRAY[5, 17]
If you don't know the ids up front, and want it all in one query, you should be able to do something along these lines:
select * from restaurants where restaurants.feature_ids #> (
select id from features where name in ('pizza', 'delivery')
) as matched_features
That last query might need some more consideration...
Anyways, I've actually got a pretty detailed article written up about Tagging in Postgres and ActiveRecord if you want some more details.
This is not "copy and paste" solution but if you consider following steps you will have fast working query.
index feature_name column (I'm assuming that column feature_id is indexed on both tables)
place each feature_name param in exists():
select fr.restaurant_id
from
features_restaurants fr
where
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'pizza')
and
exists(select true from features f where fr.feature_id = f.feature_id and f.feature_name = 'delivery')
group by
fr.restaurant_id
Maybe you're looking at it backwards?
Maybe try merging the restaurants returned by each feature.
Simplified:
pizza_restaurants = Feature.find_by_name('pizza').restaurants
delivery_restaurants = Feature.find_by_name('delivery').restaurants
pizza_delivery_restaurants = pizza_restaurants & delivery_restaurants
Obviously, this is a single instance solution. But it illustrates the idea.
UPDATE
Here's a dynamic method to pull in all filters without writing SQL (i.e. the "Railsy" way)
def get_restaurants_by_feature_names(features)
# accepts an array of feature names
restaurants = Restaurant.all
features.each do |f|
feature_restaurants = Feature.find_by_name(f).restaurants
restaurants = feature_restaurants & restaurants
end
return restaurants
end
Since its an AND condition (the OR conditions get dicey with AREL). I reread your stated problem and ignoring the SQL. I think this is what you want.
# in Restaurant
has_many :features
# in Feature
has_many :restaurants
# this is a contrived example. you may be doing something like
# where(name: 'pizza'). I'm just making this condition up. You
# could also make this more DRY by just passing in the name if
# that's what you're doing.
def self.pizza
where(pizza: true)
end
def self.delivery
where(delivery: true)
end
# query
Restaurant.features.pizza.delivery
Basically you call the association with ".features" and then you use the self methods defined on features. Hopefully I didn't misunderstand the original problem.
Cheers!
Restaurant
.joins(:features)
.where(features: {name: ['pizza','delivery']})
.group(:id)
.having('count(features.name) = ?', 2)
This seems to work for me. I tried it with SQLite though.

Left join query in Django with multiple joins against same table

Not sure how to accomplish this in Django.
Models:
class LadderPlayer(models.Model):
player = models.ForeignKey(User, unique=True)
position = models.IntegerField(unique=True)
class Match(models.Model):
date = models.DateTimeField()
challenger = models.ForeignKey(LadderPlayer)
challengee = models.ForeignKey(LadderPlayer)
Would like to query to get all info about a player in one shot, including any challenges they have issued or challenges against them. This SQL works:
select lp.position,
lp.player_id,
sc1.challengee_id challenging,
sc2.challenger_id challenged_by
from ladderplayer lp left join challenge sc1 on lp.player_id = sc1.challenger_id
left join challenge sc2 on lp.player_id = sc2.challengee_id
Which returns something like this, if player 3 has challenged player 2:
position player_id challenging challenged_by
---------- ---------- ----------- -------------
1 1
2 2 3
3 3 2
No idea how to do in Django ORM....any way to do this?
Actually you should probably change your models a bit, since there's a many-to-many relation from LadderPlayer to itself using Match as an intermediate table. Check out django's documentation on this topic. Then you should be able to make the queries you want using django's orm! Also have a look at symmetrical/asymmetrical many-to-many relationships!
Well, I did more digging and it looks like in Django 1.2 this is doable via the "raw()" method on the Query Manager thing. So this is the code using my query above:
ladder_players = LadderPlayer.objects.raw("""select lp.id, lp.position,lp.player_id,
sc1.challengee_id challenging,
sc2.challenger_id challenged_by
from ladderplayer lp left join challenge sc1 on lp.player_id = sc1.challenger_id
left join challenge sc2 on lp.player_id = sc2.challengee_id order by position""")
And in the template, you can refer to the "calculated" join fields:
{% for p in ladder_players %}
{{p.challenging}} {{p.challenged_by}}
...
etc.
Seems to work as I needed....
#lazerscience is absolutely correct. You should tweak your models, since you are setting up a de facto many-to-many relationship; doing so will allow you to leverage more features of the admin interface & so forth.
Additionally, regardless, there is no need to go to raw(), since this can be done entirely via normal usage of the Django ORM.
Something like:
class LadderPlayer(models.Model):
player = models.ForeignKey(User, unique=True)
position = models.IntegerField(unique=True)
challenges = models.ManyToManyField("self", symmetrical=False, through='Match')
class Match(models.Model):
date = models.DateTimeField()
challenger = models.ForeignKey(LadderPlayer)
challengee = models.ForeignKey(LadderPlayer)
should be all you need to change in the models. You then should be able to do a query like
player_of_interest = LadderPlayer.objects.filter(pk=some_id)
matches_of_interest = \
Match.objects.filter(Q(challenger__pk=some_id)|Q(challengee__pk=some_id))
to get all the information of interest about the player in question. Note that you'll need to have from django.db.models import Q to use that.
If you want exactly the same info you're presenting with your example query, I believe it'd be easiest to split the queries into separate ones for getting the challenger & challengee lists -- for example, something like:
challengers = LadderPlayer.objects.filter(challenges__challengee__pk=poi_id)
challenged_by = LadderPlayer.objects.filter(challenges__challenger__pk=poi_id)
will get the two relevant query sets for the player of interest (w/ a primary key of poi_id).
If there's some particular reason you don't want the de facto many-to-many relationship to become a de jure one, you can change those to something along the lines of
challenger = LadderPlayer.objects.filter(match__challengee__pk=poi_id)
challenged_by = LadderPlayer.objects.filter(match__challenger_pk=poi_id)
So the suggestion for the model change is merely to help leverage existing tools, and to make explicit a relationship which you are currently having occur implicitly.
Based on how you want use it, you might want to do something like
pl_tuple = ()
for p in LadderPlayer.objects.all():
challengers = LadderPlayer.objects.filter(challenges__challengee__pk=p.id)
challenged_by = LadderPlayer.objects.filter(challenges__challenger__pk=p.id)
pl_tuple += (p.id, p.position, challengers, challenged_by)
context_dict['ladder_players'] = pl_tuple
in your view to prepare the data for your template.
Regardless, you should probably be doing your query through the Django ORM instead of using raw() in this case.