I am trying to count the number of nodes that have a certain property key
I currently have:
MATCH (a:address)
WHERE a.countries = "Austria"
RETURN count(a)
to count a single node.
I would like to iterate this for a large list of property keys such as "Austria", "Denmark", "Finland" without having to manually change each key
How would I go about this?
Let's create a small example dataset:
CREATE (:address {name: "Vienna", countries: "Austria"})
CREATE (:address {name: "Salzburg", countries: "Austria"})
CREATE (:address {name: "Munich", countries: "Germany"})
To count the address nodes for each country, use the following query:
MATCH (a:address)
RETURN a.countries, count(a)
As Cypher is a declarative query language, you do not have to iterate on the results - the query engine does that for you. For a detailed explanation, check the documentation on aggregation.
This returns:
╒═════════════╤══════════╕
│"a.countries"│"count(a)"│
╞═════════════╪══════════╡
│"Austria" │"2" │
├─────────────┼──────────┤
│"Germany" │"1" │
└─────────────┴──────────┘
Note: I am not familiar with your data model, but I think using the singular country would be a better choice for the attribute name.
Related
I want to be able to do the following simple SQL query using Sequelize:
SELECT * FROM one
JOIN (SELECT COUNT(*) AS count, two_id FROM two GROUP BY two_id) AS table_two
ON one.two_id = two.two_id
I can't seem to find anything about raw include, or raw model
For performance reason, I don't want subselect in the main query (which I know sequelize already works well with) aka:
SELECT * FROM one, (SELECT COUNT(*) AS count FROM two WHERE one.two_id = two.two_id) AS count
Regarding the following sequelize code (models One and Two exists)
models.One.findAll({
include: [
models: model.Two
// what to add here in order to get the example SQL
]
})
Seems like I found a somewhat hacky workaround:
You can use fn inside selections to use any SQL word (like JOIN), resulting in something like this for my use case:
models.One.findAll({
attributes: [
fn('JOIN', literal('SELECT COUNT(*) AS count FROM two WHERE one.two_id = two.two_id')),
],
});
Note you can do that only on the last attribute (else it's a misplaced joint)
I'm working on an shopping website. User selects multiple filters on and sends the request to backend which is in node.js and using postgres as DB.
So I want to search the required data in a single query.
I have a json object containing all the filters that user selected. I want to use them in postgres query and return to user the obtained results.
I have a postgres Table that contains a few products.
name Category Price
------------------------------
LOTR Books 50
Harry Potter Books 30
Iphone13 Mobile 1000
SJ8 Cameras 200
I want to filter the table using n number of filters in a single query.
I have to make it work for multiple filters such as the ones mentioned below. So I don't have to write multiple queries for different filters.
{ category: 'Books', price: '50' }
{ category: 'Books' }
{category : ['Books', 'Mobiles']}
I can query the table using
SELECT * FROM products WHERE category='Books' AND 'price'='100'
SELECT * FROM products WHERE category='Books'
SELECT * FROM products WHERE category='Books' OR category='Mobiles'
respectively.
But I want to write my query in such a way that it populates the Keys and Values dynamically. So I may not have to write separate query for every filter.
I have obtained the key and value pairs from the request.query and saved them
const params = req.query;
const keys: string = Object.keys(params).join(",")
const values: string[] = Object.values(params)
const indices = Object.keys(params).map((obj, i) => {
return "$" + (i + 1)
})
But I'm unable to pass them in the query in a correct manner.
Does anybody have a suggestion for me? I'd highly appreciate any help.
Thank you in advance.
This is not the way you filter data from a SQL database table.
You need to use the NodeJS pg driver to connect to the database, then write a SQL query. I recommend prepared statements.
A query would look like:
SELECT * FROM my_table WHERE price < ...
At least based on your question, to me, it is unclear why would want to do these manipulations in JavaScript, nor what you want to be accomplished really.
I am using the StackOverflow annual survey data to do a sample data analysis project. The data can be found in the below link:
Annual Data Survey
I want to filter 2 columns using a single command. The two data type of 2 columns are as follows:
Country 88751 non-null object
ConvertedComp 55823 non-null float64
I want to select a list of countries and then see if their ConvertedComp is greater than 10000.
To make a list of required countries i am using the following filter:
countries = ['United States', 'India','United Kingdom','Germany']
filt = (df['Country'].isin(countries) )
I am using the following filter on the ConvertComp as follows:
filt1 = (df['ConvertedComp']>1000 )
I want to use both these conditions to make a single filter in a single cell. I am using the & operator as follows:
filter1 = (df['Country'].isin(countries) & df['ConvertedComp']>1000)
According to my understanding, when i apply the above filter to the dataframe, I should not see any countries except the ones mentioned in the above list. However when I apply the filer the dataframe is giving me 0 results.
Can anyone please explain how to correctly use the & operator while using filters?
I'm writing a REST api that works with SQL and am constantly finding myself in similar situations to this one, where I need to return lists of objects with nested lists inside each object by querying over table joins.
Let's say I have a many-to-many relationship between Users and Groups. I have a User table and a Group table and a junction table UserGroup between them. Now I want to write a REST endpoint that returns a list of users, and for each user the groups that they are enrolled in. I want to return a json with a format like this:
[
{
"username": "test_user1",
<other attributes ...>
"groups": [
{
"group_id": 2,
<other attributes ...>
},
{
"group_id": 3,
<other attributes ...>
}
]
},
{
"username": "test_user2",
<other attributes ...>
"groups": [
{
"group_id": 1,
<other attributes ...>
},
{
"group_id": 2,
<other attributes ...>
}
]
},
etc ...
There are two or three ways to query SQL for this that I can think of:
Issue a variable number of SQL queries: Query for a list of Users, then loop over each user to query over the junction linkage to populate the groups list for each user. The number of SQL queries linearly increases with the number of users returned.
example (using python flask_sqlalchemy / flask_restx):
users = db.session.query(User).filter( ... )
for u in users:
groups = db.session.query(Group).join(UserGroup, UserGroup.group_id == Group.id) \
.filter(UserGroup.user.id == u.id)
retobj = api.marshal([{**u.__dict__, 'groups': groups} for u in users], my_model)
# Total number of queries: 1 + number of users in result
Issue a constant number of SQL queries: This can be done by issuing one monolithic SQL query performing all joins with potentially lots of redundant data in the User's columns, or, often more preferably, a few separate SQL queries. For example, query for a list of Users, then query the Group table joining on GroupUsers, then manually group groups in server code.
example code:
from collections import defaultdict
users = db.session.query(User).filter( ... )
uids = [u.id for u in users]
groups = db.session.query(User.user_id, Group).join(UserGroup, UserGroup.group_id == Group.id) \
.filter(UserGroup.user_id._in(uids))
aggregate = defaultdict(list)
for g in groups:
aggregate[g.user_id].append(g[1].__dict__)
retobj = api.marshal([{**u.__dict__, 'groups': aggregate[u.id]} for u in users], my_model)
# Total number of queries: 2
The third approach, with limited usefulness, is to use string_agg or a similar approach to force SQL to concatenate a grouping into one string column, then unpack the string into a list server-side, for example if all I want was the group number I could use string_agg and group_by to get back "1,2" in one query to the User table. But this is only useful if you don't need complex objects.
I'm attracted to the second approach because I feel like it's more efficient and scalable because the number of SQL queries (which I have assumed is the main bottleneck for no particularly good reason) is constant, but it takes some more work on the server's side to filter all the groups into each user. But I thought part of the point of using SQL is to take advantage of its efficient sorting/filtering so you don't have to do it yourself.
So my question is, am I right in thinking that it's a good idea to make the number of SQL queries constant at the expense of more server-side processing and dev time? Is it a waste of time to try to whittle down the number of unnecessary SQL queries? Will I regret it if I don't, when API is tested at scale? Is there a better way to solve this problem that I'm not aware of?
Using joinedload option you can load all the data with just one query:
q = (
session.query(User)
.options(db.joinedload(User.groups))
.order_by(User.id)
)
users = q.all()
for user in users:
print(user.name)
for ug in user.groups:
print(" ", ug.name)
When you run the query above, all the groups would have been loaded already from the database using the query similar to below:
SELECT "user".id,
"user".name,
group_1.id,
group_1.name
FROM "user"
LEFT OUTER JOIN (user_group AS user_group_1
JOIN "group" AS group_1 ON group_1.id = user_group_1.group_id)
ON "user".id = user_group_1.user_id
And now you only need to serialize the result with proper schema.
My model looks like so: Each Bottle has an attribute name, and a relationship to Brand.
In one of my views, I want to show a user all distinct bottles, and their counts.
A distinct bottle is a bottle that has the same name attribute, and the same Brand relationship.
So this table:
Should display 2 lines instead of 3, with the proper quantities (1 for Eitan, 2 for Almon).
The following line in my views.py:
object = Bottle.objects.filter(brand__business__owner_id=user.id).all().values('name').annotate(Count('brand'))
Produces this when I print object:
<QuerySet [{'name': 'Almon', 'brand__count': 2}, {'name': 'Eitan', 'brand__count': 1}]>
Which seems to be the right direction, but it has two problems:
I lose all other fields (vintage, capacity) except name and brand__count. I can of course explicitly add them to values, but that seems a) upythonic b) that it will group_by these items as well!
My pug template complains: Need 2 values to unpack in for loop; got 1 (this is because I'm iterating through them as a list, and using its index for numbering)
Any help is appreciated!
object = Bottle.objects.filter(brand__business__owner_id=user.id).all().values('name','vintage','capacity').annotate(Count('brand'))
unless you mention the fields to filter as you are mentioning name then how will the query set pass it to you? then do this, like not mentioning any name in the values
object = Bottle.objects.filter(brand__business__owner_id=user.id).all().values().annotate(Count('brand'))
both of this will give you all the fields in Bottle table