I have Mongo DB collection now I want to calculate the count of document by each employee id, this could have been done in this way in sql.
select count(*), empid
from my_table
group by empid
Any idea how to achieve this in Mongo DB?
db.table_name.aggregate([
{"$group":{_id:"$empid",count:{$sum:1}}} ])
You can sum ones, one for each document:
db.my_table.aggregate([
$group: { _id: "empid", qty: { $sum: 1 }
]);
Related
I run this simple query in Logic App using the "Execute a SQL query (V2)" connector to find out if a number exists in my table.
select count(*) from users where user_number='724-555-5555';
If the number exist, I get this JSON , but somehow I cant parse it.
[
{
"": 1
}
]
Any idea how to simply retrieve 0 or 1 ?
Thanks
David
You need to add an explicit column name:
SELECT
count(*) AS cnt
FROM
users
WHERE
user_number = '724-555-5555';
That will give you this result:
[ { "cnt": 1 } ]
...which is valid JSON.
50 Users with a record format
Id,
Name,
Group_Id
And Groups
1,
2,
3
Are to be inserted into a pairs table in the format
Id,
Pair_1,
Pair_2
Note
Users belongs to different groups.
Users from group 2 cannot pair with each other and users from group 3 can also not pair with each other, duplicates must also be avoided.
How do i go about this in Sql. Am a novice.
This is a sample data in Javascript
[
{
Id:1,
Name:"James",
Group_Id:3
},
{
Id:2,
Name:"Daniel",
Group_Id:3
},
{
Id:3,
Name:"Jonathan",
Group_Id:2
},
{
Id:4,
Name:"Esther",
Group_Id:1
},
{
Id:5,
Name:"Leo",
Group_Id:1
}
]
Pair_1 & Pair_2 are two paired users to be added to a pairs table based on the condition explained earlier.
I am new to Couchbase and kind a stuck with the following problem.
This query works just fine in the Couchbase Query Editor:
SELECT
p.countryCode,
SUM(c.total) AS total
FROM bucket p
USE KEYS (
SELECT RAW "p::" || ca.token
FROM bucket ca USE INDEX (idx_cr)
WHERE ca._class = 'backend.db.p.ContactsDo'
AND ca.total IS NOT MISSING
AND ca.date IS NOT MISSING
AND ca.token IS NOT MISSING
AND ca.id = 288
ORDER BY ca.total DESC, ca.date ASC
LIMIT 20 OFFSET 0
)
LEFT OUTER JOIN bucket finished_contacts
ON KEYS ["finishedContacts::" || p.token]
GROUP BY p.countryCode ORDER BY total DESC
I get this:
[
{
"countryCode": "en",
"total": 145
},
{
"countryCode": "at",
"total": 133
},
{
"countryCode": "de",
"total": 53
},
{
"countryCode": "fr",
"total": 6
}
]
Now, using this query in a spring-boot application i end up with this error:
Unable to retrieve enough metadata for N1QL to entity mapping, have you selected _ID and _CAS?
adding metadata,
SELECT
meta(p).id AS _ID,
meta(p).cas AS _CAS,
p.countryCode,
SUM(c.total) AS total
FROM bucket p
trying to map it to the following object:
data class CountryIntermediateRankDo(
#Id
#Field
val id: String,
#Field
#NotNull
val countryCode: String,
#Field
#NotNull
val total: Long
)
results in:
Unable to execute query due to the following n1ql errors:
{“msg”:“Expression must be a group key or aggregate: (meta(p).id)“,”code”:4210}
Using Map as return value results in:
org.springframework.data.couchbase.core.CouchbaseQueryExecutionException: Query returning a primitive type are expected to return exactly 1 result, got 0
Clearly i missed something important here in terms of how to write proper Couchbase queries. I am stuck between needing metadata and getting this key/aggregate error that relates to the GROUP BY clause. I'd be very thankful for any help.
When you have a GROUP BY query, everything in the SELECT clause should be either a field used for grouping or a group aggregate. You need to add the new fields into the GROUP by statement, sort of like this:
SELECT
_ID,
_CAS,
p.countryCode,
SUM(p.c.total) AS total
FROM testBucket p
USE KEYS ["foo", "bar"]
LEFT OUTER JOIN testBucket finished_contacts
ON KEYS ["finishedContacts::" || p.token]
GROUP BY p.countryCode, meta(p).id AS _ID, meta(p).cas AS _CAS
ORDER BY total DESC
(I had to make some changes to your query to work with it effectively. You'll need to retrofit the advice to your specific case.)
If you need more detailed advice, let me suggest the N1QL forum https://forums.couchbase.com/c/n1ql . StackOverflow is great for one-and-done questions, but the forum is better for extended interactions.
I'm trying to find rows with duplicate fields in an array of structs within a Google BigQuery table, using the new Standard SQL. The data in the table (simplified) where each row looks a bit like this:
{
"Session": "abc123",
"Information" [
{
"Identifier": "e8d971a4-ef33-4ea1-8627-f1213e4c67dc"
},
{
"Identifier": "1c62813f-7ec4-4968-b18b-d1eb8f4d9d26"
},
{
"Identifier": "e8d971a4-ef33-4ea1-8627-f1213e4c67dc"
}
]
}
My end goal is to display the rows that have Information entities with duplicate Identifier values present. However, most of the queries I attempt get an error message of the following form:
Cannot access field Identifier on a value with type ARRAY<STRUCT<Identifier STRING>>
Is there a way to work with the data inside of a STRUCT within an ARRAY?
Here's my first attempt at a query:
SELECT
Session,
Information
FROM
`events.myevents`
WHERE
COUNT(DISTINCT Information.Identifier) != ARRAY_LENGTH(Information.Identifier)
LIMIT
1000
And another using a subquery:
SELECT
Session,
Information
FROM (
SELECT
Session,
Information,
COUNT(DISTINCT Information.Identifier) AS info_count_distinct,
ARRAY_LENGTH(Information) AS info_count
FROM
`events.myevents`
WHERE
COUNT(DISTINCT Information.Identifier) != ARRAY_LENGTH(Information.Identifier)
LIMIT
1000)
WHERE
info_count != info_count_distinct
Try below
SELECT Session, Identifier, COUNT(1) AS dups
FROM `events.myevents`, UNNEST(Information)
GROUP BY Session, Identifier
HAVING dups > 1
ORDER BY Session
Should give you what you expect plus number of dups.
Like below (example)
Session Identifier dups
abc123 e8d971a4-ef33-4ea1-8627-f1213e4c67dc 2
abc345 1c62813f-7ec4-4968-b18b-d1eb8f4d9d26 3
I'm using group_by in a DBIx::Class resultset search. The result returned for each group is always the row in the group with the lowest id (i.e the oldest row in the group). I'm looking for a way to get the row with the highest id (i.e. the newest row in the group) instead.
The problem is fundamentally the same as this:
Retrieving the last record in each group
...except that I'm using DBIx::Class not raw SQL.
To put the question in context:
I have a table of music reviews
review
------
id
artist_id
album_id
pub_date
...other_columns...
There can be multiple reviews for any given artist_id/album_id.
I want the most recent reviews, in descending date order, with no more than one review per artist_id/album_id.
I tried to do this using:
$schema->resultset('Review')->search(
undef,
{
group_by => [ qw/ artist_id album_id / ],
order_by => { -desc => 'pub_date' },
}
);
This nearly works, but returns the oldest review in each group instead of the newest.
How can I get the newest?
For this to work you are relying on broken database behaviour. You should not be able to select columns from a table when you use group by unless they use an aggregate function (min, max etc.) or are specified in the group by clause.
In MySQL, even the manual admits this is wrong - though it supports it.
What I think you need to do is get the latest dates of the reviews, with max(pub_date):
my $dates = $schema->resultset('Review')->search({},
{
select => ['artist_id', 'album_id', {max => 'pub_date'}],
as => [ qw(artist_id album_id recent_pub_date) ],
group_by => [ qw(artist_id album_id) ],
}
);
Then loop through to get the review:
while (my $review_date = $dates->next) {
my $review = $schema->resultset('Review')->search({
artist_id => $review_date->artist_id,
album_id => $review_date->album_id,
pub_date => $review_date->get_column('recent_pub_date'),
})->first;
}
Yep - it's more queries but it makes sense - what if two reviews are on the same date - how should the DB know which one to return in the select statement?