updating a value in an array in mongodb from java - mongodb-query

I have couple of documens in mongodb as follow:
{
"_id" : ObjectId("54901212f315dce7077204af"),
"Date" : ISODate("2014-10-20T04:00:00.000Z"),
"Type" : "Twitter",
"Entities" : [
{
"ID" : 4,
"Name" : "test1",
"Sentiment" : {
"Value" : 20,
"Neutral" : 1
}
},
{
"ID" : 5,
"Name" : "test5",
"Sentiment" : {
"Value" : 10,
"Neutral" : 1
}
}
]
}
Now I want to update the document that has Entities.ID=4 by adding (Sentiment.Value+4)/2 for example in the above example after update we have 12.
I wrote the following code but I am stuck in the if statement as you can see:
DBCollection collectionG;
collectionG = db.getCollection("GraphDataCollection");
int entityID = 4;
String entityName = "test";
BasicDBObject queryingObject = new BasicDBObject();
queryingObject.put("Entities.ID", entityID);
DBCursor cursor = collectionG.find(queryingObject);
if (cursor.hasNext())
{
BasicDBObject existingDocument = new BasicDBObject("Entities.ID", entityID);
//not sure how to update the sentiment.value for entityid=4
}
First I thought I should unwind the Entities array first to get the value of sentiment but if I do that then how can I wind them again and update the document with the same format as it has now but with the new sentiment value ?
also I found the this link as well :
MongoDB - Update objects in a document's array (nested updating)
but I could not understand it since it is not written in java query,
can anyone explain how I can do this in java?

You need to do this in two steps:
Get all the _id of the records which contain a Entity with sentiment
value 4.
During the find, project only the entity sub document that has
matched the query, so that we can process it to consume only its
Sentiment.Value. Use the positional operator($) for this purpose.
Instead of hitting the database every time to update each matched
record, use the Bulk API, to queue up the updates and execute it
finally.
Create the Bulk operation Writer:
BulkWriteOperation bulk = col.initializeUnorderedBulkOperation();
Find all the records which contain the value 4 in its Entities.ID field. When you match documents against this query, you would get the whole document returned. But we do not want the whole document, we would like to have only the document's _id, so that we can update the same document using it, and the Entity element in the document that has its value as 4. There may be n other Entity documents, but they do not matter. So to get only the Entity element that matches the query we use the positional operator $.
DBObject find = new BasicDBObject("Entities.ID",4);
DBObject project = new BasicDBObject("Entities.$",1);
DBCursor cursor = col.find(find, project);
What the above could would return is the below document for example(since our example assumes only a single input document). If you notice, it contains only one Entity element that has matched our query.
{
"_id" : ObjectId("54901212f315dce7077204af"),
"Entities" : [
{
"ID" : 4,
"Name" : "test1",
"Sentiment" : {
"Value" : 12,
"Neutral" : 1
}
}
]
}
Iterate each record to queue up for update:
while(cursor.hasNext()){
BasicDBObject doc = (BasicDBObject)cursor.next();
int curVal = ((BasicDBObject)
((BasicDBObject)((BasicDBList)doc.get("Entities")).
get(0)).get("Sentiment")).getInt("Value");
int updatedValue = (curVal+4)/2;
DBObject query = new BasicDBObject("_id",doc.get("_id"))
.append("Entities.ID",4);
DBObject update = new BasicDBObject("$set",
new BasicDBObject("Entities.$.Sentiment.Value",
updatedValue));
bulk.find(query).update(update);
}
Finally Update:
bulk.execute();
You need to do a find() and update() and not simply an update, because currently mongodb does not allow to reference a document field to retrieve its value, modify it and update it with a computed value, in a single update query.

Related

Do I need Indexing required in case of less data returned by DynamoDb

Say, the dynamo db has data of format:-
{
"id":"<id>",
"field-1":"<field-1-value>",
"field-2":"<field-2-value>",
"field-3":"<field-3-value>",
"field-4":"<field-4-value>",
"metadata":{
"subfield-1":"<subfield-1-value>",
"subfield-2":"<subfield-2-value>"
}
}
So, I have a partition key on id column and sort key on field-1 say. Now, say, I have a requirement that for the same id, if we want a search capability on subfield-1 value, so can that be easily done in Dynamo Db without creating any index. The max. number of rows that would be there for each id would be 70. So, looks like a small set of data.
Please let me know your views.
Yes, this can be achieved without index. You can use FilterExpression to filter the data i.e. metadata.subfield-1.
Example:
var params = {
TableName : 'yourTableName',
KeyConditionExpression : 'id = :idval',
FilterExpression : '#metadata = :subField1Val',
ExpressionAttributeNames : {
'#metadata' : 'metadata.subfield-1'
},
ExpressionAttributeValues : {
':idval' : '7',
'subField1Val' : 'somevalue'
}
};

query for Time Stamp in mongo [duplicate]

I have a problem when querying mongoDB with nested objects notation:
db.messages.find( { headers : { From: "reservations#marriott.com" } } ).count()
0
db.messages.find( { 'headers.From': "reservations#marriott.com" } ).count()
5
I can't see what I am doing wrong. I am expecting nested object notation to return the same result as the dot notation query. Where am I wrong?
db.messages.find( { headers : { From: "reservations#marriott.com" } } )
This queries for documents where headers equals { From: ... }, i.e. contains no other fields.
db.messages.find( { 'headers.From': "reservations#marriott.com" } )
This only looks at the headers.From field, not affected by other fields contained in, or missing from, headers.
Dot-notation docs
Since there is a lot of confusion about queries MongoDB collection with sub-documents, I thought its worth to explain the above answers with examples:
First I have inserted only two objects in the collection namely: message as:
> db.messages.find().pretty()
{
"_id" : ObjectId("5cce8e417d2e7b3fe9c93c32"),
"headers" : {
"From" : "reservations#marriott.com"
}
}
{
"_id" : ObjectId("5cce8eb97d2e7b3fe9c93c33"),
"headers" : {
"From" : "reservations#marriott.com",
"To" : "kprasad.iitd#gmail.com"
}
}
>
So what is the result of query: db.messages.find({headers: {From: "reservations#marriott.com"} }).count()
It should be one because these queries for documents where headers equal to the object {From: "reservations#marriott.com"}, only i.e. contains no other fields or we should specify the entire sub-document as the value of a field.
So as per the answer from #Edmondo1984
Equality matches within sub-documents select documents if the subdocument matches exactly the specified sub-document, including the field order.
From the above statements, what is the below query result should be?
> db.messages.find({headers: {To: "kprasad.iitd#gmail.com", From: "reservations#marriott.com"} }).count()
0
And what if we will change the order of From and To i.e same as sub-documents of second documents?
> db.messages.find({headers: {From: "reservations#marriott.com", To: "kprasad.iitd#gmail.com"} }).count()
1
so, it matches exactly the specified sub-document, including the field order.
For using dot operator, I think it is very clear for every one. Let's see the result of below query:
> db.messages.find( { 'headers.From': "reservations#marriott.com" } ).count()
2
I hope these explanations with the above example will make someone more clarity on find query with sub-documents.
The two query mechanism work in different ways, as suggested in the docs at the section Subdocuments:
When the field holds an embedded document (i.e, subdocument), you can either specify the entire subdocument as the value of a field, or “reach into” the subdocument using dot notation, to specify values for individual fields in the subdocument:
Equality matches within subdocuments select documents if the subdocument matches exactly the specified subdocument, including the field order.
In the following example, the query matches all documents where the value of the field producer is a subdocument that contains only the field company with the value 'ABC123' and the field address with the value '123 Street', in the exact order:
db.inventory.find( {
producer: {
company: 'ABC123',
address: '123 Street'
}
});

RavenDB Get document count after BulkInsertOperations

I am using RavenDB to bulk load some documents. Is there a way to get the count of documents loaded into the database?
For insert operations I am doing:
BulkInsertOperation _bulk = docStore.BulkInsert(null,
new BulkInsertOptions{ CheckForUpdates = true});
foreach(MyDocument myDoc in docCollection)
_bulk.Store(myDoc);
_bulk.Dispose();
And right after that I call the following:
session.Query<MyDocument>().Count();
but I always get a number which is less than the count I see in raven studio.
By default, the query you are doing limits to a sane number of results, part of RavenDB's promise to be safe by default and not stream back millions of records.
In order to get the number of a specific type of document in yoru database, you need a special map-reduce index whose job it is to track the counts for each document type. Because this type of index deals directly with document metadata, it's easier to define this in Raven Studio instead of trying to create it with code.
The source for that index is in this question but I'll copy it here:
// Index Name: Raven/DocumentCollections
// Map Query
from doc in docs
let Name = doc["#metadata"]["Raven-Entity-Name"]
where Name != null
select new { Name , Count = 1}
// Reduce Query
from result in results
group result by result.Name into g
select new { Name = g.Key, Count = g.Sum(x=>x.Count) }
Then to access it in your code you would need a class that mimics the structure of the anonymous type created by both the Map and Reduce queries:
public class Collection
{
public string Name { get; set; }
public int Count { get; set; }
}
Then, as Ayende notes in the answer to the previously linked question, you can get results from the index like this:
session.Query<Collection>("Raven/DocumentCollections")
.Where(x => x.Name == "MyDocument")
.FirstOrDefault();
Keep in mind, however, that indexes are updated asynchronously so after bulk-inserting a bunch of documents, the index may be stale. You can force it to wait by adding .Customize(x => x.WaitForNonStaleResults()) right after the .Query(...).
Raven Studio actually gets this data from the index Raven/DocumentsByEntityName which exists for every database, by sidestepping normal queries and getting metadata on the index. You can emulate that like this:
QueryResult result = docStore.DatabaseCommands.Query("Raven/DocumentsByEntityName",
new Raven.Abstractions.Data.IndexQuery
{
Query = "Tag:MyDocument",
PageSize = 0
},
includes: null,
metadataOnly: true);
var totalDocsOfType = result.TotalResults;
That QueryResult contains a lot of useful data:
{
Results: [ ],
Includes: [ ],
IsStale: false,
IndexTimestamp: "2013-11-08T15:51:25.6463491Z",
TotalResults: 3,
SkippedResults: 0,
IndexName: "Raven/DocumentsByEntityName",
IndexEtag: "01000000-0000-0040-0000-00000000000B",
ResultEtag: "BA222B85-627A-FABE-DC7C-3CBC968124DE",
Highlightings: { },
NonAuthoritativeInformation: false,
LastQueryTime: "2014-02-06T18:12:56.1990451Z",
DurationMilliseconds: 1
}
A lot of that is the same data you get on any query if you request statistics, like this:
RavenQueryStatistics stats;
Session.Query<Course>()
.Statistics(out stats)
// Rest of query

facets with ravendb

i am trying to work with the facet ability in ravendb but getting strange results.
i have a documents like :
{
"SearchableModel": "42LC2RR ",
"ModelName": "42LC2RR",
"ModelID": 490578,
"Name": "LG 42 Television 42LC2RR",
"Desctription": "fffff",
"Image": "1/4/9/8/18278941c",
"MinPrice": 9400.0,
"MaxPrice": 9400.0,
"StoreAmounts": 1,
"AuctionAmounts": 0,
"Popolarity": 3,
"ViewScore": 0.0,
"ReviewAmount": 2,
"ReviewScore": 45,
"Sog": "E-TV",
"SogID": 1,
"IsModel": true,
"Manufacrurer": "LG",
"ParamsList": [
"1994267",
"46570",
"4134",
"4132",
"4118",
"46566",
"4110",
"180676",
"239517",
"750771",
"2658507",
"2658498",
"46627",
"4136",
"169941",
"169846",
"145620",
"169940",
"141416",
"3190767",
"3190768",
"144720",
"2300706",
"4093",
"4009",
"1418470",
"179766",
"190025",
"170557",
"170189",
"43768",
"4138",
"67976",
"239516",
"3190771",
"141195"
],
}
where the ParamList each represents a property of the product and in our application we have in cache what each param represents.
when searching for a specific product i would like to count all the returning attributes to be able to add the amount of each item after the search.
After searching lg in televisions category i want to get :
Param:4134 witch is a representative of LCD and the amount :65.
but unfortunately i am getting strange results. only some params are counted and some not.
on some searchers where i am getting results back i dont get any amounts back.
i am using the latest stable version of RavenDB.
index :
from doc in docs
from param in doc.ParamsList
select new {Name=doc.Name,Description=doc.Description,SearchNotVisible = doc.SearchNotVisible,SogID=doc.SogID,Param =param}
facet :
DocumentStore documentStore = new DocumentStore { ConnectionStringName = "Server" };
documentStore.Initialize();
using (IDocumentSession session = documentStore.OpenSession())
{
List<Facet> _facets = new List<Facet>
{
new Facet {Name = "Param"}
};
session.Store(new FacetSetup { Id = "facets/Params", Facets = _facets });
session.SaveChanges();
}
usage example :
IDictionary<string, IEnumerable<FacetValue>> facets = session.Advanced.DatabaseCommands.GetFacets("FullIndexParams", new IndexQuery { Query = "Name:lg" }, "facets/Params");
i tried many variations without success.
does anyone have ideas what am i doing wrong ?
Thanks
Use this index, it should resolve your problem:
from doc in docs
select new {Name=doc.Name,Description=doc.Description,SearchNotVisible = doc.SearchNotVisible,SogID=doc.SogID,Param = doc.ParamsList}
What analyzer you set for "Name" field. I see you search by Name "lg". By default, Ravendb use KeywordAnalyzer, means you must search by exact name. You should set another analyzer for Name or Description field (StandardAnalyzer for example).

Two "id" fields in one MongoDB collection with Rails 3?

I've got a Rails 3.0.9 project using the latest version of MongoDB and Mongoid 2.2.
I imported a CSV with an "id" field into a MongoDB collection named College, resulting in a collection like so:
{ "_id" : ObjectID("abc123"), "id" : ######, ... }
Observations:
The show action results in a URL utilizing the ObjectID
Displaying 'college.id' in index.html.erb displays the ObjectID
Questions:
How do I use the original "id" field as the parameter
Is "id" reserved by MongoDB, meaning I need to rename the "id" field in the
College collection (perhaps to "code") - if so, how?
Thanks!
Update
Answer:
db.colleges.update( { "name" : { $exists : true } } , { $rename : { "id" : "code" } }, false, true )
I used "name" since that was a field I could check for the existence.
_id is a reserved and required property in MongoDB - I think mongoid is mapping id to _id since that makes sense. There might be a way to access the id property through mongoid but I think you are much better off renaming the id column to something else to avoid confusion in the future.
{ $rename : { old_field_name : new_field_name } }
will rename the field name in a document (mongo 1.7.2+).
so
db.college.update({ "_id" : { $exists : true }}, { $rename : { 'id' : 'code' } }, false, true);
should update every record in that collection and rename the id field to code.
(obviously test this before running in any important data)