pymongo upgrade to 3.0.3 causes this statement to fail - pymongo

This statement on pymongo version 2.7.2 works just fine:
allFieldsArray = list(fields.find({"persist": "True"}, fields={'name': 1, 'list_name': 1, '_id': 0}))
When I upgraded to 3.0.3, I get this:
TypeError: init() got an unexpected keyword argument 'fields'

You need to pass projection instead of fields.
allFieldsArray = list(fields.find(
{"persist": "True"},
projection={'name': 1, 'list_name': 1, '_id': 0}))
See the docs.

Related

Proper way to convert Data type of a field in MongoDB

Possible Replication of How to change the type of a field?
I am currently newly learning MongoDB and I am facing problem while converting Data type of field value to another data type.
Below is an example of my document
[
{
"Name of Restaurant": "Briyani Center",
"Address": " 336 & 338, Main Road",
"Location": "XYZQWE",
"PriceFor2": "500.0",
"Dining Rating": "4.3",
"Dining Rating Count": "1500",
},
{
"Name of Restaurant": "Veggie Conner",
"Address": " New 14, Old 11/3Q, Railway Station Road",
"Location": "ABCDEF",
"PriceFor2": "1000.0",
"Dining Rating": "4.4",
}]
Like above I have 12k documents. Notice the datatype of PriceFor2 is a string. I would like to convert the data type to Integer data type.
I have referred many amazing answers given in the above link. But when I try to run the query, I get .save() is not a function error. Please advice what is the problem.
Below is the code I used
db.chennaiData.find().forEach( function(x){ x.priceFor2= new NumberInt(x.priceFor2);
db.chennaiData.save(x);
db.chennaiData.save(x);});
This is the error I am getting..
TypeError: db.chennaiData.save is not a function
From MongoDB's save documentation:
Starting in MongoDB 4.2, the
db.collection.save()
method is deprecated. Use db.collection.insertOne() or db.collection.replaceOne() instead.
Likely you are having a MongoDB with version 4.2+, so the save function is no longer available. Consider migrate to the usage of insertOne and replaceOne as suggested.
For your specific scenario, it is actually preferred to do with a single update as mentioned in another SO answer. It only does one db call(while your approach fetches all documents in the collection to the application level) and performs n db call to save them back.
db.collection.update({},
[
{
$set: {
PriceFor2: {
$toDouble: "$PriceFor2"
}
}
}
],
{
multi: true
})
Mongo Playground

JavaScript toString() method is not working for JSON in karate v 1.1.0

In karate version 0.9.6 the following code was working fine
* def response = { "firstName": "John", "lastName" : "doe", "age" : 26, "address" : { "streetAddress": "applewood", "city" : "Nara", "postalCode" : "630-0192" } }
* match response.toString() contains 'applewood'
But in karate version 1.1.0, The assertion failing with:
match failed: CONTAINS
$ | actual does not contain expected
(STRING:STRING)
'[object Object]'
'applewood'
classpath:...some classpath
Even I printed the response.toString() and its printing [object Object].
Is there any change in JS function support in karate feature files in v1.1.0?
You can try this change:
* match karate.toString(response) contains 'applewood'
A couple more points:
I consider what you are doing as bad-practice. It is better to narrow down the match to a single field
Also be aware of "type conversion": https://github.com/karatelabs/karate#type-conversion

PyMongo Aggregation "AttributeError: 'dict' object has no attribute '_txn_read_preference'"

I'm sure there is an error in my code since I'm a newby to pyMongo, but I'll give it a go. The data in MongoDB is 167k+ and is as follows:
{'overall': 5.0,
'reviewText': {'ago': 1,
'buy': 2,
'daughter': 1,
'holiday': 1,
'love': 2,
'niece': 1,
'one': 2,
'still': 1,
'today': 1,
'use': 1,
'year': 1},
'reviewerName': 'dcrm'}
I would like to get a tally of terms used within that reviewText field for all 5.0 ratings. I have run the following code and I get the error that follows. Any insight?
#1 Find the top 20 most common words found in 1-star reviews.
aggr = [{"$unwind": "$reviewText"},
{"$group": { "_id": "$reviewText", "word_freq": {"$sum":1}}},
{"$sort": {"word_freq": -1}},
{"$limit": 20},
{"$project": {"overall":"$overall", "word_freq":1}}]
disk_use = { 'allowDiskUse': True }
findings = list(collection.aggregate(aggr, disk_use))
for item in findings:
p(item)
As you can see, I came across the 'allDiskUse' component since I seemed to exceed the 100MB threshold. But the error that I get is:
AttributeError: 'dict' object has no attribute '_txn_read_preference'
you are quite close, allowDiskUse is named parameter not a dictionary so the statement should be like this
findings = list(collection.aggregate(aggr, allowDiskUse=True))
or
findings = list(collection.aggregate(aggr, **disk_use ))
ManishSingh response is the best, but if you don't get exactly what he means and why you are having this error I can clarify what you have and why this is not correct:
The problem might be that you are using "allowDiskUse" with quotes like this:
findings = list(collection.aggregate(aggr, {"allowDiskUse": True})) # wrong
but the correct is this:
findings = list(collection.aggregate(aggr, allowDiskUse=True)) # correct

AWS boto3 page_iterator.search can't compare datetime.datetime to str

Trying to capture delta files(files created after last processing) sitting on s3. To do that using boto3 filter iterator by query LastModified value rather than returning all the list of files and filtering on the client site.
According to http://jmespath.org/?, the below query is valid and filters the following json respose;
filtered_iterator = page_iterator.search(
"Contents[?LastModified>='datetime.datetime(2016, 12, 27, 8, 5, 37, tzinfo=tzutc())'].Key")
for key_data in filtered_iterator:
print(key_data)
However it fails with;
RuntimeError: xxxxxxx has failed: can't compare datetime.datetime to str
Sample paginator reponse;
{
"Contents": [{
"LastModified": "datetime.datetime(2016, 12, 28, 8, 5, 31, tzinfo=tzutc())",
"ETag": "1022dad2540da33c35aba123476a4622",
"StorageClass": "STANDARD",
"Key": "blah1/blah11/abc.json",
"Owner": {
"DisplayName": "App-AWS",
"ID": "bfc77ae78cf43fd1b19f24f99998cb86d6fd8220dbfce0ce6a98776253646656"
},
"Size": 623
}, {
"LastModified": "datetime.datetime(2016, 12, 28, 8, 5, 37, tzinfo=tzutc())",
"ETag": "1022dad2540da33c35abacd376a44444",
"StorageClass": "STANDARD",
"Key": "blah2/blah22/xyz.json",
"Owner": {
"DisplayName": "App-AWS",
"ID": "bfc77ae78cf43fd1b19f24f99998cb86d6fd8220dbfce0ce6a81234e632c5a8c"
},
"Size": 702
}
]
}
Boto3 Jmespath implementation does not support dates filtering (it will mark them as incompatible types "unicode" and "datetime" in your example). But by the way Dates are parsed by Amazon you can perform lexographical comparison of them using to_string() method of Jmespath.
Something like this:
"Contents[?to_string(LastModified)>='\"2015-01-01 01:01:01+00:00\"']"
But keep in mind that its a lexographical comparison and not dates comparison. Works most of the time tho.
After spend a few minutes on boto3 paginator documentation, I just realist it is actually an syntax problem, which I overlook it as a string.
Actually, the quote that embrace comparison value on the right is a backquote/backtick, symbol [ ` ] . You cannot use single quote [ ' ] for the comparison values/objects.
After inspect JMESPath example, I notice it is using backquote for comparative value. So boto3 paginator implementation indeed comply to JMESPath standard.
Here is the code I run without error using the backquote.
import boto3
s3 = boto3.client("s3")
s3_paginator = s3.get_paginator('list_objects')
s3_iterator = s3_paginator.paginate(Bucket='mytestbucket')
filtered_iterator = s3_iterator.search(
"Contents[?LastModified >= `datetime.datetime(2016, 12, 27, 8, 5, 37, tzinfo=tzutc())`].Key"
)
for key_data in filtered_iterator:
print(key_data)

Is it possible to turn an array returned by the Mongo GeoNear command (using Ruby/Rails) into a Plucky object?

As a total newbie I have been trying to get the geoNear command working in my rails application and it appear to be working fine. The major annoyance for me is that it is returning an array with strings rather than keys which I can call on to pull out data.
Having dug around, I understand that MongoMapper uses Plucky to turn the the query resultant into a friendly object which can be handled easily but I haven't been able to find out how to transform the result of my geoNear query into a plucky object.
My questions are:
(a) Is it possible to turn this into a plucky object and how do i do that?
(b) If it is not possible how can I most simply and systematically extract each record and each field?
here is the query in my controller
#mult = 3963 * (3.14159265 / 180 ) # Scale to miles on earth
#results = #db.command( {'geoNear' => "places", 'near'=> #search.coordinates , 'distanceMultiplier' => #mult, 'spherical' => true})
Here is the object i'm getting back (with document content removed for simplicity)
{"ns"=>"myapp-development.places", "near"=>"1001110101110101100100110001100010100010000010111010", "results"=>[{"dis"=>0.04356444023196527, "obj"=>{"_id"=>BSON::ObjectId('4ee6a7d210a81f05fe000001'),...}}], "stats"=>{"time"=>0, "btreelocs"=>0, "nscanned"=>1, "objectsLoaded"=>1, "avgDistance"=>0.04356444023196527, "maxDistance"=>0.0006301239824196907}, "ok"=>1.0}
Help is much appreciated!!
Ok so lets say you store the results into a variable called places_near:
places_near = t.command( {'geoNear' => "places", 'near'=> [50,50] , 'distanceMultiplier' => 1, 'spherical' => true})
This command returns an hash that has a key (results) which maps to a list of results for the query. The returned document looks like this:
{
"ns": "test.places",
"near": "1100110000001111110000001111110000001111110000001111",
"results": [
{
"dis": 69.29646421910687,
"obj": {
"_id": ObjectId("4b8bd6b93b83c574d8760280"),
"y": [
1,
1
],
"category": "Coffee"
}
},
{
"dis": 69.29646421910687,
"obj": {
"_id": ObjectId("4b8bd6b03b83c574d876027f"),
"y": [
1,
1
]
}
}
],
"stats": {
"time": 0,
"btreelocs": 1,
"btreelocs": 1,
"nscanned": 2,
"nscanned": 2,
"objectsLoaded": 2,
"objectsLoaded": 2,
"avgDistance": 69.29646421910687
},
"ok": 1
}
To iterate over the responses just iterate as you would over any list in ruby:
places_near['results'].each do |result|
# do stuff with result object
end