Pymongo: Best way to remove $oid in Response - pymongo

I have started using Pymongo recently and now I want to find the best way to remove $oid in Response
When I use find:
result = db.nodes.find_one({ "name": "Archer" }
And get the response:
json.loads(dumps(result))
The result would be:
{
"_id": {
"$oid": "5e7511c45cb29ef48b8cfcff"
},
"about": "A jazz pianist falls for an aspiring actress in Los Angeles."
}
My expected:
{
"_id": "5e7511c45cb29ef48b8cfcff",
"about": "A jazz pianist falls for an aspiring actress in Los Angeles."
}
As you seen, we can use:
resp = json.loads(dumps(result))
resp['id'] = resp['id']['$oid']
But I think this is not the best way. Hope you guys have better solution.

You can take advantage of aggregation:
result = db.nodes.aggregate([{'$match': {"name": "Archer"}}
{'$addFields': {"Id": '$_id.oid'}},
{'$project': {'_id': 0}}])
data = json.dumps(list(result))
Here, with $addFields I add a new field Id in which I introduce the value of oid. Then I make a projection where I eliminate the _id field of the result. After, as I get a cursor, I turn it into a list.
It may not work as you hope but the general idea is there.

First of all, there's no $oid in the response. What you are seeing is the python driver represent the _id field as an ObjectId instance, and then the dumps() method represent the the ObjectId field as a string format. the $oid bit is just to let you know the field is an ObjectId should you need to use for some purpose later.
The next part of the answer depends on what exactly you are trying to achieve. Almost certainly you can acheive it using the result object without converting it to JSON.
If you just want to get rid of it altogether, you can do :
result = db.nodes.find_one({ "name": "Archer" }, {'_id': 0})
print(result)
which gives:
{"name": "Archer"}

import re
def remove_oid(string):
while True:
pattern = re.compile('{\s*"\$oid":\s*(\"[a-z0-9]{1,}\")\s*}')
match = re.search(pattern, string)
if match:
string = string.replace(match.group(0), match.group(1))
else:
return string
string = json_dumps(mongo_query_result)
string = remove_oid(string)

I am using some form of custom handler. I managed to remove $oid and replace it with just the id string:
# Custom Handler
def my_handler(x):
if isinstance(x, datetime.datetime):
return x.isoformat()
elif isinstance(x, bson.objectid.ObjectId):
return str(x)
else:
raise TypeError(x)
# parsing
def parse_json(data):
return json.loads(json.dumps(data, default=my_handler))
result = db.nodes.aggregate([{'$match': {"name": "Archer"}}
{'$addFields': {"_id": '$_id'}},
{'$project': {'_id': 0}}])
data = parse_json(result)

In the second argument of find_one, you can define which fields to exclude, in the following way:
site_information = mongo.db.sites.find_one({'username': username}, {'_id': False})
This statement will exclude the '_id' field from being selected from the returned documents.

Related

Proper way to convert Data type of a field in MongoDB

Possible Replication of How to change the type of a field?
I am currently newly learning MongoDB and I am facing problem while converting Data type of field value to another data type.
Below is an example of my document
[
{
"Name of Restaurant": "Briyani Center",
"Address": " 336 & 338, Main Road",
"Location": "XYZQWE",
"PriceFor2": "500.0",
"Dining Rating": "4.3",
"Dining Rating Count": "1500",
},
{
"Name of Restaurant": "Veggie Conner",
"Address": " New 14, Old 11/3Q, Railway Station Road",
"Location": "ABCDEF",
"PriceFor2": "1000.0",
"Dining Rating": "4.4",
}]
Like above I have 12k documents. Notice the datatype of PriceFor2 is a string. I would like to convert the data type to Integer data type.
I have referred many amazing answers given in the above link. But when I try to run the query, I get .save() is not a function error. Please advice what is the problem.
Below is the code I used
db.chennaiData.find().forEach( function(x){ x.priceFor2= new NumberInt(x.priceFor2);
db.chennaiData.save(x);
db.chennaiData.save(x);});
This is the error I am getting..
TypeError: db.chennaiData.save is not a function
From MongoDB's save documentation:
Starting in MongoDB 4.2, the
db.collection.save()
method is deprecated. Use db.collection.insertOne() or db.collection.replaceOne() instead.
Likely you are having a MongoDB with version 4.2+, so the save function is no longer available. Consider migrate to the usage of insertOne and replaceOne as suggested.
For your specific scenario, it is actually preferred to do with a single update as mentioned in another SO answer. It only does one db call(while your approach fetches all documents in the collection to the application level) and performs n db call to save them back.
db.collection.update({},
[
{
$set: {
PriceFor2: {
$toDouble: "$PriceFor2"
}
}
}
],
{
multi: true
})
Mongo Playground

Lua script access object path inside an array

I'm trying to access the object with property optionId = 'a386ead3-08ca-486e-aeb1-23add87292e7' to set its weight.
my object is like following:
weight": {
"options": [
{
"optionId": "a386ead3-08ca-486e-aeb1-23add87292e7",
"weight": 10
},
{
"optionId": "a386ead3-08ca-486e-aeb1-23add87292e7",
"weight": 20
}
],
"value": 100
}
and i'm using the following function to get its path but with no luck:
local GetFieldOptionWeightPath = function (optionId)
return "$.weight.options[\"optionId\"==\""..optionId.."\"]";
end
You need to compare each optionId like you would programmatically. To do that, you can use a filter script expression:
$.weight.options[?(#.optionId=="a386ead3-08ca-486e-aeb1-23add87292e7")]
Here, $() is the filter and # will point to each of the elements that's getting filtered.
Note that since it's a filter, it may potentially yield multiple results. In fact, in your example case it will yield both entries as they have the same optionId.
In the end your Lua function generating the path can look like:
local GetFieldOptionWeightPath = function (optionId)
return ("$.weight.options[?(#.optionId==%q)]"):format(optionId)
end
This answer assumes JSONPath support which was implemented in RedisJSON v2 (late 2021).

Django rest framework: Is there a way to clean data before validating it with a serializer?

I've got an API endpoint POST /data.
The received data is formatted in a certain way which is different from the way I store it in the db.
I'll use geometry type from postgis as an example.
class MyPostgisModel(models.Model):
...
position = models.PointField(null=True)
my_charfield = models.CharField(max_length=10)
...
errors = JSONField() # Used to save the cleaning and validation errors
class MyPostgisSerializer(serializers.ModelSerializer):
class Meta:
model = MyPostgisModel
fields = [
...
"position",
...
"my_charfield",
"errors",
]
def to_internal_value(self, data):
...
# Here the data is coming in the field geometry but in the db, it's called
# position. Moreover I need to apply the `GEOSGeometry(json.dumps(...))`
# method as well.
data["position"] = GEOSGeometry(json.dumps(data["geometry"]))
return data
The problem is that there is not only one field like position but many. And I would like (maybe wrongly) to do like the validate_*field_name* scheme but for cleaning (clean_*field_name*).
There is another problem. In this scheme, I would like to still save the rest of the data in the database even if some fields have raised ValidationError (eg: a CharField that is too long) but are not part of the primary_key/a unique_together constraint. And save the related errors into a JSONField like this:
{
"cleaning_errors": {
...
"position": 'Invalid format: {
"type": "NotAValidType", # Should be "Point"
"coordinates": [
4.22,
50.67
]
}'
...
},
"validating_errors": {
...
"my_charfield": "data was too long: 'this data is way too long for 10 characters'",
...
}
}
For the first problem, I thought of doing something like this:
class BaseSerializerCleanerMixin:
"""Abstract Mixin that clean fields."""
def __init__(self, *args, **kwargs):
"""Initialize the cleaner strategy."""
# This is the error_dict to be filled by the `clean_*field_name*`
self.cleaning_error_dict = {}
super().__init__(*args, **kwargs)
def clean_fields(self, data):
"""Clean the fields listed in self.fields_to_clean before validating them."""
cleaned_data = {}
for field_name in getattr(self.Meta, "fields", []):
cleaned_field = (
getattr(self, "clean_" + field_name)(data)
if hasattr(self, "clean_" + field_name)
else data.get(field_name)
)
if cleaned_field is not None:
cleaned_data[field_name] = cleaned_field
return cleaned_data
def to_internal_value(self, data):
"""Reformat data to put it in the database."""
cleaned_data = self.clean_fields(data)
return super().to_internal_value(cleaned_data)
I'm not sure that's a good idea and maybe there is an easy way to deal with such things.
For the second problem ; catching the errors of the validation without specifying with is_valid() returning True when no primary_key being wrongly formatted, I'm not sure how to proceed.

Filtering dstore collection against an array field

I'm trying to filter a dstore collection by a field that has an array of values. My json data looks like the following (simplified):
[{
user_id: 1,
user_name: "John Doe",
teams: [{team_id: 100, team_name: 'Red Sox'}, {team_id: 101, team_name: 'Buccaneers'}]
},
{
user_id: 2,
user_name: "Fred Smith",
teams: [{team_id: 100, team_name: 'Buccaneers'}, {team_id: 102, team_name: 'Rays'}]
}]
I can do a simple filter against the username field and it works perfectly.
this.dstoreFilter = new this.dstore.Filter();
var results = this.dgrid.set('collection', this.dstore.filter(
this.dstoreFilter.match('user_name',new RegExp(searchTerm, 'i'))
));
How, though, do I construct a filter to show me only those players who play for the Red Sox, for example. I've tried using the filter.contains() method, but I can't find any adequate documentation on how it works. Looking at the dstore code, I see that the filter.contains() method has the following signature: (value, required, object, key), but that's not helping me much.
Any guidance would be much appreciated. Thanks in advance!
You can find documentation on Filtering here.
In your case, .contains() will not work because it is intended to work on values of array type. What you want to filter here is array of objects. Here is a quote from the doc link:
contains: Filters for objects where the specified property's value is an array and the array contains any value that equals the provided value or satisfies the provided expression.
In my opinion, the best way here is to override the filter method where you want to filter by team name. Here is some sample code:
this.grid.set('collection', this.dstore.filter(lang.hitch(this, function (item) {
var displayUser = false;
for(var i=0; i < item.teams.length; i++){
var team = item.teams[i];
if(team.team_name == 'Red Sox'){
displayUser = true;
break;
}
}
return displayUser;
})));
this.grid.refresh();
For each user in the store, if false is returned, it's display is set to false and if true is returned it gets displayed. This is by far the easiest way that I know of to apply complex filtering on dstore.
Some similar questions that you might want to read up: link, link, link

Facing challenges while using relative path and mapping test data from a json file to a request

I am facing few issues while using relative path and mapping test data from a JSON file. I am having JSON POST request and a test data file in JSON format.
This is the test data I am using.
{
"name": "Test Data",
"description": "Information's mainly related with Users",
"testData": [
{
"Scenario1": {
"givenName": "Joseph",
"familyName": "George",
"addressType": "Current",
"lineOne": "BNRA-222, Kowdiar lane",
"cityName": "Trivandrum",
"countryID": "India",
"postcode": "695006"
}
},
{
"Scenario2": {
"givenName": "Sreenath",
"familyName": "Bhasi",
"addressType": "Current",
"lineOne": "HSE-123, Karyavatom",
"cityName": "Trivandrum",
"countryID": "India",
"postcode": "695552"
}
}
]
}
This is the feature file
Feature: Test using the Data from a JSON file
Background:
* def baseJsonRequest = read('../requests/jsonrequest.json')
* def baseData = read('../data/sampledata.json')
* def endPointURL = endPointURI + path
Scenario: A sample scenario to test the data parametrization
Given url endPointURL
And request baseJsonRequest
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.givenName = baseData.testData[*].Scenario1.givenName
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.familyName = baseData.testData[*].Scenario1.familyName
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.residenceAddress.addressType = baseData.testData[*].Scenario1.addressType
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.residenceAddress.lineOne = baseData.testData[*].Scenario1.lineOne
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.residenceAddress.cityName = baseData.testData[*].Scenario1.cityName
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.residenceAddress.countryID = baseData.testData[*].Scenario1.countryID
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson.residenceAddress.postcode = baseData.testData[*].Scenario1.postcode
My Questions are:
I am not able to give relative path on both sides. The relative path is returning me a json array. For eg I cannot use $..Scenario1.givenName, which makes me write longer paths.
To include this mapping on every scenario will be practically difficult. How can we implement a parameterized solution for that. What will better way? Can I invoke the data reading using a feature file and pass the informations to another feature. If that's possible then I need to parameterize . How to do that?
Or do I need to use a java class to read the JSON file?
Yes, the moment you have a wildcard in JsonPath, it returns an array. Anyway, 2 points that should help here straight away:
you can move repeating nested paths into a table-set
you can refer to a nested chunk of JSON by assigning to a variable
So this should be the way to go:
* def first = get[0] baseData.testData[*].Scenario1
* set baseJsonRequest.autoRequest.applicants.applicant.specifiedPerson
| path | value |
| familyName | first.familyName |
| residenceAddress.addressType | first.addressType |
I would try to not use wildcards as far as possible, for e.g.
* def first = $baseData.testData[0].Scenario1
Hope this helps !