Select * Except particular properties in Cosmos DB with SQL API - sql

Consider the following, I have a document that looks something like this:
"id": 2
"properties": {
"desired": {
"Property1": 10,
"Property2": 1,
"Property3": 1,
"$metadata": {
...
},
"$version": 53
}
},
I want to get everything from the document EXCEPT $metadata and $version The obvious solution would be to:
SELECT c["Property1"], c["Property2"] .... FROM c where c["id"] = "2"
However, my document may expand dynamically, hence why the above is suboptimal. I therefore figured that it may be better to exclude just $metadata and $version. I looked at different "interesting" solutions here on stackoverflow, amongst which one suggests to create a temporary table.
Unfortunately, the query needs to be very efficient, because I want to reduce the amount of RUs used. Also I really want to avoid handling the exclusion in the code.
Therefore, how do I exclude particular "columns" from my document, without writing an excessively long query, which may include creating temporary tables.

Cosmos DB does not support "Project Away". You will need to specify properties to project or use * and return all of them.

Related

How to achieve generic Audit.NET json data processing?

I am using Audit.Net library to log EntityFramework actions into a database (currently everything into one AuditEventLogs table, where the JsonData column stores the data in the following Json format:
{
"EventType":"MyDbContext:test_database",
"StartDate":"2021-06-24T12:11:59.4578873Z",
"EndDate":"2021-06-24T12:11:59.4862278Z",
"Duration":28,
"EntityFrameworkEvent":{
"Database":"test_database",
"Entries":[
{
"Table":"Offices",
"Name":"Office",
"Action":"Update",
"PrimaryKey":{
"Id":"40b5egc7-46ca-429b-86cb-3b0781d360c8"
},
"Changes":[
{
"ColumnName":"Address",
"OriginalValue":"test_address",
"NewValue":"test_address"
},
{
"ColumnName":"Contact",
"OriginalValue":"test_contact",
"NewValue":"test_contact"
},
{
"ColumnName":"Email",
"OriginalValue":"test_email",
"NewValue":"test_email2"
},
{
"ColumnName":"Name",
"OriginalValue":"test_name",
"NewValue":"test_name"
},
{
"ColumnName":"OfficeSector",
"OriginalValue":1,
"NewValue":1
},
{
"ColumnName":"PhoneNumber",
"OriginalValue":"test_phoneNumber",
"NewValue":"test_phoneNumber"
}
],
"ColumnValues":{
"Id":"40b5egc7-46ca-429b-86cb-3b0781d360c8",
"Address":"test_address",
"Contact":"test_contact",
"Email":"test_email2",
"Name":"test_name",
"OfficeSector":1,
"PhoneNumber":"test_phoneNumber"
},
"Valid":true
}
],
"Result":1,
"Success":true
}
}
Me and my team has a main aspect to achieve:
Being able to create a search page where administrators are able to tell
who changed
what did they change
when did the change happen
They can give a time period, to reduce the number of audit records, and the interesting part comes here:
There should be an input text field which should let them search in the values of the "ColumnValues" section.
The problems I encountered:
Even if I map the Json structure into relational rows, I am unable to search in every column, with keeping the genericity.
If I don't map, I could search in the Json string with LIKE mssql function but on the order of a few 100,000 records it takes an eternity for the query to finish so it is probably not the way.
Keeping the genericity would be important, so we don't need to modify the audit search page every time when we create or modify a new entity.
I only know MSSQL, but is it possible that storing the audit logs in a document oriented database like cosmosDB (or anything else, it was just an example) would solve my problem? Or can I reach the desired behaviour using relational database like MSSQL?
Looks like you're asking for an opinion, in that case I would strongly recommend a document oriented DB.
CosmosDB could be a great option since it supports SQL queries.
There is an extension to log to CosmosDB from Audit.NET: Audit.AzureCosmos
A sample query:
SELECT c.EventType, e.Table, e.Action, ch.ColumnName, ch.OriginalValue, ch.NewValue
FROM c
JOIN e IN c.EntityFrameworkEvent.Entries
JOIN ch IN e.Changes
WHERE ch.ColumnName = "Address" AND ch.OriginalValue = "test_address"
Here is a nice post with lot of examples of complex SQL queries on CosmosDB

Convert global issue ID to project issue ID

When I query the API api/issues/ for issues with fields="id", I get back an array of issues similiar to this:
[
{ "id": "2-120" }
]
This works for further calls because 2-120 can be used in calls to /api/issues/{id}. However, I also need to display those IDs to users, which are more comfortable with project-based IDs, like EX-10. (Also, the whole browser user-interface is structured around those project issues ids)
What I tried:
Had a look at the Issue JSON Schema docs, which do not seem to contain an additional ID
Tried to find out if they can be converted manually, which does not seem to be the case.
So, how can I convert global issue IDs, like 2-120, to project issue IDs, like EX-10?
After looking at the schema again, I simply overlooked idReadable. So, a request to api/issues/PA-102?fields=id,idReadable will give you both types of IDs.
{ "id": "2-120", "idReadable": "PA-20" }

SQL Database for Magic Cardgame

For school I am creating a deckbuilder website based on Magic the gathering. It's the project that decides if I get my degree or not. Trough the website from Deckbrew I have been able to get data like the following:
[
{
"name": "About Face",
"id": "about-face",
"url": "https://api.deckbrew.com/mtg/cards/about-face",
"store_url": "http://store.tcgplayer.com/magic/urzas-legacy/about-face",
"types": [
"instant"
],
"colors": [
"red"
],
"cmc": 1,
"cost": "{R}",
"text": "Switch target creature's power and toughness until end of turn.",
"formats": {
"commander": "legal",
"legacy": "legal",
"vintage": "legal"
},
"editions": [
{
"set": "Urza's Legacy",
"rarity": "common",
"artist": "Melissa A. Benson",
"multiverse_id": 12414,
"flavor": "The overconfident are the most vulnerable.",
"number": "73",
"layout": "normal",
"price": {
"low": 0,
"average": 0,
"high": 0
},
"url": "https://api.deckbrew.com/mtg/cards?multiverseid=12414",
"image_url": "http://mtgimage.com/multiverseid/12414.jpg",
"set_url": "https://api.deckbrew.com/mtg/sets/ULG",
"store_url": "http://store.tcgplayer.com/magic/urzas-legacy/about-face"
}
]
}
]
It's obvious that it's in jSon format. I have found the way to turn this into objects and the structure of the project is 4-layer MVC with entity framework and C#, which is working (kinda)...The problem is the database. I have been working on it for 2 months now and I am not getting any further. The thing I get stuck on is the database. I have not seen much on how to create databases and that's where it goes wrong. I don't get how to build the database. The creation itself would work if I figured out how to include certain things...
1) Formats: if the card is legal in a format, Formats is filled with: "legacy": "legal", "commander":"legal", ... so only the legal formats are included.
2) Types and colors are just plain arrays of words, but since I'm very bad with databases I don't even know how to figure this one out.
3) Editions is something completely different. It's an array of the object Edition which I believe has to have a table of its own. The problem here is that I thought I needed to use a foreign key but since it's an array of Editions I don't really know how to start doing that either.
4) and then there's Price: It always has 3 values: low, average and high which can be 0 if there's no price known.
So here you have it. To me this database is very complex or maybe I am making it too complex. Is there anybody who can help me to get this database organized so I can get on with my project, because I'm so lost at the moment that I feel I am not going to get this ready by the end of next month and that would be awful.
1: No, you should include all.
2: Table with colors, standard m:n binding table in between mapping the card table with the color table. Not knowing how to make a m:n relationship thing makes me thing you skipped all classes... this is fundamental and basic.
3: Seems like "cardedition" is the main table actually, and everything before is a master type table. Not sure- I don't really do magic at all, so I lack what is called domain knowledge. Are cards changed so multiple editions exist? Why is that an array in json?
3: magic values, 0,1,2,3. What is the question?
To me this database is very complex
I suggest you start from scratch (making things easier) and just have maybe 10 or so tables. Go step by step. Follow what you learned, go to 3rd of 4th normal form and go relational.

Without JOINs, what is the right way to handle data in document databases?

I understand that JOINs are either not possible or frowned upon in document databases. I'm coming from a relational database background and trying to understand how to handle such scenarios.
Let's say I have an Employees collection where I store all employee related information. The following is a typical employee document:
{
"id": 1234,
"firstName": "John",
"lastName": "Smith",
"gender": "Male",
"dateOfBirth": "3/21/1967",
"emailAddresses":[
{ "email": "johnsmith#mydomain.com", "isPrimary": "true" },
{ "email": "jsmith#someotherdomain.com", "isPrimary": "false" }
]
}
Let's also say, I have a separate Projects collection where I store project data that looks something like that:
{
"id": 444,
"projectName": "My Construction Project",
"projectType": "Construction",
"projectTeam":[
{ "_id": 2345, "position": "Engineer" },
{ "_id": 1234, "position": "Project Manager" }
]
}
If I want to return a list of all my projects along with project teams, how do I handle making sure that I return all the pertinent information about individuals in the team i.e. full names, email addresses, etc?
Is it two separate queries? One for projects and the other for people whose ID's appear in the projects collection?
If so, how do I then insert the data about people i.e. full names, email addresses? Do I then do a foreach loop in my app to update the data?
If I'm relying on my application to handle populating all the pertinent data, is this not a performance hit that would offset the performance benefits of document databases such as MongoDB?
Thanks for your help.
"...how do I handle making sure that I return all the pertinent information about individuals in the team i.e. full names, email addresses, etc? Is it two separate queries?"
It is either 2 separate queries OR you denormalize into the Project document. In our applications we do the 2nd query and keep the data as normalized as possible in the documents.
It is actually NOT common to see the "_id" key anywhere but on the top-level document. Further, for collections that you are going to have millions of documents in, you save storage by keeping the keys "terse". Consider "name" rather than "projectName", "type" rather than "projectType", "pos" rather than "position". It seems trivial but it adds up. You'll also want to put an index on "team.empId" so the query "how many projects has Joe Average worked on" runs well.
{
"_id": 444,
"name": "My Construction Project",
"type": "Construction",
"team":[
{ "empId": 2345, "pos": "Engineer" },
{ "empId": 1234, "pos": "Project Manager" }
]
}
Another thing to get used to is that you don't have to write the whole document every time you want to update an individual field or, say, add a new member to the team. You can do targeted updates that uniquely identify the document but only update an individual field or array element.
db.projects.update(
{ _id : 444 },
{ $addToSet : "team" : { "empId": 666, "position": "Minion" } }
);
The 2 queries to get one thing done hurts at first, but you'll get past it.
Mongo DB is a document storage database.
It supports High Availability, and Scalability.
For returning a list of all your projects along with project team(details),
according to my understanding, you will have to run 2 queries.
Since mongoDb do not have FK constraints, we need to maintain it at the program level.
Instead of FK constraints,
1) if the data is less, then we can embed the data as a sub document.
2) rather than normalized way of designing the db, in MongoDb we need to design according to the access pattern. i.e. the way we need to query the data more likely. (However time for update is more(slow), but at the user end the performance mainly depends on read activity, which will be better than RDBMS)
The following link provides a certificate course on mongo Db, free of cost.
Mongo DB University
They also have a forum, which is pretty good.

Creating Mandatory User Filters with multiple element IDs

Mandatory User Filters
I am working on a tool to allow customers to apply Mandatory User Filters. When attributes are loaded like "Year" or "Age", each can have hundreds of elements with the subsequent ids. In the POST request to create a filter (documented here: https://developer.gooddata.com/article/lets-get-started-with-mandatory-user-filters), looks like this:
{
"userFilter": {
"content": {
"expression": "[/gdc/md/{project-id}/obj/{object-id}]=[/gdc/md/{project-id}/obj/{object-id}/elements?id={element-id}]"
},
"meta": {
"category": "userFilter",
"title": "My User Filter Name"
}
}
}
In the "expression" property, it notes how one ID could be set. What I want is to have multiple ids associated with the object-id set with the post. For example, if I user wanted to add a filter to all of the elements in "Year" (there are 150) in the demo project, it seems odd to make 150 post requests.
Is there a better way?
UPDATE
Tomas thank you for your help.
I am not having trouble assigning multiple userfilters to a user. I can easily apply a singular filter to a user with the method outlined in the documentation. However, this overwrites the userfilter field. What is the syntax for this?
Here is my demo POST data:
{ "userFilters":
{ "items": [
{ "user": "/gdc/account/profile/decd0b2e3077cf9c47f8cfbc32f6460e",
"userFilters":["/gdc/md/a1nc4jfa14wey1bnfs1vh9dljaf8ejuq/obj/808728","/gdc/md/a1nc4jfa14wey1bnfs1vh9dljaf8ejuq/obj/808729","/gdc/md/a1nc4jfa14wey1bnfs1vh9dljaf8ejuq/obj/808728"]
}
]
}
}
This receives a BAD REQUEST.
I'm not sure what you mean by "have multiple ids associated with the object-id" exactly, but I'll try to tell you all I know about it. :-)
If you indeed made multiple POST requests, created multiple userFilters and set them all for one user, the user wouldn't see anything at all. That's because the system combines separate userFilters using logical AND, and a Year cannot be 2013 and 2014 at the same time. So for the rest of my answer, I'll assume that you want OR instead.
There are several ways to do this. As you may have guessed by now, you can use AND/OR explicitly, using an expression like this:
[/…/obj/{object-id}]=[/…/obj/{object-id}/elements?id={element-id}] OR [/…/obj/{object-id}]=[/…/obj/{object-id}/elements?id={element-id}]
This can often be further simplified to:
[/…/obj/{object-id}] IN ( [/…/obj/{object-id}/elements?id={element-id}], [/…/obj/{object-id}/elements?id={element-id}], … )
If the attribute is a date (year, month, …) attribute, you could, in theory, also specify ranges using BETWEEN instead of listing all elements:
[/…/obj/{object-id}] BETWEEN [/…/obj/{object-id}/elements?id={element-id}] AND [/…/obj/{object-id}/elements?id={element-id}]
It seems, though, that this only works in metrics MAQL and is not allowed in the implementation of user filters. I have no idea why.
Also, for your own attribute like Age, you can't do that since user-defined numeric attributes aren't supported. You could, in theory, add a fact that holds the numeric value, and construct a BETWEEN filter based on that fact. It seems that this is not allowed in the implementation of user filters either. :-(
Hope this helps.