Using Aggregate to rename a field in MongoDB - sql

Working on query to rename field 'expertise' to skills, expertise is an array which holds more than one so looking to slice it to 1.
Example of table:{ "_id" : "E08", "name" : "Damien Collins", "expertise" : [ "Python", "Java" ] }
Looking to show named, expertise as "skill" and just to show one piece of expertise.
Current query:
db.employees.aggregate([{expertise:{$exists:true}},{$project:{_id:1,"Skill":{expertise{$slice:1}},name:1}}])
Had it working before the rename of skill part.

Use $arrayElemAt to return 1 element of the array:
{$project:{_id:1,"Skill":{$arrayElemAt:["$expertise",0]},name:1}}

Related

Need Pentaho JSON without array

I wanted to output json data not as array object and I did the changes mentioned in the pentaho document, but the output is always array even for the single set of values. I am using PDI 9.1 and I tested using the ktr from the below link
https://wiki.pentaho.com/download/attachments/25043814/json_output.ktr?version=1&modificationDate=1389259055000&api=v2
below statement is from https://wiki.pentaho.com/display/EAI/JSON+output
Another special case is when 'Nr. rows in a block' = 1.
If used with empty json block name output will looks like:
{
"name" : "item",
"value" : 25
}
My output comes like below
{ "": [ {"name":"item","value":25} ] }
I have resolved myself. I have added another JSON input step and defined as below
$.wellDesign[0] to get the array as string object

Snowflake Searching string in semi structured data

I have a table. There are many columns and rows. One column that I am trying to query in Snowflake has semi structured data. For example, when I query
select response
from table
limit 5
This is what is returned
[body={\n "id": "xxxxx",\n "object": "charge",\n "amount": 500,\n "amount_refunded": 0,\n "application": null,\n "application_fee": null,\n "application_fee_amount": null,\n "balance_transaction": null,\n "billing_details": {\n "address": {\n "city": null,\n "zip": "xxxxx",]
I want to select only the zip in this data. When I run code:
select response:zip
from table
limit 5
I get an error.
SQL compilation error: error line 1 at position 21 Invalid argument types for function 'GET': (VARCHAR(16777216), VARCHAR(11))
Is there a reason why this is happening? I am new to snowflake so trying to parse out this data but stuck. Thanks!
Snowflake has very good documentation on the subject
For your specific case, have you attempted to use dot notation? It's the appropiate method for accessing JSON. So
Select result:body.zip
from table
Remember that you have your 'body' element. You need to access that one first with semicolon because it's a level 1 element. Zip is located within body so it's a level 2. Level 1 elements are accessed with semicolon, level 2 elements are accessed with dot notation.
I think you have multiple issues with this.
First I think your response column is not a variant column. Please run the below query and confirm
SHOW COLUMNS ON table;
Even if the column is variant, the way the data is stored is not in a valid JSON format. You will need to strip the JSON part and then store that in the variant column.
Please do the first part and share the information, I will then suggest next steps. I wanted to put that in the comment but comment does not allow to write so many sentences.

How to match following queries in Azure Search

I have the default Analyzer set for my index and the fields in Azure Search.
I have following values for a field - name.
Demo 001
Demo Site 001
001 Demo Site
I am trying to get matching values for following . My sample queries are
$count=true&queryType=full&searchFields=name&searchMode=any&$select=name,id&$skip=0&$top=10&search=name:/"Demo(.*)/
I could get all the results
In order to get the query work for getting only Demo S, that is Demo Site 001. What change I should make to the Query? Or what change I should make to the analyzer?
If I want to get a query working with 001, 001 and a space how can I modify the query?
Finally is there any way I could tell the search that I need only the properties which starts with 001?
Is it possible to achieve all the above three conditions with a single setup?
There are 2 probable ways to achieve this.
A. Custom Analyzer with a CharMap filter
1. For index phase, you can use a Custom Analyzer with a character filter to map whitespaces to underscores/emptystring.
eg:If you map whitespaces to emptystring, your data will be stored as:
Demo Site 001 ---> DemoSite001
001 Demo Site ---> 001DemoSite
"charFilters":[
{
"name":"map_dash",
"#odata.type":"#Microsoft.Azure.Search.MappingCharFilter",
"mappings":[" =>"]
}
In query phase,
Step 1. Parse the query and substitute whitespace with the same identifier, as used in the index phase.
So , search query "Demo S" translates to ---> "DemoS"
Step 2. Do a wildcard search for the new query string
search = DemoS*
B. Custom Analyzer with an EdgeNGramToken Filter
Use a custom analyzer , with a EdgeNGram TokenFilter to index your documents.
eg:
"tokenFilters": [
{
"name": "edgeNGramFilter",
"#odata.type": "#Microsoft.Azure.Search.EdgeNGramTokenFilterV2",
"minGram": 2,
"maxGram": 20
}
],
"analyzers": [
{
"name": "prefixAnalyzer",
"#odata.type": "#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer": "keyword",
"tokenFilters": [ "lowercase", "edgeNGramFilter" ]
}
]
With any of these approach
"Demo S" will return only Demo Site 001
"001 " will only return 001 Demo Site
More details :
How Search works
Custom Analyzers

Using reserved word field name in DocumentDB

I inherited a database loaded into DocumentDB, where field name happens to be "Value".
Example of my structure is:
{
...
"Alternates": [
"Type": "ID",
"Value" : "NOCALL"
]
}
when I query (using documentDB's SQL), trying to get back all documents where Alternates.Value = "NOCALL", I get syntax error near
"Value" error
. If I query for Type = "ID", it is all fine.
Seems that the word Value, having a special meaning on DocumentDB is causing an issue.
Putting punctuation (e.g. quotes/double quotes) around "Value" does not seem to help.
Any suggestion on how to resolve this will be much appreciated!
Thank you in advance!
You are correct. Value is a reserved keyword.
To escape this use [""] syntax.
So in your case of
"Alternates": [
"Type": "ID",
"Value" : "NOCALL"
]
SELECT c
FROM c
JOIN alt IN c.Alternates
WHERE alt["Value"] = 'NOCALL'
In my case, the structure looks something like this - { "name": "ABC", "Value": 123 }.
I could escape the reserved keyword using [""] (as answered by others) along with <source_name> i.e.
SELECT c["Value"] FROM c -- 123
Ref.: Querying in Azure Cosmos DB

Is getting the General ID same as getting FormattedID in rally?

I am trying to get the ID under "General" from a feature item in rally. This is my query:
body = { "find" => {"_ProjectHierarchy" => projectID, "_TypeHierarchy" => "PortfolioItem/Feature"
},
"fields" => ["FormattedID","Name","State","Release","_ItemHierarchy","_TypeHierarchy","Tags"],
"hydrate" => ["_ItemHierarchy","_TypeHierarchy","Tags"],
"fetch"=>true
}
I am not able to get any value for FormattedID, I tried using "_UnformattedID" but it pulls up an entirely different value than the FormattedID. Any help would be appreciated.
LBAPI does not have FormattedID field. You are correct using _UnformattedID. It is the FormattedID without the prefix. For example, this query:
https://rally1.rallydev.com/analytics/v2.0/service/rally/workspace/1111/artifact/snapshot/query.js?find={"_ProjectHierarchy":2222,"_TypeHierarchy":"PortfolioItem/Feature","State":"Developing",_ValidFrom: {$gte: "2013-06-01TZ",$lt: "2013-09-01TZ"}},sort:{_ValidFrom:-1}}&fields=["_UnformattedID","Name","State"]&hydrate=["State"]&compress=true&pagesize:200
shows _UnformattedID that correspond to FormattedID as this screenshot shows:
I noticed your are using fields and fetch . Per LBAPI's documentation, it uses fields rather than fetch. If you want to get all fields, use fields=true
As far as the missing custom fields, make sure that the custom field value was set within the dates of the query.
Compare these almost identical queries: the first query does not return a custom field, the second query does.
Query #1:
https://rally1.rallydev.com/analytics/v2.0/service/rally/workspace/1111/artifact/snapshot/query.js?find={"_ProjectHierarchy":2222,"_TypeHierarchy":"PortfolioItem/Feature","State":"Developing",_ValidFrom: {$gte: "2013-06-01TZ",$lt: "2013-09-01TZ"}}}&fields=["_UnformattedID","Name","State","c_PiCustomField"]&hydrate=["State","c_PiCustomField"]
Query #2:
https://rally1.rallydev.com/analytics/v2.0/service/rally/workspace/11111/artifact/snapshot/query.js?find={"_ProjectHierarchy":2222,"_TypeHierarchy":"PortfolioItem/Feature","State":"Developing",__At: "current"}&fields=["_UnformattedID","Name","State","c_PiCustomField"]&hydrate=["State","c_PiCustomField"]
The first query uses time period: _ValidFrom: {$gte: "2013-06-01TZ",$lt: "2013-09-01TZ"}
The second query uses __At: "current"
Let's say I just create a new custom field on PortfolioItem. It is not possible to create a custom field on PorfolioItem/Feature, so the field is created on PI, but both queries still use "_TypeHierarchy":"PortfolioItem/Feature".
After I created this custom field, called PiCustomField, I set a value of that field for a specific Feature, F4.
The first query does not have a single snapshot that includes that field because that field did not exist in the time period we lookback. We can't change the past.
The second query returns this field for F4. It does not return it for other Features because all other Features do not have this field set.
Here is the screenshot: