Datastore API Filter by field then sort ascending - sql

I'm playing around with the Google Datastore API (runQuery method), and I am trying to run the gQuery String
'Select * from Transaction Where User = "[Me]" ORDER BY Start[date] ASC'
Sending that JSON Object gives me the following error:
400
- Show headers -
{
"error": {
"code": 400,
"message": "no matching index found. recommended index is:\n- kind: Transaction\n properties:\n - name: User\n - name: Start\n",
"status": "FAILED_PRECONDITION"
}
}
Alternatively, If I run this string:
{
"gqlQuery":
{
"allowLiterals":
"queryString":"
SELECT * FROM Transaction WHERE User = "[Me]"
"
}
}
I get a 200 Response
200
- Show headers -
{
"batch": {
"entityResultType": "FULL",
"entityResults": [
{
"entity": {
"key": {
"partitionId": {
"projectId": "project-id-5200999099906492774"
},
"path": [
{
"kind": "Transaction",
"id": "4693202737039992"
}
]
},....
or if I run this one to just order all the results:
'Select * from Transaction ORDER BY Start[date] ASC'
I get a 200 Response as well:
200
- Show headers -
{
"batch": {
"entityResultType": "FULL",
"entityResults": [
{
"entity": {
"key": {
"partitionId": {
"projectId": "project-id-5200707080506492774"
},
"path": [
{
"kind": "Transaction",
"id": "5641081148407808"
}
]
},...
So how can I do both operations in one line?
UPDATE:
As recommended below, I have used the google cloud platform to update the indexes manually. You create a valid yaml file in notepad and then use the upload tool (three vertical dots button on the right side of the cloud control command line tool) to place it on the server and direct to it with the comand line. Here are my results so far:
nathaniel#project-id-5200707080555492774:~$ gcloud datastore create-indexes /home/nathaniel/index.yaml
Configurations to update:
descriptor: [/home/nathaniel/index.yaml]
type: [datastore indexes]
target project: [project-id-5200707044406492774]
Do you want to continue (Y/n)? y
nathaniel#project-id-5200707080506492774:~$
This is the Yaml file I used:
indexes:
- kind: Transaction
properties:
- name: User
direction: asc
- name: Period
direction: asc
- name: Status
direction: asc
- name: auditStatus
direction: asc
- name: role
direction: asc
- name: Start
direction: desc
- name: End
direction: asc
Still unable to complete the query, but it may take time to populate. I'll check back through the day and update my results. As of 1:35PM EST, The indexes still don't seem to have updated, as shown below:

Like the error message says - in order to run the query with a WHERE and ORDER, you need a composite index on User and Start properties for Transaction kind. You can learn more about the indexes at https://cloud.google.com/datastore/docs/concepts/indexes.
You can create indexes using the gcloud command line tool. Refer to the documentation at - https://cloud.google.com/sdk/gcloud/reference/datastore/create-indexes
Once your index is created/active, which may take a while depending on the amount of data you have, you should be able to run the first query.

Related

Apache superset with mongoDB(NO SQL database)

I am using MongoDB. My task is to build Dashboard charts for the data. So, I am using Apache superset. I connected MongoDB to apache drill as it wont connect directly with superset. Then connected apache drill to Apachesueperset. My collection is nested. How can I process this nested data to get use for dashboard charts.My data looks as below
{
"_id": {
"$oid": "6229d3cfdbfc81a8777e4821"
},
"jobs": [
{
"job_ID": {
"$oid": "62289ded8079821eb24760e0"
},
"New": false,
"Expired": false
},
{
"job_ID": {
"$oid": "6228a252fb4554dd5c48202a"
},
"New": true,
"Expired": true
},
{
"job_ID": {
"$oid": "622af1c391b290d34701af9f"
},
"New": true,
"Expired": false
}
],
"email": "mani2090996#ail.com"
}
I am querying in apache drill as follows
SELECT flat.fill FROM (SELECT FLATTEN(t.jobs) AS fill FROM mongo.recruitingdb.flatten.`Vendorjobs` t) flat WHERE flat.fill.New = flase;
And i am getting parsing error
org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Encountered "." at line 1, column 123.
Superset doesn't really handle nested data very well. Drill does however, so you'll have to craft queries to produce columns that can be visualized.
Take a look here: https://drill.apache.org/docs/json-data-model/
and here: https://drill.apache.org/docs/querying-complex-data-introduction/.
UPDATE:
Try the query below. The FROM clause may not be exactly right, but you should get the idea from this.
Note that you can access maps in Drill in two ways:
tablename.mapname.field OR
mapname['field']
You can do this for any level of nesting.
SELECT mongoTable.jobs.job_ID.`$oid` AS job_ID,
mongoTable.jobs.`New` AS new,
mongoTable.jobs.`Expired` AS expired
FROM
(
SELECT flatten(jobs) AS jobs
FROM mongo.recruitingdb.flatten.`Vendorjobs` AS t1
WHERE t1.jobs.New = false
) AS mongoTable

How to convert Json to CSV and send it to big query or google cloud bucket

I`m new to nifi and I want to convert big amount of json data to csv format.
This is what I am doing at the moment but it is not the expected result.
These are the steps:
processes to create access_token and send request body using InvokeHTTP(This part works fine I wont name the processes since this is the expected result) and getting the response body in json.
Example of json response:
[
{
"results":[
{
"customer":{
"resourceName":"customers/123456789",
"id":"11111111"
},
"campaign":{
"resourceName":"customers/123456789/campaigns/222456422222",
"name":"asdaasdasdad",
"id":"456456546546"
},
"adGroup":{
"resourceName":"customers/456456456456/adGroups/456456456456",
"id":"456456456546",
"name":"asdasdasdasda"
},
"metrics":{
"clicks":"11",
"costMicros":"43068982",
"impressions":"2079"
},
"segments":{
"device":"DESKTOP",
"date":"2021-11-22"
},
"incomeRangeView":{
"resourceName":"customers/456456456/incomeRangeViews/456456546~456456456"
}
},
{
"customer":{
"resourceName":"customers/123456789",
"id":"11111111"
},
"campaign":{
"resourceName":"customers/123456789/campaigns/222456422222",
"name":"asdasdasdasd",
"id":"456456546546"
},
"adGroup":{
"resourceName":"customers/456456456456/adGroups/456456456456",
"id":"456456456546",
"name":"asdasdasdas"
},
"metrics":{
"clicks":"11",
"costMicros":"43068982",
"impressions":"2079"
},
"segments":{
"device":"DESKTOP",
"date":"2021-11-22"
},
"incomeRangeView":{
"resourceName":"customers/456456456/incomeRangeViews/456456546~456456456"
}
},
....etc....
]
}
]
Now I am using:
===>SplitJson ($[].results[])==>JoltTransformJSON with this spec:
[{
"operation": "shift",
"spec": {
"customer": {
"id": "customer_id"
},
"campaign": {
"id": "campaign_id",
"name": "campaign_name"
},
"adGroup": {
"id": "ad_group_id",
"name": "ad_group_name"
},
"metrics": {
"clicks": "clicks",
"costMicros": "cost",
"impressions": "impressions"
},
"segments": {
"device": "device",
"date": "date"
},
"incomeRangeView": {
"resourceName": "keywords_id"
}
}
}]
==>> MergeContent( here is the problem which I don`t know how to fix)
Merge Strategy: Defragment
Merge Format: Binary Concatnation
Attribute Strategy Keep Only Common Attributes
Maximum number of Bins 5 (I tried 10 same result)
Delimiter Strategy: Text
Header: [
Footer: ]
Demarcator: ,
What is the result I get?
I get a json file that has parts of the json data
Example: I have 50k customer_ids in 1 json file so I would like to send this data into big query table and have all the ids under the same field "customer_id".
The MergeContent uses the split json files and combines them but I will still get 10k customer_ids for each file i.e. I have 5 files and not 1 file with 50k customer_ids
After the MergeContent I use ==>>ConvertRecord with these settings:
Record Reader JsonTreeReader (Schema Access Strategy: InferSchema)
Record Writer CsvRecordWriter (
Schema Write Strategy: Do Not Write Schema
Schema Access Strategy: Inherit Record Schema
CSV Format: Microsoft Excel
Include Header Line: true
Character Set UTF-8
)
==>>UpdateAttribute (custom prop: filename: ${filename}.csv) ==>> PutGCSObject(and put the data into the google bucket (this step works fine- I am able to put files there))
With this approach I am UNABLE to send data to big query(After MergeContent I tried using PutBigQueryBatch and used this command in bq sheel to get the schema I need:
bq show --format=prettyjson some_data_set.some_table_in_that_data_set | jq '.schema.fields'
I filled all the fields as needed and Load file type: I tried NEWLINE_DELIMITED_JSON or CSV if I converted it to CSV (I am not getting errors but no data is uploaded into the table)
)
What am I doing wrong? I basically want to map the data in such a way that each fields data will be under the same field name
The trick you are missing is using Records.
Instead of using X>SplitJson>JoltTransformJson>Merge>Convert>X, try just X>JoltTransformRecord>X with a JSON Reader and a CSV Writer. This skips a lot of inefficiency.
If you really need to split (and you should avoid splitting and merging unless totally necessary), you can use MergeRecord instead - again with a JSON Reader and CSV Writer. This would make your flow X>Split>Jolt>MergeRecord>X.

Filter parameters to POST verify and place order request for Performance storage

I am trying to do BPM and SoftLayer integration using Java REST client. On my initial analysis(as well as help form stack overflow),I found
Step 1) we to get getPriceItem list to have all IDs for next request.
https://username:api_key#api.softlayer.com/rest/v3/SoftLayer_Product_Package/2/getItemPrices?objectMask=mask[id,item[keyName,description],pricingLocationGroup[locations[id, name, longName]]]
and then do verify and place order POST call using respective APIs.
I am stucked on Step 1) as filtering here seems to be bit tricky. I am getting a json response of over 20000 lines.
I wanted to show similar data(just like SL Performance storage UI ) on my custom BPM UI . (One drop down to select type of storage, 2nd to show location, 3rd to show size and 4th would be IOPS) where user can select the items and place request.
Here I found, SL is something similar to this for populating the drop downs-
https://control.softlayer.com/sales/productpackage/getregions?_dc=1456386930027&categoryCode=performance_storage_iscsi&packageId=222&page=1&start=0&limit=25
Can't we have implementation where we can use control.softlayer.com just like SL instead of api.softlayer.com? In that case we can use similar logic to display data on UI.
Thanks
Anupam
Here, using the API, the steps for performance storage. For endure storage the steps are similar you just need to review the value for categoryCode and modify if it needed
you can get the locations using this method:
http://sldn.softlayer.com/reference/services/SoftLayer_Product_Package/getRegions
you just need to know the package of the storage e.g.
GET https://api.softlayer.com/rest/v3.1/SoftLayer_Product_Package/222/getRegions
then, you can get the storage size for that you can use the SoftLayer_Product_Package::getItems or SoftLayer_Product_Package::getItemPrices methods and a filter e.g.
GET https://api.softlayer.com/rest/v3.1/SoftLayer_Product_Package/222/getItemPrices?objectFilter={"itemPrices": {"categories": {"categoryCode": {"operation": "performance_storage_space"}},"locationGroupId": { "operation": "is null"}}}
Note: We are filtering the data to get the prices whose category code is "performance_storage_space" and we want the standard price locationGroupId = null
then, you can get the IOPS, you can use the same approach like above, but there is a dependency between the IOPS and storage space e.g.
GET https://api.softlayer.com/rest/v3.1/SoftLayer_Product_Package/222/getItemPrices?objectFilter={"itemPrices": { "attributes": { "value": { "operation": 20 } }, "categories": { "categoryCode": { "operation": "performance_storage_iops" } }, "locationGroupId": { "operation": "is null" } } }
Note: In the example we assume that selected storage space was "20", the prices for IOPS have an record called atributes, this record tell us the valid storage spaces of the IOPS, then we have other filters to get only the IOPS prices categoryCode = performance_storage_iops and we want only the standard prices locationGroupId=null
To selecting the storage type I do not think there is a method the only way I see is that you call the SoftLayer_Product_Package::getAllObjects method and filter the data to get the packages for endurance, performance and portable storage.
Just in case here an example using the Softlayer's Python client to order
"""
Order a block storage (performance ISCSI).
Important manual pages:
http://sldn.softlayer.com/reference/services/SoftLayer_Product_Order
http://sldn.softlayer.com/reference/services/SoftLayer_Product_Order/verifyOrder
http://sldn.softlayer.com/reference/services/SoftLayer_Product_Order/placeOrder
http://sldn.softlayer.com/reference/services/SoftLayer_Product_Package
http://sldn.softlayer.com/reference/services/SoftLayer_Product_Package/getItems
http://sldn.softlayer.com/reference/services/SoftLayer_Location
http://sldn.softlayer.com/reference/services/SoftLayer_Location/getDatacenters
http://sldn.softlayer.com/reference/services/SoftLayer_Network_Storage_Iscsi_OS_Type
http://sldn.softlayer.com/reference/services/SoftLayer_Network_Storage_Iscsi_OS_Type/getAllObjects
http://sldn.softlayer.com/reference/datatypes/SoftLayer_Location
http://sldn.softlayer.com/reference/datatypes/SoftLayer_Container_Product_Order_Network_Storage_Enterprise
http://sldn.softlayer.com/reference/datatypes/SoftLayer_Product_Item_Price
http://sldn.softlayer.com/blog/cmporter/Location-based-Pricing-and-You
http://sldn.softlayer.com/blog/bpotter/Going-Further-SoftLayer-API-Python-Client-Part-3
http://sldn.softlayer.com/article/Object-Filters
http://sldn.softlayer.com/article/Python
http://sldn.softlayer.com/article/Object-Masks
License: http://sldn.softlayer.com/article/License
Author: SoftLayer Technologies, Inc. <sldn#softlayer.com>
"""
import SoftLayer
import json
# Values "AMS01", "AMS03", "CHE01", "DAL05", "DAL06" "FRA02", "HKG02", "LON02", etc.
location = "AMS01"
# Values "20", "40", "80", "100", etc.
storageSize = "40"
# Values between "100" and "6000" by intervals of 100.
iops = "100"
# Values "Hyper-V", "Linux", "VMWare", "Windows 2008+", "Windows GPT", "Windows 2003", "Xen"
os = "Linux"
PACKAGE_ID = 222
client = SoftLayer.Client()
productOrderService = client['SoftLayer_Product_Order']
packageService = client['SoftLayer_Product_Package']
locationService = client['SoftLayer_Location']
osService = client['SoftLayer_Network_Storage_Iscsi_OS_Type']
objectFilterDatacenter = {"name": {"operation": location.lower()}}
objectFilterStorageNfs = {"items": {"categories": {"categoryCode": {"operation": "performance_storage_iscsi"}}}}
objectFilterOsType = {"name": {"operation": os}}
try:
# Getting the datacenter.
datacenter = locationService.getDatacenters(filter=objectFilterDatacenter)
# Getting the performance storage NFS prices.
itemsStorageNfs = packageService.getItems(id=PACKAGE_ID, filter=objectFilterStorageNfs)
# Getting the storage space prices
objectFilter = {
"itemPrices": {
"item": {
"capacity": {
"operation": storageSize
}
},
"categories": {
"categoryCode": {
"operation": "performance_storage_space"
}
},
"locationGroupId": {
"operation": "is null"
}
}
}
pricesStorageSpace = packageService.getItemPrices(id=PACKAGE_ID, filter=objectFilter)
# If the prices list is empty that means that the storage space value is invalid.
if len(pricesStorageSpace) == 0:
raise ValueError('The storage space value: ' + storageSize + ' GB, is not valid.')
# Getting the IOPS prices
objectFilter = {
"itemPrices": {
"item": {
"capacity": {
"operation": iops
}
},
"attributes": {
"value": {
"operation": storageSize
}
},
"categories": {
"categoryCode": {
"operation": "performance_storage_iops"
}
},
"locationGroupId": {
"operation": "is null"
}
}
}
pricesIops = packageService.getItemPrices(id=PACKAGE_ID, filter=objectFilter)
# If the prices list is empty that means that the IOPS value is invalid for the configured storage space.
if len(pricesIops) == 0:
raise ValueError('The IOPS value: ' + iops + ', is not valid for the storage space: ' + storageSize + ' GB.')
# Getting the OS.
os = osService.getAllObjects(filter=objectFilterOsType)
# Building the order template.
orderData = {
"complexType": "SoftLayer_Container_Product_Order_Network_PerformanceStorage_Iscsi",
"packageId": PACKAGE_ID,
"location": datacenter[0]['id'],
"quantity": 1,
"prices": [
{
"id": itemsStorageNfs[0]['prices'][0]['id']
},
{
"id": pricesStorageSpace[0]['id']
},
{
"id": pricesIops[0]['id']
}
],
"osFormatType": os[0]
}
# verifyOrder() will check your order for errors. Replace this with a call to
# placeOrder() when you're ready to order. Both calls return a receipt object
# that you can use for your records.
response = productOrderService.verifyOrder(orderData)
print(json.dumps(response, sort_keys=True, indent=2, separators=(',', ': ')))
except SoftLayer.SoftLayerAPIError as e:
print("Unable to place the order. faultCode=%s, faultString=%s" % (e.faultCode, e.faultString))

square connect api batch processing

I need assistance with batch processing, especially in adding tax codes to items.
I'm experimenting with the square batch processing feature and my sample cases are create 2 items and add the tax code to them. In all 4 requests - 2 for creating item, 2 to 'put' the tax code. I have tried the following orders:
1. create the two items; add the taxes
2. create one item; add tax code to that item; create second item, add code to the second item.
In both instances, the result is the same - the taxes are applied to only one item. For the second item, the response I get is:
{
"status_code":404,
"body":{
"type":"not_found",
"message":"NotFound"
},
"request_id":4
}
To help with the investigation, here's the sample json that I use in the cURL request.
{
"requests":[
{
"method":"POST",
"relative_path":"\/v1\/me\/items",
"access_token":"XXX-YYY",
"body":
{
"id":126,
"name":"TestItem",
"description":"TestItemDescription",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"variations":[
{
"name":"var1",
"pricing_type":"FIXED_PRICING",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG"
}
]},
"request_id":1
},
{
"method":"PUT",
"relative_path":"\/v1\/me\/items\/126\/fees\/7F2D50D8-43C1-4518-8B8D-881CBA06C7AB",
"access_token":"XXX-YYY",
"request_id":2
},
{
"method":"POST",
"relative_path":"\/v1\/me\/items",
"access_token":"XXX-YYY",
"body":
{
"id":127,
"name":"TestItem1",
"description":"TestItemDescription1",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"variations":[
{
"name":"var1",
"pricing_type":"FIXED_PRICING",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG1"
}
]
},
"request_id":3
},
{
"method":"PUT",
"relative_path":"\/v1\/me\/items\/127\/fees\/7F2D50D8-43C1-4518-8B8D-881CBA06C7AB",
"access_token":"XXX-YYY",
"request_id":4
}
]
}
Below is the full response that I receive indicating successful creation of two items and only one successful tax push.
[
{
"status_code":200,
"body":
{
"visibility":"PUBLIC",
"available_online":false,
"available_for_pickup":false,
"id":"126",
"description":"TestItemDescription",
"name":"TestItem",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"category":
{
"id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"name":"Writing Instruments"
},
"variations":[
{
"pricing_type":"FIXED_PRICING",
"track_inventory":false,
"inventory_alert_type":"NONE",
"id":"4c70909b-90bd-4742-b772-e4fabe636557",
"name":"var1",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG",
"ordinal":1,
"item_id":"126"
}
],
"modifier_lists":[],
"fees":[],
"images":[]
},
"request_id":1
},
{
"status_code":200,
"body":{},
"request_id":2
},
{
"status_code":200,
"body":
{
"visibility":"PUBLIC",
"available_online":false,
"available_for_pickup":false,
"id":"127",
"description":"TestItemDescription1",
"name":"TestItem1",
"category_id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"category":
{
"id":"DF1F51FB-11D6-4232-B138-2ECE3D89D206",
"name":"Writing Instruments"
},
"variations":[
{
"pricing_type":"FIXED_PRICING",
"track_inventory":false,
"inventory_alert_type":"NONE",
"id":"6de8932f-603e-4cd9-99ad-67f6c7777ffd",
"name":"var1",
"price_money":
{
"currency_code":"CAD",
"amount":400
},
"sku":"123444:QWEFASDERRG1",
"ordinal":1,
"item_id":"127"
}
],
"modifier_lists":[],
"fees":[],
"images":[]
},
"request_id":3
},
{
"status_code":404,
"body":
{
"type":"not_found",
"message":"NotFound"
},
"request_id":4
}
]
I have checked through going for the list of items and both items with their item ID's are present in the inventory. So the questions I have are, Why the tax is applied to one item and not to the other? How to resolve it?
From the Square docs:
Note the following when using the Submit Batch endpoint:
You cannot include more than 30 requests in a single batch.
Recursive
requests to the Submit Batch endpoint are not allowed (i.e., none of
the requests included in a batch can itself be a request to this
endpoint).
There is no guarantee of the order in which batched
requests are performed.
(emphasis mine).
If you want to use the batch API, you will have to create parent entities like items first, then in a separate batch request apply any child entities like fees, discounts, etc... Alternately, you can just make separate requests. There may not be much benefit from using the batch API in this case.

Keen-io: i can't delete special event using extraction query filter

using extraction query (which used url decoded for reading):
https://api.keen.io/3.0/projects/xxx/queries/extraction?api_key=xxxx&event_collection=dispatched-orders&filters=[{"property_name":"features.tradeId","operator":"eq","property_value":8581}]&timezone=28800
return
{
result: [
{
mobile: "13185716746",
keen : {
timestamp: "2015-02-10T07:10:07.816Z",
created_at: "2015-02-10T07:10:08.725Z",
id: "54d9aed03bc6964a7d311f9e"
},
data : {
itemId: 2130,
num: 1
},
features: {
communityId: 2000,
dispatcherId: 39,
tradeId: 8581
}
}
]
}
but if i use the same filters in my delete query url (which used url decoded for reading):
https://api.keen.io/3.0/projects/xxxxx/events/dispatched-orders?api_key=xxxxxx&filters=[{"property_name":"features.tradeId","operator":"eq","property_value":8581}]&timezone=28800
return
{
properties: {
data.num: "num",
keen.created_at: "datetime",
mobile: "string",
keen.id: "string",
features.communityId: "num",
features.dispatcherId: "num",
keen.timestamp: "datetime",
features.tradeId: "num",
data.itemId: "num"
}
}
plz help me ...
It looks like you are issuing a GET request for the delete comment. If you perform a GET on a collection you get back the schema that Keen has inferred for that collection.
You'll want to issue the above as a DELETE request. Here's the cURL command to do that:
curl -X DELETE "https://api.keen.io/3.0/projects/xxxxx/events/dispatched-orders?api_key=xxxxxx&filters=[{"property_name":"features.tradeId","operator":"eq","property_value":8581}]&timezone=28800"
Note that you'll probably need to URL encode that JSON as you mentioned in your above post!