While reading in the knowledge center, the following is mentioned:
The TTL properties are not applied to data that already exists in the
Analytics Platform. You must set the TTL properties before you add
data.
So how can I remove existing logs before setting those properties?
You must use the Elastic Search delete APIs to remove existing documents from Worklight Analytics.
Before using any of the Elastic Search delete APIs it is advised to back up your data first, as misuse of the APIs or an undesired query will result in permanent data loss.
Below is an example of how to delete client logs in a specified date range, assuming your instance of Elastic Search is running on http://localhost:9500. This specific example deletes all client logs between October 1st and October 15th 2014.
curl -XDELETE 'http://localhost:9500/worklight/client_logs/_query' -d
'
{
"query": {
"range": {
"timestamp": {
"gt" : 1412121600000,
"lt" : 1413331200000
}
}
}
}
'
You can delete any type of document using the path http://localhost:9500/worklight/{document_type}. The types of documents are app_activities, network_activities, notification_activities, client_logs and server_logs.
When deleting documents, you can filter on two properties: "timestamp" or "daystamp", which are both represented in epoch time in milliseconds. Please note, "daystamp" is simply the first timestamp for the given day (i.e. 12:00AM). The range query also accepts the following parameters:
gte - greater than or equal to
gt - greater than
lte - less than or equal to
lt - less than
For more information refer to Elastic Search delete and query APIS:
Delete by Query API
Queries
Range Query
Related
I have a message coming from a topic through MQTT.
I need change the name os the columns of the message.
The original message:
{
"timestamp": 1645722065088,
"Heart Rate Measurement": 24550,
"Energy Expended": 1900,
"RR-Interval": 1
}
I need to take just timestamp and Heart Rate inside of a rule:
SELECT "Heart Rate Measurement"as heartrate, timestamp as date FROM
'pulsewave/heart_rate'
The timestamp is easy to get but the "Heart Rate Measurement" is not
I ended up getting the following:
{
"heartrate": "Heart Rate Measurement",
"date": 1645722065088
}
any tips to get the message inside of the Heart Rate Measurement? When i set without the quotes it doesnt accept
The rule works for the timestamp attribute but not Heart Rate Measurement as the AWS IoT SQL syntax doesn't support spaces in attribute names.
From https://docs.aws.amazon.com/iot/latest/developerguide/iot-sql-reference.html
Attribute names with spaces in them can't be used as field names in the SQL statement. While the incoming payload can have attribute names with spaces in them, such names can't be used in the SQL statement. They will, however, be passed through to the outgoing payload if you use a wildcard (*) field name specification.
An alternate approach is to implement a lambda that projects your JSON payload to an equivalent without the spaces.
How to programmatically list available Google BigQuery locations? I need a result similar to what is in the table of this page: https://cloud.google.com/bigquery/docs/locations.
As #shollyman has mentioned
The BigQuery API does not expose the equivalent of a list locations call at this time.
So, you should consider filing a feature request on the issue tracker.
Meantime, I wanted to add Option 3 to those two already proposed by #Tamir
This is a little naïve option with its pros and cons, but depends on your specific use case can be useful and easy adapted to your application
Step 1 - load page (https://cloud.google.com/bigquery/docs/locations) html
Step 2 - parse and extract needed info
Obviously, this is super simple to implement in any client of your choice
As I am huge BigQuery fan - I went through "prove of concept" using BigQuery Tool - Magnus
I've created workflow with just two Tasks:
API Task - to load page's HTML into variable var_payload
and
BigQuery Task - to parse and extract wanted info out of html
The "whole" workflow is as simple as it looks in below screenshot
The query I used in BigQuery Task is
CREATE TEMP FUNCTION decode(x STRING) RETURNS STRING
LANGUAGE js AS """
return he.decode(x);
"""
OPTIONS (library="gs://my_bucket/he.js");
WITH t AS (
SELECT html,
REGEXP_EXTRACT_ALL(
REGEXP_REPLACE(html,
r'\n|<strong>|</strong>|<code>|</code>', ''),
r'<table>(.*?)</table>'
)[OFFSET(0)] x
FROM (SELECT'''<var_payload>''' AS html)
)
SELECT pos,
line[SAFE_OFFSET(0)] Area,
line[SAFE_OFFSET(1)] Region_Name,
decode(line[SAFE_OFFSET(2)]) Region_Description
FROM (
SELECT
pos, REGEXP_EXTRACT_ALL(line, '<td>(.*?)</td>') line
FROM t,
UNNEST(REGEXP_EXTRACT_ALL(x, r'<tr>(.*?)</tr>')) line
WITH OFFSET pos
WHERE pos > 0
)
As you can see, i used he library. From its README:
he (for “HTML entities”) is a robust HTML entity encoder/decoder written in JavaScript. It supports all standardized named character references as per HTML, handles ambiguous ampersands and other edge cases just like a browser would ...
After workflow is executed and those two steps are done - result is in project.dataset.location_extraction and we can query this table to make sure we've got what is expected
Note: obviously parsing and extracting needed locations info is quite simplified and surely can be improved to be more flexible in terms of changing source page layout
Unfortunately, There is no API which provides BigQuery supported location list.
I see two options which might be good for you:
Option 1
You can manually manage a list and expose this list to your client via an API or any other means your application support (You will need to follow BigQuery product updates to follow on updates on this list)
Option 2
If your use case is to provide a list of the location you are using to store your own data you can call dataset.list to get a list of location and display/use it in your app
{
"kind": "bigquery#dataset",
"id": "id1",
"datasetReference": {
"datasetId": "datasetId",
"projectId": "projectId"
},
"location": "US"
}
I am using Domino Data service to access documents based on certain search criteria.One of my document is
{
"#href":"/rrdb.nsf/api/data/documents/unid/2FC3551DC5266A5088257E35001D5D2C",
"#unid":"2FC3551DC5266A5088257E35001D5D2C",
"#noteid":"922",
"#created":"2015-04-28T05:20:43Z",
"#modified":"2015-04-28T05:20:47Z",
"#authors":
["CN=domain/O=test",""
],
"#form":"Reservation",
"ApptUNID":"B0E582BBA2A39B5988257E35001D5D29",
"From":"CN=ram/O=test",
"AltFrom":"CN=ram/O=test",
"Chair":"CN=ram/O=test",
"AltChair":"CN=ram/O=test",
"Principal":"CN=ram/O=cisco",
"SequenceNum":1,
"ORGState":"5",
"ResourceType":"1",
"ResourceName":"Sedna/B17",
"Room":"Sedna/B17#test",
"Capacity":1,
"_ViewIcon":133,
"AppointmentType":"3",
"StartTimeZone":"Z=-3005$DO=0$ZN=India",
"EndTimeZone":"Z=-3005$DO=0$ZN=India",
"Topic":"2 hour meeting with sendna conference room",
"SendTo":"CN=Sedna/O=B17",
"PostedDate":"2015-04-28T05:20:43Z",
"Encrypt":"0",
"Categories":"",
"RouteServers":"CN=B16-PF-QA-055/O=test",
"RouteTimes":
["2015-04-28T05:20:43Z","2015-04-28T05:20:44Z"
],
"DeliveredDate":"2015-04-28T05:20:44Z",
"StartDate":"2015-04-28T05:15:00Z",
"StartTime":"2015-04-28T05:15:00Z",
"StartDateTime":"2015-04-28T05:15:00Z",
"EndDate":"2015-04-28T07:15:00Z",
"EndTime":"2015-04-28T07:15:00Z",
"EndDateTime":"2015-04-28T07:15:00Z",
"UpdateSeq":1,
"Author":"CN=ram/O=test",
"ResourceOwner":"",
"ReservedFor":"CN=ram/O=cisco",
"ReservedBy":"CN=ram/O=cisco",
"RQStatus":"A",
"Purpose":"2 hour meeting with sendna conference room",
"NoticeType":"A",
"Step":3,
"Site":"B17",
"ReserveDate":"2015-04-28T05:15:00Z"
}
I am using http://{host}/rrdb.nsf/api/data/collections/name/$Calendar?search=([SendTo] CONTAINS "CN=Sedna") to fetch this document,But it is not returning me the record.But if i use CONTAINS "Sedna" then it works.
[edited]
The internal representation of the sendTo seems to be [ABBREVIATE] and not [CANONICALIZE]. Thus looking for CN=... doesn't return any result, since the CN= O= are not part of the data.
Replace the search by:
[SendTo]="Sedna/B17"
Or optionally "Sedna/" if you only want to test that exact name is Sedna.
I am using AWS Data Pipeline to save a text file to my S3 bucket from RDS. I would like the file name to to have the date and the hour in the file name like:
myfile-YYYYMMDD-HH.txt
myfile-20140813-12.txt
I have specified my S3DataNode FilePath as:
s3://mybucketname/out/myfile-#{format(myDateTime,'YYYY-MM-dd-HH')}.txt
When I try to save my pipeline I get the following error:
ERROR: Unable to resolve myDateTime for object:DataNodeId_xOQxz
According to the AWS Data Pipeline documentation for date and time functions this is the proper syntax for using the format function.
When I save pipeline using a "hard-coded" the date and time I don't get this error and my file is in my S3 bucket and folder as expected.
My thinking is that I need to define "myDateTime" somewhere or use a NOW()
Can somebody tell me how to set "myDateTime" to the current time (e.g. NOW) or give a workaround so I can format the current time to be used in my FilePath?
I am not aware of an exact equivalent of NOW() in Data Pipeline. I tried using makeDate with no arguments (just for fun) to see if that worked.. it did not.
The closest are runtime variables scheduledStartTime, actualStartTime, reportProgressTime.
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-s3datanode.html
The following for eg. should work.
s3://mybucketname/out/myfile-#{format(#scheduledStartTime,'YYYY-MM-dd-HH')}.txt
Just for fun, here is some more info on Parameters.
At the end of your Pipeline Json (click List Pipelines, select into one, click Edit Pipeline, then click Export), you need to add a Parameters and/or Values object.
I use a myStartDate for backfill processes which you can manipulate once it is passed in for ad hoc runs. You can give this a static default, but can't set it to a dynamic value so it is limited for regular schedule tasks. For realtime/scheduled dates, you need to use the #scheduledStartTime, etc, as suggested. Here is a sample of setting up some Parameters and or Values. Both show up in Parameters in the UI. These values can be used through out your pipeline activities (shell, hive, etc) with the #{myVariableToUse} notation.
"parameters": [
{
"helpText": "Put help text here",
"watermark": "This shows if no default or value set",
"description": "Label/Desc",
"id": "myVariableToUse",
"type": "string"
}
]
And for Values:
"values": {
"myS3OutLocation": "s3://some-bucket/path",
"myThreshold": "30000",
}
You cannot add these directly in the UI (yet) but once they are there you can change and save the values.
I started using the sails.js framework a few months ago because I need it's restful API.
In the first version a simple "http://domain.com:1337/mymodel" returned all datasets of the connected MySQL-database, however, after an update to V 0.10.xx it returns only the first 30 results.
I searched the sails.js changelog, documentation and various examples around the web and tried several ideas but I can't figure out how to force sails.js to return **all* results again.
Has anybody a solution for this?
Use sails.config.blueprints.defaultLimit for general record limits. This also serves as the default limit for populated associations. There's technically no way at the moment to specify "no limit" for blueprints, but you can set the limit to the max number value as long as you don't have more than 9 quadrillion records :)
config/blueprints.js
defaultLimit: Number.MAX_VALUE // Set to highest possible value
Use populate_limit in your route config options to set the populate limit on a per-route basis.
config/routes.js
"GET /user": {blueprint: populate_limit: 10}
Use populate_[alias]_limit in your route config options to set the populate limit for a particular association on a per-route basis (e.g. populate_pets_limit: 10)
config/routes.js
"GET /user": {blueprint: 'find', limit: 20, populate_limit: 10, populate_pets_limit: 5}
I'll make sure this all gets added to the docs!
defaultLimit: -1 brings back all rows
if you need cange only populate limit, you can use populate_limit in sails.config.blueprints
// defaultLimit: 30
populate_limit:999 //default value for populate limit