Azure HBASE REST - simple get failing for colfam:col - api

This should be a very simple one (been searching for a solution all day - read a thousand and a half posts).
I put a test row in my HBASE table in hbase shell:
put 'iEngine','testrow','SVA:SourceName','Journal of Fun'
I can get the value for a column family using the REST API in DHC Chrome:
https://ienginemaster.azurehdinsight.net/hbaserest/iEngine/testrow/SVA
I can't seem to get it for the specific cell: https://ienginemaster.azurehdinsight.net/hbaserest/iEngine/testrow/SVA:SourceName
{
"Row": [{
"key": "dGVzdHJvdw==",
"Cell": [{
"column": "U1ZBOlNvdXJjZU5hbWU=",
"timestamp": 1440602453975,
"$": "Sm91cm5hbCBvZiBGdW4="
}]
}]
}
I get back a 400 error.
When successfully asking for just the family, I get back:
I tried replacing the encoded value for SVA:SourceName, and a thousand other things. I'm assuming I'm missing something simple.
Also, the following works:
hbase(main):012:0> get 'iEngine', 'testrow', 'SVA:SourceName'
COLUMN CELL
SVA:SourceName timestamp=1440602453975, value=Journal of Fun
1 row(s) in 0.0120 seconds
hbase(main):013:0>

I opened a case with Microsoft support. I received confirmation that it is a bug (IIS and the colon separator not working). They are working on a fix - they are slightly delayed as the decide on the "best" way to fix it.

Related

Writing the correct SQL statement in AWS IoT rule

I am working on AWS IoT to develop my custom solution based on a set of sensors, and I have a problem regarding how to write the SQL statement related to the kind of data I receive from a Zigbee sensor.
An example of what I receive from my sensor is reported here:
{
"type": "reportAttribute",
"from": "WIFI",
"deviceCode": "aws_device_code",
"to": "CLOUD",
"mac": "30:ae:7b:e2:e1:e6",
"time": 1668506014,
"data": {...}
}
What I would like to do is to select messages that have the from field equal to GREENPOWER, something along the lines of SELECT * FROM 'test' WHERE from = 'GREENPOWER', but from is also a keyword in SQL hence my problem. I am no expert whatsoever in SQL, so I am not sure how this can be done. I am also looking for a way to modify the received data, but solving this problem on AWS would be much easier.
Thank you very much for your help!
There are quite a lot of SQL functions that exist in AWS IoT Rule. You can find them here: https://docs.aws.amazon.com/iot/latest/developerguide/iot-sql-functions.html
In your case, something like this should work:
SELECT * FROM 'test' WHERE get(*, "from") = "GREENPOWER"

Querying an IoT Hub Object with number key

I am trying to retrieve information from a twin device of a third party IoT Hub instance.
The data I'm trying to access has the next format:
{
"properties": {
"reported": {
"softwareLoad": {
"0": {
"systemSWVer": "1.2.0",
"picSWVer": "0.0.42",
"bootStatus": "inactive",
"partitionId": 0
},
"1": {
"systemSWVer": "1.2.0",
"picSWVer": "0.0.42",
"bootStatus": "active",
"partitionId": 1
},
"beamTableCRC": "0x5454"
}
}
}
}
I was trying to reach the variable systemSWVer to use it as a where clause, but each time I try to access I had an error retrieving the information in 0 and IoT Hub returns an error of a Bad Request.
I tried with this query
SELECT properties.reported.softwareLoad FROM devices WHERE properties.reported.softwareLoad.0.systemSWVer in ["1.2.0", "2.1.0"]
Is there a way to use it as a where clause in my query?
Note: I don't have access to the resource to change the format of the information.
I tried to recreate this case and found the same result. Always an error when you add a number in the expression. I also found a workaround; instead of using 0 or 1, you can add them as an escaped Unicode character. It's important to add them as an ASCII Unicode character, though. 0 becomes \u0030 and 1 becomes \u0031.
Try this query:
SELECT properties.reported.softwareLoad FROM devices
WHERE properties.reported.softwareLoad.\u0030.systemSWVer in ['1.2.0', '2.1.0']
Please note: in any case, you need to use single quotes.
Edit: in my excitement, I forgot to test if just escaping the integer would also work. It does.
SELECT properties.reported.softwareLoad FROM devices
WHERE properties.reported.softwareLoad.\0.systemSWVer in ['1.2.0', '2.1.0']

Left join did not working properly in Azure Stream Analytics

I'm trying to create a simple left join between two inputs (event hubs), the source of inputs is an app function that process a rabbitmq queue and send to a event hub.
In my eventhub1 I have this data:
[{
"user": "user_aa_1"
}, {
"user": "user_aa_2"
}, {
"user": "user_aa_3"
}, {
"user": "user_cc_1"
}]
In my eventhub2 I have this data:
[{
"user": "user_bb_1"
}, {
"user": "user_bb_2"
}, {
"user": "user_bb_3
}, {
"user": "user_cc_1"
}]
I use that sql to create my left join
select hub1.[user] h1,hub2.[user] h2
into thirdTestDataset
from hub1
left join hub2
on hub2.[user] = hub1.[user]
and datediff(hour,hub1,hub2) between 0 and 5
and test result looks ok...
the problem is when I try it on job running... I got this result in power bi dataset...
Any idea why my left isn't working like any sql query?
I tested your query sql and it works well for me too.So when you can't get expected output after executing ASA job,i suggest you following troubleshoot solutions in this document.
Based on your output,it seems that the HUB2 becomes the left table.You could use diagnostic log in ASA to locate the truly output of job execution.
I tested the end-to-end using blob storage for input 1 and 2 and your sample and a PowerBI dataset as output and observed the expected result.
I think there are few things that can go wrong with your query:
First, your join has a 5-hours windows: basically that means it looks at EH1 and EH2 for matches during that large window, so live results will be different from sample input for which you have only 1 row. Can you validate that you had no match during this 5-hour window?
Additionally by default PBI streaming datasets are "hybrid datasets" so it will accumulate results without a good way to know when the result was emitted since there is no timestamp in your output schema. So you can also view previous data here. I'd suggest few things here:
In Power BI, change the option of your dataset: disable "Historic data analysis" to remove caching of data
Add a timestamp column to make sure to identify when the data is generated (the first line of you query will become: select System.timestamp() as time, hub1.[user] h1,hub2.[user] h2 )
Let me know if it works for you.
Thanks,
JS (Azure Stream Analytics)

JSON-LD schema won't validate in SDTT

I've spent hours on this without solution. I'm having a terrible time identifying and correcting an error when validating in Google SDTT. After dozens of revisions, I continue to get "Missing ',' or ']' in array declaration" error. I'd appreciate if someone will take a look, make the needed corrections or show me what I'm overlooking. Here's the code snippet >> https://drive.google.com/drive/folders/1HNJgZrGa7_F6-7FuGCbL2Y0vFPGsX7MQ
Your top sameAs is a bit mixed up. I suspect you wanted to quote every URL and drop the last comma. e.g.
"sameAs" : [ "https://plus.google.com/100804793716209856515", "https://plus.google.com/115455274861158767219", "https://www.facebook.com/pg/TallentRoofingInc/about/", "https://www.yelp.com/biz/tallent-roofing-mckinney-2", "https://www.yelp.com/biz/tallent-roofing-melissa", "https://www.yelp.com/biz/tallent-roofing-el-paso-2", "https://www.yelp.com/biz/tallent-roofing-alpine", "https://www.yelp.com/biz/tallent-roofing-artesia" ]
Your graph is an array, but does to some bad closing of }s so it directly includes properties. Remove the } on the line before "aggregateRating" and add one to the line after "reviewCount".
,
"aggregateRating" :
{
"#type": "AggregateRating",
"ratingValue" : "4.9",
"ratingCount": "57",
"reviewCount": "53"
}},

PostgreSQL: Create Index in JSON Array

I am pretty new to postregSQL and not too familiar with SQL yet. But im trying to learn.
In my database i want to store huge JSON files (~2mio lines, 40mb) and later query them as fast as possible. Right now it is to slow, so i figured indexing should do the trick.
The Problem is i do not know how to index the file since it is a bit tricky. I am woking on it the whole day now and starting to get desperate..
My DB is calles "replays" the json column "replay_files"
So my files look like this:
"replay": [
{
"data": {
"posX": 182,
"posY": 176,
"hero_name": "CDOTA_Unit_Hero_EarthSpirit"
},
"tick": 2252,
"type": "entity"
},
{
"data": {
"posX": 123,
"posY": 186,
"hero_name": "CDOTA_Unit_Hero_Puck"
},
"tick": 2252,
"type": "entity"
}, ...alot more lines... ]}
I tried to get all the entries with say heron_name: Puck
So i tried this:
SELECT * FROM replays r, json_array_elements(r.replay_file#>'{replay}') obj WHERE obj->'data'->>'hero_name' = 'CDOTA_Unit_Hero_Puck';
Which is working but for smaller files.
So i want to index like that:
CREATE INDEX hero_name_index ON
replays ((json_array_elements(r.replay_file#>'{replay}')->'data'->'hero_name);
BUt it doesn work. I have no idea how to reach that deep into the file and get to index this stuff.
I hope you understand my problem since my english isnt the best and can help me out here. I just dont know what else to try out.
Kind regards and thanks alot in advance
Peter