Am I able to view a list of devices by partition in iothub? - azure-iot-hub

I have 2 nodes of a cluster receiving messages from iothub. I split their responsibility by partition. Node 1 reads from partitions 1,3,5,7,9 and the other 2,4,6,8, and 0. Recently, my partition 8 stops responding until I stop my code and restart it. It seems like a device is sending a message that locks up the partition. What I want to do is list all devices in my partition 8. Is that possible? Is there a cloud shell command to get those devices in a list?

Not sure this will help you, but you can see the partition on the incoming messages. For example you could use Azure Stream Analytics to see the partitions using this query:
Select GetMetadataPropertyValue(IoTHub, '[IoTHub].[ConnectionDeviceId]') as DeviceId, partitionId
from IoTHub
Also, if you run locally in VisualStudio it will tell you which device is sending malformed JSON. eg.
[Warning] 10/21/2021 9:12:54 AM : User Warning Source 'IoTHub' had 1 occurrences of kind 'InputDeserializerError.InvalidData' between processing times '2021-10-21T15:12:50.5076449Z' and '2021-10-21T15:12:50.5712076Z'. Could not deserialize the input event(s) from resource 'Partition: [1], Offset: [455266583232], SequenceNumber: [634800], DeviceId: [DeviceName]' as Json. Some possible reasons: 1) Malformed events 2) Input source configured with incorrect serialization format
Also check your "Activity Log" blade in the ASA job. It may have more details for you.

Related

Datastream Troubleshoot: "An unknown error occurred. Please try again. If the error persists, contact Google support"

We are trying to replicate data from AlloyDB to Bigquery using Datastream.
We Get "An unknown error occurred. Please try again. If the error persists, contact Google support."
In the Datastream console --> objects list, we see all source tables with Object Status "Failed" and Backfill status "Completed".
In Bigquery we see only a subset of the tables (not all the "Completed" objects were synced).
In the Logs Explorer I can see this error on BQ:
I also see this error: error: {
code: 11
message: "Unsupported primary key column either does not exist or is a pseudocolumn at [1:401]"
}
The column referred in the error is of type enum.
The desired situation is having all the AlloyDB tables replicated into Bigquery.
The error message is not very informative...
What does it mean?
What would be the best way to go about troubleshooting this?
We're actively working on making these error messages be more informative, and improvements are continuously being rolled out as we identify more edge cases. Assuming you followed all the steps in the documentation, then you may need to open a ticket with support for further investigation. If a support ticket isn't an option, you can still report the issue using the public issue tracker
I just had this same issue but connecting to a PostgreSQL in AWS RDS:
Beginning with Postgres 10, passwords are encrypted using SCRAM-SHA-256 in PostgreSQL. Google DataStream still expects MD5 password encryption, or it will generate an "unknown error" in the logs and fail the backfills.
You'll need to update your postgresql.conf (or RDS Cluster Parameter Group if you're using AWS like me):
password_encryption = 'MD5'
Restart the database and make sure the parameter has changed with:
SHOW password_encryption;
Reset the password of your users:
ALTER USER "{username}" with password '{password}';
More info from the PostgreSQL docs: https://www.postgresql.org/docs/current/auth-password.html

Spark - Failed to load collect frame - "RetryingBlockFetcher - Exception while beginning fetch"

We have a Scala Spark application, that reads something like 70K records from the DB to a data frame, each record has 2 fields.
After reading the data from the DB, we make minor mapping and load this as a broadcast for later usage.
Now, in local environment, there is an exception, timeout from the RetryingBlockFetcher while running the following code:
dataframe.select("id", "mapping_id")
.rdd.map(row => row.getString(0) -> row.getLong(1))
.collectAsMap().toMap
The exception is:
2022-06-06 10:08:13.077 task-result-getter-2 ERROR
org.apache.spark.network.shuffle.RetryingBlockFetcher Exception while
beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to /1.1.1.1:62788
at
org.apache.spark.network.client.
TransportClientFactory.createClient(Transpor .tClientFactory.java:253)
at
org.apache.spark.network.client.
TransportClientFactory.createClient(TransportClientFactory.java:195)
at
org.apache.spark.network.netty.
NettyBlockTransferService$$anon$2.
createAndStart(NettyBlockTransferService.scala:122)
In the local environment, I simply create the spark session with local "spark.master"
When I limit the max of records to 20K, it works well.
Can you please help? maybe I need to configure something in my local environment in order that the original code will work properly?
Update:
I tried to change a lot of Spark-related configurations in my local environment, both memory, a number of executors, timeout-related settings, and more, but nothing helped! I just got the timeout after more time...
I realized that the data frame that I'm reading from the DB has 1 partition of 62K records, while trying to repartition with 2 or more partitions the process worked correctly and I managed to map and collect as needed.
Any idea why this solves the issue? Is there a configuration in the spark that can solve this instead of repartition?
Thanks!

FusionPBX: Ring Group with external phone numbers?

I want to define a Ring Group that, when called, rings one extension and one external number (mobile phone). What is the best way to achieve that?
Right now only the extension is called. So just entering an external number in the Destination field does not work, the logs say
[NOTICE] switch_cpp.cpp:1376 [ring groups][call forward all] user_exists id <mobileno> <domainname>
and later
[DEBUG] switch_ivr_originate.c:3865 Originate Resulted in Error Cause: 27 [DESTINATION_OUT_OF_ORDER]
It will check all calls using the dialplan to see if the destination is a local number for an external number it should say user_exists false every time.
This [DESTINATION_OUT_OF_ORDER] indicates that it may not have found a matching outbound route that matches the number of digits of the external phone number. Or it may mean that your carrier rejected the call maybe didn't like the caller ID that was sent. Easiest thing to try is attempt it with an outbound rout to different carrier.
In case you weren't aware FusionPBX 4.4 was release on 5 April 2018. Instructions to upgrade are post on docs.fusionpbx.com. Search term upgrade (version upgrade).

File: 0: Unexpected from Google BigQuery load job

I've a compressed json file (900MB, newline delimited) and load into a new table via bq command and get the load failure:
e.g.
bq load --project_id=XXX --source_format=NEWLINE_DELIMITED_JSON --ignore_unknown_values mtdataset.mytable gs://xxx/data.gz schema.json
Waiting on bqjob_r3ec270ec14181ca7_000001461d860737_1 ... (1049s) Current status: DONE
BigQuery error in load operation: Error processing job 'XXX:bqjob_r3ec270ec14181ca7_000001461d860737_1': Too many errors encountered. Limit is: 0.
Failure details:
- File: 0: Unexpected. Please try again.
Why the error?
I tried again with the --max_bad_records, still not useful error message
bq load --project_id=XXX --source_format=NEWLINE_DELIMITED_JSON --ignore_unknown_values --max_bad_records 2 XXX.test23 gs://XXX/20140521/file1.gz schema.json
Waiting on bqjob_r518616022f1db99d_000001461f023f58_1 ... (319s) Current status: DONE
BigQuery error in load operation: Error processing job 'XXX:bqjob_r518616022f1db99d_000001461f023f58_1': Unexpected. Please try again.
And also cannot find any useful message in the console.
To BigQuery team, can you have a look using the job ID?
As far I know there are two error sections on a job. There is one error result, and that's what you see now. And there is a second, which should be a stream of errors. This second is important as you could have errors in it, but the actual job might succeed.
Also you can set the --max_bad_records=3 on the BQ tool. Check here for more params https://developers.google.com/bigquery/bq-command-line-tool
You probably have an error that is for each line, so you should try a sample set from this big file first.
Also there is an open feature request to improve the error message, you can star (vote) this ticket https://code.google.com/p/google-bigquery-tools/issues/detail?id=13
This answer will be picked up by the BQ team, so for them I am sharing that: We need an endpoint where we can query based on a jobid, the state, or the stream of errors. It would help a lot to get a full list of errors, it would help debugging the BQ jobs. This could be easy to implement.
I looked up this job in the BigQuery logs, and unfortunately, there isn't any more information than "failed to read" somewhere after about 930 MB have been read.
I've filed a bug that we're dropping important error information in one code path and submitted a fix. However, this fix won't be live until next week, and all that will do is give us more diagnostic information.
Since this is repeatable, it isn't likely a transient error reading from GCS. That means one of two problems: we have trouble decoding the .gz file, or there is something wrong with that particular GCS object.
For the first issue, you could try decompressing the file and re-uploading it as uncompressed. While it may sound like a pain to send gigabytes of data over the network, the good news is that the import will be faster since it can be done in parallel (we can't import a compressed file in parallel since it can only be read sequentially).
For the second issue (which is somewhat less likely) you could try downloading the file yourself to make sure you don't get errors, or try re-uploading the same file and seeing if that works.

Restart my delta loading after delete the infopackage in PSA by mistake

here i have got one issue.can some one please help me to resolve this.
i was trying to extract some data to DS 0FI_AP_6...
then in InfoPackage Monitor I can see like..
-->Requests (messages): Everything OK
-->Extraction (messages): Everything OK
-->Transfer (IDocs and TRFC): Missing messages or warnings
-->Info IDoc 2 : sent, not arrived ; IDoc ready for dispatch (ALE service)
Data Package 1 : 23752 Records arrived in BW
Data Package 2 : 15216 Records arrived in BW
Request IDoc : Application document posted
Info IDoc 1 : Application document posted
Info IDoc 3 : Application document posted
Info IDoc 4 : Application document posted
-->Processing (data packet): Everything OK
Data Package 1 ( 38672 Records ) : Everything OK
in Status Menu I am having message like...
Missing data packages for PSA Table
Diagnosis
Data packets are missing from PSA Table . BI processing does not
return any errors. The data transport from the source system to BI was
probably incorrect.
Procedure
Check the tRFC overview in the source system.
You access this log using the wizard or following the menu path
"Environment -> Transact. RFC -> Source System".
Error handling:
If the tRFC is incorrect, resolve the errors listed there.
Check that the source system is connected properly to BI. In
particular, check the remote user authorizations in BI.
Please suggest me how to resolve this issue...
thanks in advance for your help and quick reply is much appreciated.
But what the worst thing is I deleted the infopackage in PSA by mistake.
In the normal case, if I repeat the process again, the delta load would be OK, but now the delta load remains error.
so gurus,
1. how can I restart my delta loading correctly?
2. I want to modify the timestamp in the delta table, but how to do it ?
Go to T-Code RSA7 in the source system. This will tell you the date/timestamp that the delta is set to. If the date was changed to a range that no longer works then you will need to re-initialize the datasource in the BW system side. However, the Delta date may still be fine becauase it may have never been changed when you tried to first do your load because of the connection issues.
You can create a new infopackage and set the update to Initialize Datasource with Data Transfer. This will essentially run a full load from the datasource and then reset the delta pointer date/timestamp to when you ran it. This way you will capture all the data that you needed and anything that was already in the PSA should be overwritten.
Also note that you should delete or set the request status to red on the previous request that may contain bad data in the PSA.
From the original error it seems like you are having an RFC connection issue between the datasource and BW. Contact your BASIS support and have them check the connection to make sure it is good. To ensure that your datasource is extracting properly you can run t-code RSA3 on it in the source system. This will ensure that the extraction of data is working properly.