I'm setting up a RabbitMQ (v3.8.0) cluster with High Availability.
To enable messages persistency, I set exchanges and queues durable parameter to True.
{
"exchanges": [
{
"name": "my_direct_exchange",
"vhost": "my_vhost",
"type": "direct",
"durable": true,
"auto_delete": false,
"internal": false,
"arguments": {}
}
],
"queues": [
{
"name": "my_queue_direct",
"vhost": "my_vhost",
"durable": true,
"auto_delete": false,
"arguments": {}
}
]
}
Then, it seems there are 2 choices :
Either sending messages with delivery_mode=2
Or, setting lazy mode in queues (via policy configuration)
"policies": [
{
"vhost": "my_vhost",
"name": "my_policy",
"pattern": "",
"apply-to": "all",
"definition": {
"ha-mode": "all",
"ha-sync-mode": "automatic",
"queue-mode": "lazy"
}
}
]
Both of these choices will stores messages on disk.
What is the difference between them ?
To enable messages persistency, I set exchanges and queues durable
parameter to True.
To clarify, the durable parameter for exchanges and queues does not affect individual message persistence. The durable parameter ensures that those exchanges and queues survive broker restarts. True, if you have a non-durable queue with persistent messages, and restart the broker, that queue and those messages will be lost, so the durable parameter is important.
You should use the persistent flag, even with lazy queues. Why? Because you should also be using Publisher Confirms, and a message will only be confirmed when written to disk when persistent is set.
NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.
Related
``As per below property files, expected behavior is to route events from "table.include.list": "dbo.idm_assets,dbo.idm_datapoints" to queue - debezium_assets and from "table.include.list": "dbo.idm_workorder,dbo.idm_activity" to queue - debezium_events.`
`{
"name": "sql-server-kafkacdc-idm-connection",
"config": {
"snapshot.mode": "initial_schema_only",
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"database.hostname": "localhost",
"database.port": "1433",
"database.user": "cdc_idm_owner",
"database.password": "cdcIdm#2022",
"database.dbname": "kafkacdc",
"database.server.name": "kafkacdc",
"tasks.max": "1",
"decimal.handling.mode": "string",
"tombstones.on.delete": false,
"table.include.list": "dbo.idm_assets,dbo.idm_datapoints",
"transforms": "Reroute",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"transforms.Reroute.topic.regex": "(.*)",
"transforms.Reroute.topic.replacement": "debezium_assets",
"database.history":"io.debezium.relational.history.MemoryDatabaseHistory"
}
}`
{
"name": "sql-server-kafkacdc-idm-all-001",
"config": {
"snapshot.mode": "initial_schema_only",
"connector.class": "io.debezium.connector.sqlserver.SqlServerConnector",
"database.hostname": "localhost",
"database.port": "1433",
"database.user": "cdc_idm_owner",
"database.password": "cdcIdm#2022",
"database.dbname": "kafkacdc",
"database.server.name": "kafkacdc",
"tasks.max": "1",
"decimal.handling.mode": "string",
"tombstones.on.delete": false,
"table.include.list": "dbo.idm_workorder,dbo.idm_activity",
"transforms": "Reroute",
"transforms.Reroute.type": "io.debezium.transforms.ByLogicalTableRouter",
"transforms.Reroute.topic.regex": "(.*)",
"transforms.Reroute.topic.replacement": "debezium_events",
"database.history":"io.debezium.relational.history.MemoryDatabaseHistory"
}
}
But currently, events are being sent to both the queues even if we trigger an event from table of another property file which isn't expected.
Observation : We have deleted one of the connectors(Assets) and tried to reproduce the issue. Triggered an event from Assets table and could see it in Events queue which is not expected.
Please find the current and expected behaviors:
Current Behavior
Expected Behavior`
Currently it seems that web activity is broken.
When using simple pipeline
{
"name": "pipeline1",
"properties": {
"activities": [
{
"name": "Webactivity",
"type": "WebActivity",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"url": "https://www.microsoft.com/",
"connectVia": {
"referenceName": "AutoResolveIntegrationRuntime",
"type": "IntegrationRuntimeReference"
},
"method": "GET",
"body": ""
}
}
],
"annotations": []
}
}
When debugging it never finishes. There is "in progress" for several minutes.
I tried Web hook and it works.
Is there something else I could try?
A quick note on the "never finishes" issue: one of my pet peeves with Data Factory is that the default timeout for all activities is 7 DAYS. While I've had a few activities that needed to run for 7 hours, a WEEK is a ridiculous default timeout value. One of the first things I do in any production scenario is address the timeout values of all the activities.
As to the Web activity question: I set up a quick example in my test bed and it returned just fine:
Looking at the generated code, the only real difference I see is the absence of the "connectVia" property that was in your example:
Ok I've found it.
The default AutoResolveIntegrationRuntime only had managed private network which I couldn't change. So I created a new Integration Runtime with public network setting.
This is a litte bit strange as I started today with a brand new Azure Data Factory.
I wonder why I cannot change the default Integration Runtime to disable virtual network:
I am trying to create a auto-remediation process that will stop/delete any VPC, Cloudformation Stack, VPC, Lambda, Internet Gateway or EC2 created outside of the eu-central-1 region. My first step is to parameter a CloudWatch event rule to detect any of the previously mentioned event.
{
"source": [
"aws.cloudtrail"
],
"detail-type": [
"AWS API Call via CloudTrail"
],
"detail": {
"eventSource": [
"ec2.amazonaws.com",
"cloudformation.amazonaws.com",
"lambda.amazonaws.com"
],
"eventName": [
"CreateStack",
"CreateVpc",
"CreateFunction20150331",
"CreateInternetGateway",
"RunInstances"
],
"awsRegion": [
"us-east-1",
"us-east-2",
"us-west-1",
"us-west-2",
"ap-northeast-1",
"ap-northeast-2",
"ap-south-1",
"ap-southeast-1",
"ap-southeast-2",
"ca-central-1",
"ap-south-1",
"eu-west-1",
"eu-west-2",
"eu-west-3"
"sa-east-1"
]
}
}
For now, the event should only trigger an SNS topic that will send me an email, but in the future there will be a lambda fonction to do the remediation.
Unfortunately, when I go create an Internet Gateway in another region (let's say eu-west-1), no notification occur. The Event does not appear if I want to set an alarm on it either, while it does appear in CloudWatch Events).
Any idea what could be wrong with my event config?
OK, I figured it out. The source of the event changes even if the notification comes from CloudTrail. The "source" parameters should therefore be:
"source": [
"aws.cloudtrail",
"aws.ec2",
"aws.cloudformation",
"aws.lambda"
]
We'are using Kafka Connect [distributed, confluence 4.0].
It works very well, except that there always remain an uncommitted messages in the topic that connector listens to. The behavior probably related to the S3 connector configuration the "flush.size": "20000". The lags in the topic are always below the flush-size.
Our data comes in batches, I don't want to wait till next batch arrive, nor reduce the flush.size and create tons of files.
Is there away to set timeout where S3 connector will flush the data even if it didn't reach 20000 events?
thanks!
"config": {
"connector.class": "io.confluent.connect.s3.S3SinkConnector",
"topics": "event",
"tasks.max": "3",
"topics.dir": "connect",
"s3.region": "some_region",
"s3.bucket.name": "some_bucket",
"s3.part.size": "5242880",
"flush.size": "20000",
"storage.class": "io.confluent.connect.s3.storage.S3Storage",
"format.class": "io.confluent.connect.s3.format.avro.AvroFormat",
"schema.generator.class": "io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator",
"schema.compatibility": "FULL",
"partitioner.class": "io.confluent.connect.storage.partitioner.TimeBasedPartitioner",
"path.format": "'\''day_ts'\''=YYYYMMdd/'\''hour_ts'\''=H",
"partition.duration.ms": "3600000",
"locale": "en_US",
"timezone": "UTC",
"timestamp.extractor": "RecordField",
"timestamp.field": "time"
}
}
To flush outstanding records periodically on low-volume topics with the S3 Connector you may use the configuration property:
rotate.schedule.interval.ms
(Complete list of configs here)
Keep in mind that by using the property above you might see duplicate messages in the event of reprocessing or recovery from errors, regardless of which partitioner you are using.
In RabbitMQ, is there an easy/simple way to migrate the custom exchanges with bindings and queues from one Vhost to Another. (Vhost:dev to Vhost:stage)
thanks
Since the version 3.6.1is possible to Export/import config on virtual host level.
It is not possible to export one specific Exchange.
But you can take the Exchange defition through the HTTP API
Using :
http://localhost:15672/api/exchanges/vhost/name/bindings/source
for example:
http://localhost:15672/api/exchanges/%2f/my_company/bindings/source
you will get a json like:
[
{
source: "my_company",
vhost: "/",
destination: "amq.gen-yZGNV22TwLcP3K-X69Yjyw",
destination_type: "queue",
routing_key: "#",
arguments: { },
properties_key: "%23"
},
{
source: "my_company",
vhost: "/",
destination: "my.queue",
destination_type: "queue",
routing_key: "#",
arguments: { },
properties_key: "%23"
}
]