Shovel from Exchange - how are data persisted? - rabbitmq

There are two environments, one is always online (A), the other one (B) might be online 2-3 days a week, whenever possible they must be synced.
now I've already got a working shovel (on A) to sync the data from A to B, but I'd like to know the difference between the different options I got as source and destination of the shovel.
Those are all the possible combinations of source-destination
| Source | Destination |
|----------|-------------|
| Queue | Queue |
| Queue | Exchange |
| Exchange | Queue |
| Exchange | Exchange |
The question is pretty simple: when Env B is offline how are the data on A persisted?
if the source is a Queue the data will just stay there idefinitely, once B is online again the messages will be consumed/moved. (in the meanwhile I expect a flooding of network related errors)
But what if the source is an Exchange?
I expect the data to be persisted... is it true? if so how?
I looked at the docs and searched around but didn't find any answer

Related

Azure Sentinel Kusto query table with data from another query

I'm trying to find a way to use the Azure Sentinel to pull all DNS results to a domain based upon a Security Alert.
Under the Security Alert table, they provide the domain name for an event as part of a JSON, here is the table for extracting that data.
SecurityAlert
| where parse_json(ExtendedProperties).AnalyticDescription == "Usage of digital currency mining pool"
| extend DomainName_ = tostring(parse_json(ExtendedProperties).DomainName);
What I would like to do is take that query, and then Query the DnsEvents table to find all queries that match the domain name on the table Name. An example of the query is
DnsEvents
| where Name contains "xmr-au1.nanopool.org"
How can I perform the second query but use the data from the first query to filter?
you could try something like this:
let domain_names =
SecurityAlert
| where ExtendedProperties has 'Usage of digital currency mining pool' // this line is optional, but may improve performance
| extend props = parse_json(ExtendedProperties).
| where props.AnalyticDescription == "Usage of digital currency mining pool"
| project DomainName_ = tostring(props.DomainName)
;
DnsEvents
| where Name has_any (domain_names)

SSIS ForEach ADO Enumerator - Performance Issues

This is a best practice/other approach question about using a ADO Enumerator ForEach loop.
My data is financial accounts, coming from a source system into a data warehouse.
The current structure of the data is a list of financial transactions eg.
+-----------------------+----------+-----------+------------+------+
| AccountGUID | Increase | Decrease | Date | Tags |
+-----------------------+----------+-----------+------------+------+
| 00000-0000-0000-00000 | 0 | 100.00 | 01-01-2018 | Val1 |
| 00000-0000-0000-00000 | 200.00 | 0 | 03-01-2018 | Val3 |
| 00000-0000-0000-00000 | 400.00 | 0 | 06-01-2018 | Val1 |
| 00000-0000-0000-00000 | 0 | 170.00 | 08-01-2018 | Val1 |
| 00000-0000-0000-00002 | 200.00 | 0 | 04-01-2018 | Val1 |
| 00000-0000-0000-00002 | 0 | 100.00 | 09-01-2018 | Val1 |
+-----------------------+----------+-----------+------------+------+
My SSIS Package, current has two forEach Loops
All Time Balances
End Of Month Balances
All Time Balances
Passes AccountGUID into the loop and selects all transactions for that account. It then orders them by date with the first transaction being first and assigns it a sequence number.
Once the sequence number is assigned, it begins to count the current balances based on the increase and decrease cols, along with the tag col to work out which balance its dealing with.
It finishes this off by assigning the latest record with a Current flag.
All Time Balances - Work Flow
->Get All Account ID's in Staging table
|-> Write all Account GUID's to object variable
|--> ADO Enumerator ForEach - Loop Account GUID List - Write GUID to variable
|---> (Data Flow) Select all transactions for Account GUID
|----> (Data Flow) Order all transactions by date and assign Sequence number
|-----> (Data Flow) Run each row through a script component transformation to calculate running totals for each record
|------> (Data Flow) Insert balance data into staging table
End Of Month Balances
The second package, End of Month does something very similar with the exception of a second loop. The select will find the earliest transnational record and the latest transnational record. Using those two dates it will figure out all the months between those two and loop for each of those months.
Inside the date loop, it does pretty much the same thing, works out the balances based on tags and stamps the end of month record for each account.
The Issue/Question
All of this currently works fine, but the performance is horrible.
In one database with approx 8000 Accounts and 500,000 transactions. This process takes upwards of a day to run. This being one of our smaller clients, I tremble at the idea of running it for our heavy databases.
Is there a better approach to doing this, using SQL cursors or so other neat way I have not seen?
Ok, so I have managed to take my package execution from around 3 days to about 11 minutes all up.
I ran a profiler and standard windows stats while running the loops and found a few interesting things.
Firstly, there was almost no utilization of HDD, CPU, RAM or network during the execution of the packages. It told me what I kind of already knew, that it was not running as quickly as it could.
What I did notice, between each execution of the loop there was a 1 to 2ms delay before the next instance of the loop started executing.
Eventually I found that every time a new instance of the loop began, SSIS created a new connection to the SQL database, it appears that this is SSIS's default behavior. Whenever you create a Source or Destination, you are adding a connection delay to your project.
The Fix:
Now this was an odd fix, you need to go into your connection manager (The odd bit) it must be the onscreen window not in the right hand project manager window.
If you select your connect that is referenced in the loop, the properties window on the right side (In my layout anyway) you will see the option called "RetainSameConnection" which be default is set to false.
By setting this to true, I eliminated the 2ms delay.
Considerations:
In doing this I created a heap of other issues, which really just highlighted areas of my package that I had not thought out well.
Some things that appears to be impacted by this change were stored procedures that used temp tables, these seemed to break instantly. I assume that is because of how SQL handles temp tables, in closing the connection and reopening, you can be pretty certain that the temp table is gone. With the same connection setting, the chance of running into temp tables appears to be an issue again.
I removed all temp tables and replaced them with CTE statements, this appears to fix this issue.
The second major issue I found was with tasks that ran parallel and both used the same connection manager. From this I received an error that SQL is still trying to run the previous statement. This bombed out my package.
To get around this, I created a duplicate connection manager (All up I made three connection managers for the same database).
Once I had my connections set up, I went into each of my parallel Source and Destinations and assigned them their own connection manager. This appears to have resolved the last error I received.
Conclusion:
They may be more unforeseen issues in doing this, but for now my packages are lightening quick and this highlighted some faults in my design.

PostgreSQL - How to prevent duplicate inserts in a competitive slot taking contest

I don't know how to phrase my question right. But to provide further details about the problem I am trying to solve, let me describe my application. Suppose I am trying to implement a queue reservation application, and I maintain the number of slots in a table roughly.
id | appointment | slots_available | slots_total
---------------------------------------------------
1 | apt 1 | 30 | 30
2 | apt 2 | 1 | 5
.. | .. | .. | ..
So, in a competitive scenario, assuming that everything works in the application side of things. A scenario can happen in the application where :
user 1 -> reserves apt 2 -> [validate if slot exists] -> update slot_available to 0 -> reserve (insert a record)
user 2 -> reserves ap2 2 -> validate if slot exists -> [update slot_available to 0] -> reserve (insert a record)
What if user 1 and 2 happens to find a slot available for apt2 at the same time in the user interface? (Of course I would validate first if there is one slot, but they would see the same value in the UI if not one of them has clicked yet). Then the two submits a reservation at the same time.
Now what if user 1 validates that there is a slot that is available, even though user 2 has already taken it though the update operation is not yet done? Then there will be two inserts.
At any case, how do I ensure that only one of them gets the reservation at database level? I'm sure this is a common scenario, but I have no idea yet on how to implement something like this. A suggestion to remodel would also be acceptable as long as it solves the scenario.

how can I find all dashboards in splunk, with usage information?

I need to locate data that has become stale in our Splunk instance - so that I can remove it
I need a way to find all the dashboards, and sort them by usage. From the audit logs I've been able to find all the actively used logs, but as my goal is to remove data, I most need the dashboards not in use
any ideas?
You can get a list of all dashboards using | rest /services/data/ui/views | search isDashboard=1. Try combining that with your search for active dashboards to get those that are not active.
| rest /services/data/ui/views | search isDashboard=1 NOT [<your audit search> | fields id | format]

Is it important to have an automated acceptance tests to test whether a field saves to a database?

I'm using SpecFlow for the automated Acceptance Testing framework and NHibernate for persistance. Many of the UI pages for an intranet application that I'm working on are basic data entry pages. Obviously adding a field to one of these pages is considered a "feature", but I can't think of any scenarios for this feature other than
Given that I enter data X for field Y on Record 1
And I click Save
When I edit Record 1
Then I should data X for field Y
How common and necessary is it to automate tests like this? Additionally, I'm using NHibernate so it's not like I'm handrolling my own data persistance layer. Once I add a property to my mapping file, there is a high chance that it won't get deleted by mistake. When considering this, isn't a "one-time" manual test enough? I'm eager to hear your suggestions and experience in this matter.
I usually have scenarios like "successful creation of ..." that tests the success case (you fill-in all required fields, all input is valid, you confirm, and finally it is really saved).
I don't think that you can easily define a separate scenario for one single field, because usually the scenario of successful creation requires several other criteria to be met "at the same time" (e.g. all required fields must be filled).
For example:
Scenario: Successful creation of a customer
Given I am on the customer creation page
When I enter the following customer details
| Name | Address |
| Cust | My addr |
And I save the customer details
Then I have a new customer saved with the following details
| Name | Address |
| Cust | My addr |
Later I can add additional fields to this scenario (e.g. the billing address):
Scenario: Successful creation of a customer
Given I am on the customer creation page
When I enter the following customer details
| Name | Address | Billing address |
| Cust | My addr | Bill me here |
And I save the customer details
Then I have a new customer saved with the following details
| Name | Address | Billing address |
| Cust | My addr | Bill me here |
Of course there can be more scenarios related to the new field (e.g. validations, etc), that you have to define or extend.
I think if you take this approach you can avoid having a lot of "trivial" scenarios. And I can argue that this is the success case of the "create customer feature", which deserves a single test at least.