How to create a process in Dell Boomi that will get data from one Database and then will send data to a SaaS - api

I would like to know how do I create a process in Dell Boomi that will meet the following criteria:
Read data directly from Database poduction table then will send the data to SaaS (public internet) using REST API.
Another process will read data from SaaS (REST API) and then write it to another Database table.
Please see attached link as to what I have done so far and I really don't know how to proceed. Hope you can help me out. Thank you.Boomi DB connector

You are actually making a good start. For the first process (DB > Saas) you need to:
Ensure you have access to the DB - if your Atom is local than this shouldn't be much of an issue, but if it is on the Boomi Cloud,
then you need to enable access to this DB from the internet (not
something I would recommend).
Check what you need to read and define Boomi Operation - from the image you have linked I can see that you are doing that, but not
knowing what data you need and how it is structured, it is impossible to say if you have defined all correctly.
Transform data to the output system format - once you get the data from the DB, use the Map shape to map it to the Profile of the Saas you are sending your data to.
Send data to Saas - you can use HttpClient connector to send data in JSON or XML (or any other format you like) to the Saas Rest API
For the other process (Saas > DB) the steps are practically the same but in reverse order.

Related

Azure Sentinel referencing large sets of data

I've been trying to find the most effective (elegant) solution to achieve what I'm trying to do. I'd like to hear from the community, thank you.
Situation:
Need to geo-enrich IP Address records on Sentinel. Example: Successful SigninLogs, since MSFT enrichment sometimes generates "Unknown" results in the IP enrichment maps.
External reference file (subnet, country_code, country_name) are available publicly, however the size and # of records are rather large. (~12MB, 200K+records).
Issue:
Tried using storage account blob to host the "reference table", apparently hitting the limit on max. blob size in Storage Account.
Looks like there are max. 30.000 records on Workbooks to read from external sources using 'externaldata' command. Hence, only partial reference data can be read and referred to.
Options considered:
Ingest the reference table into the log analytics workspace, do a join/lookup to this custom reference table for enrichment
Export the IP addresses from SigninLogs table to a blob storage, enrich the IP address using logicapps, and then put it back to a 'reference' blob storage. then read the 'reference' blob storage using 'externaldata' syntax.
Limitation Observed:
Came to a realization that Sentinel couldn't perform API call for enrichment from external data. (CMIIW). I've done similar stuff with Splunk, and we could enrich the data on the fly, by calling in multiple API calls to outside database.
Ingest the Data - As you've mentioned, ingest the data and join the tables. You would need to regularly ingest this though to ensure you can lookup the data within the desired time range (e.g. If you have an Analytics Rule, then this only looks up data for a 14 day period).
Use a Playbook - If you want the Geo-IP lookup post incident, you can perform this with a Logic App
Use Jupyter Notebooks - This have the flexibility to perform API calls against external locations and join the data to that hosted in Sentinel. An example notebook is the IP Explorer Notebook. Use Jupyter notebooks to hunt for security threats
Threat Intelligence - Microsoft enriches all imported threat intelligence indicators with GeoLocation and WhoIs data, which is displayed together with other indicator details.
Since March 2022, you can upload large CSV files into a Sentinel Watchlist. This way, you can upload a complete GeoIP database and perform ipv4_lookups. This blog post explains you how to do this: https://cryptsus.com/blog/enrich-geolocation-sentinel-siem.html

What is the most performant way to submit a POST to an API

A little background on what I am needing to accomplish. I am a developer of a cloud-based SaaS application. In one instance, my clients use my application to log the receipt of goods as they come across a conveyor line. Directly before the PC where they are logged into my app, there is another Windows PC that is collecting from instruments, the moisture and weight of the item. I (personally not my app) have full access to this pc and its database. I know how I am going to grab the latest record from the db via stored procedure/SQLCMD.
On the receiving end, I have an API endpoint that needs to receive the ID, Weight, Moisture, and ClientID. This all needs to happen in less than ~2 seconds since they are waiting to add this record to the my software's database.
What is the most-perfomant way for me to stand up a process that triggers retrieving the record from the db and then calls the API? I also want to update the record flagging success for 200 response. My thoughts were to script all of this in a batch file and use cURL to make the API call. Then call this batch file from a task in windows. But I feel like there may be a better way with less moving parts.
P.S. I am not looking for code solutions per say, just direction or tools that will help, also I am using the AWS stack to host my application.
The most performant way is to use AWS Amplify, its ready aws framework and development environment that can connect your existing DB to a REST API easily
you can check their documentation on how to build it
https://docs.amplify.aws/lib/restapi/getting-started/q/platform/js

Check file encoding thanks to Azure Data Factory activity

I'd like to be able to check the encoding of a input file in the flow of my pipeline. Any idea about to do that thanks to one of the activity provided by Azure Data Factory?
Thanks for the tips
It's actually not supported by any of the activities "on the box" at this time, but you are able to do that using other services with connectors available on ADF like Azure Function for example. But you will need to develop the algorithm to detect the encoding and an azure function service to do that ... (Of course other services like Azure Batch, Notebooks ... could be used)
Saying that, it could be really usefull to add this information into the Get Metadata Activity (just posted the idea to https://feedback.azure.com/forums/270578-data-factory/suggestions/37452187-add-encoding-into-the-get-a-file-s-metadata-activi)

How to use Apache Nifi to query a REST API?

For a project i need to develop an ETL process (extract transform load) that reads data from a (legacy) tool that exposes its data on a REST API. This data needs to be stored in amazon S3.
I really like to try this with apache nifi but i honestly have no clue yet how i can connect with the REST API, and where/how i can implement some business logic to 'talk the right protocol' with the source system. For example i like to keep track of what data has been written so far so it can resume loading where it left of.
So far i have been reading the nifi documentation and i'm getting a better insight what the tool provdes/entails. However it's not clear to be how i could implement the task within the nifi architecture.
Hopefully someone can give me some guidance?
Thanks,
Paul
The InvokeHTTP processor can be used to query a REST API.
Here is a simple flow that
Queries the REST API at https://api.exchangeratesapi.io/latest every 10 minutes
Sets the output-file name (exchangerates_<ID>.json)
Stores the query response in the output file on the local filesystem (under /tmp/data-out)
I exported the flow as a NiFi template and stored it in a gist. The template can be imported into a NiFi instance and run as is.

Bulk user account creation from CSV data import/ingestion

Hi all brilliant minds,
I am currently working on a fairly complex problem and I would love to get some idea brainstorming going on. I have a C# .NET web application running in Windows Azure, using SQL Azure as the primary datastore.
Everytime a new user creates an account, all they need to provide is the name, email and password. Upon account creation, we store the core membership data to the SQL database, and all the secondary operations (e.g. sending emails, establishing social relationships, creating profile assets, etc) get pushed onto an Azure Queue and gets picked-up/processed later.
Now I have a couple of CSV files that contain hundreds of new users (names & emails) that need to be created on the system. I am thinking of automating this by breaking into two parts:
Part 1: Write a service that ingests the CSV files, parses out the names & emails, and saves this data in storage A
This service should be flexible enough to take files with different formats
This service does not actually create the user accounts, so this is decoupled from the business logic layer of our application
The choice of storage does not have to be SQL, it could also be non-relational datastore
(e.g. Azure Tables)
This service could be a third-party solution outside of our application platform - so it is open to all suggestions
Part 2: Write a process that periodically goes through storage A and creates the user accounts from there
This is in the "business logic layer" of our application
Whenever an account is successfully created, mark that specific record in storage A as processed
This needs to be retry-able in case of failures in user account creations
I'm wondering if anyone has experience with importing bulk "users" from files, and if what I am suggesting sounds like a decent solution.
Note that Part 1 could be a third-party solution outside of our application platform, so there's no restriction in what language/platform it has to be running in. We are thinking about either using BULK INSERT, or Microsoft SQL Server Integration Services 2008 (SSIS) that ingests and loads data from CSV into SQL datastore. If anyone has worked with these and can provide some pointers that would be greatly appreciated too.. Thanks so much in advance!
If I understand this correctly, you already have a process that picks up messages from a queue and does its core logic to create the user assets/etc. So, sounds like you should only automate the parsing of the CSV files and dumping the contents into queue messages? That sounds like a trivial task.
You can kick the process of processing the CSV file also via a queue message (to a different queue). The message would contain the location of the CSV file and the Worker Role running in Azure would pick it up (could be the same worker role as the one that processes new users if the usual load is not high).
Since you're utilizing queues, the process is retriable
HTH