Pushing logs to Log Analytics from Databricks - azure-log-analytics

I have logs collected in Databricks cluster but I need to pushed to Log Analytics in Azure to have a common log collection
Have not tried anything but would like to know what the approach

To know how to push databricks logs to Azure Log Analytics.

Related

How to get the queries made to AWS/RDS in Grafana

I have a grafana instance running in a kubernetes cluster. I have set up a CoudWatch Datasource with the corresponding credentials and I can retrieve some metrics.
My specific need is to know if I can retrieve the queries to the DB (or SQL digest) like RDS does in TOP SQL on AWS Console (https://i.stack.imgur.com/H6sO4.png) or something similar, so I can check the query performance.
Thank you so much in advance.
You can do this with the following steps:
First, Enable the Query logging for Amazon RDS: eg. PostgreSQL and MySQL
Then, publish the logs to Amazon CloudWatch Logs
And, in the Grafana side, add the AWS CloudWatch data source integration
This way you be able to get your queries in Grafana like this Cloudwatch Logs integration example.
If you want you can analise/filter your RDS logs using CloudWatch Logs Insights in Grafana.

Delta Table transactional guarantees when loading using Autoloader from AWS S3 to Azure Datalake

Trying to use autoloader where AWS S3 is source and Delta lake is in Azure Datalake Gen. When I am trying to read files it gives me following error
Writing to Delta table on AWS from non-AWS is unsafe in terms of providing transactional guarantees. If you can guarantee that no one else will be concurrently modifying the same Delta table, you may turn this check off by setting the SparkConf: "spark.databricks.delta.logStore.crossCloud.fatal" to false when launching your cluster.
Tried setting up settings at cluster level and it works fine. My question is, is there any way we can ensure transactional guarantee wile loading data from AWS3 to Azure Datalake (Datalake is backend storage for our Delta Lake). We don't want to set "spark.databricks.delta.logStore.crossCloud.fatal" at Cluster level. Will there be any issue if we do and will it be a good solution for a production ETL pipeline?
This warning appears when Databricks detects that you're doing the multicloud work.
But this warning is for case when you're writing into AWS S3 using Delta, because AWS doesn't have atomic write operation (like, put-if-absent), so it requires some kind of coordinator process that is available only on AWS.
But in your case you can ignore this message because you're just reading from AWS S3, and writing into Delta that is on Azure Datalake.

Looking for a solution to ingest Pega cloud service logs to Splunk using Splunk addons for AWS

I am looking for a solution to ingest pega cloud service logs to Splunk. I cam across to approaches push and pull. With Push option splunk has blueprint lamdbda which can be used to push events to Splunk HTTP Event Collector (HEC). I am not finding any clear solution for pull approach. Can some one summarize for which scenarios pull will work and for which scenarios push will work.
Splunk and Pega Cloud services are on different vpc , how we can have secure data transfer between them.
A fairly simple solution would be to set up your own client application to pull all relevant notifications. (https://community.pega.com/knowledgebase/articles/pega-predictive-diagnostic-cloud/subscribing-pega-predictive-diagnostic-cloud-notifications-using-rest-api)
You can run your application from your own secure server and have it write the info into Splunk. (https://dev.splunk.com/enterprise/docs/devtools/python/sdk-python/howtousesplunkpython/howtogetdatapython/#To-add-data-directly-to-an-index)

Kinesis data analytics sql application is not writing logs into cloudwatch

I created a kinesis data analytics application(using SQL) and attached cloudwatch logging option.
when i run the application, i am receiving the result based on my requirements.
Problem: my kinesis-data-analytics application is not writing logs into cloudwatch.
Note: I used CloudWatch FullAccess policy. The configured cloudwatch log-group and stream-name are also correct.
Please let me know how can i receive the logs.
Regards,
Siva

Monitoring Amazon S3 logs with Splunk?

We have a large extended network of users that we track using badges. The total traffic is in the neighborhood of 60 Million impressions a month. We are currently considering switching from a fairly slow, database-based logging solution (custom-built on PHP—messy...) to a simple log-based alternative that relies on Amazon S3 logs and Splunk.
After using Splunk for some other analyisis tasks, I really like it. But it's not clear how to set up a source like S3 with the system. It seems that remote sources require the Universal Forwarder installed, which is not an option there.
Any ideas on this?
Very late answer but I was looking for the same thing and found a Splunk app that does what you want, http://apps.splunk.com/app/1137/. I have yet not tried it though.
I would suggest logging j-son preprocessed data to a documentdb database. For example, using azure queues or simmilar service bus messaging technologies that fit your scenario in combination with azure documentdb.
So I'll keep your database based approach and modify it to be a schemaless easy to scale document based DB.
I use http://www.insight4storage.com/ from AWS Marketplace to track my AWS S3 storage usage totals by prefix, bucket or storage class over time; plus it shows me the previous versions storage by prefix and per bucket. It has a setting to save the S3 data as splunk format logs that might work for your use case, in addition to its UI and webservice API.
You use Splunk Add-On for AWS.
This is what I understand,
Create a Splunk instance. Use the website version or the on-premise
AMI of splunk to create an EC2 where splunk is running.
Install Splunk Add-On for AWS application on the EC2.
Based on the input logs type (e.g. Cloudtrail logs, Config logs, generic logs, etc) configure the Add-On and supply AWS account id or IAM Role, etc parameters.
The Add-On will automatically ping AWS S3 source and fetch the latest logs after specified amount of time (default to 30 seconds).
For generic use case (like ours), you can try and configure Generic S3 input for Splunk