Multi faceted time series visualisation in CloudWatch Log Insights - amazon-cloudwatch

I'm trying to create a multi-faceted time series graph in CloudWatch Log Insights.
I can create a multi-faceted query which is not a time series, and I can create an unfaceted time series query, but I can't seem to do both.
For example:
I can do a query that looks like this:
fields #timestamp, someField1, #message
| stats count(*) by someField1, someField2
This will give me a table of results broken down by both someField1 and someField2.
I can also do:
fields #timestamp, someField1, #message
| stats count(*) by bin(1h)
This will give me a time series graph.
However, I can't work out how to combine the two, so that I get a time series graph with multiple lines on it.
Is this simply unavailable in AWS CloudWatch logs, or is there a way to do it that I haven't found?

Related

Splunk event increasing logic witch each SPL query

I am getting data in Splunk from Snowflake using Splunk DB Connect. This is just simple orders data. At Splunk search & reporting I am running the following query on my table to get visualization.
source="big_data_table_inner_join" "UNITS_SOLD" | top COUNTRY
What I am seeing is that each time I run query the events number at splunk increases quite heavily. For eg. After running first time they were 342000 events and when I ran the same query they were 67445 events. Any idea why is this happening?

Is it possible to retrieve full query history and correlate its cost in google bigquery?

I am querying multiple tables and I am able to see the cost of each query for my personal use. As I view the Query History I only see the queries I ran on my account.
So my question is, is it possible to somehow to see the queries which have been run by others (as well as the cost of the query ) in a project from the query history ?
You can use Jobs information schema:
SELECT query, total_bytes_processed FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT WHERE project_id = 'you_project_id' AND user_email = 'my#eamil.com'
According to the documentation, there is not a direct method of getting costs by job and user. However, there is a way of doing it.
For a detailed billing analysis, I would advise you to export the logs to BigQuery with a custom filter and from there analyse the billing for each user and query job.
So, you can create an export using the Logs Viewer or the API. While creating your sink use the following custom filter:
resource.type="bigquery_resource"
logName="projects/<your_project>/logs/cloudaudit.googleapis.com%2Fdata_access"
protoPayload.methodName="jobservice.jobcompleted"
The above filter will retrieve completed query jobs whilst the data access logs are a comprehensive audit of every query run in BigQuery along with the total bytes scanned. I would like to point that you have to make sure that data_access logs are enable, link.
From the log entries you will get the fields:
protoPayload.authenticationInfo.principalEmail
protoPayload.serviceData.jobCompletedEvent.job.jobName.jobId
protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.query
protoPayload.serviceData.jobCompletedEvent.job.jobStatistics.totalBilledBytes
In BigQuery, you can use a query as follows:
SELECT
protopayload_auditlog.authenticationInfo.principalEmail AS email,
protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobStatistics.totalBilledBytes AS total_billed_bytes,
protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.query AS query,
protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobName.jobId as job_id
FROM
`<myproject>.<mydataset>.cloudaudit_googleapis_com_data_access`
WHERE
protopayload_auditlog.methodName = 'jobservice.jobcompleted';
Afterwards, to get an estimate of the price per each query you can use the totalBilledBytes and the Pricing summary in order to add a new column with a price estimative for each query. Therefore, you have a final table with the user's email, the query code, total bytes billed, job id and an estimate price.

Azure Stream Analytics : Select data with the last timestamp only

I'm working on a way to stream status of some jobs that are running on an HPC resource (sort of like trying to create a dashboard to look at real time flight status). I generate and push data every 60 seconds. Unfortunately, this way i end up with a lot of repeated data as the status of each 'job' changes unpredictably. I need a way to only keep the latest data. I'm not an SQL pro and do this work in my free time so any help will be appreciated!
Here is my query:
SELECT
Job, Ref, Location, Queue, Description, Status, ElapTime, cast (Time as datetime) as Time
INTO
output_source
FROM
input_source
Here is what my output looks like when i test the query:
Query Test Result
As you can see, in the image, there are two sets of data with two different time stamps. I would like the query to return all the columns associated with only the last timestamp. How do i do this? Any ideas? Apologies if this is a repeated question. I have not found an answer that has helped me solve this problem.
Thanks for all your help!

How to accumulate counts from different searches into one (pie) chart?

I have 5 different searches I am doing in Splunk where I am getting the count of how many results from that search query.
I've had a look at this thread here:
https://answers.splunk.com/answers/757081/pie-chart-with-count-from-different-search-criteri.html
but its not quite working for me, I'm not 100% sure if its what I want.
My search queries all look something like this:
index=A variable="foo" message="Created*" | stats count
index=A variable="foo" message="Deleted*" | stats count
I ideally want to assign each query to a keyword - such as created, deleted, etc, then do a pie chart based on the counts.
The following should be sufficient.
index=A variable="foo" message="Created*" OR message="Deleted*" OR message="<repeat this for any other message types you want>" | stats count by message
If you can provide some more examples of the events you are trying to chart, there may be alternate approaches that can work for you.
This version will extract the key part of the message (Created, Deleted. etc...) into a field called mtype and you can then perform stats on that field.
index=A variable="foo" message="Created*" OR message="Deleted*" OR message="<repeat this for any other message types you want>" | rex field=message "(?<mtype>Created|Deleteted|...)" | stats count by mtype

Regarding splunk query

I have some audio files in S3 bucket. I want to write a splunk query to find the top 500 audio files and categorize them by date and store them in a folder. Also I want to do it every night. What should be my query?
You would do something like the following.
index=audiofiles earliest=-1d#d | stats count by title
The earliest=1d#d means you only want to look at the last day's worth of logs. The stats command counts the frequency of each song, by title. I am assuming that there is a title field in your data.
You would probably choose to schedule a search like this to run every night. Should you wish to save the results of this data, you would write this data into a summary index with a collect command, or write it out to a CSV with outputcsv command