I'm using the splunk stats command:
stats max(_time) AS latestTimePerServer BY ServerName, ServerStatus
The result is a list of server name / server status combinations each with a timestamp.
I'd like to access individual timestamps of this list by indexing by ServerName or ServerStatus.
Is that possible?
Related
Daily I’m receiving a new table (example :tablename_20220811) in the BigQuery, I want concatenate this new table data to the main_table, dataset schema are same.
I tried using wild cards,I don’t know how to pull the daily loaded table.
You can use BigQuery scheduled queries with an interval (cron) in the schedule parameters :
Example with gcloud cli :
bq query \
--use_legacy_sql=false \
--destination_table=mydataset.desttable \
--display_name='My Scheduled Query' \
--schedule='every 24 hours' \
--append_table=true \
'SELECT
1
FROM
mydataset.tablename_*
where _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d', CURRENT_DATE())'
In order to target on the expected table, I used a wildcard and a filter based on the table suffix. The table suffix should be equals to the current date as STRING with the following format yyyymmdd.
The cron plan to run the query every day.
You can also configure it directly with the Google Cloud console.
It sounds like you have the right naming format for BigQuery to treat your tables as a single 'date-sharded table'.
You need to ensure that the daily tables
have the same schema
are in the same dataset
have the same name apart from the _yyyymmdd suffix
You will know if this worked because only one table will appear (with an icon showing multiple tables, rather than the usual icon).
With this in hand, you can write queries like
SELECT
fieldA,
fieldB,
FROM
`some_dataset.tablename_*`
WHERE
_table_suffix BETWEEN '20220101' AND '20221201'
This gives you some idea of what's possible:
select from the full date-sharded table using backticks (essential!) and the wildcard syntax
filter using the special _table_suffix meta-field
I want to create a CloudWatch metric filter so that I count the number of log entries containing the error line
Connection State changed to LOST
I have CloudWatch Log Group called "nifi-app.log" with 3 log streams (one for each EC2 instance named `i-xxxxxxxxxxx', 'i-yyyyyyyyyy', etc)
Ideally I would want to extract a metric nifi_connection_state_lost_count with a dimension InstanceId where the value is the log stream name.
From what I gather from the documentation, it is possible to extract dimension from the log file contents themselves but I do not see any way to refer to the log stream name for example.
The log entries look like this
2022-03-15 09:44:47,811 INFO [Curator-ConnectionStateManager-0] o.a.n.c.l.e.CuratorLeaderElectionManager org.apache.nifi.controller.leader.election.CuratorLeaderElectionManager$ElectionListener#3fe60bf7 Connection State changed to LOST
I know that I can extract fields from that log entries with [date,level,xxx,yy,zz] but what I need is not in the log entry itself, it's part of the log entry metadata (the log stream name).
The log files are NiFi log files and do NOT have the instance name, hostname, or anything like that printed in each log line, and I would rather not try to change the log format as it would require a restart of the NiFi cluster and I'm not even sure how to change it.
So, is it possible to get the log stream name as dimension for a CW metric filter in some other way?
cat compare_data_gse_query.sql | bq query --use_legacy_sql=False
(1) Cost optimization in GSP. lots of the SQL query do need the changes.
(2) How to compare CSV data (today | yesterday) changes in GSC - Google Storage
(3) SQL * is to to be avoided. is there any utility in 'bq' CLI could handle ?
I have made an SQL statement like the following example:
SELECT ip
FROM ip_table
LIMIT 500
Then I saved the result into a google storage table as a csv format. Now I found that I want more data about the ips I queries previously. Can I read the ips that I saved in the previous query and use them into a new query like this:
SELECT mroe_info
FROM ip_table
WHERE ip = ip_from_my_csv_file
Where ip_from_my_csv_file should iterate over the ips I have in my csv file.
Can you help me achieve this?
You can create external table (for example named my_csv_file) on top of your csv file (see Using External Data Sources) and than use it in your query
SELECT mroe_info
FROM `project.dataset.ip_table`
WHERE ip in (SELECT DISTINCT ip FROM `project.dataset.my_csv_file`)
I have a set of cloudwatch logs in json format that contain a username field. How can I write a cloudwatch metric query that counts the number of unique users per month?
Now you can count unique field values using the count_distinct instruction inside CloudWatch Insights queries.
Example:
fields userId, #timestamp
| stats count_distinct(userId)
More info on CloudWatch Insights: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html
You can now do this! Using CloudWatch Insights.
API: https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_StartQuery.html
I am working on a similar problem and my query for this API looks something like:
fields #timestamp, #message
| filter #message like /User ID/
| parse #message "User ID: *" as #userId
| stats count(*) by #userId
To get the User Ids. Right now this returns with a list of them then counts for each one. Getting a total count of unique can either be done after getting the response or probably by playing with the query more.
You can easily play with queries using the CloudWatch Insights page in the AWS Console.
I think you can achieve that by following query:
Log statement being parsed: "Trying to login user: abc ....."
fields #timestamp, #message
| filter #message like /Trying to login user/
| parse #message "Trying to login user: * and " as user
| sort #timestamp desc
| stats count(*) as loginCount by user | sort loginCount desc
This will print the table in such a way,
# user loginCount
1 user1 10
2 user2 15
......
I don't think you can.
Amazon CloudWatch Logs can scan log files for a specific string (eg "Out of memory"). When it encounters this string, it will increment a metric. You can then create an alarm for "When the number of 'Out of memory' errors exceeds 10 over a 15-minute period".
However, you are seeking to count unique users, which does not translate well into this method.
You could instead use Amazon Athena, which can run SQL queries against data stored in Amazon S3. For examples, see:
Analyzing Data in S3 using Amazon Athena
Using Athena to Query S3 Server Access Logs
Amazon Athena – Interactive SQL Queries for Data in Amazon S3