Splunk HEC - Disable multiline event splitting due to timestamp - splunk

I have a multi-line event that has timestamps on different lines as shown in the below example
[2022-02-08 08:30:23:776] [INFO] [com.example.monitoring.ServiceMonitor] Status report for services
Service 1 - Available
Service 2 - Unavailable since 2022-02-08T07:00:00 UTC
Service 3 - Available
When the log is sent to an HEC, the lines are split into multiple events as highlighted in the Splunk data pipeline's parsing phase. Due to the presence of a timestamp on line 3, it creates 2 different events.
When searching in Splunk, I see the two events as shown below while they are supposed to be part of a single event.
Event 1
[2022-02-08 08:30:23:776] [INFO] [com.example.monitoring.ServiceMonitor] Status report for services
Service 1 - Available
Event 2
Service 2 - Unavailable since 2022-02-08T07:00:00 UTC
Service 3 - Available
I can solve the issue by setting DATETIME_CONFIG to NONE in props.conf but that creates another issue, Splunk will stop recognizing timestamps in the event.
Is it possible to achieve the same result but without disabling the above property?

The trick is to set TIME_PREFIX correctly:
https://ibb.co/PCG5TqY
This will only look for timestamps in lines starting with a "[".
Here is the entry for props.conf:
[changeme]
disabled = false
pulldown_type = true
category = Custom
DATETIME_CONFIG =
NO_BINARY_CHECK = true
TIME_PREFIX = ^\[
TIME_FORMAT = %Y-%m-%d %H:%M:%S:%3N
SHOULD_LINEMERGE = true

Related

GtkTreeView stops updating unless I change the focus of the window

I have a GtkTreeView object that uses a GtkListStore model that is constantly being updated as follows:
Get new transaction
Feed data into numpy array
Convert numbers to formatted strings, store in pandas dataframe
Add updated token info to GtkListStore via GtkListStore.set(titer, liststore_cols, liststore_data), where liststore_data is the updated info, liststore_cols is the name of the columns (both are lists).
Here's the function that updates the ListStore:
# update ListStore
titer = ls_full.get_iter(row)
liststore_data = []
[liststore_data.append(df.at[row, col])
for col in my_vars['ls_full'][3:]]
# check for NaN value, add a (space) placeholder is necessary
for i in range(3, len(liststore_data)):
if liststore_data[i] != liststore_data[i]:
liststore_data[i] = " "
liststore_cols = []
[liststore_cols.append(my_vars['ls_full'].index(col) + 1)
for col in my_vars['ls_full'][3:]]
ls_full.set(titer, liststore_cols, liststore_data)
Class that gets the messages from the websocket:
class MyWebsocketClient(cbpro.WebsocketClient):
# class exceptions to WebsocketClient
def on_open(self):
# sets up ticker Symbol, subscriptions for socket feed
self.url = "wss://ws-feed.pro.coinbase.com/"
self.channels = ['ticker']
self.products = list(cbp_symbols.keys())
def on_message(self, msg):
# gets latest message from socket, sends off to be processed
if "best_ask" and "time" in msg:
# checks to see if token price has changed before updating
update_needed = parse_data(msg)
if update_needed:
update_ListStore(msg)
else:
print(f'Bad message: {msg}')
When the program first starts, the updates are consistent. Each time a new transaction comes in, the screen reflects it, updating the proper token. However, after a random amount of time - seen it anywhere from 5 minutes to over an hour - the screen will stop updating, unless I change the focus of the window (either activate or inactive). This does not last long, though (only enough to update the screen once). No other errors are being reported, memory usage is not spiking (constant at 140 MB).
How can I troubleshoot this? I'm not even sure where to begin. The data back-ends seem to be OK (data is never corrupted nor lags behind).
As you've said in the comments that it is running in a separate thread then i'd suggest wrapping your "update liststore" function with GLib.idle_add.
from gi.repository import GLib
GLib.idle_add(update_liststore)
I've had similar issues in the past and this fixed things. Sometimes updating liststore is fine, sometimes it will randomly spew errors.
Basically only one thread should update the GUI at a time. So by wrapping in GLib.idle_add() you make sure your background thread does not intefer with the main thread updating the GUI.

AWS Cloudwatch alarm set to NonBreaching (or notBreaching) is not triggering, based on a log filter

With the following Metric and Alarm combination
Metric
Comes from a Cloudwatch log filter (when a match is found on the log)
Metric value: "1"
Default value: None
Unit: Count
Alarm
Statistic: Sum
Period: 1 minute
Treat missing data as: notBreaching
Threshold: [Metric] > 0 for 1 datapoints within 1 minute
The alarm goes to:
State changed to OK at 2018/12/17.
Reason: Threshold Crossed: no datapoints were received for 1 period and 1 missing datapoint was treated as [NonBreaching].
And then it doesn't trigger, even though I force the metric > 0
Why is the alarm stuck in OK? How can the alarm become triggered again?
Solution
Remove the "Unit" property from the stack template Alarm config.
The source of the problem was actually the "Unit" property. This being set to "Count" actually made the alarm become stuck :(
Ensure the stack is producing the same result as a manual alarm setup by checking with the describe-alarms API.

Flink: How to process rest of finite stream with combination of countWindowAll()

//assume following logic
val source = arrayOf(1,2,3,4,5,6,7,8,9,10,11,12) // total 12 elements
val env = StreamExecutionEnvironment.createLocalEnvironment(1);
val input = env.fromCollection(source)
.countWindowAll(5)
.aggregate(...) // pack them to List<Int> for bulk upload to DB
.addSink(...) // sends bulk
When i execute it - only first 10 processed, but rest 2 elements
are thrown away - flink shutdown without processing of them.
The only avoid for me - while i totally controll source data, i can push some well-known IGNORABLE_VALUES to source collection to fit window size and then ignore them in sink... but i think where is some far more professional way in flink.
You have a finite stream of 12 and a window that triggers for every 5 elements. So the first window gets 5 elements and then triggers, then the next 5 are received and it triggers, but the last 2 come and the job knows that no more are going to come. So since there aren't 5 elements in the window the trigger doesn't fire so nothing is done with them.

summarize mutlitple values sent to graphite at the same time

I'm trying to display the sum of several values ​​sent to Graphite (carbon-cache) for the same timestamp.
Sent values are like :
test.nb 10 1421751600
test.nb 11 1421751600
test.nb 12 1421751600
test.nb 13 1421751600
and I would Graphite to display value "46" for timestamp 1421751600.
Only the last value "13" is displayed on Graphite.
Here are configuration files :
storage-aggregation.conf
[test_sum]
pattern = ^test\.*
xFilesFactor = 0.1
aggregationMethod = sum
storage-schemas.conf
[TEST]
pattern = ^test\.
retentions = 10s:30d
Is there a way to do this with Graphite/Carbon ?
Thx.
storage-aggregation.conf file defines how to aggregate data to lower precision retentions and since you only have one retention precision defined: 10s for 30 days, this is not needed.
In order to this with Graphite daemons, you will have to use
carbon-aggregator.py that is run in front of carbon-cache.py to buffer metrics over time. Check [aggregator] section in config file. By default, carbon-aggregator listens on port 2023 (default) so you will have to send data points to this port and not carbon-cache port (2004 by default).
Also, you will have to specify the aggregation rule in aggregation-rules.conf that will allow you to add several metrics together as the come in. You can find detailed explanation here.

How to get priority of current job?

In beanstalkd
telnet localhost 11300
USING foo
put 0 100 120 5
hello
INSERTED 1
How can I know what is the priority of this job when I reserve it? And can I release it by making the new priority equals to current priority +100?
Beanstalkd doesn't return the priority with the data - but you could easily add it as metadata in your own message body. for example, with Json as a message wrapper:
{'priority':100,'timestamp':1302642381,'job':'download http://example.com/'}
The next message that will be reserved will be the next available entry from the selected tubes, according to priority and time - subject to any delay that you had requested when you originally sent the message to the queue.
Addition: You can get the priority of a beanstalk job (as well as a number of other pieces of information, such as how many times it has previously been reserved), but it's an additional call - to the stats-job command. Called with the jobId, it returns about a dozen different pieces of information. See the protocol document, and your libraries docs.

Categories