Splunk Error Log Dashboard - splunk

I've to create a dashboard in splunk which will show error reporting within the log file:
[2011-09-12 14:13:00:605 GMT][com.abc.rest.Security][http-8080-Processor15] ERROR Unable to decrypt token [abc.com=3502639832.36895.0000; path=/] due to error: Input length must be multiple of 16 when decrypting with padded cipher
[2011-09-12 14:13:00:608 GMT][com.abc.filters.AuthenticationFilter][http-8080-Processor15] DEBUG ValidAuthToken: false
[2011-09-13 16:43:40:134 GMT][com.abc.PerfManager][http-8080-Processor13] ERROR Operation Failed: GET_ACCOUNT_ORDER [Status Code: 0150 Message: ACCESS_DENIED]
[2011-09-13 16:43:40:137 GMT][com.abc.rest.ResolvePackage][http-8080-Processor13] WARN MCE error occurred [StatusCode: 0150].
The above errors are occurring at different times & I want to count those all & show pie chart of all these errors with their count. Basically, these errors could be anything which starts with ERROR.
I should also get the Top10 warnings in the logs with their count.
I couldn't find a good way to implement it in Splunk. Can any one help me out on how to implement it in splunk?
Thanks!

... | rex "ERROR (?<error_type>[^\[]+)" | stats count by error_type
something like that should work. check our http://splunk-base.splunk.com/answers/ if it doesn't work.

Related

How do I solve "-307" error on ZKTeco SDK?

Hello everyone and thanks for reading this problem.
I have a solution in C# using the zkemkeeper dll to get the records from some access control devices. When I "ping" them, there isn't any problem, but when I try to connect to them (Using my solution or the standalone demo to get attendance) I get the "-307" error with the "Unable to connect message". That's not very clear and I would really aprecciate if someone can explain what this error is (please!!!!). I would really like to understand these errors myself, so, where can I find all the definition of these errors?
In short:
1.- What is the problem regarding the "-307" error?
2.- Is there any place where all these errors are documented?
Thanks in advance!!
enter image description here
maybe you should check your device, you can ping them
Attention
The dwErrorCode parameter specifies the error code. The values are described as follows:
During connection, the following error codes may be returned:
0 Connected successfully
-1 Failed to invoke the interface
-2 Failed to initialize
-3 Failed to initialize parameters
-5 Data mode read error
-6 Wrong password
-7 Reply error
-8 Receive timeout
-307 Connection timeout
In invoking other interfaces, the following error codes may be returned:
-201 Device is busy
-199 New Mode
-103 device send back error of face version error
-102 face template version error, like 8.0 face template send to 7.0 device
-101 malloc memory failed
-100 Not supported or the data does not exist
-10 The length of transmitted data is incorrect
-5 Data already exists
-4 Insufficient space
-3 Wrong size
-2 File read/write error
-1 The SDK is not initialized and needs to be reconnected
0 Data not found or duplicate data
1 Correct operation
4 Parameter error
101 Buffer allocation error
102 repeat invoking
Underlying error codes:
-12001 Socket creation timeout (connection timeout)
-12002 Insufficient memory
-12003 Wrong Socket version
-12004 Not TCP protocol
-12005 Waiting timeout
-12006 Data transmission timeout
-12007 Data reading timeout
-12008 Failed to read Socket
-13009 Waiting event error
-13010 Exceeded retry attempts
-13011 Wrong reply ID
-13012 Checksum error
-13013 Waiting event timeout
-13014 DIRTY_DATA
-13015 Buffer size too small
-13016 Wrong data length
-13017 Invalid data read1
-13018 Invalid data read2
-13019 Invalid data read3
-13020 Data loss
-13021 Memory initialization error
-15001 Invoking return value of status key issued by SetShortkey interface repeatedly
-15002 Invoking return value of description issued by SetShortkey interface repeatedly
-15003 The two level menu is not opened in the device, and the data need not be issued
getdevicedata and setdevicedata invocation error codes
-15100 Error occurs in obtaining table structure
-15101 The condition field does not exist in the table structure
-15102 Inconsistency in the total number of fields
-15103 Inconsistency in sorting fields
-15104 Memory allocation error
-15105 Data parsing error
-15106 Data overflow as the transmitted data exceeds 4M
-15108 Invalid options
-15113 Data parsing error: table ID not found
-15114 A data exception is returned as the number of fields is smaller than or equal to 0
-15115 A data exception is returned as the total number of table fields is inconsistent with the
total number of fields of the data
Firmware error codes:
2000 Return OK to execute
-2001 Return Fail to execute command
-2002 Return Data
-2003 Regstered event occorred
-2004 Return REPEAT Command
-2005 Return UNAUTH Command
0xffff Return Unknown Command
-4999 Device parameter read error
-4998 Device parameter write error
-4997 The length of the data sent by the software to the device is incorrect
228
229
-4996 A parameter error exists in the data sent by the software to the device
-4995 Failed to add data to the database
-4994 Failed to update the database
-4993 Failed to read data from the database
-4992 Failed to delete data in the database
-4991 Data not found in the database
-4990 The data amount in the database reaches the limit
-4989 Failed to allocate memory to a session
-4988 Insufficient space in the memory allocated to a session
-4987 The memory allocated to a session overflows
-4986 File does not exist
-4985 File read failure
-4984 File write failure
-4983 Failed to calculate the hash value
-4982 Failed to allocate memory
Note
This interface is applicable to the new architecture firmware.

Datastream Troubleshoot: "An unknown error occurred. Please try again. If the error persists, contact Google support"

We are trying to replicate data from AlloyDB to Bigquery using Datastream.
We Get "An unknown error occurred. Please try again. If the error persists, contact Google support."
In the Datastream console --> objects list, we see all source tables with Object Status "Failed" and Backfill status "Completed".
In Bigquery we see only a subset of the tables (not all the "Completed" objects were synced).
In the Logs Explorer I can see this error on BQ:
I also see this error: error: {
code: 11
message: "Unsupported primary key column either does not exist or is a pseudocolumn at [1:401]"
}
The column referred in the error is of type enum.
The desired situation is having all the AlloyDB tables replicated into Bigquery.
The error message is not very informative...
What does it mean?
What would be the best way to go about troubleshooting this?
We're actively working on making these error messages be more informative, and improvements are continuously being rolled out as we identify more edge cases. Assuming you followed all the steps in the documentation, then you may need to open a ticket with support for further investigation. If a support ticket isn't an option, you can still report the issue using the public issue tracker
I just had this same issue but connecting to a PostgreSQL in AWS RDS:
Beginning with Postgres 10, passwords are encrypted using SCRAM-SHA-256 in PostgreSQL. Google DataStream still expects MD5 password encryption, or it will generate an "unknown error" in the logs and fail the backfills.
You'll need to update your postgresql.conf (or RDS Cluster Parameter Group if you're using AWS like me):
password_encryption = 'MD5'
Restart the database and make sure the parameter has changed with:
SHOW password_encryption;
Reset the password of your users:
ALTER USER "{username}" with password '{password}';
More info from the PostgreSQL docs: https://www.postgresql.org/docs/current/auth-password.html

Can you combine multivalue fields to form a consolidated Splunk alert?

I have a Splunk search which returns several logs of the same exception, one for each ID number (from a batch process). I have no problem extracting the field from the log with reg-ex and can build a single alert for each ID number easily.
Slack Message: "Reference number $result.extractedField$ has failed processing."
Since the error happens in batches, sending out an alert for every reference ID that failed would clutter up my Slack channel very quickly. Is it possible to collect all of the extracted fields and set the alert to send only one message? Like this...
Slack Message: "Reference numbers $result.listOfExtractedFields$ have failed to process."
To have a consolidated alert you need consolidated search results. Do that like this:
index=the_index_youre_searching "the class where the error occurs" "the exception you're looking for"
| stats values(*) as * by referenceID
Be sure to select the "Once" Trigger Condition in the alert setup.

Splunk rex query does not return desired result

I am looking to search for error type in my spunk. A typical error log looks like this:
ERROR 2016/03/16 22:13:55 Program exited with error Calling service: Post http://hostname/v1.21/resource/create?name=/60b80cf9-ebc4-11e5-a9cb-3c4a92db9491-2: read unix #->/var/run/program.sock: use of closed network connection (Client.Timeout exceeded while awaiting headers)
Note that common part is "Program exited with error". I am looking to capture the part that follows this common part of the error message. I tried with a couple of rex expressions. Both returned different results. Importantly, neither captured the error type I have shown above. I am giving the one that worked better here.
* | rex "Program exited with error\s+(?<reason>.+)" | top reason
An example of the log it matched-
Unable to get program status, Get http://192.168.0.2:2774/program/v1/status: net/http: timeout awaiting response headers
However, it did not match log of the form-
initial ZK connection failed, stat /var/program/f47aae5c-ea42-11e5-8975-fc15b40f4cc4/srcheck/started: no such file or directory
Calling service: Post http://hostname/v1.21/resource/create?name=/60b80cf9-ebc4-11e5-a9cb-3c4a92db9491-2: read unix #->/var/run/program.sock: use of closed network connection (Client.Timeout exceeded while awaiting headers)
Could someone help me understand what's wrong with my rex expression and what the right one would be so I get all possible error types?
This recipe:
"ERROR.*Program exited with error.*:.*:.*:\s+(?<reason>.+)"
will yield:
use of closed network connection (Client.Timeout exceeded while awaiting headers)
I don't have enough sample data to know if this will hold up or not. For example, I'm counting on exactly 3 colons, to get me to the interesting part. Also I don't know if you care about other things like the hostname, the fact that it's a Post, etc.. But based on your sample of 1, this answer should do the trick.

How to make Xlib print error message, but not exit?

"The action of the default handlers is to print an explanatory message and exit." (link)
Example of such message:
X Error of failed request: BadWindow (invalid Window parameter)
Major opcode of failed request: 12 (X_ConfigureWindow)
Resource id in failed request: 0xc0007a
Serial number of failed request: 140
Current serial number in output stream: 141
If I set (XSetErrorHandler) my own "ignore everything" error handler, the "explanatory messages" disappear.
How to make Xlib ignore errors, but still print error messages?
If you actually want those error messages you pretty much have two options:
Pull _XPrintDefaultError out of XlibInt.c along with some private headers (with all the caveats of using library-private definitions).
Redefine exit() not to actually exit when _XDefautError calls it.
Neither is especially pretty and both may break and reduce your portability, but they do work.
You have to format your own message. The contents of the message is the contents of the XErrorEvent struct:
http://tronche.com/gui/x/xlib/event-handling/protocol-errors/default-handlers.html