Splunk graph data grouped by "release" and "time" - splunk

I need to create a graph that will have date on the x axis and "successfully_processed" and "failed_to_process" on y axis group by "release".
This is my example:
|makeresults
| eval raw="100, 2, typeA, 2022-05-25T19:53:51.000-07:00|110, 3, typeA, 2022-05-26T19:53:51.000-08:00|150, 1, typeB, 2022-05-25T19:53:51.000-08:00"
| makemv raw delim="::"
| mvexpand raw
| fields - _time
| streamstats count AS _serial
| makemv raw delim="|"
| mvexpand raw
| rex field=raw "^(?<success>[^,]+),(?<fail>[^,]+),(?<release>[^,]+),(?<_time>[^,]+)$"
| fields - raw
| stats values(success) as Successfully_processed values(fail) as Failed_to_process by release
When I group them by release I can't figure out how to get the date as well. I need every log "successfully_processed" and "failed_to_process" to be shown per day grouped by "release".
Can anyone help please? Thank you

Try the chart command.
| makeresults
| eval raw="100, 2, typeA, 2022-05-25T19:53:51.000-07:00|110, 3, typeA, 2022-05-26T19:53:51.000-08:00|150, 1, typeB, 2022-05-25T19:53:51.000-08:00"
| makemv raw delim="::"
| mvexpand raw
| streamstats count AS _serial
| makemv raw delim="|"
| mvexpand raw
| rex field=raw "^(?<success>[^,]+),(?<failure>[^,]+),(?<release>[^,]+),(?<_time>[^,]+)$"
| fields - raw
| chart values(success) as success, values(failure) as failure over _time by release

Related

Postgresql order by out of order

I have a database where I need to retrieve the data as same order as it was populated in the table. The table name is bible When I type in table bible; in psql, it prints the data in the order it was populated with, but when I try to retrieve it, some rows are always out of order as in the below example:
table bible
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 1
day | 1
book | Genesis
chapter | 1
verse | 1
text | In the beginning God created the heavens and the earth.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Genesis1.1&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 2
day | 1
book | John
chapter | 1
verse | 1
text | In the beginning was the Word, and the Word was with God, and the Word was God.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=John1.1&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 3
day | 1
book | John
chapter | 1
verse | 2
text | The same was in the beginning with God.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=John1.2&key=dc5e2d416f46150bf6ceb21d884b644f
Everything is in order, but when I try to query the same thing using for example: select * from bible where day='1' or select * from bible where day='1' order by day or select * from bible where day='1' order by day, id;, I always get some rows out of order either in the day selected (here 1) or any other day.
I have been using Django to interfere with Postgres database, but since I found this problem, I tried to query using SQL, but nothing, I still get rows out of order, although they all have unique ids which I verified with select count(distinct id), count(id) from bible;
- [ RECORD 1 ]------------------------------------------------------------------------------------------------------
id | 1
day | 1
book | Genesis
chapter | 1
verse | 1
text | In the beginning God created the heavens and the earth.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Genesis1.1&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 10
day | 1
book | Colossians
chapter | 1
verse | 18
text | And he is the head of the body, the church: who is the beginning, the firstborn from the dead; that in all things he might have the preemine
nce.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Colossians1.18&key=dc5e2d416f46150bf6ceb21d884b644f
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------
id | 11
day | 1
book | Genesis
chapter | 1
verse | 2
text | And the earth was waste and void; and darkness was upon the face of the deep: and the Spirit of God moved upon the face of the waters.
link | https://api.biblia.com/v1/bible/content/asv.txt.txt?passage=Genesis1.2&key=dc5e2d416f46150bf6ceb21d884b644f
As you could see above if you notice, the ids are out of order 1, 10, 11.
my table
Table "public.bible";
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------+------+-----------+----------+---------+----------+--------------+-------------
id | text | | | | extended | |
day | text | | | | extended | |
book | text | | | | extended | |
chapter | text | | | | extended | |
verse | text | | | | extended | |
text | text | | | | extended | |
link | text | | | | extended | |
Access method: heap
The id field is of type text because I used pandas's to_sql() method to populate the bible table. I tried to drop the id column and then I added it again as a PK with ALTER TABLE bible ADD COLUMN id SERIAL PRIMARY KEY; but I still get data return out of order.
Is there anyway I can retrieve the data with ordering with id, without having some of the rows totally out of order? Thank you in advance!
Thou shalt cast thy id to integer to order it as number.
SELECT * FROM bible ORDER BY cast(id AS integer);
While #jordanvrtanoski is correct, the way to do this is django is:
>>> Bible.objects.extra(select={'id': 'CAST(id AS INTEGER)'}).order_by('id').values('id')
<QuerySet [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 10}, {'id': 20}]>
Side note: If you want to filter on day as an example, you can do this:
>>> Bible.objects.extra(select={
'id': 'CAST(id AS INTEGER)',
'day': 'CAST(day AS INTEGER)'}
).order_by('id').values('id', 'day').filter(day=2)
<QuerySet [{'id': 2, 'day': 2}, {'id': 10, 'day': 2}, {'id': 11, 'day': 2}, {'id': 20, 'day': 2}]>
Otherwise you get this issue: (notice 1 is followed by 10 and not 2)
>>> Bible.objects.order_by('id').values('id')
<QuerySet [{'id': '1'}, {'id': '10'}, {'id': '2'}, {'id': '20'}, {'id': '3'}]>
I HIGHLY suggest you DO NOT do any of this, and set your tables correctly (have the correct column types and not have everything as text), or your query performance is going to suck.. BIG TIME
Building on both answers of #jordanvrtanoski and #Javier Buzzi, and some search online, the issue is because the ids are of type TEXT (or VARCHAR too), so, you would need to cast the id to type INTEGER as in the following:
ALTER TABLE bible ALTER COLUMN id TYPE integer USING (id::integer);
Now here is my table
Table "public.bible"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------+---------+-----------+----------+-----------------------------------------+----------+--------------+-------------
id | integer | | | nextval('bible_id_seq'::regclass) | plain | |
day | text | | | | extended | |
book | text | | | | extended | |
chapter | text | | | | extended | |
verse | text | | | | extended | |
text | text | | | | extended | |
link | text | | | | extended | |
Indexes:
"lesson_unique_id" UNIQUE CONSTRAINT, btree (id)
Referenced by:
TABLE "notes_note" CONSTRAINT "notes_note_verse_id_5586a4bf_fk" FOREIGN KEY (verse_id) REFERENCES days_lesson(id) DEFERRABLE INITIALLY DEFERRED
Access method: heap
Hope this helps other people, and thank you everyone!

Create a static table in Splunk

I have a tiny table of data that I want to display as a reference on a dashboard - something like:
date | val1 | val2
9/16/2020 | 10 | 12
9/17/2020 | 11 | 14
9/18/2020 | 12 | 13
that I want to display as a line chart:
I found this very convoluted way to construct it:
| makeresults
| eval testDay="9/16/2020"
| eval testVal1=10
| eval testVal2=12
| append
[| makeresults
| eval testDay="9/17/2020"
| eval testVal1=11
| eval testVal2=14
]
| append
[| makeresults
| eval testDay="9/18/2020"
| eval testVal1=12
| eval testVal2=13
]
| chart first(testVal1), first(testVal2) over testDay
is there a simpler way? Perhaps something more like my little tabular syntax in the table at the beginning of the post? Or at least more like:
val1 = [10,11,12]
There is a simpler way. It's what I use to produce test data when answering questions about Splunk.
| makeresults
| eval _raw="date val1 val2
9/16/2020 10 12
9/17/2020 11 14
9/18-2020 12 13"
| multikv forceheader=1
| chart values(val1) as val1, values(val2) as val2 by date
It's important for mutlikv to work properly that the header and data line up vertically.

How to get part of the String before last delimiter in AWS Athena

Suppose I have the following table in AWS Athena
+----------------+
| Thread |
+----------------+
| poll-23 |
| poll-34 |
| pool-thread-24 |
| spartan.error |
+----------------+
I need to extract the part of the string from columns before last delimiter(Here '-' is delimiter)
Basically need a query which can give me output as
+----------------+
| Thread |
+----------------+
| poll |
| poll |
| pool-thread |
| spartan.error |
+----------------+
Also i need a group by query which ca generate this
+---------------+-------+
| Thread | Count |
+---------------+-------+
| poll | 2 |
| pool-thread | 1 |
| spartan.error | 1 |
+---------------+-------+
I tried various forms of MySql queries using LEFT(), RIGHT(), LOCATE(), SUBSTRING_INDEX() functions but it seems that athena does not support all these functions.
You could use regexp_replace() to remove the part of the string that follows the last '-':
select regexp_replace(thread, '-[^-]*$', ''), count(*)
from mytable
group by regexp_replace(thread, '-[^-]*$', '')

Splunk query to list out by stats from json kind of log

I have a splunk log where the log will be in JSON format or as raw data. Need to write a splunk query using stats command.
index=* application_name=abc type=imp | stats count by status
Tried with 'stats count by status' command, but noyhing worked. Also tried as 'stats count by message_text:data' , 'stats count by message_text:data:status'
Log as listed as below,
{"application_name":"abc","type":"imp"},"box":"dev","message_text":"{\"data\":{\"error\":"invalid",\"status\":"200"}}
Need to get the count by status and type
#Sateesh M
Can you please try this?
YOUR SEARCH | rex field=_raw "\"message_text\":\"(?<data>.*)$" | rex mode=sed field=data "s/\\\\\"/\"/g" | eval _raw=data | kv | stats count by "data.status"
My Sample Search:
| makeresults | eval _raw="{\"application_name\":\"abc\",\"type\":\"imp\"},\"box\":\"dev\",\"message_text\":\"{\\\"data\\\":{\\\"error\\\":\"invalid\",\\\"status\\\":\"200\"}}"
| rex field=_raw "\"message_text\":\"(?<data>.*)$" | rex mode=sed field=data "s/\\\\\"/\"/g" | eval _raw=data | kv | stats count by "data.status"

Adding a field to differentiate parts of tables

I have several gigabites of arducopter binary flight logs. Each log is a series of messages.
MessageType1: param1, param2, param3
MessageType2: param3, param4, param5, param6
...
The logs are self describing in the sense that the first time a message appears in the log it tells what are the names of the params.
MessageType1: timestamp, a, b
MessageType1: value 1, value 2, value 3
MessageType2: timestamp, c, d, e
MessageType1: value 4, value 5, value 6
MessageType1: value 7, value 8, value 9
MessageType2: value 10, value 11, value 12, value 13
I have written a python script that takes the logs apart and creates tables for each message type in a sqlite database where the message type is the table name and the parameter name is the column name.
Table MessageType1
| Flight Index | Timestamp | a | b |
|--------------|-----------|-------|---------|
| ... | | | |
| "Flight 1" | 111 | 14725 | 10656.0 |
| "Flight 1" | 112 | 57643 | 10674.0 |
| "Flight 1" | 113 | 57157 | 13674.0 |
| ... | | | |
| "Flight 2" | 111 | 56434 | 16543.7 |
| "Flight 2" | 112 | 56434 | 16543.7 |
Table MessageType2
| Flight Index | Timestamp | c | d | e |
|--------------|-----------|-------|---------|--------|
| ... | | | | |
| "Flight 1" | 111 | 14725 | 10656.0 | 462642 |
| "Flight 1" | 112 | 57643 | 10674.0 | 426428 |
| "Flight 1" | 113 | 57157 | 13674.0 | 642035 |
| ... | | | | |
| "Flight 2" | 111 | 56434 | 16543.7 | 365454 |
| "Flight 2" | 112 | 56434 | 16543.7 | 754632 |
| ... | | | | |
For a single log this database is good enough but i would like to add several logs. Meaning messages of several logs of same type go into a single table.
In this case I added a column "Flight Index" which is what I would like to have but:
Each log processed should have a unique identifier
The identifier should be minimal in size, as im dealing with tables that have possibly millions of rows.
Im thinking of adding the flight index as an integer and just iterating the number when processing logs and if the database exists taking the last row of a table and using its index + 1. Is this optimal or is there a SQL native way of operating?
Am i doing something wrong in general as I'm not experienced with SQL?
EDIT: added a second table to show that messages dont have the same number of parameters and example messages.
You can achieve this with two tables
Table 1
Flights
Flight name, Flight number, date, device, etc. (any other data points make sense)
"Flight 1", 1, 1/1/2018,...
"Flight 2", 2, 1/2/2018,...
Table 2
Flight_log
Flight_number, timestamp, parameter1, parameter2,
1,111,14725,10656.0
1,112,57643,10674.0
1,113,57157,13674.0
...
2,111,56434,16543.7
2,112,56434,16543.7
Before you load Flight_logs table you should have an entry in Flights table, you can do a "lookup" do get the Flight_number from Flight table
After reading about data normalization I ended up with the following database.
This minimizes the number of tables. I could have done 35 tables (one for each message) and right parameters for each column, but that would make the database more fragile in the case where the parameters in a message are changed.
EDIT: replaced the image as datamodler got fixed.