Create a static table in Splunk - splunk

I have a tiny table of data that I want to display as a reference on a dashboard - something like:
date | val1 | val2
9/16/2020 | 10 | 12
9/17/2020 | 11 | 14
9/18/2020 | 12 | 13
that I want to display as a line chart:
I found this very convoluted way to construct it:
| makeresults
| eval testDay="9/16/2020"
| eval testVal1=10
| eval testVal2=12
| append
[| makeresults
| eval testDay="9/17/2020"
| eval testVal1=11
| eval testVal2=14
]
| append
[| makeresults
| eval testDay="9/18/2020"
| eval testVal1=12
| eval testVal2=13
]
| chart first(testVal1), first(testVal2) over testDay
is there a simpler way? Perhaps something more like my little tabular syntax in the table at the beginning of the post? Or at least more like:
val1 = [10,11,12]

There is a simpler way. It's what I use to produce test data when answering questions about Splunk.
| makeresults
| eval _raw="date val1 val2
9/16/2020 10 12
9/17/2020 11 14
9/18-2020 12 13"
| multikv forceheader=1
| chart values(val1) as val1, values(val2) as val2 by date
It's important for mutlikv to work properly that the header and data line up vertically.

Related

SELECT 1 ID and all belonging elements

I try to create a json select query which can give me back the result on next way.
1 row contains 1 main_message_id and belonging messages. (Like the bottom image.) The json format is not a requirement, if its work with other methods, it will be fine.
I store the data as like this:
+-----------------+---------+----------------+
| main_message_id | message | sub_message_id |
+-----------------+---------+----------------+
| 1 | test 1 | 1 |
| 1 | test 2 | 2 |
| 1 | test 3 | 3 |
| 2 | test 4 | 4 |
| 2 | test 5 | 5 |
| 3 | test 6 | 6 |
+-----------------+---------+----------------+
I would like to create a query, which give me back the data as like this:
+-----------------+-----------------------+--+
| main_message_id | message | |
+-----------------+-----------------------+--+
| 1 | {test1}{test2}{test3} | |
| 2 | {test4}{test5}{test6} | |
| 3 | {test7}{test8}{test9} | |
+-----------------+-----------------------+--+
You can use json_agg() for that:
select main_message_id, json_agg(message) as messages
from the_table
group by main_message_id;
Note that {test1}{test2}{test3} is invalid JSON, the above will return a valid JSON array e.g. ["test1", "test2", "test3"]
If you just want a comma separated list, use string_agg();
select main_message_id, string_ag(message, ', ') as messages
from the_table
group by main_message_id;

How to get part of the String before last delimiter in AWS Athena

Suppose I have the following table in AWS Athena
+----------------+
| Thread |
+----------------+
| poll-23 |
| poll-34 |
| pool-thread-24 |
| spartan.error |
+----------------+
I need to extract the part of the string from columns before last delimiter(Here '-' is delimiter)
Basically need a query which can give me output as
+----------------+
| Thread |
+----------------+
| poll |
| poll |
| pool-thread |
| spartan.error |
+----------------+
Also i need a group by query which ca generate this
+---------------+-------+
| Thread | Count |
+---------------+-------+
| poll | 2 |
| pool-thread | 1 |
| spartan.error | 1 |
+---------------+-------+
I tried various forms of MySql queries using LEFT(), RIGHT(), LOCATE(), SUBSTRING_INDEX() functions but it seems that athena does not support all these functions.
You could use regexp_replace() to remove the part of the string that follows the last '-':
select regexp_replace(thread, '-[^-]*$', ''), count(*)
from mytable
group by regexp_replace(thread, '-[^-]*$', '')

In Hive, what is the difference between explode() and lateral view explode()

Assume there is a table employee:
+-----------+------------------+
| col_name | data_type |
+-----------+------------------+
| id | string |
| perf | map<string,int> |
+-----------+------------------+
and the data inside this table:
+-----+------------------------------------+--+
| id | perf |
+-----+------------------------------------+--+
| 1 | {"job":80,"person":70,"team":60} |
| 2 | {"job":60,"team":80} |
| 3 | {"job":90,"person":100,"team":70} |
+-----+------------------------------------+--+
I tried the following two queries but they all return the same result:
1. select explode(perf) from employee;
2. select key,value from employee lateral view explode(perf) as key,value;
The result:
+---------+--------+--+
| key | value |
+---------+--------+--+
| job | 80 |
| team | 60 |
| person | 70 |
| job | 60 |
| team | 80 |
| job | 90 |
| team | 70 |
| person | 100 |
+---------+--------+--+
So, what is the difference between them? I did not find suitable examples. Any help is appreciated.
For your particular case both queries are OK. But you can't use multiple explode() functions without lateral view. So, the query below will fail:
select explode(array(1,2)), explode(array(3, 4))
You'll need to write something like:
select
a_exp.a,
b_exp.b
from (select array(1, 2) as a, array(3, 4) as b) t
lateral view explode(t.a) a_exp as a
lateral view explode(t.b) b_exp as b

Adding a field to differentiate parts of tables

I have several gigabites of arducopter binary flight logs. Each log is a series of messages.
MessageType1: param1, param2, param3
MessageType2: param3, param4, param5, param6
...
The logs are self describing in the sense that the first time a message appears in the log it tells what are the names of the params.
MessageType1: timestamp, a, b
MessageType1: value 1, value 2, value 3
MessageType2: timestamp, c, d, e
MessageType1: value 4, value 5, value 6
MessageType1: value 7, value 8, value 9
MessageType2: value 10, value 11, value 12, value 13
I have written a python script that takes the logs apart and creates tables for each message type in a sqlite database where the message type is the table name and the parameter name is the column name.
Table MessageType1
| Flight Index | Timestamp | a | b |
|--------------|-----------|-------|---------|
| ... | | | |
| "Flight 1" | 111 | 14725 | 10656.0 |
| "Flight 1" | 112 | 57643 | 10674.0 |
| "Flight 1" | 113 | 57157 | 13674.0 |
| ... | | | |
| "Flight 2" | 111 | 56434 | 16543.7 |
| "Flight 2" | 112 | 56434 | 16543.7 |
Table MessageType2
| Flight Index | Timestamp | c | d | e |
|--------------|-----------|-------|---------|--------|
| ... | | | | |
| "Flight 1" | 111 | 14725 | 10656.0 | 462642 |
| "Flight 1" | 112 | 57643 | 10674.0 | 426428 |
| "Flight 1" | 113 | 57157 | 13674.0 | 642035 |
| ... | | | | |
| "Flight 2" | 111 | 56434 | 16543.7 | 365454 |
| "Flight 2" | 112 | 56434 | 16543.7 | 754632 |
| ... | | | | |
For a single log this database is good enough but i would like to add several logs. Meaning messages of several logs of same type go into a single table.
In this case I added a column "Flight Index" which is what I would like to have but:
Each log processed should have a unique identifier
The identifier should be minimal in size, as im dealing with tables that have possibly millions of rows.
Im thinking of adding the flight index as an integer and just iterating the number when processing logs and if the database exists taking the last row of a table and using its index + 1. Is this optimal or is there a SQL native way of operating?
Am i doing something wrong in general as I'm not experienced with SQL?
EDIT: added a second table to show that messages dont have the same number of parameters and example messages.
You can achieve this with two tables
Table 1
Flights
Flight name, Flight number, date, device, etc. (any other data points make sense)
"Flight 1", 1, 1/1/2018,...
"Flight 2", 2, 1/2/2018,...
Table 2
Flight_log
Flight_number, timestamp, parameter1, parameter2,
1,111,14725,10656.0
1,112,57643,10674.0
1,113,57157,13674.0
...
2,111,56434,16543.7
2,112,56434,16543.7
Before you load Flight_logs table you should have an entry in Flights table, you can do a "lookup" do get the Flight_number from Flight table
After reading about data normalization I ended up with the following database.
This minimizes the number of tables. I could have done 35 tables (one for each message) and right parameters for each column, but that would make the database more fragile in the case where the parameters in a message are changed.
EDIT: replaced the image as datamodler got fixed.

postgres: Multiply column of table A with rows of table B

Fellow SOers,
Currently I am stuck with the following Problem.
Say we have table "data" and table "factor"
"data":
---------------------
| col1 | col2 |
----------------------
| foo | 2 |
| bar | 3 |
----------------------
and table "factor" (the amount of rows is variable)
---------------------
| name | val |
---------------------
| f1 | 7 |
| f2 | 8 |
| f3 | 9 |
| ... | ... |
---------------------
and the following result should look like this:
---------------------------------
| col1 | f1 | f2 | f3 | ...|
---------------------------------
| foo | 14 | 16 | 18 | ...|
| bar | 21 | 24 | 27 | ...|
---------------------------------
So basically I want the column "col2" multiplicated with all the contents of "val" of table "factor" AND the content of column "name" should act as tableheader/columnname for the result.
We are using postgres 9.3 (upgrade to higher version may be possible), so an extended Search resulted in multiple possible solutions: using crosstab (though even with crosstab I was not able to figure this one out), using CTE "With" (preferred, but also no luck). Probably this may also be done with the correct use of array() and unnest().
Hence, any help is appreciated on how to achieve this (the less code, the better)
Tnx in advance!
This package seems to do what you want:
https://github.com/hnsl/colpivot