How to extract data using multiple delimited values in splunk - splunk

I have the below string in logs with multiple delimiters (: = and #). I am expecting all the values in tabular formate like
tenant |countryCode |deviceType |platformID|paymentMethod1|paymentMethod2|userAgent
XYZ | US | IOS |13 |p1 |p2 |Mozilla /20.0.553 Mozilla/5.0
logs string
TrackingLogs tenant=XYZ, countryCode=US, deviceType:IOS, platformID:13,currency=USD, paymentMethods:P1 # P1 # P2 # P2 # P4 # , userAgent:Mozilla /20.0.553 Mozilla/5.0
I tried for ':' but no result
search string| rex field=_raw "deviceType\:\s+?(?<deviceType>\S+)" |table deviceType
for = I used below query it worked but don't know how to combine it with : and #
search trackinglog | rex field=tenant "(?<tenant>[^\.]*)\.[a-zA-Z]"| table _raw tenant, countryCode , currency , paymentMethods

The problem with the first query is not the separator, but the regex itself. It expects a space where none exists. This variation works:
| rex field=_raw "deviceType:\s*?(?<deviceType>\S+)" |table deviceType
For better results, however, try the extract command.
| extract pairdelim="," kvdelim=":="

Related

CloudWatch Logs Insights display a filed from the Json in the log message

This is my log entry from AWS API Gateway:
(8d036972-0445) Method request body before transformations: {"TransactionAmount":225.00,"OrderID":"1545623982","PayInfo":{"Method":"ec","TransactionAmount":225.00},"CFeeProcess":0}
I want to write a CloudWatch Logs Insights query which can display AWS request id, present in the first parenthesis and the order id present in the json.
I'm able to get the AWS request id by parsing the message. How can I get the OrderID json field?
Any help is greatly appreciated.
| parse #message "(*) Method request body before transformations: *" as awsReqId,JsonBody
#| filter OrderID = "1545623982" This did not work
| display awsReqId,OrderID
| limit 20
You can do it with two parse steps, like this:
fields #message
| parse #message "(*) Method request body before transformations: *" as awsReqId, JsonBody
| parse JsonBody "\"OrderID\":\"*\"" as OrderId
| filter OrderID = "1545623982"
| display awsReqId,OrderID
| limit 20
Edit:
Actually, they way you're doing it should also work. I think it doesn't work because you have 2 space characters between brackets and the word Method here (*) Method. Try removing 1 space.

How would I remove specific words or text from KQL query?

I have the following query which provides me with all the data I need exported but I would like text '' removed from my final query. How would I achieve this?
| where type == "microsoft.security/assessments"
| project id = tostring(id),
Vulnerabilities = properties.metadata.description,
Severity = properties.metadata.severity,
Remediations = properties.metadata.remediationDescription
| parse kind=regex id with '/virtualMachines/' Name '/providers/'
| where isnotempty(Name)
| project Name, Severity, Vulnerabilities, Remediations ```
You could use replace_string() (https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/replace-string-function) to replace any substring with an empty string

how to extract value from splunk and generate line graph

My log messages
.o.s.c.PaymentMethodInstrumentController : Exiting ServiceController.getMyServiceDetails() : elapsedTime(ms):34, xrfRequestId:c3b5878d-8795-49cb-b6a7-51ab02789f46, xCorrelationId:786d68ea-ze46-42b9-966f-124f2eb444f6, xForwardedFor:10.242.79.96
.o.s.c.PaymentMethodInstrumentController : Exiting ServiceController.getMyServiceDetails() : elapsedTime(ms):39, xrfRequestId:c3b2c08d-6c6d-49cb-b6a7-51a89897446, xCorrelationId:78676yt64-ze46-42b9-966f-124f2eb444f6, xForwardedFor:10.242.79.96
I am looking to extract elapsedTime(ms):34 and generate the line graph of these values.
Assuming you already have _time, something like that:
<your search>
| rex "elapsedTime(ms):(?<elapsedTime>\d+),"
| table _time elapsedTime

Splunk extract a value from string which begins with a particular value

Could you help me extract file name in table format.
Here the below field just before file name is always constant. "Put File /test/abc/test/test/test to /test/test/test/test/test/test/test/test/test/test destFolderPath: /test/test/test/test/test/test/test/abc/def/hij"
This is an event from splunk
2021-04-08T01:03:40.155069+00:00 somedata||someotherdata||..|||Put File /test/abc/test/test/test to /test/test/test/test/test/test/test/test/test/test destFolderPath: /test/test/test/test/test/test/test/abc/def/hij/CHARGEBACK_20210407_060334_customer.csv
Result should be in table format: (font / format doesnt matter)
File Name
CHARGEBACK_20210407_060334_customer.csv
Assuming the original event/field ends with the file name, you should use this regular expression:
(?<file_name>[^\/]+)$
This will extract the text between the last "/" and the end of the event/field ("$").
You can test it here: https://regex101.com/r/J6bU3m/1
Now you can use Splunk's rex command to extract fields at search-time:
| makeresults
| eval _raw="2021-04-08T01:03:40.155069+00:00 somedata||someotherdata||..|||Put File /test/abc/test/test/test to /test/test/test/test/test/test/test/test/test/test destFolderPath: /test/test/test/test/test/test/test/abc/def/hij/CHARGEBACK_20210407_060334_customer.csv"
| fields - _time
| rex field=_raw "(?<file_name>[^\/]+)$"
Alternatively, you could also use this regular expression since you mentioned that the file path is always the same:
| rex field=_raw "abc\/def\/hij\/(?<file_name>.+)"

creating external table from compressed (gz format) files without selecting all fields

I have gz files in a folder. I need only 3 columns from these files, but each line has over 100 of them. At the moment I create a view this way.
drop table MAK_CHARGE_RCR;
create external table MAK_CHARGE_RCR
(LINE string)
STORED as SEQUENCEFILE
LOCATION '/apps/hive/warehouse/mydb.db/file_rcr';
drop view VW_MAK_CHARGE_RCR;
create view VW_MAK_CHARGE_RCR as
Select LINE[57] as CREATE_DATE, LINE[64] as SUBS_KEY, LINE[63] as RC_TERM_NAME
from
(Select split(LINE, '\\|') as LINE
from MAK_CHARGE_RCR) a;
The view has the fields I need. Now I have to do the same, but without CTAS and I am not sure how to go about it. What can I do?
I was told the table must look like this
create external table MAK_CHARGE_RCR
(CREATE_DATE string, SUBS_KEY string, RC_TERM_NAME etc)
I could split the line like this
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\\|'
but I'll need to list every column. I have another group of files with over 1000 columns. All of them I'll need to list. This just seems a bit excessive, so I wondered if it is possible to do
create external table arstel.MAK_CHARGE_RCR
(split(LINE, '\\|')[57] string,
split(LINE, '\\|')[64] string
etc)
This doesn't work obviously, but maybe there are work arounds?
RegexSerDe
For educational purposes
P.s.
I intend to create an enhanced version of the CSV SerDe that excepts an additional parameter with the positions of the requested columns.
Demo
bash
echo {a..c}{1..100} | xargs -n 100 | tr ' ' '|' | \
hdfs dfs -put - /user/hive/warehouse/mytable/data.txt
hive
create external table mytable
(
col58 string
,col64 string
,col65 string
)
row format serde 'org.apache.hadoop.hive.serde2.RegexSerDe'
with serdeproperties ("input.regex" = "^(?:([^|]*)\\|){58}(?:([^|]*)\\|){6}([^|]*)\\|.*$")
stored as textfile
location '/user/hive/warehouse/mytable'
;
select * from mytable
;
+---------------+---------------+---------------+
| mytable.col58 | mytable.col64 | mytable.col65 |
+---------------+---------------+---------------+
| a58 | a64 | a65 |
| b58 | b64 | b65 |
| c58 | c64 | c65 |
+---------------+---------------+---------------+