Splunk: Removing all text after a specific string in a column - splunk

I have a field where all values have the following format:
Knowledge:xyz,id:2907129
The id number always changes, however, all I want is the value of xyz.
I used the following to remove "Knowledge:"e
eval url=replace (url, "Open_KnowledgeZone:", "")
For the id portion, using ",id*" did not work within the eval replace function.

You'll want to use a regex. Something like:
rex field=url "(?<=Knowledge:)(?<AnyFieldName>.*)(?=,)"
Where <AnyFieldName> is the name you want the result field to be. This will select all characters after "Knowledge:" and before the ",".
Here is the regex in action outside of Splunk:
https://regex101.com/r/ofW0a1/1

Related

SQL: Extract from messy JSON nested field with backslashes

I have a table that has some rows with normal JSON and some with escaped values in the JSON field (backslashes)
id
obj
1
{"is_from_shopping_bag":true,"products":[{"price":{"amount":"18.00","currency":"USD","offset":100,"amount_with_offset":"1800"},"product_id":"1234","quantity":1}],"source":"cart"}
2
{"is_from_shopping_bag":"","products":"[{\ "product_id\ ":\ "2345\ ",\ "price\ ":{\ "currency\ ":\ "USD\ ",\ "amount\ ":\ "140.00\ ",\ "offset\ ":100},\ "quantity\ ":1}]"}
(Note: I needed to include a space after the backslashes in the above table so that they would show up in the github generated markdown table -- my actual table does not include those spaces between the backslash and the quote character)
I am doing a sql query in Hive to get the 'currency' field.
Currently I can run
SELECT
id,
JSON_EXTRACT(obj, '$.products[0].price.currency')
FROM my_table
Which will give me the correct output for the first row, but gives me a NULL in the second row
id
obj
1
"USD"
2
NULL
What is the best way to get currency field from the second row? Is there a way to clean up the field and remove the backslashes before trying to JSON_EXTRACT the relevant data?
I could use REPLACE to swap the '\ ' for '', but is that the most efficient method?
Replace \" with " using regexp_replace like this:
regexp_replace(obj,'\\\\"','"')

How to filter String in where clause

I would like to extract the string using where clause in SAP HANA.For an example,these are 3 strings for name column.
123._SYS_BIC.meag.app.qthor.cidwh_eingangsschicht.backend.dblayer.l2.checks/MasterData_Holdings.
153._SYS_BIC.meag.app.qthor.centralAdministration.backend.dblayer.l2.checks/AuditAndSecurities.
meag.app.qthor.centralAdministration.backend.dblayer.l2.checks/GeneralLedger
After filter the name column using where clause, output in the name column would be shown only the last portion of the string. So, output will be like this. That means whatever we have, just remove from the beginning till '/'.
"MasterData_Holdings"
"AuditAndSecurities"
"GeneralLedger"
You can try using the REPLACE_REGEXPR
I'm not familiar myself with Hana but the function is pretty straight forward and it should be:
select REPLACE_REGEXPR('.+/(.+)' IN fieldName WITH '\1' OCCURRENCE ALL) as field
...
where
... -- your filter
Be aware that this regex '.+/(.+)' will eat everything until the last / so for instance if you have ....checks/MasterData_Holdings/Something it will return only Something

Postgres SQL regexp_replace replace all number

I need some help with the next. I have a field text in SQL, this record a list of times sepparates with '|'. For example
'14613|15474|3832|148|5236|5348|1055|524' Each value is a time in milliseconds. This field could any length, for example is perfect correct '3215|2654' or '4565' (only 1 value). I need get this field and replace all number with -1000 value.
So '14613|15474|3832|148|5236|5348|1055|524' will be '-1000|-1000|-1000|-1000|-1000|-1000|-1000|-1000'
Or '3215|2654' => '-1000|-1000' Or '4565' => '-1000'.
I try use regexp_replace(times_field,'[[:digit:]]','-1000','g') but it replace each digit, not the complete number, so in this example:
'3215|2654' than must be '-1000|-1000', i get:
'-1000-1000-1000-1000|-1000-1000-1000-1000', I try with other combinations and more options of regexp but i'm done.
Please need your help, thanks!!!.
We can try using REGEXP_REPLACE here:
UPDATE yourTable
SET times_field = REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g');
If instead you don't really want to alter your data but rather just view your data this way, then use a select:
SELECT
times_field,
REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g') AS times_field_replace
FROM yourTable;
Note that in either case we pass g as the fourtb parameter to REGEXP_REPLACE to do a global replacement of all pipe separated numbers.
[[:digit:]] - matches a digit [0-9]
+ Quantifier - matches between one and unlimited times, as many times as possible
your regexp must look like
regexp_replace(times_field,'[[:digit:]]+','-1000','g')

How to only extract match strings from a multi-value field and display in new column in SPLUNK Query

i am trying to extract matched strings from the multivalue field and display in another column. I have tried various options to split the field by delimiter and then mvexpand and then user where/search to pull those data. I was trying to find if there is an easier way to do this without all this hassle in SPLUNK query.
Example: Lets say i have below multi-value column1 field with data separated by delimiter comma
column1 = abc1,test1,test2,abctest1,mail,send,mail2,sendtest2,new,code,results
I was splitting this column using delimiter |eval column2=split(column1,",") and using regex/where/search to search for data with *test* in this column and return results, where i was able to pull the results but the column1 still shows all the values abc1,test1,test2,abctest1,mail,send,mail2,sendtest2,new,code,results , what i want is either to trim1 column1 to show only words match with test or show those entries in new column2 which should only show this words test1,test2,abctest1,sendtest2 as they were only matching *test*.
I would appreciate your help, thanks.
Found the answer after posting this question, its just using exiting mvfilter function to pull the match resutls.
column2=mvfilter(match(column1,"test"))
| eval column2=split(column1,",") | search column2="*test*"
doesn't work, as the split creates a multi-value field, which is a single event containing a single field containing many values. The search for *test* will still find that event, even though it contains abc1, etc... as there is at least one field that is *test*.
What you can use is the mvfilter command to narrow down the multi-value field to the events you are after.
| eval column2=split(column1,",") | eval column2=mvfilter(match(column2,".*test.*"))
Alternatively to this approach, you can use a regular expression to extract what you need.
| rex field=column1 max_match=0 "(<?column2>[^,]*test[^,]*)"
Regardless, at the end, you would need to use mvjoin to join your multiple values into a single string
| eval column2=mvjoin(column2, ",")

PostgreSQL - find matching line in char/string column?

How can I find matching line in char/string type column?
For example let say I have column called text and some row has content of:
12345\nabcdf\nXKJKJ
(where \n are real new lines)
Now I want to find related row if any of lines match. For example, I have value 12345,
then it should find match. But if I have value 123, It would not.
I tried using like but it finds in both cases, when I have matching value (like 12345) and partially matching value (like 123).
For example something like this, but to have boundary for checking whole line:
SELECT id
FROM my_table
WHERE text like [SOME_VALUE]
Update
Maybe its not yet clear what Im asking. But basically I want something equivalent what you can do with regular expression,
like this: https://regexr.com/5akj1
Here regular expression /^123$/m would not match my string, it would only match if it would have been with pattern /^12345$/m (when I use pattern, value is dynamic, so pattern would change depending what value I got).
You may use regexp_replace and then check that the replaced string is not equal to the original column value:
select count(*)
from dummy
where regexp_replace(mytext, '(?m)^1234$', '') <> mytext;
You have a demo here.
Bear in mind that I have used the (?m) modifier, which makes ^ and $ match begin and end of line instead of begin and end of string.
You should be able to use ~ for matching:
where mytext ~ '(\n|^)1234(\n|$)'