I want to extract the string from the string and use it under a field - splunk

I want to extract a string from a string...and use it under a field named source.
I tried writing like this bu no good.
index = cba_nemis Status: J source = *AAP_ENC_UX_B.* |eval plan=upper
(substr(source,57,2)) |regex source = "AAP_ENC_UX_B.\w+\d+rp"|stats
count by plan,source
for example..
source=/p4products/nemis2/filehandlerU/encpr1/log/AAP_ENC_UX_B.az_in_aza_277U_ rp-20190722-054802.log
source=/p4products/nemis2/filehandlerU/encpr2/log/AAP_ENC_UX_B.oh_in_ohf_ed_ph_ld-20190723-034121.log
I want to extract the string \
AAP_ENC_UX_B.az_in_aza_277U_ rp from 1st
and
AAP_ENC_UX_B.oh_in_ohf_ed_ph_ld from 2nd.
and put it under the column source along with the counts..
I want results like...
source counts
AAP_ENC_UX_B.az_in_aza_277U_ rp 1
AAP_ENC_UX_B.oh_in_ohf_ed_ph_ld 1

You can use the [rex][1] command that extracts a new field from an existing field by applying a regular expression.
...search...
| rex field=source ".+\/(?<source_v2>[\.\w\s]+)-.+"
| stats count by plan, source_v2
Be careful, though: I called the new field source_v2, what you were asking would rewrite the existing source field without you explicitly requesting this. Just change source_v2 to source in my code in case this is what you want.
The search takes this new source_v2 field into account. Try and see if this is what you need. You can tweak it easily to get your expected results.

Related

Splunk field extractor unable to extract all values

I want to extract 4 values out of one field, called msg, from a Splunk query; and the msg is in the form of:
msg: "Service call successful k1=v1 k2=v2 k3=v3 k4=v4 k5=v5 something else can be ignored"
keys are always static but values are not, for instance, v2 could be XXX or XXYYZZ; similarly possible values for v3 just have unpredictable length.
I query to get some sample results and hope to use Field Extractor to generate a regex, but the regex generated can't get all the values out and I guess it's probably because values are not having the same length?
Do I need to change my logging format by separating each key=value using a common? Or I am not using the field extractor correctly?
[Update1]: A few sample data:
msg:Service call successful k1=XXX k2=BBBB k3=Something I made up k4=YYYNNN k5=do not need to retrieve this value
msg:Service call successful k1=SSSSSS k2=AAA k3=This could contain space and comma, like this one k4=YYYNNM k5=can be ignored
I could change the logging format if it makes easier to query and extract fields. Will adding a separator like dot or pipe help?
Normally Splunk will pull key-value pairs out automatically
However, when it doesn't, go try your regular expression(s) on regex101 - the field extractor is often a good[ish] start, but rarely creates efficient (or complete) regular expressions
An inline version of this would be as follows (presuming the "value" half of the key-value pair is contiguous characters):
| rex field=_raw "k1=(?<k1>\S+)\s+k2=(?<k2>\S+)\s+k3=(?<k3>\S+)\s+k4=(?<k4>\S+)\s+k5=(?<k5>\S+)"
Normally I prefer to do sequential rex calls, in case something's out of order or missing, but if your data's consistent, this will work
Once you have it the way you want it, update your props.conf and transforms.conf as appropriate for the sourcetype
EDIT for updated sample data / comment response:
...
| rex field=_raw "k3=(?<k3>.+)\s+k4="
| rex field=_raw "k4=(?<k4>.+)\s+k5="
...

How to create a correct filter string with OR and AND operators for django?

My app has a frontend on vue.js and backend on django rest framework. I need to do a filter string on vue which should do something like this:
((status=closed) | (status=canceled)) & (priority=middle)
but got an error as a response
["Invalid querystring operator. Matched: ') & '."]
After encoding my string looks like this:
?filters=((status%3D%D0%97%D0%B0%D0%BA%D1%80%D1%8B%D1%82)%20%7C%20(status%3D%D0%9E%D1%82%D0%BA%D0%BB%D0%BE%D0%BD%D0%B5%D0%BD))%20%26%20(priority%3D%D0%A1%D1%80%D0%B5%D0%B4%D0%BD%D0%B8%D0%B9)
which corresponds to
?filters=((status=closed)|(status=canceled))&(priority=middle)
How should look a correct filter string for django?
I have no problem if statement includes only | or only &. For example filter string like this one works perfect:
?filters=(status%3D%D0%97%D0%B0%D0%BA%D1%80%D1%8B%D1%82)%20%7C%20(status%3D%D0%9E%D1%82%D0%BA%D0%BB%D0%BE%D0%BD%D0%B5%D0%BD)
a.k.a. ?filters=(status=closed)|(status=canceled). But if i add an & after it and additional brackets to specify the order of conditions calculation it fails with an error.
I also tried to reduce usage of brackets and had string like this (as experiment):
?filters=(status%3D%D0%97%D0%B0%D0%BA%D1%80%D1%8B%D1%82%20%7C%20status%3D%D0%9E%D1%82%D0%BA%D0%BB%D0%BE%D0%BD%D0%B5%D0%BD)
a.k.a. ?filters=(status=closed | status=canceled). This one doesn't work - get neither error nor the data.
I need to have a mixed results in my case: both statuses (closed and canceled) and priority=middle, but a string format isn't correct. Please explain, which format would be Ok?
That doesn't look like a very uri friendly syntax you're trying to use there.
Try doing this instead:
?status[]=closed&status[]=cancelled&priority=middle
Then use request.GET.getlist('status[]') to get back the list and use the values for logical OR queryset filtering:
qs = qs.filter(status__in=request.GET.getlist('status[]', [])
and then add any additional filtering which works as logical AND.
If you're using axios, it should automatically format js status url param into proper format.

REGEX_EXTRACT error in PIG

I have a CSV file with 3 columns: tweetid , tweet, and Userid. However within the tweet column there are comma separated values.
i.e. of 1 row of data:
`396124437168537600`,"I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.",savava143
I want to extract all 3 fields individually, but REGEX_EXTRACT is giving me an error with this code:
a = LOAD tweets USING PigStorage(',') AS (f1,f2,f3);
b = FILTER a BY REGEX_EXTRACT(f1,'(.*)\\"(.*)',1);
The error is:
error: Filter's condition must evaluate to boolean.
In the use case shared, reading the data using PigStrorage(',') will result in missing savava143 (last field value)
A = LOAD '/Users/muralirao/learning/pig/a.csv' USING PigStorage(',') AS (f1,f2,f3);
DUMP A;
Output : A : Observe that the last field value is missing.
(396124437168537600,"I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.")
For the use case shared, to extract all the values from CSV file with field values having ',' we can use either CSVExcelStorage or CSVLoader.
Approach 1 : Using CSVExcelStorage
Ref : http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
Input : a.csv
396124437168537600,"I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.",savava143
Pig Script :
REGISTER piggybank.jar;
A = LOAD 'a.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() AS (f1,f2,f3);
DUMP A;
Output : A
(396124437168537600,I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.,savava143)
Approach 2 : Using CSVLoader
Ref : http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/piggybank/storage/CSVLoader.html
Below script makes use of CSVLoader(), DUMP A will result in the same output seen earlier.
A = LOAD 'a.csv' USING org.apache.pig.piggybank.storage.CSVLoader() AS (f1,f2,f3);
The error is that you do not want to FILTER based on a regex but GENERATE new fields based on a regex. To filter, you need to know if the line have to be filtered, hence the boolean requirement.
Therefore, you have to use :
b = FOREACH a GENERATE REGEX_EXTRACT(FIELD, REGEX, HOW_MANY_GROUPS_TO_RETURN);
However, as #Murali Rao said, your values are not just coma separated but CSV (think how you will handle a coma in tweet : it is not a field separator, just some content).

How to use Substring in SSIS

I want to export data from SharePoint list to SQL using SSIS.
In SharePoint list, i have a column as multi select, So i am getting below value in my column
1;#control 1;#3;#control 3
I want to use substring in derived column in such a way that i should get below result
1,3
I want only ID from the given column.
I have tried below code
SUBSTRING(ColumnName,1,FINDSTRING(ColumnName,";#",1) - 1)
But it only gives me answer as
1
Can anyone please help me out.?
Because there is an unknown number of controls selected in your SharePoint Multi-Select, a Derived Column transformation is not going to work for you. You'll have to use a script.
One way to parse your string is with regular expressions. You'll have to add an output to the script transformation and assign your parsed string to that output.
Regex controlExpression = new Regex(#"control ([0-9]+)");
MatchCollection controlMatches = controlExpression.Matches(--YOUR INPUT HERE--);
String output = string.Join(",",
(controlMatches.Cast<Match>().Select(n => n.Groups[1].ToString())).ToArray());

Splunk: Extracting multiple fields with the same name

I am using Splunk to index logs with multiple fields with the same name. All fields have the same meaning:
2012-02-22 13:10:00,ip=127.0.0.1,to=email1#example.com,to=email2#example.com
In the automatic extraction for this event, I only get "email1#example.com" extracted for the "to" field. How can I make sure all the values are extracted?
Thanks!
I think adding this to the end of the search this may do it:
| extract pairdelim="," kvdelim="=" mv_add=t | table to
(the 'table' is just for demonstration).
So, I think, in 'transforms.conf' (from http://docs.splunk.com/Documentation/Splunk/latest/admin/transformsconf) put:
[my-to-extraction]
DELIMS = ",", "="
MV_ADD = true
and reference it in 'props.conf':
[eventtype::my_custom_eventtype]
REPORT-to = my-to-extraction
where 'eventtype::my_custom_eventtype' could be anything that works as a 'props.conf' specification (<spec> in http://docs.splunk.com/Documentation/Splunk/latest/admin/propsconf).