How to only extract match strings from a multi-value field and display in new column in SPLUNK Query - splunk

i am trying to extract matched strings from the multivalue field and display in another column. I have tried various options to split the field by delimiter and then mvexpand and then user where/search to pull those data. I was trying to find if there is an easier way to do this without all this hassle in SPLUNK query.
Example: Lets say i have below multi-value column1 field with data separated by delimiter comma
column1 = abc1,test1,test2,abctest1,mail,send,mail2,sendtest2,new,code,results
I was splitting this column using delimiter |eval column2=split(column1,",") and using regex/where/search to search for data with *test* in this column and return results, where i was able to pull the results but the column1 still shows all the values abc1,test1,test2,abctest1,mail,send,mail2,sendtest2,new,code,results , what i want is either to trim1 column1 to show only words match with test or show those entries in new column2 which should only show this words test1,test2,abctest1,sendtest2 as they were only matching *test*.
I would appreciate your help, thanks.

Found the answer after posting this question, its just using exiting mvfilter function to pull the match resutls.
column2=mvfilter(match(column1,"test"))

| eval column2=split(column1,",") | search column2="*test*"
doesn't work, as the split creates a multi-value field, which is a single event containing a single field containing many values. The search for *test* will still find that event, even though it contains abc1, etc... as there is at least one field that is *test*.
What you can use is the mvfilter command to narrow down the multi-value field to the events you are after.
| eval column2=split(column1,",") | eval column2=mvfilter(match(column2,".*test.*"))
Alternatively to this approach, you can use a regular expression to extract what you need.
| rex field=column1 max_match=0 "(<?column2>[^,]*test[^,]*)"
Regardless, at the end, you would need to use mvjoin to join your multiple values into a single string
| eval column2=mvjoin(column2, ",")

Related

How to filter String in where clause

I would like to extract the string using where clause in SAP HANA.For an example,these are 3 strings for name column.
123._SYS_BIC.meag.app.qthor.cidwh_eingangsschicht.backend.dblayer.l2.checks/MasterData_Holdings.
153._SYS_BIC.meag.app.qthor.centralAdministration.backend.dblayer.l2.checks/AuditAndSecurities.
meag.app.qthor.centralAdministration.backend.dblayer.l2.checks/GeneralLedger
After filter the name column using where clause, output in the name column would be shown only the last portion of the string. So, output will be like this. That means whatever we have, just remove from the beginning till '/'.
"MasterData_Holdings"
"AuditAndSecurities"
"GeneralLedger"
You can try using the REPLACE_REGEXPR
I'm not familiar myself with Hana but the function is pretty straight forward and it should be:
select REPLACE_REGEXPR('.+/(.+)' IN fieldName WITH '\1' OCCURRENCE ALL) as field
...
where
... -- your filter
Be aware that this regex '.+/(.+)' will eat everything until the last / so for instance if you have ....checks/MasterData_Holdings/Something it will return only Something

Splunk: Removing all text after a specific string in a column

I have a field where all values have the following format:
Knowledge:xyz,id:2907129
The id number always changes, however, all I want is the value of xyz.
I used the following to remove "Knowledge:"e
eval url=replace (url, "Open_KnowledgeZone:", "")
For the id portion, using ",id*" did not work within the eval replace function.
You'll want to use a regex. Something like:
rex field=url "(?<=Knowledge:)(?<AnyFieldName>.*)(?=,)"
Where <AnyFieldName> is the name you want the result field to be. This will select all characters after "Knowledge:" and before the ",".
Here is the regex in action outside of Splunk:
https://regex101.com/r/ofW0a1/1

Get a Count of a Field Including Similar Entries MS Access

Hey all I'm trying to parse out any duplicates in an access database. I want the database to be usable for the access illiterate and therefore I am trying to set up queries that can be run without any understanding of the program.
My database is setup where there are occasionally special characters attached to the entries in the Name field. I am interested in checking for duplicate entries based of the fields field1 and name. How can I include the counts for entries with special characters with their non-special character counterparts? Is this possible in a single step or do I need to add a step where I clean the data first?
Currently my code (shown below) only returns counts for entries not including special characters.
SELECT
table.[field1],
table.[Name],
Count(table.[Name]) AS [CountOfName]
FROM
table
GROUP BY
table.[field1],
table.[Name]
HAVING
(((table.[Name]) Like "*") AND ((Count(table.[Name]))>1));
I have tried adding a leading space to the Like statement (Like " *"), but that returns zero results.
P.S. I have also tried the Replace statement to replace the special characters, but that did not work.
field1 ##
1234567
1234567
4567890
4567890
name ##
brian
brian
ted
ted‡
Results
field1
1234567
name
brian
countofname
2
GROUP BY works by placing rows into groups where values are the same. So, when you run your query on your data and it groups by field1 and name, you are saying "Put these records into groups where they share a common field1 and name value". If you want 4567890, ted and 4567890, ted† to show in the same group, and thus have a count of 2, both the field1 and name have to be the same.
If you only have one or two possible special characters on the end of the names, you could potentially use Replace() or Substring() to remove all the special chars from the end of the names, but remember you must also GROUP BY the new expression you create; you can't GROUP BY the original name field or you won't get your desired count. You could also create another column that contains a sanitized name, one without any special character on the end.
I don't have Access installed, but something like this should do it:
SELECT
table.[field1],
Replace(table.[Name], "†", "") AS Name,
Count(table.[Name]) AS [CountOfName]
FROM
table
GROUP BY
table.[field1],
Replace(table.[Name], "†", "")

Transform data in Google bigquery - extract text, split it into multiple columns and pivoting the data

I have some weblog data in big query which I need to transform to make it easier to use and query. The data looks like:
I want to extract and transform the data within the curled brackets after Results{…..} (colored blue). The data is of the form ‘(\d+((PQ)|(KL))+\d+)’ and there can be 1-20+ entries in the result array. I am only interested in the first 16 entries.
I have been able to extract the data within curled brackets into a new column, using Substr and regext_extract. But I'm unable to SPLIT it into columns (sometimes there is only 1 result and so the delimiter "," is missing. I'm new with regex, may be I can use something like ‘(\d+((PQ)|(KL))+\d+){1}’ etc. to split the data into multiple columns and then pivot it.
Ideal output in my case would be to transform it into something like:
In the above solution, each row in original table is repeated from 1-16 times depending on the number of items in the Results array.
I’m not completely sure if it’s possible to do this in big query. I’ll be grateful if anyone can help me out a little here.
If this is not possible, then I can have 16 rows for every event with NULL values in Event_details for cases where there are less than 16 entries in result array.
In case both of these are not possible, the last solution would be to have it transformed into something like:
The reason I want to transform the data is that in most of the cases I would need to find which result array items are appearing and in what order.
Check this out: Split string into multiple columns with bigquery.
In their case its delimited by spaces. replace the \s with ','
something like:
SELECT
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){0}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word0,
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){1}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word1,
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){2}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word2,
Regexp_extract(StringToParse,r'^*{(?:[^,]*,){3}(\d+(?:(?:PQ)|(?:KL))+\d+)\s?') as Word3,
FROM
(SELECT 'bla{1234PQ5,6789KL0,1234PQ5,6789KL0,123' as StringToParse)
Use SPLIT()
SELECT Event_ID, Event_UserID, Event_SessionID, Keyword,
SPLIT(REGEXP_EXTRACT(Event_details,"Results\{(.*)\}"),",") as Event_details_item
FROM mydata.mytable

Is it possible to get the matching string from an SQL query?

If I have a query to return all matching entries in a DB that have "news" in the searchable column (i.e. SELECT * FROM table WHERE column LIKE %news%), and one particular row has an entry starting with "In recent World news, Somalia was invaded by ...", can I return a specific "chunk" of an SQL entry? Kind of like a teaser, if you will.
select substring(column,
CHARINDEX ('news',lower(column))-10,
20)
FROM table
WHERE column LIKE %news%
basically substring the column starting 10 characters before where the word 'news' is and continuing for 20.
Edit: You'll need to make sure that 'news' isn't in the first 10 characters and adjust the start position accordingly.
You can use substring function in a SELECT part. Something like:
SELECT SUBSTRING(column, 1,20) FROM table WHERE column LIKE %news%
This will return the first 20 characters from column column
I had the same problem, I ended up loading the whole field into C#, then re-searched the text for the search string, then selected x characters either side.
This will work fine for LIKE, but not full text queries which use FORMS OF INFLECTION because that may match "women" when you search for "woman".
If you are using MSSQL you can perform all kinds VB-like of substring functions as part of your query.