Expand targeted Splunk search to include all hosts, build report - splunk

I am trying to run a search against all hosts but I am having difficulty figuring out the right approach. A simplified version of what I am looking for is:
index=os sourcetype=df host=system323 mount=/var | streamstats range(storage_used) as storage_growth window=2
But ultimately I want it to search all mount points on all hosts and then send that to a chart or a report.
I tried a few different approaches but none of them gave me the expected results. I felt like I was on the right path with sub-searches, because it felt like the equivalent of a for loop but it did not yield the expected results
index=os sourcetype=df [search index=os sourcetype=df [search index=os sourcetype=df earliest=-1d#d latest=now() | stats values(host) AS host] earliest=-1d#d latest=now() | stats values(mount) AS mount] | streamstats range(storage_used) as storage_growth window=2
How can I take my first search an build a report that will include all hosts and mount points?

Much simpler than sub-searches. Just use a by clause in your streamstats:
index=os sourcetype=df
| eval mountpoint=host+":"+mount
| streamstats range(storage_used) as storage_growth by mountpoint window=2
| table _time,mountpoint,storage_growth

Related

Kusto convert Bytes to MeBytes by division

Trying to graph Bandwidth consumed using Azure Log Analytics
Perf
| where TimeGenerated > ago(1d)
| where CounterName contains "Network Send"
| summarize sum(CounterValue) by bin(TimeGenerated, 1m), _ResourceId
| render timechart
This generates a reasonable chart except the y axis runs from 0 - 15,000,000,000. I tried
Perf
| where TimeGenerated > ago(1d)
| where CounterName contains "Network Send"
| extend MeB_bandwidth_out = todouble(CounterValue)/1,048,576
| summarize sum(MeB_bandwidth_out) by bin(TimeGenerated, 1m), _ResourceId
| render timechart
but I get exactly the same chart. I've tried without the todouble(), or doing it after the division, but nothing changes. Any hint why this is not working?
A bit hard to say without seeing a sample of the data, but here are a couple of idea:
Try removing the commas from 1,048,576
If this doesn't work, remove the last line from both queries and compare the results, and run them to see why the data doesn't make sense
P.S. Regardless, there's a good chance that you can replace contains with has to significantly improve the performance (note that has looks for full words, while contains doesn't - so they are not the same, be careful).

Not able to read nested json array in SPLUNK

I am using "spath" to read json structure from a log file.
{"failure_reason":null,"gen_flag":"GENERATED","gen_date":"2020-02-15","siteid":"ABC","_action":"Change","order":"123"}
I am able to parse above json.
However, "spath" function is not able to read nested array inside that json:
{"failure_reason":"[{"module":"Status Report","reason":"Status Report is not available","statusCode":"503"}]","gen_flag":"GENERATED_PARTIAL","gen_date":"2020-02-15","siteid":"ABC","_action":"Change","wonum":"321"}.
please help!
Your event is not valid JSON. A JSON array should not be surrounded by "s.
Copy your event into any of the following JSON validators, and confirm that it is incorrect.
https://jsonformatter.curiousconcept.com/
https://jsonlint.com/
https://jsonformatter.org/
Now, try with the corrected event.
{"failure_reason":[{"module":"Status Report","reason":"Status Report is not available","statusCode":"503"}],"gen_flag":"GENERATED_PARTIAL","gen_date":"2020-02-15","siteid":"ABC","_action":"Change","wonum":"321"}
You can see that spath works correctly with the modified JSON with the following search.
| makeresults
| eval raw="{\"failure_reason\":[{\"module\":\"Status Report\",\"reason\":\"Status Report is not available\",\"statusCode\":\"503\"}],\"gen_flag\":\"GENERATED_PARTIAL\",\"gen_date\":\"2020-02-15\",\"siteid\":\"ABC\",\"_action\":\"Change\",\"wonum\":\"321\"}"
| spath input=raw
If you need a way to pre-process your event to remove the "s from the array, you may be able to try the following, which may remove the extra "s. This is really dependent on the structure of the event, and may not be 100%, but should be enough to get you started. Try to fix the format of the event at the source.
| makeresults | eval raw="{\"failure_reason\":\"[{\"module\":\"Status Report\",\"reason\":\"Status Report is not available\",\"statusCode\":\"503\"}]\",\"gen_flag\":\"GENERATED_PARTIAL\",\"gen_date\":\"2020-02-15\",\"siteid\":\"ABC\",\"_action\":\"Change\",\"wonum\":\"321\"}"
| rex mode=sed field=raw "s/\"\[/[/" | rex mode=sed field=raw "s/\]\"/]/"
| spath input=raw

Chrome Selenium IDE 3.4.5 extension "execute script" command not storing as variable

I am trying to create a random number generator:
Command | Tgt | Val |
store | tom | tester
store | dominic | envr
execute script | Math.floor(Math.random()*11111); | number
type | id=XXX | ${tester}.${dominic}.${number}
Expected result:
tom.dominic.0 <-- random number
Instead I get this:
tom.dominic.${number}
I looked thru all the resources and it seems the recent selenium update/version has changed the approach and I cannot find a solution.
I realize this question is 2 years old, but it's a fairly common one, so I'll answer it and see if there are other answers that address it.
If you want to assign the result of a script run by the "execute script" in Selenium IDE to a Selenium variable, you have to return the value from JavaScript. So instead of
execute script | Math.floor(Math.random()*11111); | number
you need
execute script | return Math.floor(Math.random()*11111); | number
Also, in your final assignment that puts the 3 pieces together, you needed ${envr} instead of ${dominic}.

Splunk - Disabling alerts during maintenance window

I have a simple cvs file loaded in splunk called StandardMaintenance.csv which looks like this...
UnderMaintenance
NO
We currently get bombarded with alerts during our maintenance window. At the start of maintenance, I want to be able to change this to YES to stop the alerts (I have an easy way to do this). I am looking for something standard to add to all alert queries that check this csv for status (lookup as I understand it) and for the query to return nothing if UnderMaintenance = YES, thus not generate a match to the query.
It is basically a binary, ON or OFF. I would appreciate any help you could provide.
NOTE:
You cannot disable the alert by executing splunk query because the
Rest API requires a POST action.
Step 1: Maintain a csv file of all your savedsearches with owners by using below query. You can schedule the query as per your convenience. For example below search creates maintenance.csv and replaces all contents whenever executed.
| rest /servicesNS/-/search/saved/searches | table title eai:acl.owner | outputlookup maintenance.csv
This file would get created in $SPLUNK_HOME/etc/apps/<app name>/lookups
Step 2: Write a script to read data from maintenance.csv file and execute below command to disable searches. (Run before maintenance window)
curl -X POST -k -u admin:pass https://<splunk server>:8089/servicesNS/<owner>/search/saved/searches/<search title>/disable
Step 3: Do the same thing to enable all seaches, just change the command to below (Run after maintenance window)
curl -X POST -k -u admin:pass https://<splunk server>:8089/servicesNS/<owner>/search/saved/searches/<search title>/enable
EDIT 1:
Create StandardMaintenance.csv file under $SPLUNK_HOME/etc/apps/search/lookups.
The StandardMaintenance.csv file contains :
UnderMaintenance
"No"
Use below search query to get results of existing saved searches only if UnderMaintenance = No :
| rest /servicesNS/-/search/saved/searches
| eval UnderMaintenance = "No"
| table title eai:acl.owner UnderMaintenance
| join UnderMaintenance
[| inputlookup StandardMaintenance.csv ]
| table title eai:acl.owner
Hope this helps !
Before each query create a variable (say it's called foo) that you set to true if maintenance is NO and that you do not set otherwise, as below:
... | eval foo=case(maintenance=="NO","true")
Then you put the below at the end of your query:
| eval foo=$foo$
This will make your query execute only if maintenance is NO

Can GraphDB load 10 million statements with OWL reasoning?

I am struggling to load most of the Drug Ontology OWL files and most of the ChEBI OWL files into GraphDB free v8.3 repository with Optimized OWL Horst reasoning on.
is this possible? Should I do something other than "be patient?"
Details:
I'm using the loadrdf offline bulk loader to populate an AWS r4.16xlarge instance with 488.0 GiB and 64 vCPUs
Over the weekend, I played around with different pool buffer sizes and found that most of these files individually load fastest with a pool buffer of 2,000 or 20,000 statements instead of the suggested 200,000. I also added -Xmx470g to the loadrdf script. Most of the OWL files would load individually in less than one hour.
Around 10 pm EDT last night, I started to load all of the files listed below simultaneously. Now it's 11 hours later, and there are still millions of statements to go. The load rate is around 70/second now. It appears that only 30% of my RAM is being used, but the CPU load is consistently around 60.
are there websites that document other people doing something of this scale?
should I be using a different reasoning configuration? I chose this configuration as it was the fastest loading OWL configuration, based on my experiments over the weekend. I think I will need to look for relationships that go beyond rdfs:subClassOf.
Files I'm trying to load:
+-------------+------------+---------------------+
| bytes | statements | file |
+-------------+------------+---------------------+
| 471,265,716 | 4,268,532 | chebi.owl |
| 61,529 | 451 | chebi-disjoints.owl |
| 82,449 | 1,076 | chebi-proteins.owl |
| 10,237,338 | 135,369 | dron-chebi.owl |
| 2,374 | 16 | dron-full.owl |
| 170,896 | 2,257 | dron-hand.owl |
| 140,434,070 | 1,986,609 | dron-ingredient.owl |
| 2,391 | 16 | dron-lite.owl |
| 234,853,064 | 2,495,144 | dron-ndc.owl |
| 4,970 | 28 | dron-pro.owl |
| 37,198,480 | 301,031 | dron-rxnorm.owl |
| 137,507 | 1,228 | dron-upper.owl |
+-------------+------------+---------------------+
#MarkMiller you can take a look at the Preload tool, which is part of GraphDB 8.4.0 release. It's specially designed to handle large amount of data with constant speed. Note that it works without inference, so you'll need to load your data and then change the ruleset and reinfer the statements.
http://graphdb.ontotext.com/documentation/free/loading-data-using-preload.html
Just typing out #Konstantin Petrov's correct suggestion with tidier formatting. All of these queries should be run in the repository of interest... at some point in working this out, I misled myself into thinking that I should be connected to the SYSTEM repo when running these queries.
All of these queries also require the following prefix definition
prefix sys: <http://www.ontotext.com/owlim/system#>
This doesn't directly address the timing/performance of loading large datasets into an OWL reasoning repository, but it does show how to switch to a higher level of reasoning after loading lots of triples into a no-inference ("empty" ruleset) repository.
Could start by querying for the current reasoning level/rule set, and then run this same select statement after each insert.
SELECT ?state ?ruleset {
?state sys:listRulesets ?ruleset
}
Add a predefined ruleset
INSERT DATA {
_:b sys:addRuleset "rdfsplus-optimized"
}
Make the new ruleset the default
INSERT DATA {
_:b sys:defaultRuleset "rdfsplus-optimized"
}
Re-infer... could take a long time!
INSERT DATA {
[] <http://www.ontotext.com/owlim/system#reinfer> []
}