Kusto convert Bytes to MeBytes by division - azure-log-analytics

Trying to graph Bandwidth consumed using Azure Log Analytics
Perf
| where TimeGenerated > ago(1d)
| where CounterName contains "Network Send"
| summarize sum(CounterValue) by bin(TimeGenerated, 1m), _ResourceId
| render timechart
This generates a reasonable chart except the y axis runs from 0 - 15,000,000,000. I tried
Perf
| where TimeGenerated > ago(1d)
| where CounterName contains "Network Send"
| extend MeB_bandwidth_out = todouble(CounterValue)/1,048,576
| summarize sum(MeB_bandwidth_out) by bin(TimeGenerated, 1m), _ResourceId
| render timechart
but I get exactly the same chart. I've tried without the todouble(), or doing it after the division, but nothing changes. Any hint why this is not working?

A bit hard to say without seeing a sample of the data, but here are a couple of idea:
Try removing the commas from 1,048,576
If this doesn't work, remove the last line from both queries and compare the results, and run them to see why the data doesn't make sense
P.S. Regardless, there's a good chance that you can replace contains with has to significantly improve the performance (note that has looks for full words, while contains doesn't - so they are not the same, be careful).

Related

Chrome Selenium IDE 3.4.5 extension "execute script" command not storing as variable

I am trying to create a random number generator:
Command | Tgt | Val |
store | tom | tester
store | dominic | envr
execute script | Math.floor(Math.random()*11111); | number
type | id=XXX | ${tester}.${dominic}.${number}
Expected result:
tom.dominic.0 <-- random number
Instead I get this:
tom.dominic.${number}
I looked thru all the resources and it seems the recent selenium update/version has changed the approach and I cannot find a solution.
I realize this question is 2 years old, but it's a fairly common one, so I'll answer it and see if there are other answers that address it.
If you want to assign the result of a script run by the "execute script" in Selenium IDE to a Selenium variable, you have to return the value from JavaScript. So instead of
execute script | Math.floor(Math.random()*11111); | number
you need
execute script | return Math.floor(Math.random()*11111); | number
Also, in your final assignment that puts the 3 pieces together, you needed ${envr} instead of ${dominic}.

Can GraphDB load 10 million statements with OWL reasoning?

I am struggling to load most of the Drug Ontology OWL files and most of the ChEBI OWL files into GraphDB free v8.3 repository with Optimized OWL Horst reasoning on.
is this possible? Should I do something other than "be patient?"
Details:
I'm using the loadrdf offline bulk loader to populate an AWS r4.16xlarge instance with 488.0 GiB and 64 vCPUs
Over the weekend, I played around with different pool buffer sizes and found that most of these files individually load fastest with a pool buffer of 2,000 or 20,000 statements instead of the suggested 200,000. I also added -Xmx470g to the loadrdf script. Most of the OWL files would load individually in less than one hour.
Around 10 pm EDT last night, I started to load all of the files listed below simultaneously. Now it's 11 hours later, and there are still millions of statements to go. The load rate is around 70/second now. It appears that only 30% of my RAM is being used, but the CPU load is consistently around 60.
are there websites that document other people doing something of this scale?
should I be using a different reasoning configuration? I chose this configuration as it was the fastest loading OWL configuration, based on my experiments over the weekend. I think I will need to look for relationships that go beyond rdfs:subClassOf.
Files I'm trying to load:
+-------------+------------+---------------------+
| bytes | statements | file |
+-------------+------------+---------------------+
| 471,265,716 | 4,268,532 | chebi.owl |
| 61,529 | 451 | chebi-disjoints.owl |
| 82,449 | 1,076 | chebi-proteins.owl |
| 10,237,338 | 135,369 | dron-chebi.owl |
| 2,374 | 16 | dron-full.owl |
| 170,896 | 2,257 | dron-hand.owl |
| 140,434,070 | 1,986,609 | dron-ingredient.owl |
| 2,391 | 16 | dron-lite.owl |
| 234,853,064 | 2,495,144 | dron-ndc.owl |
| 4,970 | 28 | dron-pro.owl |
| 37,198,480 | 301,031 | dron-rxnorm.owl |
| 137,507 | 1,228 | dron-upper.owl |
+-------------+------------+---------------------+
#MarkMiller you can take a look at the Preload tool, which is part of GraphDB 8.4.0 release. It's specially designed to handle large amount of data with constant speed. Note that it works without inference, so you'll need to load your data and then change the ruleset and reinfer the statements.
http://graphdb.ontotext.com/documentation/free/loading-data-using-preload.html
Just typing out #Konstantin Petrov's correct suggestion with tidier formatting. All of these queries should be run in the repository of interest... at some point in working this out, I misled myself into thinking that I should be connected to the SYSTEM repo when running these queries.
All of these queries also require the following prefix definition
prefix sys: <http://www.ontotext.com/owlim/system#>
This doesn't directly address the timing/performance of loading large datasets into an OWL reasoning repository, but it does show how to switch to a higher level of reasoning after loading lots of triples into a no-inference ("empty" ruleset) repository.
Could start by querying for the current reasoning level/rule set, and then run this same select statement after each insert.
SELECT ?state ?ruleset {
?state sys:listRulesets ?ruleset
}
Add a predefined ruleset
INSERT DATA {
_:b sys:addRuleset "rdfsplus-optimized"
}
Make the new ruleset the default
INSERT DATA {
_:b sys:defaultRuleset "rdfsplus-optimized"
}
Re-infer... could take a long time!
INSERT DATA {
[] <http://www.ontotext.com/owlim/system#reinfer> []
}

Expand targeted Splunk search to include all hosts, build report

I am trying to run a search against all hosts but I am having difficulty figuring out the right approach. A simplified version of what I am looking for is:
index=os sourcetype=df host=system323 mount=/var | streamstats range(storage_used) as storage_growth window=2
But ultimately I want it to search all mount points on all hosts and then send that to a chart or a report.
I tried a few different approaches but none of them gave me the expected results. I felt like I was on the right path with sub-searches, because it felt like the equivalent of a for loop but it did not yield the expected results
index=os sourcetype=df [search index=os sourcetype=df [search index=os sourcetype=df earliest=-1d#d latest=now() | stats values(host) AS host] earliest=-1d#d latest=now() | stats values(mount) AS mount] | streamstats range(storage_used) as storage_growth window=2
How can I take my first search an build a report that will include all hosts and mount points?
Much simpler than sub-searches. Just use a by clause in your streamstats:
index=os sourcetype=df
| eval mountpoint=host+":"+mount
| streamstats range(storage_used) as storage_growth by mountpoint window=2
| table _time,mountpoint,storage_growth

What is the easiest way to find the biggest objects in Redis?

I have a 20GB+ rdb dump in production.
I suspect there's a specific set of keys bloating it.
I'd like to have a way to always spot the first 100 biggest objects from static dump analysis or ask it to the server itself, which by the way has ove 7M objects.
Dump analysis tools like rdbtools are not helpful in this (I think) really common use case!
I was thinking to write a script and iterate the whole keyset with "redis-cli debug object", but I have the feeling there must be some tool I'm missing.
An option was added to redis-cli: redis-cli --bigkeys
Sample output based on https://gist.github.com/michael-grunder/9257326
$ ./redis-cli --bigkeys
# Press ctrl+c when you have had enough of it... :)
# You can use -i 0.1 to sleep 0.1 sec every 100 sampled keys
# in order to reduce server load (usually not needed).
Biggest string so far: day:uv:483:1201737600, size: 2
Biggest string so far: day:pv:2013:1315267200, size: 3
Biggest string so far: day:pv:3:1290297600, size: 5
Biggest zset so far: day:topref:2734:1289433600, size: 3
Biggest zset so far: day:topkw:2236:1318723200, size: 7
Biggest zset so far: day:topref:651:1320364800, size: 20
Biggest string so far: uid:3467:auth, size: 32
Biggest set so far: uid:3029:allowed, size: 1
Biggest list so far: last:175, size: 51
-------- summary -------
Sampled 329 keys in the keyspace!
Total key length in bytes is 15172 (avg len 46.12)
Biggest list found 'day:uv:483:1201737600' has 5235597 items
Biggest set found 'day:uvx:555:1201737600' has 47 members
Biggest hash found 'day:uvy:131:1201737600' has 2888 fields
Biggest zset found 'day:uvz:777:1201737600' has 1000 members
0 strings with 0 bytes (00.00% of keys, avg size 0.00)
19 lists with 5236744 items (05.78% of keys, avg size 275618.11)
50 sets with 112 members (15.20% of keys, avg size 2.24)
250 hashs with 6915 fields (75.99% of keys, avg size 27.66)
10 zsets with 1294 members (03.04% of keys, avg size 129.40)
redis-rdb-tools does have a memory report that does exactly what you need. It generates a CSV file with memory used by every key. You can then sort it and find the Top x keys.
There is also an experimental memory profiler that started to do what you need. Its not yet complete, and so isn't documented. But you can try it - https://github.com/sripathikrishnan/redis-rdb-tools/tree/master/rdbtools/cli. And of course, I'd encourage you to contribute as well!
Disclaimer: I am the author of this tool.
I am pretty new to bash scripting. I came out with this:
for line in $(redis-cli keys '*' | awk '{print $1}'); do echo `redis-cli DEBUG OBJECT $line | awk '{print $5}' | sed 's/serializedlength://g'` $line; done; | sort -h
This script
Lists all the key with redis-cli keys "*"
Gets size with redis-cli DEBUG OBJECT
sorts the script based on the name prepend with the size
This may be very slow due to the fact that bash is looping through every single redis key. You have 7m keys you may need to cache the out put of the keys to a file.
If you have keys that follow this pattern "A:B" or "A:B:*", I wrote a tool that analyzes both existing content as well as monitors for things such as hit rate, number of gets/sets, network traffic, lifetime, etc. The output is similar to the one below.
https://github.com/alexdicianu/redis_toolkit
$ ./redis-toolkit report -type memory -name NAME
+----------------------------------------+----------+-----------+----------+
| KEY | NR KEYS | SIZE (MB) | SIZE (%) |
+----------------------------------------+----------+-----------+----------+
| posts:* | 500 | 0.56 | 2.79 |
| post_meta:* | 440 | 18.48 | 92.78 |
| terms:* | 192 | 0.12 | 0.63 |
| options:* | 109 | 0.52 | 2.59 |
Try redis-memory-analyzer - a console tool to scan Redis key space in real time and aggregate memory usage statistic by key patterns. You may use this tools without maintenance on production servers. It shows you detailed statistics about each key pattern in your Redis serve.
Also you can scan Redis db by all or selected Redis types such as "string", "hash", "list", "set", "zset". Matching pattern also supported.
RMA also try to discern key names by patterns, for example if you have keys like 'user:100' and 'user:101' application would pick out common pattern 'user:*' in output so you can analyze most memory distressed data in your instance.

What is the best way to use a console when developing?

For scripting languages, what is the most effective way to utilize a console when developing? Are there ways to be more productive with a console than a "compile and run" only language?
Added clarification: I am thinking more along the lines of Ruby, Python, Boo, etc. Languages that are used for full blown apps, but also have a way to run small snippets of code in a console.
I am thinking more along the lines of Ruby, ...
Well for Ruby the irb interactive prompt is a great tool for "practicing" something simple. Here are the things I'll mention about the irb to give you an idea of effective use:
Automation. You are allowed a .irbrc file that will be automatically executed when launching irb. That means you can load your favorite libraries or do whatever you want in full Ruby automatically. To see what I mean check out some of the ones at dotfiles.org.
Autocompletion. That even makes writing code easier. Can't remember that string method to remove newlines? "".ch<tab> produces chop and chomp. NOTE: you have to enable autocompletion for irb yourself
Divide and Conquer. irb makes the small things really easy. If you're writing a function to manipulate strings, the ability to test the code interactively right in the prompt saves a lot of time! For instance you can just open up irb and start running functions on an example string and have working and tested code already ready for your library/program.
Learning, Experimenting, and Hacking. Something like this would take a very long time to test in C/C++, even Java. If you tried testing them all at once you might seg-fault and have to start over.
Here I'm just learning how the String#[] function works.
joe[~]$ irb
>> "12341:asdf"[/\d+/]
# => "12341"
>> "12341:asdf"[/\d*/]
# => "12341"
>> "12341:asdf"[0..5]
# => "12341:"
>> "12341:asdf"[0...5]
# => "12341"
>> "12341:asdf"[0, ':']
TypeError: can't convert String into Integer
from (irb):5:in `[]'
from (irb):5
>> "12341:asdf"[0, 5]
# => "12341"
Testing and Benchmarking. Now they are nice and easy to perform. Here is someone's idea to emulate the Unix time function for quick benchmarking. Just add it to your .irbrc file and its always there!
Debugging - I haven't used this much myself but there is always the ability to debug code like this. Or pull out some code and run it in the irb to see what its actually doing.
I'm sure I'm missing some things but I hit on my favorite points. You really have zero limitation in shells so you're limited only by what you can think of doing. I almost always have a few shells running. Bash, Javascript, and Ruby's irb to name a few. I use them for a lot of things!
I think it depends on the console. The usefulness of a CMD console on windows pails in comparison to a Powershell console.
You didn't say what OS you're using but on Linux I been using a tabbed window manager (wmii) for a year or so and it has radically changed the way I use applications - consoles or otherwise.
I often have four or more consoles and other apps on a virtual desktop and with wmii I don't have to fiddle with resizing windows to line everything up just so. I can trivially rearrange them into vertical columns, stack them up vertically, have them share equal amounts of vertical or horizontal space, and move them between screens.
Say you open two consoles on your desktop. You'd get this (with apologies for the cronkey artwork):
----------------
| |
| 1 |
| |
----------------
----------------
| |
| 2 |
| |
----------------
Now I want them side-by-side. I enter SHIFT-ALT-L in window 2 to move it rightwards and create two columns:
------- -------
| || |
| || |
| 1 || 2 |
| || |
| || |
------- -------
Now I could open another console and get
------- -------
| || 2 |
| || |
| | -------
| 1 | -------
| || 3 |
| || |
------- -------
Then I want to temporarily view console 3 full-height, so I hit ALT-s in it and get:
------- -------
| | -------
| || |
| 1 || 3 |
| || |
| || |
------- -------
Consoles 2 and 3 are stacked up now.
I could also give windows tags. For example, in console 2 I could say ALT-SHIFT-twww+dev and that console would be visible in the 'www' and 'dev' virtual desktops. (The desktops are created if they don't already exist.) Even better, the console can be in a different visual configuration (e.g., stacked and full-screen) on each of those desktops.
Anyway, I can't do tabbed window managers justice here. I don't know if it's relevant to your environment but if you get the chance to try this way of working you probably won't look back.
I've added a shortcut to my Control-Shift-C key combination to bring up my Visual Studio 2008 Console. This alone has saved me countless seconds when needing to register a dll or do any other command. I imagine if you leverage this with another command tool and you may have some massive productivity increases.
Are you kidding?
In my Linux environment, the console is my lifeblood. I'm proficient in bash scripting, so to me a console is very much like sitting in a REPL for Python or Lisp. You can quite literally do anything.
I actually write tools used by my team in bash, and the console is the perfect place to do that development. I really only need an editor as a backing store for things as I figure them out.