WSL2 io speeds on Linux filesystem are slow - windows-subsystem-for-linux

Trying out WSL2 for the first time. Running Ubuntu 18.04 on a Dell Latitude 9510 with an SSD. Noticed build speeds of a React project were brutally slow. Per all the articles on the web I'm running the project out of ~ and not the windows mount. Ran a benchmark using sysbench --test=fileio --file-test-mode=seqwr run in ~ and got:
File operations:
reads/s: 0.00
writes/s: 3009.34
fsyncs/s: 3841.15
Throughput:
read, MiB/s: 0.00
written, MiB/s: 47.02
General statistics:
total time: 10.0002s
total number of events: 68520
Latency (ms):
min: 0.01
avg: 0.14
max: 22.55
95th percentile: 0.31
sum: 9927.40
Threads fairness:
events (avg/stddev): 68520.0000/0.00
execution time (avg/stddev): 9.9274/0.00
If I'm reading this correctly, that wrote 47 mb/s. Ran the same test on my mac mini and got 942 mb/s. Is this normal? This seems like the Linux i/o speeds on WSL are unusably slow. Any thoughts on ways to speed this up?
---edit---
Not sure if this is a fair comparison, but the output of winsat disk -drive c on the same machine from the Windows side. Smoking fast:
> Dshow Video Encode Time 0.00000 s
> Dshow Video Decode Time 0.00000 s
> Media Foundation Decode Time 0.00000 s
> Disk Random 16.0 Read 719.55 MB/s 8.5
> Disk Sequential 64.0 Read 1940.39 MB/s 9.0
> Disk Sequential 64.0 Write 1239.84 MB/s 8.6
> Average Read Time with Sequential Writes 0.077 ms 8.8
> Latency: 95th Percentile 0.219 ms 8.9
> Latency: Maximum 2.561 ms 8.7
> Average Read Time with Random Writes 0.080 ms 8.9
> Total Run Time 00:00:07.55
---edit 2---
Windows version: Windows 10 Pro, Version 20H2 Build 19042

Late answer, but I had the same issue and wanted to post my solution for anyone who has the problem:
Windows Defender seems to destroy the read speeds in WSL. I added the entire rootfs folder as an exclusion. If you're comfortable turning off Windows Defender, I recommend that as well. Any antivirus probably has similar issues, so adding the WSL directories as an exclusion is probably you best bet.

Related

Tensorflow strange CPU usage

my problem is with Tensorflow and the usage of the CPU.
My System:
CPU => AMD FX 8320 (8 Cores รก 3,5ghz) and 8 Threads
Grafik => GTX 970
RAM => 16Gb and i belive ddr3 2600
I want to run a A3C algorithm for Starcraft 2 (pysc2) on my pc what works fine but the usage of the cpu ist somewhat strange.
If i start the algorithm with 4 Workers i get something about 150k Steps in 1h
and all cpu's are used about 25-30%
If i start the same algorithm with 8 Workers i get something about 120k Steps in 1h and all cpu's are used about 25-30%
If i now start the algorithm with 4 workers twice i get each 150k steps 1h and the cpu usage is 60-70%
Why cant i start the algorithm with 8 Workers, get the double amount of steps in 1H and the cpu is used to 70%?

Can't configure GPU bios properly

I have 6 GPUs of RX470. It should be mining average 25-27 mh/s each but it's only 20 mh/s. Overall is 120 instead of 150-170. I think the problem is GPU bios configuration but can't figure out any other thing. Any suggestions?
25 mh/s is what you would expect from an RX 480 stock. To get the same hashrate for RX 470, you'd be looking at overclocking memory speed (+600). In terms of how to overclock, it depends on whether your running linux or windows.

What is the performance overhead of XADisk for read and write operations?

What does XADisk do in addition to reading/writing from the underlying file? How does that translate into a percentage of the read/write throughput (approximately)?
depends on the size, if large set the flag "heavyWrite" as true while opening the xaFileOutputStream.
test with 500 files of size 1MB each. Below is the amount of time taken, averaged over 10 executions...
Java IO - 37.5 seconds
Java NIO - 24.8 seconds
XADisk - 30.3 seconds

G1 garbage having one slow worker

I'm having an issue with GC pause (~400ms) which I'm trying to reduce. I noticed that I always have one worker a lot slower than others :
2013-06-03T17:24:51.606+0200: 605364.503: [GC pause (mixed)
Desired survivor size 109051904 bytes, new threshold 1 (max 1)
- age 1: 47105856 bytes, 47105856 total
, 0.47251300 secs]
[Parallel Time: 458.8 ms]
[GC Worker Start (ms): 605364503.9 605364503.9 605364503.9 605364503.9 605364503.9 605364504.0
Avg: 605364503.9, Min: 605364503.9, Max: 605364504.0, Diff: 0.1]
--> [**Ext Root Scanning (ms)**: **356.4** 3.1 3.7 3.6 3.2 3.0
Avg: 62.2, **Min: 3.0, Max: 356.4, Diff: 353.4**] <---
[Update RS (ms): 0.0 22.4 33.6 21.8 22.3 22.3
Avg: 20.4, Min: 0.0,
As you can see one worker took 356 ms when others took only 3 ms !!!
If someone has an idea or think it's normal ..
[I'd rather post this as a comment, but I still lack the necessary points to do so]
No idea as to whether it is normal, but I've come across the same problem:
2014-01-16T13:52:56.433+0100: 59577.871: [GC pause (young), 2.55099911 secs]
[Parallel Time: 2486.5 ms]
[GC Worker Start (ms): 59577871.3 59577871.4 59577871.4 59577871.4 59577871.4 59577871.5 59577871.5 59577871.5
Avg: 59577871.4, Min: 59577871.3, Max: 59577871.5, Diff: 0.2]
[Ext Root Scanning (ms): 152.0 164.5 159.0 183.7 1807.0 117.4 113.8 138.2
Avg: 354.5, Min: 113.8, Max: 1807.0, Diff: 1693.2]
I've been unable to find much about the subject but here http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2013-February/001484.html
Basically as you surmise one the GC worker threads is being held up
when processing a single root. I've seen s similar issue that's caused
by filling up the code cache (where JIT compiled methods are held).
The code cache is treated as a single root and so is claimed in its
entirety by a single GC worker thread. As a the code cache fills up,
the thread that claims the code cache to scan starts getting held up.
A full GC clears the issue because that's where G1 currently does
class unloading: the full GC unloads a whole bunch of classes allowing
any the compiled code of any of the unloaded classes' methods to be
freed by the nmethod sweeper. So after a a full GC the number of
compiled methods in the code cache is less.
It could also be the just the sheer number of loaded classes as the
system dictionary is also treated as a single claimable root.
I think I'll try enabling the code cache flushing and let you know. If you finally managed to solve this problem please let me know, I'm striving to get done with it as well.
Kind regards

ThinkingSphinx returns no results when in test mode in Cucumber

I'm on a project upgrading from Rails 2 -> 3. We are removing Ultrasphinx (which is not supported in Rails 3) and replacing it with ThinkingSphinx. One problem - the Cucumber tests for searching, which used to work, are failing as ThinkingSphinx is not indexing the files in test mode.
This is the relevant part of env.rb:
require 'cucumber/thinking_sphinx/external_world'
Cucumber::ThinkingSphinx::ExternalWorld.new
Cucumber::Rails::World.use_transactional_fixtures = false
And here is the step (declared in my common_steps.rb file) that indexes my objects:
Given /^ThinkingSphinx is indexed$/ do
puts "Indexing the new database objects"
# Update all indexes
ThinkingSphinx::Test.index
sleep(0.25) # Wait for Sphinx to catch up
end
And this is what I have in my .feature file (after the model objects are created)
And ThinkingSphinx is indexed
This is the output of ThinkingSphinx when its run in test mode (this is WRONG, it should be finding documents but it is not)
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff
using config file 'C:/Users/PaulG/Programming/Projects/TechTV/config/test.sphinx.conf'...
indexing index 'collection_core'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.027 sec, 0 bytes/sec, 0.00 docs/sec
distributed index 'collection' can not be directly indexed; skipping.
indexing index 'video_core'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.018 sec, 0 bytes/sec, 0.00 docs/sec
distributed index 'video' can not be directly indexed; skipping.
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 8 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=4332).
In comparison, this is the output I get when I run
rake ts:index
To index the development environment:
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff
using config file 'C:/Users/PaulG/Programming/Projects/TechTV/config/development.sphinx.conf'...
indexing index 'collection_core'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 39 bytes
total 0.031 sec, 1238 bytes/sec, 127.04 docs/sec
distributed index 'collection' can not be directly indexed; skipping.
indexing index 'video_core'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 62 bytes
total 0.023 sec, 2614 bytes/sec, 168.66 docs/sec
distributed index 'video' can not be directly indexed; skipping.
total 10 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 20 writes, 0.001 sec, 0.1 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=5476).
Notice how its actually finding documents in my development database, but not my test database. The indexer is working in dev, but not test? I've spent 2 days on this, and am no closer to a solution. Any help would be overwhelmingly appreciated.
I figured it out this morning, hopefully I can save someone else the troubles I experienced. Looks like it wasn't a fault of Cucumber, but of DatabaseCleaner.
I fixed this issue by changing this line in env.rb:
DatabaseCleaner.strategy = :transaction
to
DatabaseCleaner.strategy = :truncation