thinking sphinx partially rebuilding index? - indexing

After running a script to populate my database i ran a rake ts:rebuild but sphinx is partially rebuilding the indexes.
Stopped searchd daemon (pid: 23309).
Generating configuration to /home/guest_dp/config/development.sphinx.conf
Sphinx 2.1.4-id64-release (rel21-r4421)
Copyright (c) 2001-2013, Andrew Aksyonoff
Copyright (c) 2008-2013, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/home/guest_dp/config/development.sphinx.conf'...
indexing index 'episode_core'...
collected 4469 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4469 docs, 8938 bytes
total 0.071 sec, 124488 bytes/sec, 62244.07 docs/sec
indexing index 'episode_delta'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.013 sec, 0 bytes/sec, 0.00 docs/sec
indexing index 'organization_core'...
.
.
.
skipping non-plain index 'episode'...
skipping non-plain index 'organization'...
skipping non-plain index 'person'...
skipping non-plain index 'position'...
skipping non-plain index 'profession'...
skipping non-plain index 'segment'...
skipping non-plain index 'tv_show'...
total 12816 reads, 0.005 sec, 0.2 kb/call avg, 0.0 msec/call avg
total 116 writes, 0.020 sec, 52.3 kb/call avg, 0.1 msec/call avg
Started searchd successfully (pid: 23571).
What does it mean skipping non-plain index ?

Each of those are distributed indices, which contain the _core and _delta plain indices (e.g. episode contains both episode_core and episode_delta). There's nothing to do to index them directly, because distributed indices don't contain data, they just point to other indices.
In other words: what you're seeing is completely normal. All of your indices are being processed appropriately.
Sphinx used to have a slightly different message: Distributed index 'episode' can not be directly indexed; skipping - same deal.

Related

WSL2 io speeds on Linux filesystem are slow

Trying out WSL2 for the first time. Running Ubuntu 18.04 on a Dell Latitude 9510 with an SSD. Noticed build speeds of a React project were brutally slow. Per all the articles on the web I'm running the project out of ~ and not the windows mount. Ran a benchmark using sysbench --test=fileio --file-test-mode=seqwr run in ~ and got:
File operations:
reads/s: 0.00
writes/s: 3009.34
fsyncs/s: 3841.15
Throughput:
read, MiB/s: 0.00
written, MiB/s: 47.02
General statistics:
total time: 10.0002s
total number of events: 68520
Latency (ms):
min: 0.01
avg: 0.14
max: 22.55
95th percentile: 0.31
sum: 9927.40
Threads fairness:
events (avg/stddev): 68520.0000/0.00
execution time (avg/stddev): 9.9274/0.00
If I'm reading this correctly, that wrote 47 mb/s. Ran the same test on my mac mini and got 942 mb/s. Is this normal? This seems like the Linux i/o speeds on WSL are unusably slow. Any thoughts on ways to speed this up?
---edit---
Not sure if this is a fair comparison, but the output of winsat disk -drive c on the same machine from the Windows side. Smoking fast:
> Dshow Video Encode Time 0.00000 s
> Dshow Video Decode Time 0.00000 s
> Media Foundation Decode Time 0.00000 s
> Disk Random 16.0 Read 719.55 MB/s 8.5
> Disk Sequential 64.0 Read 1940.39 MB/s 9.0
> Disk Sequential 64.0 Write 1239.84 MB/s 8.6
> Average Read Time with Sequential Writes 0.077 ms 8.8
> Latency: 95th Percentile 0.219 ms 8.9
> Latency: Maximum 2.561 ms 8.7
> Average Read Time with Random Writes 0.080 ms 8.9
> Total Run Time 00:00:07.55
---edit 2---
Windows version: Windows 10 Pro, Version 20H2 Build 19042
Late answer, but I had the same issue and wanted to post my solution for anyone who has the problem:
Windows Defender seems to destroy the read speeds in WSL. I added the entire rootfs folder as an exclusion. If you're comfortable turning off Windows Defender, I recommend that as well. Any antivirus probably has similar issues, so adding the WSL directories as an exclusion is probably you best bet.

Should I trim outliers from input features

Almost half of my input feature columns have offshoot "outliers" like when the mean is 19.6 the max is 2908.0. Is it OK or should I trim those to mean + std?
msg_cnt_in_x msg_cnt_in_other msg_cnt_in_y \
count 330096.0 330096.0 330096.0
mean 19.6 2.6 38.3
std 41.1 8.2 70.7
min 0.0 0.0 0.0
25% 0.0 0.0 0.0
50% 3.0 1.0 8.0
75% 21.0 2.0 48.0
max 2908.0 1296.0 4271.0
There is no general answer to that. It depends very much on your probem and data set.
You should look into your data set and check whether these outlier data points are actually valid and important. If they are caused by some errors during data collection you should delete them. If they are valid, then you can expect similar values in your test data and thus the data points should stay in the data set.
If you are unsure, just test both and pick the one that works better.

tf.subtract cost too long time for large array

The Tensorflow tf.subtract cost too long time for the large array.
My workstation configuration:
CPU: Xeon E5 2699 v3
Mem: 384 GB
GPU: NVIDIA K80
CUDA: 8.5
CUDNN: 5.1
Tensorflow: 1.1.0, GPU version
The following is the test code and result.
import tensorflow as tf
import numpy as np
import time
W=3000
H=4000
in_a = tf.placeholder(tf.float32,(W,H))
in_b = tf.placeholder(tf.float32,(W,H))
def test_sub(number):
sess=tf.Session()
out = tf.subtract(in_a,in_b)
for i in range(number):
a=np.random.rand(W,H)
b=np.random.rand(W,H)
feed_dict = {in_a:a,
in_b:b}
t0=time.time()
out_ = sess.run(out,feed_dict=feed_dict)
t_=(time.time()-t0) * 1000
print "index:",str(i), " total time:",str(t_)," ms"
test_sub(20)
Results:
index: 0 total time: 338.145017624 ms
index: 1 total time: 137.024879456 ms
index: 2 total time: 132.538080215 ms
index: 3 total time: 133.152961731 ms
index: 4 total time: 132.885932922 ms
index: 5 total time: 135.06102562 ms
index: 6 total time: 136.723041534 ms
index: 7 total time: 137.926101685 ms
index: 8 total time: 133.605003357 ms
index: 9 total time: 133.143901825 ms
index: 10 total time: 136.317968369 ms
index: 11 total time: 137.830018997 ms
index: 12 total time: 135.458946228 ms
index: 13 total time: 132.793903351 ms
index: 14 total time: 144.603967667 ms
index: 15 total time: 134.593963623 ms
index: 16 total time: 135.535001755 ms
index: 17 total time: 133.697032928 ms
index: 18 total time: 136.134147644 ms
index: 19 total time: 133.810043335 ms
The test result shows it(i.e., tf.subtract) cost more than 130 ms to dispose a 3000x4000 subtraction, which obviously is too long, especially on the NVIDIA k80 GPU platform.
Can anyone provide some methods to optimize the tf.subtract?
Thanks in advance.
You're measuring not only the execution time of tf.subtract but also the time required from transferring the input data from the CPU memory to the GPU memory: this is your bottleneck.
To avoid it, don't use placeholders to feed the data but generate it with tensorflow (if you have to randomly generate it) or if you have to read them, use the tensorflow input pipeline. (that creates threads that reads the input for you before starting and then feed the graph without exiting from the tensorflow graph)
It's important to do more possible operations within the tensorflow graph in order to remove the data transfer bottleneck.
It sounds reasonable that the time I measured contained the data transferring time from CPU memory to GPU memory.
Since I have to read the input data (e.g., the input data are images generated by mobile phone and they are sent to the tensorflow one by one), does it mean that the tensorflow placeholders must be used?
To the situation mentioned above (the input data are images generated by mobile phone and they are sent to the tensorflow one by one), if two images are not generated at the same time (i.e., the second image comes long after the first one), how can the input pipeline threads read the input data before starting (i.e., the second image are not generated when the tensorflow is disposing the first image)? So, could you give me a simple example to explain the tensorflow input pipeline?

G1 garbage having one slow worker

I'm having an issue with GC pause (~400ms) which I'm trying to reduce. I noticed that I always have one worker a lot slower than others :
2013-06-03T17:24:51.606+0200: 605364.503: [GC pause (mixed)
Desired survivor size 109051904 bytes, new threshold 1 (max 1)
- age 1: 47105856 bytes, 47105856 total
, 0.47251300 secs]
[Parallel Time: 458.8 ms]
[GC Worker Start (ms): 605364503.9 605364503.9 605364503.9 605364503.9 605364503.9 605364504.0
Avg: 605364503.9, Min: 605364503.9, Max: 605364504.0, Diff: 0.1]
--> [**Ext Root Scanning (ms)**: **356.4** 3.1 3.7 3.6 3.2 3.0
Avg: 62.2, **Min: 3.0, Max: 356.4, Diff: 353.4**] <---
[Update RS (ms): 0.0 22.4 33.6 21.8 22.3 22.3
Avg: 20.4, Min: 0.0,
As you can see one worker took 356 ms when others took only 3 ms !!!
If someone has an idea or think it's normal ..
[I'd rather post this as a comment, but I still lack the necessary points to do so]
No idea as to whether it is normal, but I've come across the same problem:
2014-01-16T13:52:56.433+0100: 59577.871: [GC pause (young), 2.55099911 secs]
[Parallel Time: 2486.5 ms]
[GC Worker Start (ms): 59577871.3 59577871.4 59577871.4 59577871.4 59577871.4 59577871.5 59577871.5 59577871.5
Avg: 59577871.4, Min: 59577871.3, Max: 59577871.5, Diff: 0.2]
[Ext Root Scanning (ms): 152.0 164.5 159.0 183.7 1807.0 117.4 113.8 138.2
Avg: 354.5, Min: 113.8, Max: 1807.0, Diff: 1693.2]
I've been unable to find much about the subject but here http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2013-February/001484.html
Basically as you surmise one the GC worker threads is being held up
when processing a single root. I've seen s similar issue that's caused
by filling up the code cache (where JIT compiled methods are held).
The code cache is treated as a single root and so is claimed in its
entirety by a single GC worker thread. As a the code cache fills up,
the thread that claims the code cache to scan starts getting held up.
A full GC clears the issue because that's where G1 currently does
class unloading: the full GC unloads a whole bunch of classes allowing
any the compiled code of any of the unloaded classes' methods to be
freed by the nmethod sweeper. So after a a full GC the number of
compiled methods in the code cache is less.
It could also be the just the sheer number of loaded classes as the
system dictionary is also treated as a single claimable root.
I think I'll try enabling the code cache flushing and let you know. If you finally managed to solve this problem please let me know, I'm striving to get done with it as well.
Kind regards

ThinkingSphinx returns no results when in test mode in Cucumber

I'm on a project upgrading from Rails 2 -> 3. We are removing Ultrasphinx (which is not supported in Rails 3) and replacing it with ThinkingSphinx. One problem - the Cucumber tests for searching, which used to work, are failing as ThinkingSphinx is not indexing the files in test mode.
This is the relevant part of env.rb:
require 'cucumber/thinking_sphinx/external_world'
Cucumber::ThinkingSphinx::ExternalWorld.new
Cucumber::Rails::World.use_transactional_fixtures = false
And here is the step (declared in my common_steps.rb file) that indexes my objects:
Given /^ThinkingSphinx is indexed$/ do
puts "Indexing the new database objects"
# Update all indexes
ThinkingSphinx::Test.index
sleep(0.25) # Wait for Sphinx to catch up
end
And this is what I have in my .feature file (after the model objects are created)
And ThinkingSphinx is indexed
This is the output of ThinkingSphinx when its run in test mode (this is WRONG, it should be finding documents but it is not)
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff
using config file 'C:/Users/PaulG/Programming/Projects/TechTV/config/test.sphinx.conf'...
indexing index 'collection_core'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.027 sec, 0 bytes/sec, 0.00 docs/sec
distributed index 'collection' can not be directly indexed; skipping.
indexing index 'video_core'...
collected 0 docs, 0.0 MB
total 0 docs, 0 bytes
total 0.018 sec, 0 bytes/sec, 0.00 docs/sec
distributed index 'video' can not be directly indexed; skipping.
total 0 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 8 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=4332).
In comparison, this is the output I get when I run
rake ts:index
To index the development environment:
Sphinx 0.9.9-release (r2117)
Copyright (c) 2001-2009, Andrew Aksyonoff
using config file 'C:/Users/PaulG/Programming/Projects/TechTV/config/development.sphinx.conf'...
indexing index 'collection_core'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 39 bytes
total 0.031 sec, 1238 bytes/sec, 127.04 docs/sec
distributed index 'collection' can not be directly indexed; skipping.
indexing index 'video_core'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 62 bytes
total 0.023 sec, 2614 bytes/sec, 168.66 docs/sec
distributed index 'video' can not be directly indexed; skipping.
total 10 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 20 writes, 0.001 sec, 0.1 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=5476).
Notice how its actually finding documents in my development database, but not my test database. The indexer is working in dev, but not test? I've spent 2 days on this, and am no closer to a solution. Any help would be overwhelmingly appreciated.
I figured it out this morning, hopefully I can save someone else the troubles I experienced. Looks like it wasn't a fault of Cucumber, but of DatabaseCleaner.
I fixed this issue by changing this line in env.rb:
DatabaseCleaner.strategy = :transaction
to
DatabaseCleaner.strategy = :truncation