tfjs-node on old cpu (without AVX) - tensorflow

There is a desire to do the initial steps in tfjs using nodejs. At the moment, for tests, I can only use a computer with the following configuration:
Windows 7 SP1
8Gb Ram
e7500 (no AVX)
GeForce 750Ti
node v12.19.0
When using tfjs-node, I get the error:
return process.dlopen (module, path.toNamespacedPath (filename));
As far as I understand this is due to the fact that the processor is very old, without AVX.
Can I somehow rebuild tfjs-node to work on my processor, it would be ideal to build tfjs-node-gpu. If there is such an opportunity, what should I do for this?
I've come across assemblies from fo40225 (https://github.com/fo40225), but they are for Python.

Solved a problem.
First I tried changing Windows to Win10, it didn't help.
Therefore, I decided to rebuild tensorflow.dll. After many attempts, I came up with this setup:
Basel 3.1
Python 3.8
NumPy installed globally
VS BuildTools 2019
Tensorflow branch 2.3 compiled (bazel build -c opt // tensorflow / tools / lib_package: libtensorflow)
After that I copied the dll to the folder with node_modules \ #tensorflow \ tfjs-node \ lib \ napi-v6

Related

How can I run Mozilla TTS/Coqui TTS training with CUDA on a Windows system?

I have a machine with a Quadro P5000 graphics card, running Windows 10. I'd like to train a TTS voice on this system. What do I need to install to make this work?
Here's what to install/do:
Download and install Python 3.8 (not 3.9+) for Windows. During the installation, ensure that you:
Opt to install it for all users.
Opt to add Python to the PATH.
Download and install CUDA Toolkit 10.1 (not 11.0+).
Download "cuDNN v7.6.5 (November 5th, 2019), for CUDA 10.1" (not cuDNN v8+), extract it, and then copy what's inside the cuda folder into C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1.
Download the latest 64-bit version of eSpeak NG (no version constraints :-) ).
Download the latest 64-bit version of Git for Windows (no version constraints :-) ).
Open a PowerShell prompt to a folder where you'd like to install Coqui TTS.
Run git clone https://github.com/coqui-ai/TTS.git.
Run cd TTS.
Run python -m venv ..
Run .\Scripts\pip install -e ..
Run the following command (this differs from the command you get from the PyTorch website because of a known issue):
.\Scripts\pip install torch==1.8.0+cu101 torchvision==0.9.0+cu101 torchaudio===0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
Put the following into a script called "test_cuda.py" in the TTS folder:
import torch
x = torch.rand(5, 3)
print(x)
print(torch.cuda.is_available())
Run the script via .\Scripts\python ./test_cuda.py and confirm the output looks like this (the first part should have just random numbers, but the last line must read True; if it does not, CUDA is not installed properly):
tensor([[0.2141, 0.7808, 0.9298],
[0.3107, 0.8569, 0.9562],
[0.2878, 0.7515, 0.5547],
[0.5007, 0.6904, 0.4136],
[0.2443, 0.4158, 0.4245]])
True
Put the following into a script called "train.bat" in the TTS folder, and then customize it for your configuration file:
set PYTHONIOENCODING=UTF-8
set PYTHONLEGACYWINDOWSSTDIO=UTF-8
set PHONEMIZER_ESPEAK_PATH=C:/Program Files/eSpeak NG/espeak-ng.exe
.\Scripts\python.exe ./TTS/bin/train_tacotron.py --config_path "C:/path/to/your/config.json"
Run the script via .\train.bat.
If you are using a different model than Tacotron or need to pass other parameters into the training script, feel free to further customize train.bat.
If you are just getting started with TTS training in general, take a peek at How do I get started training a custom voice model with Mozilla TTS on Ubuntu 20.04?.

nv-nsight-cu-cli caused Tensorflow to fail

I've downloaded the newest Nsight Compute profiling tool and I want to use it to benchmark Tensorflow applications. The code I'm using is here. It runs perfectly fine when I execute it and when I benchmark it with nvprof ./mnist.py it had no problem at all. However, when I try to run it with command sudo ./nv-nsight-cu-cli [path to the file] I get the following error:
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory
I suspect that nv-nsight-cu-cli somehow didn't recognized the environment variable at all. Is there any fix around?
You need to search for differences in both environments:
env variables
LD_LIBRARY_PATH
/etc/ld.so.conf
/etc/ld.so.conf.d/*
cuBLAS
Is installation complete/not broken?
Is it installed at the same location on both machines?
Versions
...
You can start with locate libcublas.so on both machines to see if there's a difference. Alternatively, you can strace -f -e open the program to check where it tries to libcublas.so from.
Your error has (for now) nothing to do with GPUs: libcublas.so.9.0 can just not be found. Find it, find why Tensorflow can not find it and your problem will be solved.
It appears that GP100 is not supported by the tool at this moment.
The answer is found here:
Nsight Compute only supports Pascal (other than GP100) and later GPUs.

tensorflow inception for retraining / fine tuning with a pretrained model: inception_train

I tried to retrain (new images, new classes) on top of the pretrained inception model, I therefor followed the instructions of the inception readme:
https://github.com/tensorflow/models/tree/master/inception#how-to-construct-a-new-dataset-for-retraining
I successfully built and ran build_image_data using bazel, as described in the tutorial. Afterwards I successfully built inception_train using bazel:
~/tensorflowmodels/models/inception# bazel build inception/inception_train
INFO: Found 1 target...
Target //inception:inception_train up-to-date (nothing to build)
INFO: Elapsed time: 0.073s, Critical Path: 0.00s
However, running bazel-bin/inception/inception_train I always get the following:
~/tensorflowmodels/models/inception# bazel-bin/inception/inception_train --train_dir="/" --validation_dir="/" --data_dir="/images_jpg/" --pretrained_model_checkpoint_path="/tensorflowmodels/models/inception/inception-v3/" --fine_tune=True --initial_learning_rate=0.001 --input_queue_memory_factor=1 --num_gpus=1
-bash: bazel-bin/inception/inception_train: No such file or directory
Naturally I would say it's by 99.9999% chance a typo. So then I tried to run inception_train.py with python. I had to change some import locations, and it finally ran with the parameters. However the script stops without any error messages after the initialization of the CUDA drivers.
Any help on how to solve this (or perform fine tuning / retraining with inception) would be very much appreciated.
tensorflow version: 0.9rc0
CPU: Xeon 5, 24 cores
GPU: Grid K2 8 GB
OS: Ubuntu 14.04
BTW I posted this already as an Github issue (which was closed, since it would be more a case for Stack Overflow).

Systemtap libdwfl error on Linux

I am tying to work/setup the Systemtap tool for profiling OS procesess, on a Virtual Linux. I am using VirtualBox to run the image. Via
rpm -q kernel
and
cat /proc/version
The version obtained is:
Linux version 2.6.32-5-686 (Debian 2.6.32-48squeeze4)
I have correctly downloaded and installed the tool and wrote a simple program (.stp). However I keep getting the same error, which I have searched information in many places without success:
After executing:
sudo stap my_profiler.stp
I get:
semantic error: libdwfl failure (all kernel modules found): no error
Pass 3: translation failed. Try again with another '--vp 001' option.
According to https://sourceware.org/systemtap/SystemTap_Beginners_Guide/errors.html
⁠semantic error: libdwfl failure
There was a problem processing the debugging information. In most cases, this error results from the installation of a kernel-debuginfo package whose version does not match the probed kernel exactly. The installed kernel-debuginfo package itself may have some consistency or correctness problems.
I have found no relevant information on the "kernel-debuginfo" package. I have also tried the verbose option without benefit. I even tried with an old Snapshot of the VM. Any ideas?
The code of the .stp program I ran:
probe timer.profile{
printf("Process: %s\n", execname())
printf("Process ID: %d\n", pid())
}
Found the problem!!!! It seemed that I was using the wrong version of the Linux Kernel. I was using the default kernel supplied by the version I wrote in the question. It seems that that version (the 2.6.32-5-686 one) has problems with the debug-info so all I did was try the same with another version (the Linux version 3.9.6 with gcc version 4.7.2 Debian 4.7.2-5) and it worked without trouble :)

poclbm not reporting hashes to deepbit or slush

I run poclbm on my system but for some reason both deepbit and slush don't "see" the work being performed. My system reports about 200 megabashes per second being done. I tried mining with my cpu using the same settings, and then both deepbit and slush recognized that work was being performed.
These are the errors I am getting out of the respective mining hardware (every minute or so):
poclbm error: pit.deepbit.net:8332 22/02/2013 21:50:59, Verification failed, check hardware! (0:0:Cypress, d47b7ba0)
cgminer error: [2013-02-22 22:18:51] GPU0: invalid nonce - HW error
I am using Ubuntu 12.10 (Quantal Quetzal) with the 12.10 version poclbm with an ATI 5800 series video card. The video drivers are installed and work as far as I can tell. When I run a "aticonfig --odgc --adapter=all", the gpu does seem to be utilized with poclbm (around 70% utilization or so).
I found the solution through an irc channel (Freenode on channcel #cgminer). Basically, at least on the version of Ubuntu that I have (12.10), the 2.8 version of the SDK does NOT work properly with cgminer or poclbm. I was instructed to download the 2.4 version of the SDK. Here:
http://developer.amd.com/Downloads/AMD-APP-SDK-v2.4-lnx32.tgz
http://developer.amd.com/Downloads/AMD-APP-SDK-v2.4-lnx64.tgz
Some distributions require the "2.7" version so I'll put the links here:
http://developer.amd.com/Downloads/AMD-APP-SDK-v2.7-lnx32.tgz
http://developer.amd.com/Downloads/AMD-APP-SDK-v2.7-lnx64.tgz
I compiled it. There is no "make install" for this Makefile, apparently, so you have to manually copy the files to your lib directory:
for 32 bit: $ cp -pv lib/x86/* /usr/lib/
for 64 bit: $ cp -pv lib/x86_64/* /usr/lib/
Also copy the include files: $ rsync -avl include/CL/ /usr/include/CL/
With the libraries installed in the appropriate directories, I recompiled cgminer and then it worked. I also tried it with poclbm and it worked with that too.
Hm, I experienced the same error with pclbm and cgminer. Then I found https://bitcointalk.org/index.php?topic=139406.msg1502120#msg1502120 .. I tried phoenix and all is ok now. Hope it helps. Sry my bad english.