How to solve the error Cnv1 DNN library not found in tensorflow?
And, are the compatibilities given by tensorflow here, https://www.tensorflow.org/install/source valid backwards as well?
This question is a possible duplicate to (certainly related to): Colab: (0) UNIMPLEMENTED: DNN library is not found and Unimplemented Error Node: 'sequential/conv1d/Conv1D' DNN library is not found running in Jupyter on Windows However, I could not quite follow their solution. Here is my problem:
I am training a convolutional neural network (CNN) with Keras/tensorflow. On my PC it appears to be running fine. However, I must get it to work on GPUs. And it is there that I am running into all sorts of issues. The latest:
Node: 'XXX/1st_Conv1D/Conv1D'
DNN library is not found.
[[{{node XXX/1st_Conv1D/Conv1D}}]] [Op:__interference_train_function_5577]
The XXX is name chosen by me in the code. The training starts fine and then abruptly aborts after some time with the above message.
I have installed conda (WITHOUT admin rights) on the GPU server (will not get amdin rights, its a univesity system).
conda list gives the following:
cudnn Version 8.4.1.50
keras Version 2.10.0
keras-preprocessing Version 1.1.2
tensorflow Version 2.10.0
tensorflow-base Version 2.10.0
tesnorflow-estimator Version 2.10.0
cuda-toolkit Version 12.0.0
cuda-tools Version 12.0.0
cuda-cudart Version 12.0.107
cuda-python Version 11.8.1
cudatoolkit Version 11.8
python Version 3.9.15
And various other packages (many cuda-something). (I have not installed bazel). The listed packages cudatoolkit and cuda-toolkit is not a typo. Neither is python nore cuda-python (from nvidia channel of anaconda). My understanding from https://www.tensorflow.org/install/source is that this should be fine. Some people mention some library paths. I don't get it. For this I need some more idiot-proof explanation :-( (Keep in mind, I have no admin rights)
In my python code I import the following:
from keras import backend as K
import tensorflow as tf
from tensorflow import keras
from keras import Input
from keras.models import Model
from keras.layers import BatchNormalization
from keras.layers import LeakyReLU
from keras.layers import Activation
from keras.layers import Dense
from keras.layers import Reshape
from keras.layers import Conv1D
from keras import initializers
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.callbacks import ReduceLROnPlateau
from keras.models import load_model
from keras.optimizers import Adam
from keras.utils import Sequence
from keras.metrics import MeanSquaredError
from keras.metrics import MeanAbsoluteError
from keras.metrics import RootMeanSquaredError
# for plotting
import matplotlib
# import matplotlib.pyplot as plt
matplotlib.use('Agg')
from matplotlib import pyplot as plt
import matplotlib.colors as mcolors
import matplotlib as mpl
import matplotlib.patches as mpatches
from matplotlib.colors import to_rgb
from matplotlib.collections import PolyCollection
from matplotlib.legend_handler import HandlerTuple
import seaborn as sns # for violin plots
I also import pandas, numpy, wave, time,sys,json,argparse, pathlib, datetime and random. They are not causing any problems so far.
The model is then trained with:
history = model.fit(x=train_data_gen,
validation_data=vali_data_gen,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
verbose=VERBOSITY,
shuffle=SHUFFLE,
max_queue_size=QUEUE_SIZE,
use_multiprocessing=MULTI_PROCESSING,
workers=WORKERS,
callbacks=[early_stopping,
model_checking,
reduce_lr])
Where MULTI_PROCESSING is set to True (see: https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly). This works fine on a PC. But is obviously slow. I MUST get it to work on GPUs. Queue Size is set to 1000. And workers is 48. I can lower it. Some people suggested memory problems, though I cannot imagine that.
Any help is welcome. I am getting desparate! Thanks in advance.
The installed CUDA and cuDNN version in your system is not matching with the tested build configurations defined for the TensorFlow 2.10 GPU setup. (Check the image below)
You need to install the specified version of cuDNN 8.1 and CUDA 11.2 compatible to TensorFlow 2.10 to enable GPU support in your system.
Please check the link to install these software. Thank you.
Related
I am running a very basic sentiment analysis pipeline utilising the XLM-Roberta model on Huggingface. I am trying to ensure I am utilising the M1 chip as I will be looping over ~10e7 entries.
So as to be consistent I am running a fresh install of PyTorch following the yml file and step outlined in this (very useful) video, I subsequently pip install sentence-piece and protobuf (version 3.2.0) to deal with a few subsequent errors. When running a simple pipeline model I am however faced with the below:
# Imports
import pandas as pd
import datetime as dt
import itertools
from transformers import pipeline, AutoTokenizer
sentiment_model = pipeline(model="cardiffnlp/twitter-xlm-roberta-base-sentiment", return_all_scores = True)
ValueError: google.__spec__ is None
Interesting following the install methods for Tensorflow from the same channel runs fine but does not access the M1 chip and simply runs on CPU.
Has anyone faced this prior or have a method such that I can run PyTorch?
Many thanks in advance.
I am trying to save a model with tensorflow-agents. First I define the following:
collect_policy = tf_agent.collect_policy
saver = PolicySaver(collect_policy, batch_size=None)
and then save the model like this:
saver.save('my_directory/')
This works ok in google colab but I am getting the following error in my local PC.
AttributeError: module 'tensorflow.python.saved_model.nested_structure_coder' has no attribute 'StructureCoder'
These are the library versions I am using:
tensorflow 2.9.1
tf-agents 0.11.0
Tl;dr
Make sure you have the right tensorflow-probability version that is compatible for 2.9.x and tf-agents 0.11.0
pip uninstall tensorflow-probability
pip install tensorflow-probability==0.17.0
(0.19.0 for TF 2.11, 0.18.0 for TF 2.10 or look at the release notes)
Also make sure to restart your kernel from the notebook.
What the problem was
StructureCoder has been moved to tensorflow API. Therefore, other dependent libraries have made changes like this in tf-agent and like this in tensorflow-probability. Your machine is somehow picking up an older version that depends on the previous version of nested_structure_coder.
For me, I was using
tensorflow 2.9.0
tf-agents 0.13.0
tensorflow-probabilities 0.17.0
Try making an explicit import in your notebook:
import tensorflow_probability
print(tensorflow_probability.__version__) # was 0.17.0 for me
I am using Colab to run a text analysis code. I am want to get universal-sentence-encoder-large from tensorflow_hub.
But anytime running the block containing the code below:
module = hub.Module("https://tfhub.dev/google/universal-sentence-encoder-large/3")
I get this error:
RuntimeError: variable_scope module_8/ was unused but the
corresponding name_scope was already taken.
I appreciate if you have any idea how this error can be fixed?
TF Hub USE-3 Module doesn't work with Tensorflow Version 2.0.
Hence, if you change the version from 2.0 to 1.15, it works without any error.
Please find the working code mentioned below:
!pip install tensorflow==1.15
!pip install "tensorflow_hub>=0.6.0"
!pip3 install tensorflow_text==1.15
import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
import tensorflow_text
module = hub.Module("https://tfhub.dev/google/universal-sentence-encoder-large/3")
Please find the Github Gist of Google Colab as well.
With tensorflow 2 in google colab you should use hub.load(url) instead of hub.Module(url)
I install api object detection, it's take a lot of space, please tell me which files to save for just detect objects on images, Thanks you.
You need utils, protoc, and ipynb script
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from utils import label_map_util
from utils import visualization_utils as vis_util
You can see this clearly from the https://github.com/tensorflow/models/blob/master/research/object_detection/object_detection_tutorial.ipynb script that is on official repo. Install all the required libraries NumPy, os, etc .. and copy the utils directory
I had a wx script working on winxp (at work). it was upgraded to win7_64. I installed python2 and wxpython (both 32bit). now my script doesn't want to run. it says "ImportError: NumPy not found.". so I installed numpy from numpy.org, but it didnt change anything. I can import wx, I can import numpy, but when I try to run my wx script, it says that numpy is not installed. I removed and reinstalled everything but nothing changed.
what to do?
Presumably your numpy is too "new" or your wxPython is too old.
For example the combination wxPython < 3.0 and numpy > 1.9 will not work for the plot module (2.9.5 + numpy 1.8.0 and 3.0.2 + numpy 1.9.2 do actually work).
Reason should be file <site-packages.wx>/lib/plot.py (2.9.5):
# Needs NumPy
try:
import numpy.oldnumeric as _Numeric
except:
msg= """
This module requires the NumPy module, which could not be
imported. It probably is not installed (it's not part of the
standard Python distribution). See the Numeric Python site
(http://numpy.scipy.org) for information on downloading source or
binaries."""
raise ImportError, "NumPy not found.\n" + msg
and as used in 3.0.2):
# Needs NumPy
try:
import numpy as np
except:
numpy.oldnumeric is no longer part of numpy 1.9.2, wx.lib.plot was developed for ancient array libraries and you can clearly see its age.