cudaError_t 1 : "__global__ function call is not configured" returned from 'cublasCreate(&handle_)' - gpu

I run ASR experiment using Kaldi on SGE cluster consisting of two workstation with TITAN XP.
And randomly I meet the following problem:
ERROR (nnet3-train[5.2.62~4-a2342]:FinalizeActiveGpu() cudaError_t 1 : "__global__ function call is not configured" returned from 'cublasCreate(&handle_)'
I guess something is wrong with GPU driver or hardware.
Could you please offer some help?
And here is the complete log

I had similar issue in running darknet in one of the TX2
with reference to
enter the root by
sudo su
Then source the catkin_ws
Then launch the darkent.
Then can run.
Here is my result
Hope you can solve it by similar method


How to show “Intra-package call graph” using “godoc”

How can I show “Intra-package call graph” using godoc as described in Intra-package call graph ?
I use this command
$ GO111MODULE=off godoc -http=:6060 -analysis=type,pointer
to start a local server and everything is fine. I just cannot find where the “Intra-package call graph” is.
Any help is appreciated!
It just takes too long to do "Pointer analysis", which is approximately 4.5 minutes on my laptop. The "Intra-package call graph" for hole package and every function appeared until "2021/03/16 15:04:28 Pointer analysis running..." had disappeared from my terminal.
And "-analysis=pointer" is just efficient.

BG95 Can't Activate - AT+QIACT=1 returning error

I'm trying to get a BG95 to activate on hologram.
Here are my commands:
AT+QCFG="band",F,180A,180A OK
AT+QCFG="iotopmode",2 OK
AT+QCFG="nwscanseq",020301 OK
AT+QCFG="nwscanmode",0 OK
AT+QCFG="snrscan",0 OK
AT+QICSGP=1,1,"hologram","","",1 OK
At first I thought it was antenna/signal related so I ran AT+CSQ and got this:
+csq: 11,99
This tells me I have a good signal I believe.
Next I tried AT+QNWINFO and get this:
+QNWINFO: "eMTC","311480","LTE BAND 13",5230
In my mind this is saying it's connected to a network.
After trying that I tried to activate again and got this:
The weird thing is it activated just fine about a week ago with pure AT commands. I did try and use an Arduino library with it (WisLTEBG96TCPIP) which may have changed a setting in it. I've done a factory reset but the it still woln't activate.
Another strange thing is the hologram dashboard. Every once and a while it will show the SIM as connected, even though I can't activate.
I have tried with 2 different SIM cards any get the same activation error.
Any help would be greatly appreciated!
Verizon has cut off all non ODI products. If your hardware has not been Verizon ODI 'certified' it will no longer be allow to be connected to their network, I have 5 new pet rocks thanks to them. The solution is to purchase new modems from vendors that have been through the Verizon ODI program or switch carriers.
I had the same problem before, after a lot of maling with network operator I find out that there isn't a LTE-CAT-M1 (eMTC) network in my area, I tested in another area successfully
Also before setting AT+QCFG commands try AT+CFUN = 0
and after setting AT+QCFG commands try AT+CFUN = 1 .
before sending AT+QIACT, try 'AT+CEREG?' command several times and tell me what is the return of it

How to get an edge's id using TraCi?

I'm using a python code with the traci library to know if there are any vehicles near a certain distance to a chosen vehicle, to test a solution I'm trying to implement I need to know a vehicle's current edge.
I'm on Ubuntu 18.04.3 LTS, using sublime to edit the code and the os, sys, optparse, subprocess, random, math libraries. I've tried using getLaneId and getEdgeId, the last one is not in the documentation but I tough I've seen it somewhere and tried to test it.
. Another option that i had was using getNeighbors but i didn't know exactly how to use it and it returned the same error message as the previous commands.
def run():
step = 0
while traci.simulation.getMinExpectedNumber() > 0:
step += 1
if step > 2:
All of them returned the following error message : AttributeError: VehicleDomain instance has no attribute 'getLaneId'. But I think the vehicle domain has indeed the getLaneId attribute since it is in the documentation:
I was expecting it to return the edge's id. Please I need help with this problem. Thank you in advance.
The TraCI command for edgeID can be found in the _vehicle.VehicleDomain module. The syntax is as follows:
traci._vehicle.VehicleDomain.getRoadID(self, vehicleID)
It needs to be getLaneID with a capital D.

julia on PBS cluster: what to give to addprocs()?

I'm trying to setup a cluster across machines on a PBS managed cluster. I'm perfectly able to compute within one node by saying julia -p 12 (after having reserved one node with 12 CPUs).
I understand that to use several machines, I have to add them to the master process with addprocs. I was able to do that on a different cluster (SGE). on this one here something is going wrong.
You can see everything I'm doing, including submit scripts etc, on this branch of a github repo.
to get a list of machines, I parse the PBS_NODEFILE, which for the case of a submit script with option
#PBS -l nodes=2:ppn=12 # give me 2 nodes with 12 processors each
looks like something like this:
I parse this file with bind_pe_procs() in sge.jl in the repo and give a vector of machine names to addprocs. When I submit this I get this error which I put up a gist with the resulting SSH error. I don't know what it means.
has this to do with a system setting, ie do i have to talk to the sys admin about SSH between machines? What are the right questions to ask?
I am unsure about what exactly I have to give to addprocs(). I don't want to add the master process (I don't want worker 1 SSHing into itself?), so I exclude ENV["HOST"] = node001 from my list. but what about all processors with the same name node002? do i list all of those
machines = [ "red0347" for i=1:12]
or just once
machines = ["red0347"]
in addprocs(machines)

openAcc how to profile

Hi I was using CAPS OpenACC compilers, but something strage happens when I try to get some preliminary profile results.
At first, I ran the code with declaring HMPPRT_LOG_LEVEL="info", which generates some profile results with time stamp.
[ 2.612337] ( 0) INFO : Upload edgelengths[0:129600] (element_size=8, queue=none, location=gravity_openacc.c:50)
[ 2.613485] ( 0) INFO : Call __hmpp_acc_region__2ha750yb (queue=none, location=gravity_openacc.c:50)
[ 2.614367] ( 0) INFO : Free edgelengths[0:129600] (element_size=8, queue=none, location=gravity_openacc.c:50)
So I guess the kernel execution time is calculated as 2.614367-2.613485=0.000882 s.
But when I declaring the CUDA_PROFILE=1, the below profile is shown
method=[ __hmpp_acc_region__2ha750yb_parallel_region_1 ] gputime=[ 492.480 ] cputime=[ 13.000 ] occupancy=[ 0.250 ]
So I'm quite confused about these two results, which is true???
Anyone get some solutions?
The CUDA profiler shows you just the time it takes to execute the CUDA kernel, while the log you obtain with HMPPRT_LOG_LEVEL="info" gives you the overall time it takes to execute the region, which is not exactly the same thing, because you may have some code that is executed on the host for example.