torch.save(predictions, predictions_path, pickle_module=dill) system dead - system

I trained the whole network and got the parameters. When I use one pth to infer the result on a test dataset, just testing, by debugging, each time when it goes to "torch.save(predictions, predictions_path, pickle_module=dill) ", the whole system gets dead. The clock stops and the mouse cannot move. Push contl+alt+F1 to tty but no response. Nothing can do. Everything is dead. If I restart the machine by the "start" button, the system then says "lost inducing document" and the whole system dumped and I have to re-install the whole Linux. If I wait for the dead system to be back, it costs one day or so, and finally, it gets back to response and just shows "system error".
I searched for all possible solutions and tried but not work. I bought new memory banks, updated and upgraded all software, checked error logs, and optimized many system services. But all is useless. Each time it got dead for one day. The system is centos for ubuntu. Anyone can help?

Related

paladin and anyone else hit same target and causes server crash

Playing on a linux hosted AzerothCore rev. be423a91b535 master branch.
Whenever as a paladin I hit a target with my abilities and then someone else swings at it the entire server crashes. The only error that shows up in log is this.
We've tested this with multiple characters on multiple accounts and and in all instances after casting either of my judgement of light, or judgement of mana causes the server to crash completely when another character swings at said target.
It does not happen if I do not use these abilities and just attach the target at the same time with another player.
This is a clean server recently set up, no additional modules or changes beyond lowering guild signing size.
error image
I've also gotten a snapshot of the server config if that has useful information.
Server config
Apparently there was a module added QAston Proc, and this causes world crashes. If anyone else has this issue updating should clear it up as the commit was removed.

Your connection was interrupted. ERR_NETWORK_CHANGED when parse with Selenium

I wrote a parser in selenium, which clicks to different links and parses data. But from time to time I get an error
Your connection was interrupted. ERR_NETWORK_CHANGED
Perhaps this error is not related to selenium.
But, if I wait a little (about 1 second), then the connection appears again and I can continue to parse.
What way is there to solve this problem programmatically? Maybe in this case, somehow, I can reload the page until the connection appears again?
For me using a mac system and chrome for a daily basis, I encounter this problem now and then and finally found it might be helpful to solve this problem by just turning off the virtual machines or dock containers whichever can adjust your network configuration.
Maybe not fit for your case but it could be a hint. You can turn them off for a while if you have them running.

Is it possible to accurately log what applications the user has launched through the linux kernel?

My goal is to write to a file (that the user whenever the user launches an application, such as FireFox) and timestamp the event.
The tricky part is having to do this from the kernel (or a module loaded onto the kernel).
From the research I've done so far (sources listed below), the execve system call seemed the most viable. As it had the filename of the process it was handling which seemed like gold at the time, but I quickly learned that it wasn't as useful as I thought since this system call isn't limited to user-related operations.
So then I thought of using ps -ef as it listed all the current running processes and I would just have to filter through which ones were applications opened by the user.
But the issue with that method is that I would have to poll every X seconds so, it has the potential to miss something if the user launched and closed an application within the time that I didn't call ps -ef.
I've also realized that writing to a file would be a challenge as well, since you don't have access to the standard library from the kernel. So my guess for that would be making use of proc somehow to allow the user to actually access the information that I'm trying to log.
Basically I'm running out of leads and I'd greatly appreciate it if anyone could point me in the right direction.
Thanks.
Sources:
http://tldp.org/LDP/lkmpg/2.6/html/x978.html (not very recent)
https://0xax.gitbooks.io/linux-insides/content/SysCall/syscall-4.html
First, writing to a file or reading a real file from the kernel is a bad idea which is not used in the kernel. There is of course VFS files, like /sys/fs or /proc, but this is a special case and this is allowed.
See this article in Linux Journal,
"Driving Me Nuts - Things You Never Should Do in the Kernel" by Greg Kroach-Hrtman
http://www.linuxjournal.com/article/8110
Every new process that is created in Linux, adds an entry under /proc,
as /proc/pidNum, where pidNum is the Process ID of the new process.
You can find out the name of the new application which was invoked simply by
cat /proc/pidNum/cmdline.
So for example, if your crond daemon has pid 1336, then
$cat /proc/1336/cmdline
will give
cron
And there are ways to monitor adding entries to a folder in Linux.

MS Access occassional "Cannot open any more databases" error

I have an MS Access (2013) application with a split database. Everything seems to run smoothly except for occasionally I will get Error 3048: Cannot open any more databases.
The error occurs when the front end tries to run vba code which involves pulling data from the back end and will stall on any line with: Set DB = OpenDatabase() or DoCmd.RunSQL() commands.
The strange thing is that this error seems to be time based. I can access the back end hundreds of times without error if I do it quickly enough but after some time has passed (~1 hr) the error shows up. In fact, I can open the application and leave it running in the background (with no code running) then go back into it after an hour and I will get the error the first time the code tries to open the back end.
I've searched the length and breath of this site and google for solutions so I know this error has been addressed before. To save people reiterating the usual fixes I will list what I've tested for so far with no success:
Recordset limit: I'm not leaving any recordsets open, every time I open one I make sure to close it. The same for the databases. All my requests
to the back end are done via 3 or 4 vba functions and each of these has
a Rec.Close or DB.Close corresponding to every OpenRecordset()
and OpenDatabase() and I never have more than 2 recordsets open at
a time.
Control limit: I have 151 controls on the biggest form in the application so I should be below the limit (I believe this is 245 for a single form?)
Corrupt database: I've copied all my forms and code to a new Access database and run a Compact and Repair.
Machine Issue: I've tested the application on several machines and reproduced the same error.
Anyway with most of the above situations I would expect the application not to run at all, instead of running fine for a set amount of time and then crashing.
Some other points of note:
Citrix Users: The users are split between normal windows machines users who are experiencing this error and others who are using the application through a virtual desktop software (Citrix) who are having no issues. Unfortunately I don't know enough about this virtual desktop to really work out what that implies.
Background vs Foreground: Some users have claimed that the application only crashes if it has been running for a long time AND they switch over to another program and switch back. I've confirmed that simply switching between the application and other programs doesn't cause it to crash but haven't yet been able to leave it running in the foreground long enough to confirm if it crashes without switching between programs.
I've been struggling with this for days, anyone able to help me out?

inittab respawn of Node.js too fast

So I am trying to keep my Node server on a embedded computer running when it is out in the field. This lead me to leveraging inittab's respawn action. Here is the file I added to inittab:
node:5:respawn:node /path/to/node/files &
I know for a fact that when I startup this node application from command line, it does not get to the bottom of the main body and console.log "done" until a good 2-3 seconds after I issue the command.
So I feel like in that 2-3 second window the OS just keeps firing off respawns of the node app. I see in the error logs too in fact that the kernel ends up killing off a bunch of node processes because its running out of memory and stuff... plus I do get the 'node' process respawning too fast will suspend for 5 minutes message too.
I tried wrapping this in a script, dint work. I know I can use crontab but thats every minute... am I doing something wrong? or should I have a different approach all together?
Any and all advice is welcome!
TIA
Surely too late for you, but in case someone else finds such a problem: try removing the & from the command invocation.
What happens is that when the command goes to the background (thanks to the &), the parent (init) sees that it exited, and respawns it. Result: a storm of new instantations of your command.
Worse, you mention embedded, so I guess you are using busybox, whose init won't rate-limit the respawning - as would other implementations. So the respawning will only end when the system is out of memory.
inittab is overkill for this. I found out what I need is a process monitor. I found one that is lightweight and effective; it has some good reports of working great out in the field. http://en.wikipedia.org/wiki/Process_control_daemon
Using this would entail configuring this daemon to start and monitor your Node.js application for you.
That is a solution that works from the OS side.
Another way to do it is as follows. So if you are trying to keep Node.js running like I was, there are several modules written meant to keep other Node.js apps running. To mention a couple there are forever and respawn. I chose to use respawn.
This method entails starting one app written in Node.js that uses the respawn module to start and monitor the actual Node.js app you were interested in keeping running anyway.
Of course the downside of this is that if the Node.js engine (V8) goes down altogether then both your monitoring and monitored process will go down with it :-(. But its better than nothing!
PCD would be the ideal option. It would go down probably only if the OS goes down, and if the OS goes down then hope fully one has a watchdog in place to reboot the device/hardware.
Niko