Running sidekiq jobs against more than one database? - ruby-on-rails-5

I have one Rails app, which uses different databases depending on the domain name (ie. it supports multiple websites). This works by loading up different environments, without issue.
I am trying to figure out how to run the same set of Sidekiq jobs for each of them.
Sidekiq runs on a worker-server instance.
I have tried running a second instance of sidekiq on the commandline of the worker, giving it a different pidfile, logfile, environment and config file.
Problem 1: In the Dashboard, all recurring tasks listed in first instance of sidekiq's config file are gone and only the task from my 2nd instance's config file is there on the recurring jobs tab.
Problem 2: For that job, if I try to enqueue it, I get unitialized constant uninitialized constant JofProductUpdateLive -> I am guessing this is because I defined the class in app/jobs/jof_product_update_live.rb on worker, and it is seeking it on master server ?
Problem 3: If my theory for the error is correct and I place that file on master server, seems to me it will run with environment/db1 and i'm not sure how to run it with db2/environment2 ?
I'm seeking any advice as to how to set something like this up, as I have tried every idea that came my way and as of yet, zero success. I have also combed through every forum I could find on sidekiq to no avail.
Thanks for any help !

Check out the Apartment gem and apartment-sidekiq.
https://github.com/influitive/apartment-sidekiq

Related

serverless - aws - SecureLambdaFunction env

I'm having the following case:
I setting several environments variables on my serverless.yml file like:
ONE_CLIENT_SECRET=${ssm:/one/key_one~true}
ONE_CLIENT_PUBLIC=${ssm:/one/key_two~true}
ANOTHER_SERVICE_KEY=${ssm:/two/key_one~true}
ANOTHER_SERVICE_SECRET=${ssm:/two/key_two~true}
let' say I have like 10 envs, when I try to deploy I get the following error:
An error occurred: SecureLambdaFunction - Lambda was unable to configure your environment variables because the environment variables you have provided exceeded the 4KB limit. String measured: JSON_WITH_MY_VARIABLES_HERE
So I cannot deploy, I have an idea of what the problem is but I dont have a clear path to solve it, so my questions are:
1) How can I extend the 4Kb limit?
2) assuming my variables are set using SSM, I'm using the EC2 Parameter store to save them. (this is more related to a serverless team or someone that knows the topic) how does it work behind the scenes?
- when I run sls deploy does it fetch for the values and included on the .zip file? (this is what I think it does, I just want to clarify) or does it fetch the values when I exec the lambdas? I'm asking cause I go to the aws lambda console and I can see em set there.
Thanks!
After taking a look around in deep, I came with the following conclusion:
Using this pattern ONE_CLIENT_SECRET=${ssm:/one/key_one~true} means that the sls framework is going to download the values on compilation time and embed into the project, this is where the problem comes, you can see this after uploading the project, your variables are going to be set on plain text on the lambda console.
My solution was to use a middy middleware to load ssm values when executing the lambda. This means, you need to code your project in a way that does not trigger any code until the variables are available and find a good strategy to catch the variables (cold start), otherwise, it will add more time to the execution.
The limit of 4Kb cannot be changed and after read about this, it seems obvious.
So short story, find a strategy of middleware and embed values that work best for you if you find this problem.

Is it possible to accurately log what applications the user has launched through the linux kernel?

My goal is to write to a file (that the user whenever the user launches an application, such as FireFox) and timestamp the event.
The tricky part is having to do this from the kernel (or a module loaded onto the kernel).
From the research I've done so far (sources listed below), the execve system call seemed the most viable. As it had the filename of the process it was handling which seemed like gold at the time, but I quickly learned that it wasn't as useful as I thought since this system call isn't limited to user-related operations.
So then I thought of using ps -ef as it listed all the current running processes and I would just have to filter through which ones were applications opened by the user.
But the issue with that method is that I would have to poll every X seconds so, it has the potential to miss something if the user launched and closed an application within the time that I didn't call ps -ef.
I've also realized that writing to a file would be a challenge as well, since you don't have access to the standard library from the kernel. So my guess for that would be making use of proc somehow to allow the user to actually access the information that I'm trying to log.
Basically I'm running out of leads and I'd greatly appreciate it if anyone could point me in the right direction.
Thanks.
Sources:
http://tldp.org/LDP/lkmpg/2.6/html/x978.html (not very recent)
https://0xax.gitbooks.io/linux-insides/content/SysCall/syscall-4.html
First, writing to a file or reading a real file from the kernel is a bad idea which is not used in the kernel. There is of course VFS files, like /sys/fs or /proc, but this is a special case and this is allowed.
See this article in Linux Journal,
"Driving Me Nuts - Things You Never Should Do in the Kernel" by Greg Kroach-Hrtman
http://www.linuxjournal.com/article/8110
Every new process that is created in Linux, adds an entry under /proc,
as /proc/pidNum, where pidNum is the Process ID of the new process.
You can find out the name of the new application which was invoked simply by
cat /proc/pidNum/cmdline.
So for example, if your crond daemon has pid 1336, then
$cat /proc/1336/cmdline
will give
cron
And there are ways to monitor adding entries to a folder in Linux.

inittab respawn of Node.js too fast

So I am trying to keep my Node server on a embedded computer running when it is out in the field. This lead me to leveraging inittab's respawn action. Here is the file I added to inittab:
node:5:respawn:node /path/to/node/files &
I know for a fact that when I startup this node application from command line, it does not get to the bottom of the main body and console.log "done" until a good 2-3 seconds after I issue the command.
So I feel like in that 2-3 second window the OS just keeps firing off respawns of the node app. I see in the error logs too in fact that the kernel ends up killing off a bunch of node processes because its running out of memory and stuff... plus I do get the 'node' process respawning too fast will suspend for 5 minutes message too.
I tried wrapping this in a script, dint work. I know I can use crontab but thats every minute... am I doing something wrong? or should I have a different approach all together?
Any and all advice is welcome!
TIA
Surely too late for you, but in case someone else finds such a problem: try removing the & from the command invocation.
What happens is that when the command goes to the background (thanks to the &), the parent (init) sees that it exited, and respawns it. Result: a storm of new instantations of your command.
Worse, you mention embedded, so I guess you are using busybox, whose init won't rate-limit the respawning - as would other implementations. So the respawning will only end when the system is out of memory.
inittab is overkill for this. I found out what I need is a process monitor. I found one that is lightweight and effective; it has some good reports of working great out in the field. http://en.wikipedia.org/wiki/Process_control_daemon
Using this would entail configuring this daemon to start and monitor your Node.js application for you.
That is a solution that works from the OS side.
Another way to do it is as follows. So if you are trying to keep Node.js running like I was, there are several modules written meant to keep other Node.js apps running. To mention a couple there are forever and respawn. I chose to use respawn.
This method entails starting one app written in Node.js that uses the respawn module to start and monitor the actual Node.js app you were interested in keeping running anyway.
Of course the downside of this is that if the Node.js engine (V8) goes down altogether then both your monitoring and monitored process will go down with it :-(. But its better than nothing!
PCD would be the ideal option. It would go down probably only if the OS goes down, and if the OS goes down then hope fully one has a watchdog in place to reboot the device/hardware.
Niko

Running Rails code/initializers but not via Rake

I keep running into a recurring issue with my application. Basically, I have certain code that I want it to run when it first starts up the server to check whether certain things have been defined e.g. a schedule, particular columns in the database, existence of files, etc. and then act accordingly.
However, I definitely don't want this code to run when I'm starting a Rake task (or doing a 'generate', etc. For example, I don't want the database fields to be checked under Rake because the Rake task might be the migration to define the fields. Another example, I have a dynamic schedule for Resque but I don't want to load that when starting the Resque workers. And so on and so forth...
And I definitely need the Rake tasks to be loading the environment!
Is there any way of determining how the application has been loaded? I do want to run the code when its loaded via 'rails server', Apache/Passenger, console, etc. but not at other times.
If not, where or how could you define this code to ensure it is only executed in the manner described above?
The easiest way is checking some environment variable in your initialization code with something like
if ENV['need_complex_init']
do_complex_init
end
and running application with need_complex_init=1 rails s

TeamCity: Managing deployment dependencies for acceptance tests?

I'm trying to configure a set of build configurations in TeamCity 6 and am trying to model a specific requirement in the cleanest possible manner way enabled by TeamCity.
I have a set of acceptance tests (around 4-8 suites of tests grouped by the functional area of the system they pertain to) that I wish to run in parallel (I'll model them as build configurations so they can be distributed across a set of agents).
From my initial research, it seems that having a AcceptanceTests meta-build config that pulls in the set of individual Acceptance test configs via Snapshot dependencies should do the trick. Then all I have to do is say that my Commit build config should trigger AcceptanceTests and they'll all get pulled in. So, lets say I also have AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC
So far, so good (I know I could also turn it around the other way and cause the Commit config to trigger AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC - problem there is I need to manually aggregate the results to determine the overall success of the acceptance tests as a whole).
The complicating bit is that while AcceptanceSuiteC just needs some Commit artifacts and can then live on it's own, AcceptanceSuiteA and AcceptanceSuiteB need to:
DeploySite (lets say it takes 2 minutes and I cant afford to spin up a completely isolated one just for this run)
Run tests against the deployed site
The problem is that I need to be able to ensure that:
the website only gets configured once
The website does not get clobbered while the two suites are running
If I set up DeploySite as a build config and have AcceptanceSuiteA and AcceptanceSuiteB pull it in as a snapshot dependency, AFAICT:
a subsequent or parallel run of AcceptanceSuiteB could trigger another DeploySite which would clobber the deployment that AcceptanceSuiteA and/or AcceptanceSuiteB are in the middle of using.
While I can say Limit the number of simultaneously running builds to force only one to happen at a time, I need to have one at a time and not while the dependent pieces are still running.
Is there a way in TeamCity to model such a hierarchy?
EDIT: Ideas:-
A crap solution is that DeploySite could set a 'in use flag' marker and then have the AcceptanceTests config clear that flag [after AcceptanceSuiteA and AcceptanceSuiteB have completed]. The problem then becomes one of having the next DeploySite down the pipeline wait until said gate has been opened again (Doing a blocking wait within the build, doesnt feel right - I want it to be flagged as 'not yet started' rather than looking like it's taking a long time to do something). However this sort of stuff a flag over here and have this bit check it is the sort of mutable state / flakiness smell I'm trying to get away from.
EDIT 2: if I could programmatically alter the agent configuration, I could set Agent Requirements to require InUse=false and then set the flag when a deploy starts and clear it after the tests have run
Seems you go look on the Jetbrains Devnet and YouTrack tracker first and remember to use the magic word clobber in your search.
Then you install groovy-plug and use the StartBuildPrecondition facility
To use the feature, add system.locks.readLock. or system.locks.writeLock. property to the build configuration.
The build with writeLock will only start when there are no builds running with read or write locks of the same name.
The build with readLock will only start when there are no builds running with write lock of the same name.
therein to manage the fact that the dependent configs 'read' and the DeploySite config 'writes' the shared item.
(This is not a full productised solution hence the tracker item remains open)
EDIT: And I still dont know whether the lock should be under Build Parameters|System Properties and what the exact name format should be, is it locks.writeLock.MYLOCKNAME (i.e., show up in config with reference syntax %system.locks.writeLock.MYLOCKNAME%) ?
Other puzzlers are: how does one manage giving builds triggered by build completion of a writeLock task read access - does the lock get dropped until the next one picks up (which would allow another writer in) - or is it necessary to have something queue up the parent and child dependency at the same time ?