Kubernetes scale down specific pods - replication

I have a set of Pods running commands that can take up to a couple seconds. There is a process that keeps track of open request & which Pod the request is running on. I'd like the use that information when scaling down pods - either by specifying which pods to try to leave up, or specifying which pods to shut down. Is it possible to specify this type of information when changing the # of replicas, e.g. I want X replicas, try not to kill my long running tasks on pods A, B, C?

This isn't currently possible. When you scale down the number of replicas, the system will choose one to remove; there isn't a way to "hint" at which one you'd like it to remove.
One thing you can do is you can change the labels on running pods which can affect their membership in the replication controller. This can be used to quarantine pods that you want to debug (so that they won't be part of a service or removed by a scaling event) but might also be possible to use for your use case.

You can annotation specific pod with controller.kubernetes.io/pod-deletion-cost: -999 and enable PodDeletionCost featuregate. This feature is implement alpha in 1.21 and beta in 1.22.
controller.kubernetes.io/pod-deletion-cost annotation can be set to offer a hint on the cost of deleting a pod compared to other pods belonging to the same ReplicaSet. Pods with lower deletion cost are deleted first.
https://github.com/kubernetes/kubernetes/pull/99163
https://github.com/kubernetes/kubernetes/pull/101080

i've been looking for a solution to this myself, and i also can't find one out of the box.
however, there might be a workaround (would love it if you could test and confirm)
steps:
1. delete replication controller
2. delete X desired pods
3. recreate replication controller of size X

As mention above, the workaround for this action should be something like this:
alias k=kubectl
k delete pod <pods_name> && k scale --replicas=<current_replicas - 1> deploy/<name_of_deployment>
Make sure you don't have an active hpa resource that is related to the deployment.

Related

local Repository taking huge space

I am using gitlab for my project. There are 3 or 4 different repository which is taking huge space. Is there any better way to handle large space taken up? I have huge performance issue with computer. Local repository had to be deleted every time after branch work is completed to freeup space. This means, I am cloning the repo every time I need to work on new branch which is taking 30mins sometime which is again not helping and consuming huge time. I m also working on all three repository sequentially which means clone and delete 4 times for one assigned work which doesn't seem efficient.
Is it possible at all to keep all 4 repo in my local and still be efficient with space and performance of computer ?
I am using VScode.
Any suggestion appreciated?
Best Strategies to keep local repository and yet efficient and avoid deleting every time.
I've had similar problems in the past with repos that contained many branches, tags, and commits. What helped me was using the --single-branch option of the clone command.
Since you mentioned you're using VS Code and GitLab, I'm also assuming you're using the GitLab WorkFlow VS Code extension. Sadly, I don't know how to specify the --single-branch option with that extension, but there should be a way.

Running sidekiq jobs against more than one database?

I have one Rails app, which uses different databases depending on the domain name (ie. it supports multiple websites). This works by loading up different environments, without issue.
I am trying to figure out how to run the same set of Sidekiq jobs for each of them.
Sidekiq runs on a worker-server instance.
I have tried running a second instance of sidekiq on the commandline of the worker, giving it a different pidfile, logfile, environment and config file.
Problem 1: In the Dashboard, all recurring tasks listed in first instance of sidekiq's config file are gone and only the task from my 2nd instance's config file is there on the recurring jobs tab.
Problem 2: For that job, if I try to enqueue it, I get unitialized constant uninitialized constant JofProductUpdateLive -> I am guessing this is because I defined the class in app/jobs/jof_product_update_live.rb on worker, and it is seeking it on master server ?
Problem 3: If my theory for the error is correct and I place that file on master server, seems to me it will run with environment/db1 and i'm not sure how to run it with db2/environment2 ?
I'm seeking any advice as to how to set something like this up, as I have tried every idea that came my way and as of yet, zero success. I have also combed through every forum I could find on sidekiq to no avail.
Thanks for any help !
Check out the Apartment gem and apartment-sidekiq.
https://github.com/influitive/apartment-sidekiq

Tools used to update dynamic properties without even restarting the application/server

In my project I am trying to do the setting where in I can update the dynamic properties in the server/application without even restarting it.
We face this problem that whenever we have to update or change some properties which are dynamic in nature, then every time we have to restart the server/application and this results in unavailability of the server for that time stamp.
I have already found one tool Archaius-ZooKeeper to set it.https://github.com/Netflix/archaius/
We are trying to do it for JBoss servers where we use war file to deploy on server.
Please suggest are there any other method or tool or technology that can be used to set it.
Thanks in advance.
You could consider jRebel, allows you to redeploy your app without any downtime, then you can use jRebel Remoting to redeploy from eclipse to a remote server
You may use Zookeeper. You have to create a Znode and add the properties in the Znode. All your servers/applications should read from this Znode and also put an watch on this Znode for data changes.
Alternately, you may use a database to store the properties along with their modification time. Whenever you change the value of a property, the corresponding modification time is changed. All your applications/servers keep pulling the delta at some intervals (may be 2 secs/ 5 secs etc.).
Or you may have the properties hosted on a web server, or on NFS, or on some distributed cache etc. All your applications/servers keep reading it at some intervals for detecting any changes.
You can use Spring Cloud Zookeeper. I shared a little example here.

TeamCity: Managing deployment dependencies for acceptance tests?

I'm trying to configure a set of build configurations in TeamCity 6 and am trying to model a specific requirement in the cleanest possible manner way enabled by TeamCity.
I have a set of acceptance tests (around 4-8 suites of tests grouped by the functional area of the system they pertain to) that I wish to run in parallel (I'll model them as build configurations so they can be distributed across a set of agents).
From my initial research, it seems that having a AcceptanceTests meta-build config that pulls in the set of individual Acceptance test configs via Snapshot dependencies should do the trick. Then all I have to do is say that my Commit build config should trigger AcceptanceTests and they'll all get pulled in. So, lets say I also have AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC
So far, so good (I know I could also turn it around the other way and cause the Commit config to trigger AcceptanceSuiteA, AcceptanceSuiteB and AcceptanceSuiteC - problem there is I need to manually aggregate the results to determine the overall success of the acceptance tests as a whole).
The complicating bit is that while AcceptanceSuiteC just needs some Commit artifacts and can then live on it's own, AcceptanceSuiteA and AcceptanceSuiteB need to:
DeploySite (lets say it takes 2 minutes and I cant afford to spin up a completely isolated one just for this run)
Run tests against the deployed site
The problem is that I need to be able to ensure that:
the website only gets configured once
The website does not get clobbered while the two suites are running
If I set up DeploySite as a build config and have AcceptanceSuiteA and AcceptanceSuiteB pull it in as a snapshot dependency, AFAICT:
a subsequent or parallel run of AcceptanceSuiteB could trigger another DeploySite which would clobber the deployment that AcceptanceSuiteA and/or AcceptanceSuiteB are in the middle of using.
While I can say Limit the number of simultaneously running builds to force only one to happen at a time, I need to have one at a time and not while the dependent pieces are still running.
Is there a way in TeamCity to model such a hierarchy?
EDIT: Ideas:-
A crap solution is that DeploySite could set a 'in use flag' marker and then have the AcceptanceTests config clear that flag [after AcceptanceSuiteA and AcceptanceSuiteB have completed]. The problem then becomes one of having the next DeploySite down the pipeline wait until said gate has been opened again (Doing a blocking wait within the build, doesnt feel right - I want it to be flagged as 'not yet started' rather than looking like it's taking a long time to do something). However this sort of stuff a flag over here and have this bit check it is the sort of mutable state / flakiness smell I'm trying to get away from.
EDIT 2: if I could programmatically alter the agent configuration, I could set Agent Requirements to require InUse=false and then set the flag when a deploy starts and clear it after the tests have run
Seems you go look on the Jetbrains Devnet and YouTrack tracker first and remember to use the magic word clobber in your search.
Then you install groovy-plug and use the StartBuildPrecondition facility
To use the feature, add system.locks.readLock. or system.locks.writeLock. property to the build configuration.
The build with writeLock will only start when there are no builds running with read or write locks of the same name.
The build with readLock will only start when there are no builds running with write lock of the same name.
therein to manage the fact that the dependent configs 'read' and the DeploySite config 'writes' the shared item.
(This is not a full productised solution hence the tracker item remains open)
EDIT: And I still dont know whether the lock should be under Build Parameters|System Properties and what the exact name format should be, is it locks.writeLock.MYLOCKNAME (i.e., show up in config with reference syntax %system.locks.writeLock.MYLOCKNAME%) ?
Other puzzlers are: how does one manage giving builds triggered by build completion of a writeLock task read access - does the lock get dropped until the next one picks up (which would allow another writer in) - or is it necessary to have something queue up the parent and child dependency at the same time ?

Ubuntu + PBS + Apache? How can I show a list of running jobs as a website?

Is there a plugin/package to display status information for a PBS queue? I am currently running an apache webserver on the login-node of my PBS cluster. I would like to display status info and have the ability to perform minimal queries without writing it from scratch (or modifying an age old python script, ala jobmonarch). Note, the accepted/bountied solution must work with Ubuntu.
Update: In addition to ganglia as noted below, I also looked that the Rocks Cluster Toolkit, but I firmly want to stay with Ubuntu. So I've updated the question to reflect that.
Update 2: I've also looked at PBSWeb as well as MyPBS neither one appears to suit my needs. The first is too out-of-date with the current system and the second is more focused on cost estimation and project budgeting. They're both nice, but I'm more interested in resource availability, job completion, and general status updates. So I'm probably just going to write my own from scratch -- starting Aug 15th.
Have you tried Ganglia?
I have no personal experience but few sysadmin I know are using it.
Following pages may help,
http://taos.groups.wuyasea.com/articles/how-to-setup-ganglia-to-monitor-server-stats/3
http://coe04.ucalgary.ca/rocks-documentation/2.3.2/monitoring-pbs.html
my two cents
Have you tried using nagios: http://www.nagios.org/ ?