kubernetes on gcp: removed role, account gone how to restore permissions? - permissions

whilst 'hardening' the accounts - namely removing or toning down accounts with editor permissions on the projects I removed editor from what appears to be the kubernetes account that container engine uses on the back end of gcloud commands.
Once you remove the last role from an account it vanishes - hard lesson to learn!
Removed editor
serviceAccount:386242358897#cloudservices.gserviceaccount.com
It meant I initially couldn't deploy because it couldn't access container registry.
So I deleted the cluster and recreated expecting the account to get recreated. That failed due to insufficient permissions.
so I manually removed the compute instances (it wouldn't have permissions to recreate them), then templates and then the cluster.
As the UI now thinks you have no clusters it looks like you are back to the beginning. So I ran my scripts and they failed.
ERROR: (gcloud.container.clusters.create) Opetion [https://container.googleapis.com/v1/projects/xxxx/zones/europe-west2-b/operations/operation-xxxx'
startTime: u'2017-10-17T17:59:41.515667863Z'
status: StatusValueValuesEnum(DONE, 3)
statusMessage: u'Deploy error: "Not all instances running in IGM. Expect 1. Current actions &{Abandoning:0 Creating:0 CreatingWithoutRetries:0 Deleting:0 None:0 Recreating:1 Refreshing:0 Restarting:0 Verifying:0 ForceSendFields:[] NullFields:[]}. Errors [https://www.googleapis.com/compute/beta/projects/xxxx/zones/europe-west2-b/instances/gke-xxxx-default-pool-xxxx:PERMISSIONS_ERROR]".'
targetLink: u'https://container.googleapis.com/v1/projects/xxxx/zones/europe-west2-b/clusters/xxxx'
zone: u'europe-west2-b'>] finished with error: Deploy error: "Not all instances running in IGM. Expect 1. Current actions &{Abandoning:0 Creating:0 CreatingWithoutRetries:0 Deleting:0 None:0 Recreating:1 Refreshing:0 Restarting:0 Verifying:0 ForceSendFields:[] NullFields:[]}. Errors [https://www.googleapis.com/compute/beta/projects/xxxx/zones/europe-west2-b/instances/xxxx:PERMISSIONS_ERROR]".
Updated property [container/cluster].
when I try to create through UI I get this
Permission denied (HTTP 403): Google Compute Engine: Required 'compute.zones.get' permission for 'projects/xxxx/zones/us-central1-a'
Have done a number on it!
My problem is that I don't see a way of giving permissions back to whatever account it is trying to use (as I cannot see that account if it exists) nor can I see how to attach a new service account with permissions that are needed to whatever is doing the work under the hood.
UPDATE:
So ...
I recreated the account at the organisation level. Gave it service account role there because you cannot modify the domain of the accounts at project level.
I have then modified that at the project level to have editor permissions.
This means i can deploy a cluster but ... still cannot create load balancer - insufficient permissions
Error creating load balancer (will retry): Error getting LB for service default/bot: googleapi: Error 403: Required
'compute.forwardingRules.get' permission for 'projects/xxxx/regions/europe-west2/forwardingRules/xxxx', forbidden
the user having the problem this time is:
service-xxx#container-engine-robot.iam.gserviceaccount.com

So ...
I played with recreating accounts etc. Eventually got Kubernetes working again.
A week later tried to use datastore and discovered that AppEngine was dead beyond dead.
The only recourse was to start a new project from scratch.
The answer to this question is (some may laugh at its self evidence, but we are all in a rush at some point).
DO NOT CREATE USER ACCOUNTS OR GIVE THEM PERMISSIONS BEYOND WHAT THEY NEED BECAUSE DELETING THEM LATER IS REALLY NOT WORTH THE RISK.
Thankyou for listening :D

Related

SSAS permission issue--Sorry I figure it out while writing this, so just share it out

I installed SSAS in a SQL Server 2016 SP2 CU15 Developer Edition server. At the last step, it prompted out this message
error message after installation
And the SSAS stopped running, when I tried to start it in SSCM manually, failed, the error message is like this:
I patched it with the latest CU17 and tried to start the service again, still failed, the same error.
I changed the service account to "build-in, local service" in SSCM, it worked, the SSAS can be brought online.
Then I changed the service account back to domain\account, it failed again, the error message is like this.error message of applying service account
I added the service account to the Windows Administrators group, tried to apply it in the SSCM, it worked.
I guess it must be some Windows-level permission issues.
Then I found this:
https://learn.microsoft.com/en-us/previous-versions/sql/sql-server-2012/ms143504(v=sql.110)?redirectedfrom=MSDN#Windows%20Configure%20Windows%20Service%20Accounts%20and%20Permissions
SSAS should be granted the following permission on local security policies
Log on as a service (SeServiceLogonRight)
For tabular only--(mine is tabular, not sure for cube):
Increase a process working set (SeIncreaseWorkingSetPrivilege)
Adjust memory quotas for a process (SeIncreaseQuotaSizePrivilege)
Lock pages in memory (SeLockMemoryPrivilege) – this is needed only when paging is turned off entirely.
For failover cluster installations only:
Increase scheduling priority (SeIncreaseBasePriorityPrivilege)
It works.

Drone error: Login Failed. User limit reached

Recently, some colleagues have started working in my team, so I showed them the basics of drone, but when they wanted to access our drone server they get that message:
Login Failed. User limit reached
We login via Github and they have access to the repositories. In fact, one of them did commit something which run the job without any problems, he just could not see it as he could not login. Any ideas on why does he get that message? I have checked our configuration and it doesn’t seem to have any limit to the number of users on drone.
Looks like I reached the limits of the trial license.
I checked the limits of my current license at the /varz URL (eg. https://cloud.drone.io/varz)
Also, about the users seats and repos: https://docs.drone.io/enterprise/usage/

MySQL error 1449 reappearing even though definer was set to resolve initial error?

On Monday I messed up with a database.
We have an application running on a VPS, using cPanel and phpmyadmin, and I informed the developers I will be doing some queries on the DB to extract information.
So, I did a few large queries using the "Visual Builder" query tool and the web-application got stuck. The queries weren't loading and even refreshing the page did not work. The website wasn't loading and users couldn't log in. So I used WHM to log in as root and kill the queries manually. After I did this, the system was still not running.
Then, the database completely freaked out and I got these error messages:
After doing this, the DB somehow fixed itself and the web application was working again. However, we saw that we could not update some jobs or add new jobs in the system. If you pressed the "SAVE" button on a job, the system just gave an "undefined" message.
The developers had a look and discovered this was causing the issue:
[
The devs went ahead and added the definer and the issue was resolved. The blacked out "user"#1.0.0.0" is the actual cPanel account username.
However, this did not last as yesterday evening the exact same situation was occurring. The web-application was running fine on Tuesday and most of Wednesday, then all of a sudden users couldn't update their jobs again which means the definer user was removed once again even though nobody did anything in the database.
Has anyone encountered this issue before? I read this thread on the topic and even though what they say makes sense, I believe the developers did this but the error still occurred.
When I log into phpmyadmin via cPanel, I get a weird user called "cpses_234ikjih#localhost.com". Does this perhaps have something to do with this error? I believe before the server went crazy, this user was only the name of the cPanel account (for example: "cPanelAccountName#localhost.com".
To summarize your post, what I'm seeing is that you have a MySQL user, the user disappeared, you recreated the user, and it went away again.
There must be some external factor here. Someone could have access to your database and is deleting the user maliciously or out of misunderstanding, there could be a scheduled job, or it could be something to do with your web host.
I'd start by auditing the database accounts, and restricting access as much as possible. Check any interface that's exposed to the web, such as WordPress, Joomla, or other applications.
You should enable logging, there are several degrees of logging that MySQL can allow. I think the most useful for you would be the audit log, although honestly I've never used that specifically. You'd enable that to log future events. The binary log may contain record of what has already occurred.
SOLVED
I managed to solve this by changing MySQL database password and cPanel account password.
I read one post by someone saying that there was a session file which perhaps stored an old session and that changing passwords could resolve this. Luckily it did, have not had the error 1449 appearing for 5 days now.

batch job occasionally gets "does not have bigquery.jobs.create permission" error after multiple successful queries

I have a python batch job running as a service account using bq.cmd to load multiple datastore backups.
It has been running successfully for 2 years, but recently in the middle of some runs (after multiple successful loads by the same user into the same dataset) it fails, continuously returning : "does not have bigquery.jobs.create permission".
Restarting the job, with no changes, usually succeeds.
bq.cmd load --quiet --source_format=DATASTORE_BACKUP --project_id=blah-blah --replace project-name:data_set_name.TableName gs://project-datastore-backup/2018-08-30-03_00_01/blahblah.TableName.backup_info
gcloud components are up to date.
Any suggestions welcome
There's a public bug with a similar issue which was resolved by recreating the service account. If you don't see any actual changes to the IAM permissions occurring in the logs, then I'd try with a new service account.

What permissions/policies are needed to support loadUserProfile=true for new application pools?

Something happened on my development workstation (Windows 8.1) in the last few weeks which require me to either run my App Pools with the "Load User Profile" setting at False or not run with the identity set to ApplicationPoolIdentity. If I were to create a new app pool, using ApplicationPoolIdentity as the identity and with loadUserProfile=true, the following happens when trying to load the application in a browser:
A number of errors in the Windows Event Log (both System and Application types):
Warning event 1509 - Windows cannot copy file \\?\C:\Users\Default\AppData\Local\Microsoft\VSCommon\12.0\SQM\sqmdata-7236-039-00000.sqm to location \\?\C:\Users\[Name of App Pool]\AppData\Local\Microsoft\VSCommon\12.0\SQM\sqmdata-7236-039-00000.sqm. This error may be caused by network problems or insufficient security rights.
Error event 1511 - Windows cannot find the local profile and is logging you on with a temporary profile. Changes you make to this profile will be lost when you log off.
Another 1509 warning
Error event 1500 - Windows cannot log you on because your profile cannot be loaded. Check that you are connected to the network, and that your network is functioning correctly. DETAIL - Only part of a ReadProcessMemory or WriteProcessMemory request was completed.
5 x event 5022 warnings - The Windows Process Activation Service failed to create a worker process for the application pool '[App Pool Name]'. The data field contains the error number.
Finally an error 5002 - Application pool '[App Pool Name]' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
The App Pool is shut down, as the error 5002 said
"HTTP Error 503. The service is unavailable." is then seen in the browser. Any further requests are met with the same (which makes sense since the app pool is shut off).
I've seen a common "fix" for this here and here which basically say to turn off profile loading. Yes it makes the problem go away, but this doesn't get to the root cause. I know that it is possible to run with this configuration as a I have a Windows 2012 machine which supports the configuration just fine. In this case, hitting an app with a new app pool set to ApplicationPoolIdentity and loadUserProfile=true actually creates the new user profile (I can watch as the profiles folder is created in C:\Users) and the app runs merrily. What's worse is I know this configuration worked on the problem machine just a few weeks ago. I have a number of App Pools I created which have their own profiles and folder under the C:\Users folder. These app pools work just fine NOW with the ApplicationPoolIdentity and loadUserProfile=true settings. It's just that NEW app pools refuse to run and load a user profile.
Does anyone have any insight to what might be going on?
Edit: I read the bottom of this recent article. It's a bit contradictory in saying that the setting can be turned on, but also says:
Only the standard application pools (DefaultAppPool and Classic .NET AppPool) have user profiles on disk. No user profile is created if the Administrator creates a new application pool.
However, if you want, you can configure IIS application pools to load the user profile by setting the LoadUserProfile attribute to "true".
I'm very confused.
The SQM file listed in the event log warning was created by a Windows or Visual Studio update. When the user profile service or application pool runs and tries to create a new profile, it tries to copy the file to the profile. The SQM file requires administrator permissions to copy. The user profile service or application pool does not have sufficient permissions to copy the file, an error is generated, and the user profile is not created. Without a user profile, the application pool cannot run because it doesn't have an isolated secure place to store data.
Remove or delete the SQM file from the source directory, and the user profile will be created successfully when the app pool is initialized. You can also change the permissions on the SQM file, but I'm not sure what the appropriate permissions should be. The user profile service runs as "LocalSystem Account". See its documentation for permission info. It's unclear to me whether the application pool identity itself is being used to perform the copy operation, or the local system account.
If you remove the file from the source directory, you could also manually copy the file where it was trying to go as well.
After a very brief search about what SQM is, it seems like it is traditionally used as "service quality management". Usually it would contain information to send back to the program authors with metrics, logs, or somesuch. I don't know if this is the case with this file or not. So it doesn't seem like it's important to include it in the new profile.
I can't take 100% credit for this answer, as I was tipped off by a comment attached to an answer on some other question. I can't find the link to it in the 50 browser tabs open for troubleshooting this. That guy deserves a thank you, because I believe this is a much better solution than compromising the security of a server by pooling all the resources together like in IIS 6.
P.S. As noted in your comment, a bug report has been filed.