GCP Server crashing with "cloudsql.enable_instance_password_validation" - sql

Over the past day I have been experiencing this error very frequently which results in the cloud instance needing to be reset in order to continue connections:
ERROR: unrecognized configuration parameter "cloudsql.enable_instance_password_validation"
This is operating on a PostgreSQL 14 GCP Cloud SQL community shared 1 vCPU, 0.614 GB instance but was also tested on the standard 1 vCPU, 3.7 GB instance where the problem persisted.
The only code that has changed since this occurrence is a listen/notify call with a Golang PGX pool interface which has been reverted and the problem persists.
The problem hits regularly with any database calls (within 30 mins of a reset) and I have not set anything involving "enable_instance_password_validation" - I am also unable to find any parameters involving this name.

This was an interesting one, #enocom was correct in pointing out that the error experienced within Postgresql was a red herring. Throughout the process, there were several red herrings due to the problem spanning across both the Go server instance and the Postgresql server.
The real reason for the crashes was that the Golang PGX pool interface for interacting with the Postgres DB had a maximum 'DBMaxPools' cap of 14 connections by default. When we get to this number of connections the server hangs without any error log and all further connections are refused perpetually.
To solve this problem all that is required is adding a 'pool_max_conns' within the dbURI when calling:
dbPool, err := pgxpool.Connect(context.Background(), dbURI)
This can do done with something like:
dbURI = fmt.Sprintf(`postgresql://%s:%s#%s:%s/%s?pool_max_conns=%s`, config.DBUser, config.DBPassword, config.DBAddress, config.DBPort, config.DBName, config.DBMaxPools)
Detailed instructions can be found here: pgxpool package info
When setting PGX max conns, be sure to set this number to the same or similar to your maximum concurrency connections for the server instance to avoid this happening again. For GCP cloud run instances, this can be set within Deployment Revisions, under 'Maximum requests per container'

Related

JDBC - SQLException : The cursor has been previously released and is unavailable

I have working program in Java that is in production for a long time now. It updates a few thousand records from database 2x per day.
After new year, when database took a hit (lot of processing happening on 1st) and I updated the other parts of the code to a new version (whole process consists of 5 programs (this is 3rd/5) that are run together in Eclipse project, but I did not change this program even a little bit), I get SQL Exception:
The cursor has been previously released and is unavailable
Where does the exception happen?
While iterating ResultSet, doesn't matter how many rows it already read. (can happen on third, can happen on 2000th row).
ResultSet is created on connection that was used before and is read only.
ResultSet is created on newly created Statement object.
->Updates on the same table are done with another write-only connection transaction-wise.
This example is probably not reproducible.
Database: IBM Informix Dynamic Server Version 14.10.FC7
Eclipse version: 2021-12 (4.22)
Java version: 1.8.0_131
JDBC driver version: 4.50.1
readCon = DriverManager.getConnection(url, user, passwd);
writeCon = DriverManager.getConnection(url, user, passwd);
Statement st = readCon.createStatement();
ResultSet rs = st.executeQuery(select from table_X....);
while (rs.next){
// commit is not happening if transaction didnt begin
writeCon.<commit transaction, begin transaction>
writeCon.UpdateUsingPreparedStatement(update table_X...)
}
...
NOTE: This program runs smoothly without any problems after running the process from that program (from step 3)
What did I learn from trying to search how to solve this?
I didn't find much on the Internet, only solution was to update JDBC driver to 4.50.1 (which I am using right now)
In almost all cases where I have seen this comes down to 2 types of problems.
Concurrent use of the same JDBC objects. This is almost always the case. The program has threads and the threads are reusing connections or Statement objects. This often blows up on you in high concurrent environments because you end up operating on the same internal statement id and one thread is then closing the statement/cursor on your other Thread so it looks like the statement closed on you suddenly.
As you say you do have 2 connections but do make absolutely sure nothing is shared among threads. I've debugged a number of customer applications that thought they had proper separation but in fact did end up sharing some objects among threads. Turning on the SQLIDEBUG or instructing the driver to dump the protocol tracing events will show who sent the close on the statement. Support teams can help with this analysis. Usually when I do this, I find the close was sent by another thread right in the middle of the work you really wanted done.
Much rarer, but occasionally another issue will cause the cursor to get closed, but in those cases, you would see very obvious prior Exceptions from the JDBC driver and/or server before you hit this statement already closed error. This could be that you hit this problem WARN - Failed to getImportedKeys The cursor has been previously released and is unavailable and upgrading the driver does fix it.
My guess is you have shared connection objects among threads that 99% of the time doesn't clash, but when you get to having a really busy system that 1% shows up and causes the issue you are seeing.

SQL server 2008 replication without reinitialize

I have two databases in different servers - center_db on siglv01\sql2008 and center_db on sig\sql2008.
Can I restart replication without needing to reinitialize it? The connection dropped more than 3 days ago and is now too slow: so I want to start replication without a reinitialize.
Based on the brief conversation above, I don't think you can do this without a re-init. Specifically, the distribution database only keeps so many commands before it starts trimming. The default is 72 hours. If the last command delivered to all of your subscribers is older than that, the distribution database doesn't have what it needs to play forward all of the activity that has happened since then.
Your only hope would be if the distribution agent is still running (it knows when the above situation happens and will give you an error saying as much). If so, try to figure out why delivery is slow (troubleshoot this like any other "slow application"; replication isn't magic) and see if it can get caught up that way. Depending on how many commands are remain undelivered, it may be faster to just re-init.

SQL Server - Timed Out Exception

We are facing the SQL Timed out issue and I found that the Error event ID is either Event 5586 or 3355 (Unable to connect / Network Issue), also could see few other DB related error event ids (3351 & 3760 - Permission issues) reported at different times.
what could be the reason? any help would be appreciated..
Can you elaborate a little? When is this happening? Can you reproduce the behavior or is it sporadic?
It appears SharePoint is involved. Is it possible there is high demand for a large file?
You should check for blocking/locking that might be preventing your query from completing. Also, if you have lots of computed/calculated columns (or just LOTS of data), your query make take a long time to compute.
Finally, if you can't find something blocking your result or optimize your query, it's possible to increase the timeout duration (set it to "0" for no timeout). Do this in Enterprise Manager under the server or database settings.
Troubleshooting Kerberos Errors. It never fails.
Are some of your webapps running under either the Local Service or Network Service account? If so, if your databases are not on the same machine (i.e. SharePoint is on machine A and SQL on machine B), authentication will fail for some tasks (i.e. timerjob related actions etc.) but not all. For instance it seems content databases are still accessible (weird, i know, but i've seen it happen....).

Stop Monitoring SQL Services for Registered Servers in SMSS

Question: Is it possible to stop SSMS from monitoring the service status of registered servers?
Details:
SSMS 2008 monitors the service status of every registered server. From what I have seen it seems to reach out to every registered server every minute or so to check it's status, in my case that is over 100 servers. This process has raised issues with our Security and Network departments. Network identified it initially as suspicious traffic due to the fact that it appeard as an unknown utility was scanning the network for SQL Servers. Security was concerned because the Security Event Logs on each server are being filled up with my logon events.
I have looked all over for a setting but can't seem to find one. Am I missing it somewhere?
TIA,
Brian
I finally found an answer!!
While it is not possible (at least that I've found) to stop SSMS from checking the service status of registered servers it is possible to change the interval at which it checks it.
The short version is to create the following registry keys (DWORD):
(SQL Server 2008)
HKLM\Software\Microsoft\Microsoft SQL Server\100\Tools\Shell | PollingInterval = 600 (decimal)
(SQL Server 2005)
HKLM\Software\Microsoft\Microsoft SQL Server\90\Tools\Shell | PollingInterval = 600 (decimal)
This will make SSMS connect automatically every minute instead of every few seconds.
See this MS Connect Post for details.
Since it doesn't appear that there's any way to stop these status checks by SSMS, can you focus on helping them to see their harmlessness?
Can the network group allow certain exceptions to this particular rule (pinging servers on port 1433) in their scanning software, which would allow you and your group to monitor SQL Server uptime? Even if you weren't using SSMS, this type of sweeping monitoring activity is pretty common, and you'll know the requests will only ever come from a handful of workstations.
I don't think these SQL status checks generate any more events in the security log than any other activity, so maybe they were just concerned because it was something they weren't expecting. Could the security group be convinced that these events aren't dangerous, again as long as they're coming from certain approved workstations?
If neither of these is an option (or even if it is), you could help mitigate the problem by not connecting to all your SQL servers at once. Maybe just connect to the ones you need at the time - it looks like loading the entire list actively connects to each of them, but just connecting to the ones you intend to use in that session might help reduce the number of network sessions open.
I hope this helps - if it doesn't, or you've got some additional input that might help find a workaround, please post it!

SQL Server Express Idle Mode Partial Data Returns?

I'm attempting to help our network engineers troubleshoot a situation for one of our clients. This client purchased a point-of-sale system from quite literally a "mom-and-pop" vendor, and said vendor recommended SQL Server Express 2005 as the back-end database to save the client from having to incur extra licensing fees. (Please don't get me started on that!)
We didn't write the app, and because it's a commercial app, we have no source code available. (Not that it would help us if we did; the thing was built in PowerBuilder, so we don't have tooling for it.) The app does none of its own logging, that we can ascertain. All we have to go on is SQL Server Express's own logging.
In the application, an end user swipes a membership card. Occasionally (a few times a day), the swipe will not return data from the database. The message on screen will say, "Member 123 not found." (The member numbers are actually six digits, "000123.") A rescan immediately afterward returns the member data correctly.
We've eliminated the scanner itself as a source of issues -- it routinely scans the full six-digit number. A scan of SQL Server Express's log indicates that it is coming back online from being idle, often at the point of the scan (but also at several other times per day). (Idle mode is explained here.)
I understand that allocating/deallocating RAM the way SQL Express does is a time-consuming process, especially if we're talking about hundreds of megabytes at a time -- which appears to be the case.
What we're not sure of is whether or not we're getting back partial data, or if the app is simply failing to connect to the database and displaying a generic error message. Since everything is so opaque, and the client is (for obvious reasons) unwilling to pay us to sit in their facility for 8 hours or so to physically see it happen (perhaps with network monitoring/packet sniffing tools), we're kind of at a loss.
At this point, our recommendation is that the client upgrade to SQL Server 2005 Workgroup Edition, with 5 CALs. But that doesn't completely sit well with me as the solution to this issue, because I'm reasonably certain that no SQL Server ever returns partial data -- if you can't connect, you can't connect. (That said, I still recommend it because it's a solution to a number of their other issues!)
I don't have much experience with Express. (I never use it for anything but local development, and there only at home; I certainly never recommend it to my clients.)
My question to those who might have experience with Express is, have you ever seen an instance of SQL Express return partial data, without the app itself being the cause of it? Specifically, have you seen this behavior when returning from idle mode?
(For what it's worth, we're inclined to believe that the app is failing to connect and merely displaying a generic error message, lopping off leading zeroes on the member ID when it does. That seems the most reasonable answer -- a third question might be, do you guys concur with that assessment?)
I've never heard of or experienced SQL Server Express returning partial data. It's essentially the same code base as the full SQL Server.
It is more likely that the application is experiencing a timeout (which defaults to 30 seconds) due to SQL Server Express going idle. The application probably receives a timeout that it does not expect and does not handle it well.
The problem and possible solutions are discussed in this forum thread: http://social.msdn.microsoft.com/forums/en-US/sqlexpress/thread/a8fbf8d6-9949-47a5-a32b-50f8131f1127/
I suspect you have a connection string that looks like this:
Data Source=.\SQLEXPRESS; Integrated Security=True;AttachDbFilename=|DataDirectory|\myDatabase.mdf;User Instance=True
From the referenced thread:
This connection string will cause an
initial connection to the main
instance (.\SQLEXPRESS) and then
instruct the main instance to spawn a
new instance of SQL Server under the
user's context and attach the database
specified to that new User Instance.
The User Instance is a completely
separate running instance of SQL
Server form the main instance that is
unique to the user and that will be
shut down when there are no longer any
connections to it.
This is totally different that
attaching a database to the main
instance, which stays running at all
times, unless you've manually shut it
down. If your question is about the
main instance going into an Idle
state, then your question is not
unique to SQL Express and you should
ask this question in the Database
Engine forum. I believe all Editions
of SQL Server have an Idle state and
the other forum would be where you can
find out how to affect that behavior.