I have a simple python script that's being called by a remote procedure in BigQuery through an Apache Spark external data connector that's provided by the public preview.
print("Hello World!")
I created a remote procedure in BigQuery in a public dataset called public using
CREATE OR REPLACE PROCEDURE
`tumult-labs.public.helloworld`()
WITH CONNECTION `tumult-labs.us.bigspark` OPTIONS (engine='SPARK',
main_file_uri='gs://analytics-procedures/helloworld.py')
LANGUAGE python
CALL `tumult-labs.public.helloworld`()
This will simply show up as Hello World! in the logs as expected.
I can call this remote procedure from within my organization, but how do I expose the Apache Spark external data connector so that people from outside the organization can call and use my remote procedures as well?
I created a user from outside the organization (A private gmail account) that has been given public preview to the connector, and can create and run the script outlined above. However, when the private gmail account tries to run the tumult-labs.public.helloworld procedure, they encounter this error.
Access Denied: Connection projects/tumult-labs/locations/us/connections/bigspark: User does not have bigquery.connections.use permission for connection projects/tumult-labs/locations/us/connections/bigspark. at [1:1]
The dataset that the remote procedure resides under tumult-labs.public is set to allUsers, but the issue now is that there's no permissions to make the external data connection tumult-labs.us.bigspark to be public.
Is there a way to set it such that it uses the user's own Apache Spark external data connection or to make tumult-labs's us.bigspark to be public?
Related
I want to setup a .NET Core web application on Cloud Run with a Google Cloud SQL database. I easily deployed the database which has a public IP on Cloud SQL and my web application with Docker Container on Cloud Run. I can access the database with SQL Server Management Studio without any difficulties and the web app is up and running as expected. The only piece missing is the link between them that allows them to connect.
In my web app, I got a connection string in that format :
Data Source=***;Initial Catalog=***;User ID=***;Password=***;Pooling=true;Trusted_Connection=false;Connection Timeout=60;Integrated Security=false;Persist Security Info={0};Encrypt=true;TrustServerCertificate=true;MultipleActiveResultSets=true;
Once I got the public IP and the connection name from Cloud SQL, how should be precisely be the connection string and/or the next steps?
Furthermore, in the connections tab under Cloud Run Service, I added the Cloud SQL connection. This is supposed to configure a Cloud SQL Proxy for me.
In order to connect to Cloud SQL from Cloud Run, you must follow this guide
You have already made some configurations in the Connections tab as stated in the Configuring Cloud Run section. You can check the guide for the Public IP since you configured your instance that way, to be sure that all steps were followed.
Briefly, the steps are:
Configure the service account for your service. Make sure that the service account has the appropriate Cloud SQL roles and permissions to connect to Cloud SQL.
The service account for your service needs one of the following IAM roles:
Cloud SQL Client (preferred)
Cloud SQL Admin
If the authorizing service account belongs to a different project than the Cloud SQL instance, the Cloud SQL Admin API and IAM permissions will need to be added for both projects.
Like any configuration change, setting a new configuration for the Cloud SQL connection leads to the creation of a new Cloud Run revision. Subsequent revisions will also automatically get this Cloud SQL connection, unless you make explicit updates to change it.
Go to Cloud Run
Configure the service:
If you are adding Cloud SQL connections to an existing service:
Click on the service name.
Click on the Connections tab.
Click Deploy.
Enable connecting to a Cloud SQL instance:
Click Advanced Settings.
Click on the Connections tab.
If you are adding a connection to a Cloud SQL instance in your project, select the desired Cloud SQL instance from the dropdown menu.
If you are deleting a connection, hover your cursor to the right of the connection to display the Trash icon, and click it.
Click Create or Deploy.
After you've double checked the steps above, you could continue with the section Connecting to Cloud SQL. You can follow the steps on the Public IP tab.
Connect with Unix sockets
Once correctly configured, you can connect your service to your Cloud SQL instance's Unix domain socket accessed on the environment's filesystem at the following path: /cloudsql/INSTANCE_CONNECTION_NAME.
The INSTANCE_CONNECTION_NAME can be found on the Overview page for your instance in the Google Cloud Console or by running the following command:
gcloud sql instances describe [INSTANCE_NAME].
These connections are automatically encrypted without any additional configuration.
The code samples shown below are extracts from more complete examples on the GitHub site. To see this snippet in the context of a web application, view the README on GitHub.
// Equivalent connection string:
// "Server=<dbSocketDir>/<INSTANCE_CONNECTION_NAME>;Uid=<DB_USER>;Pwd=<DB_PASS>;Database=<DB_NAME>;Protocol=unix"
String dbSocketDir = Environment.GetEnvironmentVariable("DB_SOCKET_PATH") ?? "/cloudsql";
String instanceConnectionName = Environment.GetEnvironmentVariable("INSTANCE_CONNECTION_NAME");
var connectionString = new MySqlConnectionStringBuilder()
{
// The Cloud SQL proxy provides encryption between the proxy and instance.
SslMode = MySqlSslMode.None,
// Remember - storing secrets in plain text is potentially unsafe. Consider using
// something like https://cloud.google.com/secret-manager/docs/overview to help keep
// secrets secret.
Server = String.Format("{0}/{1}", dbSocketDir, instanceConnectionName),
UserID = Environment.GetEnvironmentVariable("DB_USER"), // e.g. 'my-db-user
Password = Environment.GetEnvironmentVariable("DB_PASS"), // e.g. 'my-db-password'
Database = Environment.GetEnvironmentVariable("DB_NAME"), // e.g. 'my-database'
ConnectionProtocol = MySqlConnectionProtocol.UnixSocket
};
connectionString.Pooling = true;
// Specify additional properties here.
return connectionString;
Google recommends that you use Secret Manager to store sensitive information such as SQL credentials. You can pass secrets as environment variables or mount as a volume with Cloud Run.
After creating a secret in Secret Manager, update an existing service, with the following command:
gcloud run services update SERVICE_NAME \
--add-cloudsql-instances=INSTANCE_CONNECTION_NAME
--update-env-vars=INSTANCE_CONNECTION_NAME=INSTANCE_CONNECTION_NAME_SECRET \
--update-secrets=DB_USER=DB_USER_SECRET:latest \
--update-secrets=DB_PASS=DB_PASS_SECRET:latest \
--update-secrets=DB_NAME=DB_NAME_SECRET:latest
See also:
GoogleCloudPlatform/dotnet-docs-samples on GitHub
I have enabled Private link by setting the "Deny public network access" knob to Yes in the Firewall settings on my Azure SQL Database server. Everything is working as expected except external data sources (external tables). The external tabels are simply links to tables in another Azure SQL database that belongs to the same server. Before I enabled the Private link, everything worked fine. If I try to query the external tables I get this error message:
"Error retrieving data from [mydbserver].database.windows.net.[mydbname]. The underlying error message received was: 'Reason: An instance-specific error occurred while establishing a connection to SQL Server. Connection was denied since Deny Public Network Access is set to Yes (https://learn.microsoft.com/azure/azure-sql/database/connectivity-settings#deny-public-network-access). To connect to this server, use the Private Endpoint from inside your virtual network (https://learn.microsoft.com/azure/sql-database/sql-database-private-endpoint-overview#how-to-set-up-private-link-for-azure-sql-database)."
I can't find anything in the docs about any limitation regarding external data sources and external tables in combination with Private Link setup.
The external tables where created using the standard way: "CREATE EXTERNAL DATA SOURCE" and "CREATE EXTERNAL TABLE". I have also tried to recreate the data source and the tables after enabling Private Link, but the error remains...
Want to reiterate the answer to the same question posted on Microsoft Q&A: External tables not working when “Deny public network access” is set to Yes
The limitation is with Polybase as it currently does not support Private Link at this time. As per the PG:
Polybase does not support using private link at this time. Please direct the customer to use Managed Identity to secure the connection to Azure Storage.
Albeit, this may not be a workable solution for you but, if the data you need to access is extracted to a storage account and then imported via the method referenced by the PG, this could be a workable solution. The same process is reversed with flip/flop endpoints, and could be done within the security of a VNET + Managed Identity.
You need to use the name yourdbname.privatelink.database.windows.net
Afterwards you'll maybe receive another error that this name is incorrect. In this case you're experiencing a DNS problem and you need to add an entry in the host file of your VM with the IP of the endpoint. If your VM is outside of that VNET, it's another story.
Then you need to add the public IP of your endpoint in your hostfile. I'm still trying to solve this with a serious dns, haven't figured it out yet.
For More information see this;
https://techcommunity.microsoft.com/t5/azure-database-support-blog/lesson-learned-126-deny-public-network-access-allow-azure/ba-p/1244037
I am having issues enabling the repository service on the Informatica Admin console. Steps I took so far,
create a new repository service with option to create contents, keeps spinning and after a while it times out. I log back in and I see the repository service created in the Admins console with option to disable but unavailable. Also not able to see the repository tables created in the metadata schema. And not able to connect using the powercenter repository manager as well.
create a new repository service without create contents. A disabled repository service is created. To add/restore contents I try to enable the service then it keeps spinning and nothing happens. After a while it times out and when I log back in I see the option to disable but the service is unavailable. Therefore I am unable to add contents.
I am looking for some helpful insight to resolve this crisis.
Thanks!
While creating repository did you provide database user name and password? And that user has necessary privileges to be a user of informatica repository?
User of that database must have necessary privileges. Even if you have error in the credentials provided during the creation, repository will be created as it can be created at any time in informatica. Kindly delete the repository and create the new one by providing accurate credentials.
Execute these queries to improve the visibility of the user to informatica,
increase the cursor size,
ALTER SYSTEM SET OPEN_CURSOR = 1000 SCOPE = BOTH;
GRANT CONNECT, RESOURCE, CREATE VIEW, SELECT ANY DICTIONARY TO USER_NAME;
Make sure you are executing these queries in sys user.
In case you need to clear the issue without deleting the created repository, go to,
Actions -> Repository contents -> Create
I am trying to share a file containing a table of information pulled from an external SQL query connection. It works fine for me as I have the connections set up on my PC but when I send the file out, it asks for connection credentials. I could go to each PC and enter the credentials but would prefer the end users to open up the file and use it without having to enter any credentials and would like them to be able to refresh the data as and when needed.
How would I set this up or is it even possible?
Thanks in advance.
Your connection string should be using Windows Authentication, and the local user must be a member of a domain group that has the privileges to run the query on SQL Server.
If you go to Connection Properties, open the connection, and click on Definition, the Integrated Security tag should be set to SSPI.
I have a sqlcmd command which will give the result of to a file which will be placed in a shared folder.
exec xp_cmdshell 'sqlcmd -s $dataSource -d $dbName -i $inputFilePath -o $outputFilePath'
Now, what if the shared drive is protected and requires username and password.
How to give credential in the Sqlcmd to bypass the authentication.
xp_cmdshell will execute under the NT (Windows) credentials of:
impersonated login if logged in using Windows credentials
service account if logged in using SQL credentials and no explicit credential object exists
explicit credential is logged in using SQL credentials associated with a credential (see CREATE CREDENTIAL
if you insist on accessing a remote resource (file share) using the default context, you're uphill shitcreek without a paddle, as impersonated access to remote resources is 'double-hop' and requires constrained delegation for at leats one of the cases (logged in using NT).
A better option is to explicitly map the remote share \\server\share locally as a drive X: and then access drive X: instead. Mapping a drive locally allows for persisted credentials to be stored, but you have to be careful to make sure the mapping is visible in the service account session. Which is... basically impossible, see Map a network drive to be used by a service.
Now that you know why you cannot do this properly and you'll be pulling your own hair, meanwhile turn white from constantly be fighting difficult to troubleshoot failures, stand back and look at the problem from a different angle: Why do you want to use xp_cmdshell to call sqlcmd? Call sqlcmd directly, from a job/process. SQL Agent has all the support for you need for this, just set the job to run under a proxy account with appropiate credentials to connect to both the remote share and the destination $datasource.