So I have a number of datasets under the same GCP BQ project, and I want to allow an external user to have read-only and read/write access on a few of them, but other datasets should not be visible to him. What's the best approach for this?
P.S. Probably not going to create an email account for him under our domain, so I'm thinking service accounts.
Just figured out one way to do it:
Create service account for external user (with BigQuery Job User role so it can be used to run queries in this project)
In GCP console web UI, for each dataset to share, click "SHARE DATASET", and in the pop-up panel add the service account created in step 1, with appropriate roles (BigQuery Data Viewer or BigQuery Data Editor)
Not sure if there's a cleaner way.
Related
Issue: In GCP IAM I have >30 users assigned the pre-defined roles BigQuery Data Viewer and BigQuery Data Editor, and now when I create a new dataset, it's automatically accessible to these 30+ users because of "policy inheritance".
Question: As BQ project admin, I want a newly created dataset only accessible to certain users (a small subset of the 30+ users). What's the best approach to do this? Thanks!
You cannot override the permissions granted at higher leves. So, if you want to restrict access at dataset level, the best approach would be to:
1) Remove the current permissions BigQuery Data Viewerand BigQuery Data Editor from project level.
2) Grant the permissions again, but only at dataset level
This also complies with the recommended best practice of least privilege. Also, if possible, use groups to grant the permissions, as it will be easier to manage.
In addition to this, you could use another project to create the dataset and allow access to the desired subset of users; however, I wouldn't recommend this approach as it only makes more difficult to handle the data and the users with access to them.
I need to grant access to Big Query to 30 Phd Students in a University on Google Cloud Platform.
Can I give them standalone access to each of them? i.e. One student cannot see other's work unless it is granted.
Creating projects = number of students is not too cost effective.
so can I give 30 access controls to a single project ?
The students need to have full access to Big query (Create, edit, join, download, run) to their respective databases.
The document is indeed confusing. Don't grant any project-level permissions. Just as Katie Sinatra said, go to the dataset Web UI, in the arrow drop's "Share dataset", add email and grant "Can edit". At the time of this writing, after you do the above, the user won't be able to see the dataset in the Web UI, but s/he can still do query in the Web UI just fine if s/he specifies the table correctly, i.e. `project.dataset.table`. (i tested it.) The user can also manually add the dataset to be displayed in the web UI. Here is how to do it, https://cloud.google.com/bigquery/bigquery-web-ui#displayprojects.
What i am still confusing is: after you do the above, when the user do a query, who is paying. My guess is the user is paying. If you want the dataset's original owner (the project under which the dataset is created) to be billed, then my guess is that you need to grant project-level BigQuery Job User permission to the user/email in addition to the above. Then, the user will be able to select the project in the GCP console and hit "BigQuery" to go to the big query Web UI, and be billed under the project. (by the way, if you do this, the dataset can be seen by the user in the Web UI.)
As JL-HN said, it is documented but it is a bit confusing. To give access to an specific dataset, you only need to go the dataset, in the arrow drop it down and click on "Share dataset". Then you only need to add the email of the student that will handle that dataset.
I'm attempting to connect BigQuery to Looker. I am pulling sample data from a Google Sheets document to a BigQuery dataset; this part is working fine, as my internal BigQuery queries are running just fine for this dataset. Using this documentation from the Looker forums, I tried to create a service account key to connect my BigQuery dataset to Looker. Unfortunately, the documentation is slightly out of date: Google now asks which service account (compute engine default service account, app engine default service account, or a new service account that can have any of multiple roles) you want to attach the key to.
Thus far, I have tried using P12 keys created for the compute engine default service account, the app engine default service account, as well as a new Project Owner service account. When I create the connection in Looker, the admin page confirms that the connection "can connect, can cancel queries, can run simple select query" (I need it to do more complex things, but am just trying to connect at all right now). Using the SQL Runner to test a simple select 10 query out, I was able to query the public datasets, e.g. hacker_news or usa_names. However, whenever I tried to run the same query on my personal sample dataset, I received this error:
Failed to retrieve data - The job encountered an internal error during execution and was unable to complete successfully.
The permissions for the base Google Sheet that the BigQuery project is pulling from are set to be viewable by my coworkers who have the link. I have also been adding each service account I test as an editor (which I assume has the highest permissions). At this point, I am creating new service accounts with each of the different possible roles to see if it's a permissions issue from the role perspective. Nothing has worked so far, so any insight would be helpful!
UPDATE: I have created a new table within the same BigQuery dataset. The new table was created using a CSV file, which was simply a download of my previous table in Google Sheets. I updated the connection to Looker. When I wrote a select 10 query pulling from the new table, it worked fine and ran very quickly. This seems to imply that the problem is something about the permissions between Google Sheets and Google BigQuery.
I've been wanting to do something like this myself for a bit, saw this question, and decided to dig in.
First thing I found was this "documentation" over in the looker discourse:
https://discourse.looker.com/t/live-spreadsheets-in-databases/2698/7
In there, it describes the steps necessary to get this working.
Two important things that you are probably missing, based on your description of events so far (since it sounds like you've already attached the sheet to your dataset and are able to query it from the BigQuery UI):
Make sure you share the Google Sheet with the service account you are using to connect Looker to BigQuery. This is the Username from the Connections tab of the Admin page in Looker.
Make sure you have enabled the Drive and Sheets APIs for your google project. You can do that via The API Library. Just search for "Drive" (or "Sheets"), click on the name, and then click on the "Enable" button from the API detail page.
Once I did the above, I had to wait a few minutes before things started working. I'll go out on a limb and guess that this was because Looker needed to cycle it's internal connection pool before the permissions would reset and work. So you may need to run a few failing queries, or wait out the connection pool before this will go into effect.
Hope that helps.
I'm trying to setup customer access to some of my BigQuery data. I'll start off with my requirements, then what I think the solution needs to be, though I'm not sure how to execute.
Requirements
Separate billing per customer for queries
I don't want to make my dataset public
Read only access to specific datasets
Accessible via Excel connector
No access rights to my main project
They manage their own access privileges, I don't want to have to add and remove individual users from direct dataset access on behalf of all our clients.
Nice to have - Web UI access
What I've Done
Created a new Google Developer Project
Added a view-only user on that project
Added a service account
Granted access to my BigQuery dataset to the service account
Here are the options for granting dataset access from the documentation:
I imagine that I need to setup some sort of special group, but I can't figure out how to do it.
Thanks in advance!
In BigQuery there are two different concepts:
The first one is billing (for queries and any other billable
activity) that is linked with a Google Cloud Project.
The second one is access to a dataset.
Having said that, to fulfil your requirements you'd create a separate project for each of the customers, and grant access to the datasets in the granularity that you would want.
That way you would have the costs for each of the projects separated but billed to you. Be careful to give them only read access to the project, unless you want them to be able to create other services like VM or deploy GAE apps, as they'd be billed to you as well.
For example dataset [MyDatasetA] to users X and Y in projects Project1 and Project2, but access to [MyDatasetB] to users Y and Z in projects Project2 and Project3.
Thus, each project is accountable for the queries their users run, and you have your access control on each dataset without it being public.
Separate billing per customer for queries. Done with the independent projects.
I don't want to make my dataset public. Done with fine grained control access.
Read only access to specific datasets. Same as above.
Accessible via Excel connector. It should work without problems as they'd be first class BQ users.
No access rights to my main project. Again possible if they are restricted to their own projects.
They manage their own access privileges. This is trickier. I think they'd need more than read access to the datasets or more than read access to the projects to be able to add new users, if you use the project groups as access control.
Nice to have - Web UI access. Check out https://bigquery.cloud.google.com/
The project groups are groups that allow to select members with Viewer, Developer or Owner roles in one click, without the hassle of adding each member manually.
You get already three groups set-up for you to use: Viewers, Editors and Owners of the original project.
But you may create your own Google Groups and give those groups the permission you want.
The hint when doing so, is that new users will usually need to Display your project so that it appears in the BQ online browser. This is done by clicking on the arrow to the side of the project name in the BQ online browser followed by Switch to project then Display project with the project name that the Dataset belongs to.
Edit: Improved the explanation about Group access
our organization has been using SAS BI Dashboard for several months now for internal use within our own organization. Now, we are working on a project where roughly 100 people in other, outside organizations will need to log on to our BI Dashboard site to view an individualized dashboard for their organization. We plan to use row-level permissions in an Information Map to control who is allowed to see what in terms of the data behind the dashboard indicators.
How would you recommend creating roughly 100 individual log-ons for outside users?
Is there a way to automate the process rather than manually creating all the accounts?
If I create the log-on name and password for each outside user, how/where would I store that in Management Console?
Any help would be appreciated - our office is small enough that we do not have a dedicated IT person or fully-trained SAS administrator, so I'm in over my head. Thanks!
As an ex SAS consultant, I can tell you briefly how I have solved this problem.
First, creating the users in batch should be easy. There are tons of scripts out there that will teach you this. I would recommend to create them in your LDAP server (probably Active Directory), to have them in a central place. That way, you can treat them the same way as you do the internal users.
To get them into the metadataserver, you should take a look at the macros that SAS provides for this:
The following macros are the core components used to import and synchronize user accounts from Active Directory to SAS metadata: %MDUIMPC , %MDUIMPLB , %MDUEXTR , %MDUCMP , %MDUCHGV , %MDUCHGLB. They are located in the following directory: [SAS Home]\SASFoundation\9.3\core\sasmacro.
This SGF proceeding will give you a practical description of the process:
http://support.sas.com/resources/papers/proceedings12/377-2012.pdf
As for the question you did not ask, "how to present the BI Dashboard webapplication to the external users". You need to set up a reverse proxy web server in a secure zone (DMZ). See this document for details: http://support.sas.com/resources/thirdpartysupport/v92m3/appservers/ApacheProxyJBoss.pdf
Hope this helps!
Stig