Running queries in BigQuery without being a project User - google-bigquery

It seems the only way to share a dataset so that another person can run queries on the tables from the dataset is to make that person a Project user. See permissions
This means that the user would have access to all the datasets on the project. This seems to me that is is highly inconvenient. Am I missing something?

For user to run query - user need to have bigquery.jobs.create permission
If your user already has this permission in any other project - you can just simply share your dataset with this user with Can View Access Level
If user is new and does not have yet bigquery.jobs.create in any other project - you can just add this user to your project with this permission only.
And still you need add this user to ACL for that
specific dataset
Important: Please also note - if you give user permission to create jobs in your project - you will be billed for respective queries.
If user has its "own" project and only has view access to you data - in this case bill goes to user's project

Related

Does service account need BQ job user role in the same project as datasets it will query?

My GCP expert tells me that my SA only needs data viewer role in the project in which the datasets I want to query are and that as long as it has job user role in any other project the query job should work.
But when I run the query I get this error:
google.api_core.exceptions.Forbidden: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/... : Access Denied: Project ... : User does not have bigquery.jobs.create permission in project ....
So does the SA need BQ job user role in exactly the same project where the datasets are?
Your GCP expert is correct!
as long as it has job user role in any other project ...
You just need to make sure you are running the job from within the project where that SA has job user role. This project will be billed for the cost of running job
In order to avoid the error you are facing, it is necessary to have the bigquery.jobs.create permission, as you can see in the error. You have two options:
1.- Create a custom role with such a permission.
2.- Add the BigQuery Job User or BigQuery User role. Both of them have the bigquery.jobs.create permission you need.

Google Big Query : How to get the authorization to change the authorizations on Data Sets

I would like to add a member (xxxxxxx#developer.gserviceaccount.com) in the list of members that are allowed to read the tables in a data set.
However, when in the Big Query console, when I click on SHARE DATASET , I have the message :
**"You don't have permission to edit the permissions of the selected resource "****
However, I have the permission to use the query editor and to run queries on this Data Set.
How can I add a member to the list of members who can read this DATA SET so that to access to this Data Set through a Virtual Machine ?
The error message you are receiving is because you do not have the required permissions associated with your account to share a Dataset with another member.
In order to do this, you(if possible), or an admin of the project, i.e., someone with project editor or project owner role assigned. Will need to assign the needed permissions to allow you to share datasets.
You can see all the available permissions in this document Predefined roles and permissions
For a comprehensive document dealing with controlling access to Datasets review this document Controlling access to datasets
As long as you have the appropriate roles/permissions you should have no problems sharing BigQuery Datasets

Limit access to each other's database in Google Cloud Big Query

I need to grant access to Big Query to 30 Phd Students in a University on Google Cloud Platform.
Can I give them standalone access to each of them? i.e. One student cannot see other's work unless it is granted.
Creating projects = number of students is not too cost effective.
so can I give 30 access controls to a single project ?
The students need to have full access to Big query (Create, edit, join, download, run) to their respective databases.
The document is indeed confusing. Don't grant any project-level permissions. Just as Katie Sinatra said, go to the dataset Web UI, in the arrow drop's "Share dataset", add email and grant "Can edit". At the time of this writing, after you do the above, the user won't be able to see the dataset in the Web UI, but s/he can still do query in the Web UI just fine if s/he specifies the table correctly, i.e. `project.dataset.table`. (i tested it.) The user can also manually add the dataset to be displayed in the web UI. Here is how to do it, https://cloud.google.com/bigquery/bigquery-web-ui#displayprojects.
What i am still confusing is: after you do the above, when the user do a query, who is paying. My guess is the user is paying. If you want the dataset's original owner (the project under which the dataset is created) to be billed, then my guess is that you need to grant project-level BigQuery Job User permission to the user/email in addition to the above. Then, the user will be able to select the project in the GCP console and hit "BigQuery" to go to the big query Web UI, and be billed under the project. (by the way, if you do this, the dataset can be seen by the user in the Web UI.)
As JL-HN said, it is documented but it is a bit confusing. To give access to an specific dataset, you only need to go the dataset, in the arrow drop it down and click on "Share dataset". Then you only need to add the email of the student that will handle that dataset.

BigQuery - Grant Access to Other Google Cloud Platform Projects

I'm trying to setup customer access to some of my BigQuery data. I'll start off with my requirements, then what I think the solution needs to be, though I'm not sure how to execute.
Requirements
Separate billing per customer for queries
I don't want to make my dataset public
Read only access to specific datasets
Accessible via Excel connector
No access rights to my main project
They manage their own access privileges, I don't want to have to add and remove individual users from direct dataset access on behalf of all our clients.
Nice to have - Web UI access
What I've Done
Created a new Google Developer Project
Added a view-only user on that project
Added a service account
Granted access to my BigQuery dataset to the service account
Here are the options for granting dataset access from the documentation:
I imagine that I need to setup some sort of special group, but I can't figure out how to do it.
Thanks in advance!
In BigQuery there are two different concepts:
The first one is billing (for queries and any other billable
activity) that is linked with a Google Cloud Project.
The second one is access to a dataset.
Having said that, to fulfil your requirements you'd create a separate project for each of the customers, and grant access to the datasets in the granularity that you would want.
That way you would have the costs for each of the projects separated but billed to you. Be careful to give them only read access to the project, unless you want them to be able to create other services like VM or deploy GAE apps, as they'd be billed to you as well.
For example dataset [MyDatasetA] to users X and Y in projects Project1 and Project2, but access to [MyDatasetB] to users Y and Z in projects Project2 and Project3.
Thus, each project is accountable for the queries their users run, and you have your access control on each dataset without it being public.
Separate billing per customer for queries. Done with the independent projects.
I don't want to make my dataset public. Done with fine grained control access.
Read only access to specific datasets. Same as above.
Accessible via Excel connector. It should work without problems as they'd be first class BQ users.
No access rights to my main project. Again possible if they are restricted to their own projects.
They manage their own access privileges. This is trickier. I think they'd need more than read access to the datasets or more than read access to the projects to be able to add new users, if you use the project groups as access control.
Nice to have - Web UI access. Check out https://bigquery.cloud.google.com/
The project groups are groups that allow to select members with Viewer, Developer or Owner roles in one click, without the hassle of adding each member manually.
You get already three groups set-up for you to use: Viewers, Editors and Owners of the original project.
But you may create your own Google Groups and give those groups the permission you want.
The hint when doing so, is that new users will usually need to Display your project so that it appears in the BQ online browser. This is done by clicking on the arrow to the side of the project name in the BQ online browser followed by Switch to project then Display project with the project name that the Dataset belongs to.
Edit: Improved the explanation about Group access

How to create multiple repositories in Penatho

I would like to know how to create different (multiple) repositories in Pentaho Enterprise version.
Below are some points which I would like to add.
1. Different repositories for different users, so one user cant access the other users transformations and jobs.
2. One user cant access the DB connections of other users in different repositories.
My main concern is I want logic here is for security reasons. One user cant access or update other users created transformation.
Is this possible? Please help me on this.
Thanks for all in advance.
This is exactly how my repos are set up. I use database repos on PostgreSQL for all my users. To create a new repo, just click the green + button at the top right of the Repository Connection dialog.
To keep users out of each others sandboxes, I create a different schema for each user and assign DB permissions accordingly. Note, the schema has to be created before you create the repo. Of course I'm DB superuser so I can get into all their repos.
When you create a connection for a repo, go to the advanced tab and specify that user's schema in the 'Preferred schema name' box. Note, this connection will not appear in your list of connections stored in the repo; it's in the repositories.xml file in the .kettle directory. I also created a template xml file that I can tweak give out to anyone who comes on board as a developer. That way they only see their repo in the connection dialog, but my repositories.xml has all of their repos.
You can do this with file based repos as well, but of course you'd handle permissions through the file system rather than the DB.
It's also true that repos can have multiple users. I use this feature when members of the same group need to share transforms. For example the Data Warehouse group is all in one repo, but each has their own directory; the other group has their own repo, etc.
I am not sure ,that you can create multiple instatnce of same repository , but
i sugest you can use single repository with different user and with
different user level permissions
You concerns can be re-solved based on user level permission on repo