BigQuery Subqueries - google-bigquery

BigQuery Subqueries - google-bigquery

I am new to BQ. My objective is to create a query which will drill down to specific values.
The following query displays iam roles, such as ServiceAccountUser or serviceAccountTokenCreator.
SELECT name, iam.role AS roles, (ARRAY_TO_STRING(iam.members, "")) AS members
FROM dataset_blah_7553a5b37_Project,
UNNEST (iam_policy.bindings) AS iam
GROUP by name, roles, members
HAVING role = 'roles/iam.serviceAccountUser' OR role = 'roles/iam.serviceAccountTokenCreator'
It displays for example the following:
Row name roles members
1 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountUser serviceAccount:xyz321#blah.com
2 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountUser user: ericcartman#blah.com
3 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountTokenCreator group: gcp-blah-group001#blah.com
4 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountUser user: stanmarshn#blah.com
5 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountUser serviceAccount:abc1234#blah.com
6 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountUser serviceAccount:opr3234#blah.com
7 //cloudresourcemanager.googleapis.com/projects/1234567890 roles/iam.ServiceAccountUser serviceAccount:stu2224#blah.com
However, I need to drill down further to filter only "group:" and "user:", such as
group: gcp-blah-001#blah.com
user: ericcartman#blah.com
How can I accomplish it? Subqueries?

Yes, we can do filtering using subqueries in bigquery. [1]
You can try something like
SELECT table1.name, table1.roles, table1.members
FROM(SELECT name, iam.role AS roles, (ARRAY_TO_STRING(iam.members, "")) AS members
FROM dataset_blah_7553a5b37_Project,
UNNEST (iam_policy.bindings) AS iam
GROUP by name, roles, members
HAVING role = 'roles/iam.serviceAccountUser' OR role =
'roles/iam.serviceAccountTokenCreator') AS table1
WHERE table1.members= "group: gcp-blah-001#blah.com" AND table1.members="user: ericcartman#blah.com"
[1]https://cloud.google.com/bigquery/docs/reference/standard-sql/subqueries#table_subquery_concepts

Related

postgreSQL - How to have condition on Join model result

I have been pulling my hair on how to translate my business requirement into a SQL query.
The case in itself is not that complex. I have two main tables user and list and a many to many relationship between the 2 of them though the userList table, where a user can be connected to a list by having a specific role, like being the owner or a collaborator.
Relevant properties for that example:
user -> id
list -> id | isPublic (BOOL)
userList -> id | userId | listId | role (OWNER|COLLABORATOR)
I have an API endpoint that allows for a user (requester) to retrieve all the lists of another user (target). I have the following 2 cases:
Case 1: requester = target
I have that under control with:
`SELECT
"list".*,
"userList"."id" AS "userList.id",
"userList"."listId" AS "userList.listId"
FROM
"list" AS "list"
INNER JOIN
"userList" AS "userList"
ON "list"."id" = "userList"."listId" AND "userList"."userId" = '${targetUserId}'`;
Case 2: requester =/= target
This is where I'm stuck as I have the following additional constraints:
a user can only see other users' lists that are public
OR lists for which they have the role of COLLABORATOR or OWNER
So I'm looking to do something like:
Get all the lists for which target is connected to (no matter the role) AND for which, EITHER the list is public OR the requester is also connected to that list.
Sample Data
user table
id
----------
john
anna
list table
id | isPublic
------------------
1 false
2 true
3 false
userList table
id | userId | listId | role
- john 1 OWNER
- john 2 OWNER
- john 3 OWNER
- anna 1 COLLABORATOR
Anna requests to see all of John's lists
desired result:
[{
id: 1,
isPublic: false
}, {
id: 2,
isPublic: true
}]
Any help would be greatly appreciated :)
Many thanks

Does this do what you want?
select l.*
from list
inner join userlist ul on ul.listid = l.id
group by l.id
having
bool_or(ul.role = 'OWNER' and ul.userid = 'john') -- owned by John
and (l.ispublic or bool_or(ul.userid = 'anna') -- public or allowed to Anna

Concurrency for select query in Redshift

We have a table in Redshift:
people
people_id people_tele people_email role
1 8989898332 john#gmail.com manager
2 8989898333 steve#gmail.com manager
3 8989898334 andrew#gmail.com manager
4 8989898335 george#gmail.com manager
I have a few users who would query the table like:
select * from people where role = 'manager' limit 1;
The system users are basically phone calling these people for up-selling products. So, when the query return results, it should not return same people ever.
For ex.
If User A executes the query - select * from people where role = 'manager' limit 1;, then he should get the result:
people_id people_tele people_email role
1 8989898332 john#gmail.com manager
If User B executes the query - select * from people where role = 'manager' limit 1;, then he should get the result:
people_id people_tele people_email role
2 8989898333 steve#gmail.com manager
APPROACH 1
So, I thought of adding a is_processed column to not return the same results. So, after User A executes the query, the table would look something like:
people_id people_tele people_email role is_processed
1 8989898332 john#gmail.com manager 1
2 8989898333 steve#gmail.com manager 0
3 8989898334 andrew#gmail.com manager 0
4 8989898335 george#gmail.com manager 0
APPROACH 2
Another thought was to create another table called - query_history where I have:
query_id people_id processed_time
1 1 22 Jan 2020, 4pm
2 2 22 Jan 2020, 5pm
QUESTION
My question is what happens when User A and User B queries at the EXACT same time? The system would return the same people_id at that moment and 2 phone calls would be made to the same person.
How can I solve the concurrency problem?

You can solve it with your Approach 1 only with adding Randomisers in it
SELECT * FROM people
WHERE role = 'manager'
AND is_processed = 0
order by random()
limit 1;
Refer: https://docs.aws.amazon.com/redshift/latest/dg/r_RANDOM.html

Maybe you can solve with transactions ? try some try/catch maneuver.
Transaction MySQL
edit: Sorry, for some reason i thought you are working with MySQL. https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-transaction-management.html

Can PostgreSQL JOIN on jsonb array objects?

I am considering switching to PostgreSQL, because of the JSON support. However, I am wondering, if the following would be possible with a single query:
Let's say there are two tables:
Table 1) organisations:
ID (INT) | members (JSONB) |
------------+---------------------------------------------------------|
1 | [{ id: 23, role: "admin" }, { id: 24, role: "default" }]|
2 | [{ id: 23, role: "user" }]
Table 2) users:
ID (INT) | name TEXT | email TEXT |
------------+-----------+---------------|
23 | Max | max#gmail.com |
24 | Joe | joe#gmail.com |
Now I want to get a result like this (all i have is the ID of the organisation [1]):
ID (INT) | members (JSONB) |
------------+--------------------------------------------------------|
1 | [{ id: 23, name: "Max", email: "max#gmail.com", role:
"admin" },
{ id: 24, name: "Joe", email: "joe#gmail.com ", role:
"default" }]
(1 row)
I know this is not what JSONB is intended for and that there is a better solution for storing this data in SQL, but I am just curious if it would be possible.
Thanks!

Yes it is possible to meet this requirement with Postgres. Here is a solution for 9.6 or higher.
SELECT o.id, JSON_AGG(
JSON_BUILD_OBJECT(
'id' , u.id,
'name' , u.name,
'email' , u.email,
'role' , e.usr->'role'
)
)
FROM organisations o
CROSS JOIN LATERAL JSONB_ARRAY_ELEMENTS(o.data) AS e(usr)
INNER JOIN users u ON (e.usr->'id')::text::int = u.id
GROUP BY o.id
See this db fiddle.
Explanation :
the JSONB_ARRAY_ELEMENTS function splits the organisation json array into rows (one per user) ; it is usually used in combination with JOIN LATERAL
to join the users table, we access the content of the id field using the -> operator
for each user, the JSONB_BUILD_OBJECT is used to create a new object, by passing a list of values/keys pairs ; most values comes from the users table, excepted the role, that is taken from the organisation json element
the query aggregates by organisation id, using JSONB_AGG to generate a json array by combining above objects
For more information, you may also have a look at Postgres JSON Functions documentation.

There might be more ways to do that. One way would use jsonb_to_recordset() to transform the JSON into a record set you can join. Then create a JSON from the result using jsonb_build_object() for the individual members and jsonb_agg() to aggregate them in a JSON array.
SELECT jsonb_agg(jsonb_build_object('id', "u"."id",
'name', "u"."name",
'email', "u"."email",
'role', "m"."role"))
FROM "organisations" "o"
CROSS JOIN LATERAL jsonb_to_recordset(o."members") "m" ("id" integer,
"role" text)
INNER JOIN "users" "u"
ON "u"."id" = "m"."id";
db<>fiddle
What functions are available in detail depends on the version. But since you said you consider switching, assuming a more recent version should be fair.

Combine same column name data of two different tables in sql to a result set including null

I have below two tables which i need to combine together into my daily reports.
Table 1: Resource_Created
FirstName LastName ObjDate Resource Login
TestDemo1 TestDemo1 5-Oct-12 AD TESTDEMO1
Table 2: Resource_Deleted
FirstName LastName ObjDate Resource Login
TestDemo4 TestDemo4 5-Oct-12 AD TESTDEMO4
TestDemo5 TestDemo5 5-Oct-12 AD TESTDEMO5
TestDemo6 TestDemo6 5-Oct-12 AD TESTDEMO6
TestDemo4 TestDemo4 5-Oct-12 Bio TESTDEMO4
TestDemo4 TestDemo4 5-Oct-12 VPN TESTDEMO4
TestDemo5 TestDemo5 5-Oct-12 VPN TESTDEMO5
TestDemo6 TestDemo6 5-Oct-12 VPN TESTDEMO6
I wrote two queries individually like
Query 1:
select distinct Resource as Resource,
count (distinct Login) as CountRes
from Resource_Created
where ObjDate between '4-Oct-12' and '6-Oct-12'
group by Resource ;
Result:
Resource CountRes
AD 1
Query 2:
select distinct Resource as Resource,
count (distinct Login) as CountRes
from Resource_Deleted
where ObjDate between '4-Oct-12' and '6-Oct-12'
group by Resource ;
Result
Resource CountRes
AD 3
VPN 3
Bio 1
I wish to combine these two queries, so that i can have one table display these values.
select COALESCE (Resource_Created.Resource, Resource_Deleted.Resource) as Resource ,
count (distinct Resource_Created.usrlogin) as aobj,
count (distinct Resource_Deleted.usrlogin) as bobj
FROM target_failed FULL OUTER JOIN target_resource
on Resource_Created.Resource = Resource_Deleted.Resource
where
Resource_Created.ObjDate between '04-OCT-2012' and '06-OCT-2012' and
Resource_Deleted.ObjDate between '04-OCT-2012' and '06-OCT-2012'
group by COALESCE(Resource_Created.Resource, Resource_Deleted.Resource);
My Result was
**Resource aobj bobj**
AD 1 3
Expected Result
Resource aobj bobj
AD 1 3
VPN Null 3
Bio Null 1
Please could anyone help me resolve the issue. I am just a OO developer who writes basic sql queries. It would be greatly appreciated.

Simply use sub sql on from statement
select
ResourceI as rs, sum (CreatedLogin) as CountCreated,
sum (DeletedLogin) as CountDeleted
from
(select
ObjDate,
Resource ,
1 as DeletedLogin,
0 as CreatedLogin
from Resource_Deleted
union all
select ObjDate,
Resource ,
0 as DeletedLogin,
1 as CreatedLogin
from Resource_Created
) TABLE_ALL
where ObjDate between TO_DATE('4-Oct-12') and TO_DATE('6-Oct-12')
group by Resource

SQL query for mutual visited places

I'm working on a project for my University with Rails 3/PostgreSQL, where we have Users, Activities and Venues. An user has many activities, and a venue has many activities. An activity belongs to an user and to a venue and has therefore an user_id and a venue_id.
What I need is a SQL query (or even a method from Rails itself?) to find mutual venues between several users. For example, I have 5 users that have visited different venues. And only 2 venues got visited by the 5 users. So I want to retrieve the 2 venues.
I've started by retrieving all activities from the 5 users:
SELECT a.user_id as user, a.venue_id as venue
FROM activities AS a
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
But now I need a way to find out the mutual venues.
Any idea?
thx,
tux

I'm not entirely familiar with sql syntax for postgresql, but try this:
select venue_id, COUNT(distinct user_id) from activities
Where user_id in (116,227,229,613,879)
group by venue_id
having COUNT(distinct user_id) = 5
EDIT:
You will need to change the '5' to however many users you care about (how many you are looking for).
I tested this on a table structure like so:
user_id venue_id id
----------- ----------- -----------
1 1 1
2 6 2
3 3 3
4 4 4
5 5 5
1 2 6
2 2 7
3 2 8
4 2 9
5 2 10
The output was:
venue_id
----------- -----------
2 5

You would have to come up with some parameters for your search. For example, 5 user may have 2 Venues in common, but not 3.
If you want to see what Venues these five users have in common, you can start by doing this:
SELECT a.venue_id, count(1) as NoOfUsers
FROM activities AS a
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
group by a.venue_id
That would bring you, for those users, how many users have that venue. So you have degrees of "Venue sharing".
But if you want to see ONLY the venues who were visited by the five users, you'd add a line in the end:
SELECT a.venue_id, count(1) as NoOfUsers
FROM activities AS a
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
group by a.venue_id
having count(1) = 5 --the number of users in the query
You should also consider changing your WHERE statement from
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
to
WHERE a.user_id in (116, 227, 229, 613, 879)

in sql it would be something like:
Select distinct v.venue_id
from v.venues
join activities a on a.venue_id = v.venue_id
Join users u on u.user_id = a.user_id
Where user_id in (116,227,229,613,879)
You need to join up your tables so to get all the venues that have had activities that have had users. When you are just learning it is sometimes simpler to visualize it if you use subqueries. At leasts thats what I found for me.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

BigQuery Subqueries - google-bigquery

Related

postgreSQL - How to have condition on Join model result

Concurrency for select query in Redshift

Can PostgreSQL JOIN on jsonb array objects?

Combine same column name data of two different tables in sql to a result set including null

SQL query for mutual visited places

Categories

Resources