Concurrency for select query in Redshift

Concurrency for select query in Redshift - sql

We have a table in Redshift:
people
people_id people_tele people_email role
1 8989898332 john#gmail.com manager
2 8989898333 steve#gmail.com manager
3 8989898334 andrew#gmail.com manager
4 8989898335 george#gmail.com manager
I have a few users who would query the table like:
select * from people where role = 'manager' limit 1;
The system users are basically phone calling these people for up-selling products. So, when the query return results, it should not return same people ever.
For ex.
If User A executes the query - select * from people where role = 'manager' limit 1;, then he should get the result:
people_id people_tele people_email role
1 8989898332 john#gmail.com manager
If User B executes the query - select * from people where role = 'manager' limit 1;, then he should get the result:
people_id people_tele people_email role
2 8989898333 steve#gmail.com manager
APPROACH 1
So, I thought of adding a is_processed column to not return the same results. So, after User A executes the query, the table would look something like:
people_id people_tele people_email role is_processed
1 8989898332 john#gmail.com manager 1
2 8989898333 steve#gmail.com manager 0
3 8989898334 andrew#gmail.com manager 0
4 8989898335 george#gmail.com manager 0
APPROACH 2
Another thought was to create another table called - query_history where I have:
query_id people_id processed_time
1 1 22 Jan 2020, 4pm
2 2 22 Jan 2020, 5pm
QUESTION
My question is what happens when User A and User B queries at the EXACT same time? The system would return the same people_id at that moment and 2 phone calls would be made to the same person.
How can I solve the concurrency problem?

You can solve it with your Approach 1 only with adding Randomisers in it
SELECT * FROM people
WHERE role = 'manager'
AND is_processed = 0
order by random()
limit 1;
Refer: https://docs.aws.amazon.com/redshift/latest/dg/r_RANDOM.html

Maybe you can solve with transactions ? try some try/catch maneuver.
Transaction MySQL
edit: Sorry, for some reason i thought you are working with MySQL. https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-transaction-management.html

Related

SQL Tinder to prioritize order for users who already like you

profile
---
id name
1 John
2 Jane
3 Jill
...
swipe
---
id profile_1_id profile_2_id liked
1 2 1 true
2 3 1 false
...
If you've used Tinder before, you might recognize that it seems to fetch an initial card deck that consists of:
users who already like you that you can instantly match with, pushed to the top
other users
(out of scope for this question but it also sprinkles in some more attractive users)
If we extend the example to 100+ users, id=1 John was looking at the app, and we fetched with a limit of 20, it would guarantee Jane comes back (since Jane already likes John and John could match right away) + 19 others to fill the rest of John's deck to keep John swiping for more.
What is the SQL for "get people who like John first then fill the rest with random users"? Would this be a WHERE(case if else) or some other statement?

Here is a query that should meet your need.
It works by using a conditional sorting with CASE. Users that liked John will are given higher priority, and will appear sorted by id. Other users are given a lower, random, priority ; this also means, for a given user, this part of list will not always be the same (which, I believe, fits your purpose). The number of output records is then controlled by a LIMIT clause.
I tested the query in this db fiddle. You need to replace the question mark (?) in the CASE clause with the id of user for which you are generating a card (1 for John in your sample data).
SELECT
p.id,
p.name
FROM
profile p
LEFT JOIN swipe s on s.profile_1_id = p.id
ORDER BY
CASE s.profile_2_id
WHEN ? THEN 0
ELSE FLOOR(random() * 10) + 1
END,
p.id
LIMIT 20

You could try something like this but I think you're oversimplifying. Do you want to exclude not liked people from the others?
select * from profile p
left outer join swipe s on (p.id=profile_1_id and s.profile_2_id = 1 and liked = true)
where
p.id<>1
order by coalesce(profile_2_id , random()*-1000000) desc
limit 20

SQL filter search according to multiple column values

I am dealing with one table(3+ million rows,SQL Server)
I need to filter results according to the two columns below:
<code>
...FromID| ToID |Column5|....
...1001 2001
...1002 2020
...1003 5000
...1001 3000
...2001 1001
</code>
Now User1 can access records with FromID or ToId 1001.
FromID|ToID
1001|2001
1001|3000
2001|1001
User2 can access records with FromID or ToID 1002,1003,3000
FromID|ToID
1002|2020
1003|5000
1001|3000
What is the most efficient way to do this ?
Do i need to create a view for each user ?(this is working on enterprise,user count will be
max 100 )
Thanks.
PS. My very first question. O.o

Your access criteria seem to be fairly arbitrary. User1 gets 1001, user2 gets 1002, 1003, and 3000, and I assume users 3 through 99 have arbitrary access as well. In that case, I recommend that you create a table, call it useraccess for this example:
user |accessID
---------------
user1|1001
user2|1002
user2|1003
user2|3000
... |...
Now when you want to know what rows a user has, you can do this:
SELECT t.FromID, t.ToID, [[other columns you care about]]
FROM yourtable t
JOIN useraccess a ON t.FromID = a.accessID OR t.ToID = a.accessID
WHERE a.user = 'user2'
You can either run that query dynamically or you can create a view based on it. The usual tradeoffs between views and direct queries will apply as usual.
Edit: I just saw your note that you already have a UserRights table, so you already have step 1 completed.

SQL query for mutual visited places

I'm working on a project for my University with Rails 3/PostgreSQL, where we have Users, Activities and Venues. An user has many activities, and a venue has many activities. An activity belongs to an user and to a venue and has therefore an user_id and a venue_id.
What I need is a SQL query (or even a method from Rails itself?) to find mutual venues between several users. For example, I have 5 users that have visited different venues. And only 2 venues got visited by the 5 users. So I want to retrieve the 2 venues.
I've started by retrieving all activities from the 5 users:
SELECT a.user_id as user, a.venue_id as venue
FROM activities AS a
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
But now I need a way to find out the mutual venues.
Any idea?
thx,
tux

I'm not entirely familiar with sql syntax for postgresql, but try this:
select venue_id, COUNT(distinct user_id) from activities
Where user_id in (116,227,229,613,879)
group by venue_id
having COUNT(distinct user_id) = 5
EDIT:
You will need to change the '5' to however many users you care about (how many you are looking for).
I tested this on a table structure like so:
user_id venue_id id
----------- ----------- -----------
1 1 1
2 6 2
3 3 3
4 4 4
5 5 5
1 2 6
2 2 7
3 2 8
4 2 9
5 2 10
The output was:
venue_id
----------- -----------
2 5

You would have to come up with some parameters for your search. For example, 5 user may have 2 Venues in common, but not 3.
If you want to see what Venues these five users have in common, you can start by doing this:
SELECT a.venue_id, count(1) as NoOfUsers
FROM activities AS a
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
group by a.venue_id
That would bring you, for those users, how many users have that venue. So you have degrees of "Venue sharing".
But if you want to see ONLY the venues who were visited by the five users, you'd add a line in the end:
SELECT a.venue_id, count(1) as NoOfUsers
FROM activities AS a
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
group by a.venue_id
having count(1) = 5 --the number of users in the query
You should also consider changing your WHERE statement from
WHERE a.user_id=116 OR a.user_id=227 OR a.user_id=229 OR a.user_id=613 OR a.user_id=879
to
WHERE a.user_id in (116, 227, 229, 613, 879)

in sql it would be something like:
Select distinct v.venue_id
from v.venues
join activities a on a.venue_id = v.venue_id
Join users u on u.user_id = a.user_id
Where user_id in (116,227,229,613,879)
You need to join up your tables so to get all the venues that have had activities that have had users. When you are just learning it is sometimes simpler to visualize it if you use subqueries. At leasts thats what I found for me.

SQL: How to select rows from a table while ignoring the duplicate field values?

How to select rows from a table while ignoring the duplicate field values?
Here is an example:
id user_id message
1 Adam "Adam is here."
2 Peter "Hi there this is Peter."
3 Peter "I am getting sick."
4 Josh "Oh, snap. I'm on a boat!"
5 Tom "This show is great."
6 Laura "Textmate rocks."
What i want to achive is to select the recently active users from my db. Let's say i want to select the 5 recently active users. The problem is, that the following script selects Peter twice.
mysql_query("SELECT * FROM messages ORDER BY id DESC LIMIT 5 ");
What i want is to skip the row when it gets again to Peter, and select the next result, in our case Adam. So i don't want to show my visitors that the recently active users were Laura, Tom, Josh, Peter, and Peter again. That does not make any sense, instead i want to show them this way: Laura, Tom, Josh, Peter, (skipping Peter) and Adam.
Is there an SQL command i can use for this problem?

Yes. "DISTINCT".
SELECT DISTINCT(user_id) FROM messages ORDER BY id DESC LIMIT 5

Maybe you could exclude duplicate user using GROUP BY.
SELECT * FROM messages GROUP BY user_id ORDER BY id DESC LIMIT 5;

Using different columns values twice in a single SQL query?

I have a mySQL table called "User" containing multiple mixed values as this:
[user_id] [user_email] [birthday]
---------------------------------
1 x#xxx.com 01/01/1981
2 y#yyy.com 02/02/1982
3 z#zzz.com 03/03/1983
I have another table called "Name" which contains name of the user, but also of some movies like this:
[node_id] [name] [user_id]
----------------------------------
9 John Doe 1
10 Star Wars 90
11 Mike Smith 2
12 Mary Lord 3
13 Rocky III 91
Finally, I have a third table named "Vote" with which is a relationship between a user and some movies he likes.
[vote_id] [node_id] [user_id]
------------------------------
1 10 1
2 10 2
3 13 1
12 10 3
13 13 2
What I'm struggling to do is pull a query with twice the "name" value for two separate things: the name of the user, and the name of the movie he likes. Like this:
[user_id] [user_name] [Birthday] [movie_name]
-------------------------------------------------
1 John Doe 01/01/1981 Star Wars
2 Mike Smith 02/02/1982 Star Wars
1 John Doe 01/01/1981 Rocky III
3 Mary Lord 03/03/1983 Rocky III
2 Mike Smith 02/02/1982 Rocky III
SELECT user.id,
node.name,
user.birthday,
IF(node.type = "movie", node.name, "")
FROM user,
node
JOIN vote ON vote.user_id = user.user_id
WHERE user.id = node.id
I think I'm all mixed up... anyone can help please?

Assuming your schema is exactly what you posted above this should work verbatim.
Query
SELECT user.user_id,
node.name user_name,
user.birthday,
(select node.name from node where node_id = vote.node_id) as movie_name
FROM user
JOIN node ON user.user_id = node.user_id
JOIN vote ON vote.user_id = user.user_id
Result

You have got the database structure wrong. Store the user name in your first table "User"

I would strongly suggest that you store the user_name in the users table. With that change you can then have a much more simple query and a properly normalized schema.
New proposed schema.
users table
(Added user_name column)
[user_id][name][user_email][birthday]
1 name1 x#xxx.com 01/01/1981
2 name2 y#yyy.com 02/02/1982
3 name3 z#zzz.com 03/03/1983
nodes table (call this movies)
(removed user entries and the user_id column as you'll be using votes to link these to users)
[node_id] [name]
10 Star Wars
11 Mike Smith
12 Mary Lord
13 Rocky III
votes table (call this something like movies_users)
(removed the vote_id column as it's just a join table)
[node_id] [user_id]
10 1
10 2
13 1
10 3
13 2
Then your query should look something like this:
select users.user_id, users.name, users.birthday, nodes.name as movie_name
from users
join votes on users.id = votes.user_id
join nodes on votes.node_id = nodes.node_id

select user_id,user_name,birthday,name
from user,name,vote
where (and here you do all the joins like user_id from one table equals user_id from another table)
But here we have a problem which makes me impossible to understand how to write the correct code you have 2 fields in two different tables, user_name and name, you want to join the tables by this name? I don't understand.) I think you are mixing the movie names with the user names, reformulate the question please

I agree with the other answers that you would be better off if you moved the user name into the user table. However, if you are stuck with your current table structure, try this:
SELECT user.id,
uname.name user_name,
user.birthday,
movie.name movie_name
FROM user
JOIN node uname ON uname.user_id = user.user_id
JOIN vote ON vote.user_id = user.user_id
JOIN node movie ON vote.node_id = movie.id
(Assuming votes can only be cast for Movies, it should be unnecessary to blank out non-movies as these should never exist.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Concurrency for select query in Redshift - sql

You can solve it with your Approach 1 only with adding Randomisers in it SELECT * FROM people WHERE role = 'manager' AND is_processed = 0 order by random() limit 1; Refer: https://docs.aws.amazon.com/redshift/latest/dg/r_RANDOM.html

Maybe you can solve with transactions ? try some try/catch maneuver. Transaction MySQL edit: Sorry, for some reason i thought you are working with MySQL. https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-transaction-management.html

Related

SQL Tinder to prioritize order for users who already like you

SQL filter search according to multiple column values

SQL query for mutual visited places

SQL: How to select rows from a table while ignoring the duplicate field values?

Using different columns values twice in a single SQL query?

Categories

Resources