Best practices to store user information in Redis

Best practices to store user information in Redis - redis

I'm building a simple web application where I need to store user information when logged in.
The information that I need to store is in json format like:
User:{'name': 'John', roles:[1,2,3]}
I was thinking in storing the information with a combination of user + {user_id} like:
User_1: {'name': 'John', roles=[1,2,3]}
I needed to know if this is a good practice (thinking in retrieving the information fast) or if there is another alternative.

Option 1:
Using key: value pair.
"User_1": "name: John, roles=[1,2,3]"
Time complexity: O(1).
[x] But after getting the data from Redis you have to parse it again.
Option 2:
Using separate keys for name and roles.
"User_1_name": "John",
"User_1_roles": [1,2,3]
Time complexity: O(1) per request. In total for your example case it's constant.
[x] Extra call for same user.
Option 3:
Using hash.
user_1: {
"name": "John",
"roles": [1, 2, 3]
}
Time complexity: O(N). Where N is the number of fields in hash. But for your example case it's constant.
In my opinion using a hash will be the best option.

I guess your question is about the key, not value.
It's good practice to save user with user+id key.
And if you have another identify attribute for your user(Email for example), you'll need to map all users to their email.
So each user will have these keys in Redis:
User_1: {'name': 'John', roles:[1,2,3], 'email': 'blabla#test.com'}
User_blabla#test.com_id: 1
When you needed to find user by email, you can get the user id by reading the User_blabla#test.com_id(which it's value is equal to user's id 1), then get the User_1.

Related

FaunaDB: Query for all documents not referenced by another collection

I'm working on an app where users learn about different patterns of grammar in a language. There are three collections; users and patterns are interrelated by progress, which looks like this:
Create(Collection("progress"), {
data: {
userRef: Ref(Collection("users"), userId),
patternRef: Ref(Collection("patterns"), patternId),
initiallyLearnedAt: Now(),
lastReviewedAt: Now(),
srsLevel: 1
}
})
I've learned how to do some basic Fauna queries, but now I have a somewhat more complex relational one. I want to write an FQL query (and the required indexes) to retrieve all patterns for which a given user doesn't have progress. That is, everything they haven't learned yet. How would I compose such a query?

One clarifying assumption - a progress document is created when a user starts on a particular pattern and means the user has some progress. For example, if there are ten patterns and a user has started two, there will be two documents for that user in progress.
If that assumption is valid, your question is "how can we find the other eight?"
The basic approach is:
Get all available patterns.
Get the patterns a user has worked on.
Select the difference between the two sets.
1. Get all available patterns.
This one is trivial with the built-in Documents function in FQL:
Documents(Collection("patterns"))
2. Get the patterns a user has worked on.
To get all the patterns a user has worked on, you'll want to create an index over the progress collection, as you've figured out. Your terms are what you want to search on, in this case userRef. Your values are the results you want back, in this case patternRef.
This looks like the following:
CreateIndex({
name: "patterns_by_user",
source: Collection("progress"),
terms: [
{ field: ["data", "userRef"] }
],
values: [
{ field: ["data", "patternRef"] }
],
unique: true
})
Then, to get the set of all the patterns a user has some progress against:
Match(
"patterns_by_user",
Ref(Collections("users"), userId)
)
3. Select the difference between the two sets
The FQL function Difference has the following signature:
Difference( source, diff, ... )
This means you'll want the largest set first, in this case all of the documents from the patterns collection.
If you reverse the arguments you'll get an empty set, because there are no documents in the set of patterns the user has worked on that are not also in the set of all patterns.
From the docs, the return value of Difference is:
When source is a Set Reference, a Set Reference of the items in source that are missing from diff.
This means you'll need to Paginate over the difference to get the references themselves.
Paginate(
Difference(
Documents(Collection("patterns")),
Match(
"patterns_by_user",
Ref(Collection("users"), userId)
)
)
)
From there, you can do what you need to do with the references. As an example, to retrieve all of the data for each returned pattern:
Map(
Paginate(
Difference(
Documents(Collection("patterns")),
Match(
"patterns_by_user",
Ref(Collection("users"), userId)
)
)
),
Lambda("patternRef", Get(Var("patternRef")))
)
Consolidated solution
Create the index patterns_by_user as in step two
Query the difference as in step three

Zapier lazy load input fields choices

I'm building a Zapier app for a platform that have dynamic fields. I have an API that returns the list of fields for one of my resource (for example) :
[
{ name: "First Name", key: "first_name", type: "String" },
{ name: "Civility", key: "civility", type: "Multiple" }
]
I build my action's inputFields based on this API :
create: {
[...],
operation: {
inputFields: [
fetchFields()
],
[...]
},
}
The API returns type that are list of values (i.e : Civility), but to get these values I have to make another API call.
For now, what I have done is in my fetchFields function, each time I encounter a type: "Multiple", I do another API call to get the possible values and set it as choices in my input field. However this is expensive and the page on Zapier takes too much time to display the fields.
I tried to use the z.dehydrate feature provided by Zapier but it doesn't work for input choices.
I can't use a dynamic dropdown here as I can't pass the key of the field possible value I'm looking for. For example, to get back the possible values for Civility, I'll need to pass the civility key to my API.
What are the options in this case?

David here, from the Zapier Platform team.
Thanks for writing in! I think what you're doing is possible, but I'm also not 100% that I understand what you're asking.
You can have multiple API calls in the function (which it sounds like you are). In the end, the function should return an array of Field objects (as descried here).
The key thing you might not be aware of is that subsequent steps have access to a partially-filled bundle.inputData, so you can have a first function that gets field options and allows a user to select something, then a second function that runs and pulls in fields based on that choice.
Otherwise, I think a function that does 2 api calls (one to fetch the field types and one to turn them into Zapier field objects) is the best bet.
If this didn't answer your question, feel free to email partners#zapier.com or join the slack org (linked at the bottom of the readme) and we'll try to solve it there.

Should the response body of GET all parent resource return a list of child resource?

Please bear with me if the title is a bit confusing, I will try my best to explain my question below.
Say I have the following two endpoints
api/companies (returns a list of all companies like below)
[{name: "company1", id: 1}, {name: "company2", id: 2}]
api/companies/{companyeId}/employees (returns a list of all employees for a specific company like below)
[{name: "employee1", id: 1}, {name: "employee2", id: 2}]
What the client side needs is a list of companies, each one of which has a list of employees. The result should looks like this:
[
{
name: "company1",
id: 1,
employees: [ {name: "employee1", id: 1}, {name: "employee2", id: 2} ]
},
{
name: "company2",
id: 2,
employees: [ {name: "employee3", id: 3}, {name: "employee4", id: 4} ]
},
]
There are two ways I can think of to do this:
Get a list of company first and loop through the company list to
make a api call for each company to get its list of employees. (I'm wondering if this is a better way of design because of HATEOAS principle if I understand correctly? Because the smallest unit of resource of api/companies is company but not employees so client is expected to discover companies as the available resource but not employees.)
a REST client should then be able to use server-provided links dynamically to discover all the available actions and resources it needs
Return a list of employees inside each company object and then return a list of companies through api/companies. Maybe add a query parameter to this endpoint called responseHasEmployees which is a boolean default to be false, so when user make a GET through api/companies?responseHasEmployees=true, the response body will have a list of employees inside each company object.
So my question is, which way is a better way to achieve the client side's goal? (Not necessarily has to be the above two.)
Extra info that might be helpful: companies and employees are stored in different tables, and employees table has a company_fk column.

Start by asking yourself a couple of questions:
Is this a common scenario?
Is it logical to request data in this way?
If so, it might make sense to make data available in this way.
Next, do you already have api calls that pass variables implemented?
Based on your HATEOAS principle, you probably shouldn't. Clients shouldn't need to know, or understand, variable values in your url.
If not, stay away from it. Make it as clean to the client side as possible. You could make a third distinct api "api/companiesWithEmployees" This fits your HATEOAS principle, the client doesn't need to know anything about parameters or other workings of the api, only that they will get "Companies with Employees".
Also, the cost is minimal; an additional method in the code base. It's simpler for the client side at a low cost.
Next think about some of the developmental consequences:
Are you opening the door to more specific api requests?
Are you able to maintain a hard line on data you want accessible through the api?
Are you able to maintain your HATEOAS principle in that the clients know everything they need to know based on the api url?
Next incorporate scenarios like this into future api design:
Can you preemptively make similar api calls available? ie (Customers and Orders, would you simply make a single api call available that gets the two related to each other?)
Ultimately, my answer to your question would be to go ahead and make this a new api call. The overhead for setting up, testing, and maintaining this particular change seem extremely small, and the likelihood of data being requested in this way appears high.

I assume that the client you build is going to have an interface to view a list of companies where there will be an option to view employees of the company. So it is best to do it by pull on demand and not load the whole data at once.
If you can consider a property of your resource as a sub-resource, do not add the whole sub-resource data into the main resource API. You may include a referral link which can be used by the client to fetch the sub-resource data.
Here, in your case,
Main-Resource - Companies
Sub-Resource - Employees
Company name, contact number, address - These are properties of the company object and not the sub-resource of a company, whereas, employees can be very well considered as sub-resource.

Implementing following stream

I am developing an app for photo sharing and having follow system so whosoever follow x user then x users photo will come in his following .
I am storing my data in redis as following
sadd rdis_key+user_id photo_id
set redis_key+photo_id+data data_of_photo
sadd redis_key+follow+user_id follower_id
Now I want to get directly all photo_id of followers without looping.

This is a simple fan-out problem which you can not easily do with Redis directly.
You can do it with Lua but YOU WILL block Redis during the action.
I have an open source project which does the same thing but I do it in code as someone creates a new post. I would imagine this is just like a new photo.
https://github.com/pjuu/pjuu/blob/master/pjuu/posts/backend.py#L252
I use sorted sets though and use the unix timestamp as the score so they are always in order.
As User1 creates a new photo you look up a list of their followers. If you are using a sorted set you can get this via:
followers = zrange followers:user1 0 -1
then simply loop over all entries in that list:
for follower in followers: zadd feed:user2 <timestamp> <photo_id>
This way this new post is now pushed out to all users that are follow user1.
If you want this done on the fly then bad news: You will need some relational data and a way to query in the values which you can't do. SQL, Mongo, Couch, etc...
This is only pseudo code as you did not mention which language you use.
EDIT: As per question this is to be done on the Redis side
local followers = redis.call('zrange', KEYS[1], 0, -1)
for key, value in pairs(followers) do
redis.call('zadd', 'items:'..value, ARGV[1], ARGV[2])
end
return true
This will take a key of the users followers to iterate over. A zset score and value and will add these to the items for each user. You will need to change it to suit your exact needs. If you want to use sets you will need to use sscan or something. Zsets are easier though and in order.

less than greater than filtered queries, aerospike

Its been hard for me to find finite documentation on aerospike. Using aerospike filters with or without lua, is it possible for me to :
Order my results server side
Use a filter to do a greater than/less than query
Essentially I want to encode a value(client side) and retrieve the first row from aerospike whos value is greater than the encoded one.
Another way to put it, is opposite of price is right... what is the lowest value i can find in aerospike, whos value is not lower than the one i give.
Id like a simple way, but I am also open to work arounds(or flat out no if its not reasonable/practical)

Aerospike does not natively support ordering of data on server-side.
Aerospike supports filters on the query. You can specify a range filter for your need. See the example at this link.

Basic Sorting is natively supported in large lists (LDT).
In a Large List your key (index) is always ordered in a lexical manner by default.
Please notice that ldt-enabled true directive must be present in the namespace's config area in aerospike.conf
an example with the javascript client
var key = {ns: 'test', set: 'mySet', key: 'myKey'};
var callback = function (status, result) {console.log(status, result)}
var list = client.LargeList(key, 'targetBinName', null, callback));
// add first item (also determinate the list values type)
list.add(1, callback);
// add multiple items
list.add([0, 2, 4, 5], callback);
list.add(3, callback);
// get all items
list.scan(function (status, list) {
// list = [0, 1, 2, 3, 4, 5]
})
// select by values range
list.findRange(0, 3, callback)
// filter using udf to do custom gt/lt filtering
list.filter('udfName', callback)
if you need to store objects then you must add a key property that will be the index for sorting, range, duplicates etc (no duplicates are allowed by default)
list.add({key: 1})
list.add([{key: 0},{key: 2}])
I'm sure the other languages drivers have the same methods more or less.
more on Large list in Aerospike docs
Large list docs section in NodeJS client on Github

In the past you would have expressed this as a stream UDF, but since release 3.12 a predicate filter would be the correct solution.
Take a look at the PredExp class of the Java client and its examples for building complex filters. Predicate filtering also currently exists for the C, C# and Go clients.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas