So I have a table which stores ltree entries along with umask-based permissions for them.
| entry | user | group | mask |
| a | 1 | 1 | 644 |
| a.b | 2 | 1 | 644 |
| a.b.c | 2 | 0 | 600 |
Permissions are inheritable, and permission check is currently done client-side, (no caching - whole tree is retrieved to check permissions for given key).
What can be seen as a better workaround?
Using separate table to keep rights (this way) - fast to query, slow to update? (10'000 keys, 100' users, 20-30 groups to help organize users => expecting ~200*10000 keys)
Keep same structure, cache permissions client-side?
Write a stored procedure to query tree based on provided entry,user,group?
something else?
Related
So I'm trying to model a basic recommended friend system based on user activity. In this model, people can join activities, and if two people aren't already friends and happen to join the same activity, thier recommendation score for eachother increases.
Most of my app uses Firebase, but for this system I'm trying to use BigQuery.
The current system I have in mind:
I would have this table to represnet friendships. Since its an undirected graph, A->B being in the table infers that B->A will also be in the table.
+-------+-------+--------------+
| User1 | User2 | TimeFriended |
+-------+-------+--------------+
| abc | def | 12345 |
| def | abc | 12345 |
| abc | rft | 3456 |
| ... | ... | ... |
+-------+-------+--------------+
I also plan for activity participation to be stored like so:
+------------+-----------+---------------+------------+
| ActivityId | CreatorID | ParticipantID | TimeJoined |
+------------+-----------+---------------+------------+
| abc | def | eft | 21234 |
| ... | ... | ... | ... |
+------------+---------- +---------------+------------+
Lastly, assume maybe there's a table that stores mutual activities for these recommended friends (not super important, but assume it looks like:)
+-------+-------+------------+
| User1 | User2 | ActivityID |
+-------+-------+------------+
| abc | def | eft |
| ... | ... | ... |
+-------+-------+------------+
So here's the query I want to run:
Get all the participants for a particular activity.
For each of these participants, get all the other participants that aren't their friend
Add that tuple of {participant, other non-friend participant} to the "mutual activites" table
So there are oviously a couple of ways to do this. I could make a simple BigQuery script with looping, but I'm not a fan of that because it'll result in a lot of scans and since BigQuery doesn't use indexes it won't scale well (in terms of cost).
I could also maybe use something like a subquery with NOT EXISTS, like something like SELECT ParticipantID from activities WHERE activityID = x AND NOT EXISTS {something to show that there doesn't exist a friend relation}, but then its unclear how to make this work for every participant at one go. I'd be finee if I can come to a solution who's table scans scale linearly with the number of participants, but I have the premonition that even if I somehow get this to work, every NOT EXISTS will result in a full scan per participant pair, resulting in quadratic scaling.
There might be something I can do with joining, but I'm not sure.
Would love some thoughts and guidance on this. I'm not very used to SQL, especially complex queries like this.
PS: If any of y'all would like to suggest another serverless solution rather than BigQuery, go ahead please :)
I have a table that has user a user_id and a new record for each return reason for that user. As show here:
| user_id | return_reason |
|--------- |-------------- |
| 1 | broken |
| 2 | changed mind |
| 2 | overpriced |
| 3 | changed mind |
| 4 | changed mind |
What I would like to do is generate a foreign key for each combination of values that are applicable in a new table and apply that key to the user_id in a new table. Effectively creating a many to many relationship. The result would look like so:
Dimension Table ->
| reason_id | return_reason |
|----------- |--------------- |
| 1 | broken |
| 2 | changed mind |
| 2 | overpriced |
| 3 | changed mind |
Fact Table ->
| user_id | reason_id |
|--------- |----------- |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
| 4 | 3 |
My thought process is to iterate through the table with a cursor, but this seems like a standard problem and therefore has a more efficient way of doing this. Is there a specific name for this type of problem? I also thought about pivoting and unpivoting. But that didn't seem too clean either. Any help or reference to articles in how to process this is appreciated.
The problem concerns data normalization and relational integrity. Your concept doesn't really make sense - Dimension table shows two different reasons with same ID and Fact table loses a record. Conventional schema for this many-to-many relationship would be three tables like:
Users table (info about users and UserID is unique)
Reasons table (info about reasons and ReasonID is unique)
UserReasons junction table (associates users with reasons - your
existing table). Assuming user could associate with same reason
multiple times, probably also need ReturnDate and OrderID_FK fields
in UserReasons.
So, need to replace reason description in first table (UserReasons) with a ReasonID. Add a number long integer field ReasonID_FK in that table to hold ReasonID key.
To build Reasons table based on current data, use DISTINCT:
SELECT DISTINCT return_reason INTO Reasons FROM UserReasons
In new table, rename return_reason field to ReasonDescription and add an autonumber field ReasonID.
Now run UPDATE action to populate ReasonID_FK field in UserReasons.
UPDATE UserReasons INNER JOIN UserReasons.return_reason ON Reasons.ReasonDescription SET UserReasons.ReasonID_FK = Reasons.ReasonID
When all looks good, delete return_reason field.
I have two tables file & users, I want to see the file info for each user for C:\Users\%USERNAME%\Documents
So e.g. this would get the info from 'example' documents:
SELECT *
FROM file
WHERE path LIKE 'C:\Users\example\Documents\%%';
But the username is coming from the users
SELECT username FROM users;
returns
+--------------------+
| username |
+--------------------+
| Administrator |
| DefaultAccount |
| example |
| Guest |
| WDAGUtilityAccount |
| SYSTEM |
| LOCAL SERVICE |
| NETWORK SERVICE |
+--------------------+
Alternatively, there's:
SELECT directory FROM users;
+---------------------------------------------+
| directory |
+---------------------------------------------+
| |
| |
| C:\Users\example |
| |
| |
| %systemroot%\system32\config\systemprofile |
| %systemroot%\ServiceProfiles\LocalService |
| %systemroot%\ServiceProfiles\NetworkService |
+---------------------------------------------+
Which provides the first part of the path, but still can't get to join 'Documents' to end of query and also run the file query.
So, how do I loop through the each of the usernames.
I've tried modifying but neither table can be modified
This is a great opportunity to use a JOIN query:
SELECT f.*
FROM file f JOIN users u
WHERE f.path LIKE 'C:\Users\' || u.username || '\Documents\%%'
When you run this query, osquery will first generate the list of users, then substitute the username into the path provided to the file table.
JOIN is a really powerful way to combine the results of various tables, and it's well worth taking some time to experiment and learn how to use this power.
Zach's answer is great, but there are times that a user's directory can be named differently than their respective username.
Thankfully, we also have the directory column in the users table which returns a user's home directory. Using this column will prevent directory/username mismatches from causing issues in your query output:
SELECT f.*
FROM file f JOIN users u
WHERE f.path LIKE u.directory || '\Documents\%%';
The title might be worded strange, but it's probably because I don't even know if I'm asking the right question.
So essentially what I'm trying to build is a "breadcrumbish" categoricalization type system (like a file directory) where each node has a parent (except for root) and each node can contain either data or another node. This will be used for organizing email addresses in a database. I have a system right now where you can create a "group" and add email addresses to that group, but it would be very nice to add an organizational system to it.
This (in my head) is in a tree format, but I don't know what tree.
The issue I'm having is building it using MySQL. It's easy to traverse trees that are in memory, but on database, it's a bit trickier.
Image of tree: http://j.imagehost.org/0917/asdf.png
SELECT * FROM Businesses:
Tim's Hardware Store, 7-11, Kwik-E-Mart, Cub Foods, Bob's Grocery Store, CONGLOM-O
SELECT * FROM Grocery Stores:
Cub Foods, Bob's Grocery Store, CONGLOM-O
SELECT * FROM Big Grocery Stores:
CONGLOM-O
SELECT * FROM Churches:
St. Peter's Church, St. John's Church
I think this should be enough information so I can accurately describe what my goal is.
Well, there are a few patterns you could use. Which one is right depends on your needs.
Do you need to select a node and all its children? If so, then a Nested set Model (Scroll down to the heading) may be better for you. The table would look like this:
| Name | Left | Right |
| Emails | 1 | 12 |
| Business | 2 | 7 |
| Tim's | 3 | 4 |
| 7-11 | 5 | 6 |
| Churches | 8 | 11 |
| St. Pete | 9 | 10 |
So then, to find anything below a node, just do
SELECT name FROM nodes WHERE Left > *yourleftnode* AND Right < *yourrightnode*
To find everything above the node:
SELECT name FROM nodes WHERE Left < *yourleftnode* AND Right > *yourrightnode*
If you only want to query for a specific level, you could do an Adjacency List Model (Scoll down to the heading):
| Id | Name | Parent_Id |
| 1 | Email | null |
| 2 | Business | 1 |
| 3 | Tim's | 2 |
To find everything on the same level, just do:
SELECT name FROM nodes WHERE parent_id = *yourparentnode*
Of course, there's nothing stopping you from doing a hybrid approach which will let you query however you'd like for the query at hand
| Id | Name | Parent_Id | Left | Right | Path |
| 1 | Email | null | 1 | 6 | / |
| 2 | Business | 1 | 2 | 5 | /Email/ |
| 3 | Tim's | 2 | 3 | 4 | /Email/Business/ |
Really, it's just a matter of your needs...
The easiest way to do it would be something like this:
Group
- GroupID (PK)
- ParentGroupID
- GroupName
People
- PersonID (PK)
- EmailAddress
- FirstName
- LastName
GroupMembership
- GroupID (PK)
- PersonID (PK)
That should establish a structure where you can have groups that have parent groups and people that can be members of groups (or multiple groups). If a person can only be a member of one group, then get rid of the GroupMembership table and just put a GroupID on the People table.
Complex queries against this structure can get difficult though. There are other less intuitive ways to model this that make querying easier (but often make updates more difficult). If the number of groups is small, the easiest way to handle queries against this is often to load the whole tree of Groups into memory, cache it, and use that to build your queries.
As always when I see questions about modeling trees and hierarchies, my suggestion is that you get a hold of a copy of Joe Celko's book on the subject. He presents various ways to model them in a RDBMS, some of which are fairly imaginative, and he gives the pros and cons for each pattern.
Create an object Group which has a name, many email addresses, and a parent, which can be null.
I would like to know how to backup my data from 2 separate tables (CATEGORIES and SUBCATEGORIES, where SUBCATEGORIES belong to a CATEGORY) in such a way that I can restore the relationship at a later time. I am not sure if mysqldump --opt db_name would suffice.
Example:
Categories:
| ID | name
-----------
| 1 | Audio
| 9 | Video
Subcategories:
| ID | category_id | name
-------------------------
| 1 | 1 | Guitar
| 2 | 1 | Piano
| 3 | 9 | Video Camera
Thanks
mysqldump is sufficient
It will generate the SQL code necessary to rebuild your database and as the relationships are not special data (just logical coincidences between tables) it's enough to backup a database. Even by using mysqldump without the --opt param it will add indexes definitions so the contraints will remain
the mysqldump default add the create table command and it save the relation.