Bigtable - read_rows and start_key - bigtable

Is there a way to write the start_key for Bigtable? I was not able to find a clear documentation on what the syntax is for start_key.
Suppose I have a row key of {domain}_{timestamp} of user activity.
To filter the query to a specific domain I could use a filter (slower), or a start_key (faster).
I have been writing my start_key string as {domain}_, but what if we now have domain, user ID, and timestamp, and now I want to filter by any user but a specific time, can I use something like {domain}_*_{timestamp}?

You have to use a Filter with RegexStringComparator. You can also setStart({domain}_) for better performance. Unfortunately, that's going to pretty much going to do a scan on {domain}_ and filter on the server-side.
It might be faster to do a search with either a random user id, or if you need all users, to use Table.get(List<Get>) where each Get correspond to individual user.

Related

LDAP limit user search on specific OUs

I have been wondering whether it is possible to limit OUs in search base. This is how my hierarchy looks like:
Now, my search base is: dc=prod,dc=prod,dc=co
Is there possibility to limit user search only to these:
OU=PROD,OU=SYS
OU=PROD,OU=Int
OU=UNIX
I'm a noob in this area, would be really welcome if someone could help.
Not sure if it is possible to use userSearchBase for multiple OUs (so far I understood that it is not possible, although for sssd I saw example which works)
I think some user search filter might do it but wasn't really successful unfortunately
Yes, you can limit the search base to multiple or single OU's.
Ranger does accept multiple search bases, for example:-
OU=PROD,OU=SYS,dc=prod,dc=prod,dc=co;OU=PROD,OU=Int,dc=prod,dc=prod,dc;OU=UNIX,dc=prod,dc=prod,dc=co
Few thing to note, it has to be separated by ";" and it needs full path including "dc" values.

Searching using SOLR on multiple fields

I have two requirements for my SOLR implementation:
I need to be able to search on multiple fields at the same time (preferably with field boosting). This is possible using dismax parser.
I also have a specific set of indexed fields (example gender field). I need to be able to apply such specific filters (example: select?q=david&gender:male&status:married). As per my understanding of dismax, this is not possible.
Please suggest if the second requirement can be handled using dismax (or edismax)? For now i am forced to use standard query parser, even though i really liked dismax.
There is nothing stopping you from using dismax or edismax. Use qf to tell it which fields to search by default, and use fq to apply queries that act as filters.
/select?q=david&fq=gender:male&fq=status:married&qf=name^10 address^3
Filter Queries doesn't affect score, and will be cached separately. If you always filter on both gender and status, you could combine them to get a single query cache instead (fq=gender:male AND status:married).

How can I store invite requests in DynamoDB?

I'm trying to write an invite system for a project I'm working on. I plan on having a "give us your email address and we'll give you a beta invite when we're ready" type of thing. I'm trying to figure out how I can design a DynamoDB table so that I can query for the first x users who haven't already received an invite.
The table I'm thinking about creating would have something like the following columns:
email
date
fulfilled (boolean)
Can I do this with some combination of Hash keys, Range keys, and secondary indexes in DynamoDB? Or is this something that's better suited for a SQL database? The SQL query would be something like this:
SELECT email
FROM invite_request
WHERE fulfilled = 0
ORDER BY date
LIMIT 50;
Your best option here is to use the hash key as Fullfilled or Unfullfilled. That way you can use the query method which is much less expensive than scan. So your record would look like:
:Hash_key => Boolean , :Range_key => DateTime , Attributes => [email=>value]
Using this structure you can query all unfulfilled emails within a certain date range and limit the results. Hope this helps. If you want more access to the information for later you may want to look into SimpleDB as a better option, but if you're only using this for a single function Dynamo is a great way to go.

Multiple searches within a search result set (stored procedure)

Multiple searches within a search result set while using all the search terms used in in that session.
For example, I have a table User (UserId, UserName, UserAddress, UserCity)
What I am trying to do is, I want to search all the columns in the table, for example using a user's name, (which might give me a result set consisting of more than 1 result). I want to be able to search within the result set again using a new search term (not necessarily have to have the first search term in the search field), but this time, it must search within the result set of the 1st search. This might go on breaking down the result set until what is required is found.
Sorry if I might sound very confusing with my request. I've tried and still got no clue to where to start with. I've tried googling and browsed through this website, but couldn't find what i am really trying to find.
I want to be able to search within the result-set again using a new
search term [...], but this time, it must search within the result set
of the 1st search .This might go on breaking down the result-set until
what is required is found.
It seems to me that you have not yet understood that SQL is a declarative language, not an imperative one. And yes, there are stored procedures, but these are a procedural extension to SQL and don't alter the fact that SQL is essentially declarative.
So instead of "breaking down the result-set until what is required is found", you specify all criteria at once, and preferably do so without resorting to a stored procedure until you've understood non-procedural SQL.
To give you an example, a query using multiple predicates (facts about the desired result specified in a WHERE clause) might look like this:
SELECT UserId FROM User
WHERE UserName LIKE 'cook%'
AND UserAddress LIKE 'sesam%'
AND UserCity = 'Hamburg';

How can I filter out profanity in numeric IDs?

I want to use numeric IDs in a web application I am developing... However, as the ID is visible to users as a URL, I want to filter out profanity. Things like (I'll leave it to you to figure out what they are):
page.php?id=455
page.php?id=8008135
page.php?id=69
Has anyone solved this? Is this even a real problem?
Does it make sense just to skip numbers in my database sequence?
See also: How can I filter out profanity in base36 IDs?.
How about you using GUIDs? That would encode the actual numbers. I would bet most users don't even notice what is on the url.