Couchbase wildcard / variable keys in view - dynamic

After doing some research and testing with Couchbase, I am getting some good results.
However, it seems strange that views must be created a head of time and are not very flexible.
Basically, if I have a view like this..
function(doc, meta) {
emit([doc.name, doc.location, doc.gender, doc.birthYear, doc.birthMonth], null);
}
And I want to query but different keys. Such as, maybe name = "John" and gender = "M"
It doesnt seem I can do startKey = ["John", {}, "M"], endKey = ["John", {}, "M", {}].
Similarly, what if I just want to filter the above by gender and birth month?
It seems i have to manually crate an individual view for every possible type of query, which with lots of data points if less than optimal.
I havnt seen any questions addressing this. Also, I looked into passing args to map or reduce to do any of it dynamically but that cant be done. I'd be stuck pulling ALL records across all group levels then having to manually sort/aggregate this data.
Can this be done?
Thank you

As of Couchbase version 4.x you have N1QL query language. You can specify a filter criteria to select your json objects without having any views in place.
So as per your example, you should be able issue a query like that:
SELECT *
FROM your_bucket_name
WHERE name = 'John' AND gender = 'M'
Here is a N1QL tutorial to get feel of it.
Yet another way is to use Couchbase integration with ElasticSearch and execute search query in ElasticSearch engine that will return you all the keys it found based on your search criteria.

http://www.couchbase.com/communities/n1ql
N1QL is more rich wuering language to couchbase data, which does not as limited as views

Related

Google Bigquery, WHERE clause based on JSON item

I've got a bigquery import from a firestore database where I want to query on a particular field from a document. This was populated via the firestore-bigquery extension and the document data is stored as a JSON string.
I'm trying to use a WHERE clause in my query that uses one of the fields from the JSON data. However this doesn't seem to work.
My query is as follows:
SELECT json_extract(data,'$.title') as title,p
FROM `table`
left join unnest(json_extract_array(data, '$.tags')) as p
where json_extract(data,'$.title') = 'technology'
data is the JSON object and title is an attribute of all of the items. The above query will run but yield 'no results' (There are definitely results there for the title in question as they appear in the table preview).
I've tried using WHERE title = 'technology' as well but this returns an error that title is an unrecognized field (hence the json_extract).
From my research this should work as a standard SQL JSON query but doesn't seem to work on Bigquery. Does anyone know of a way around this?
All I can think of is if I put the results in another table, but I don't know if that's a workable solution as the data is updated via the extension on an update, so I would need to constantly refresh my second table as well.
Edit
I'm wondering if configuring a view would help with this? Though ultimately I would like to query this based on different parameters and the docs here https://cloud.google.com/bigquery/docs/views suggest you can't reference query parameters in a view
I've since managed to work this out, and will share the solution for anyone else with the same problem.
The solution was to use JSON_VALUE in the WHERE clause instead e.g:
where JSON_VALUE(data,'$.title') = 'technology';
I'm still not sure if this is the best way to do this in terms of performance and cost so I will wait to see if anyone else leaves a better answer.

SQL DB2 - How to SELECT or compare columns based on their name?

Thank you for checking my question out!
I'm trying to write a query for a very specific problem we're having at my workplace and I can't seem to get my head around it.
Short version: I need to be able to target columns by their name, and more specifically by a part of their name that will be consistent throughout all the columns I need to combine or compare.
More details:
We have (for example), 5 different surveys. They have many questions each, but SOME of the questions are part of the same metric, and we need to create a generic field that keeps it. There's more background to the "why" of that, but it's pretty important for us at this point.
We were able to kind of solve this with either COALESCE() or CASE statements but the challenge is that, as more surveys/survey versions continue to grow, our vendor inevitably generates new columns for each survey and its questions.
Take this example, which is what we do currently and works well enough:
CASE
WHEN SURVEY_NAME = 'Service1' THEN SERV1_REC
WHEN SURVEY_NAME = 'Notice1' THEN FNOL1_REC
WHEN SURVEY_NAME = 'Status1' THEN STAT1_REC
WHEN SURVEY_NAME = 'Sales1' THEN SALE1_REC
WHEN SURVEY_NAME = 'Transfer1' THEN Null
ELSE Null
END REC
And also this alternative which works well:
COALESCE(SERV1_REC, FNOL1_REC, STAT1_REC, SALE1_REC) as REC
But as I mentioned, eventually we will have a "SALE2_REC" for example, and we'll need them BOTH on this same statement. I want to create something where having to come into the SQL and make changes isn't needed. Given that the columns will ALWAYS be named "something#_REC" for this specific metric, is there any way to achieve something like:
COALESCE(all columns named LIKE '%_REC') as REC
Bonus! Related, might be another way around this same problem:
Would there also be a way to achieve this?
SELECT (columns named LIKE '%_REC') FROM ...
Thank you very much in advance for all your time and attention.
-Kendall
Table and column information in Db2 are managed in the system catalog. The relevant views are SYSCAT.TABLES and SYSCAT.COLUMNS. You could write:
select colname, tabname from syscat.tables
where colname like some_expression
and syscat.tabname='MYTABLE
Note that the LIKE predicate supports expressions based on a variable or the result of a scalar function. So you could match it against some dynamic input.
Have you considered storing the more complicated properties in JSON or XML values? Db2 supports both and you can query those values with regular SQL statements.

Understanding GraphQL

I started experimenting with the GraphQL wp api.
I am querying the menus. As for the documentation, the query is very long
I would expect that querying
{
menus
}
only would bring about all the data nested in menus, it does not.
Why is this? What is the way to getting all nested data in an object as to see what's in there?
Thank you for your time
The rule is that every "leaf" fields in a GraphQL query should be a Scalar something like Int , Boolean , String etc. So if the meuns field in the root Query type is a Scalar , it is a valid query and will return you something.
If not , you have to continue navigating the Menu type and pick the fields that you want to include in the GraphQL query such as :
{
menus {
id
createdDate
}
}
There is no wildcard that can represent all fields in current GraphQL spec.You have to explicitly declare all fields you want to select in the query.By looking at the GraphQL schema, you can know the available fields for each type. One of the tips is to rely on the GraphQL introspection system .It basically means that you can use some of the GraphQL client such as Altair, Graphiql, or GraphQL Playground etc. which most of them will have some auto-suggest function that will guide you to compose a query by suggesting you what fields are available to be included for a type .
P.S. A similar analogy to SQL is that there is no select * from foo , you have to explicitly define the columns that you want to select in the select clause such as select id,name,address from foo.
If you keep in mind that you're getting back a JSON object, you can think of your GraphQL query as defining the left-hand side of the response (this is intentional in how it was designed), e.g. just the keys. So unless there are null values, what you get back should exactly match the shape of the query.
If you want to see what can be queried, you need access to the schema itself. If it's a schema provided by someone else (looks like WordPress in this case), they should also have provided the means to explore and understand it.
That is the main feature of GraphQL, you can specify what data you need from a query. And because of that, you can't just query menus in that way, you need to specify every nested field in menus you need and only then it'll work :)

How do I programmatically build Eloquent queries based on a custom data structure saved in a persistent storage?

I'm building a reporting tool for my Laravel app that will allow users to create reports and save them for later use.
Users will be able to select from a pre-defined list to modify the query, then run the report and save it.
Having never done this before, I was just wondering if it is ok to save the query in the database? This would allow the user to select a saved report and execute the query.
One approach that would be easier / more robust than the suggested approach of saving queries to the database would be build a Controller that constructs the queries based on user input.
You could validate server side that the query parameters match the predefined list of options and then use Eloquent's QueryBuilder to programatically build the queries.
Actual code examples are hard to provide based off of your question however, as it's very broad and doesn't contain any specific examples.
You essentially need to build a converter between your storage mechanism and your data model in PHP. A code example would not add much value because you need to build it based on your needs.
You need to build a data structure (ideally using JSON in this case, since it's powerful enough for this) that defines all the query elements in a way that your business logic is able to read and convert in Eloquent queries.
I have done something similar in the past but for some simple scenarios, like defining variables for queries, instead of actual query elements.
This is how I'd do it, for example:
{
table: 'users',
type: 'SELECT',
fields: ['firstname AS fName', 'lastname AS lName'],
wheres: [
is_admin: false,
is_registered: true
]
}
converts to:
DB::table('users')
->where('is_admin', false)
->where('is_registered', true)
->get(['firstname AS fName', 'lastname AS lName']);
which converts to:
SELECT firstname AS fName, lastname AS lName WHERE is_admin = false AND is_registered = true
Here's an answer about Saving Report Parameters to a db but from a SQL Server Report Service (SSRS) angle.
But it's still a generic enough EAV structure to work for any parameter datatype (strings, ints, dates, etc.).
You might want to skip Eloquent and use mysql stored procedures. Then you only need to save the list of parameters you'd pass to each.
Like the preferred output type (e.g. .pdf, .xlsx, .html), and who to email it to, and who has permission to run it.

Django - finding the extreme member of each group

I've been playing around with the new aggregation functionality in the Django ORM, and there's a class of problem I think should be possible, but I can't seem to get it to work. The type of query I'm trying to generate is described here.
So, let's say I have the following models -
class ContactGroup(models.Model):
.... whatever ....
class Contact(models.Model):
group = models.ForeignKey(ContactGroup)
name = models.CharField(max_length=20)
email = models.EmailField()
...
class Record(models.Model):
contact = models.ForeignKey(Contact)
group = models.ForeignKey(ContactGroup)
record_date = models.DateTimeField(default=datetime.datetime.now)
... name, email, and other fields that are in Contact ...
So, each time a Contact is created or modified, a new Record is created that saves the information as it appears in the contact at that time, along with a timestamp. Now, I want a query that, for example, returns the most recent Record instance for every Contact associated to a ContactGroup. In pseudo-code:
group = ContactGroup.objects.get(...)
records_i_want = group.record_set.most_recent_record_for_every_contact()
Once I get this figured out, I just want to be able to throw a filter(record_date__lt=some_date) on the queryset, and get the information as it existed at some_date.
Anybody have any ideas?
edit: It seems I'm not really making myself clear. Using models like these, I want a way to do the following with pure django ORM (no extra()):
ContactGroup.record_set.extra(where=["history_date = (select max(history_date) from app_record r where r.id=app_record.id and r.history_date <= '2009-07-18')"])
Putting the subquery in the where clause is only one strategy for solving this problem, the others are pretty well covered by the first link I gave above. I know where-clause subselects are not possible without using extra(), but I thought perhaps one of the other ways was made possible by the new aggregation features.
It sounds like you want to keep records of changes to objects in Django.
Pro Django has a section in chapter 11 (Enhancing Applications) in which the author shows how to create a model that uses another model as a client that it tracks for inserts/deletes/updates.The model is generated dynamically from the client definition and relies on signals. The code shows most_recent() function but you could adapt this to obtain the object state on a particular date.
I assume it is the tracking in Django that is problematic, not the SQL to obtain this, right?
First of all, I'll point out that:
ContactGroup.record_set.extra(where=["history_date = (select max(history_date) from app_record r where r.id=app_record.id and r.history_date <= '2009-07-18')"])
will not get you the same effect as:
records_i_want = group.record_set.most_recent_record_for_every_contact()
The first query returns every record associated with a particular group (or associated with any of the contacts of a particular group) that has a record_date less than the date/ time specified in the extra. Run this on the shell and then do this to review the query django created:
from django.db import connection
connection.queries[-1]
which reveals:
'SELECT "contacts_record"."id", "contacts_record"."contact_id", "contacts_record"."group_id", "contacts_record"."record_date", "contacts_record"."name", "contacts_record"."email" FROM "contacts_record" WHERE "contacts_record"."group_id" = 1 AND record_date = (select max(record_date) from contacts_record r where r.id=contacts_record.id and r.record_date <= \'2009-07-18\')
Not exactly what you want, right?
Now the aggregation feature is used to retrieve aggregated data and not objects associated with aggregated data. So if you're trying to minimize number of queries executed using aggregation when trying to obtain group.record_set.most_recent_record_for_every_contact() you won't succeed.
Without using aggregation, you can get the most recent record for all contacts associated with a group using:
[x.record_set.all().order_by('-record_date')[0] for x in group.contact_set.all()]
Using aggregation, the closest I could get to that was:
group.record_set.values('contact').annotate(latest_date=Max('record_date'))
The latter returns a list of dictionaries like:
[{'contact': 1, 'latest_date': somedate }, {'contact': 2, 'latest_date': somedate }]
So one entry for for each contact in a given group and the latest record date associated with it.
Anyway, the minimum query number is probably 1 + # of contacts in a group. If you are interested obtaining the result using a single query, that is also possible, but you'll have to construct your models in a different way. But that's a totally different aspect of your problem.
I hope this will help you understand how to approach the problem using aggregation/ the regular ORM functions.