RavenDB: Indexing documents from multiple collections - ravendb

I have several document collections that occasionally need to be pulled together into a single index for reporting purposes.
This FAQ provides a solution for writing such an index in Raven Studio: http://ravendb.net/faq/indexing-across-entities
While I understand I won't get full compile-time checking, I'm trying to avoid completely unchecked code like this:
public class Assets_ById : AbstractIndexCreationTask
{
public override IndexDefinition CreateIndexDefinition()
{
return new IndexDefinition
{
Map = #"from doc in docs
where doc[""#metadata""][""Raven-Entity-Name""] == ""Cars"" ||
doc[""#metadata""][""Raven-Entity-Name""] == ""Trains"" ||
doc[""#metadata""][""Raven-Entity-Name""] == ""Boats"" ||
doc[""#metadata""][""Raven-Entity-Name""] == ""Planes""
select new
{
Cost = doc.Cost,
Id = doc.Id,
Name = doc.Name,
Type = doc.Type,
};"
}
}
}
Is there something similar to the generic AbstractIndexCreationTask<T> that will allow me to define a heterogeneous index with lambda expressions?

You can use WhereEntityIs(names), like this:
from doc in docs.WhereEntityIs<Vehicle>("Cars", "Trains", "Boats", "Planes")
select new
{
doc.Cost,
doc.Name,
doc.Type
}

Take a look here: https://groups.google.com/forum/#!topic/ravendb/9wvRY0OiGBs
It's basically the same question and the short answer is:
"right now there isn't a better option, but there will be in the future"

Related

How do I read data in one cell and write data into another cell using Google Sheets?

So let's all assume that column B is filled with multiple, short statements. These statements may be used more than once, not at all, or just once throughout the column. I want to be able to read what's in each cell of column B and assign a category to it in column F using the Google Sheets script editor. I'll include some pseudo-code of how I would do something like this normally.
for (var i = 0; i < statements.length; i++) {
if (statements[i] == 'Description One') {
category[i] = 'Category One';
}
else if (statements[i] == 'Description Two') {
category[i] = 'Category Two';
}
// and so on for all known categories....
}
How do I go about accessing a cell for a read and accessing a different cell for a write?
Thanks in advance for the help!
Ok, so after a little more thought on the subject, I've arrived at a solution. It's super simple, albeit tedious
function assignCategory(description) {
if (description == 'Description One') {
return 'Category One';
}
// and so on for all known categories
}
Hopefully someone will see this and be helped anyway, if you guys think of a more efficient and easier to maintain way of doing this, by all means do chime in.
Assuming a sheet such as this one, which has a header and six different columns (where B is the description, and F the category); you could use a dictionary to translate your values as follows:
// (description -> category) dictionary
var translations = {
"cooking": "Cooking",
"sports": "Sport",
"leisure": "Leisure",
"music": "Music",
"others": "Other"
}
function assignCategories() {
var dataRange = SpreadsheetApp.getActiveSheet().getDataRange();
for (var i=2; i<=dataRange.getNumRows(); i++) {
var description = dataRange.getCell(i, 2).getValue();
var category = translations[description];
dataRange.getCell(i, 6).setValue(category);
}
}
In case you need additional ruling (i.e. descriptions that contain cricket must be classified as sport), you could accomplish your desired results by implementing your own custom function and using string functions (such as indexOf) or regular expressions.
Using indexOf
// (description -> category) dictionary
var translations = {
"cooking": "Cooking",
"sports": "Sport",
"leisure": "Leisure",
"music": "Music",
"others": "Other"
}
function assignCategories() {
var dataRange = SpreadsheetApp.getActiveSheet().getDataRange();
for (var i=2; i<=dataRange.getNumRows(); i++) {
var description = dataRange.getCell(i, 2).getValue()
var category = assignCategory(description);
if (category) dataRange.getCell(i, 6).setValue(category);
}
}
function assignCategory(description) {
description = description.toLowerCase();
var keys = Object.keys(translations);
for (var i=0; i<categories.length; i++) {
var currentKey = keys[i];
if (description.indexOf(currentKey) > -1)
return translations[currentKey];
}
}
This version is a bit more sophisticated. It will make the 'description' of each row lowercase in order to better compare with your dictionary, and also uses indexOf for checking whether the 'translation key' appears in the description, rather than checking for an exact match.
You should be aware however that this method will be considerably slower, and that the script may timeout (see GAS Quotas). You could implement ways to 'resume' your script operations such that you can re-run it and continue where it left off, in case that this hinders your operations.

How Do I Return the Index of type T in a Collection based on some criteria?

Kotlin has some pretty cool functions for collections. However, I have come across a problem in which the solution is not apparent to me.
I have a List of Objects. Those Objects have an ID field which coincides with a SQLite database. SQL operations are performed on the database, and a new list is generated. How can the index of an item from the new list be found based on the "ID" field (or any other field for that matter)?
the Collection.find{} function return the object, but not the index.
indexOfFirst can find the index of the first element of a collection that satisfies a specified predicate.
We have a DB SQlite that a call is made to to retrieve parentList We can obtain the items in the ArrayList with this type of code
fun onDoIt(view: View){
initDB()
for (t in 0..X-1) {
var N:String = parentList[t].dept
// NOTE two syntax here [t] and get(t)
if(t == 1){
var B:String = parentList[0].idD.toString()
println("$$$$$$$$$$$$$$$$$$$$$ ====== "+B)
}
var I:String = parentList.get(t).idD.toString()
println("################### id "+I+" for "+N)
}
}
private fun initDB() {
parentList = db.querySPDept()
if (parentList.isEmpty()) {
title = "No Records in DB"
} else {
X = parentList.size
println("**************************************** SIZE " + X)
title = "SP View Activity"
}
}

Proper Way to Retrieve More than 128 Documents with RavenDB

I know variants of this question have been asked before (even by me), but I still don't understand a thing or two about this...
It was my understanding that one could retrieve more documents than the 128 default setting by doing this:
session.Advanced.MaxNumberOfRequestsPerSession = int.MaxValue;
And I've learned that a WHERE clause should be an ExpressionTree instead of a Func, so that it's treated as Queryable instead of Enumerable. So I thought this should work:
public static List<T> GetObjectList<T>(Expression<Func<T, bool>> whereClause)
{
using (IDocumentSession session = GetRavenSession())
{
return session.Query<T>().Where(whereClause).ToList();
}
}
However, that only returns 128 documents. Why?
Note, here is the code that calls the above method:
RavenDataAccessComponent.GetObjectList<Ccm>(x => x.TimeStamp > lastReadTime);
If I add Take(n), then I can get as many documents as I like. For example, this returns 200 documents:
return session.Query<T>().Where(whereClause).Take(200).ToList();
Based on all of this, it would seem that the appropriate way to retrieve thousands of documents is to set MaxNumberOfRequestsPerSession and use Take() in the query. Is that right? If not, how should it be done?
For my app, I need to retrieve thousands of documents (that have very little data in them). We keep these documents in memory and used as the data source for charts.
** EDIT **
I tried using int.MaxValue in my Take():
return session.Query<T>().Where(whereClause).Take(int.MaxValue).ToList();
And that returns 1024. Argh. How do I get more than 1024?
** EDIT 2 - Sample document showing data **
{
"Header_ID": 3525880,
"Sub_ID": "120403261139",
"TimeStamp": "2012-04-05T15:14:13.9870000",
"Equipment_ID": "PBG11A-CCM",
"AverageAbsorber1": "284.451",
"AverageAbsorber2": "108.442",
"AverageAbsorber3": "886.523",
"AverageAbsorber4": "176.773"
}
It is worth noting that since version 2.5, RavenDB has an "unbounded results API" to allow streaming. The example from the docs shows how to use this:
var query = session.Query<User>("Users/ByActive").Where(x => x.Active);
using (var enumerator = session.Advanced.Stream(query))
{
while (enumerator.MoveNext())
{
User activeUser = enumerator.Current.Document;
}
}
There is support for standard RavenDB queries, Lucence queries and there is also async support.
The documentation can be found here. Ayende's introductory blog article can be found here.
The Take(n) function will only give you up to 1024 by default. However, you can change this default in Raven.Server.exe.config:
<add key="Raven/MaxPageSize" value="5000"/>
For more info, see: http://ravendb.net/docs/intro/safe-by-default
The Take(n) function will only give you up to 1024 by default. However, you can use it in pair with Skip(n) to get all
var points = new List<T>();
var nextGroupOfPoints = new List<T>();
const int ElementTakeCount = 1024;
int i = 0;
int skipResults = 0;
do
{
nextGroupOfPoints = session.Query<T>().Statistics(out stats).Where(whereClause).Skip(i * ElementTakeCount + skipResults).Take(ElementTakeCount).ToList();
i++;
skipResults += stats.SkippedResults;
points = points.Concat(nextGroupOfPoints).ToList();
}
while (nextGroupOfPoints.Count == ElementTakeCount);
return points;
RavenDB Paging
Number of request per session is a separate concept then number of documents retrieved per call. Sessions are short lived and are expected to have few calls issued over them.
If you are getting more then 10 of anything from the store (even less then default 128) for human consumption then something is wrong or your problem is requiring different thinking then truck load of documents coming from the data store.
RavenDB indexing is quite sophisticated. Good article about indexing here and facets here.
If you have need to perform data aggregation, create map/reduce index which results in aggregated data e.g.:
Index:
from post in docs.Posts
select new { post.Author, Count = 1 }
from result in results
group result by result.Author into g
select new
{
Author = g.Key,
Count = g.Sum(x=>x.Count)
}
Query:
session.Query<AuthorPostStats>("Posts/ByUser/Count")(x=>x.Author)();
You can also use a predefined index with the Stream method. You may use a Where clause on indexed fields.
var query = session.Query<User, MyUserIndex>();
var query = session.Query<User, MyUserIndex>().Where(x => !x.IsDeleted);
using (var enumerator = session.Advanced.Stream<User>(query))
{
while (enumerator.MoveNext())
{
var user = enumerator.Current.Document;
// do something
}
}
Example index:
public class MyUserIndex: AbstractIndexCreationTask<User>
{
public MyUserIndex()
{
this.Map = users =>
from u in users
select new
{
u.IsDeleted,
u.Username,
};
}
}
Documentation: What are indexes?
Session : Querying : How to stream query results?
Important note: the Stream method will NOT track objects. If you change objects obtained from this method, SaveChanges() will not be aware of any change.
Other note: you may get the following exception if you do not specify the index to use.
InvalidOperationException: StreamQuery does not support querying dynamic indexes. It is designed to be used with large data-sets and is unlikely to return all data-set after 15 sec of indexing, like Query() does.

How do I get a list of fields in a generic sObject?

I'm trying to build a query builder, where the sObject result can contain an indeterminate number of fields. I'm using the result to build a dynamic table, but I can't figure out a way to read the sObject for a list of fields that were in the query.
I know how to get a list of ALL fields using the getDescribe information, but the query might not contain all of those fields.
Is there a way to do this?
Presumably you're building the query up as a string, since it's dynamic, so couldn't you just loop through the fields in the describe information, and then use .contains() on the query string to see if it was requested? Not crazy elegant, but seems like the simplest solution here.
Taking this further, maybe you have the list of fields selected in a list of strings or similar, and you could just use that list?
Not sure if this is exactly what you were after but something like this?
public list<sObject> Querylist {get; set;}
Define Search String
string QueryString = 'select field1__c, field2__c from Object where';
Add as many of these as you need to build the search if the user searches on these fields
if(searchParameter.field1__c != null && searchParameter.field1__c != '')
{
QueryString += ' field1__c like \'' + searchParameter.field1__c + '%\' and ';
}
if(searchParameter.field2__c != null && searchParameter.field2__c != '')
{
QueryString += ' field2__c like \'' + searchParameter.field2__c + '%\' and ';
}
Remove the last and
QueryString = QueryString.substring(0, (QueryString.length()-4));
QueryString += ' limit 200';
add query to the list
for(Object sObject : database.query(QueryString))
{
Querylist.add(sObject);
}
To get the list of fields in an sObject, you could use a method such as:
public Set<String> getFields(sObject sobj) {
Set<String> fieldSet = new Set<String>();
for (String field : sobj.getSobjectType().getDescribe().fields.getMap().keySet()) {
try {
a.get(field);
fieldSet.add(field);
} catch (Exception e) {
}
}
return fieldSet;
}
You should refactor to bulkily this approach for your context, but it works. Just pass in an sObject and it'll give you back a set of the field names.
I suggest using a list of fields for creating both the query and the table. You can put the list of fields in the result so that it's accesible for anyone using it. Then you can construct the table by using result.getFields() and retrieve the data by using result.getRows().
for (sObject obj : result.getRows()) {
for (String fieldName : result.getFields()) {
table.addCell(obj.get(fieldName));
}
}
If your trying to work with a query that's out of your control, you would have to parse the query to get the list of fields. But I wouldn't suggest trying that. It complicates code in ways that are hard to follow.

Does Dapper support the like operator?

Using Dapper-dot-net...
The following yields no results in the data object:
var data = conn.Query(#"
select top 25
Term as Label,
Type,
ID
from SearchTerms
WHERE Term like '%#T%'",
new { T = (string)term });
However, when I just use a regular String Format like:
string QueryString = String.Format("select top 25 Term as Label, Type, ID from SearchTerms WHERE Term like '%{0}%'", term);
var data = conn.Query(QueryString);
I get 25 rows back in the collection. Is Dapper not correctly parsing the end of the parameter #T?
Try:
term = "whateverterm";
var encodeForLike = term => term.Replace("[", "[[]").Replace("%", "[%]");
string term = "%" + encodeForLike(term) + "%";
var data = conn.Query(#"
select top 25
Term as Label,
Type,
ID
from SearchTerms
WHERE Term like #term",
new { term });
There is nothing special about like operators, you never want your params inside string literals, they will not work, instead they will be interpreted as a string.
note
The hard-coded example in your second snippet is strongly discouraged, besides being a huge problem with sql injection, it can cause dapper to leak.
caveat
Any like match that is leading with a wildcard is not SARGable, which means it is slow and will require an index scan.
Yes it does. This simple solution has worked for me everytime:
db.Query<Remitente>("SELECT *
FROM Remitentes
WHERE Nombre LIKE #n", new { n = "%" + nombre + "%" })
.ToList();
Best way to use this to add concat function in query as it save in sql injecting as well, but concat function is only support above than sql 2012
string query = "SELECT * from country WHERE Name LIKE CONCAT('%',#name,'%');"
var results = connection.query<country>(query, new {name});
The answer from Sam wasn't working for me so after some testing I came up with using the SQLite CONCAT equivalent which seems to work:
string sql = "SELECT * FROM myTable WHERE Name LIKE '%' || #NAME || '%'";
var data = IEnumerable data = conn.Query(sql, new { NAME = Name });
Just to digress on Sam's answer, here is how I created two helper methods to make searches a bit easier using the LIKE operator.
First, creating a method for generating a parameterized query, this method uses dynamic: , but creating a strongly typed generic method should be more desired in many cases where you want static typing instead of dynamic.
public static dynamic ParameterizedQuery(this IDbConnection connection, string sql, Dictionary<string, object> parametersDictionary)
{
if (string.IsNullOrEmpty(sql))
{
return null;
}
string missingParameters = string.Empty;
foreach (var item in parametersDictionary)
{
if (!sql.Contains(item.Key))
{
missingParameters += $"Missing parameter: {item.Key}";
}
}
if (!string.IsNullOrEmpty(missingParameters))
{
throw new ArgumentException($"Parameterized query failed. {missingParameters}");
}
var parameters = new DynamicParameters(parametersDictionary);
return connection.Query(sql, parameters);
}
Then adding a method to create a Like search term that will work with Dapper.
public static string Like(string searchTerm)
{
if (string.IsNullOrEmpty(searchTerm))
{
return null;
}
Func<string, string> encodeForLike = searchTerm => searchTerm.Replace("[", "[[]").Replace("%", "[%]");
return $"%{encodeForLike(searchTerm)}%";
}
Example usage:
var sql = $"select * from products where ProductName like #ProdName";
var herringsInNorthwindDb = connection.ParameterizedQuery(sql, new Dictionary<string, object> { { "#ProdName", Like("sild") } });
foreach (var herring in herringsInNorthwindDb)
{
Console.WriteLine($"{herring.ProductName}");
}
And we get our sample data from Northwind DB:
I like this approach, since we get helper extension methods to do repetitive work.
My solution simple to this problem :
parameter.Add("#nomeCliente", dfNomeCliPesquisa.Text.ToUpper());
query = "SELECT * FROM cadastrocliente WHERE upper(nome) LIKE " + "'%" + dfNomeCliPesquisa.Text.ToUpper() + "%'";