RavenDB paging via cursor - ravendb

Paging in RavenDB is done via skip+take. This is default implementation I'm happy with most of the time. However for frequently changing data I want paging via a cursor. The cursor/after parameter specifies which was the last record displayed and where the list should continue on the next page.
This should work for data which can be dynamically sorted, so the sorting parameter is not fixed.
github is doing it this way on the "stars" page for example: https://github.com/[username]?after=Y3Vyc29&tab=stars
Any ideas how to achieve this in RavenDB?

there is no cursor pagination in RavenDB.
But you can use the 'last-modified to continuously iterate on frequently changing data.
from Orders as o
where o.'#metadata'.'#last-modified' > "2018-08-28:12:11"
order by o.'#metadata'.'#last-modified'
select {
A: o["#metadata"]["#last-modified"]
}.
You can also use Subscription

Related

React-Admin filters that relate to the current results

We're really enjoying using the capabilities offered by React-Admin.
We're using <ReferenceArrayInput> to allow filtering of a <List> by Country. The drop-down contains all countries in the database.
But, we'd like it to just contain the countries that relate to the current set of filtered records.
So, in the context of the React-Admin demo, if we've filtered for Returned, then the Customer drop-down would only contain customers who have returned items (see below). This would make a real-difference in finding the records of interest.
Our current plan is to (somehow) handle this in our <DataProvider>. But, is there are more ReactAdmin friendly way of doing this?
So you want to build dependent filters, which is not a native feature of react-admin - and a complex beast to tame.
First, doing so in the dataProvider will not work, because you'll only have the data of the first page of results. A record in a following page may have another value for your array input.
You could implement that logic in a custom Input component instead. This component can wrap the original <ReferenceArrayInput> and read the current ListContext to get the current data and filter value (https://marmelab.com/react-admin/useListContext.html), then alter the array of possible values using the filter prop (https://marmelab.com/react-admin/ReferenceArrayInput.html#filter).

React Admin - Make input for filter based on other resource

I am using React Admin to make a dashboard and I have this Lead resource with the status field, that is computed based on another resource, Call, and wanted to make a filter component for Lead's list. The way it works is that for each lead, I query the last call (sorted by a date field) associated with this lead and get its status. The lead status is the status for the last call.
{ filter: { lead }, sort: { date: -1 }, limit: 1 }
the lead status query
I use this query to make a field (that appear in the list in the row of a single lead), and wanted to know how I can make an input component to use as a filter in the list. I know this pattern is weird, but it's hard to change it in the backend because of how it's structured. I am open to suggestions concerning how to change this messy computed field situation, but as I said, I would be satisfied with knowing how I can create the input component.
The solution I'm going with is a computed field. In my case, as I use MongoDB, it will be done through an aggregation pipeline. As I'm using REST instead of GraphQL, I cannot use a resolver that would only be called in the need of the status field, sometimes resulting in an uneeded aggregation (getting the last Call for a given Lead). However, it won't incur in an additional roundtrip - and instead only consume more processing time in the DB - which would be necessary for react-admin to compute this field in through a reference. And status is an important field that will usually be needed anyways.

Custom Pagination in datatable

I have a web application in which I get data from my database and show in a datatable. I am facing an issue doing this as the data that I am fetching has too many rows(200 000). So when I query something like select * from table_name;
my application gets stuck.
Is there a way to handle this problem with JavaScript?
I tried pagination but I cannot figure how would i do that as datatable creates pagination for already rendered data?
Is there a way through which I can run my query through pagination at
the backend?
I have come across the same problem when working with mongodb and angularjs. I used server side paging. Since you have huge number of records, You can try using the same approach.
Assuming a case that you are displaying 25 records in one page.
Backend:
Get the total count of the records using COUNT query.
select * from table_name LIMIT 25 OFFSET
${req.query.pageNumber*25} to query limited records based on the page number;
Frontend:
Instead of using datatable, display the data in HTML table it self.
Define buttons for next page and previous page.
Define global variable in the controller/js file for pageNumber.
Increment pageNumber by 1 when next page button is clicked and
decrement that by 1 when prev button is pressed.
use result from COUNT query to put upper limit to pageNumber
variable.(if 200 records are there limit will be 200/25=8).
So basically select * from table_name LIMIT 25 OFFSET
${req.query.pageNumber*25} will limit the number of records to 25. when req.query.pageNumber=1, it will offset first 25records and sends next 25 records. similarly if req.query.pageNumber=2, it will offset first 2*25 records and sends 51-75 records.
There are two ways to handle.
First way - Handling paging in client side
Get all data from database and apply custom paging.
Second way - Handling paging in server side
Every time you want to call in database and get records according to pagesize.
You can use LIMIT and OFFSET constraints for pagination in MySQL. I understand that at a time 2 lacs data makes performance slower. But as you mention that you have to use JS for that. So make it clear that if you wants js as frontend then it is not going to help you. But as you mention that you have a web application, If that application is on Node(as server) then I can suggest you the way, which can help you a lot.
use 2 variables, named var_pageNo and var_limit. Now use the row query of mysql as
select * form <tbl_name> LIMIT var_limit OFFSET (var_pageNo * var_limit);
Do code according to this query. Replace the variable with your desire values. This will make your performance faster, and will fetch the data as per your specified limit.
hope this will helpful.

Search Items without presentation details in sitecore

Improve search performance.
We are currently on sitecore 8.1.3 in production and use Lucene Search to make the search work. We will be moving over to SOLR or Coveo search in near future. That said, we are trying to improve search functionality on our site.
In current scenario if a user searches on our site, Lucene search provides us with appropriate search results from sitecore content items. The results are a list of items in which some have presentation details where as some don't have presentation detail(which are basically datasource items, or pulled in multilist fields items). We displays results which have presentation details directly to user, however, the datasource items do not have presentation details attached to it, thus for such items we dispaly the items in which these respective items are referred as datasource items in presentation details, via sitecore link or are referenced in a multi-list field.
We are using Globals.LinkDatabase.GetItemReferrers(item, false) method to fetch the item where results items are referring to. We know this method is a heavy method. To improve the performance, we are filtering the items that are returned when we use Globals.LinkDatabase.GetItemReferrers(item, false) method. We select only the latest version of the item, we select an item only if the item has presentation details, we select only if the item is of same language as that of the context language. If the current item doesn't have presentation details, it will search for its related item with presentation details using the same function recursively. This logic or code that we have helps us to improve the performance at some level and yields the required results.
However this code slows down its performance if the number of search results is high. Say if I search for an item in which the Lucene search returns me say 10 items for it, our custom search code will then yield me say 100 related items(assuming the Datasource items of items found in the result can be reused across different items). The performance degrades when the Lucene search provides results with a huge count, say 500. In such scenarios we will be running our code recursively on 500 items and their related items. For better performance we have tried using LINQ query instead of foreach iterations wherever possible. The code works perfectly fine. We do get appropriate results, however the search slows down if the count is high for search items. Want to know if there is any more area where we can improve the performance.
The best way to improve the performance is to have a custom index that has the results you want to search and does not contain items that you do not want to return. In this way, your filtering is 'pre-done' during indexing.
The common way of doing is to use a computed field that will contain all the 'text' of the page (collating together content from datasources) so that the page's full contents are in a field in the index. This way, even if the text match would have been on a datasource, the page will still come back as a valid search result.
There is a blog from Kam Figy on this topic: https://kamsar.net/index.php/2014/05/indexing-subcontent/
Note that in addition to the computed field, you will also need to patch in the field to the index using a Sitecore config patch file. Kam's blog shows an example of that as well.
You need to index this data together to begin with, rather than trying to piece it together at runtime. You should also try to keep your indexes lean or use queries to restrict the results that are returned to only provide the relevant results.
I agree with the answer from Jason that a separate index is one of the best solutions, combined with a computed field that includes content from all referenced datasources.
Further, I would create a custom crawler which excludes items without any layout from the index. For an index which which is only used to provide results for site search, you only care about items with layout since only they have a navigable URL.
namespace MyProject.CMS.Custom.ContentSearch.Crawlers
{
public class CustomItemCrawler : Sitecore.ContentSearch.SitecoreItemCrawler
{
protected override bool IsExcludedFromIndex(SitecoreIndexableItem indexable, bool checkLocation = false)
{
bool isExcluded = base.IsExcludedFromIndex(indexable, checkLocation);
if (isExcluded)
return true;
Item obj = (Item)indexable;
return obj.Visualization != null && obj.Visualization.Layout != null;
}
protected override bool IndexUpdateNeedDelete(SitecoreIndexableItem indexable)
{
Item obj = indexable;
return obj.Visualization != null && obj.Visualization.Layout != null;
}
}
}
If for some reason you do not wish to create a separate index, or you only want to keep a single index (since you are using the Content Search API and require a full index for your component queries, or even just to minimise indexing speeds across multiple indexes) then I would consider creating a custom computed field in the index which stores [true/false]. The logic is the same as above. You can then filter in your search to only return results which have layout.
The combination of including/combining the content of the datasourse items during indexing and only returning items with layout should result in much better performance of your search queries.

Restriction on retrieving file information from a folder

I am trying to get information about files in a folder using https://apis.live.net/v5.0/folderid/files?
This particular folder of mine has around 5200 files. So I am getting a readtimeout when I make the above mentioned request. Is there any restriction on the number of files that I can make the request.
Note : I am able to successfully retrieve the file information from folder if I restrict the file count to 500 say https://apis.live.net/v5.0/folderid/files?limit=500
In general it's good to page queries that could potentially return a large number of requests. You could try using the limit query parameter in combination with the offset query parameter to read sets of the children at a time and see if that works better for you.
I'll quote in the relevant information from the documentation for ease of reference:
Specify the first item to get by setting the offset parameter in the preceding code to the index of the first item that you want to get. For example, to get two items starting with the third item, use FOLDER_ID/files?limit=2&offset=3.
Note In the JavaScript Object Notation (JSON)-formatted object that's returned, you can look in the paging object for the previous and next structures, if they apply, to get the offset and limit parameter values of the previous and next entries, if they exist.
You may also want to consider swapping to the new API, which has it's own paging model (using the next links).