Facebook like scroll down and searching/adding - sql

I am working on enhancing the a search functionality of a website.
The current search is working as
1.reading all the rows from the database
2.find keywords from each rows and return the result.
The problem is it is too slow and it has to prepare all the data in the backend which mean read all the data from different database and put them to html.
The solution comes to my mind is:
show partial search results (like 10) which means as long as it find enough result in the databse it will stop reading and searching rows.
once user scroll down the page, using ajax to trigger another process of searching
My questions is:
Is it a good way(possible way) to do that?
Any tutorial source I should look up.
i know it is kinda abstract question, but I need advice for this.
Thanks in advance.
Update my research:
https://github.com/webcreate/infinite-ajax-scroll
this jquery lib can do the front end job

Related

Splash issue with https://sailing-channels.com/by-subscribers

I'm attempting a scrapy-with-splash project to get a few fields off the website "https://sailing-channels.com/by-subscribers". This site uses java to retrieve and delete listings as you scroll.
I've not had any luck getting the splash server to give me the whole set of data, or any of the detailed listings for that mater.
My first question is can splash even do this?
I really don't care how I get this data. I would prefer doing it with a program but any tool that can get me fields from this site in an .csv file would do the job. Anyone have any suggestions?
Thanks for any advice
Why do you want render it? They have pretty good API, check https://sailing-channels.com/api/channels/get?sort=subscribers&skip=0&take=5&_=1548520116425. So you can iterate, increasing skip argument and parsing json each time.
Looks like very promising way.

Using CakePHP to build applications that deal with large data-sets?

I am using Datatables JQuery plugin 1.9 with CakePHP 2.4.
My application is an online Database interface. The database is not relatively huge. We have only 26,000 records. However, I found out that even with recursive = -1, and the use of "Containable" behavior, CakePHP is still limited to find 5000 rows only, before I get a memory exhausted error, and it even takes 5 minutes to load, in the view!
Of course, using the LIMIT option was just experimental as I need to list/paginate/search through the entire database records.
My question is, have anyone built a CakePHP application that dealt with similar number of rows or larger?? How did you do it? Any detailed documentation reference about your approach would be greatly appreciated.
I've been looking for a week for setting up the server-side processing of Datatables in CakePHP and none of the solutions/plugins suggested out there worked (e.g. cnizzdotcom's). Increasing the memory limit (up to 1 GB!) didn't help much too.
Unfortunately, if this limitation continues this will be our last time using CakePHP. But for now, it's very critical to find a solution of the problem.
you really need to display your 26k items at once on your datatables??
I'd change it and use an ajax pagination. You could still keep your datatables pagination and make an async request to cakephp (cake should return a json), in this request you'll need to pass the limit and the page (you could also pass the sort and direction) as parameters
Search the datatable's doc for ajax sources
create a view for this ajax
in this view create paginate your items using the "page", "limit", "sort" and "direction" parameters passed to the view (via GET probablly)
this view should return a json in a format that datatables could understand
But if you really need to display the 26k at once.. then ignore this answer.
Hope this helps

Formatting couchdb-lucene results with a couchdb list

Situation...
I have a simple couchapp that lists out emails that are stored in the couch database, these emails are queried with a simple view and then piped through a list to give me a pretty table that I can click on the emails to view them. That works great.
The next evolution of this app was to add some fulltext searching of the subject line of the emails with couchdb-lucene, and I think I have that nailed down as well as I can search using lucene and get valid results back. What I can't quite grasp is how do I take those results and pipe them back into my existing list function so they get formated correctly?
Here is an example of my view + list URL that gives me the HTML
http://localhost:5984/tenant103/_design/Email/_list/emaillist/by_type?startkey=["Email",2367264774866]&endkey=["Email",0]&limit=20&descending=true&include_docs=true
And here is my search URL that also gives me results
http://localhost:5984/_fti/local/tenant103/_design/Email/by_subject?q=OM-2875&include_docs=true
My thinking was I would build the URL like this
http://localhost:5984/_fti/local/tenant103/_design/Email/_list/emaillist/by_subject?q=OM-2875&include_docs=true
But that just returns
{
reason: "bad_request",
code: 400
}
This is a learning project for myself with CouchDB so I may not be getting some simple concepts here.
CouchDB-Lucene does not natively support list transformations and CouchDB can only apply list transformations to its own map/reduce views. Sorry about that!
Robert Newson.

Should a dynamic help page content be stored in Database or HTML file?

So, I'm trying to come up with a better way to do a dynamic help module that displays a distinct help page for each page of a website. Currently there is a help.aspx that has a title and div section that is filled by methods that grab a database record. Each DB record is stored html withy the specific help content. Now, this works but it is an utter pain to maintain when, say an image, changes or the text has to be edited, you have to find and updated 1 or more DB records. I was thinking instead, I could build a single html page that basically shows/hides panels and inside each panel is the appropriate help content. As long as you follow a proper naming convention (name the panels ID to the page/content it represents) using ctrl + f will get you where you need to go and make it easier to find the content you need. What I'm curious of is would this have an impact on performance? The html page would be a fairly large file and would be hosted/ran at the server but it would also remove the need for Database calls. Would the work even be worth the benefit here or am I reinventing the wheel already in place?
Dynamic anything should be stored in the database. A truly usable web application should NEVER need code modified to change content. Hiding content is usually not a good idea, imagine if you expanded your application to 100 different pages that need their own help page. Then when someone clicks help their browser has to load 99 hidden pages to get 1 that it will show. You need to break your help page down into sections and just store the plain text in the database. I would need to know more about what language you're using as well as the architecture you're using to elaborate further but take a look below.
The need your describing is pretty much what MVC (web application architecture type) was built for.
If you're already using ASP.net and you aren't too far into your project I would consider switching to MVC. It's an architecture built specifically with dynamic page content in mind. You build different 'Views' (the V in MVC) that will dynamically build the HTML based on the content it receives from the Controller (The C in MVC) which pulls it's data from the database/Model (The M) and modifies it for the View. Also once you get into MVC you can couple it with Razor and half of your code get's written for you. It's a wonderful thing.
http://www.asp.net/mvc

How can I get the full change history for an article on Wikipedia?

I'd like a way to download the content of every page in the history of a popular article on Wikipedia. In other words I want to get the full contents of every edit for a single article. How would I go about doing this?
Is there a simple way to do this using the Wikipedia API. I looked and didn't find anything the popped out as a simple solution. I've also looked into the scripts on the PyWikipedia Bot page (http://botwiki.sno.cc/w/index.php?title=Template:Script&oldid=3813) and didn't find anything that was useful. Some simple way to do it in Python or Java would be the best, but I'm open to any simple solution that will get me the data.
There are multiple options for this. You can use the Special:Export special page to fetch an XML stream of the page history. Or you can use the API, found under /w/api.php. Use action=query&title=$TITLE&prop=revisions&rvprop=timestamp|user|content etc. to fetch the history.
Pywikipedia provides an interface to this, but I do not know by heart how to call it. An alternative library for Python, mwclient, also provides this, via site.pages[page_title].revisions()
Well, one solution is to parse the Wikipedia XML dump.
Just thought I'd put that out there.
If you're only getting one page, that's overkill. But if you don't need the very very latest information, using the XML would have the advantage of being a one-time download instead of repeated network hits.