I'm attempting a scrapy-with-splash project to get a few fields off the website "https://sailing-channels.com/by-subscribers". This site uses java to retrieve and delete listings as you scroll.
I've not had any luck getting the splash server to give me the whole set of data, or any of the detailed listings for that mater.
My first question is can splash even do this?
I really don't care how I get this data. I would prefer doing it with a program but any tool that can get me fields from this site in an .csv file would do the job. Anyone have any suggestions?
Thanks for any advice
Why do you want render it? They have pretty good API, check https://sailing-channels.com/api/channels/get?sort=subscribers&skip=0&take=5&_=1548520116425. So you can iterate, increasing skip argument and parsing json each time.
Looks like very promising way.
Related
I need a little help here, i hope someone here can help me for a hint or clue.
First of all, I'm not programmer. I'm just web admin who can use cms and basic html.
I was using PrestaShop for my online shop. In the backend, I can't upload new product image anymore.
The error just blank without any sign for me. Here for the screenshot:
I appreciate it, if someone can help me, thanks, and sorry for my english.
#PanjiWiyono This errors don't really give us a quality information, but this is a start. In your JS code should be a ajax query that has an error when converting results to json (first error). You should check what's the exact error that this request is returning by inspecting in browser developer console.
If you detect that the second error is in fact, the response of this ajax query, well now we almost have the problem.
The second error should be related to data size. Check this: Error while sending QUERY packet
Anyway, you should check DbPDO.php class. You can use debug_backtrace function to display the complete stack, but definitly if error is related to a basic PHP class issue you will have not help knowing wich classes are in calling stack.
Good luck.
Simple solution
maybe a extra module is in a conflict with the prestashop core files. go to advanced parameters, performance and disable third party modules and try again.
other option is reupload admin folder with other name and
js folder check again
hope it will work for you
Can anyone help with using the Twitter API to upload a profile banner using the account/update_profile_banner? I have been searching on Google for so long and can't find any solution, thanks in advance
Based on https://gist.github.com/hayesdavis/97756
It looks like the docs are misleading, unless you are uploading a really small image, I expect it is critical to use multi part form data instead of encoding data in the query params.
Post your example code though, it's bad stackoverflow form to just say it doesn't work without showing the code and errors you are getting.
I am working on enhancing the a search functionality of a website.
The current search is working as
1.reading all the rows from the database
2.find keywords from each rows and return the result.
The problem is it is too slow and it has to prepare all the data in the backend which mean read all the data from different database and put them to html.
The solution comes to my mind is:
show partial search results (like 10) which means as long as it find enough result in the databse it will stop reading and searching rows.
once user scroll down the page, using ajax to trigger another process of searching
My questions is:
Is it a good way(possible way) to do that?
Any tutorial source I should look up.
i know it is kinda abstract question, but I need advice for this.
Thanks in advance.
Update my research:
https://github.com/webcreate/infinite-ajax-scroll
this jquery lib can do the front end job
I'm having a problem when trying to A/B test certain nodes in my node-tree in Umbraco.
What I want to do is to copy a node in the node-tree to a specific spot and use that B-structure to see which of the structures works best, using Google analytics.
For example we have two node structures, let's call them "Private" and "Sweden".
Their structure with childnodes and properties are exactly the same. The only difference between them is the propertyvalues (content). The "Private"-URL is www.mysite.com/Private and the "Sweden"-URL is www.mysite.com/Sweden.
What I would like to do is to change every link on the B-structure, so that it points to its match at the A-structure. The problem is that since it's two different structures, it will have two different alternative links.
With other words, it should be a coinsidence that it enters the B-structure, then be moved back to the A-structure in the next click.
We manage what page it should load (either the A-node or the B-node) with scripts, so that it has a 50% chance for each node, and if it lands on the B-node, Google analytics will save data. What we can't manage is that every link on that page will be to the A-node.
I'd appreciate any help I can get.
Regards,
David
There's a couple of ways that seem likely to give you a start at least.
The /config/urlrewriting.config file allows you to set up multiple redirect rules within umbraco so a section like the following might work in sending all requests (whether (/sweden/pagename/ or /private/pagename/) back to the private structure. Not sure how GA will handle it:
rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="http://www.mysite.com/private/$1" redirect="Domain" redirectMode="Permanent" ignoreCase="true" />
Secondly a simple httpmodule (http://support.microsoft.com/kb/307996) can process all page requests and redirect as required - you could do a gaq_push here directly or indirectly.
I'd be interested to know how you get on - it seems a good area for extension to Umbraco.
I'm not sure I have understood perfectly what you need to do, so please excuse any assumptions that may prove mistaken. Here's what I think:
Since A & B nodes should share the same html content (besides the links of course), why don't you make the link href attribute dynamic by using a bit of razor in the template or macro:
#{var isANode = CurrentPage.Parent.Name == "Sweden"; }
A similar approach would work if you are using web forms.
We finally came to the final decision to use the alternative template-solution. Since there seem to be no generic solution for my case of this problem we had to create an alternative template with specific macros to render the different information for every documenttype we're using.
Creating dynamic links for every page is a hell of a job in this stage in the project, since there are so many pages and links. Also some links are made in javascript, so there's another problem.
I copied the a-structure to another node, only for the reason to be able to change propertyvalues. There might be a problem logging and track the information with Google Analytics though, so that's the next step for us in this project. In our alternative templates we're getting the propertyvalues from the b-structure.
Still, if anyone have some better solution I'd highly appreciate it!
Regards,
David
I'd like a way to download the content of every page in the history of a popular article on Wikipedia. In other words I want to get the full contents of every edit for a single article. How would I go about doing this?
Is there a simple way to do this using the Wikipedia API. I looked and didn't find anything the popped out as a simple solution. I've also looked into the scripts on the PyWikipedia Bot page (http://botwiki.sno.cc/w/index.php?title=Template:Script&oldid=3813) and didn't find anything that was useful. Some simple way to do it in Python or Java would be the best, but I'm open to any simple solution that will get me the data.
There are multiple options for this. You can use the Special:Export special page to fetch an XML stream of the page history. Or you can use the API, found under /w/api.php. Use action=query&title=$TITLE&prop=revisions&rvprop=timestamp|user|content etc. to fetch the history.
Pywikipedia provides an interface to this, but I do not know by heart how to call it. An alternative library for Python, mwclient, also provides this, via site.pages[page_title].revisions()
Well, one solution is to parse the Wikipedia XML dump.
Just thought I'd put that out there.
If you're only getting one page, that's overkill. But if you don't need the very very latest information, using the XML would have the advantage of being a one-time download instead of repeated network hits.