I am currently scraping a super slow website. Therefore, I already bought a windows2000 virtual machine. However, this machine is using a static IP. As a result, I got banned.
Now, I am wondering if it would make sense to move my script (Scrapy, Selenium and Chrome) to AWS.
Has anybody experiences in using AWS for crawling? What is the typical price for it?
You don’t need a different machine, you need a proxy. Or even better, a smart proxy.
Related
I currently run an Apache server with Ubuntu 14.04, and also have a TOMCAT server and a Calibre server (running on port 8080 and 8081 respectivly).
I can reach them throught firefox by typing
http://localhost:8080 // For TOMCAT
http://localhost:8081 // For Calibre
I'd like to know what should I tweak or set to be able to reach them by typing
http://tomcat.localhost/ or http://localhost/~user/tomcat
and
http://calibre.localhost/ or http://localhost/~user/Calibre
(I'd like best the first option if possible)
Is this possible to do this without installing a DNS server? (I can use it if needed, but I'd be happier not to use a technology I'm not comfortable with)
I tried a PHP include or redirection in localhost/~user/Calibre/index.php, but this is verry inelegant (and I couldn't get it to work properly anyway)
The goal is to have it used on different computer on my local network (so cross navigators and cross computer compatibility is a better solution, but I'd be happy if it work only on my computer for the moment).
Any help would be greatly appreciated.
Thanks a lot
You should create a virtual host to use multiple domain..Follow the article to archive this..Let me know if you have any queries.
https://www.digitalocean.com/community/tutorials/how-to-set-up-apache-virtual-hosts-on-ubuntu-14-04-lts
Is there a way to get the difficulty from any coin even if there isn't a blockchain site like http://blockchain.info/ (they have an API)? I need to access it programmatically and i want to have it from the source so ripping it from a site that already lists them all isn't an option. Im using a vps Ubuntu server so the ram and mainly the diskspace is limited hence, i cant have alot of blockchains installed on it.
Most if not all have daemons you can use. You can run the daemon on the server and make a request for it. They should all be similar to the php one.
My boss wants me to test our web application to demonstrate how much traffic the web site can handle.
The app is a JSF/JPA/Oracle application, everything is running on one rack mounted server at a local hosting company's data centre.
The truth is, we don't know how much traffic it can handle before it gets unresponsive or shuts down altogether.
What would be a good way to pound on the web app from the internet, simulating tonnes of traffic? I was thinking of setting up a number of different Amazon EC2 virtual machines and getting them to pretend to be web visitors, but is there some kind of software I can run on these machines so they behave like lots of web visitors?
Also, it doesn't have to be free, I'd be willing to apy for a solution or a tool.
Any suggestions or help is greatly appreciated!
Thanks, Rob
Try this, mate:
http://httpd.apache.org/docs/1.3/programs/ab.html
Did you try setting up Selenium-Grid to run tests in parallel. This will simulate actual user actions on the application and in-turn can stress the app server. You can install a performance monitoring utility on the server to monitor the load generated.
Or you could also use J Meter to simulate multiple users accessing your application. You can talk to your network admin team to route this traffic via internet instead of your local internet.
I'm trying to setup a server based on our needs for a new website. Basically, I need to build a website based on social engine, and according to the platform's requirements (found here: http://www.socialengine.net/support/documentation/article?q=152&question=SocialEngine-Requirements) it requires the webserver to be Apache based.
Now my issue comes with the addition of a web application that needs to be included in the site. The web application requires the server to be capable of Asynchronous Request Processing, and is currently only supported by Tomcat or GlassFish.
I found a couple tutorials such as this one http://www.serverwatch.com/tutorials/article.php/2203891/Integrating-Tomcat-with-Apache.htm that explain how to "integrate" Tomcat into Apache. Would a server running Tomcat alone be able to handle the applet needs as well as serve the Apache (assuming HTTP) needs from the Social Engine platform? Are there any hosting providers any of you would reccomend?
Although I've done alot of front end stuff before, this is the first time i have to deal with any of the back end details, so my knowledge of server side functionality is really garbage. Please let me know if I'm not asking the right questions.
Thanks
You wouldn't really be able to use Tomcat for both apps, since the other one needs PHP. It's pretty common to have both Tomcat and Apache running on the same server. You might want to look up more recent documentation on mixing them, even this but definitely have a look at mod_proxy_ajp.
What's the other application? It's a little tricky to set up Asynchronous Request Processing if you are new to server apps, but there is also a lot of documentation, so if you're game, you can probably figure it out OK. You might also want to see if that app would work with node.js (hosting info here)
If you want to set it all up yourself, you could get a virtual private server from Rackspace Cloud or similar host or get a shared host that has the required apps already set up, which would limit your ability to customize the environment and may require 2 hosting plans, but would be easier to set up. It also somewhat depends on if both apps need to be on the same machine for any reason and/or on the same domain.
A regular LAMP stack will run SE4 just fine, however, you will need to do some tuning to get the page loads under 3 seconds. You will want to remove any Apache modules that you aren't using with a2dismod. For instance, if you're not using any Ruby on the site, a2dismod ruby. This will help get memory usage under control. APC is a must.
For a much more in depth read on tuning php/apache, please read this: Performance tuning on Apache, PHP, MySQL, WordPress v1.1 – Updated
I would like to be able to give someone a "bundle" of software to be able to host anywhere. Is there a way to do this so that the person is charged by Amazon for the amount of time that they used but does not have to deal with setting up an EC2 account and installing an image?
Also, it doesn't have to be EC2. What I am looking for a way for people to host their own cloud service.
I'll split your question into two parts:
Not requiring your customer to set up an account on EC2 (or whatever)
Not having to install an image
For the first, I don't think you'll be able to find a solution where the person who is to be charged by the host provider can be charged without setting up an account!
For the second, I think you are mainly concerned about the annoyance of your customer having to do a full image install and then configure your software package into that image... VM/VPS setup can be a chore.
If this is the case, you can simplify this a lot by pre-preparing a full VM that has an OS installed as well as your fully configured software and support packages. Then simply give your "someone"/customer the entire VM as one .iso file and they can host it at http://www.elastichosts.com quite easily. ElasticHosts lets you set up completely arbitrary VMs, unlike most (all?) other VPS providers that currently limit you to a selection of OS images that you then need to install your software into.
I don't think you can get any easier, unless I completely misunderstand your question.