Protect page from requests - httpwebrequest

What's another way, instead of recaptcha, to protect the booting page?
My customers do not like to be filling these things out and I'm out of ideas

Without more details, it will be difficult to know what is appropriate for your situation. Here are some things that may or may not work for you depending on your situation:
Instead of a CAPTCHA use a simple question. Arithmetic perhaps. Or asking someone to type a word that is undistorted rather than distorted like a CAPTCHA. Or having someone always type the same word into a box. This is nowhere near as strong as a CAPTCHA, but it may be enough depending on your needs.
Password protect your site. Depending on your needs, you can use a shared password, individual accounts specific to your site, or something like Facebook Connect / OAuth / OpenID.
You can try a robots.txt file, but that will only keep well-behaved robots away. Attackers will, of course, ignore it.
You can firewall off your server so people can only access it from a certain subnet. If everyone using the site has access to the same VPN, then they can use VPN to access the site.

If you don't like any of #Trott's suggestions - A simple CAPTCHA replacement, but I am not sure for how long (a somewhat sophisticated attacker could crack it):
Add this into your form:
<input name="dummy" value="" style="display: hidden"/>
Then in your server code,
if params['dummy'].empty?
# user
else
# spambot!
end
This relies on spambots compulsively filling out unknown form fields (so that they don't leave out any mandatory ones); but a user will never see it, and thus always leave empty.

Related

Basic Web Development Questions (building a working test site)

I am new to this site and coding. I have self taught myself html and I understand css. I have been putting together a site of mine using my basic knowledge. I have no college experience but this is MY DREAM to put this site together so I have done a lot of research and read books to get started but I have hit a roadblock now. Here is what I have done:
-I have put together all of the front end pages and design using html/css. So, I have all of the pages that would be involved with the site, ready to go. All designed and have the layout how I wish it to be.
I guess I would call it the "skeleton" of the site. Any page that a user would be directed to, I have in a folder.
I have put together a little "demo" for myself to mimic a user experience. For example, I created a login page that "looks" how i want it to be but it doesnt actually store or save any logins.
This is my first question:
What is my next step? I admit it sounds stupd but I am self taught and I really have the ambition to acheive this I just can not figure out where to go from here in order to actually make a functioning site. All I have right now is my html "demo" where basically I have to follow a certain path down my site that mimics what a user would do on the site. I have it now where I click on the "sign up" button on my html form and it basically just redirects to my "new user" page. Then it is the same formula throughout the rest of my demo. I just put my other html pages I have designed into the html to sort of give a "user experience" to the demo. But I REALLY want to be able to have working accounts and saved data.
How do I create/save a user login to my site? DO i need to get a sql database? Is there a free one to use while i build the site? Honestly i really need someone who is willing to help me out with the steps in this journey without me sharing my entire site (i wish to keep it to my self) but.. i understand this is basic web stuff i just am genuinely lost as how to take it to the next level. I have all of the html done and now i need a way to actually make it work. I wish to conversate with someone please about this kink in the chain i am seeming to find myself in please. Thank you so much and I would be grateful. :)
----basically what programming languages do i need to learn, or when looking for someone to hire, what should they be skilled in? any software or sites or databases that i need? please help!!!
HTML and CSS are the languages that make up the front end of a website, like you said. In order for your website to have dynamic content (content specific to a user) and the ability to actually process logins, etc., there needs to be a server involved. A webpage is a text document that is interpreted by a browser. HTML makes up the content and CSS tells the browser how you want it to look. What you are missing, primarily, is server scripts, most commonly, in my experience, PHP. You can also include JavaScript for client-side effects.
Specific to your question about a user login, yes, you will need a database. The process should look something like this.
User visits login page
User enters information into an HTML form
User clicks submit
Form is submitted to a server URL using the 'POST' method
Server validates the form content
Server checks database for username or email (whichever you are using)
If the username/email exists, it compares the passwords
Server sends a response back to the client, either good or bad
Once the user is validated, you can redirect the user to the dashboard or user section.
Please keep in mind this is a very simplistic version of events. There are more in depth steps that need to be taken, for example, your passwords should never be stored in a database as plain text, you should use a one-way encryption (hashing) algorithm to make them unreadable. Then when a password is given to the server it should be hashed and you should compare the hashes. You can also use salts when hashing for more security. The form should use SSL to prevent man in the middle attacks, etc.
Sounds like you are off to a good start, but in order to make it work you have to add the server logic. Self-teaching will get you as far as you are willing to let it. I taught myself how to do web programming, and now I do it as a business. The Internet is a great resource. There are a ton of great tutorials online that will show you how to do everything I just laid out.

What is preventing people from using someone else's CAPTCHA as their own?

Why (other than moral reasons) don't more people use the CAPTCHAs of other sites as their own while selling the solving of said CAPTCHAs?
To me, such a system seems like it would be simple to implement:
set up a script that does something on another website that requires a CAPTCHA to be completed through the use of a proxy service
when a user on your site performs a task that requires the completion of a CAPTCHA, simply serve them the CAPTCHA that the other
site asks you to solve
when the user solves the CAPTCHA, your script can perform the desired action on the other site that is the source of the CAPTCHA,
and the user on your site is also verified through this process
Is this commonplace? If not, why not? What, if anything, could be done to prevent this?
Fetching the captcha. Assuming one could easily fetch the exact visual of the captcha from the foreign host. To do this, you have to pass the referral check (most browsers (navigated by humans) allow to send the http_referer). You also would have to save the session_id and the secret from the hidden input.
Checking the result. The foreign host must link the saved variables with the ones associated with the session of your first request, which requires you to implement tricky cURL methods. You would have to handle multiple parallel requests, all from your single ip.
Your server will probably use more resources when hacking a captcha on a foreign host than if it generates a captcha on its own.
Prevents
http_referer check
limit requests for single IP to e.g. 5 / minute
good session handling and tricky cookies
it's not impossible to reverse engineer javascript, but the more complicated your javascript is, ...
you have to find a pattern that recognizes the result on the foreign host. the easiest signature may be the Location header field, leading either to /path/success.html or /path/tryagain.php
Challenge:
I took a moment to prepare an example: http://woisteinebank.de/test/
In this example, I attach keys to the session_id(); and save it in the database.
Through session_regenerate_id(); I have a fresh session on every request.
In check.php, I compare the database values to the $_GET values.
Try to find a way to get leech this captcha, I'll try to defend. Everytime you sucessfully use my captcha on your site, I try to defend it.

Stop spam without captcha

I want to stop spammers from using my site. But I find CAPTCHA very annoying. I am not just talking about the "type the text" type, but anything that requires the user to waste his time to prove himself human.
What can I do here?
Requiring Javascript to post data blocks a fair amount of spam bots while not interfering with most users.
You can also use an nifty trick:
<input type="text" id="not_human" name="name" />
<input type="text" name="actual_name" />
<style>
#not_human { display: none }
</style>
Most bots will populate the first field, so you can block them.
I combine a few methods that seem quite successful so far:
Provide an input field with the name email and hide it with CSS
display: none. When the form is submitted check if this field is
empty. Bots tend to fill this with a bogus emailaddress.
Provide another hidden input field which contains the time the page
is loaded. Check if the time between loading and submitting the page
is larger the minimum time it takes to fill in the form. I use
between 5 and 10 seconds.
Then check if the number of GET parameters are as you would expect.
If your forms action is POST and the underlying URL of your
submission page is index.php?p=guestbook&sub=submit, then you
expect 2 GET parameters. Bots try to add GET parameters so this
check would fail.
And finally, check if the HTTP_USER_AGENT is set, which bots sometimes don't set,
and that the HTTP_REFERER is the URL of the page of your form. Bots
sometimes just POST to the submission page causing the HTTP_REFERER
to be something else.
I got most of my information from http://www.braemoor.co.uk/software/antispam.shtml and http://www.nogbspam.com/.
Integrate the Akismet API to automatically filter your users' posts.
If you're looking for a .NET solution, the Ajax Control Toolkit has a control named NoBot.
NoBot is a control that attempts to provide CAPTCHA-like bot/spam prevention without requiring any user interaction. NoBot has the benefit of being completely invisible. NoBot is probably most relevant for low-traffic sites where blog/comment spam is a problem and 100% effectiveness is not required.
NoBot employs a few different anti-bot techniques:
Forcing the client's browser to perform a configurable JavaScript calculation and verifying the result as part of the postback. (Ex: the calculation may be a simple numeric one, or may also involve the DOM for added assurance that a browser is involved)
Enforcing a configurable delay between when a form is requested and when it can be posted back. (Ex: a human is unlikely to complete a form in less than two seconds)
Enforcing a configurable limit to the number of acceptable requests per IP address per unit of time. (Ex: a human is unlikely to submit the same form more than five times in one minute)
More discussion and demonstration at this blogpost by Jacques-Louis Chereau on NoBot.
<ajaxToolkit:NoBot
ID="NoBot2"
runat="server"
OnGenerateChallengeAndResponse="CustomChallengeResponse"
ResponseMinimumDelaySeconds="2"
CutoffWindowSeconds="60"
CutoffMaximumInstances="5" />
I would be careful using CSS or Javascript tricks to ensure a user is a genuine real life human, as you could be introducing accessibility issues, cross browser issues, etc. Not to mention spam bots can be fairly sophisticated, so employing cute little CSS display tricks may not even work anyway.
I would look into Akismet.
Also, you can be creative in the way you validate user data. For example, let's say you have a registration form that requires a user email and address. You can be fairly hardcore in how you validate the email address, even going so far as to ensure the domain is actually set up to receive mail, and that there is a mailbox on that domain that matches what was provided. You could also use Google Maps API to try and geolocate an address and ensure it's valid.
To take this even further, you could implement "hard" and "soft" validation errors. If the mail address doesn't match a regex validation string, then that's a hard fail. Not being able to check the DNS records of the domain to ensure it accepts mail, or that the mailbox exists, is a "soft" fail. When you encounter a soft fail, you could then ask for CAPTCHA validation. This would hopefully reduce the amount of times you'd have to push for CAPTCHA verification, because if you're getting enough activity on the site, valid people should be entering valid data at least some of the time!
I realize this is a rather old post, however, I came across an interesting solution called the "honey-pot captcha" that is easy to implement and doesn't require javascript:
Provide a hidden text box!
Most spambots will gladly complete the hidden text box allowing you to politely ignore them.
Most of your users will never even know the difference.
To prevent a user with a screen reader from falling into your trap simply label the text box "If you are human, leave blank" or something to that affect.
Tada! Non-intrusive spam-blocking! Here is the article:
http://www.campaignmonitor.com/blog/post/3817/stopping-spambots-with-two-simple-captcha-alternatives
Since it is extremely hard to avoid it at 100% I recommend to read this IBM article posted 2 years ago titled 'Real Web 2.0: Battling Web spam', where visitor behavior and control workflow are analyzed well and concise
Web spam comes in many forms, including:
Spam articles and vandalized articles on wikis
Comment spam on Weblogs
Spam postings on forums, issue trackers, and other discussion sites
Referrer spam (when spam sites pretend to refer users to a target
site that lists referrers)
False user entries on social networks
Dealing with Web spam is very difficult, but a Web developer
neglects spam prevention at his or her
peril. In this article, and in a
second part to come later, I present
techniques, technologies, and services
to combat the many sorts of Web spam.
Also is linked a very interesting "...hashcash technique for minimizing spam on Wikis and such, in addition to e-mail."
How about a human readable question that tells the user to put in the first letter of the value he put in the first name field and the last letter of the last name field or something like this?
Or show some hidden fields which are filled with JavaScript with values like referer and so one. Check for equality of these fields with the ones you have stored in the session before.
If the values are empty, the user has no javascript. Then it would be no spam. But a bot will at least fill in some of them.
Surely you should select one thing Honeypot or BOTCHA.

Figure out if a website has restricted/password protected area

I have a big list of websites and I need to know if they have areas that are password protected.
I am thinking about doing this: downloading all of them with httrack and then writing a script that looks for keywords like "Log In" and "401 Forbidden". But the problem is these websites are different/some static and some dynamic (html, cgi, php,java-applets...) and most of them won't use the same keywords...
Do you have any better ideas?
Thanks a lot!
Looking for password fields will get you so far, but won't help with sites that use HTTP authentication. Looking for 401s will help with HTTP authentication, but won't get you sites that don't use it, or ones that don't return 401. Looking for links like "log in" or "username" fields will get you some more.
I don't think that you'll be able to do this entirely automatically and be sure that you're actually detecting all the password-protected areas.
You'll probably want to take a library that is good at web automation, and write a little program yourself that reads the list of target sites from a file, checks each one, and writes to one file of "these are definitely passworded" and "these are not", and then you might want to go manually check the ones that are not, and make modifications to your program to accomodate. Using httrack is great for grabbing data, but it's not going to help with detection -- if you write your own "check for password protected area" program with a general purpose HLL, you can do more checks, and you can avoid generating more requests per site than would be necessary to determine that a password-protected area exists.
You may need to ignore robots.txt
I recommend using the python port of perls mechanize, or whatever nice web automation library your preferred language has. Almost all modern languages will have a nice library for opening and searching through web pages, and looking at HTTP headers.
If you are not capable of writing this yourself, you're going to have a rather difficult time using httrack or wget or similar and then searching through responses.
Look for forms with password fields.
You may need to scrape the site to find the login page. Look for links with phrases like "log in", "login", "sign in", "signin", or scrape the whole site (needless to say, be careful here).
I would use httrack with several limits and then search the downloaded files for password fields.
Typically, a login form could be found within two links of the home page. Almost all ecommerce sites, web apps, etc. have login forms that are accessed just by clicking on one link on the home page, but another layer or even two of depth would almost guarantee that you didn't miss any.
I would also limit the speed that httrack downloads, tell it not to download any non-HTML files, and prevent it from downloading external links. I'd also limit the number of simultaneous connections to the site to 2 or even 1. This should work for just about all of the sites you are looking at, and it should be keep you off the hosts.deny list.
You could just use wget and do something like:
wget -A html,php,jsp,htm -S -r http://www.yoursite.com > output_yoursite.txt
This will cause wget to download the entire site recursively, but only download endings listed with the -A option, in this case try to avoid heavy files.
The header will be directed to file output_yoursite.txt which you then can parse for the header value 401, which means that the part of the site requires authentication, and parse the files accordingly to Konrad's recommendation also.
Looking for 401 codes won't reliably catch them as sites might not produce links to anything you don't have privileges for. That is, until you are logged in, it won't show you anything you need to log in for. OTOH some sites (ones with all static content for example) manage to pop a login dialog box for some pages so looking for password input tags would also miss stuff.
My advice: find a spider program that you can get source for, add in whatever tests (plural) you plan on using and make it stop of the first positive result. Look for a spider that can be throttled way back, can ignore non HTML files (maybe by making HEAD requests and looking at the mime type) and can work with more than one site independently and simultaneously.
You might try using cURL and just attempting to connect to each site in turn (possibly put them in a text file and read each line, try to connect, repeat).
You can set up one of the callbacks to check the HTTP response code and do whatever you need from there.

Block spam from "tell a friend" forms

You have to have a form on your website for people to send an email to a friend if they found something interesting. You can force people to be logged in (which is not a good option in my case). You can make time delay (this is not really urgent email, so it can wait for 5 minutes). Do you have this problem? How would you solve it?
Edit: I am mostly interested in stopping manual spam
Do you have a problem with automated scripting of your form, or people genuinely using it too much?
The simple solution to the bot problem is a Captcha, such as ReCaptcha. The user-friendliness is questionable, but it would perhaps solve your problem.
You can also use something different from all those captcha scripts. Let me tell you what I do:
- I create a md5 hash:
$secretWord='TryToHashMe';
$formID='myForm';
$md5Value=md5($secretWord.$formID);
echo '<input type="hidden" name="form-check" value="'.$md5Value.'">';
echo '<input type="hidden" name="bot-check" value="">';
those are 2 very simple ways because: 1) auto bots try to fill all your inputs and 2)the hash is not provided, this mean you have a post request from outside your site. The hashing could be extended with some session or cookie, too.
All the best!
I would recommend a Captcha or if you would like something a bit less intrusive, have a simple math problem(which changes) so you just have something like:
For spam protection: Type what Two Plus Two is here _________
I did this on my personal website and never had a problem(and I had a lot of attempts that failed by spambots)
This service has very good anti-spam measures.
http://www.tellafriendking.com/features.php?showall=1#spam-free
FYI, I am involved with the company, so I'm not entirely unbiased, but we do get a lot of refugees who come to us to end their spam problems with other services or downloaded scripts.
Edit:
If you feel the need to vote down, perhaps you should leave a comment too...
The best solution is to use an all-purpose bot filtering solution. I know this is an old post, but a new botnet was discovered that uses these send to a friend modules to send spam (not a new technique but some interesting new advancements).
According to one security vendor (good tips), “At a minimum, they should include a rate-limiting mechanism that will prevent an IP address from issuing unreasonable numbers of requests over a specific period of time. Other DIY solutions are to have all users fill in CAPTCHAs and to enforce registration as a prerequisite to sending out an email message.”