Basic Web Development Questions (building a working test site) - sql

I am new to this site and coding. I have self taught myself html and I understand css. I have been putting together a site of mine using my basic knowledge. I have no college experience but this is MY DREAM to put this site together so I have done a lot of research and read books to get started but I have hit a roadblock now. Here is what I have done:
-I have put together all of the front end pages and design using html/css. So, I have all of the pages that would be involved with the site, ready to go. All designed and have the layout how I wish it to be.
I guess I would call it the "skeleton" of the site. Any page that a user would be directed to, I have in a folder.
I have put together a little "demo" for myself to mimic a user experience. For example, I created a login page that "looks" how i want it to be but it doesnt actually store or save any logins.
This is my first question:
What is my next step? I admit it sounds stupd but I am self taught and I really have the ambition to acheive this I just can not figure out where to go from here in order to actually make a functioning site. All I have right now is my html "demo" where basically I have to follow a certain path down my site that mimics what a user would do on the site. I have it now where I click on the "sign up" button on my html form and it basically just redirects to my "new user" page. Then it is the same formula throughout the rest of my demo. I just put my other html pages I have designed into the html to sort of give a "user experience" to the demo. But I REALLY want to be able to have working accounts and saved data.
How do I create/save a user login to my site? DO i need to get a sql database? Is there a free one to use while i build the site? Honestly i really need someone who is willing to help me out with the steps in this journey without me sharing my entire site (i wish to keep it to my self) but.. i understand this is basic web stuff i just am genuinely lost as how to take it to the next level. I have all of the html done and now i need a way to actually make it work. I wish to conversate with someone please about this kink in the chain i am seeming to find myself in please. Thank you so much and I would be grateful. :)
----basically what programming languages do i need to learn, or when looking for someone to hire, what should they be skilled in? any software or sites or databases that i need? please help!!!

HTML and CSS are the languages that make up the front end of a website, like you said. In order for your website to have dynamic content (content specific to a user) and the ability to actually process logins, etc., there needs to be a server involved. A webpage is a text document that is interpreted by a browser. HTML makes up the content and CSS tells the browser how you want it to look. What you are missing, primarily, is server scripts, most commonly, in my experience, PHP. You can also include JavaScript for client-side effects.
Specific to your question about a user login, yes, you will need a database. The process should look something like this.
User visits login page
User enters information into an HTML form
User clicks submit
Form is submitted to a server URL using the 'POST' method
Server validates the form content
Server checks database for username or email (whichever you are using)
If the username/email exists, it compares the passwords
Server sends a response back to the client, either good or bad
Once the user is validated, you can redirect the user to the dashboard or user section.
Please keep in mind this is a very simplistic version of events. There are more in depth steps that need to be taken, for example, your passwords should never be stored in a database as plain text, you should use a one-way encryption (hashing) algorithm to make them unreadable. Then when a password is given to the server it should be hashed and you should compare the hashes. You can also use salts when hashing for more security. The form should use SSL to prevent man in the middle attacks, etc.
Sounds like you are off to a good start, but in order to make it work you have to add the server logic. Self-teaching will get you as far as you are willing to let it. I taught myself how to do web programming, and now I do it as a business. The Internet is a great resource. There are a ton of great tutorials online that will show you how to do everything I just laid out.

Related

Create individual pages or dynamically create a page?

I'm unsure of what to do here I would like to ask what process should I use or what are the web standards in accomplishing this.
Scenario:
A User inputs information (e.g. a classified ads post). How should I generate the page of the users post? Should I create a new page with the information provided by the user or should I create just one page that dynamically changes with the content requested?
Generally whats the process I should take?
I hear a lot of talk about using cms, but I think I can create my own simple "cms" that can provide my needs. But I need to know what direction should I use. Creating a NEW PAGE or Creating a SINGLE DYNAMIC PAGE?
I agree that you should be able to create your own "CMS" that will meet your needs. If I understand your question, I would definitely create one page that is dynamically loaded from the database. Think about the server load you could create if you created a new page for every entry. One of the things I advise every client that I develop a website for is to think about what they want their website to be doing for their business in 3 years. I would advise you to think through the same question and I believe you will see why I recommend using a dynamic page.
Good luck with your site.

Protect page from requests

What's another way, instead of recaptcha, to protect the booting page?
My customers do not like to be filling these things out and I'm out of ideas
Without more details, it will be difficult to know what is appropriate for your situation. Here are some things that may or may not work for you depending on your situation:
Instead of a CAPTCHA use a simple question. Arithmetic perhaps. Or asking someone to type a word that is undistorted rather than distorted like a CAPTCHA. Or having someone always type the same word into a box. This is nowhere near as strong as a CAPTCHA, but it may be enough depending on your needs.
Password protect your site. Depending on your needs, you can use a shared password, individual accounts specific to your site, or something like Facebook Connect / OAuth / OpenID.
You can try a robots.txt file, but that will only keep well-behaved robots away. Attackers will, of course, ignore it.
You can firewall off your server so people can only access it from a certain subnet. If everyone using the site has access to the same VPN, then they can use VPN to access the site.
If you don't like any of #Trott's suggestions - A simple CAPTCHA replacement, but I am not sure for how long (a somewhat sophisticated attacker could crack it):
Add this into your form:
<input name="dummy" value="" style="display: hidden"/>
Then in your server code,
if params['dummy'].empty?
# user
else
# spambot!
end
This relies on spambots compulsively filling out unknown form fields (so that they don't leave out any mandatory ones); but a user will never see it, and thus always leave empty.

Dynamic url shortening script for text input

We are looking for script, which automatically detects url, as you type and shorten it, in text input window, before press "submit". The shortening service used is http://yourls.org/
Have you tried implementing one yourself? Deploy the shortener to your own web site (it's written in PHP, as far as I can see from a cursory glance at the web site) and provide a simple Ajax endpoint which will dynamically perform a shortening conversion, then implement calls to that from the main page using JavaScript.
You might want to impose a reasonable delay to allow the user to finish typing, to avoid performing lots of unnecessary conversions of bogus URLs (which may require, e.g. writes to a file or database - I haven't looked at how the library referenced does things).
I'm not sure what you're trying to achieve; if you create new shortened URLs for each substring before the user has finished typing the full URL, you will just proliferate your database.
I don't see how shortening a URL before it's finished makes sense.
If you want to relieve the user from the arduous task of clicking the submit button, then initiate the submit using javascript (jQuery, or something). I'm not sure if that's what you want to do.
http://monkeytooth.net/2010/12/htaccess-php-how-to-wordpress-slugs/
simple means of implementing the concept its a lot more easier than one would think. Querying a DB or some other means of matching the slug/id with the that of which is found in the URL wouldn't be all to hard either. The linked article doesn't really go in depth as what to do next but catching and breaking the URL apart is the essential process of making it work. I have person used the method myself on several sites and it works like a charm for me and the sites it was used on.

How do I implement a secure upload/download area?

I've been asked to create a solution where people log in and are able to upload and download off of our work server. So John uploads a photo, and Jen can download it, for example. They also have to authenticate themselves.
Can someone give me a rough overview of how to implement this? I'm familiar enough with MySQL, C#, and JavaScript.
The rough overview
This should just be a matter of planning out the pieces.
at the very top of the page, put some code that checks if a user is logged in. If not, show a login form (or redirect to...). If they are logged in, show the rest of the page. If not, you'll need some logic to show a form, and then check it once it's submitted for authentication, and set a SESSION cookie or something similar.
Once the user is logged in, on the homepage, you might have an file-upload form and a listing of existing files. How you would style would depend on how many files you might expect to have. To keep things extremely simple, you could simple iterate through whatever files are in the upload directory. If you expect many more files than that, you may consider using a db.
Handle a file upload by sanitizing filenames (checking for filetype/filesize if you want to limit those) and putting the file into the directory.
Force the users to download the files (instead of having the browser decide what to do with them) for security purposes. Implementing this on certain filetypes may also be acceptable.
Other thoughts
You probably would not want the users to be able to excecute any files, so keeping the file directory hidden would be a good idea.
Keeping track of who uploaded and downloaded what is also doable, but would add another layer of complication to the script.

Figure out if a website has restricted/password protected area

I have a big list of websites and I need to know if they have areas that are password protected.
I am thinking about doing this: downloading all of them with httrack and then writing a script that looks for keywords like "Log In" and "401 Forbidden". But the problem is these websites are different/some static and some dynamic (html, cgi, php,java-applets...) and most of them won't use the same keywords...
Do you have any better ideas?
Thanks a lot!
Looking for password fields will get you so far, but won't help with sites that use HTTP authentication. Looking for 401s will help with HTTP authentication, but won't get you sites that don't use it, or ones that don't return 401. Looking for links like "log in" or "username" fields will get you some more.
I don't think that you'll be able to do this entirely automatically and be sure that you're actually detecting all the password-protected areas.
You'll probably want to take a library that is good at web automation, and write a little program yourself that reads the list of target sites from a file, checks each one, and writes to one file of "these are definitely passworded" and "these are not", and then you might want to go manually check the ones that are not, and make modifications to your program to accomodate. Using httrack is great for grabbing data, but it's not going to help with detection -- if you write your own "check for password protected area" program with a general purpose HLL, you can do more checks, and you can avoid generating more requests per site than would be necessary to determine that a password-protected area exists.
You may need to ignore robots.txt
I recommend using the python port of perls mechanize, or whatever nice web automation library your preferred language has. Almost all modern languages will have a nice library for opening and searching through web pages, and looking at HTTP headers.
If you are not capable of writing this yourself, you're going to have a rather difficult time using httrack or wget or similar and then searching through responses.
Look for forms with password fields.
You may need to scrape the site to find the login page. Look for links with phrases like "log in", "login", "sign in", "signin", or scrape the whole site (needless to say, be careful here).
I would use httrack with several limits and then search the downloaded files for password fields.
Typically, a login form could be found within two links of the home page. Almost all ecommerce sites, web apps, etc. have login forms that are accessed just by clicking on one link on the home page, but another layer or even two of depth would almost guarantee that you didn't miss any.
I would also limit the speed that httrack downloads, tell it not to download any non-HTML files, and prevent it from downloading external links. I'd also limit the number of simultaneous connections to the site to 2 or even 1. This should work for just about all of the sites you are looking at, and it should be keep you off the hosts.deny list.
You could just use wget and do something like:
wget -A html,php,jsp,htm -S -r http://www.yoursite.com > output_yoursite.txt
This will cause wget to download the entire site recursively, but only download endings listed with the -A option, in this case try to avoid heavy files.
The header will be directed to file output_yoursite.txt which you then can parse for the header value 401, which means that the part of the site requires authentication, and parse the files accordingly to Konrad's recommendation also.
Looking for 401 codes won't reliably catch them as sites might not produce links to anything you don't have privileges for. That is, until you are logged in, it won't show you anything you need to log in for. OTOH some sites (ones with all static content for example) manage to pop a login dialog box for some pages so looking for password input tags would also miss stuff.
My advice: find a spider program that you can get source for, add in whatever tests (plural) you plan on using and make it stop of the first positive result. Look for a spider that can be throttled way back, can ignore non HTML files (maybe by making HEAD requests and looking at the mime type) and can work with more than one site independently and simultaneously.
You might try using cURL and just attempting to connect to each site in turn (possibly put them in a text file and read each line, try to connect, repeat).
You can set up one of the callbacks to check the HTTP response code and do whatever you need from there.