Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I was reading a few articles on salts and password hashes and a few people were mentioning rainbow attacks. What exactly is a rainbow attack and what are the best methods to prevent it?
The wikipedia article is a bit difficult to understand. In a nutshell, you can think of a Rainbow Table as a large dictionary with pre-calculated hashes and the passwords from which they were calculated.
The difference between Rainbow Tables and other dictionaries is simply in the method how the entries are stored. The Rainbow table is optimized for hashes and passwords, and thus achieves great space optimization while still maintaining good look-up speed. But in essence, it's just a dictionary.
When an attacker steals a long list of password hashes from you, he can quickly check if any of them are in the Rainbow Table. For those that are, the Rainbow Table will also contain what string they were hashed from.
Of course, there are just too many hashes to store them all in a Rainbow Table. So if a hash is not in the particular table, the hacker is out of luck. But if your users use simple english words and you have hashed them just once, there is a large possibility that a good Rainbow Table will contain the password.
It's when somebody uses a Rainbow table to crack passwords.
If you are worried about this, you should use Salt. There is also a Stack Overlow question that might help you understand salt a little better than Wikipedia...
This is a useful article on Rainbow Tables for the lay person. (Not suggesting you are a layperson, but it's well written and concise.)
Broadly speaking, you encrypt a vast number of possible short plaintext strings (i.e. for passwords), and store the encrypted values alongside the plaintext. This makes it (relatively) straightforward to simply lookup the plaintext when you have the encrypted value.
This is most useful for weak and/or unsalted password hashes. A popular example is the LAN Manager hash, used by versions of Windows up to XP to store user passwords.
Note that a pre-computed rainbow table for even something as simple as the LM hash takes a lot of CPU time to generate and occupies a fair amount of space (on the order of 10s of gigabytes IIRC).
Rainbow Tables basically allow someone to store a large number of precomputed hashes feasibly.
This makes it easy to crack your hashed passwords, since instead of performing a whole heap of hashing functions, the work has already been done and they virtually just have to do a database lookup.
The best protection against this kind of attack is to use a salt (random characters) in your password. i.e. instead of storing md5(password), store md5(password + salt), or even better md5(salt + md5(password)).
Since even with rainbow tables, it is going to be near impossible to store all possible salted hashes.
BTW, obviously you have to store your salt with your hash so that you can authenticate the user.
Late to the party but I was also aware of Rainbow Tables being a method of attack on hashed/unsalted passwords. However on Twitter recently http://codahale.com/how-to-safely-store-a-password/ was shared and depending on your needs and concerns.. you may not be able to salt your way to safe password storage.
I hope this is informative to you.
Wikipedia is your friend:
http://en.wikipedia.org/wiki/Rainbow_table
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Ok ok I know you probably all going to kill me for asking this, however I got into an friendly programmer argument with a co-worker about one of our database tables and he asked a question which I know the answer to but I couldn't explain it is the better way.
I will simplify the situation for the simplicity of the question, We have a fairly large table of people / users. Now amongst other data being stored the data in question is as follows: we have a simNumber, cellNumber and the ipAddress of that sim.
Now I am saying that we should make a table lets call it SimTable and put those 3 entries in the sim table, and then put a FK in the UsersTable linking the two. Why? Because that's what I have always been taught NORMALISE your tables!!! Ok so all is good in that regard.
But now my friend says to me yes, but now when you want to query a users phone number, SQL now has to go and:
search for the user
search for the sim fk
search for the correct sim row in the sim database
get the phone number
Now when I go and request 10000 users phone numbers, the number of operations done seriously grows in size.
Vs the other approach
search for the user
find the phone number
Now the argument is purely performance based. As much as I understand why we do normalize the data (to remove redundant data, maintainability, make changes to data in one table which propagate up etc.. ) It does appear to me that the approach with the data in one table will be faster or will at least less tasks/ operations to give me the data I want?
So what is the case in this situation? I do hope that I have not asked anything insanely silly , it is early in the morning so do forgive me if im not thinking clearly
The technology involved in MS SQL server 2012
[EDIT]
This article below also touches on some pf the concepts I have mentioned above
http://databases.about.com/od/specificproducts/a/Should-I-Normalize-My-Database.htm
The goal of normalization is not performance. The goal is to model your data correctly with minimum redundancy so you avoid data anomalies.
Say for example two users share the same phone. If you store the phones in the user table, you'd have sim number, IP address, and cell number stored one each user's row.
Then you change the IP address on one row but not the other. How can one sim number have two IP addresses? Is that even valid? Which one is correct? How would you fix such discrepancies? How would you even detect them?
There are times when denormalization is worthwhile, if you really need to optimize data access for one query that you run very frequently. But denormalization comes at a cost, so be prepared to commit yourself to a lot more manual work to take responsibility for data integrity. More code, more testing, more cleanup tasks. Do those count when considering "performance" of the project overall?
Re comments:
I agree with #JoelBrown, as soon as you implement your first case of denormalization, you compromise on data integrity.
I'll expand on what Joel mentions as "well-considered." Denormalization benefits specific queries. So you need to know which queries you have in your app, and which ones you need to optimize for. Do this conservatively, because while denormalization can help a specific query, it harms performance for all other uses of the same data. So you need to know whether you need to query the data in different ways.
Example: suppose you are designing a database for StackOverflow, and you want to support tags for questions. Each question can have a number of tags, and each tag can apply to many questions. The normalized way to design this is to create a third table, pairing questions with tags. That's the physical data model for a many-to-many relationship:
Questions ----<- QuestionsTagged ->---- Tags
But you figure you don't want to do the join to get tags for a given question, so you put tags into a comma-separated string in the questions table. This makes it quicker to query a given question and its associated tags.
But what if you also want to query for one specific tag and find its related questions? If you use the normalized design, it's simply a query against the many-to-many table, but on the tag column.
But if you denormalize by storing tags as a comma-separated list in the Questions table, you'd have to search for tags as substrings within that comma-separated list. Searching for substrings can't be indexed with a standard B-tree style index, and therefore searching for related questions becomes a costly table-scan. It's also more complex and inefficient to insert and delete a tag, or to apply constraints like uniqueness or foreign keys.
That's what I mean by denormalization making an improvement for one type of query at the expense of other uses of the data. That's why it's a good idea to start out with everything in normal form, and then refactor to denormalized designs later on a case by case basis as your bottlenecks reveal themselves.
This goes back to old wisdom:
"Premature optimization is the root of all evil" -- Donald Knuth
In other words, don't denormalize until you can demonstrate during load testing that (a) it makes a real improvement to performance that justifies the loss of data integrity, and (b) it does not degrade performance of other cases unacceptably.
It sounds like you already understand the benefits of normalisation, so I won't cover these.
There are a couple of considerations here:
1. Does a user always have one and only phone number?
If so, then it is still normalised to add these to the user table. However, if the user can have either no phone number or multiple phone numbers, then the phone details should be held in a seperate table.
Assuming you have these in seperate tables, but after conducting performance tests you found that joining on these 2 tables was having a significant effect on performance, then you may choose to deliberately denormalise the tables for performance gains.
Others have already provided some good points and you may also want to take a look at this.
I'd just like to mention one more aspect that is often overlooked: I/O tends to be the greatest component of the cost of most queries, and denormalization generally increases the storage size of data, therefore making the DBMS cache "smaller".
If your normalized database fits into cache and denormalized doesn't, you may actually observe a performance decrease for the latter.
And you won't be able to spot that in development, unless you actually have the amount of data that is similar to production. This is one of many reasons why you should never, ever denormalize without solid measurements (on representative amounts of data) to justify it.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
can somebody explain this algorithm is secure or not? is there attack to break that? this algorithm uses common XOR cryptography but has some differences:
M(1) = key XOR Message(1)
M(2) = h(key) XOR Message(2)
M(3) = h(h(key)) XOR Message(3)
and so on
Notes:
M(i) is ciphered text
Message(i) is message that we are going to cipher it
key and Message(i) have the same lengths**
attacker just has the ciphered text and knows key making scheme(continues hashing) and XOR cryptography
hash algorithm is SHA-512
If the attacker ever gets to know a plaintext-ciphertext pair, he can calculate the corresponding key. And from that he can calculate all later keys. i.e. it's trivially vulnerable to a known plain text attack.
Note that when I say that the attacker guesses the message, I don't mean that he's sure that his guess is correct. He might make a few trillion guesses, and if one of them is correct, your whole scheme is broken.
And of course you must not ever reuse a key.
A more secure (but twice as slow) algorithm would be:
Key(i+1) = h("A"+key)
M(i) = h("B"+key) XOR Message(i)
Or a construction similar to CTR mode:
M(i) = h(i+key) XOR Message(i)
But I still wouldn't use either.
But there is no reason to use such a homebrew algorithm. There are plenty of existing algorithms that work well. For example if you like a stream cipher design, you could use AES in CTR mode.
Studying encryption algorithms is great fun. Just remember you are playing, not producing anything serious. As long as you are only keeping things like your personal diary (or maybe even passwords) encrypted and you keep the data secure, you will probably be fine. This kind of counts as security through obscurity. I would not recommend encrypting mass quantities of data that you REALLY need to keep private or anything that is available and of interest to the outside world, however.
In this case, if your message is shorter than the key size and hash block size and the key is single use and random, you are effectively using a one-use pad so everything in swell. Provided your random number key generation is perfect, you have an unbreakable encryption mechanism. As you add each block to the message, you are effectively calculating new keys using SHA-512, not adding any particular value. If an attacker can assume the message consists of printable text and if the length of the message is long or the key is used repeatedly, it should would not be too difficult to find the original key.
It would be more effective to calculate:
M(1)=h(N + key) XOR Message(1)
M(2)=h(M(1)) XOR Message(2)
M(3)=h(M(2)) XOR Message(3)
(where N is the number of times the key has been used which is passed in clear text.)
That way the bad guys can’t calculate your key sequence ahead of time and decrypt your message before you can. Also by using a salted hash of the key, the attacker won’t be able to predict the key sequence that will be used next time.
I read somewhere:
The first rule of cryptography is “Cryptography should be left to experts.”
The second rule is “You are not an expert.”
There is a reason people get PhDs in things like Computer Science and Mathematics. There is a lot to learn and discover. Something like this looks fine to me but no doubt it has a gaping hole that an attacker could drive a truck through.
Have fun and don't let grouchy people like me get you down.
/Bob Bryan
I'm creating a script to detect weak passwords within a MySQL database. Which method would work the best?
I've been researching a few methods, but can't seem to decide which one would offer the best results with the best performance. I currently have the following methods in mind:
Extract passwords, and perform a dictionary attack on each.
Still extracting passwords, but to a file and use a tool like Hydra.
Perform a regex matching, that hits on basic passwords.
Please note that all passwords in the database is encrypted with a md5 hash.
Every now and then I come across applications that force you to change passwords once in a while. Almost universally, they have this strange requirement for the new password: it has to be "significantly" different from your previous password(s).
While at first this sounds logical, next thing I think is: how do they do that? Do they store my passwords in plain text? I would have accepted the answer that they do, if it wasn't for the fact that these are kinds of applications that pretend to care about security so much they force you to change your password if it is expired! Microsoft Exchange is one example of this.
I'm not very good at cryptography and hash functions, so my question is this: Is it possible to enforce this kind of policy without storing passwords in plain text?
Do you know how this policy is implemented in real world applications?
UPDATE: An Example.
I was recently changing my Microsoft Exchange password. I only use Web Access, so it might be different a little -- I have no idea.
So, it forces me to change my password. What I do sometimes is I change it to something new and then change it back almost immediately. The freaky part is that It did not allow me to even change it back because of this. I tried changing it a little, by adding a letter in front of it or changing one symbol -- no luck, it was complaining.
With a typical hash, the best you can do is see if the new password is exactly equal to previous ones. You can break the password into multiple hashes in order to get more flexible with comparison, for example 3 hashes:
Alpha characters only
Numeric characters only
All other characters
You could for example require all the hashes to change to be accepted, to prevent users from just changing their password from SecretPassword01 to SecretPassword02.
A cryptographic expert may weigh in here on if this could be made as secure as a single hash.
NOTE that this is not as secure as a single hash, so before you go implementing this, make sure you have really done your research.
When changing password you're usually asked for the old one to confirm your identity. It's then trivial to compare the old one and the new one to see how much they differ. TBH I don't know how to compare to several previous passwords without storing them, but that's getting into the territory of ridiculous policies anyway.
I'm thinking of hashing user passwords with two different salt strings, one stored in the code which is the same for all users and another stored in the database for which each user has their own unique value.
Would this be more effective than simply storing the values in the database?
Any advice, opinions appreiated.
Thanks
The effect is miniscule if anything at all. Consider that a static, hard coded salt can be viewed as nothing more than an alteration to the hashing algorithm - it happens exactly the same way every time, so it may as well be considered part of the algorithm.
But the purpose of the salt is to create some randomness that is similar to extending the (minimum) strength of the password, for the purpose of making offline cracking (including rainbow tables) more resource intensive (non-rainbow-table cracking will require more CPU time, and rainbow tables will require all salts for all strings).
The only way that you'd get any value from this is while the static salt is unknown - the equivalent to the algorithm being unknown. If your binary or your source is available to the attacker, then reverse engineering will demonstrate the algorithm and the hard coded salt.
And if this issue goes public, you will probably have to deal with flack from many security enthusiasts who believe that anything not perfect is completely broken, even though your product already does the right thing and the additional step is just useless.
And, of course, you'll have to deal with maintenance issues of having a static salt - backwards compatibility and bug fixes around the hashing code can be a pain.
The very small benefit of static keys (or salts) is simply not worth the cost. Always make keys and salts dynamic.