Why do license/product keys contain hyphens/dashes? - passwords

All the license keys I've seen look like something like that: XXXX-XXXX-XXXX-XXXX
But why do they always contain dashes? Is there a specific advantage over keys like this: XXXXXXXXXXXXXXXX?
Follow-up question:
What's the best format for a license key? What I mean is:
how long ( in characters ) should it be?
How many possible characters? ( E.g. [ a-z, A-Z, 0-9 ] + two special characters, just [ a-z, A-Z ] or something else? )
Dashes? If so, every how many characters? ( XXXX-XXXX, XXXXX-XXXXX or something else? )

For the separation, because they often have to be manually entered and verified by a human. The initial reading - and then cross-checking the sequence to catch errors - is much easier as a series of chunks than as a single contiguous string.
For the length, this often depends on a couple of factors - how globally unique you want it to be, how hard you want it to be to "brute force" (generate valid but unauthorized keys), and how easy you want it to be to enter. Most license keys have undisclosed checksum / signing mechanisms to reduce the efficiency of brute force (but can themselves also extend the length).

Related

Handling dynamic (user supplied) column names

When writing applications that manage data, it is often useful to allow the end user to create or remove classes of data that are best represented as columns. For example, I'm working on a dictionary building application; a user might decide they want to add, say, an "alternate spelling" field or something to data, which could be very easily represented as another column.
Usually, I just name the column based on whatever the user called it ("alternate_spelling" in this case); however, a user-defined string that isn't explicitly sanitized as a database identifier bothers me. Since column names can't be bound like values, I'm trying to figure out how to sanitize the column names.
So my question is: what should I be doing? Can I get away with just quoting things? There's lots of questions asking how to bind column names in SQL, and many responses saying one should never need to, but never explaining the correct approach to handling variable columns. I'm working in Python specifically, but I think this question is more general.
It depands on which database you are using...
According to PostgreSQL:
"SQL identifiers and key words must begin with a letter (a-z, but also letters with diacritical marks and non-Latin letters) or an underscore (_). Subsequent characters in an identifier or key word can be letters, underscores, digits (0-9), or dollar signs ($). Note that dollar signs are not allowed in identifiers according to the letter of the SQL standard, so their use might render applications less portable"
(Keep also in mind: maximum length allowed for the name)
I was looking for something like this. I still wouldn't trust it with user-supplied names - I'd look those up from the database catalog instead, but I think it is robust enough to check data that is provided from your backend.
i.e. Just because something comes from your internal data tables or yaml config files doesn't 100% mean that an attacker couldn't have hacked into those sources, so why not add another layer right before composing sql queries?
This is for postgresql but mostly should work on something else. No, it doesn't cover ALL possible characters for naming columns and tables, only those used in my databases.
class SecurityException(Exception):
"""concerns security"""
class UnsafeSqlException(SecurityException):
""" sql fragments looks unsafe """
def is_safe_sql_name(sql : str, error_on_empty : bool = False, raise_on_false : bool = True) -> bool :
"""check that something looks like an object name"""
patre = re.compile("^[a-z][a-z0-9_]{0,254}$",re.IGNORECASE)
if not isinstance(sql, str):
raise TypeError(f"sql should be a string {sql=}")
if not sql:
if error_on_empty:
raise ValueError(f"empty sql {sql=}")
return False
res = bool(patre.match(sql))
if not res and raise_on_false:
raise UnsafeSqlException(f"{sql=}")
return res

Selecting rows where a column contains at least one character not in a whitelist

I'm converting some sensitive data from a low-security encryption to a higher security encryption (specifically, from CFMX_COMPAT to AES with a 256-bit key). I intend to encode my AES-encrypted strings using Hex, and CFMX_COMPAT is extremely likely to use special characters, so finding records that aren't yet converted should be as simple as (pseudocode):
select from table where column has at least one character not in [A-Z0-9]
Is this possible in SQL? If so, how?
I found this documentation, but I had no idea it was possible in a simple LIKE clause. Awesome!
select top 10 foo
from bar
where foo like '%[^A-Z0-9]%'

What are the characters that count as the same character under collation of UTF8 Unicode? And what VB.net function can be used to merge them?

Also what's the vb.net function that will map all those different characters into their most standard form.
For example, tolower would map A and a to the same character right?
I need the same function for these characters
german
ß === s
Ü === u
Χιοσ == Χίος
Otherwise, sometimes I insert Χιοσ and latter when I insert Χίος mysql complaints that the ID already exist.
So I want to create a unique ID that maps all those strange characters into a more stable one.
For the encoding aspect of the thing, look at String.Normalize. Notice also its overload that specifies a particular normal form to which you want to convert the string, but the default normal form (C) will work just fine for nearly everyone who wants to "map all those different characters into their most standard form".
However, things get more complicated once you move into the database and deal with collations.
Unicode normalization does not ever change the character case. It covers only cases where the characters are basically equivalent - look the same1, mean the same thing. For example,
Χιοσ != Χίος,
The two sigma characters are considered non-equivalent, and the accented iota (\u1F30) is equivalent to a sequence of two characters, the plain iota (\u03B9) and the accent (\u0313).
Your real problem seems to be that you are using Unicode strings as primary keys, which is not the most popular database design practice. Such primary keys take up more space than needed and are bound to change over time (even if the initial version of the application does not plan to support that). Oh, and I forgot their sensitivity to collations. Instead of identifying records by Unicode strings, have the database schema generate meaningless sequential integers for you as you insert the records, and demote the Unicode strings to mere attributes of the records. This way they can be the same or different as you please.
It may still be useful to normalize them before storing for the purpose of searching and safer subsequent processing; but the particular case insensitive collation that you use will no longer restrict you in any way.
1Almost the same in case of compatibility normalization as opposed to canonical normalization.

How can you query a SQL database for malicious or suspicious data?

Lately I have been doing a security pass on a PHP application and I've already found and fixed one XSS vulnerability (both in validating input and encoding the output).
How can I query the database to make sure there isn't any malicious data still residing in it? The fields in question should be text with allowable symbols (-, #, spaces) but shouldn't have any special html characters (<, ", ', >, etc).
I assume I should use regular expressions in the query; does anyone have prebuilt regexes especially for this purpose?
If you only care about non-alphanumerics and it's SQL Server you can use:
SELECT *
FROM MyTable
WHERE MyField LIKE '%[^a-z0-9]%'
This will show you any row where MyField has anything except a-z and 0-9.
EDIT:
Updated pattern would be: LIKE '%[^a-z0-9!-# ]%' ESCAPE '!'
I had to add the ESCAPE char since you want to allow dashes -.
For the same reason that you shouldn't be validating input against a black-list (i.e. list of illegal characters), I'd try to avoid doing the same in your search. I'm commenting without knowing the intent of the fields holding the data (i.e. name, address, "about me", etc.), but my suggestion would be to construct your query to identify what you do want in your database then identify the exceptions.
Reason being there are just simply so many different character patterns used in XSS. Take a look at the XSS Cheat Sheet and you'll start to get an idea. Particularly when you get into character encoding, just looking for things like angle brackets and quotes is not going to get you too far.

How do you know when to use varchar and when to use text in sql?

It seems like a very arbitrary decision.
Both can accomplish the same thing in most cases.
By limiting the varchar length seems to me like you're shooting yourself in the foot cause you never know how long of a field you will need.
Is there any specific guideline for choosing VARCHAR or TEXT for your string fields?
I will be using postgresql with the sqlalchemy orm framework for python.
In PostgreSQL there is no technical difference between varchar and text
You can see a varchar(nnn) as a text column with a check constraint that prohibits storing larger values.
So each time you want to have a length constraint, use varchar(nnn).
If you don't want to restrict the length of the data use text
This sentence is wrong:
By limiting the varchar length seems to me like you're shooting yourself in the foot cause you never know how long of a field you will need.
If you are saving, for example, MD5 hashes you do know how large the field is your storing and your storage becomes more efficient. Other examples are:
Usernames (64 max)
Passwords (128 max)
Zip codes
Addresses
Tags
Many more!
In brief:
Variable length fields save space, but because each field can have different length, it makes table operations slower
Fixed length fields make table operations fast, although must be large enough for the maximum expected input, so can use more space
Think of an analogy to arrays and linked lists, where arrays are fixed length fields, and linked lists are like varchars. Which is better, arrays or linked lists? Lucky we have both, because they are both useful in different situations, so too here.
In the most cases you do know what the max length of a string in a field is. In case of a first of lastname you don't need more then 255 characters for example. So by design you choose wich type to use, if you always use text you're wasting resources
Check this article on PostgresOnline, it also links to two other usefull articles.
Most problems with TEXT in PostgreSQL occur when you're using tools, applications and drivers that treat TEXT very different from VARCHAR because other databases behave very different with these two datatypes.
Database designers almost always know how many characters a column needs to hold. US delivery addresses need to hold up to 64 characters. (The US Postal Service publishes addressing guidelines that say so.) US ZIP codes are 5 characters long.
A database designer will look at representative sample data from her clients when she's specifying columns. She'll ask herself, questions like "What's the longest product name?" And when the answer is "70 characters", she won't make the column 3000 characters wide.
VARCHAR has a limit of 8k in SQL Server (I think). Most applications don't require nearly that much storage for a single column.