How do you add attribute validation to an LDAP schema?

How do you add attribute validation to an LDAP schema? - ldap

e.g.
attributetype ( 2.16.840.1.113730.3.1.39
NAME 'preferredLanguage'
DESC 'RFC2798: preferred written or spoken language for a person'
EQUALITY caseIgnoreMatch
SUBSTR caseIgnoreSubstringsMatch
SYNTAX 1.3.6.1.4.1.1466.115.121.1.15
SINGLE-VALUE )
I've read that I could add {4096} onto the end of the syntax to set a recommended length, but that some LDAP servers ignore it and none treat it like validation and it's not be used as a max? OpenLDAP is the implementation I'm tied to.
Is that correct? Is there a better way to add simple validation aspects - max and min length and not null ought to cover my use cases. Thanks in advance.

You should consult the LDAP standards documentation: RFC4512 is quite clear on this question:
for instance, "1.3.6.4.1.1466.0{64}" suggests
that server implementations should allow a string to be 64 characters
long, although they may allow longer strings.
The key words are: suggest and may.
As for not-null, the same admonition applies: consult the standards documentation to locate a directory schema syntax that does not allow for null octet strings.

Related

Why is the use of 'from' behind the table name allowed? What does it do?

SQL for DB2 is pretty strict, that's why I was surprised this query succeeded:
select 1 from sysibm.sysdummy1 from
Is it exactly the same as?
select 1 from sysibm.sysdummy1
If the double from is allowed, why isn't a double where/select/order by/having allowed? Is there any difference in the output when running this query on a 'real' table?

Db2 (for Linux, Unix, Windows) provides a list of reserved schemas and words. As stated in the docs, the list is not enforced by Db2, but the recommendation is to not use them for portability reasons.
A from succeeds but not a where because an optional WHERE clause follows in the place you tried to use the reserved words. In that case you have an incomplete WHERE clause and it violates grammar rules. Thus, the recommendation is to respect the list of reserved words and not use them. You may (freedom of expression... ;-) ), but you should be considerate...

OpenLDAP telephoneNumber schema

I try to create a phonebook with OpenLDAP 2.4.31 with the standard schemas.
Inserting a number containing a hash (#) or asterisk (*) won't work and return me a syntax error.
RFCs tell me that a number is the following: Printable string (alphabetic, digits, ', (, ), +, ,, -, ., /, :, ?, and space) and "
How can I edit the schema to support # and * characters?

We are having the same exact issue! Mobile networks offer a variety of services and information accessed using numbers that include either pound (hash) or star. Its a perfectly valid question for perfectly normal use of a phone number field.
Having a very casual look at RFC 4517, I see that it's really true! The LDAP RFC offers only a very limited selection of basic syntax types, and telephoneNumber maps to PrintableString. Probably a case of the RFCWG more interested in their RFC for its own sake, than the practical application thereof. I mean, which would be more useful in a phone number field - '?' or '#'..
As was already alluded, hacking cosine.schema can lead to even larger problems and is not upgrade-safe. FYI there are a few LDAP servers out there, many a bit more flexible about the RFC implementation. Have a look at OpenDJ:
https://forgerock.org/opendj/
Any server-side 'fix' in this case will likely no longer be strictly RFC compliant, which runs the risk of your original syntax issues revisiting you, if you ever need to exchange LDIF with other LDAP systems. But IMHO changing the client mapping to another unrelated field type could hardly be called 'better', especially from an onlooker's perspective. So either get another LDAP server which is more forgiving or change the field mapping on the client - either way presents risks and should be understood as a limitation of RFC 4517.

You would have to change the OID in the telephoneNumber schema entry to refer to a more general attribute syntax OID as per the RFCs. Not a good idea. You would be better off using a different attribute.

What does the SQL Standard say about usage of backtick(`)?

Once I had spent hours in debugging a simple SQL query using mysql_query() in PHP/MySQL only to realise that I had missed bactick around the table name. From then I had been always using it around table names.
But when I used the same in SQLite/C++, the symbol is not even recognized. It's confusing, whether to use this or not? What does standard say about usage of it?
Also, it would be helpful if anyone could tell me when to use quotes and when not. I mean around values and field names.

The SQL standard (current version is ISO/IEC 9075:2011, in multiple parts) says nothing about the 'back-tick' or 'back-quote' symbol (Unicode U+0060 or GRAVE ACCENT); it doesn't recognize it as a character with special meaning that can appear in SQL.
The Standard SQL mechanism for quoting identifiers is with delimited identifiers enclosed in double quotes:
SELECT "select" FROM "from" WHERE "where" = "group by";
In MySQL, that might be written:
SELECT `select` FROM `from` WHERE `where` = `group by`;
In MS SQL Server, that might be written:
SELECT [select] FROM [from] WHERE [where] = [group by];
The trouble with the SQL Standard notation is that C programmers are used to enclosing strings in double quotes, so most DBMS use double quotes as an alternative to the single quotes recognized by the standard. But that then leaves you with a problem when you want to enclose identifiers.
Microsoft took one approach; MySQL took another; Informix allows interchangeable use of single and double quotes, but if you want delimited identifiers, you set an environment variable and then you have to follow the standard (single quotes for strings, double quotes for identifiers); DB2 only follows the standard, AFAIK; SQLite appears to follow the standard; Oracle also appears to follow the standard; Sybase appears to allow either double quotes (standard) or square brackets (as with MS SQL Server — which means SQL Server might allow double quotes too). This page (link AWOL since 2013 — now available in The Wayback Machine) documents documented all these servers (and was helpful filling out the gaps in my knowledge) and notes whether the strings inside delimited identifiers are case-sensitive or not.
As to when to use a quoting mechanism around identifiers, my attitude is 'never'. Well, not quite never, but only when absolutely forced into doing so.
Note that delimited identifiers are case-sensitive; that is, "from" and "FROM" refer to different columns (in most DBMS — see URL above). Most of SQL is not case-sensitive; it is a nuisance to know which case to use. (The SQL Standard has a mainframe orientation — it expects names to be converted to upper-case; most DBMS convert names to lower-case, though.)
In general, you must delimit identifiers which are keywords to the version of SQL you are using. That means most of the keywords in Standard SQL, plus any extras that are part of the particular implementation(s) that you are using.
One continuing source of trouble is when you upgrade the server, where a column name that was not a keyword in release N becomes a keyword in release N+1. Existing SQL that worked before the upgrade stops working afterwards. Then, at least as a short-term measure, you may be forced into quoting the name. But in the ordinary course of events, you should aim to avoid needing to quote identifiers.
Of course, my attitude is coloured by the fact that Informix (which is what I work with mostly) accepts this SQL verbatim, whereas most DBMS would choke on it:
CREATE TABLE TABLE
(
DATE INTEGER NOT NULL,
NULL FLOAT NOT NULL,
FLOAT INTEGER NOT NULL,
NOT DATE NOT NULL,
INTEGER FLOAT NOT NULL
);
Of course, the person who produces such a ridiculous table for anything other than demonstration purposes should be hung, drawn, quartered and then the residue should be made to fix the mess they've created. But, within some limits which customers routinely manage to hit, keywords can be used as identifiers in many contexts. That is, of itself, a useful form of future-proofing. If a word becomes a keyword, there's a moderate chance that the existing code will continue to work unaffected by the change. However, the mechanism is not perfect; you can't create a table with a column called PRIMARY, but you can alter a table to add such a column. There is a reason for the idiosyncrasy, but it is hard to explain.

Trailing underscore
You said:
it would be helpful if anyone could tell me when to use quotes and when not
Years ago I surveyed several relational database products looking for commands, keywords, and reserved words. Shockingly, I found over a thousand distinct words.
Many of them were surprisingly counter-intuitive as a "database word". So I feared there was no simple way to avoid unintentional collisions with reserved words while naming my tables, columns, and such.
Then I found this tip some where on the internets:
Use a trailing underscore in all your SQL naming.
Turns out the SQL specification makes an explicit promise to never use a trailing underscore in any SQL-related names.
Being copyright-protected, I cannot quote the SQL spec directly. But section 5.2.11 <token> and <separator> from a supposed-draft of ISO/IEC 9075:1992, Database Language SQL (SQL-92) says (in my own re-wording):
In the current and future versions of the SQL spec, no keyword will end with an underscore
➥ Though oddly dropped into the SQL spec without discussion, that simple statement to me screams out “Name your stuff with a trailing underscore to avoid all naming collisions”.
Instead of:
person
name
address
…use:
person_
name_
address_
Since adopting this practice, I have found a nice side-effect. In our apps we generally have classes and variables with the same names as the database objects (tables, columns, etc.). So an inherent ambiguity arises as to when referring to the database object versus when referring to the app state (classes, vars). Now the context is clear: When seeing a trailing underscore on a name, the database is specifically indicated. No underscore means the app programming (Java, etc.).
Further tip on SQL naming: For maximum portability, use all-lowercase with underscore between words, as well as the trailing underscore. While the SQL spec requires (not suggests) an implementation to store identifiers in all uppercase while accepting other casing, most/all products ignore this requirement. So after much reading and experimenting, I learned the all-lowercase with underscores will be most portable.
If using all-lowercase, underscores between words, plus a trailing underscore, you may never need to care about enquoting with single-quotes, double-quotes, back-ticks, or brackets.

T-SQL language specification and lexing rules

I'm thinking about writing a templating tool for generating T-SQL code, which will include delimited sections like below;
SELECT
~~idcolumn~~
FROM
~~table~~
WHERE
~~table~~.flag = 1
Notice the double-tildes delimiting bits? This is an idea for an escape sequence in my templating language. But I want to be certain that the escape sequence is valid -- that it will never occur in a valid T-SQL statement. Problem is, I can't find any official microsoft description of the T-SQL language.
Does anyone know of an official specification for the T-SQL language, or at least the lexing rules? So I can make an informed decision about the escape sequence.
UPDATES:
Thanks for the suggestions so far, but I'm not looking for confirmation of the '~~' escape sequence per se. What I need is a document I can reference I can point to and say 'microsoft says this character sequence is totally impossible in T-SQL.' For instance, microsoft publish the language specification for C# here which includes a description of what characters can go into valid C# programs. (see page 67 of the pdf.) I'm looking for a similar reference.
The double-tilde: "~~" is actually perfectly good T-SQL. For instance; "(SELECT ~~1)" returns '1'.

There are several well known and often used formats for template parameters, one example being $(paramname) (also used in other scripts as well as T-SQL scripts)
Why not use an existing format?

It doesn't matter if ~~ is legal TSQL or not, if you provide an escape for producing ~~ in actual TSQL when you need it.
Since template parameters have to have a nonzero-length identifier, you have a peculiar case where the identifier length is ridiculously "zero", e.g., ~~~~. This kind of thing makes an ideal escape sequence, since it is useless for anything else. Simply process your template text; whenever you find ~~~~ replace it by the named parameter string, and whenever you find ~~~~ replace it by ~~. Now, if ~~ is needed in the final TSQL, just write ~~~~ in your template.
I suspect that even if you do this, that the number of times you'll actually write ~~~~ in practice will be close to zero, so the reason for doing it is theoretical completeness and giving you a warm fuzzy feeling that you can write anything in a template.

Well, I'm not sure about a complete description of the language, but it appears that ~~ could occur in an identifier provided that it is quoted (in brackets, typically).
You may have more luck with a convention saying you don't support identifiers with ~~ in them. Or, just reserve your own lexical symbols and don't worry about ~~ occurring elsewhere.

You could treat quoted literals and strings as content, regardless if they contain your escape-sequence. It would make it more robust.
Run the text trough a lexer, to separate each token. If the token is a string or a quoted literal, treat it as such. But if it is a literal that begins and ends with ~~, you can safely assume it is a template placeholder.

I'm not sure you'll find something that will never occur in a valid statement. Consider:
DECLARE #TemplateBreakingString varchar(100) = '~~I hope this works~~'
or
CREATE TABLE [~~TemplateBreakingTable~~] (IDField INT Identity)

Your escape sequence can occur in string literals, but that is all. That said, Microsoft owns t-sql, and they are free to do anything they want with it moving forward for future versions of sql server. Still, I think ~~ is safe enough.

What's the significance of # in SQL?

I have the following code to create an object type in Oracle (PL??)
CREATE OR REPLACE TYPE STAFF_T as OBJECT(Staff_ID# NUMBER, Person PERSON_T); \
I'd like to know what is the significance of the # appended to the Staff_ID variable in the declaration?

No special meaning.
Oracle allows using $, _ and # in identifiers, just like any other alphanumeric characters, but the identifier should begin with an alpha character (a letter).

That's part of the column name Staff_ID#. The pound sign is an allowable part of an identifier (table/column name) in PL/SQL. See here

Whoever wrote the code probably didn't mean anything special by #.
But # apparently means something to Oracle, although I don't know what. From the SQL Language Reference:
Oracle strongly discourages you from
using $ and # in nonquoted
identifiers.
Here are some guesses for what the warning is about:
it's related to a really old bug (the
warning goes back to at least Oracle
7)
Oracle plans to do something with
it in a future verison
that character
isn't available on all keyboards, character sets, or platforms that Oracle supports
The data dictionary uses the number sign a lot, and as far as I can tell it works just fine for user objects. But just to be safe you might want to remove it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas