I have tried looking for answers online, but I am lacking the right nomenclature to find any answers matching my question.
The DB I am working with is an inconsistent mess. I am currently trying to import a number of maintenance codes which I have to link to a pre-existing Excel table. For this reason, the maintenance code I import have to be very universal.
The table is designed to work with 2-3 digit number (time lengths), followed by a time unit.
For example, SERV-01W and SERV-03M .
As these used to be added to the DB by hand, a large number of older maintenance codes are actually written with 1 digit numbers.
For example, SERV-1W and SERV-3M.
I would like to replace the old codes by the new codes. In other words, I want to add a leading 0 if only one digit is used in the code.
REPLACE(T.Code,'-[0-9][DWM]','-0[0-9][DWM]') unfortunately does not work, most likely because I am using wildcards in the result string.
What would be a good way of handling this issue?
Thank you in advance.
Assuming I understand your requirement this should get you what you are after:
WITH VTE AS(
SELECT *
FROM (VALUES('SERV-03M'),
('SERV-01W'),
('SERV-1Q'),
('SERV-4X')) V(Example))
SELECT Example,
ISNULL(STUFF(Example, NULLIF(PATINDEX('%-[0-9][A-z]%',Example),0)+1,0,'0'),Example) AS NewExample
FROM VTE;
Instead of trying to replace the pattern, I used PATINDEX to find the pattern and then inject the extra '0' character. If the pattern wasn't found, so 0 was returned by PATINDEX, I forced the expression to return NULL and then wrapped the entire thing with a further ISNULL, so that the original value was returned.
I find a simple CASE expression to be a simple way to express the logic:
SELECT (CASE WHEN code LIKE '%-[0-9][0-9]%'
THEN code
ELSE REPLACE(code, '-', '-0')
END)
That is, if the code has two digits, then do nothing. Otherwise, add a zero. The code should be quite clear on what it is doing.
This is not generalizable (it doesn't add two zeros for instance), but it does do exactly what you are asking for.
I have to write a select statement following the following pattern:
[A-Z][0-9][0-9][0-9][0-9][A-Z][0-9][0-9][0-9][0-9][0-9]
The only thing I'm sure of is that the first A-Z WILL be there. All the rest is optional and the optional part is the problem. I don't really know how I could do that.
Some example data:
B/0765/E 3
B/0765/E3
B/0764/A /02
B/0749/K
B/0768/
B/0784//02
B/0807/
My guess is that I best remove al the white spaces and the / in the data and then execute the select statement. But I'm having some problems writing the like pattern actually.. Anyone that could help me out?
The underlying reason for this is that I'm migrating a database. In the old database the values are just in 1 field but in the new one they are splitted into several fields but I first have to write a "control script" to know what records in the old database are not correct.
Even the following isn't working:
where someColumn LIKE '[a-zA-Z]%';
You can use Regular Expression via xQuery to define this pattern. There are many question in StackOverFlow that talk about patterns in DB2, and they have been solved with Regular Expressions.
DB2: find field value where first character is a lower case letter
Emulate REGEXP like behaviour in SQL
I have a piece of code that assigns attributes to an NSAttributedString depending on whether certain keywords are present in the string or not. In other words, syntax highlight.
To find if a certain string has those keywords I am currently using regular expressions to find the location of those words with "\\bKEYWORD\\b". The problem is, obviously, performance.
I first tried with NSRegularExpression but performance was so slow that scrolling my textview was nearly impossible. I then tried Oniguruma and things improved but it's still noticeably slow. I may try PCRE but I don't think I'll be adding much.
So, my question is: how can I speed up regular expression searches? Maybe caching the compiled expression?
It sounds like you're searching for each word individually. I would create an array of search words, then join them together with a regex alternation | symbol
Given search words like: alpha, bravo, charlie, delta, echo
Resulting complied regex: \b(?:alpha|bravo|charlie|delta|echo)\b
The non capture group construct (?:...) is a bit faster then the capture syntax (...)
I'm writing my own syntax and want characters that do not have obvious common meanings in that syntax [1]. Is there a list of the common meanings of punctuation characters (e.g. '?' could be part of a ternary operator, or part of a regex) so I can try to pick those which may not have 'obvious' syntax (I can be the judge of that :-).
[1] It's actually an extended Fortran FORMAT, but the details are irrelevant here
Here is an exhaustive survey of syntax across languages.
I am loath to be so defeatist, but this does sound a bit like it doesn't exist ( a list of all the symbols / operators across languages ) a quick look around would give a good idea of what is commonplace.
Assuming that you will restrict yourself to ASCII, the short-list is more or less what you can see on your keyboard and I can can think of a few uses for most of them. So maybe avoiding conflicts is a bit ambitious. Of course it depends on who is to be the user of this syntax, if for example symbols that are relatively unused in Fotran would be suitable then that is more realistic.
This link: Fotran 95 Spec gives a list of Fortran operators, which might help if avoided.
I'm sorry if any of this is a statement of the obvious or missing the point, or just not very helpful :)
I would say [a-z][A-Z] All do not have an obvious syntax for instance. if you used Upper case T as an operator.
x T v
The downfall is people like to use letters for variables.
Other than that you might want to investigate multicharacter operators, the downfall of these however is that they quickly grow weary to type things like
scalar = vec4i *+ vec4j
if you perhaps had a Fused multiply add operator. Well that one isnt so bad, but I'm sure you can find more cumbersome ones.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is there a good reason to use upper case for T-SQL keywords?
I personally find a string of lowercase characters to be more readable than a string of uppercase characters. Is some old/popular flavor of SQL case-sensitive or something?
For reference:
select
this.Column1,
case when this.Column2 is null then 0 else this.Column2 end
from dbo.SomeTable this
inner join dbo.AnotherTable another on this.id = another.id
where
this.Price > 100
vs.
SELECT
this.Column1,
CASE WHEN this.Column2 IS NULL THEN 0 ELSE this.Column2 END
FROM dbo.SomeTable this
INNER JOIN dbo.AnotherTable another ON this.id = another.id
WHERE
this.Price > 100
The former just seems so much more readable to me, but I see the latter way more often.
I agree with you - to me, uppercase is just SHOUTING.
I let my IDE handle making keywords stand out, via syntax highlighting.
I don't know of a historical reason for it, but by now it's just a subjective preference.
Edit to further make clear my reasoning:
Would you uppercase your keywords in any other modern language? Made up example:
USING (EditForm form = NEW EditForm()) {
IF (form.ShowDialog() == DialogResult.OK) {
IF ( form.EditedThing == null ) {
THROW NEW Exception("No thing!");
}
RETURN form.EditedThing;
} ELSE {
RETURN null;
}
}
Ugh!
Anyway, it's pretty clear from the votes which style is more popular, but I think we all agree that it's just a personal preference.
I think the latter is more readable. You can easily separate the keywords from table and column names, etc.
One thing I'll add to this which I haven't seen anyone bring up yet:
If you're using ad hoc SQL from within a programming language you'll have a lot of SQL inside strings. For example:
insertStatement = "INSERT INTO Customers (FirstName, LastName) VALUES ('Jane','Smith')"
In this case syntax coloring probably won't work so the uppercasing could be helping readability.
From Joe Celko's "SQL Programming Style" (ISBN 978-0120887972):
Rule:
Uppercase the Reserved Words.
Rationale:
Uppercase words are seen as a unit,
rather than being read as a series of
syllables or letters. The eye is drawn
to them, and they act to announce a
statement or clause. That is why
headlines and warning signs work.
Typographers use the term bouma for
the shape of a word. The term appears
in paul Saenger's book (1975). Imagine
each letter on a rectangular card that
just fits it, so that you see the
ascenders, descenders, and baseline
letters as various "Lego blocks" that
are snapped together to make a word.
The bouma of an uppercase word is
always a simple, dense rectangle, and
it is easy to pick out of a field of
lowercase words.
What I find compelling is that this is the only book about SQL heuristics, written by a well-known author of SQL works. So is this the absolute truth? Who knows. It sounds reasonable enough and I can at least point out the rule to a team member and tell them to follow it (and if they want to blame anyone I give them Celko's email address :)
Code has punctuation which SQL statements lack. There are dots and parentheses and semicolons to help you keep things separate. Code also has lines. Despite the fact that you can write a SQL statement on multiple physical lines, it is a single statement, a single "line of code."
IF I were to write English text without any of the normal punctuation IT might be easier if I uppercased the start of new clauses THAT way it'd be easier to tell where one ended and the next began OTHERWISE a block of text this long would probably be very difficult to read NOT that I'd suggest it's easy to read now BUT at least you can follow it I think
Mostly it's tradition. We like to keep keywords and our namespace names separate for readability, and since in many DBMSes table and column names are case sensitive, we can't upper case them, so we upper case the keywords.
I prefer lower case keywords. SQL Server Management Studio color codes the keywords, so there is no problem distinguishing them from the identifiers.
And upper case keywords feels so... well... BASIC... ;)
-"BASIC, COBOL and FORTRAN called from the eighties, and they wanted their UPPERCASE KEYWORDS back." ;)
I like to use upper case on SQL keywords. I think my mind skips over them as they are really blocky and concentrates on what's important. The blocky words split up the important bits when you layout like this:
SELECT
s.name,
m.eyes,
m.foo
FROM
muppets m,
muppet_shows ms,
shows s
WHERE
m.name = 'Gonzo' AND
m.muppetId = ms.muppetId AND
ms.showId = s.showId
(The lack of ANSI joins is an issue for another question.)
There is a psychology study that shows lowercase was quicker to read than uppercase due to the outlines of the words being more distinctive. However, this effect can disappear about with lots of practice reading uppercase.
What's worse it that as the majority of developers at my office believe in capitals for SQL keyword, so I have had to change to uppercase. Majority rules.
I believe lowercase is easier to read and that given that SQL keywords are highlighted in blue anyway.
In the glory days, keywords were in capitals because we were developing on green screens!
The question is: if we don't write C# keywords in uppercase then why do I have to write SQL keywords in uppercase?
Like someone else has said - capitals are SHOUTING!
Back in the 1980s, I used to capitalize database names, and leave SQL keywords in lower case. Most writers did the opposite, capitalizing the SQL keywords. Eventually, I started going along with the crowd.
Just in passing, I'll mention that, in most published code snippets in C, C++, or Java the language keywords are always in lower case, and upper case keywords may not even be recognized as such by some parsers. I don't see a good reason for using the opposite convention in SQL that you use in the programming language, even when the SQL is embedded in source code.
And I'm not defending the use of all caps for database names. It actually looks a little like "shouting". And there are better conventions, like using a few upper case letters in database names. (By "database names" I mean the names of schemas, schema objects like tables, and maybe a few other things.) Just because I did it in the 80s doesn't mean I have to defend it today.
Finally, "De gustibus non disputandum est".
It's just a matter of readability and helps you quickly distinguish SQL keywords.
Btw, that question was already answered:
Is SQL syntax case sensitive?
I prefer using upper case as well for keywords in SQL.
Yes, lower case is more readable, but for me having to take an extra second to scan through the query will do you good most of the time. Once it's done and tested you should rarely ever see it again anyway (DAL, stored procedure or whatever will hide it from you).
If you are reading it for the first time, capitalized WHERE AND JOIN will jump right at you, as they should.
It’s just a question of readability. Using UPPERCASE for the SQL keywords helps make the script more understandable.
I capitalize SQL to make it more "contrasty" to the host language (mostly C# these days).
It's just a matter of preference and/or tradition really...
Apropos of nothing perhaps, but I prefer typesetting SQL keywords in small caps. That way they look capitalized to most readers, but they aren't the same as the ugly ALL CAPS style.
A further advantage is that I can leave the code as is and print it in the traditional style. (I use the listings package in LaTeX for pretty-printing code.)
Some SQL developers here like to lay it out like this:
SELECT s.name, m.eyes, m.foo
FROM muppets m, muppet_shows ms, shows s
WHERE m.name = 'Gonzo' AND m.muppetId = ms.muppetId AND ms.showId = s.showId
They claim this is easier to read unlike your one field per line approach which I use myself.