How to use regular expression in T-SQL? - sql

For example:
where strword match with {%{J{GC * GC}J} or strword={%{J{GC * GC}J}

I typically try to avoid posting answers that are simply links to somewhere else. But in this case it's a fairly big answer as you have to involve CLR in your approach.
So - in this case I'm going to make an exception and just give you a link. I feel better about it since it's an official Microsoft doc and they are pretty good about not moving things around.
Here's the walk through from Microsoft on using RegEx with SQLServer.
It has good sample code and is extensive in it's coverage. If your OK with adding CLR to your solution then it should give you exactly what you need.
Update: Turns out Microsoft did in fact change that link. Another walk through by a respected company (Red Gate) can be found here: https://www.red-gate.com/simple-talk/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/

Related

Rewrite this exceedingly long query

I just stumbled across this gem in our code:
my $str_rep="lower(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(field,'-',''),'',''),'.',''),'_',''),'+',''),',',''),':',''),';',''),'/',''),'|',''),'\',''),'*',''),'~','')) like lower('%var%')";
I'm not really an expert in DB, but I have a hunch it can be rewritten in a more sane manner. Can it?
It depends on the DBMS you are using. I'll post some examples (feel free to edit this answer to add more).
MySQL
There is really not much to do; the only way to replace all the characters is nesting REPLACE functions as it has already been done in your code.
Oracle DB
Your clause can be rewritten by using the TRANSLATE function.
SQL Server
Like in MySQL there aren't any functions similar to Oracle's TRANSLATE. I have found some (much longer) alternatives in the answers to this question. In general, however, queries become very long. I don't see any real advantages of doing so, besides having a more structured query that can be easily extended.
Firebird
As suggested by Mark Rotteveel, you can use SIMILAR TO to rewrite the entire clause.
If you are allowed to build your query string via Perl you can also use a for loop against an array containing all the special characters.
EDIT: Sorry I did not see you indicated the DB in the tags. Consider only the last part of my answer.
Your flagged this as Perl, but it's probably not?
Here is a Perl solution anyway:
$var =~ s/[\-\.\_\+\,\:\;\/\|\\\*\~]+//g;
Sorry I don't know the languages concerned, but a couple of things come to mind.
Firstly you could look for a replace text function that does more that just a single character. Many languages have them. Some also do regular expression based find and replace.
Secondly the code looks like it is attempting to strip a specific list of characters. This list may not include all that is necessary which means a relatively high (pain in the butt) maintenance problem. A simpler solution might be to invert the problem and ask what characters do you want to keep? Inverting like this sometimes yields a simpler solution.

Installation of full-text research on SQL Server 2012

I may need some assistance for my installation of FTS on my computer.
I have the requirement of practicing some stuff concerning FTS.
And at the beginning, I used SELECT SERVERPROPERTY('IsFullTextInstalled'); to check if FTS was installed on my desktop and the result was 0.
Then I started to find the solution by asking Professor Goo (Google), yet I still haven't found a solution that can resolve my problem after seeing some articles for approximately one hour.
And the followings are some information
Any suggestion, please.
I've found the answer and I'll answer to myself though it may be kind of idiotic for asking this. Well, what I had in my hand is 'ENU\x64\SQLEXPRWT_x64_ENU.exe' and, a person should download Express with Advanced Services(ENU\x64\SQLEXPRADV_x64_ENU.exe) and thne add the feature, like FTS, to the existing instance or the new one. In the end, the person can start to explore the functions provided by FTS.

Math needed for Sql Server

I've been working in Sql server jobs since 2 years now. Although I like it, sometimes I get the feeling that at certain times, I stall too much on some tasks, and I seem to be discouraged easily from things that involve relatively simple logic. It's like, at some point I must repeat a logical condition inside my head more than 2 or 3 times in order to understand it completely.
I have the feeling that this might be of my lack of math knowledge. Can anyone please let me know what area of mathematics I can study, that would improve my Sql server coding skills?
Thank you.
The field of maths most likely to be useful to you is Boolean logic
Set Theory is good for second place however it will often go into more detail that you are likely to need/use in understanding most sql queries.
A quick cheat that you may find useful is if you feed a boolean expression into wolfram alpha it will spit out a truth table for you which some find a much easier way of visualising the expression.
http://www.wolframalpha.com/input/?i=a+or+not+b
I recommend you study symbolic logic.
I'd suggest reading up on Set based Math.
See this link: http://weblogs.sqlteam.com/jeffs/archive/2007/04/30/thinking-set-based-or-not.aspx
Set theory helped me somewhat. Studied it in college years before I got into SQL, but being able to think of a bunch of numbers as a semi-amorphous blob of data and not as an ordered list of items really helps.
Get a copy of this book. It should prove to be most useful: The Art of SQL, by Stephane Faroult.

Code related web searches

Is there a way to search the web which does NOT remove punctuation? For example, I want to search for window.window->window (Yes, I actually do, this is a structure in mozilla plugins). I figure that this HAS to be a fairly rare string.
Unfortunately, Google, Bing, AltaVista, Yahoo, and Excite all strip the punctuation and just show anything with the word "window" in it. And according to Google, on their site, at least, there is NO WAY AROUND IT.
In general, searching for chunks of code must be hard for this reason... anyone have any hints?
google codesearch ("window.window->window" but it doesn't seem to get any relevant result out of this request)
There is similar tools all over the internet like codase or koders but I'm not sure they let you search exactly this string. Anyway they might be useful to you so I think they're worth mentioning.
edit: It is very unlikely you'll find a general purpose search engine which will allow you to search for something like "window.window->window" because most search engines will do some processing on the document before storing it. For instance they might represent it internally as vectors of words (a vector space model) and use that to do the search, not the actual original string. And creating such a vector involves first cutting the document according to punctuation and other critters. This is a very complex and interesting subject which I can't tell you much more about. My bad memory did a pretty good job since I studied it at school!
BTW they might do the same kind of processing on your query too. You might want to read about tf-idf which is probably light years from what google and his friends are doing but can give you a hint about what happens to your query.
There is no way to do that, by itself in the main Google engine, as you discovered -- however, if you are looking for information about Mozilla then the best bet would be to structure your query something more like this:
"window.window->window" +Mozilla
OR +XUL
+ Another search string related to what you are
trying to do.
SymbolHound is a web search that does not remove punctuation from the queries. There is an option to search source code repositories (like the now-discontinued Google Code Search), but it also has the option to search the Internet for special characters. (primarily programming-related sites such as StackOverflow).
try it here: http://www.symbolhound.com
-Tom (co-founder)

Good Use Cases of Comments

I've always hated comments that fill half the screen with asterisks just to tell you that the function returns a string, I never read those comments.
However, I do read comments that describe why something is done and how it's done (usually the single line comments in the code); those come in really handy when trying to understand someone else's code.
But when it comes to writing comments, I don't write that, rather, I use comments only when writing algorithms in programming contests, I'd think of how the algorithm will do what it does then I'd write each one in a comment, then write the code that corresponds to that comment.
An example would be:
//loop though all the names from n to j - 1
Other than that I can't imagine why anyone would waste valuable time writing comments when he could be writing code.
Am I right or wrong? Am I missing something? What other good use cases of comments am I not aware of?
Comments should express why you are doing something not what you are doing
It's an old adage, but a good metric to use is:
Comment why you're doing something, not how you're doing it.
Saying "loop through all the names from n to j-1" should be immediately clear to even a novice programmer from the code alone. Giving the reason why you're doing that can help with readability.
If you use something like Doxygen, you can fully document your return types, arguments, etc. and generate a nice "source code manual." I often do this for clients so that the team that inherits my code isn't entirely lost (or forced to review every header).
Documentation blocks are often overdone, especially is strongly typed languages. It makes a lot more sense to be verbose with something like Python or PHP than C++ or Java. That said, it's still nice to do for methods & members that aren't self explanatory (not named update, for instance).
I've been saved many hours of thinking, simply by commenting what I'd want to tell myself if I were reading my code for the first time. More narrative and less observation. Comments should not only help others, but yourself as well... especially if you haven't touched it in five years. I have some ten year old Perl that I wrote and I still don't know what it does anymore.
Something very dirty, that I've done in PHP & Python, is use reflection to retrieve comment blocks and label elements in the user interface. It's a use case, albeit nasty.
If using a bug tracker, I'll also drop the bug ID near my changes, so that I have a reference back to the tracker. This is in addition to a brief description of the change (inline change logs).
I also violate the "only comment why not what" rule when I'm doing something that my colleagues rarely see... or when subtlety is important. For instance:
for (int i = 50; i--; ) cout << i; // looping from 49..0 in reverse
for (int i = 50; --i; ) cout << i; // looping from 49..1 in reverse
I use comments in the following situations:
High-level API documentation comments, i.e. what is this class or function for?
Commenting the "why".
A short, high-level summary of what a much longer block of code does. The key word here is summary. If someone wants more detail, the code should be clear enough that they can get it from the code. The point here is to make it easy for someone browsing the code to figure out where some piece of logic is without having to wade through the details of how it's performed. Ideally these cases should be factored out into separate functions instead, but sometimes it's just not do-able because the function would have 15 parameters and/or not be nameable.
Pointing out subtleties that are visible from reading the code if you're really paying attention, but don't stand out as much as they should given their importance.
When I have a good reason why I need to do something in a hackish way (performance, etc.) and can't write the code more clearly instead of using a comment.
Comment everything that you think is not straightforward and you won't be able to understand the next time you see your code.
It's not a bad idea to record what you think your code should be achieving (especially if the code is non-intuitive, if you want to keep comments down to a minimum) so that someone reading it a later date, has an easier time when debugging/bugfixing. Although one of the most frustrating things to encounter in reading someone else's code is cases where the code has been updated, but not the comments....
I've always hated comments that fill half the screen with asterisks just to tell you that the function returns a string, I never read those comments.
Some comments in that vein, not usually with formatting that extreme, actually exist to help tools like JavaDoc and Doxygen generate documentation for your code. This, I think, is a good form of comment, because it has both a human- and machine-readable format for documentation (so the machine can translate it to other, more useful formats like HTML), puts the documentation close to the code that it documents (so that if the code changes, the documentation is more likely to be updated to reflect these changes), and generally gives a good (and immediate) explanation to someone new to a large codebase of why a particular function exists.
Otherwise, I agree with everything else that's been stated. Comment why, and only comment when it's not obvious. Other than Doxygen comments, my code generally has very few comments.
Another type of comment that is generally useless is:
// Commented out by Lumpy Cheetosian on 1/17/2009
...uh, OK, the source control system would have told me that. What it won't tell me is WHY Lumpy commented out this seemingly necessary piece of code. Since Lumpy is located in Elbonia, I won't be able to find out until Monday when they all return from the Snerkrumph holiday festival.
Consider your audience, and keep the noise level down. If your comments include too much irrelevant crap, developers will just ignore them in practice.
BTW: Javadoc (or Doxygen, or equiv.) is a Good Thing(tm), IMHO.
I also use comments to document where a specific requirement came from. That way the developer later can look at the requirement that caused the code to be like it was and, if the new requirement conflicts with the other requirment get that resolved before breaking an existing process. Where I work requirments can often come from different groups of people who may not be aware of other requirements then system must meet. We also get frequently asked why we are doing a certain thing a certain way for a particular client and it helps to be able to research to know what requests in our tracking system caused the code to be the way it is. This can also be done on saving the code in the source contol system, but I consider those notes to be comments as well.
Reminds me of
Real programmers don't write documentation
I wrote this comment a while ago, and it's saved me hours since:
// NOTE: the close-bracket above is NOT the class Items.
// There are multiple classes in this file.
// I've already wasted lots of time wondering,
// "why does this new method I added at the end of the class not exist?".