REGEXP Oracle SQL - sql

I am having a clob field rq_dev_comments which should replace the username with "anonymous"
Update <TABLE>.req
Set rq_dev_comments = regexp_REPLACE(rq_dev_comments,
'\<[bB]\>.*gt;,', '<b>anonymous ')
where length(rq_dev_comments) > ...
Now my question is, if there is a way to check before wheather "anonymous" is already set or not and how to reduce the datasets?
Example:
rq_dev_comments = "<html><b>HendrikHeim</b>: I found an error....</html>"
Desired: "<html><b>Anonymous</b>: I found an error....</html>"

The following solution will not catch cases where "username" may appear more than once, and some but not all occurrences have already been replaced with "anonymous". So think twice before you use it. (The same would apply to ANY solutions along the lines of what you asked!)
Add the following to your WHERE clause:
... where length(...) ....
and dbms_lob.instr(rq_dev_comments, '<b>Anonymous') = 0
"= 0" means the search pattern wasn't found in the input string.
Another thing: In the example you show "anonymous" capitalized (with upper case A), but in your code you have it all lower case. Decide one way or another and be consistent. Good luck!

Related

Open Refine: Exporting nested XML with templating

I have a question regarding the templating option for XML in Open Refine. Is it possible to export data from two columns in a nested XML-structure, if both columns contain multiple values, that need to be split first?
Here's an example to illustrate better what I mean. My columns look like this:
Column1
Column2
https://d-nb.info/gnd/119119110;https://d-nb.info/gnd/118529889
Grützner, Eduard von;Elisabeth II., Großbritannien, Königin
https://d-nb.info/gnd/1037554086;https://d-nb.info/gnd/1245873660
Müller, Jakob;Meier, Anina
Each value separated by semicolon in Column1 has a corresponding value in Column2 in the right order and my desired output would look like this:
<rootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/1037554086">
<skos:prefLabel xml:lang="zxx">Müller, Jakob</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/1245873660">
<skos:prefLabel xml:lang="zxx">Meier, Anina</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<rootElement>
(note: in my initial posting, the position of the root element was not indicated and it looked like this:
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
)
I managed to split the values separated by ";" for both columns like this
{{forEach(cells["Column1"].value.split(";"),v,"<edm:Agent rdf:about=\""+v+"\">"+"\n"+"</edm:Agent>")}}
{{forEach(cells["Column2"].value.split(";"),v,"<skos:prefLabel xml:lang=\"zxx\">"+v+"</skos:prefLabel>")}}
but I can't find out how to nest the splitted skos:prefLabel into the edm:Agent element. Is that even possible? If not, I would work with seperate columns or another workaround, but I wanted to make sure, if there's a more direct way before.
Thank you!
Kristina
I am going to expand the answer from RolfBly using the Templating Exporter from OpenRefine.
I do have the following assumptions:
There is some other column left of Column1 acting as record identifying column (see first screenshot).
The columns actually have some proper names
The columns URI and Name are the only columns with multiple values. Otherwise we might produce empty XML elements with the following recipe.
We will use the information about records available via GREL to determine whether to write a <recordRootElement> or not.
Recipe:
Split first Name and then URI on the separator ";" via "Edit cells" => "Split multi-valued cells".
Go to "Export" => "Templating..."
In the prefix field use the value
<?xml version="1.0" encoding="utf-8"?>
<rootElement>
Please note that I skipped the namespace imports for edm, skos, rdf and xml.
In the row template field use the value:
{{if(row.index - row.record.fromRowIndex == 0, '<recordRootElement>', '')}}
<edm:Agent rdf:about="{{escape(cells['URI'].value, 'xml')}}">
<skos:prefLabel xml:lang="zxx">{{escape(cells['Name'].value, 'xml')}}</skos:prefLabel>
</edm:Agent>
{{if(row.index - row.record.fromRowIndex == row.record.rowCount - 1, '</recordRootElement>', '')}}
The row separator field should just contain a linebreak.
In the suffix field use the value:
</rootElement>
Disclaimer: If you're keen on using only OpenRefine, this won't be the answer you were hoping for. There may be ways in OR that I don't know of. That said, here's how I would do it.
Edit The trick is to keep URL and literal side by side on one line. b2m's answer below does just that: go from right to left splitting, not from left to right. You can then skip steps 2 and 3, to get the result in the image.
split each column into 2 columns by separator ;. You'll get 4 columns, 1 and 3 belong together, and 2 and 4 belong together. I'm assuming this will be the case consistently in your data.
export 1 and 3 to a file, and export 2 and 4 to another file, of any convenient format, using the custom tabular exporter.
concatenate those two files into one single file using an editor (I use Notepad++), or any other method you may prefer. Several ways to Rome here. Result in OR would be something like this.
You then have all sorts of options to put text strings in front, between and after your two columns.
In OR, you could use transform on column URL to build your XML using the below code
(note the \n for newline, that's probably just a line feed, you may want to use \r\n for carriage return + line feed if you're using Windows).
'<edm:Agent rdf:about="' + value + '">\n<skos:prefLabel xml:lang="zxx">' + cells.Name.value + '</skos:prefLabel>\n</edm:Agent>'
to get your XML in one column, like so
which you can then export using the custom tabular exporter again. Or instead you could use Add column based on this column in a similar manner, if you want to retain your URL column.
You could even do this in the editor without re-importing the file back into OR, but that's beyond the scope of this answer.

SQL - ignore where clause if null / no input

I'm building a SSRS report and would like one of my parameters to be optional where data is entered or not.
Here is an example query for a better understanding:
SELECT
C1
,C2
,C3
FROM
db_Database..tb_Table
WHERE
tb_Table_DateTime between [THEN] and [NOW]
AND
tb_Table_Integer IN (#Integer)
I'm trying to work out if, in my query, I can ignore the whole:
AND tb_Table_Integer IN (#Integer)
line if user chooses not to input any number.
Bascially, I want all data returned unless specified otherwise via #integer.
If not possible in the query, can this be achieved in the Visual Studio?
Cheers.
This is typically handled by doing:
WHERE . . . AND
(#Integer IS NULL OR tb_Table_Integer = #Integer)
Do not use IN (#Integer). It sort of implies that you think that #Integer could be a list. That is not possible.
The most common way to do this is with coalesce or nullif. Like this:
WHERE coalesce(#integer,tb_Table_Integer) = tb_Table_Integer

Peoplesoft CreateRowset with related display record

According to the Peoplebook here, CreateRowset function has the parameters {FIELD.fieldname, RECORD.recname} which is used to specify the related display record.
I had tried to use it like the following (just for example):
&rs1 = CreateRowset(Record.User, Field.UserId, Record.UserName);
&rs1.Fill();
For &k = 1 To &rs1.ActiveRowCount
MessageBox(0, "", 999999, 99999, &rs1(&k).UserName.Name.Value);
End-for;
(Record.User contains only UserId(key), Password.
Record.UserName contains UserId(key), Name.)
I cannot get the Value of UserName.Name, do I misunderstand the usage of this parameter?
Fill is the problem. From the doco:
Note: Fill reads only the primary database record. It does not read
any related records, nor any subordinate rowset records.
Having said that, it is the only way I know to bulk-populate a standalone rowset from the database, so I can't easily see a use for the field in the rowset.
Simplest solution is just to create a view, but that gets old very soon if you have to do it a lot. Alternative is to just loop through the rowset yourself loading the related fields. Something like:
For &k = 1 To &rs1.ActiveRowCount
&rs1(&k).UserName.UserId.value = &rs1(&k).User.UserId.value;
&rs1(&k).UserName.SelectByKey();
End-for;

Parse SQL with REGEX to find Physical Update

I've spent a bit of time trying to bend regex to my will but its beaten me.
Here's the problem, for the following text...
--to be matched
UPDATE dbo.table
UPDATE TOP 10 PERCENT dbo.table
--do not match
UPDATE #temp
UPDATE TOP 10 PERCENT #temp
I'd like to match the first two updates statements and not match the last two update statements. So far I have the regex...
UPDATE\s?\s+[^#]
I've been trying to get the regex to ignore the TOP 10 PERCENT part as its just gets in the way. But I haven't been successful.
Thanks in advance.
I'm using .net 3.5
I assume you're trying to parse real SQL syntax (looks like SQL Server) so I've tried something that is more suitable for that (rather than just detecting the presence of #).
You can try regex like:
UPDATE\s+(TOP.*?PERCENT\s+)?(?!(#|TOP.*?PERCENT|\s)).*
It checks for UPDATE followed by optional TOP.*?PERCENT and then by something that is not TOP.*?PERCENT and doesn't start with #. It doesn't check just for the presence of # as this may legitimately appear in other position and not mean a temp table.
As I understand it, you want a regex to interact with SQL code, not actually querying a database?
You can use a negative look ahead to check if the line has #temp:
(?m)^(?!.*#temp).*UPDATE
(?!...) will fail the whole match if what's inside it matches, ^ matches the beginning of the line when combined with the m modifier. (?m) is the inline version of this modifier, as I don't know how/where you plan on using the regex.
See demo here.
#Robin's solution is much better but in case you needed regex with some simplier mechanisms employed I give you this:
UPDATE\s+(TOP\s+10\s+PERCENT\s+)?[a-z\.]+
sqlconsumer, here's a fully functioning C# .NET program. Does it do what you're looking for?
using System;
using System.Text.RegularExpressions;
class Program {
static void Main() {
string s1 = "UPDATE dbo.table";
string s2 = "UPDATE TOP 10 PERCENT dbo.table";
string s3 = "UPDATE #temp";
string s4 = "UPDATE TOP 10 PERCENT #temp";
string pattern = #"UPDATE\s+(?:TOP 10 PERCENT\s+)?dbo\.\w+";
Console.WriteLine(Regex.IsMatch(s1, pattern) );
Console.WriteLine(Regex.IsMatch(s2, pattern));
Console.WriteLine(Regex.IsMatch(s3, pattern));
Console.WriteLine(Regex.IsMatch(s4, pattern));
Console.WriteLine("\nPress Any Key to Exit.");
Console.ReadKey();
} // END Main
} // END Program
The Output:
True
True
False
False

regular expression to pull words beginning with #

Trying to parse an SQL string and pull out the parameters.
Ex: "select * from table where [Year] between #Yr1 and #Yr2"
I want to pull out "#Yr1" and "#Yr2"
I have tried many patterns, but none has worked, such as:
matches = Regex.Matches(sSQL, "\b#\w*\b")
and
matches = Regex.Matches(sSQL, "\b\#\w*\b")
Any help?
You're trying to put a word boundary after the #, rather than before. Maybe this:
\w(#[A-Z0-9a-z]+)
or
\w(#[^\s]+)
I would have gone with
/^|\s(#\w+)\s|$/
or if you didn't want to include the #
/^|\s#(\w+)\s|$/
though I also like joel's above, so maybe one of these
/^|\s(#[^\s]+)\s|$/
/^|\s#([^\s]+)\s|$/