SQL Strip the Font Format(Colour or other) - sql

I have a problem to strip out the format in a note table
Here is an example:
";\red31\green73\blue125;
\viewkind4\uc1\ltrpar\f0\fs20 USEFUL TEXT BODY \cf1\f3
\ltrpar\f0\fs17
"
How to get rid of those stuff? I want to play safe not to replace anything after'\'
Many thanks,
Rick

Your making it quite difficult for yourself by not replace '\' .
If you look at http://other9.tripod.com/Refs/easy-rtf.html you will see that there are different RTF codes and there is no default size for the codes.
Additionally, it is not like HTML where there must be a necessary "closing" tag which makes it additionally difficult.
The only thing I can think of is to record all possible RTF codes (or use an RTF parser library) and hence be able to recognize if a \ is or is not RTF code.

Related

unable to perform search on custom_field(JIRA-Python)

I'm getting the below error when I search on custom_field.
{"errorMessages":["Field \'customfield_10029\' does not exist or you do not have permission to view it."],"warningMessages":[]}
But I have enough permissions(Admin) to access that field. And also I enabled the field visible.
URL = 'https://xyz.atlassian.net/rest/api/2/search?jql=status="In+Progress"+and+customfield_10029=125&fields=id,key,status'
Custom fields in JQL searches are referenced using the abbreviation 'cf' followed by their ID inside square brackets '[id]', so your URL would be:
URL =
'https://xyz.atlassian.net/rest/api/2/search?jql=status="In+Progress"+and+cf[10029]=125&fields=id,key,status'
Make sure you properly encode the square brackets in UTF-8 format in your language's encoding method.
PS. Generally speaking, it's much easier to reference custom fields in JQL searches by their names, not their IDs. It makes the search URL easier to read and understand what is being searched for.
I get a 400 response code with customized field syntax:
https://domain/rest/api/2/search?maxResults=500&jql=cf[10025]='xxxxxxxxxd'&fields=id,key,issuetype,status,customfield_10025

How to determine Thousands Separator using Format in VBA

I would like to determine the Thousand Separator used while running a VBA Code on a target machine without resolving to calling system built-in functions such as (Separator = Application.ThousandsSeparator).
I am using the following simple code using 'Format':
ThousandSeparator = Mid(Format(1000, "#,#"), 2, 1)
The above seems to work fine, and would like to confirm if this is a safe method of doing it without resorting to system calls.
I would expect the result to be a single char string in the form of , or . or ' or a Space as applicable to the locale on the machine.
Please note that I want to only use a language statement such as Format or similar (no sys calls). Also this relates to Thousands Separator not Decimal Separator. This article Using VBA to detect which decimal sign the computer is using does not help or answer my question. Thanks
Thanks in advance.
The strict answer to whether it is safe to use Format to get the thousands separator is No.
E.g. on Windows, it is possible to enter up to three characters into the Thousands Separator field in the regional settings in the control panel.
Suppose you enter asd and click OK.
If you now call Format(1000, "#,#") it will give you 1a000. That is only the first letter of your thousands separator. You have failed to retrieve it correctly.
Reading the registry:
? CreateObject("WScript.Shell").RegRead("HKCU\Control Panel\International\sThousand")
you get back asd in full.
To be fair, the Excel international properties do not seem to be of much help either. Application.International(xlThousandsSeparator) in this situation will return the separator originally defined in your computer's locale, not the value you've overridden it to.
Having that said, the practical answer is Yes, because it would appear (and if you happen to know for sure, please post an answer here) that there is no culture with multi-char thousand separator (even in China where scary things like 1億2345万6789 or 1億2345萬6789 exist, they happen to be represented with just one UTF-16 character), and you probably are happy to ignore the people who decided to play with their locale settings in that fashion.

Scrapy: how to solve the "empty" item in html due to a foreign language symbol?

One of the scrapy-ed items seems contain no content in HTML. In MySQL database, it does have content including a non-regular - (dash) that is slightly longer. It could be a dash symbol from Chinese input, or something similar. I am copy it below, not sure whether it will keep the original form. The web link is here and this non-regular dash is in the title and the beginning of the description.
**Hospitalist – Chattanooga**
To further prove it, the export CVS file from MySQL convert this weird dash to ?€?. Most likely this weird symbol causes the non-display problem.
I want to either delete this weird symbol or replace it with a , or a regular dash. Where can it be done? During Scrapy? Or in MySQL? Sorry this is not a specific coding question. I need some guidance before figuring out any codes for this problem.
The long dash is called an EM dash fileformat - EM dash
The reason you are seeing it is likely due to the chosen encoding.
Try setting a different encoding or replacing the EM dash with the , sign as you mentioned in your question.
In php you can do so with the following code:
str_replace(chr(151), ',' $input);

Special unicode question mark characters in database table

Firstly anyone who reads this and response, thanks for your assistance.
I'm having a problem where I have a site (primarily in English), with many translations for different language. I have a database which stores these translations. Unfortunately one of the language seems to be populated with question mark characters between each general character. Because of this, any text which contains these characters wont show up in IE.
Is there any SQL statements that will seek these characters out and remove them? There's a find/replace option, but I can't seem to find a rule that applies.
Thanks for any help you can give.
As an example, this is how text shows in a table:
�i�O�N� �k�i�t� �d�e� �s�u�p�p�o�r�t� �V�é�l�o� - which stops it showing IE.
Removing these as below will show it in IE:
iON kit de support Vélo
Any idea how I go about this?
Thanks :)
Your translation database contains mangled data that has come from misinterpreting UTF-16-encoded input as ISO-8859-1 (or the closely related Windows code page 1252; you can't tell the difference from the example data).
You could attempt to undo the damage by extracting the data, encoding it back to what is hopefully the original set of bytes, and re-decoding it, then inserting it back into the database. For example in PHP:
$mangled = "i\0O\0N\0 \0k\0i\0t\0 \0d\0e\0 \0s\0u\0p\0p\0o\0r\0t\0 \0V\0\xE9\0l\0o\0"
$fixed = iconv('utf-16le', 'utf-8', $mangled)
# "iON kit de support V\xC3\xA9lo"
but it would be best to go back to the original input data and re-import it properly really.
Just removing zero bytes from a UTF-16-encoded bytes string (str_replace("\0", '', $mangled)) isn't really fixing it, it would work for the ASCII characters (U+0000–U+007F) but you would end up with ISO-8859-1 bytes for characters U+0080–U+00FF (more usually you would want UTF-8) and any other characters outside that range would remain unreadable nonsense.

Controlling Doxygen's LaTeX output for making PDF documentation

I'm using Doxygen to generate documentation for my code. I need to make a PDF version of this and using Doxygen's LaTeX output appears to be the way to do it.
However I've run into a number of annoying problems, and not knowing anything about LaTeX previously haven't really got much of an idea on how to approach them, and the countless references for LaTeX related things are not much help...
I worked out how to create a custom style thing in a sty file and how to get Doxygen to use it. After a lot of searching I found out how to set the page margins etc. through this, and I'm guessing the perhaps this is the file I want for doing the other things I want, but I cant seem to find any commands for doign what I want :(
The table of contents at the start of the document contains a lot of items Id rather it didn't as it makes the contents very long. Is there some way to limit this contents to just say the first two levels, rather than having entries for every single individual function, variable, etc.? Id quite like to keep all the bookmarks however. I did try the "COMPACT_LATEX" option but as well as removing items on the contents pages, it removed the bookmarks and the member lists at the start of each section, which I do really want to keep.
Is there a way to change the order of things, like putting the full class description at the start of the section, rather than after all the members and attributes?
Wow, that's kind of evil of Doxygen.
Okay, to get around the tocdepth counter problem, add the following line to your .sty file:
\AtBeginDocument{\setcounter{tocdepth}{2}}% or whatever level you want
You can set the PDF bookmarks depth to a separate value:
% requires you \usepackage{hyperref} first
\hypersetup{
bookmarksdepth = section, % of whatever level you want
}
Also note that if you have a list of figures/tables, the tocdepth must be at least 2 for them to show up.
I don't see any way of rearranging those items within the LaTeX files---Doxygen just barfs them out there, so we can't do much. You'll have to poke around the Doxygen documentation to see if there's any way to specify the order I guess. (Here's hoping!)
You're so close.
Googling on "latex contents level" brought me to LaTeX - customizing the depth of the table of contents for different parts of the thesis which suggests
\setcounter{tocdepth}{n}
where n starts at zero for only the highest level division. This is presumable defined in all the default styles, but is worth a try in doxygen.
You could write a Perl/Awk script to simply delete the unwanted lines from the table of contents. For the file burble.tex, Latex will generate the file burble.toc, which will contain lines such as:
\contentsline {subsection}{Class F rewrites}{38}
\contentsline {subsection}{Class M rewrites}{39}
\contentsline {section}{\numberline {7}Definition and properties of the translation}{44}
\contentsline {paragraph}{Well-formedness}{54}
Simple regexes will identify which levels each line belongs to, and you can filter the file based on that. Once you have the table of contents the way you want it, insert \nofiles in the appropriate place (the style sheet?), which means that Latex will read the auxiliary files but not overwrite them.