Keep text formatting in SQL - sql

I have a text area that inserts its content into a SQL table. Is there a way to keep the formatting of the text and then use it in HTML?

I'll assume you're talking about preserving line breaks.
Either:
Output the text inside a <pre> tag
or
Convert newlines to <br /> tags before insertion to the DB. (E.g. nl2br in PHP).

If you mean keep the Enters then replace the char 10 and char 13 with <br/>
When using SQL (note the enters)
select replace('
test
test','
','<br/>')
This results in <br/>test<br/>test

Text is text is text. Insert the text into the table including its markup and it will come out that way as well.
...or am I misunderstanding your question?

Related

SQL cannot search

In my SQL table Image, when i perform a search query
SELECT * FROM Image WHERE platename LIKE 'WDD 666'
it return no result(using other column to search then no problem).
The all the column data was inserted by C# code. (If enter data manually search works.)
now i suspect that the words WDD 666 wasn't english alphabet. is this possible?
In c#,
the plate number was generate by using tesseract wrapper string type.
what should i do to search the plate number?
Thanks in advance and sorry for my bad English.
Since your case matches, I'm going to rule out Case-sensitivity.
There may be leading or trailing blank spaces - Try this..
SELECT * FROM Image WHERE platename LIKE '%WDD 666%'
Try running this command:
SELECT '*'+plateName+'*',len(plateName)
FROM image.
I suspect platename has some non-printable characters in the field.
It appears to be a CR/LF at the end of the data. You can use
UPDATE image SET plateName = replace(plateName,char(13)+char(10),'')
WHERE plateName like '%'+char(13)+char(10)+'%'
If you get a positive row count, you'll know there was CR/LF data and it was removed. If you run the select afterwards, your lengths should be 7 and 8 based on your sample data

Convert text with HTML character encoding to database characterset

Our application receives data from various sources. Some of these contain HTML character makeup instead of regular characters. So instead of string "â" we receive string "â".
How can we convert "â" to a character in the database character set using SQL/PLSQL?
Our database is 10GR2.
Unescape_reference and excape_reference I believe is what you're looking for
UTL_I18N.UNESCAPE_REFERENCE('hello < å')
This returns 'hello <'||chr(229).
http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/u_i18n.htm#i998992
You can use the CHR() function to convert an ascii character number to a character representation.
SELECT chr(226)
FROM dual;
CHR(226)
--------
â
For more information see: http://www.techonthenet.com/oracle/functions/chr.php
Hope it helps...
one solution
replace(your_test, 'â', chr(226))
but you'd have to nest many replace functions, one for each entity you need to replace. This might be very slow if you have to replace many.
You can wrote your own function, seqrching for the ampersand and replacing when found.
Have you searched the Oracle Supplied Packages manual? I know they have a function that does the opposite for a few entities.
to convert a column in oracle which contains HTML items to plain text, you could use:
trim(regexp_replace(UTL_I18N.unescape_reference(column_name), '<[^>]+>'))
It will replace HTML character as above stated but will also remove HTML tags en remove leading and trailing spaces.
I hope it will help someone.

How to get data from a .rtf file or excel file into database(sqlite) in iphone sdk?

I had lots of data in a .rtf file(having usernames and passwords).How can I fetch that data into a table. I'm using sqlite3.
I had created a "userDatabase.sql" in that I had created a table "usersList" having fields "username","password". I want to get the list of data in the "list.rtf" file in to my table "usersList". Please help me .
Thanks in advance.
Praveena.
I would write a little parser. Re-save the .rtf as a txt-file and assume it look like this:
user1:pass1
user2:pass2
user5:pass5
Now do this (in your code):
open the .txt file (NSString -stringWithContentsOfFile:usedEncoding:error:)
read line by line
for each line, fetch user and password (NSArray -componentsSeparatedByString)
store user/password into your DB
Best,
Christian
Edit: for parsing excel-sheets I recommend export as CSV file and then do the same
Parsing RTF files is mostly trivial. They're actually text, not binary (like doc pdf etc).
Last I used it, I remember the file format wasn't too difficult either.
Example:
{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}}
{\*\generator Msftedit 5.41.21.2510;}\viewkind4\uc1\pard\sa200\sl276\slmult1\lang9\f0\fs22 Username Password\par
Username2 Password2\par
UsernameN PasswordN\par
}
Do a regular expression match to get the last { ... } part. By sure to match { not \{.
Next, parse the text as you want, but keep in mind that:
everything starting with a \ is escaped, I would write a little function to unescape the text
the special identifier \par is for a new line
there are other special identifiers, such as \b which toggles bolding text
the color change identifier, \cfN changes the text color according to the color table defined in the file header. You would want to ignore this identifier since we're talking about plain text.

Replace character in SQL results

This is from a Oracle SQL query. It has these weird skinny rectangle shapes in the database in places where apostrophes should be. (I wish we would could paste screen shots in here)
It looks like this when I copy and paste the results.
spouse�s
is there a way to write a SQL SELECT statement that searches for this character in the field and replaces it with an apostrophe in the results?
Edit: I need to change only the results in a SELECT statement for reporting purposes, I can't change the Database.
I ran this
select dump('�') from dual;
which returned
Typ=96 Len=3: 239,191,189
This seems to work so far
select translate('What is your spouse�s first name?', '�', '''') from dual;
but this doesn't work
select translate(Fieldname, '�', '''') from TableName
Select FN from TN
What is your spouse�s first name?
SELECT DUMP(FN, 1016) from TN
Typ=1 Len=33 CharacterSet=US7ASCII: 57,68,61,74,20,69,73,20,79,6f,75,72,20,73,70,6f,75,73,65,92,73,20,66,69,72,73,74,20,6e,61,6d,65,3f
EDIT:
So I have established that is the backquote character. I can't get the DB updated so I'm trying this code
SELECT REGEX_REPLACE(FN,"\0092","\0027") FROM TN
and I"m getting ORA-00904:"Regex_Replace":invalid identifier
This seems a problem with your charset configuracion. Check your NLS_LANG and others NLS_xxx enviroment/regedit values. You have to check the oracle server, your client and the client of the inserter of that data.
Try to DUMP the value. you can do it with a select as simple as:
SELECT DUMP(the_column)
FROM xxx
WHERE xxx
UPDATE: I think that before try to replace, look for the root of the problem. If this happens because a charset trouble you can get big problems with bad data.
UPDATE 2: Answering the comments. The problem may be is not on the database server side, may be is in the client side. The problem (if this is the problem) can be a translation on server to/from client comunication. It's for a server-client bad configuracion-coordination. For instance if the server has defined UTF8 charset and your client uses US7ASCII, then all acutes will appear as ?.
Another approach can be that if the server has defined UTF8 charset and your client also UTF8 but the application is not able to show UTF8 chars, then the problem is in the application side.
UPDATE 3: On your examples:
select translate('What. It works because the � is exactly the same char: You have pasted on both sides.
select translate(Fieldname. It does not work because the � is not stored on database, it's the char that the client receives may be because some translation occurs from the data table until it's showed to you.
Next step: Look in DUMP syntax and try to extract the codes for the mysterious char (from the table not pasting �!).
I would say there's a good chance the character is a single-tick "smart quote" (I hate the name). The smart quotes are characters 91-94 (using a Windows encoding), or Unicode U+2018, U+2019, U+201C, and U+201D.
I'm going to propose a front-end application-based, client-side approach to the problem:
I suspect that this problem has more to do with a mismatch between the font you are trying to display the word spouse�s with, and the character �. That icon appears when you are trying to display a character in a Unicode font that doesn't have the glyph for the character's code.
The Oracle database will dutifully return whatever characters were INSERTed into its' column. It's more up to you, and your application, to interpret what it will look like given the font you are trying to display your data with in your application, so I suggest investigating as to what this mysterious � character is that is replacing your apostrophes. Start by using FerranB's recommended DUMP().
Try running the following query to get the character code:
SELECT DUMP(<column with weird character>, 1016)
FROM <your table>
WHERE <column with weird character> like '%spouse%';
If that doesn't grab your actual text from the database, you'll need to modify the WHERE clause to actually grab the offending column.
Once you've found the code for the character, you could just replace the character by using the regex_replace() built-in function by determining the raw hex code of the character and then supplying the ASCII / C0 Controls and Basic Latin character 0x0027 ('), using code similar to this:
UPDATE <table>
set <column with offending character>
= REGEX_REPLACE(<column with offending character>,
"<character code of �>",
"'")
WHERE regex_like(<column with offending character>,"<character code of �>");
If you aren't familiar with Unicode and different ways of character encoding, I recommend reading Joel's article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). I wasn't until I read that article.
EDIT: If your'e seeing 0x92, there's likely a charset mismatch here:
0x92 in CP-1252 (default Windows code page) is a backquote character, which looks kinda like an apostrophe. This code isn't a valid ASCII character, and it isn't valid in IS0-8859-1 either. So probably either the database is in CP-1252 encoding (don't find that likely), or a database connection which spoke CP-1252 inserted it, or somehow the apostrophe got converted to 0x92. The database is returning values that are valid in CP-1252 (or some other charset where 0x92 is valid), but your db client connection isn't expecting CP-1252. Hence, the wierd question mark.
And FerranB is likely right. I would talk with your DBA or some other admin about this to get the issue straightened out. If you can't, I would try either doing the update above (seems like you can't), or doing this:
INSERT (<normal table columns>,...,<column with offending character>) INTO <table>
SELECT <all normal columns>, REGEX_REPLACE(<column with offending character>,
"\0092",
"\0027") -- for ASCII/ISO-8859-1 apostrophe
FROM <table>
WHERE regex_like(<column with offending character>,"\0092");
DELETE FROM <table> WHERE regex_like(<column with offending character>,"\0092");
Before you do this you need to understand what actually happened. It looks to me that someone inserted non-ascii strings in the database. For example Unicode or UTF-8. Before you fix this, be very sure that this is actually a bug. The apostrophe comes in many forms, not just the "'".
TRANSLATE() is a useful function for replacing or eliminating known single character codes.

Delimiting User Input

What is the best character to use to delimit user input?
For example if a user has an infinite number of textboxes to type things into, but each textbox's value will be concatenated into a single database field, what is the safest character to delimit each input?
I think it should be a character not on your typical keyboard. Is there a character out there just for this?
You could use one of the ASCII control characters. There's one called "Record Separator" which has a hex value of 0x1E that might fit your needs.
Edit: Incidentally, if you want to do a proper job, you should probably ensure that \x1E is escaped in user input. One way to do this would be to use another ASCII control character: \x1B which is the "escape" control code. Thus, "\x1E" in input becomes "\x1B\x1E" and "\x1B" becomes "\x1B\x1B".
Keep in mind, of course, that because these are non-printing control codes, they can't be displayed. If you want a printable representation, you might want to go with a normal character like the comma and just escape it from input.
I guess one approach is to use a comma, and then to escape commas within the user input. It's probably not safe to assume any character (or even a sequence of characters) can't appear in user input -- if you can enter it in your code, then there's a way the user can enter it into a text box!
Normally commas or semi-colons are used for splitting data. What about | which the average user never uses?
How about a combination of keys? e.g.
|::|
so
this|::|and|::|that. Plus Those:Here and there.|::|Even this|that works
Any markup language will do for this. They're a little verbose but at least they'll be future proofing your field.
use ♥
ftw