Always comes When converting from iso-8859-1 to utf-8. Latin Capital Letter a with Circumflex (Â) always comes along with § - sql

Consider below steps
vi tgh.sql
copy below content into the file select to_char(sysdate,'DD-MON-YYYY')||'§' from dual and save the file
now check the unicode format of this file file -i tgh.sql. You will see output as below tgh.sql: text/plain; charset=iso-8859-1
Now convert it to utf8 using below command iconv -f iso-8859-1 -t utf-8 tgh.sql -o tghconv.sql
Lets check the format of this new file file -i tghconv.sql , you will see output as below tghconv.sql: text/plain; charset=utf-8
But if you cat this new file you will see completely different result
cat tghconv.sql
You will see output as below
select to_char(sysdate,'DD-MON-YYYY')||'§' from dual
We see an extra  in the output . This was never there initially when we created the file. I do understand that utf-8 hex code for § is xC2A7 and in utf-16 hex code for § is x00A7 and hex code for  is x00C2 so the conclusion that I can make is that file is somehow getting converted to utf-16 format. How can I avoid this? I am expecting my output to be as below even after converting to ut-8
select to_char(sysdate,'DD-MON-YYYY')||'§' from dual
Any suggestion?
Another update to code
Considering issue might be with my keyboard I slightly changed the test approach above, further details are below
vi tgh.sql
copy below content into the file select to_char(sysdate,'DD-MON-YYYY')||unistr('\00A7') from dual and save the file
Execute the above sql file using oracle sqlplus as below
_resultset=$(sqlplus -s username/password#TNSAliasnameforDB <<EOF
whenever sqlerror exit sql.sqlcode
whenever oserror exit sql.sqlcode
set heading off feedback off linesize 32736 trimspool on termout off tab off pagesize 0 long 100000 longc 100000;
spool tgh.out;
#tgh.sql
/
spool off;
EXIT
EOF
)
Now lets open the output file tgh.out.
cat tgh.out Running this we will get output as 03-AUG-2022§. Again the result is same we have an additional  in the output. If we verify the unicode format ( by running command file -i tgh.out) of file we will see it as utf-8.

Related

How export CSV with SQLCMD in UTF-16?

I am trying to export the data from different queries by SQLCMD, my files must be exported in UTF-16, currently they do it in UTF8 BOM and I have been able to even in UTF8, but I cannot get it to export in UTF16.
Reviewing the Encoding of the file in Notepad++ this is the format in which it should be exported (UTF16 LE BOM), in the list selection it appears with the name UCS-2 Little Endian
Encoding to export
Encoding in options
SET #cmd= 'sqlcmd -s"-" -f i:1252 -Q "SELECT * FROM Machines" -f o:65001 -o "C:\CSV\report_machines.csv"'
EXEC master..xp_cmdshell #cmd
Check the page (https://www.example-code.com/sql/load_text_file_using_code_page.asp) and tried code 1200, however SQL Server returns the following message:
Sqlcmd: The code page <1200> specified in option -f is invalid or not installed on this system.
I want to avoid using another program, if anyone knows how to solve this problem I would greatly appreciate it.

Can't export sqlite to json because the json option disappeared?

I'm trying to export specific columns of my sqlite database into json.
I found a question relevant to my issue here: Impossible to export SQLite query results to JSON?
Where they used the sqlite3 -json option to solve their problem and export their sqlite database to json.
However, when I try to do the same terminal command I get this in return:
sqlite3: Error: unknown option: -json
and sqlite3 -help returns the following list of options:
-append append the database to the end of the file
-ascii set output mode to 'ascii'
-bail stop after hitting an error
-batch force batch I/O
-column set output mode to 'column'
-cmd COMMAND run "COMMAND" before reading stdin
-csv set output mode to 'csv'
-deserialize open the database using sqlite3_deserialize()
-echo print commands before execution
-init FILENAME read/process named file
-[no]header turn headers on or off
-help show this message
-html set output mode to HTML
-interactive force interactive I/O
-line set output mode to 'line'
-list set output mode to 'list'
-lookaside SIZE N use N entries of SZ bytes for lookaside memory
-maxsize N maximum size for a --deserialize database
-memtrace trace all memory allocations and deallocations
-newline SEP set output row separator. Default: '\n'
-nullvalue TEXT set text string for NULL values. Default ''
-pagecache SIZE N use N slots of SZ bytes each for page cache memory
-quote set output mode to 'quote'
-readonly open the database read-only
-separator SEP set output column separator. Default: '|'
-stats print memory stats before each finalize
-version show SQLite version
-vfs NAME use NAME as the default VFS
json is nowhere to be found. Very strange, because the documentation shows the following:
$ sqlite3 --help
Usage: ./sqlite3 [OPTIONS] FILENAME [SQL]
FILENAME is the name of an SQLite database. A new database is created
if the file does not previously exist.
OPTIONS include:
-A ARGS... run ".archive ARGS" and exit
-append append the database to the end of the file
-ascii set output mode to 'ascii'
-bail stop after hitting an error
-batch force batch I/O
-box set output mode to 'box'
-column set output mode to 'column'
-cmd COMMAND run "COMMAND" before reading stdin
-csv set output mode to 'csv'
-deserialize open the database using sqlite3_deserialize()
-echo print commands before execution
-init FILENAME read/process named file
-[no]header turn headers on or off
-help show this message
-html set output mode to HTML
-interactive force interactive I/O
-json set output mode to 'json'
-line set output mode to 'line'
-list set output mode to 'list'
-lookaside SIZE N use N entries of SZ bytes for lookaside memory
-markdown set output mode to 'markdown'
-maxsize N maximum size for a --deserialize database
-memtrace trace all memory allocations and deallocations
-mmap N default mmap size set to N
-newline SEP set output row separator. Default: '\n'
-nofollow refuse to open symbolic links to database files
-nonce STRING set the safe-mode escape nonce
-nullvalue TEXT set text string for NULL values. Default ''
-pagecache SIZE N use N slots of SZ bytes each for page cache memory
-quote set output mode to 'quote'
-readonly open the database read-only
-safe enable safe-mode
-separator SEP set output column separator. Default: '|'
-stats print memory stats before each finalize
-table set output mode to 'table'
-tabs set output mode to 'tabs'
-version show SQLite version
-vfs NAME use NAME as the default VFS
-zip open the file as a ZIP Archive
As you can see, -json is clearly listed as an option (as well as many others that don't show up for me).
What's going on?

Creating a Format File for Bulk Import

I am trying to create a Format File to bulk import a .csv file but i, am getting an error.
Query I used
"BCP -SMSSQLSERVER01.[Internal_Checks].[Jan_Flat] format out -fC:\Desktop\exported data\Jan_FlatFormat.fmt -c -T -Uasda -SMSSQLSERVER01 -PPASSWORD"
I am getting an error
"A valid table name is required for in, out, or format options."
This is the error. can anyone suggest what need to do.
According to the bcp Utility documentation the first parameter should be a [Database.]Schema.{Table | View | "query"}, so don't put -SMSSQLSERVER01 where you've got it. Also use format nul instead of format out.
Try using:
bcp.exe [Internal_Checks].[Jan_Flat] format nul "-fC:\Desktop\exported data\Jan_FlatFormat.fmt" -c -SMSSQLSERVER01 -T -Uasda -PPASSWORD
Note the quotes " around the -f switch because your path name contains space characters.
Also note that the -c switch causes single-byte characters (ASCII/OEM/codepage with SQLCHAR) to be written out. If your table contains nchar, nvarchar or ntext columns you should consider using the -w switch instead so as to write out UTF-16 encoded data (using SQLNCHAR).

how to guess file encoding

I have a file (an author list from the Library of Congress) with lines like:
Arteaga, Ana Mar�ia
Corval�an-V�asquez, Oscar E.
(when printed to linux console)
I'd like to read those (either into a pandas dataframe or a set of lines)
df = pd.read_csv(fname, sep='\t', header='infer', lineterminator=None,encoding='latin1') #lineterminator \r\n hits error
or
with open(fname,'r',encoding='ISO-8859-1') as fp:
lines=fp.readlines()
but both are not quite right , giving me output like
Arteaga, Ana Marâia
(again when printed to console)
when I am pretty sure the actual name here should be María.
Does someone recognize this format?
Ok this seems to be the 'marc-8' format .
yaz-iconv -f marc8 -t utf8 infile.txt > outfile.txt
took care of the conversion to utf8 , with the sole hiccup being that yaz killed all the line terminators (both for \r\n and \n versions of the file).
Those can be returned with something along the lines of
sed 's/\[/\n\[/g' outfile.txt > outfile_utf.txt
(for example in my case where each line starts with a '[' character)

Running SQL from Batch File [duplicate]

I write shell script and want to use sqlplus, when I write:
#!/bin/bash
result=$(sqlplus -s user/pass#DB << EOF
set trimspool on;
set linesize 32000;
SET SPACE 0;
SELECT MAX(DNNCMNT_ANSWER_TIME) FROM TKIMEI.DNNCMNT_IMEI_APPRV;
/
exit;
EOF)
echo "$result"
the result is in txt file (I'm executing it as ksh sql.sh > result.txt):
MAX(DNNCM
---------
10-MAR-14
MAX(DNNCM
---------
10-MAR-14
it is automatically putting an empty line at the beginning of file and writing the result twice.
How can I fix it ?
Remove the slash. It's causing the previous command (the select) to be repeated:
http://docs.oracle.com/cd/B10501_01/server.920/a90842/ch13.htm#1006932
Also, talk to your DBA about setting up external OS authentication so you don't have to hardcode the password in a shell script for security reasons. Once set up, you can replace the login/password combo with just a slash:
http://docs.oracle.com/cd/E25054_01/network.1111/e16543/authentication.htm#i1007520