MSI DB, Visual Basic and cp1252 encoded string problems - vba

I have such line of code (generated by MakeMSI)
oRec.StringData(2) = "A publicitar a aplicação"
oRec is record from Msi database, opened with:
oInstaller = MkObject("WindowsInstaller.Installer")
oMsi = oInstaller.OpenDatabase(MsiName, msiOpenDatabaseModeDirect)
oMsi.OpenView(selectQuery)
After executing and commiting string "A publicitar a aplicação" is converted to "A publicitar a aplicaçao" (ã is converted to a) in the database. I'm 100% sure database is cp1252 encoded, as when I edit field manualy and insert ã it displays well. Any ideas how to workaround this?
EDIT:
When building installer on Portugese Windows everything is fine

What is the codepage of the computer where you edit the property?
I don't know if VBA uses Unicode internally to store strings or not. If it does, then it should work on any computer; if it does not, then it should work correctly only where the system code page supports ‘ã’.
So another part of the problem is the source file itself: to work as expected it should be Unicode-enabled (UTF-8 or UTF-16), and the interpreter should handle it this was. Otherwise, you'll get unexpected results where the current code page is not compatible with cp1252.
Check the setting for Language for non-Unicode programs in the Regional settings in Windows. It should be set to Portuguese.

Related

UTF-8 encoding for .sql files created or modified in SSMS

The default encoding for files saved by my SSMS (v18) is ASCII, not UTF-8.
I’ve tried the steps below to set the default encoding to UTF-8 (so that I wouldn’t have to remember to set it every time I create/modify a file), but the default encoding remains ASCII.
Outside of SSMS, I can change the encoding in a text editor, but I don’t want to have to do that every time.
Have you encountered this issue and, if so, how did you resolve it?
Steps I tried:
From within SSMS, open the “template” file: C:\Program Files
(x86)\Microsoft SQL
Server\120\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems\Sql\SQLFile.sql.
Re-save using correct encoding:
File => Save As
Click the arrow next to the Save button
Choose the relevant encoding: Unicode (UTF-8 with asignature) - Codepage 65001
This is supposed to result in all new query windows having UTF-8 as the default encoding. But, this doesn’t work for me.

issue on creating language model for sinhala usin SRILM

I'm trying to create a sinhala voice recognition system using pocketsphinx. I use SRILM tool to create language model. My source files to create the laguage model are Here . Im using cygwin on windows 8.1 to run SRILM 1.7.1. But once i run the command
ngram-count -vocab sinhalalexicon.txt -text sinhalacorpus.Train -order 3 -write sinhala.count -unk
I'm getting
iconv: Invalid or incomplete multibyte or wide character
iconv: Invalid or incomplete multibyte or wide character
What did I do wrong here? sinhalacorpus.Train file was created by manually using Notepad++
I found the solution to my issue. once I convert the corpus and lexicon files to Unix format and change the encoding to UTF-8 without BOM it worked. I used Notepad++ to do the changes.

doxygen latex make fails for input encoding error

I have a git repo project in eclipse which I have been documenting using doxygen (v1.8.4).
If I run the latex make ion a fresh clone of the project it runs fine and the PDF is made.
However, if I then run a doxy build, which completes OK, then attempt to run the latex make, it fails for
! Package inputenc Error: Keyboard character used is undefined
(inputenc) in inputencoding `utf8'.
See the inputenc package documentation for explanation.
Type H <return> for immediate help.
...
I have tried switching the encoding of the doxyfile by setting DOXYFILE_ENCODING to ISO-8859-1 with no change in the result... How can I fix this?? Thanks.
EDIT: I have used no non-UTF-8 chars as far as I know in my files, the file referenced before the error is very short and definitely doesn't have non-UTF-8 chars in it. I've even tried clearing my latex output dir and building from scratch with no luck...
EDIT: Irealised that the doxy build only appears to run correctly. It doesnt show any errors, but it should, for example run DOT and build about 10 graphs. The console output says Running dot, but it doesn't say generating graph (n/x) like it should when it actually makes the graphs...
Short answer: So by a slow process of elimination I found that this was caused by a single apostrophe in a file that had appeared to be already built and made without error!!
Long answer: Firstly I used used the project properties to flip the encoding from the default Cp1252 to UTF-8. Then I started removing files one-by-one until rebuilding and remaking after each removal, until the make ran successfully. I re-added all files, but deleted the content in the most recently removed file and tested the make - to confirm it was this file and only this file that caused the issue. the make ran fine. So I pasted the content back into the empty file, and started deleting smaller and smaller sections of the file, again rebuilding and remaking each time until I was left with a good make without the apostrophe and a bad one with it... I simply retyped the apostrophe (as this would then force it to be a UTF-8 char) and success!! Such an annoying bug!
Dude you made it a hard way. Why not use python to do the work for you:
f = open(fn,"rb")
data = f.read()
f.close()
for i in range(len(data)):
ch = data[i]
if(ch > 0x7F): # non ASCII character
print("char: %c, idx: %d, file: %s"%(ch,i,fn))
str2 = str(data[i-30:i+30])#.decode("utf-8")
print("txt: %s" % (str2))

How to print sqlite to file in utf-8?

I've opened sqlite3.exe in windows console and made a database with special characters.
.dump showed me the sql query with special characters.
Then I changed output to file: .output file.sql
And executed the .dump command.
The special characters were missing when I imported the database using .read file.sql.
I used pragma encoding="UTF-8"; but it didn't change anything (I don't know if it should).
The Windows console makes it hard to use UTF-8 correctly, and the Microsoft compiler has lots of bugs that make it impossible to use UTF-8 with portable I/O functions.
If you have entered data in the Windows console, those strings are not valid UTF-8.
If a non-ASCII string is output with correct characters in the Windows console, it is not valid UTF-8.
To ensure that your data is valid UTF-8, you have to go through files.
Alternatively, use any SQLite shell that does not use the console (such as the SQLite Manager Firefox extension).
This work fine for CP852, but could be used for any codepage known by iconv.
chcp 852 >NUL
echo INSERT into NAMES (name,timestamp) VALUES ('ěščřžýáíé','1429001515'); | iconv.exe -f cp852 -t utf-8 | ..\utilities\sqlite3.exe test.db
Windows can handle unicode internaly, but if you print it on console (by 'echo' command for example) than character mismatch. Using on-the-fly reencoding solve this problem.

stata odbc sqlfile

I am trying to load data from database (either MS Access or SQL server) using odbc sqlfile it seems that the code is running with any error but I am not getting data. I am using the following code odbc sqlfile("sqlcode.sql"),dsn("mysqlodbcdata"). Note that sqlcode.sql contains just sql statement with SELECT. The thing is that the same sql code is giving data with odbc load,exec(sqlstmt) dsn("mysqlodbcdata"). Can anyone suggest how can I use odbc sqlfile to import data? This would be a great help for me.
Thanks
Joy
sqlfile doesn't load any data. It just executes (and displays the results when the loud option is specified), without loading any data into Stata. That's somewhat counter-intuitive, but true. The reasons are somewhat opaquely explained in the pdf/dead tree manual entry for the odbc command.
Here's a more helpful answer. Suppose you have your SQL file named sqlcode.sql. You can open it in Stata (as long as it's not too long, where too long depends on your flavor of Stata). Basically, -file read- reads the SQL code line by line, storing the results in a local macro named exec. Then you pass that macro as an argument to the -odbc load- command:
Updated Code To Deal With Some Double Quotes Issues
Cut & paste the following code into a file called loadsql.ado, which you should put in directory where Stata can see it (like ~/ado/personal). You can find such directories with the -adopath- command.
program define loadsql
*! Load the output of an SQL file into Stata, version 1.3 (dvmaster#gmail.com)
version 14.1
syntax using/, DSN(string) [User(string) Password(string) CLEAR NOQuote LOWercase SQLshow ALLSTRing DATESTRing]
#delimit;
tempname mysqlfile exec line;
file open `mysqlfile' using `"`using'"', read text;
file read `mysqlfile' `line';
while r(eof)==0 {;
local `exec' `"``exec'' ``line''"';
file read `mysqlfile' `line';
};
file close `mysqlfile';
odbc load, exec(`"``exec''"') dsn(`"`dsn'"') user(`"`user'"') password(`"`password'"') `clear' `noquote' `lowercase' `sqlshow' `allstring' `datestring';
end;
/* All done! */
The syntax in Stata is
loadsql using "./sqlfile.sql", dsn("mysqlodbcdata")
You can also add all the other odbc load options, such as clear, as well. Obviously, you will need to change the file path and the odbc parameters to reflect your setup. This code should do the same thing as -odbc sqlfile("sqlfile.sql"), dsn("mysqlodbcdata")- plus actually load the data.
I also added the functionality to specify your DB credentials like this:
loadsql using "./sqlfile.sql", dsn("mysqlodbcdata") user("user_name") password("not12345")
For "--XYZ" style comments, do something like this (assuming you don't have "--" in your SQL code):
if strpos(`"``line''"', "--") > 0 {;
local `line' = substr(`"``line''"', 1, strpos(`"``line''"', "--")-1);
};
I had to post this as an answer otherwise the formatting would've been all messed up, but it's obviously referring to Dimitriy's code.
(You could also define a local macro holding the position of the "--" string to make your code a little cleaner.)