Strange interaction between jdom and ssis - jdom

I apologize for the long post, but this problem is not easily stated.
I recently wrote a piece of Java to reconfigure some SSIS packages for a colleague, using jdom to parse and manipulate the XML. The program worked, but the resulting files crashed. We were able to trace the crash to an odd mostly-nonprinting character in the original files, which was not reproduced in the files written by jdom.
What's strange about this character is that it doesn't show up in all editors. The Oxygen XML editor, for example, doesn't even see it. However, in notepad, the original copyright notice appears like this:
<DTS:Property DTS:Name="TaskContact">Execute SQL Task; Microsoft Corporation; Microsoft
SQL Server v9; © 2004 Microsoft Corporation; All Rights
Reserved;http://www.microsoft.com/sql/support/default.asp;1</DTS:Property>
and the transformed version of the same element:
<DTS:Property DTS:Name="TaskContact">Execute SQL Task; Microsoft Corporation; Microsoft
SQL Server v9; © 2004 Microsoft Corporation; All Rights
Reserved;http://www.microsoft.com/sql/support/default.asp;1</DTS:Property>
(the problem character is the  just before the copyright symbol)
Running a global replace on the packages in question, where  -> "" and © -> "(c)", made the problem go away, but now it turns out that the problem comes back when unmodified elements are put into the modified packages, so now I'm not as sure what is at the root of the problem.
Again, I'm sorry for the long post, but I didn't want to leave out any details. Any insights or suggestions would be greatly appreciated; I'm pretty well stumped.
My colleague will be sending me error messages from his attempts to load these, I can post those if they're useful.

As to the root of the problem: writing in one encoding and reading another. See my answer to this question. £ becomes £ Why? XML ISO encoding issue?
Just replace the pound sign, £, with the copyright symbol, © (unicode U+00A9). Hopefully you can find the place where the encoding mixup is occurring.

Related

how-to-parse-a-ofx-version-1-0-2-file-in power BI?

I just read
How to parse a OFX (Version 1.0.2) file in PHP?
I am not a developer. What easy tool can I use to make this code run with no code skill or appetence ? web browser is pretty hard to use for non dev guys.
I need this to use the file into Power BI, which accept M code, json source or xml, but not sgml ofx or PHP.
Thanks in advance
Welcome Didier to StackOverflow!
I'm going to try and give you a clue how I'd approach the problem here. But keep in mind that your question really lacks details for us to help you, and I'm asking to update your question with example data that you want to integrate into PowerBI. Also, I'm not too familiar with PowerBI nor PHP, and won't go into making that PHP code you linked run for you.
Rather, I'd suggest to convert your OFX file into XML, and then use PowerBI's XML import on that converted file.
From your linked question, I get that your OFX file is in SGML format. There's a program specifically designed to convert SGML into XML (which is just a restricted form of SGML) called osx. I've detailed how to install it on Linux and Mac OS in another question related to SGML-to-XML down-converting; if you're on Windows, you may have luck by just downloading a really ancient (32bit) version of it from ftp://ftp.jclark.com/pub/sp/win32/sp1_3_4.zip. Alternatively, you can use my sgmljs.net software as explained in Converting HTML to XML though that tutorial is really about the much more complex task of converting HTML to XML/XHTML and will probably confuse you.
Anyway, if you manage to install osx, running it on your OFX file (which I assume to have the name yourfile.ofx just for illustration) is just a matter of invoking (on the Windows or Linux/Mac OS command line):
osx yourfile.ofx > yourfile.xml
to result in yourfile.xml which you can attempt to load with PowerBI.
Chances are your OFX file has additional text at the beginning (lines like XYZ:0001 that come before <ofx>). In that case, you can just remove those lines using a text editor before invoking osx on it. Maybe you also need a .dtd file or additional instructions at the top of the OFX file informing SGML about the grammar of your file; it's really difficult to say without seeing actual test data.
Before bothering with SGML and all that, however, I suggest to remove those first few lines in your OFX file (everything until the first < character) and check if PowerBI can already recognize your changed input file as XML (which, from other OFX example files, has a good chance of succeeding). Be sure to work on a copy of your original file rather than overwriting it. Then come back and update your question with your results and example data.

VisualStudio 2013 weird compile issues

Issue
I'm having some really odd compile issues using Visual Studio 2013 and it's really disrupting my teams workflow.
The issue is hard to explain but I will provide screen shots and code snippets to help people understand the problems we are facing.
We have a project that we recently moved to VisualStudio 2013 from 2010 and upgraded it to .Net 4.5, the project is a ASP.net Web Forms project.
The code compiles and runs but oddly when I change any of the class files sometimes even just adding a comment 'Test Comment it fails to compile.
The errors shown in the error window are all wierd and the IntelliSense shows errors in the wrong place, some of the errors are even completly off. An example is _To is not defined in the line Dim _Town as String or in the same line 'ring' is not defined which is oviously part of the word String
ScreenShots
Here are some of the errors after I added the failing code at line 44 and then commented it out and re-compiled
What I've tried
I've tried to change the files line endings and make sure they are all Windows CR+LF, I've tried snooping in the build output nothing I can really see to help me.
I even brung the solution down from source control on another machine to test and it had the same issue. It didn't actually compile properly at all on the new machine but I don't know yet if these two issues are related.
I had the same issue as yours, and also the same scenario (I had upgraded a very old VB.NET project to a 2013 project).
The issue seem to be related to file encoding. I don't know the exact cause, but it might be having multiple files with different encoding (In my case, some files were ANSI, other files were UTF-8 w/ BOM).
If you aren't sure about having files with different encoding, open them in Notepad++. You should see the file encoding in the bottom-right corner.
At first, I have convert the offending file to ANSI to see if this will resolve the problem. I opened the offending file in Notepad++, selected Encoding-Convert to ANSI, Saved, Encoding->Encode In UTF-8 without BOM, Saved and Reloaded the file in VS. Now the project compile successfully.
However, I didn't want to do this every time I changed the file, (since VS convert it back to UTF8), therefore I copied all the old files (In my case they were 4 files only) to a temporary directory, deleted the files from VS and created new files with the same name, and I copy/pasted the content into each file. Now all my files are in UTF8, and I am no longer having this issue.
The solution is to either convert your files to ANSI as UTF8, or convert them to UTF8 (This seems to be the default encoding for newly created files in Visual Studio, so I suggest converting them to UTF8)
If you have a lot of files I think you can try to convert them to UTF8 using Notepad++.
Regards.

How can I find out what's causing differences in generated Sandcastle docs?

In Noda Time, we generate our documentation using Sandcastle and SHFB. We then commit the documentation back into the source repository - primarily because that makes it easy to view the latest (and historical) docs.
I'm the primary developer for the project, but I use two computers - and unfortunately, at the moment they're building different documentation even though they're both updated to the same source.
The two computers are the same in every important way I can think of:
Sandcastle 2.7.2.0
SHFB 1.9.6.0
VS 2012 Professional (both reported version 11.0.50727.1 in "Programs", both "Version 11.0.51106.01 Update 1" in the "About" page)
Latest version of local help content for .NET Framework 4.5 (and no local help content for other framework versions)
Steps taken to ensure a clean build:
Deleted the SHFB cache folder (C:\Users\Jon\AppData\Local\EWSoftware\Sandcastle Help File Builder\Cache)
Deleted the folder the documentation is generated into
Deleted the user settings file related to the SHFB project file
Deleted the symbol cache in Visual Studio
Still the differences remain. They appear to be limited to documentation inherited from MSDN itself, in particular Object.Finalize.
Version 1 (generated on machine "Chubby"):
<div class="summary">Allows an object to try to free resources and perform
other cleanup operations before it is reclaimed by garbage collection.</div>
Version 2 (generated on machine "Sandy"):
<div class="summary">Allows an <a
href="http://msdn2.microsoft.com/en-us/library/e5kfa45b" target="_blank">
Object</a> to attempt to free resources and perform other cleanup operations
before the <a href="http://msdn2.microsoft.com/en-us/library/e5kfa45b"
target="_blank">Object</a> is reclaimed by garbage collection.</div>
Both link to the same MSDN documentation, which looks like version 1 (no links to Object).
Looking at a few of the changed files, the change is consistent and restricted to this member.
Where might Sandcastle be getting this documentation from, and how can I get both computers to behave the same way?
EDIT: One more fragment of information - after cleaning the cache and rebuilding the docs on both machines, there are three files in the SHFB Cache directory:
Reflection.cache has the same size on both machines
MsdnUrl.cache has the same size on both machines
.NETFramework_4.0.0319_E8879A28.cache has size 13,377,733 bytes on Chubby, and 13,337,949 bytes on Sandy
EDIT: Significant progress! I've found where the difference is probably coming from...
The file c:\Windows\Microsoft.NET\Framework\v2.0.50727\en\mscorlib.xml:
On Chubby is 8,005,263 bytes with a date of 12th December 2011, and has the non-linked text for Finalize
On Sandy is 9,740,370 bytes with a date of 31st August 2009, and has the text for Finalize which includes links
On both machines, mscorlib.dll itself is the same size (4,550,656 bytes) and has a modified date of 13th September 2012.
But how can I get them to be the same? Where does that difference come from? (Service packs?)
EDIT: Okay, the version in c:\Windows was a red herring - it's the version in c:\Program Files (x86)\Reference Assemblies\Microsoft\Framework that's to blame. I'm going to see if I can find out why that might be different between installations...
A couple of ideas considering your recent edits, although I agree it is a bit shooting in the dark...
I would use a tool like "Beyond Compare" to compare the .Net Framework files and XML files on both machines ("folder compare" profile).
Favour the binary level comparison to be perfectly sure... if both of your machines are local, it should be very fast.
You can also try to run Mark Russinovich's Process Monitor ( http://live.sysinternals.com/procmon.exe ) on both machines and run the documentation building process.
This way, you will see which files are being read from and involved in the help file building process, and where they are coming from...
You will get a lot of output as it will show everything that happens in your system; you may want to disable registry and network monitoring, to only leave file monitoring, and also exclude any process unrelated to the documentation building process.
I'm not an help generation expert, but I would think that the text comes from the XML files, so you may want to put a filter on only showing the xml files as well.
If you can identify the files involved, then you might just have to copy them from one machine over to the other.

Visual Studio 2008 - Default Encoding style with .SQL files

Tools used:
Visual Studio 2008
Team Foundation Server or Visual Source Safe
Backstory:
We add our SQL files to our source control. We do this by adding them to the solution with the .sql extension and checking them in.
By default these files are saved as unicode. What that means is that user A can save foo.sql and user B can get latest and grab foo.sql
Problem:
Unfortunately, since the encoding is unicode by default, if foo.sql happens to have a file size that is divisible by 8 bits the system will open up the file in the wrong format. This causes the file to look like it has chinese characters instead of normal sql statements.
This can be fixed if the user A manually changes the encoding type to western european, but that's a huge pain. It's also very difficult to notice if user A forgot to manually set the encoding unless a problem occurs.
Question:
Is there a way to have visual studio make the default encoding of sql files western european? Is there a way to batch update the encoding type of files in visual studio?
You can update the Visual Studio default files. See the answers in this post:
SQL file encoding in Visual Studio

Pex Generated Tests Encoded UCS-2 Little Endian, Why, how to change?

HI there
i noticed that when I generate a pex test solution the default encoding of the files is UCS-2 Little Endian, this is not really cool, because all the rest of the files are normally encoded with Windows ANSI
(I m getting this info from Notepadd ++) and its confirmed by my CI breaking
Anyone knows
1) why is it using this encoding?
2) how to change it so by default it uses Windows ANSI like the rest of the files
NOTE:I know this is the issue because i saved the file with Windows Ansi Encoding and it all works
I know I probably shouldnt but I went and posted this same question on the pex forum
link to the question
and this was an Answer from Peli ( he is heavily involved in the Pex project AFAIK)
Copy of the Answer
1) why is it using this encoding?
There is no particular reason for this, besides that we decide to use this particular encoding. We will switch on Windows-1252 (ANSI) encoding in the future for source files. XML files will still be encoded as UTF-8.
2) how to change it so by default it uses Windows ANSI like the rest of the files
Unfortunately, this is hard-coded in Pex and you cannot change this. The next release of Pex (0.93) will use ANSI.