Why weblate import of JSON does not respect the original order? - weblate

I've just installed Weblate to host our translation projects.
It's working fine, but the import process of our JSON files is creating the strings in a different order than the original ones.
The result is that nerby messages are not the same than in the original files; the strings are mixed, and so it will be very difficult for translators to work in that order.
Why is the original order (present in JSON files) not respected?
Is it possible to respect it?
Thanks in advance.

The reason is that translate-toolkit which is library that Weblate uses doesn't respect it. It should be quite easy to adjust jsonl10n storage to use OrderedDict to fix this.
Update: This seems to be fixed as of translate-toolkit 1.14.0-rc1.

Related

how-to-parse-a-ofx-version-1-0-2-file-in power BI?

I just read
How to parse a OFX (Version 1.0.2) file in PHP?
I am not a developer. What easy tool can I use to make this code run with no code skill or appetence ? web browser is pretty hard to use for non dev guys.
I need this to use the file into Power BI, which accept M code, json source or xml, but not sgml ofx or PHP.
Thanks in advance
Welcome Didier to StackOverflow!
I'm going to try and give you a clue how I'd approach the problem here. But keep in mind that your question really lacks details for us to help you, and I'm asking to update your question with example data that you want to integrate into PowerBI. Also, I'm not too familiar with PowerBI nor PHP, and won't go into making that PHP code you linked run for you.
Rather, I'd suggest to convert your OFX file into XML, and then use PowerBI's XML import on that converted file.
From your linked question, I get that your OFX file is in SGML format. There's a program specifically designed to convert SGML into XML (which is just a restricted form of SGML) called osx. I've detailed how to install it on Linux and Mac OS in another question related to SGML-to-XML down-converting; if you're on Windows, you may have luck by just downloading a really ancient (32bit) version of it from ftp://ftp.jclark.com/pub/sp/win32/sp1_3_4.zip. Alternatively, you can use my sgmljs.net software as explained in Converting HTML to XML though that tutorial is really about the much more complex task of converting HTML to XML/XHTML and will probably confuse you.
Anyway, if you manage to install osx, running it on your OFX file (which I assume to have the name yourfile.ofx just for illustration) is just a matter of invoking (on the Windows or Linux/Mac OS command line):
osx yourfile.ofx > yourfile.xml
to result in yourfile.xml which you can attempt to load with PowerBI.
Chances are your OFX file has additional text at the beginning (lines like XYZ:0001 that come before <ofx>). In that case, you can just remove those lines using a text editor before invoking osx on it. Maybe you also need a .dtd file or additional instructions at the top of the OFX file informing SGML about the grammar of your file; it's really difficult to say without seeing actual test data.
Before bothering with SGML and all that, however, I suggest to remove those first few lines in your OFX file (everything until the first < character) and check if PowerBI can already recognize your changed input file as XML (which, from other OFX example files, has a good chance of succeeding). Be sure to work on a copy of your original file rather than overwriting it. Then come back and update your question with your results and example data.

Read Sentinel-2 L1C view angles from rasterio

I am trying to read the view angles from a Sentinel-2 image (L1C SAFE compact format) for executing an atmospheric correction algorithm. I can get those values by parsing the file MTD_TL.xml, but I am not able to get them through rasterio.
I have tried to access to those data using the xml:SENTINEL2 and the xml:VRT metadata domains, but I can only access to the values from the file MTD_MSIL1C.xml (the main metadata file).
The whole point of using rasterio is being able of using GDAL's virtual file system, as the images will be read from S3 buckets. Any alternatives for easily reading MTD_TL.xml through the virtual file system would be also valid (and really appreciated).
Thank you!!
I answer to myself.
I could not find how to get the values I require, but according to https://gdal.org/user/virtual_file_systems.html the function VSIFOpenL may be used for opening the file. After that, manual parsing will do the trick :)
Ps. I must read the documentation slowly.

Scala and sbt: Storing SQL in a resource

I'd like to store a database schema in its own file, and have my Scala code retrieve it (and execute it via JDBC).
It seems to me that sbt wants me to store the file as: src/main/resources/packagename/my.sql. Putting it there, I see it's in the jar - but I can't seem to access it from Scala.
Specifically, getClass().getResource("my.sql") returns a null pointer, and so does any other form I can think of.
How should I load the file? Or is there a better way to do it?
I had an almost identical problem.
The only difference is my file is in src/main/resources (without any packages).
This worked for me.
val is:InputStream = Github.getClass().getResourceAsStream( "/repo.json" );
Why don't you generate a file, and store it as "my.sql", and later search for it, where it appears in the filesystem?

Find duplicate PDFs

I'm looking for a utility that will help me find duplicate PDFs. The problem: I have a 1000s of PDF files. Some are duplicates. They are not easy to detect due differing files names and small differences in file size. Is there a utility/algorithm/library that can help me find the duplicates or show me files that are very similar (or degree of difference)?
Create an MD5 hash for each file and store it in a database. Identical files will then sort next to each other, or you can quickly search for a pre-existing key.
The problem is not yet solved in any way. What I do, is I use fdupes http://premium.caribe.net/~adrian2/fdupes.html to find exact duplicates.
But most of all, I use a workflow which minimizes duplicates. Every document that enters my system gets indexed with this perl-script I wrote: http://seegras.discordia.ch/Programs/fileindex which puts some name and an md5-sum of it into ~/.fileindex.md5 Now I can change metadata of the local PDF-files or whatever (and run fileindex again), and whenever I accidently download the same file again, I will stil lhave the md5-sum of the original file, and thus can detect whether it's a duplicate.
There's also exif-meta and exif-rename on http://seegras.discordia.ch/Programs/ which help with setting PDF metadata and with renaming PDF-files according to metadata; and if you're tagging all the files correctly, you will end up with duplicate filenames, indicating that they might be the same document within a different file.
If the files were created by the different tools, they could look the same but generate very different results because they are structured totally differently. I made some suggestions in a blog article at https://blog.idrsolutions.com/2010/09/comparing-2-pdf-files/
DiffPDF looks like something that might help you.
I remember that there is a UNIX utility called pdf2txt (see the package poppler-utils). You can try to extract the text from the files and make a textual diff.

Should I merge .pbxproj files with git using merge=union?

I'm wondering whether the merge=union option in .gitattributes makes sense for .pbxproj files.
The manpage states for this option:
Run 3-way file level merge for text files, but take lines from both versions, instead of leaving conflict markers. This tends to leave the added lines in the resulting file in random order and the user should verify the result.
Normally, this should be fine for the 90% case of adding files to the project. Does anybody have experience with this?
Not a direct experience, but:
This SO question really advices again merging .pbxproj files.
The pbxproj file isn't really human mergable.
While it is plain ASCII text, it's a form of JSON. Essentially you want to treat it as a binary file.
(hence a gitignore solution)
Actually, Peter Hosey adds in the comment:
It's a property list, not JSON. Same ideas, different syntax.
Yet, according to this question:
The truth is that it's way more harmful to disallow merging of that .pbxproj file than it is helpful.
The .pbxproj file is simply JSON (similar to XML). From experience, just about the ONLY merge conflict you were ever get is if two people have added files at the same time. The solution in 99% of the merge conflict cases is to keep both sides of the merge.
So a merge 'union' (with a gitattributes merge directive) makes sense, but do some test to see if it does the same thing than the script mentioned in the last question.
See also this question for potential conflicts.
See the Wikipedia article on Property List
I have been working with a large team lately and tried *.pbxproj merge=union, but ultimately had to remove it.
The problem was that braces would become out of place on a regular basis, which made the files unreadable. It is true that tho does work most of the time - but fails maybe 1 out of 4 times.
We are back to using *.pbxproj -crlf -merge for now. This seems to be the best solution that is workable for us.