I'm wondering whether the merge=union option in .gitattributes makes sense for .pbxproj files.
The manpage states for this option:
Run 3-way file level merge for text files, but take lines from both versions, instead of leaving conflict markers. This tends to leave the added lines in the resulting file in random order and the user should verify the result.
Normally, this should be fine for the 90% case of adding files to the project. Does anybody have experience with this?
Not a direct experience, but:
This SO question really advices again merging .pbxproj files.
The pbxproj file isn't really human mergable.
While it is plain ASCII text, it's a form of JSON. Essentially you want to treat it as a binary file.
(hence a gitignore solution)
Actually, Peter Hosey adds in the comment:
It's a property list, not JSON. Same ideas, different syntax.
Yet, according to this question:
The truth is that it's way more harmful to disallow merging of that .pbxproj file than it is helpful.
The .pbxproj file is simply JSON (similar to XML). From experience, just about the ONLY merge conflict you were ever get is if two people have added files at the same time. The solution in 99% of the merge conflict cases is to keep both sides of the merge.
So a merge 'union' (with a gitattributes merge directive) makes sense, but do some test to see if it does the same thing than the script mentioned in the last question.
See also this question for potential conflicts.
See the Wikipedia article on Property List
I have been working with a large team lately and tried *.pbxproj merge=union, but ultimately had to remove it.
The problem was that braces would become out of place on a regular basis, which made the files unreadable. It is true that tho does work most of the time - but fails maybe 1 out of 4 times.
We are back to using *.pbxproj -crlf -merge for now. This seems to be the best solution that is workable for us.
Related
here is a complex problem that I am struggling to find a clean solution for:
Imagine having a Snakemake workflow with several rules that can be parameterized in some way. Now, we might want to test different parameter settings for some rules, to see how the results differ. However, ideally, if these rules depend on the output of other rules that are not parameterized, we want to re-use these non-changing files, instead of re-computing them for each of our parameter settings. Furthermore, if at all possible, all this should be optional, so that in the default case, a user does not see any of this.
There is inherent complexity in there (to specify which files are re-used, etc). I am also aware that this is not exactly the intended use case of Snakemake ("reproducible workflows"), but is more of a meta-feature for experimentation.
Here are some approaches:
Naive solution: Add wildcards for each possible parameter to the file paths. This gets ugly, hard to maintain, and hard to extend really quickly. Not a solution.
A nice approach might be to name each run, and have an individual config file for that name which contains all settings that we need. Then, we only need a wildcard for such a named set of parameter settings. That would probably require to read some table of meta-config file, and process that. That doesn't solve the re-use issue though. Also, that means we need multiple config files for one snakemake call, and it seems that this is not possible (they would instead update each other, but not considered as individual configs to be run separately).
Somehow use sub-workflows, by specifying individual config files each time, e.g., via a wildcard. Not sure that this can be done (e.g., configfile: path/to/{config_name}.yaml). Still not a solution for file re-using.
Quick-and-dirty: Run all the rules up to the last output file that is shared between different configurations. Then, manually (or with some extra script) create directories with symlinks to this "base" run, with individual config files that specify the parameters for the per-config-runs. This still necessitates to call snakemake individually for each of these directories, making cluster usage harder.
None of these solve all issues though. Any ideas appreciated!
Thanks in advance, all the best
Lucas
Snakemake now offers the Paramspace helper to solve this! https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html?highlight=parameter#parameter-space-exploration
I have not tried it yet, but it seems like the solution to the issue!
When am trying to open any .EXE file am getting information in encoded form. Any idea how to see the content of an .EXE file ????
I need to know what Database tables are used in the particular .EXE.
Ah, now we are getting closer to the real question.
It is probably much more productive to ask the targeted databases about the SQL queries being execute during the run, or a top-ten shortly afterwards.
The table-names might not be hard-coded recognizably as such in the executable.
They might be obtained by a lookup, and some fun pre-fixing or other transformation might be in place.
Admittedly they like are clear text.
Easiest is probably to just transfer to a Unix server and use STRINGS on the image.
I want to include the source here with but that failed, and I cannot find how to attach a file. Below you'll find a link OpenVMS macro program source for a STRINGS like tool. Not sure how long the link will survive.
Just read for instructions, save (strings.mar), compile ($ MACRO strings), link ($link strings), and activate ($ mcr sys$login:strings image_to_test.exe)
OpenVMS Macro String program text
Good luck!
Hein
Use analyze/image to view the contents of an executable image file.
I'm guessing you are trying to look in the EXE because you do not have access to the source. I do something like this:
$ dump/record/byte/hex/out=a.a myexe.exe
Then look at a.a with any text editor (132 columns). The linker groups string literals together, and they are mostly near the beginning of the EXE, so you don't have to look to far into the file. Of course this only helps if the database references are string literals.
The string literal might be broken across a block (512 byte) boundary, so if you use search in your editor, try looking for substrings.
Aksh - you are chasing your tail on this one. Its a false dawn. Even if you could (and you can't) find the database tables, you will need the source of the .exe to do anything sensible with it, or the problem you are trying to solve. Its possible to write a program which just lists all the tables in a database without reading any of 'em. So you could spend and awful lot of effort and get nowhere. Hope this helps
I have set up a project in DataGrip with several sql files spread over a couple of directories like this:
My hope is to manage the complexity as this turns into hundreds of files. This is a learning/proof of concept level effort right now.
What I want to do is have a way to run/build/publish this project but at present the best I have found is to select the files and then do a "Run Files" CTRL+SHIFT+F10. This worked for a bit but now I have a foreign key that gets run in the wrong order. I don't want to have to make a hack like prefixing the file names with integers to force a specific order. It feels like a real kludge.
How should I accomplish this, I must have missed something since the alternative is very manual and error prone. If it matters the database I am working against is Oracle.
Since DataGrip 2020.1 one can create a Run Configuration and specify data source and multiple files or scripts:
Refer to DataGrip blog post.
I am using a command line tool (ng-xi18n) to extract the i18n strings from an angular 2 app I wrote. The output of this command is a messages.xlf file. Coming from a .po background, and being not familiar with .xlf, I assumed that this file is the equivalent to the .pot file (correct me if I am wrong).
I then assumed that if I want to translate my app, I had to cp messages.xlf messages.de.xlf to have a copy (messages.de.xlf) of the template file (messages.xlf) where I can translate each message into German (hence the .de.xlf).
After translating some dummy texts and running the app, I saw that it worked as expected, so I quit translating and continued developing the app. After some time, I added more i18n strings, and eventually thought that I had to update my template. And this is where things got hardly maintainable. I updated the template messages.xlf file, and quickly was wondering how I could update the new strings to my already translated messages.de.xlf file without loosing my progress.
When I was developing using .po files, this was no problem thanks to good tools like poEdit, but I didn't find anything comparable for .xlf. After trying some tools, I thought that the best choice would be Lokalize, but I didn't find a possibility to merge the template file to already translated (but outdated) files either.
Up to now, this was rather an essay than a question, so here's a quick summary:
Is the workflow of dealing with .xlf files really comparable to .po as I initially thought (described above), or is it completely different?
How am I suppose to update my already translated files?
What are the best practices dealing with .xlf files?
What are proof of concept tools to work with .xlf?
Sidenotes:
The Lokalize handbook was not helpful at all. I see a lot of functions that sound promising, like:
"File" > "Update file from template". I did not find anything in the handbook to explain this function. If I click on this, nothing happens.
"Sync" > "Open file for sync/merge". This seems to be a function to merge two similar files (by multiple translators) rather than a tool to update the translation file from a template. Even though there is a tooltip in Lokalize's primary sync tab, notifying me about "x unmatched entries", I just couldn't find anything to append those unmatched entries to my .de.xlf file.
[Update] Turns out, I had similar issues as in this question. After downgrading my version of Lokalize to the suggested one, many issues (including the ones mentioned in the question) disappeared. However, now the "Update file from template" option is greyed out, and I don't know why.
I also tried OmegaT, which does not work at all on my platform (Ubuntu 16.04).
[Update] Virtaal works great for merging new strings from a template, but the UI in general is very poorly designed...
Googling did not help, as every hit seems to be related to XCode or something.
Thanks for any help in advance, I really appreciate it
I wrote a small npm command line tool called xliffmerge.
In principle it does the same, that Roland Oldengarm does with his gulp tasks described in his blog article.
It is free and you can have a look at it at https://github.com/martinroob/ngx-i18nsupport#readme
The best workflow automation solution I have seen described so far is from Roland Oldengarm's blog entry "Angular 2: Automated i18n workflow using gulp". To summarize, in a few dozen lines of Gulp code he created the tooling to handle some of the challenges you faced. Specifically it runs ng-xi18n to extract the messages; creates an English translation with sources copied to targets; updates existing translations by adding new trans-units, keeping existing ones, and removing missing ones; and then exposes all xlf files as TypeScript string constants. These last strings can then be imported to supply the bootstrapModule with its translation provider options.
Caveat: I have not used this exact solution (and code) myself, but I was able to expose generated xlf as TypeScript strings and use them in an app in a manner similar to what he described. As for maintaining translations, I have leveraged IntelliJ IDEA (WebStorm) file comparison features and Counterparts Lite (for Mac) for that. My own efforts are still in early stages but are working end to end for an application that is in active development.
Official Angular docs are now updated for Internationalization (i18n) at https://angular.io/docs/ts/latest/cookbook/i18n.html including a section specifically for creating a translation source file with the ng-xi18n tool.
I'm looking for a utility that will help me find duplicate PDFs. The problem: I have a 1000s of PDF files. Some are duplicates. They are not easy to detect due differing files names and small differences in file size. Is there a utility/algorithm/library that can help me find the duplicates or show me files that are very similar (or degree of difference)?
Create an MD5 hash for each file and store it in a database. Identical files will then sort next to each other, or you can quickly search for a pre-existing key.
The problem is not yet solved in any way. What I do, is I use fdupes http://premium.caribe.net/~adrian2/fdupes.html to find exact duplicates.
But most of all, I use a workflow which minimizes duplicates. Every document that enters my system gets indexed with this perl-script I wrote: http://seegras.discordia.ch/Programs/fileindex which puts some name and an md5-sum of it into ~/.fileindex.md5 Now I can change metadata of the local PDF-files or whatever (and run fileindex again), and whenever I accidently download the same file again, I will stil lhave the md5-sum of the original file, and thus can detect whether it's a duplicate.
There's also exif-meta and exif-rename on http://seegras.discordia.ch/Programs/ which help with setting PDF metadata and with renaming PDF-files according to metadata; and if you're tagging all the files correctly, you will end up with duplicate filenames, indicating that they might be the same document within a different file.
If the files were created by the different tools, they could look the same but generate very different results because they are structured totally differently. I made some suggestions in a blog article at https://blog.idrsolutions.com/2010/09/comparing-2-pdf-files/
DiffPDF looks like something that might help you.
I remember that there is a UNIX utility called pdf2txt (see the package poppler-utils). You can try to extract the text from the files and make a textual diff.