Quick uses for scripting languages? - scripting

I feel that there are a lot of quick uses for scripting languages that you may only think of if you have the shell open at all times. I leave a terminal tab open with python running and have solved many problems with a few lines of code typed off the top of my head. What are some of your less obvious uses for the scripting language of your choice.

Most recently in my Windows centric world I have used it to rename large numbers of files, search/filter log files for a specific occurrence, perform network diagnostics, and a host of smaller things I can't think of at the moment that some of my colleagues not having a UNIX background would never have thought of.

I just used a Lua script in SciTE to take a selected SVG path and do some operations on it (find min values and translate to 0, scale, round up values to avoid having a ton of decimal digits). It is just handy.

Reformat text is some complicated way;
Prepare some text based on a template logic;
Rename multiple files (e.g. music collection or photos);
etc.

Something very similar was discussed in the Wikibooks article Ad Hoc Data Analysis From The Command Line.
This mostly discusses the use of Unix commands rather than scripting languages, but the principle is the same ... have a shell open at all times.

Use BeautifulSoup to clean up some HTML.

Related

Replace words/phrases in existing PDF or docx with other words

I am trying to make a dynamic PDF generator as an .NET Core API. I want to take an existing PDF, or .docx file, and edit it so it replaces the current name (John Doe) with something that can be replaced like #NAME_PLACEHOLDER.
I then want to transform #NAME_PLACEHOLDER -> John Doe (or whatever is in the KeyValuePair or Dictionary<string, string>).
I am running this on a Docker environment, so I can easily execute commands and I am willing to do that as well.
So far I have tried a few things:
1) pdf2htmlEX
Executes as pdf2htmlEX file.pdf
Does the job pretty well
Can be converted back to PDF using Google Chrome headless or similar
Problem: Only the characters used in the PDF can be used to replace. So if I only use A, B, C as characters, it will make D into Times New Roman (or default font)
2) LibreOffice ODT to PDF
This was pretty nice, because I could simply unzip the .odt file, open content.xml, search and replace, then save it as an .odt file again
Could be converted into PDF rather easily using soffice --convert-to pdf
LibreOffice is quite nice
Problem 1: Microsoft Word -> Save as ODT tends to break the formatting, so we have to use LibreOffice to go and change it back again
Problem 2: We don't want to move away from Microsoft's Office suite
3) HTML to PDF using Chrome Headless
What you see is what you get
By far the best option, if we're all developers aaand have unlimited time
Problem 1: Only our developers can make changes, since our marketing department do not know HTML
Problem 2: Our existing PDFs would have to be rewritten in HTML
As you can see, I have tried a bunch of things. None of them, except Chrome Headless, has lived up to my expectations. What I really like about #3 is what you see is what you get. I can make the whole thing in HTML, press CTRL+P and see what it looks like as a finished PDF, basically.
I am looking for a better solution, though. It can be paid. It can be free. All I need is to change out words/phrases with other words dynamically, which apparently seems like a tough thing to do.
Thanks for specifying what you've already found clearly. It helps a lot providing a succinct answer.
The conversion is always tricky - I'm sure you know Word has trouble displaying/editing some Word documents itself.
I have experience regarding point #2 "LibreOffice ODT to PDF" and can suggest a few things to test:
Don't use Microsoft to do the docx->odt conversion. It's not good as you know. Use LibreOffice itself to do this step. The rest of your process remains the same.
For some documents, Libre Office does doc->odt much better. So, you can instead work with DOC format and get a better result without any other changes.
You won't be able to remove the devs from the process, but you can certainly reduce their role allowing your business/marketing teams to have more direct input simply by:
get the starting point document to the devs to run through the conversion process. The devs can "clean up" the document to make it convert nicely.
make this version of the document the "official" starting point. The business or technical teams can load it, adjust it, and put it back into the process.
if possible, expose a test-platform to the business teams so they can download, adjust, upload and render to PDF. This cycle means they will be able to achieve more and if they're good, do impressive stuff without any dev input.
the above steps simply mean don't expect perfect conversion of arbitrary complex documents. Starting from a (even complex) working baseline is great.
Some of that might show you that your #2 is actually going to get the best overall results.
I hope that helps.

What's the best method for traversing long scripts?

When your script gets to be thousands of lines long, finding a particular function or variable declaration gets to be a real pain. Are there any methods you can use to avoid this?
It really depends on the language and the editor you use.
If the language supports importing from external files, as most of them do, you should refactor your script into smaller modules and import/include them into your main script.
Also, most editors have some means of searching within a file and some, such as 'TextMate' (on Mac) or 'e', its Windows clone, provide a special view displaying all the symbols within the source which you can click on to immediately reorient the editor to the chosen target.
You split your code into separate files/modules/etc., generally organized by similar functionality, and require them in your main script.

Other options for a Rebol editor|IDE?

I currently use Programmer's Notepad with the Rebol syntax scheme. It's not bad--does any insightful person have another suggestion?
For my Windows programming work I use the Zeus editor, but I'm not sure if it does Rebol?
Another windows option is TextPad. It is commercial but it is quite a useful editor.
There are 2 Rebol syntax files available from the official site
http://www.textpad.com/add-ons/synn2t.html
I also wrote a TextPad syntax file generator uploaded it to rebol.org
http://www.rebol.org/view-script.r?script=textpad-syngen.r
It is probably quite easy to modify this script to support other editors.
vim.
Especially with the following binding in your _vimrc/.vimrc:
nnoremap <Leader>fr :w<CR>:silent ! %<CR>
In normal mode, Leaderfr saves your current file and executes it: (fr is a memo for 'fast-run')
:wEnter save current file
:silent execute without messages: ! open shell % paste current file name Enter
Leader is usually \ key, I have this mapped to spacebar. In case anyone is interested on how to do that, post a comment.
Programmer’s Notepad better than Crimson Editor with Code Folding and Great Project Management
http://www.pnotepad.org/
It's opensource so you can even modify it in C++
For Windows, there is Crimson Editor or E with the REBOL bundle.
For Mac, there is TextMate.
Emacs, I believe has a REBOL syntax too.
Sublime Text is a really nice Windows editor (commercial, but reasonably priced) that supports TextMate configurations (well, at least for syntax and snippets) so if you manage to get a REBOL bundle from somewhere, you can use it with this.
SciTE also has REBOL syntax coloring support because the Scintilla editor component it's based on includes this.
Notepad++ should also support REBOL syntax coloring, being Scintilla based, but as it is currently distributed, the support is not compiled in. If you're so inclined, you could probably compile it yourself and add the support back in. It might be worth it because Notepad++ is quite a good editor too.
I can't include proper links because I don't have enough rep, but this should do:
www.sublimetext.com
www.scintilla.org/SciTE.html
www.scintilla.org/index.html
notepad-plus-plus.org
http://rebol.wik.is/index.php?title=Notepad%2b%2b
which is a REBOL plugin for Notepad++
I use JEdit which not only has REBOL syntax highlighting but also auto-indenting. It has most of the features you'd expect from a text editor (e.g. block selection, configurable keyboard shortcuts).
There are versions for Windows, Mac OS X and Linux so if you choose to work cross-platform you won't need to learn a new editor. The web page is Jedit.org1
I use UltraEdit.
with its advanced project, syntax highlighting, macro, command-line control and total keyboard shortcut configuration, per language and project, you can program the editor to do just about all of what you need at a single click of the mouse, or keyboard.
my setup starts rebol on any file, and assigns a launch "default project script " to a shortcut, so wherever I am in the files, I still launch the project's relevant script. change project, it will run that new project's scripts. another key for unit tests, another key for "find in all opened files, etc, etc..
also, the actual text-exiting, when combined with a few macros which create functions, objects, and more using the clipboard and "currently highlighted" text makes it much faster than any Visual IDE including MSVC.
ultra edit itself has thousands of other advanced features, and they all work... really they do.
I've tried other editors and they always fall short when I start to push them.
yeah, you have to buy it... but its cheap (like one or two hours of your life salary ;-)
so considering you might use it for several months or years... its a cheap investment.
also, ultra edit is now released on linux and the mac port is just around the corner.
I use EditPlus for several years, it is not free but not expensive. It has Rebol syntax highlighting file (downloadable from its web site).
It is especially useful & very fast if you work with huge files (over 100 mb) or with lots of files (say 300 files.), find & replace takes a second.
For syntax highlighting and a simple autocomplete, you can use http://komodoide.com/komodo-edit/.
It's free and open source with several nice features, including folder browsing while editing, which I personally find very useful.
There is also a bunch of other languages supported in case you want to take a closer look and give this editor a chance.

Which is it Perl or perl, TIF or TIFF, ant or Ant, ClearCase or Clear Case?

In one sentence I have manage to create 16 possible variations on how I present information. Does it matter as long as the context is clear? Do any common mistakes irritate you?
regarding Perl: How should I capitalize Perl?
TIFF stands for Tagged Image File Format, whereas the extension of files using that format is often ".tif".
That is for the purpose of compatibility with 8.3 filenames, I believe.
I generally like the Perl way of capitalizing when used as a proper noun, but lowercasing when referring to the command itself (assuming the command is lowercase to begin with).
Well, Perl and TIFF have already been answered, so I'll add the last two
the Apache Foundation writes "Apache Ant".
Rational ClearCase (or sometimes "IBM Rational ClearCase") is written as such at its web site.
Even though Perl was originally an acronym for Practical Extration and Report Language, it is written Perl.
These things dont 'bother' me as much as they provide insights into the level of knowledge of the speaker/author. You see, we work in a industry that requires precision, so precision in language does matter as it affects the understanding of the consumer.
The one that really seems to bother me is when people fully upper case JAVA as though it was an acronym.

Batch source-code aware spell check

What is a tool or technique that can be used to perform spell checks upon a whole source code base and its associated resource files?
The spell check should be source code aware meaning that it would stick to checking string literals in the code and not the code itself. Bonus points if the spell checker understands common resource file formats, for example text files containing name-value pairs (only check the values). Super-bonus points if you can tell it which parts of an XML DTD or Schema should be checked and which should be ignored.
Many IDEs can do this for the file you are currently working with. The difference in what I am looking for is something that can operate upon a whole source code base at once.
Something like a Findbugs or PMD type tool for mis-spellings would be ideal.
As you mentioned, many IDEs have this functionality already, and one such IDE is Eclipse. However, unlike many other IDEs Eclipse is:
A) open source
B) designed to be programmable
For instance, here's an article on using Eclipse's code formatting functionality from the command line:
http://www.peterfriese.de/formatting-your-code-using-the-eclipse-code-formatter/
In theory, you should be able to do something similar with it's spell-checking mechanism. I know this isn't exactly what you're looking for, and if there is a program for doing spell-checking in code then obviously that'd be better, but if not then Eclipse may be the next best thing.
This seems little old but seems to do a good job
Source Code Spell Checker