Reading a large-single XML line to a variable using Batch Script - batch-processing

I have a xml file which only contains a single line, but the problem is the line is very large, so it seems that I can't store in a variable.
What i want is this,
given tag1, tag2.....tag900, I want to break each tag into a line as follow:
tag1
tag2
tag3
......
tag900

Do not attempt to do this using native batch. It will be extremely difficult, and any solution will be very slow.
The problem is native batch cannot read lines > 8k, and batch does not have a good way to read partial lines.
There is a method that creates a test file that has size >= your file that consists of a single repeated character. A binary file compare ( FC /B ) is then done and the results are parsed character by character expressed as hex codes. It's a bit more complex than that, but I don't think you want to go there.
The only other option is to use SET /P to read in 1021 chars at a time, and then parse and piece things together. But this is unproven, and again, I don't think worth the effort.
If you want to use a native scripting language than I suggest VBScript or JScript. (Perhaps PowerShell, but I don't really know much about its capabilities).
You could download a Unix text processing tool like sed that has been ported to Windows.
I don't do much with XML, but I've got to believe there is a free tool geared specifically for XML that would make your job fairly easy.
Basically, use anything except batch! (this is coming from someone whose hobby is solving problems with batch)

Related

Multi-line text in a .env file

In vue, is there a way to have a value span multiple lines in an .env file. Ex:
Instead of:
someValue=[{"someValue":"Here is a really really long piece which should be split into multiple lines"}]
I want to do something like:
someValue=`[{"someValue":"Here is a really
really long piece which
should be split into multiple lines"}]`
Doing the latter gives me a JSON parsing error if I try to do JSON.parse(someValue) in my code
I don't know if this will work, but I can't format a comment appropriately enough to get the point across so see if this will work:
someValue=[{"someValue":"Here is a really\
really long piece which\
should be split into multiple lines"}]
Where "\" should escape the newline similar to how you can write long bash commands while escaping the newline. I'm not certain the .env interpreter will support it though.
EDIT
Looks like this won't work. This syntax was actually proposed, but I don't think it was incorporated. See motdotla/dotenv#333 (which is what Vue uses to parse .env).
Like #zero298 said, this isn't possible. Likely you could delimit the entry with a character that wouldn't show up normally in the text (^ is a good candidate), then parse it within the application using string.replace('^', '\n');

Failure to read full line including embedded zero bytes

Lua script:
i=io.read()
print(i)
Command line:
echo -e "sala\x00m" | lua ll.lua
Output:
sala
I want it to print all character from input, similar to this:
salam
in HEX editor:
0000000: 7361 6c61 006d 0a sala.m.
How can I print all character from input?
You tripped over one of the few places where the Lua standard library is still not 8-bit-clean.
Specifically, file reading line-by-line is not embedded-0 proof.
The reason it isn't yet is an unfortunate combination of:
Only standard C90 or equally portable constructs are allowed for the core, which does not provide for efficient 0-clean text parsing.
Every solution discussed to date on the mailinglist under that constraint has considerable overhead.
Embedded 0-bytes in text files are quite rare.
Workarounds:
Use a modified library, fixing these formats: "*l" "*L" for file:read(...)
parse your raw data yourself. (read a block using a number or as much as possible using "*a")
Badger the Lua developers/maintainers for a bugfix until they give in.

How to write a custom assembly compiler (sort of) in VB.NET

I've been trying to write a simple script compiler for a custom language used by the Game Boy Advance's Z80 processor.
All I want it to do is look at a human-readable command, take it and its arguments and convert it into a hexadecimal value into a ROM file. That's it. Each command is a byte, and each may take a different number of arguments - arguments can be either 8, 16, or 32 bits and each command has a specific number of arguments that it takes.
All of this sort of code is handled by the game and converted into workable machine code within the game's memory, so I'm not writing a full-on assembly compiler if you will. The game automatically knows how many args a command has, what each command does, exactly how to execute it as it is, etc.
For instance, you have command 0x4E, which takes in one 8-bit argument and another 32-bit argument. In hex that would obviously be 4E XX YY YY YY YY. I want my compiler to read it from text as foo 0xXX 0xYYYYYYYY and directly write it into a file as the former.
My question is, how would I do that in VB.NET? I know it's probably a very simple answer, but I see a lot of different options to write it to a file--some work and most don't for me. Could you give me some sample code as to how I would do this?
Writing an assembly compiler as I understand it is not so simple. I recomed you to use one already written see: Software Development Tools for Z80 Family
If you are still interested in writing it here are instructions:
Write the text you want to translate to some file (or memory stream)
Read it line by line
Parse the line either splitting it to an array or with regular
expressions
Identify command and arguments (as far as I remember it some commands
does not have arguments)
Translate the command to Hex (with a collection or dictionary of
commands)
Write results to an array remembering the references for jump
addresses
When everything is translated resolve addresses and write them to
right places.
I think that the most tricky part is to deal with symbolic addressees.
If you are still interested write a first piece of code (or ask how to do it) and continue with next ones.
This sounds like an assembler, even if it for a 'custom language'.
Start by parsing the command lines. use string.split method to convert the string to an array of strings. the first element in the array is your foo, you can then look that up and output 4E, then convert the subsequent elements to bytes.

SAS : read in PDF file

I am looking for ways to read in a PDF file with SAS. Apparently this is not basic functionality and there is very little to be found on the internet. (Let alone that google is not easy with PDF in you search giving you also links to PDF documents that go about other things.)
The only things that can be found, are people looking for ways to import data into datasets from a PDF. For me, that is not even necesarry. I would like to be able to read the contents of the PDF file in one big character variable. If possible, it would even be better to be able to read in the file's binary data.
Is this possible with SAS and how? (I got it to work in Access VBA, but can't find any similar ways in SAS.)
(In the end, the purpose is to convert this to base64 and put that base64-string into an XML document.)
You probably will not be able to read the entire file into one character variable since the maximum size of a character variable is around 33 KB. A simple way to read in one line at a time, though, is something like the following:
%let pdfFileName = Test.pdf;
%let lineSize = 2000;
data base;
format text_line $&lineSize..;
infile "&pdfFileName" lrecl=&lineSize;
input text_line $;
run;
This requires that you have a general idea of the maximum record length ahead of time, but you could write additional code to determine the maximum record size prior to reading in the file. In this example each line of text is read into one character variable named "text_line." From there, you could use a RETAIN statement or double trailers (##) in the INPUT line to process multiple lines at a time. The SAS web-site has plenty of documentation on how to read and process text from various types of input files.

Is there a tool to clean the output of the script(1) tool?

script(1) is a tool for keeping a record of an interactive terminal session; by default it writes to the file transcript. My problem is that I use ksh93, which has readline features, and so the transcript is mucked up with all sorts of terminal escape sequences and it can be very difficult to reconstruct the command that was actually executed. Not to mention the stray ^M's and the like.
I'm looking for a tool that will read a transcript file written by script, remove all the junk, and reconstruct what the shell thought it was executing, so I have something that shows $PS1 and the commands actually executed. Failing that, I'm looking for suggestions on how to write such a tool, ideally using knowledge from the terminfo database, or failing that, just using ANSI escape sequences.
A cheat that looks in shell history, as long as it really really works, would also be acceptable.
Doesn't cat/more work by default for browsing the transcript? Do you intend to create a script out of the commands actually executed (which in my experience can be dangerous)?
Anyway, 3 years without an answer, so I will give it a shot with an incomplete solution. If your are only interested in the commands actually typed, remove the non-printable characters, then replace PS1' with something readable and unique, and grep for that unique string. Like this:
$ sed -i 's/[^[:print:]]//g' transcript
$ sed 's/]0;cartman#southpark: ~cartman#southpark:~/CARTMAN/g' transcript | grep CARTMAN
Explanation: After first sed, PS1' can be taken from one of the first few lines of the transcript file, as is -- PS1' is different from PS1 -- and can be modified with a unique readable string ("CARTMAN" here). Note that the dollar sign at the end of the prompt was left out intentionally.
In the few examples that I tried, this didn't solve everything but took care of most issues.
This is essentially the same question asked recently in Can I programmatically “burn in” ANSI control codes to a file using unix utils? -- removing all nonprinting characters will not fix
embedded escape sequences
backspace/overstriking for underlining
use of carriage-returns for overstriking