Convert .mdt to .html in Standard ML - input

How do I input a file called "first.mdt" convert it to html syntax and output the result to "second.html", using only the TextIO and the string modules of Standard ML?
fun copyTextFile(first: string, second: string) =
let
val ins = TextIO.openIn first
val outs = TextIO.openOut second
fun helper(copt: char option) =
case copt of
NONE => (TextIO.closeIn ins; TextIO.closeOut outs)
| SOME(c) => (TextIO.output1(outs,c); helper(TextIO.input1 ins))
in
helper(TextIO.input1 ins)
end
This is just the first part of the code, and is supposed to copy the contents of the .mdt file as-is to the .html file. The code runs and does not show any errors, but the contents do not get copied.

Related

Scalding Unit Test - How to Write A Local File?

I work at a place where scalding writes are augmented with a specific API to track dataset meta data. When converting from normal writes to these special writes, there are some intricacies with respect to Key/Value, TSV/CSV, Thrift ... datasets. I would like to compare the binary file is the same prior to conversion and after conversion to the special API.
Given I cannot provide the specific api for the metadata-inclusive writes, I only ask how can I write a unit test for .write method on a TypedPipe?
implicit val timeZone: TimeZone = DateOps.UTC
implicit val dateParser: DateParser = DateParser.default
implicit def flowDef: FlowDef = new FlowDef()
implicit def mode: Mode = Local(true)
val fileStrPath = root + "/test"
println("writing data to " + fileStrPath)
TypedPipe
.from(Seq[Long](1, 2, 3, 4, 5))
// .map((x: Long) => { println(x.toString); System.out.flush(); x })
.write(TypedTsv[Long](fileStrPath))
.forceToDisk
The above doesn't seem to write anything to local (OSX) disk.
So I wonder if I need to use a MiniDFSCluster something like this:
def setUpTempFolder: String = {
val tempFolder = new TemporaryFolder
tempFolder.create()
tempFolder.getRoot.getAbsolutePath
}
val root: String = setUpTempFolder
println(s"root = $root")
val tempDir = Files.createTempDirectory(setUpTempFolder).toFile
val hdfsCluster: MiniDFSCluster = {
val configuration = new Configuration()
configuration.set(MiniDFSCluster.HDFS_MINIDFS_BASEDIR, tempDir.getAbsolutePath)
configuration.set("io.compression.codecs", classOf[LzopCodec].getName)
new MiniDFSCluster.Builder(configuration)
.manageNameDfsDirs(true)
.manageDataDfsDirs(true)
.format(true)
.build()
}
hdfsCluster.waitClusterUp()
val fs: DistributedFileSystem = hdfsCluster.getFileSystem
val rootPath = new Path(root)
fs.mkdirs(rootPath)
However, my attempts to get this MiniCluster to work haven't panned out either - somehow I need to link the MiniCluster with the Scalding write.
Note: The Scalding JobTest framework for unit testing isn't going to work due actual data written is sometimes wrapped in bijection codec or setup with case class wrappers prior to the writes made by the metadata-inclusive writes APIs.
Any ideas how I can write a local file (without using the Scalding REPL) with either Scalding alone or a MiniCluster? (If using the later, I need a hint how to read the file.)
Answering ... There is an example of how to use a mini cluster for exactly reading and writing to HDFS. I will be able to cross read with my different writes and examine them. Here it is in the tests for scalding's TypedParquet type
HadoopPlatformJobTest is an extension for JobTest that uses a MiniCluster.
With some hand-waiving on detail in the link, the bulk of the code is this:
"TypedParquetTuple" should {
"read and write correctly" in {
import com.twitter.scalding.parquet.tuple.TestValues._
def toMap[T](i: Iterable[T]): Map[T, Int] = i.groupBy(identity).mapValues(_.size)
HadoopPlatformJobTest(new WriteToTypedParquetTupleJob(_), cluster)
.arg("output", "output1")
.sink[SampleClassB](TypedParquet[SampleClassB](Seq("output1"))) {
toMap(_) shouldBe toMap(values)
}
.run()
HadoopPlatformJobTest(new ReadWithFilterPredicateJob(_), cluster)
.arg("input", "output1")
.arg("output", "output2")
.sink[Boolean]("output2")(toMap(_) shouldBe toMap(values.filter(_.string == "B1").map(_.a.bool)))
.run()
}
}

Kotlin Comparison between BufferedReader::readText and String always false

I read stdin and stderr from the command-line using:
fun runCommand(vararg commands: String): Pair<String, String> {
val proc = Runtime.getRuntime().exec(commands)
val stdIn = BufferedReader(InputStreamReader(proc.inputStream))
val stdErr = BufferedReader(InputStreamReader(proc.errorStream))
val p = Pair(stdIn.use(BufferedReader::readText).trim(), stdErr.use(BufferedReader::readText).trim())
stdIn.close();
stdErr.close();
return p;
}
This gives me a Pair of <String, String> with the output of stdin and stderr.
However, no matter how I try to compare these Strings to another String, the comparison always returns false.
Things I've tried:
runCommand("nordvpn", "account").first.compareTo("You are not logged in.")
runCommand("nordvpn", "account").first == "You are not logged in."
runCommand("nordvpn", "account").first.equals("You are not logged in.")
Might this have something to do with the encoding?
Or am I just reading the output incorrectly?
Any help would be appreciated!
Thanks to #gidds' comment I was able to find that for some reason the output of the command (stdout) had "-CR SP SP CR" (- CarriageReturn Space Space CarriageReturn" prepended, which I removed with a simple String.drop(5).
Edit: After some more thinking, I assume that the aforementioned charts were responsible for making the output of the command in the terminal colored (yellow)

first `readLine` is skipped inside a `case - of` control flow in Nim-lang

I have the following code.
import lib
var stat = false
when isMainModule:
while stat != true:
echo("Option: ")
var opt = readChar(stdin)
case opt
of 'q':
stat = true
of 'n':
echo("Salu: ")
var ss = readLine(stdin)
echo("Nam: ")
var nn = readLine(stdin)
let k = prompt("Rust")
else: discard
What I am trying to achieve is, prompting and receiving user input one after another for two variables. Upon choosing n I am expecting Salu first and once user input is supplied then Nam.
However, what I receive when I execute the following nim code by issuing the following command is, nim c -r src/mycode.nim
~~> nim c -r src/cmdparsing.nim
...
...
...
CC: stdlib_system.nim
CC: cmdparsing.nim
Hint: [Link]
Hint: operation successful (48441 lines compiled; 2.338 sec total; 66.824MiB peakmem; Debug Build) [SuccessX]
Hint: /home/XXXXX/Development/nim_devel/mycode/src/mycode [Exec]
Option:
n
Salu:
Nam:
Salu is being echoed, but readLine doesn't wait for my input and immediately echoes Nam. But, stacked readLine commands from the prompt procedure appears one after the other for receiving user input.
I was wondering what is that I am missing to understand here. Could someone enlighten me?
Code for prompt lives in lib.nim which is as follows,
proc prompt*(name: string): bool =
echo("Salutation: ")
var nn = readLine(stdin)
echo(nn&"."&name)
echo("Diesel")
var dd = readLine(stdin)
echo(dd)
return true
You do a readChar to get the opt value, and then you input two chars: n and \n. The first is the opt value, the second gets buffered or retained in the stdin waiting for further reading. The next time you try to read a line, the \n that's still hanging is interpreted as a new line, and immediately assigned to ss. You don't see anything because the line is empty except for the newline char.
E.g.
var opt = readChar(stdin)
case opt
of 'n':
var ss = readLine(stdin)
echo ss
else:
discard
Compile and run, but in the input write something like "ntest". n fires the first branch of case, test (the remainder of stdin) is assigned to ss, and echoed.
You have two options to solve the problem:
Read a line instead of a char, and store only the first char with something like var opt = readLine(stdin)[0].
Use the rdstdin module:
import rdstdin
var ss = readLineFromStdin("Salu:")

Combine two TTS outputs in a single mp3 file not working

I want to combine two requests to the Google cloud text-to-speech API in a single mp3 output. The reason I need to combine two requests is that the output should contain two different languages.
Below code works fine for many language pair combinations, but unfortunately not for all. If I request e.g. a sentence in English and one in German and combine them everything works. If I request one in English and one in Japanes I can't combine the two files in a single output. The output only contains the first sentence and instead of the second sentence, it outputs silence.
I tried now multiple ways to combine the two outputs but the result stays the same. The code below should show the issue.
Please run the code first with:
python synthesize_bug.py --t1 'Hallo' --code1 de-De --t2 'August' --code2 de-De
This works perfectly.
python synthesize_bug.py --t1 'Hallo' --code1 de-De --t2 'こんにちは' --code2 ja-JP
This doesn't work. The single files are ok, but the combined files contain silence instead of the Japanese part.
Also, if used with two Japanes sentences everything works.
I already filed a bug report at Google with no response yet, but maybe it's just me who is doing something wrong here with encoding assumptions. Hope someone has an idea.
#!/usr/bin/env python
import argparse
# [START tts_synthesize_text_file]
def synthesize_text_file(text1, text2, code1, code2):
"""Synthesizes speech from the input file of text."""
from apiclient.discovery import build
import base64
service = build('texttospeech', 'v1beta1')
collection = service.text()
data1 = {}
data1['input'] = {}
data1['input']['ssml'] = '<speak><break time="2s"/></speak>'
data1['voice'] = {}
data1['voice']['ssmlGender'] = 'FEMALE'
data1['voice']['languageCode'] = code1
data1['audioConfig'] = {}
data1['audioConfig']['speakingRate'] = 0.8
data1['audioConfig']['audioEncoding'] = 'MP3'
request = collection.synthesize(body=data1)
response = request.execute()
audio_pause = base64.b64decode(response['audioContent'].decode('UTF-8'))
raw_pause = response['audioContent']
ssmlLine = '<speak>' + text1 + '</speak>'
data1 = {}
data1['input'] = {}
data1['input']['ssml'] = ssmlLine
data1['voice'] = {}
data1['voice']['ssmlGender'] = 'FEMALE'
data1['voice']['languageCode'] = code1
data1['audioConfig'] = {}
data1['audioConfig']['speakingRate'] = 0.8
data1['audioConfig']['audioEncoding'] = 'MP3'
request = collection.synthesize(body=data1)
response = request.execute()
# The response's audio_content is binary.
with open('output1.mp3', 'wb') as out:
out.write(base64.b64decode(response['audioContent'].decode('UTF-8')))
print('Audio content written to file "output1.mp3"')
audio_text1 = base64.b64decode(response['audioContent'].decode('UTF-8'))
raw_text1 = response['audioContent']
ssmlLine = '<speak>' + text2 + '</speak>'
data2 = {}
data2['input'] = {}
data2['input']['ssml'] = ssmlLine
data2['voice'] = {}
data2['voice']['ssmlGender'] = 'MALE'
data2['voice']['languageCode'] = code2 #'ko-KR'
data2['audioConfig'] = {}
data2['audioConfig']['speakingRate'] = 0.8
data2['audioConfig']['audioEncoding'] = 'MP3'
request = collection.synthesize(body=data2)
response = request.execute()
# The response's audio_content is binary.
with open('output2.mp3', 'wb') as out:
out.write(base64.b64decode(response['audioContent'].decode('UTF-8')))
print('Audio content written to file "output2.mp3"')
audio_text2 = base64.b64decode(response['audioContent'].decode('UTF-8'))
raw_text2 = response['audioContent']
result = audio_text1 + audio_pause + audio_text2
with open('result.mp3', 'wb') as out:
out.write(result)
print('Audio content written to file "result.mp3"')
raw_result = raw_text1 + raw_pause + raw_text2
with open('raw_result.mp3', 'wb') as out:
out.write(base64.b64decode(raw_result.decode('UTF-8')))
print('Audio content written to file "raw_result.mp3"')
# [END tts_synthesize_text_file]ls
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description=__doc__,
formatter_class=argparse.RawDescriptionHelpFormatter)
parser.add_argument('--t1')
parser.add_argument('--code1')
parser.add_argument('--t2')
parser.add_argument('--code2')
args = parser.parse_args()
synthesize_text_file(args.t1, args.t2, args.code1, args.code2)
You can find the answer here:
https://issuetracker.google.com/issues/120687867
Short answer: It's not clear why it is not working, but Google suggests a workaround to first write the files as .wav, combine and then re-encode the result to mp3.
I have managed to do this in NodeJS with just one function (idk how optimal is it, but at least it works). Maybe you could take inspiration from it
I have used memory-streams dependency from npm
var streams = require('memory-streams');
function mergeAudios(audios) {
var reader = new streams.ReadableStream();
var writer = new streams.WritableStream();
audios.forEach(element => {
if (element instanceof streams.ReadableStream) {
element.pipe(writer)
}
else {
writer.write(element)
}
});
reader.append(writer.toBuffer())
return reader
}
Input parameter is a list which contain ReadableStream or responce.audioContent from synthesizeSpeech operation. If it is readablestream, it uses pipe operation, if it is audiocontent, it uses write method. At the end all content is passed into an readabblestream.

Groovy write to file (newline)

I created a small function that simply writes text to a file, but I am having issues making it write each piece of information to a new line. Can someone explain why it puts everything on the same line?
Here is my function:
public void writeToFile(def directory, def fileName, def extension, def infoList) {
File file = new File("$directory/$fileName$extension")
infoList.each {
file << ("${it}\n")
}
}
The simple code I'm testing it with is something like this:
def directory = 'C:/'
def folderName = 'testFolder'
def c
def txtFileInfo = []
String a = "Today is a new day"
String b = "Tomorrow is the future"
String d = "Yesterday is the past"
txtFileInfo << a
txtFileInfo << b
txtFileInfo << d
c = createFolder(directory, folderName) //this simply creates a folder to drop the txt file in
writeToFile(c, "garbage", ".txt", txtFileInfo)
The above creates a text file in that folder and the contents of the text file look like this:
Today is a new dayTomorrow is the futureYesterday is the past
As you can see, the text is all bunched together instead of separated on a new line per text. I assume it has something to do with how I am adding it into my list?
As #Steven points out, a better way would be:
public void writeToFile(def directory, def fileName, def extension, def infoList) {
new File("$directory/$fileName$extension").withWriter { out ->
infoList.each {
out.println it
}
}
}
As this handles the line separator for you, and handles closing the writer as well
(and doesn't open and close the file each time you write a line, which could be slow in your original version)
It looks to me, like you're working in windows in which case a new line character in not simply \n but rather \r\n
You can always get the correct new line character through System.getProperty("line.separator") for example.
I came across this question and inspired by other contributors. I need to append some content to a file once per line. Here is what I did.
class Doh {
def ln = System.getProperty('line.separator')
File file //assume it's initialized
void append(String content) {
file << "$content$ln"
}
}
Pretty neat I think :)
Might be cleaner to use PrintWriter and its method println.
Just make sure you close the writer when you're done
#Comment for ID:14.
It's for me rather easier to write:
out.append it
instead of
out.println it
println did on my machine only write the first file of the ArrayList, with append I get the whole List written into the file.
Kindly anyway for the quick-and-dirty-solution.