Passing iterator as argument in Nim fails "attempting to call undeclared routine" - iterator

I'm trying to learn Nim by writing a certain bioinformatics tool that I already implemented in other languages.
I have the following version that compiles and runs apparently correctly:
from strutils import join
from sequtils import zip
type
Nucleotides = distinct string
Qualities = distinct string
#Nucleotides = string
#Qualities = string
Fastq = tuple
name: string
nucls: Nucleotides
quals: Qualities
# proc `==` (ns, ms: Nucleotides): bool =
# string(ns) == string(ms)
# https://nim-by-example.github.io/types/distinct/
proc `==` (ns, ms: Nucleotides): bool {.borrow.}
proc makeFastq(name, nucls, quals: string): Fastq =
result = (name: name, nucls: nucls.Nucleotides, quals: quals.Qualities)
proc bestQuals(quals1, quals2: string): string =
let N = min(quals1.len, quals2.len)
result = newStringOfCap(N)
for pair in zip(quals1, quals2):
result.add(chr(max(ord(pair.a), ord(pair.b))))
proc bestQuals(quals1, quals2: Qualities): Qualities =
result = bestQuals(string(quals1), string(quals2)).Qualities
proc fuseFastq(rec1, rec2: Fastq): Fastq =
result = (name: rec1.name, nucls: rec1.nucls, quals: bestQuals(rec1.quals, rec2.quals))
proc `$` (record: Fastq): string =
result = join([
record.name,
string(record.nucls),
"+",
string(record.quals)], "\n")
iterator parseFastqs(input: File): Fastq =
var
nameLine: string
nucLine: string
quaLine: string
while not input.endOfFile:
nameLine = input.readLine()
nucLine = input.readLine()
discard input.readLine()
quaLine = input.readLine()
yield makeFastq(nameLine, nucLine, quaLine)
proc deduplicate() =
var
record: Fastq
record = (name: "", nucls: "".Nucleotides, quals: "".Qualities)
for fastq in parseFastqs(stdin):
if record.nucls != fastq.nucls:
if record.name != "":
echo $record
record = fastq
else:
record = fuseFastq(record, fastq)
continue
if record.name != "":
echo $record
when isMainModule:
deduplicate()
Now, I would like to have deduplicate take as argument the "thing" (currently an iterator) that generates Fastq tuples. It would indeed seem much cleaner to have the when isMainModule part deal with reading from stdin or possibly something else in the future (a file passed as command-line argument, for instance):
proc deduplicate(inputFqs: <some relevant type>) =
var
record: Fastq
record = (name: "", nucls: "".Nucleotides, quals: "".Qualities)
for fastq in inputFqs:
if record.nucls != fastq.nucls:
if record.name != "":
echo $record
record = fastq
else:
record = fuseFastq(record, fastq)
continue
if record.name != "":
echo $record
when isMainModule:
let inputFqs = parseFastqs(stdin)
deduplicate(inputFqs)
Is there a simple and efficient way to do this?
I naïvely tried the following:
proc deduplicate(inputFqs: iterator) =
var
record: Fastq
record = (name: "", nucls: "".Nucleotides, quals: "".Qualities)
for fastq in inputFqs:
if record.nucls != fastq.nucls:
if record.name != "":
echo $record
record = fastq
else:
record = fuseFastq(record, fastq)
continue
if record.name != "":
echo $record
when isMainModule:
let inputFqs = parseFastqs(stdin)
deduplicate(inputFqs)
And this results in the following compilation error: Error: attempting to call undeclared routine: 'parseFastqs'.
I searched and understood from the manual that I should make my iterator a closure iterator. So I started simply by just using the {.closure.} pragma:
iterator parseFastqs(input: File): Fastq {.closure.} =
But I kept having the same error.
So I tried mimicking more closely the example given in the manual:
iterator parseFastqs(input: File): Fastq {.closure.} =
var
nameLine: string
nucLine: string
quaLine: string
while not input.endOfFile:
nameLine = input.readLine()
nucLine = input.readLine()
discard input.readLine()
quaLine = input.readLine()
yield makeFastq(nameLine, nucLine, quaLine)
proc deduplicate(inputFqs: iterator(): Fastq {.closure.}) =
var
record: Fastq
record = (name: "", nucls: "".Nucleotides, quals: "".Qualities)
for fastq in inputFqs:
if record.nucls != fastq.nucls:
if record.name != "":
echo $record
record = fastq
else:
record = fuseFastq(record, fastq)
continue
if record.name != "":
echo $record
deduplicate(parseFastqs(stdin))
This resulted in a type error:
Error: type mismatch: got (iterator (): Fastq{.closure.})
but expected one of:
iterator items[T](a: set[T]): T
iterator items(a: cstring): char
iterator items[T](a: openArray[T]): T
iterator items[IX, T](a: array[IX, T]): T
iterator items(a: string): char
iterator items[T](a: seq[T]): T
iterator items(E: typedesc[enum]): E:type
iterator items[T](s: HSlice[T, T]): T
expression: items(inputFqs)
What am I doing wrong?
Edit: Solving the type issue
It seems that the type mismatch can be solved by changing for fastq in inputFqs: to for fastq in inputFqs():. The situation is back to Error: attempting to call undeclared routine: 'parseFastqs'.
Some tinkering with the example from the manual reveals that the type of the iterator parameter does not need to have the parentheses. The following compiles and runs correctly:
iterator count0(): int {.closure.} =
yield 0
iterator count2(): int {.closure.} =
var x = 1
yield x
inc x
yield x
proc invoke(iter: iterator) =
for x in iter(): echo x
invoke(count0)
invoke(count2)
Now I would be interested in the meaning of the parentheses in the original example:
proc invoke(iter: iterator(): int {.closure.}) =

You must loop over an iterator
for item in myIterator():
echo repr item
or you could transform it into a sequence
import sequtils
echo toSeq(myIterator())

Related

I'm trying to create and pass a new record which contains a map from the jython processor in streamsets but getting this error?

I want the newRecord to containt a map of column names and column values. I am getting the following error which I am not able to resolve -
Record1-Error Record1 SCRIPTING_04 - Script sent record to error: write(): 1st arg can't be coerced to com.streamsets.pipeline.stage.util.scripting.ScriptRecord : (View Stack Trace... )
from datetime import datetime
metadata_dict = {}
for metadata in sdc.records[0].value['XMLData']['Metadata'][0]['FieldDefinitions'][0]['FieldDefinition']:
metadata_dict [metadata['attr|id']] = metadata ['attr|alias']
for record in sdc.records:
try:
for row in record.value['XMLData']['Record']:
newRecord = sdc.createRecord(str(datetime.now()))
newRecord = sdc.createMap (False)
value = row ['Field']
for values in value:
column_id = values ['attr|id']
column_name = metadata_dict [column_id]
for a in values:
if a == 'value':
column_value = values ['value']
elif a == 'ListValues':
column_value = values ['ListValues']
elif a == 'Groups':
column_value = values ['Groups']
elif a == 'Users':
column_value = values ['Users']
newRecord[column_name] = column_value
sdc.output.write(newRecord)
except Exception as e:
sdc.error.write(record, str(e))
You have a bug in your code:
newRecord = sdc.createMap (False)
Here, you create a map and place it into a newRecord variable.
sdc.output.write(newRecord)
Here, you're trying to write a map, not a record, into the output.
You should do something like:
newRecord = sdc.createRecord(...
myMap = sdc.createMap(...)
myMap['foo'] = 'bar'
newRecord.value = myMap

How to check the type of a document element (sub document vs list)? "ReqlServerCompileError: Variable name not found in: var_1"

I extend the RethinkDb API by providing some extra functions.
For example I simplify the expression
site_ids = r.table('periods')\
['regions']\
.concat_map(lambda row: row['sites'])\
['id']
to
site_ids = f['periods']\
.unwind('regions.sites.id')
using a custom unwind method that is able to resolve a path of nested document elements. If an item in the given path is a list, its entries are concatenated with concat_map. Otherwise the item is accessed with bracket notation:
def unwind(self, path):
items = path.split('.')
cursor = self._cursor
for item in items:
is_list = isinstance(cursor[item].run().next(), list)
if is_list:
cursor = cursor.concat_map(lambda row: row[item])
else:
cursor = cursor[item]
return self.wrap(self._f, cursor)
=> How can I improve the type check to find out if an element is a list? The check should not require an extra .run() and it should work in main queries as well as in sub queries.
My current implementation with the expression
is_list = isinstance(cursor[item].run().next(), list)
works fine in "main queries" like
result = f['periods'] \
.unwind('regions.sites.plants.product.process.technologies')\
.populate_with('periods', 'technologies')\
.sum('specific_cost_per_year') \
.run()
It does not work in sub queries, e.g. inside a mapping function:
def period_mapper(period):
return {
'year': period['start'],
'site_ids': f.wrap(period).unwind('regions.sites.id')
}
f.table('periods')\
.map(period_mapper)\
.run()
I get the error
rethinkdb.errors.ReqlServerCompileError: Variable name not found in:
var_1['regions']
^^^^^
because I am not able to .run() a query on the passed variable argument "period".
I tried to replace the if-then-else condition with r.branch but that did not help.
=> How can I choose an operator based on the type of the current cursor content in a better way?
Code of my selection class that wraps a RethinkDb cursor:
from rethinkdb.ast import RqlQuery
# needs to inherit from RqlQuery for the json serialization to work
class AbstractSelection(RqlQuery):
def __init__(self, f, cursor):
self._f = f
self._cursor = cursor
def __getitem__(self, identifier):
cursor = self._cursor[identifier]
return self.wrap(self._f, cursor)
def __repr__(self):
return self._cursor.__repr__()
def __str__(self):
return self._cursor.__str__()
def build(self):
return self._cursor.build()
#property
def _args(self): # required for json serialization
return self._cursor._args
#property
def optargs(self): # required for json serialization
return self._cursor.optargs
def wrap(self, r, cursor):
raise NotImplemented('Needs to be implemented by inheriting class')
def unwind(self, path):
items = path.split('.')
cursor = self._cursor
for item in items:
is_list = isinstance(cursor[item].run().next(), list)
if is_list:
cursor = cursor.concat_map(lambda row: row[item])
else:
cursor = cursor[item]
return self.wrap(self._f, cursor)
def pick(self, path, query):
return self.unwind(path).get(query)
def populate(self, collection_name, path):
return self.map(lambda identifier:
self._f[collection_name]
.pick(path, {'id': identifier})
)
def get(self, query):
cursor = self._cursor.filter(query)[0]
return self.wrap(self._f, cursor)
def to_array(self):
return [item for item in self._cursor]
I managed to use type_of in combination with branch. Accessing the item with bracket notation returns a STREAM and I had to get the first item with [0] before using type_of to check for the 'ARRAY' type. This also works if the property is not an array:
def unwind(self, path):
items = path.split('.')
cursor = self._cursor
r = self._f._r
for item in items:
cursor = r.branch(
cursor[item][0].type_of() == 'ARRAY',
cursor.concat_map(lambda row: row[item]),
cursor[item]
)
return self.wrap(self._f, cursor)

using paste0 in file name in exams2moodle

I am trying to create a loop to automatize exams generation using the examspackage....
I have created a series of exercices like this
gr1 <- c("ae1_IntroEst_1.Rmd","ae1_IntroEst_2.Rmd","ae1_IntroEst_3.Rmd","ae1_IntroEst_4.Rmd")
gr2 <- c("ae1_IntroProcEst_1.Rmd","ae1_IntroProcEst_2.Rmd","ae1_IntroProcEst_3.Rmd","ae1_IntroProcEst_4.Rmd")
...etc...
Now, I am creating a loop to export all the exercices to moodle xml:
for (i in 1:2){
grupo <- paste0("gr",i)
exams2moodle(grupo, name = paste0("mt1_",i, "_M"), dir = "nops_moodle", encoding = "UTF-8", schoice = list(answernumbering = "none", eval = ee))
}
But I am getting this error:
Error in xexams(file, n = n, nsamp = nsamp, driver = list(sweave = list(quiet = quiet, : The following files cannot be found: gr11.
If I replace "grupo" by "gr1" then it works... (but I am generating 20 exercices). I can't figure it out...
Any ideas?
Thanks!
Because grupo is a string: "gr1". The exams2moodle's first parameter is a string (in your case) and not the list of files (as you want).
If you want use a variable which name is in a string variable, you should use get (get: Return the Value of a Named Object)
Check the sample code:
> x <- 'foo'
> foo <- 'bar'
> x
[1] "foo"
> get(x)
[1] "bar"
>
In your case:
for (i in 1:2){
grupo <- paste0("gr",i)
exams2moodle(get(grupo), name = paste0("mt1_",i, "_M"), dir = "nops_moodle", encoding = "UTF-8", schoice = list(answernumbering = "none", eval = ee))
}

how can we pass multiple arguments in the background functions in karate feature file

i am passing the two arguments to my custom function but in background while i am passing the arguments it's skipping first taking second one only arugment.
here is the sample code
* def LoadToTigerGraph =
"""
function(args1,args2) {
var CustomFunctions = Java.type('com.optum.graphplatform.util.CareGiverTest');
var cf = new CustomFunctions();
return cf.testSuiteTrigger(args1,args2);
}"""
#*eval if (karate.testType == "component") karate.call(LoadToTigerGraph '/EndTestSample.json')
* def result = call LoadToTigerGraph "functional","/EndTestSample.json"
output :
test type is ************/EndTestSample.json
path is *************undefined
When you want to pass two arguments, you need to send them as two json key/value.
* def result = call LoadToTigerGraph { var1: "functionnal", var2: "/EndTestSample.json" }
And you just have to use args.var1 and args.var2 in your function function(args)

Kotlin: how to swap character in String

I would like to swap a string from "abcde" to "bcdea". So I wrote my code as below in Kotlin
var prevResult = "abcde"
var tmp = prevResult[0]
for (i in 0..prevResult.length - 2) {
prevResult[i] = prevResult[i+1] // Error on preveResult[i]
}
prevResult[prevResult.length-1] = tmp // Error on preveResult[prevResult.lengt-1]
It errors out as stated above comment line. What did I do wrong? How could I fix this and get what I want?
Strings in Kotlin just like in Java are immutable, so there is no string.set(index, value) (which is what string[index] = value is equivalent to).
To build a string from pieces you could use a StringBuilder, construct a CharSequence and use joinToString, operate on a plain array (char[]) or do result = result + nextCharacter (creates a new String each time -- this is the most expensive way).
Here's how you could do this with StringBuilder:
var prevResult = "abcde"
var tmp = prevResult[0]
var builder = StringBuilder()
for (i in 0..prevResult.length - 2) {
builder.append(prevResult[i+1])
}
builder.append(tmp) // Don't really need tmp, use prevResult[0] instead.
var result = builder.toString()
However, a much simpler way to achieve your goal ("bcdea" from "abcde") is to just "move" one character:
var result = prevResult.substring(1) + prevResult[0]
or using the Sequence methods:
var result = prevResult.drop(1) + prevResult.take(1)
You can use drop(1) and first() (or take(1)) to do it in one line:
val str = "abcde"
val r1 = str.drop(1) + str.first()
val r2 = str.drop(1) + str.take(1)
As to your code, Kotlin String is immutable and you cannot modify its characters. To achieve what you want, you can convert a String to CharArray, modify it and then make a new String of it:
val r1 = str.toCharArray().let {
for (i in 0..it.lastIndex - 1)
it[i] = it[i+1]
it[it.lastIndex] = str[0] // str is unchanged
String(it)
}
(let is used for conciseness to avoid creating more variables)
Also, you can write a more general version of this operation as an extension function for String:
fun String.rotate(n: Int) = drop(n % length) + take(n % length)
Usage:
val str = "abcde"
val r1 = str.rotate(1)
Simpler solution: Just use toMutableList() to create a MutableList of Char and then join it all together with joinToString.
Example:
Given a String input, we want to exchange characters at positions posA and posB:
val chars = input.toMutableList()
val temp = chars[posA]
chars[posA] = chars[posB]
chars[posB] = temp
return chars.joinToString(separator = "")
Since Strings are immutable, you will have to copy the source string into an array, make changes to the array, then create a new string from the modified array. Look into:
getChars() to copy the string chars into an array.
Perform your algorithm on that array, making changes to it as needed.
Convert the modified array back into a String with String(char[]).