In jmeter: getting Error invoking bsh method: eval - testing

BeanShellInterpreter: Error invoking bsh method: eval Sourced file: inline evaluation of: import org.apache.jmeter.services.FileServer; String path=FileServer.getFileSer . . . '' : Attempt to access property on undefined variable or class name 2022-01-27 19:50:51,923 WARN o.a.j.e.BeanShellPostProcessor: Problem in BeanShell script: org.apache.jorphan.util.JMeterException: Error invoking bsh method: eval Sourced file: inline evaluation of: import org.apache.jmeter.services.FileServer; String path=FileServer.getFileSer . . . '' : Attempt to access property on undefined variable or class name
JMETER BEANSHELL CODE: (Beanshell post processor)
import org.apache.jmeter.services.FileServer;
String path=FileServer.getFileServer().getBaseDir();
var1= vars.get("userid");
var2= vars.get("username");
var3= vars.get("userfullname");
var4= ${exam_id};
f = new FileOutputStream("C://apache-jmeter-5.4.1/apache-jmeter-5.4.1/bin/Script4.csv",true);
p = new PrintStream(f);
this.interpreter.setOut(p);
p.println(var1+ "," +var2 + "," +var3 + "," +var4);
f.close();

Change this line:
var4= ${exam_id};
to this one:
var4= vars.get("exam_id");
Since JMeter 3.1 it's recommended to use JSR223 Test Elements and Groovy language for scripting so it worth considering migrating to Groovy.
If you run this code with more than 1 thread (virtual user) you will face a race condition resulting in data corruption when multiple threads will be writing into the same file. JMeter provides sample_variables property allowing to write variables values to the .jtl results file, if you need to store the data in a separate file - go for Flexible File Writer

Related

Python unit tests for Foundry's transforms?

I would like to set up tests on my transforms into Foundry, passing test inputs and checking that the output is the expected one. Is it possible to call a transform with dummy datasets (.csv file in the repo) or should I create functions inside the transform to be called by the tests (data created in code)?
If you check your platform documentation under Code Repositories -> Python Transforms -> Python Unit Tests, you'll find quite a few resources there that will be helpful.
The sections on writing and running tests in particular is what you're looking for.
// START DOCUMENTATION
Writing a Test
Full documentation can be found at https://docs.pytest.org
Pytest finds tests in any Python file that begins with test_.
It is recommended to put all your tests into a test package under the src directory of your project.
Tests are simply Python functions that are also named with the test_ prefix and assertions are made using Python’s assert statement.
PyTest will also run tests written using Python’s builtin unittest module.
For example, in transforms-python/src/test/test_increment.py a simple test would look like this:
def increment(num):
return num + 1
def test_increment():
assert increment(3) == 5
Running this test will cause checks to fail with a message that looks like this:
============================= test session starts =============================
collected 1 item
test_increment.py F [100%]
================================== FAILURES ===================================
_______________________________ test_increment ________________________________
def test_increment():
> assert increment(3) == 5
E assert 4 == 5
E + where 4 = increment(3)
test_increment.py:5: AssertionError
========================== 1 failed in 0.08 seconds ===========================
Testing with PySpark
PyTest fixtures are a powerful feature that enables injecting values into test functions simply by adding a parameter of the same name. This feature is used to provide a spark_session fixture for use in your test functions. For example:
def test_dataframe(spark_session):
df = spark_session.createDataFrame([['a', 1], ['b', 2]], ['letter', 'number'])
assert df.schema.names == ['letter', 'number']
// END DOCUMENTATION
If you don't want to specify your schemas in code, you can also read in a file in your repository by following the instructions in documentation under How To -> Read file in Python repository
// START DOCUMENTATION
Read file in Python repository
You can read other files from your repository into the transform context. This might be useful in setting parameters for your transform code to reference.
To start, In your python repository edit setup.py:
setup(
name=os.environ['PKG_NAME'],
# ...
package_data={
'': ['*.yaml', '*.csv']
}
)
This tells python to bundle the yaml and csv files into the package. Then place a config file (for example config.yaml, but can be also csv or txt) next to your python transform (e.g. read_yml.py see below):
- name: tbl1
primaryKey:
- col1
- col2
update:
- column: col3
with: 'XXX'
You can read it in your transform read_yml.py with the code below:
from transforms.api import transform_df, Input, Output
from pkg_resources import resource_stream
import yaml
import json
#transform_df(
Output("/Demo/read_yml")
)
def my_compute_function(ctx):
stream = resource_stream(__name__, "config.yaml")
docs = yaml.load(stream)
return ctx.spark_session.createDataFrame([{'result': json.dumps(docs)}])
So your project structure would be:
some_folder
config.yaml
read_yml.py
This will output in your dataset a single row with one column "result" with content:
[{"primaryKey": ["col1", "col2"], "update": [{"column": "col3", "with": "XXX"}], "name": "tbl1"}]
// END DOCUMENTATION

How to write specific data from JMeter execution output to CSV / Notepad using beanshell scripting

We are working on web services automation project using JMeter 4.0 In our response, JMeter returns data in json format but we would like to store only specific data (Account ID, Customer ID or Account inquiry fields) from that json into csv file but it stores data in csv file in unformatted format.
Looking for a workaround on this.
We are using following code:
import java.io.File;
import org.apache.jmeter.services.FileServer;
Result = "FAIL";
Responce = prev.getResponseDataAsString():
if(Responce.contains("data"))
Result = "PASS";
f = new FileOutputStream("C:/Users/Amar.pawar/Desktop/testoup.csv",true);
p = new PrintStream(f);
p.println(vars.get("/ds1odmc") + "," + Result):
p.close();
f.close():
Following error is getting encountered:
Error invoking bsh method: eval In file: inline evaluation of: ``import java.io.File; import org.apache.jmeter.services.FileServer; Result = "FA . . . '' Encountered ":" at line 5, column 42.
We are looking for saving specific data in CSV (or txt) instead of complete output in unformatted format. Please look into the matter & suggest.
Looks like a typo. You use : instead of ; in three lines:
Responce = prev.getResponseDataAsString():
...
p.println(vars.get("/ds1odmc") + "," + Result):
...
f.close():
And if problem does not solved it may be useful to check the article about testing complex logic with JMeter beanshell

snakemake STAR module issue and extra question

I discovered that the snakemake STAR module outputs as 'BAM Unsorted'.
Q1:Is there a way to change this to:
--outSAMtype BAM SortedByCoordinate
When I add the option in the 'extra' options I get an error message about duplicate definition:
EXITING: FATAL INPUT ERROR: duplicate parameter "outSAMtype" in input "Command-Line"
SOLUTION: keep only one definition of input parameters in each input source
Nov 15 09:46:07 ...... FATAL ERROR, exiting
logs/star/se/UY2_S7.log (END)
Should I consider adding a sorting module behind STAR instead?
Q2: How can I take a module from the wrapper repo and make it a local module, allowing me to edit it?
the code:
__author__ = "Johannes Köster"
__copyright__ = "Copyright 2016, Johannes Köster"
__email__ = "koester#jimmy.harvard.edu"
__license__ = "MIT"
import os
from snakemake.shell import shell
extra = snakemake.params.get("extra", "")
log = snakemake.log_fmt_shell(stdout=True, stderr=True)
fq1 = snakemake.input.get("fq1")
assert fq1 is not None, "input-> fq1 is a required input parameter"
fq1 = [snakemake.input.fq1] if isinstance(snakemake.input.fq1, str) else snakemake.input.fq1
fq2 = snakemake.input.get("fq2")
if fq2:
fq2 = [snakemake.input.fq2] if isinstance(snakemake.input.fq2, str) else snakemake.input.fq2
assert len(fq1) == len(fq2), "input-> equal number of files required for fq1 and fq2"
input_str_fq1 = ",".join(fq1)
input_str_fq2 = ",".join(fq2) if fq2 is not None else ""
input_str = " ".join([input_str_fq1, input_str_fq2])
if fq1[0].endswith(".gz"):
readcmd = "--readFilesCommand zcat"
else:
readcmd = ""
outprefix = os.path.dirname(snakemake.output[0]) + "/"
shell(
"STAR "
"{extra} "
"--runThreadN {snakemake.threads} "
"--genomeDir {snakemake.params.index} "
"--readFilesIn {input_str} "
"{readcmd} "
"--outSAMtype BAM Unsorted "
"--outFileNamePrefix {outprefix} "
"--outStd Log "
"{log}")
Q1:Is there a way to change this to:
--outSAMtype BAM SortedByCoordinate
I would add another sorting rule after the wrapper as it is the most 'standardized` way of doing it. You can also use another wrapper for sorting.
There is an explanation from the author of snakemake for the reason why the default is unsorted and why there is no option for sorted output in the wrapper:
https://bitbucket.org/snakemake/snakemake/issues/440/pre-post-wrapper
Regarding the SAM/BAM issue, I would say any wrapper should always output the optimal file format. Hence, whenever I write a wrapper for a read mapper, I ensure that output is not SAM. Indexing and sorting should not be part of the same wrapper I think, because such a task has a completely different behavior regarding parallelization. Also, you would loose the mapping output if something goes wrong during the sorting or indexing.
Q2: How can I take a module from the wrapper repo and make it a local module, allowing me to edit it?
If you wanted to do this, one way would be to download the local copy of the wrapper. Change in the shell portion of the downloaded wrapper Unsorted to {snakemake.params.outsamtype}. In your Snakefile change (wrapper to script, path/to/downloaded/wrapper and add the outsamtype parameter):
rule star_se:
input:
fq1 = "reads/{sample}_R1.1.fastq"
output:
# see STAR manual for additional output files
"star/{sample}/Aligned.out.bam"
log:
"logs/star/{sample}.log"
params:
# path to STAR reference genome index
index="index",
# optional parameters
extra="",
outsamtype = "SortedByCoordinate"
threads: 8
script:
"path/to/downloaded/wrapper"
I think a separate rule w/o a wrapper for sorting or even making your own star rule rather is better. Modifying the wrapper defeats the whole purpose of it.

Jython - importing a text file to assign global variables

I am using Jython and wish to import a text file that contains many configuration values such as:
QManager = MYQM
ProdDBName = MYDATABASE
etc.
.. and then I am reading the file line by line.
What I am unable to figure out is now that as I read each line and have assigned whatever is before the = sign to a local loop variable named MYVAR and assigned whatever is after the = sign to a local loop variable MYVAL - how do I ensure that once the loop finishes I have a bunch of global variables such as QManager & ProdDBName etc.
I've been working on this for days - I really hope someone can help.
Many thanks,
Bret.
See other question: Properties file in python (similar to Java Properties)
Automatically setting global variables is not a good idea for me. I would prefer global ConfigParser object or dictionary. If your config file is similar to Windows .ini files then you can read it and set some global variables with something like:
def read_conf():
global QManager
import ConfigParser
conf = ConfigParser.ConfigParser()
conf.read('my.conf')
QManager = conf.get('QM', 'QManager')
print('Conf option QManager: [%s]' % (QManager))
(this assumes you have [QM] section in your my.conf config file)
If you want to parse config file without help of ConfigParser or similar module then try:
my_options = {}
f = open('my.conf')
for line in f:
if '=' in line:
k, v = line.split('=', 1)
k = k.strip()
v = v.strip()
print('debug [%s]:[%s]' % (k, v))
my_options[k] = v
f.close()
print('-' * 20)
# this will show just read value
print('Option QManager: [%s]' % (my_options['QManager']))
# this will fail with KeyError exception
# you must be aware of non-existing values or values
# where case differs
print('Option qmanager: [%s]' % (my_options['qmanager']))

How can I incorporate the current input filename into my Pig Latin script?

I am processing data from a set of files which contain a date stamp as part of the filename. The data within the file does not contain the date stamp. I would like to process the filename and add it to one of the data structures within the script. Is there a way to do that within Pig Latin (an extension to PigStorage maybe?) or do I need to preprocess all of the files using Perl or the like beforehand?
I envision something like the following:
-- Load two fields from file, then generate a third from the filename
rawdata = LOAD '/directory/of/files/' USING PigStorage AS (field1:chararray, field2:int, field3:filename);
-- Reformat the filename into a datestamp
annotated = FOREACH rawdata GENERATE
REGEX_EXTRACT(field3,'*-(20\d{6})-*',1) AS datestamp,
field1, field2;
Note the special "filename" datatype in the LOAD statement. Seems like it would have to happen there as once the data has been loaded it's too late to get back to the source filename.
You can use PigStorage by specify -tagsource as following
A = LOAD 'input' using PigStorage(',','-tagsource');
B = foreach A generate INPUT_FILE_NAME;
The first field in each Tuple will contain input path (INPUT_FILE_NAME)
According to API doc http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/PigStorage.html
Dan
The Pig wiki as an example of PigStorageWithInputPath which had the filename in an additional chararray field:
Example
A = load '/directory/of/files/*' using PigStorageWithInputPath()
as (field1:chararray, field2:int, field3:chararray);
UDF
// Note that there are several versions of Path and FileSplit. These are intended:
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.mapreduce.RecordReader;
import org.apache.hadoop.mapreduce.lib.input.FileSplit;
import org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigSplit;
import org.apache.pig.builtin.PigStorage;
import org.apache.pig.data.Tuple;
public class PigStorageWithInputPath extends PigStorage {
Path path = null;
#Override
public void prepareToRead(RecordReader reader, PigSplit split) {
super.prepareToRead(reader, split);
path = ((FileSplit)split.getWrappedSplit()).getPath();
}
#Override
public Tuple getNext() throws IOException {
Tuple myTuple = super.getNext();
if (myTuple != null)
myTuple.append(path.toString());
return myTuple;
}
}
-tagSource is deprecated in Pig 0.12.0 .
Instead use
-tagFile - Appends input source file name to beginning of each tuple.
-tagPath - Appends input source file path to beginning of each tuple.
A = LOAD '/user/myFile.TXT' using PigStorage(',','-tagPath');
DUMP A ;
will give you the full file path as first column
( hdfs://myserver/user/blo/input/2015.TXT,439,43,05,4,NAVI,PO,P&C,P&CR,UC,40)
Refrence: http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/builtin/PigStorage.html
A way to do this in Bash and PigLatin can be found at: How Can I Load Every File In a Folder Using PIG?.
What I've been doing lately though, and find to be much cleaner is embedding Pig in Python. That let's you throw all sorts of variables and such between the two. A simple example is:
#!/path/to/jython.jar
# explicitly import Pig class
from org.apache.pig.scripting import Pig
# COMPILE: compile method returns a Pig object that represents the pipeline
P = Pig.compile(
"a = load '$in'; store a into '$out';")
input = '/path/to/some/file.txt'
output = '/path/to/some/output/on/hdfs'
# BIND and RUN
results = P.bind({'in':input, 'out':output}).runSingle()
if results.isSuccessful() :
print 'Pig job succeeded'
else :
raise 'Pig job failed'
Have a look at Julien Le Dem's great slides as an introduction to this, if you're interested. There's also a ton of documentation at http://pig.apache.org/docs/r0.9.2/cont.pdf.