issue accessing lexical scope using B - variables

For debugging purposes I'd like to Access the lexical scope of different subroutines with a specific Attribute set. That works fine. I get a Problem when the first variable stores a string, then I get a empty string. I do something like this:
$pad = $cv->PADLIST; # $cv is the coderef to the sub
#scatchpad = $pad->ARRAY; # getting the scratchpad
#varnames = $scratchpad[0]->ARRAY; # getting the variablenames
#varcontents = $scratchpad[1]->ARRAY; # getting the Content from the vars
for (0 .. $#varnames) {
eval {
my $name = $varnames[$_]->PV;
my $content;
# following line matches numbers, works so far
$content = $varcontent[$_]->IVX if (scalar($varcontent[$_]) =~ /PVIV=/);
# should match strings, but does give me undef
$content = B::perlstring($varcontent[$_]->PV) if (scalar($varcontent[$_]) =~ /PV=/);
print "DEBUGGER> Local variable: ", $name, " = ", $content, "\n";
}; # there are Special vars that throw a error, but i don't care about them
}
Like I said in the comment the eval is to prevent the Errors from the B::Special objects in the scratchpad.
Output:
Local variable: $test = 42
Local variable: $text = 0
The first Output is okay, the second should Output "TEXT" instead of 0.
What am I doing wrong?
EDIT: With a little bit of coding I got all values of the variables , but not stored in the same indexes of #varnames and #varcontents. So now is the question how (in which order) the values are stored in #varcontents.
use strict;
use warnings;
use B;
sub testsub {
my $testvar1 = 42;
my $testvar2 = 21;
my $testvar3 = "testval3";
print "printtest1";
my $testvar4 = "testval4";
print "printtest2";
return "returnval";
}
no warnings "uninitialized";
my $coderef = \&testsub;
my $cv = B::svref_2object ( $coderef );
my $pad = $cv->PADLIST; # get scratchpad object
my #scratchpad = $pad->ARRAY;
my #varnames = $scratchpad[0]->ARRAY; # get varnames out of scratchpad
my #varcontents = $scratchpad[1]->ARRAY; # get content array out of scratchpad
my #vars; # array to store variable names adn "undef" for special objects (print-values, return-values, etc.)
for (0 .. $#varnames) {
eval { push #vars, $varnames[$_]->PV; };
if ($#) { push #vars, "undef"; }
}
my #cont; # array to store the content of the variables and special objects
for (0 .. $#varcontents) {
eval { push #cont, $varcontents[$_]->IV; };
eval { push #cont, $varcontents[$_]->PV; };
}
print $vars[$_], "\t\t\t", $cont[$_], "\n" for (0 .. $#cont);
EDIT2: Added runnable script to demonstrate the issue: Variablenames and variablevalues are not stored in the same index of the two Arrays (#varnames and #varcontents).

Related

Is it possible to create subroutine signatures at run time?

I've got a method in a class:
method options(*#opt) {
if !#!valid-options {
my $out = (cmd 'bin/md2html -h').out;
my #matches = $out ~~ m:g/\s'--'(<-[\s]>+)/;
for #matches -> $opt {
push #!valid-options, $opt[0].Str;
}
}
for #opt -> $opt {
when !($opt (elem) #!valid-options) {
warn "'$opt' is not a valid option";
}
push #!options, '--' ~ $opt;
}
}
The method checks that the options to see if they are valid and, if they are, places them into an attribute.
I pass args into the options method like this, as words:
$obj.options: <ftables ftasklists github>;
This works. But it got me wondering if it was possible pass in the options as named flags like this:
$obj.options: :ftables, :ftasklists, :github
But since I don't know all the command's options ahead of time, I'd need to generate the named arguments dynamically. Is this possible? I tried this but had no luck:
# create a signature
my #params = Parameter.new(name => ':$option', type => Bool, :!default);
my $sig = Signature.new(:#params);
my &blah = -> $sig { say 'this works too' } ;
&blah(:option1);
Currently, there is no way to do that, short of using EVAL.
You can add a slurpy hash to any sub signature to catch all unexpected named arguments:
sub foo(*%_) { .say for %_.keys }
foo :bar, :baz; # bar baz
Creating your own signatures at runtime may become possible / easier when the RakuAST has landed.

Perl6: large gzipped files read line by line

I'm trying to read a gz file line by line in Perl6, however, I'm getting blocked:
How to read gz file line by line in Perl6 however, this method, reading everything into :out uses far too much RAM to be usable except on very small files.
I don't understand how to use Perl6's Compress::Zlib to get everything line by line, although I opened an issue on their github https://github.com/retupmoca/P6-Compress-Zlib/issues/17
I'm trying Perl5's Compress::Zlib to translate this code, which works perfectly in Perl5:
use Compress::Zlib;
my $file = "data.txt.gz";
my $gz = gzopen($file, "rb") or die "Error reading $file: $gzerrno";
while ($gz->gzreadline($_) > 0) {
# Process the line read in $_
}
die "Error reading $file: $gzerrno" if $gzerrno != Z_STREAM_END ;
$gz->gzclose() ;
to something like this using Inline::Perl5 in Perl6:
use Compress::Zlib:from<Perl5>;
my $file = 'chrMT.1.vcf.gz';
my $gz = Compress::Zlib::new(gzopen($file, 'r');
while ($gz.gzreadline($_) > 0) {
print $_;
}
$gz.gzclose();
but I can't see how to translate this :(
I'm confused by Lib::Archive example https://github.com/frithnanth/perl6-Archive-Libarchive/blob/master/examples/readfile.p6 I don't see how I can get something like item 3 here
There should be something like
for $file.IO.lines(gz) -> $line { or something like that in Perl6, if it exists, I can't find it.
How can I read a large file line by line without reading everything into RAM in Perl6?
Update Now tested, which revealed an error, now fixed.
Solution #2
use Compress::Zlib;
my $file = "data.txt.gz" ;
my $handle = try open $file or die "Error reading $file: $!" ;
my $zwrap = zwrap($handle, :gzip) ;
for $zwrap.lines {
.print
}
CATCH { default { die "Error reading $file: $_" } }
$handle.close ;
I've tested this with a small gzipped text file.
I don't know much about gzip etc. but figured this out based on:
Knowing P6;
Reading Compress::Zlib's README and choosing the zwrap routine;
Looking at the module's source code, in particular the signature of the zwrap routine our sub zwrap ($thing, :$zlib, :$deflate, :$gzip);
And trial and error, mainly to guess that I needed to pass the :gzip adverb.
Please comment on whether my code works for you. I'm guessing the main thing is whether it's fast enough for the large files you have.
A failed attempt at solution #5
With solution #2 working I would have expected to be able to write just:
use Compress::Zlib ;
.print for "data.txt.gz".&zwrap(:gzip).lines ;
But that fails with:
No such method 'eof' for invocant of type 'IO::Path'
This is presumably because this module was written before the reorganization of the IO classes.
That led me to #MattOates' IO::Handle like object with .lines ? issue. I note no response and I saw no related repo at https://github.com/MattOates?tab=repositories.
I am focusing on the Inline::Perl5 solution that you tried.
For the call to $gz.gzreadline($_): it seems like gzreadline tries to return the line read from the zip file by modifying its input argument $_ (treated as an output argument, but it is not a true Perl 5 reference variable[1]), but the modified value is not returned to the Perl 6 script.
Here is a possoble workaround:
Create a wrapper module in the curent directory, e.g. ./MyZlibWrapper.pm:
package MyZlibWrapper;
use strict;
use warnings;
use Compress::Zlib ();
use Exporter qw(import);
our #EXPORT = qw(gzopen);
our $VERSION = 0.01;
sub gzopen {
my ( $fn, $mode ) = #_;
my $gz = Compress::Zlib::gzopen( $fn, $mode );
my $self = {gz => $gz};
return bless $self, __PACKAGE__;
}
sub gzreadline {
my ( $self ) = #_;
my $line = "";
my $res = $self->{gz}->gzreadline($line);
return [$res, $line];
}
sub gzclose {
my ( $self ) = #_;
$self->{gz}->gzclose();
}
1;
Then use Inline::Perl5 on this wrapper module instead of Compress::Zlib. For example ./p.p6:
use v6;
use lib:from<Perl5> '.';
use MyZlibWrapper:from<Perl5>;
my $file = 'data.txt.gz';
my $mode = 'rb';
my $gz = gzopen($file, $mode);
loop {
my ($res, $line) = $gz.gzreadline();
last if $res == 0;
print $line;
}
$gz.gzclose();
[1]
In Perl 5 you can modify an input argument that is not a reference, and the change will be reflected in the caller. This is done by modifying entries in the special #_ array variable. For example: sub quote { $_[0] = "'$_[0]'" } $str = "Hello"; quote($str) will quote $str even if $str is not passed by reference.

How to read PHP output row by row and create HTML-links from each row

I'm having this PHP-script:
<?php
$old_path = getcwd();
chdir('/var/www/html/SEARCHTOOLS/');
$term1 = $_POST['query1'];
$term2 = $_POST['query2'];
$var = "{$term1} {$term2}";
$outcome = shell_exec("searcher $var");
chdir($old_path);
echo "<pre>$outcome</pre>";
?>
On a searchpage two searchwords are written and the searchbutton is pushed. The search result turns up as a webpage like this:
/var/www/html/SEARCHTOOLS/1974-1991.pdf:1
/var/www/html/SEARCHTOOLS/1974-1991.pdf:3
/var/www/html/SEARCHTOOLS/1974-1991.pdf:7
/var/www/html/SEARCHTOOLS/1974-1991.pdf:7
/var/www/html/SEARCHTOOLS/1974-1991.pdf:9
/var/www/html/SEARCHTOOLS/1974-1991.pdf:13
/var/www/html/SEARCHTOOLS/1974-1991.pdf:13
The result shows links to individual PDF-files and pagenumber in that file, but are not clickable.
Is there a way to make these links clickable so that it opens up for instance in Evince or Acrobat at the correct page number?
Many thanks in advance.
/Paul
I found a correct answer to my problem. It took some time, but here it is:
<?php
// Get current working directory and put it as variable
$old_path = getcwd();
// Change directory
chdir('/var/www/html/SEARCHTOOLS/');
// Create first variable as result of first searchword on searchpage
$term1 = $_POST['query1'];
// Create second variable as result of second searchword on searchpage
$term2 = $_POST['query2'];
// Create a variable combining first AND second variable
$var = "{$term1} {$term2}";
// Create a variable as the result of the executed search using command "sokare" and variable "$var"
$outcome = shell_exec("sokare $var");
// Return to starting directory
chdir($old_path);
// Split the varible "$outcome" per line representing every page in PDF-file where variable "$var" is found
foreach(preg_split("/((\r?\n)|(\r\n?))/", $outcome) as $line){
// Create a variable out of the given pagenumber in PDF-file
$end = substr($line, strpos($line, ":") + 1);
// Trim the line by removing leading directories from line
$line2 = str_replace('/var/www/html', '', $line);
// Change a string from lower to upper case
$line2 = str_replace('searchtools', 'SEARCHTOOLS', $line2);
// Remove the colon and anything behind it from line
$line2 = array_shift(explode(':', $line2));
// Add suffix to line to facilitate linking to pagenumber in PDF-file
$line3 = str_replace(" ", "_", $line2).'#page=';
// Add pagenumber from the variable "$end"
$line3 = str_replace(" ", "_", $line3).$end;
// Print each line as a correct URL-link
echo "<pre><a href=$line3>$line3</a></pre>";
}
?>
The search results will now turn up as (and are clickable):
/SEARCHTOOLS/1974-1991.pdf#page=1
/SEARCHTOOLS/1974-1991.pdf#page=3
/SEARCHTOOLS/1974-1991.pdf#page=7
Just a small edit. The line ....
// Add suffix to line to facilitate linking to pagenumber in PDF-file
$line3 = str_replace(" ", "_", $line2).'#page=';
...works better with:
// Add suffix to line to facilitate linking to pagenumber in PDF-file
if (substr($line2, -3) == 'pdf') {
$line3 = $line2.'#page=';
}

perl CGI parameters not all showing

I am passing about seven fields from an HTML form to a Perl CGI script.
Some of the values are not getting recovered using a variety of methods (POST, GET, CGI.pm or raw code).
That is, this code
my $variable = $q->param('varname');
resulted in about half the variables either being empty or undef, although the latter may have been a coincidental situation from the HTML page, which uses JavaScript.
I wrote a test page on the same platform with a simple form going to a simple CGI, and also got results where onpy half the parameters were represented. The remaining values were empty after the assignment.
I tried both POST and GET. I also tried GET and printed the query string after attempting to write out the variables; everything was in the query string as it should be. I'm using CGI.pm for this.
I tried to see if the variable values had been parsed successfully by CGI.pm by creating a version of my test CGI code which just displays the
parameters on the HTML page. The result is a bunch of odd strings like
CGI=HASH(0x02033)->param('qSetName')
suggesting that assignment of these values results in a cast of some kind, so I was unable to tell if they actually 'contained' the proper values.
My real form uses POST, so I just commented out the CGI.pm code and iterated over STDIN and it had all the name-value pairs as it should have.
Everything I've done points to CGI.pm, so I will try reinstalling it.
Here's the test code that missed half the vars:
#!/usr/bin/perl;
use CGI;
my $q = new CGI;
my $subject = $q->param('qSetSubject');
my $topic = $q->param('qTopicName');
my $userName = $q->param('uName');
my $accessLevel = $q->param('accessLevel');
my $category = $q->param('qSetCat');
my $type = $q->param('qSetType');
print "Content-Type: text/html\n\n";
print "<html>\n<head><title>Test CGI<\/title><\/head>\n<body>\n\n<h2>Here Are The Variables:<\/h2>\n";
print "<list>\n";
print "<li>\$q->param(\'qSetSubject\') = $subject\n";
print "<li>\$q->param(\'qTopicName\') = $topic\n";
print "<li>\$q->param(\'uName\') = $userName\n";
print "<li>\$q->param(\'qSetCat\') = $accessLevel\n";
print "<li>\$q->param(\'qSetType\') = $category\n";
print "<li>\$q->param(\'accessLevel\') = $type\n";
print "<\/list>\n";
The results of ikegami's code are here:
qSetSubject: precalculus
qTopicName: polar coordinates
uName: kjtruitt
accessLevel: private
category: mathematics
type: grid-in
My attempt to incorporate ikegami's code
%NAMES = (
seqNum => 'seqNum',
uName => 'userName',
qSetName => 'setName',
accessLevel => 'accessLevel',
qSetCat => 'category',
qTopicName => 'topic',
qSetType => 'type',
qSetSubject => 'subject',
);
use CGI;
my $cgi = CGI->new();
print "Content-Type:text/html\n\n";
#print($cgi->header('text/plain'));
for my $name ($cgi->param) {
for ($cgi->param($name)) {
#print("$name: ".( defined($_) ? $_ : '[undef]' )."\n");
print "$NAMES{$name} = $_\n";
${$NAMES{$name}} = $_;
}
}
print "<html>\n<head><title>Test CGI<\/title><\/head>\n<body>\n\n<h2>Here Are The Variables:<\/h2>\n";
print "Hello World!\n";
print "<list>\n";
print "<li>\$q->param(\'qSetSubject\') = $subject\n";
print "<li>\$q->param(\'qTopicName\') = $topic\n";
print "<li>\$q->param(\'uName\') = $userName\n";
print "<li>\$q->param(\'qSetCat\') = $accessLevel\n";
print "<li>\$q->param(\'qSetType\') = $category\n";
print "<li>\$q->param(\'accessLevel\') = $type\n";
print "<\/list>\n";
You are receiving
qSetSubject: precalculus
qTopicName: polar coordinates
uName: kjtruitt
accessLevel: private
category: mathematics
type: grid-in
so
my $category = $q->param('qSetCat');
my $type = $q->param('qSetType');
should be replaced with
my $category = $q->param('category');
my $type = $q->param('type');

Losing time with displays and printed content when cleaning XML files

I'm trying to create a PowerShell routine to clean XML files automatically. I have succesfully created my routine, and I'm able to clear a file with different functions and scripts. But I want to launch my PowerShell routine every time that I have new XML file. So I've decided to add a system to deal with every files in a directory.
Now that I'm calling my routine to clean my PowerShell scripts, even if I don't use Write-Host, It displays rows when I'm launching my routine, and I'm losing a lot of time to clear XML files.
Here is my code:
param ([string] $sourceDirectory, [string] $targetDirectory, [string] $XSDFileName, [string] $dataSourceName, [string] $databaseName)
clear
function clearLocalVariables{
#This functions clears my local variables
}
function createSQLNodesList{
param ([string] $dataSourceName,[string] $databaseName)
#This function creates a list of available and allowed nodes in my XML Files from SQL databases.
}
The following functions are used to check my nodes, and this is where the prints and Write-Host appears when it's launched more than once:
function isNodeNameValid {
param ([string] $testedNodeName)
# This function is used to return the value of the nodeAnalysis function.
# It selects wich list the node will be analysed depending on the fact that
# it is a node for the aspect of the XML or for data.
# - $testedNodeName is a string representing the XML node analysed.
# If the node name is a 5 length string, begins with an A, and is composed of
# 4 digits ('AXXXX'), then it is data.
if(($testedNodeName.Length -eq 5) -and ($testedNodeName.Substring(0,1) -eq "A" ) -and ($testedNodeName.Substring(1,4) -match "^[-]?[0-9.]+$")) {
return nodeAnalysis -nodesList $nodesSQL -testedNodeName $testedNodeName
#Else, it is in the list for the aspect of the XML.
} else {
return nodeAnalysis -nodesList $nodesXML -testedNodeName $testedNodeName
}
}
function nodeAnalysis {
param ($nodesList,[string] $testedNodeName)
# This function is used to analyse each node name given.
# It compares the name of the name analysed to each node in the array given in parameter.
# - $nodesList is the corresponding array depending on the isNodeNameValid() method.
# - $testedNodeName is a string representing the XML node analysed.
# We compare each node of the node array to the testedNodeName. If the testedNodeName is in this array, the method returns 1.
foreach($nodeName in $nodesList) {
if ($testedNodeName -eq $nodeName) {
return 1
}
}
#If the node correspond to any node of the list, then the method returns 0.
return 0
}
# -- XML Nodes recursive cleaning method -- #
function cleanXMLContent {
param ($XMLDoc,[int] $endOfLeaf, [int] $boucle)
#This is the function I have trouble with displays and efficency :
while($endOfFile -ne 1) {
if($endOfLeaf -eq 1) {
if($XMLDoc.Name -eq "#document"){
$endOfFile = 1
}
if($XMLDoc.NextSibling) {
$XMLDoc = $XMLDoc.NextSibling
$endOfLeaf = 0
} else {
$XMLDoc = $XMLDoc.ParentNode
$endOfLeaf = 1
}
} else {
if(!(isNodeNameValid -testedNodeName $XMLDoc.Name)) {
if($XMLDoc.PreviousSibling) {
$nodeNameToDelete = $XMLDoc.Name
$siblingNodeName = $XMLDoc.PreviousSibling.Name
$XMLDoc = $XMLDoc.ParentNode
$XMLDoc.RemoveChild($XMLDoc.SelectSingleNode($nodeNameToDelete))
$XMLDoc = $XMLDoc.SelectSingleNode($siblingNodeName)
} else {
$nodeNameToDelete = $XMLDoc.Name
$XMLDoc = $XMLDoc.ParentNode
$XMLDoc.RemoveChild($XMLDoc.SelectSingleNode($nodeNameToDelete))
}
} else {
if($XMLDoc.HasChildNodes) {
$XMLDoc = $XMLDoc.FirstChild
$endOfLeaf = 0
} else {
if($XMLDoc.NextSibling) {
$XMLDoc = $XMLDoc.NextSibling
$endOfLeaf = 0
} else {
if($XMLDoc.ParentNode) {
$XMLDoc = $XMLDoc.ParentNode
if($XMLDoc.NextSibling) {
$endOfLeaf = 1
} else {
$XMLDoc = $XMLDoc.ParentNode
$endOfLeaf = 1
}
}
}
}
}
}
}
Write-Host "- Cleaning XML Nodes OK" -ForegroundColor Green
}
function createXSDSchema {
param ([string] $XSDFileName)
#This function is used to create XSD corresponding File
}
function cleanFile {
param ([string] $fileName, [string] $source, [string] $target, [string] $XSDFileName, [string] $dataSourceName, [string] $databaseName)
# -- Opening XML File -- #
#Creation of the XML Document iteration path
$date = Get-Date
[string] $stringDate = ($date.Year*10000 + $date.Month*100 + $date.Day) * 1000000 + ($date.Hour * 10000 + $date.Minute* 100 + $date.Second)
$date = $stringDate.substring(0,8) + "_" + $stringDate.substring(8,6)
#determining the path of the source and the target files.
$XMLDocPath = $source + $fileName
$XMLFutureFileNamePreWork = $fileName.Substring(0,$fileName.Length - 4)
$XMLFuturePath = $target + $XMLFutureFileNamePreWork + "cleaned" #_"+$date
#Creation of the XML Document
$XMLDoc = New-Object System.Xml.XmlDocument
$XMLFile = Resolve-Path($XMLDocPath)
#Loading of the XML File
$XMLDoc.Load($XMLFile)
[XML] $XMLDoc = Get-Content -Path $XMLDocPath
#If the XML Document exists, then we clean it.
if($XMLDoc.HasChildNodes) {
#The XML Document is cleaned.
cleanXMLContent $XMLDoc.FirstChild -endOfLeaf 0
Write-Host "- XML Cleaned" -ForegroundColor Green
#If it is a success, then we save it in a new file.
#if($AnalysisFinished -eq 1) {
#Modifying the XSD Attribute
#setting the XSD name into the XML file
createXSDSchema -XSDFileName $XSDFileName
#Creation of the XML Document
$XMLDoc.Save($XMLFuturePath+".xml")
Write-Host "- Creation of the new XML File Successfull at "$XMLFuturePath -ForegroundColor Green
#Creation of the XSD Corresponding Document
#createXSDSchema -XMLPath $XMLFuturePath
#}
} else {
Write-Host "Impossible"
}
}
Here I'm executing the whole process with the different functions. When I'm launching each functions separatly it works, but with many files it displays content and I lose a lot of time:
cd $sourceDirectory
$files = Get-ChildItem $sourceDirectory
# -- Local Variables Cleanning -- #
clearLocalVariables
Write-Host "- Variable cleaning successfull" -ForegroundColor Green
# -- SQL Connection -- #
$nodesSQL = createSQLNodesList -dataSourceName $dataSourceName -databaseName $databaseName
foreach($file in $files){
cleanFile -fileName $file -source $sourceDirectory -target $targetDirectory -XSDFileName $XSDFileName -dataSourceName $dataSourceName -databaseName $databaseName
}
Do you have any idea about how to avoid the different displays of the contents?
I have a lot of blank rows, that multiplies the cleaning time by 10 or 15.
First, refrain from loading XML files twice. Use either
$XMLDoc = New-Object System.Xml.XmlDocument
$XMLDoc.Load($XMLFile)
or
[xml]$XMLDoc = Get-Content -Path $XMLFile
They both do the same thing.
Next, replace the iterative XML traversal with a recursive one:
function Clean-XmlContent {
Param(
[Parameter(Mandatory=$true)]
[Xml.XmlElement]$Node
)
if ($Node.HasChildNodes) {
foreach ($child in $Node.ChildNodes) {
if ($child -is [Xml.XmlElement]) { Clean-XmlContent $child }
}
}
if (-not (Test-NodeName -NodeName $Node.LocalName)) {
$Node.ParentNode.RemoveChild($Node)
}
}
and call it with the XML root node:
Clean-XmlContent $XMLDoc.DocumentElement
Also, simplify the node name validation:
function Test-NodeName {
Param(
[string]$NodeName
)
if ($NodeName -match '^A\d{4}$') {
return ($nodesSQL -contains $NodeName)
} else {
return ($nodesXML -contains $NodeName)
}
}
That should speed up things considerably.
Thanks to Ansgar Wiechers, I have found a way to accelerate my code, I use recursive way to develop my code. This way my code is much faster, but the content of the rows deleted was still printed.
But to avoid having the content of the deleted nodes printed on screen, I had to use :
[void]$RemovedNode.ParentNode.RemoveChild($RemovedNode)
Instead of :
$RemovedNode.ParentNode.RemoveChild($RemovedNode)