What is the best way to save a hash! block in Rebol? - rebol

I am using Rebol2 and would like to persist a HASH! block.
Currently I am converting it to-string then using save.
Is there a better way? For example:
r: make hash! ["one" "two"]
I want to save this to a file, then load it back to r.

you are very near your goal. Just use save/all and load
>> r: make hash! ["one" "two"]
== make hash! ["one" "two"]
>> save/all %htest r
>> r: load %htest
== make hash! ["one" "two"]
If you want the same result in Red you just need one command more
>> r: do load %htest
== make hash! ["one" "two"]

You do that just like with any other value - save, then load.
There are zero benefits in using hash! for persistent storage, by the way. What load gives you back is a plain block! ([make hash! [...]]). Populating hashtable from this loaded data takes more time compared to just loading block!, but gives you faster lookups thereafter.
In other words, you can just:
>> save %database [one two]
>> make hash! load %database
== make hash! [one two]
As described in the tutorial that you linked.

Related

How to print the variable loaded and assigned from file

I am just learning machine learning using Octave. I want to load the data from the file and assign the data to one variable, then I just want to print the data to the console.The data in data.txt file is several-rows and two-columns matrix.
data = load('data.txt');
x = data(:, 1);
y = data(:, 2);
printf x;
printf y;
After executing the code, x and y will show up on the console, it is not what I expected, I just want to check the data loaded from the file, how to print this? Do I use wrong command?
Just type data in console and press enter . This will print the contents of the data variable with all its elements. Do NOT add a semicolon at end of line. This would suppress output.
The explicit way doing this is the command
display(data)

Pig: positionals counting from right?

I have a tuple in my pig script:
((,v1,fb,fql))
I know I can choose elements from the left as $0 (blank), $1 ("v1"), etc. But can I choose elements from the right? The tuples will be different lengths, but I would always like to get the last element.
You can't. You can however write a python UDF to extract it:
# Make sure to include the appropriate ouputSchema
def getFromBack(TUPLE, pos):
# gets elements from the back of TUPLE
# You can also do TUPLE[-pos], but this way it starts at 0 which is closer
# to Pig
return TUPLE[::-1][pos]
# This works the same way as above, but may be faster
# return TUPLE[-(pos + 1)]
And used like:
register 'myudf.py' using jython as pythonUDFs ;
B = FOREACH A GENERATE pythonUDFs.getFromBack(T, 0) ;

How to process various tasks like video acquisition parallel in Matlab?

I want to acquire image data from stereo camera simultaneously, or in parallel, save somewhere and read the data when need.
Currently I am doing
for i=1:100
start([vid1 vid2]);
imageData1=getdata(vid1,1);
imageData2=getdata(vid2,1);
%do several calculations%
....
end
In this cameras are working serially and it is very slow. How can I make 2 cameras work at a time???
Please help..
P.S : I also tried parfor but it does not help .
Regards
No Parallel Computing Toolbox required!
The following solution can generally solve problems like yours:
First the videos, I just use some vectors as "data" and save them to the workspace, these would be your two video files:
% Creating of some "videos"
fakevideo1 = [1 ; 1 ; 1];
save('fakevideo1','fakevideo1');
fakevideo2 = [2 ; 2 ; 2];
save('fakevideo2','fakevideo2');
The basic trick is to create a function which generates another instance of Matlab:
function [ ] = parallelinstance( fakevideo_number )
% create command
% -sd (set directory), pwd (current directory), -r (run function) ...
% finally "&" to indicate background computation
command = strcat('matlab -sd',{' '},pwd,{' '},'-r "processvideo(',num2str(fakevideo_number),')" -nodesktop -nosplash &');
% call command
system( command{1} );
end
Most important is the use of & at the end of the terminal command!
Within this function another function is called where the actual video processing is done:
function [] = processvideo( fakevideo_number )
% create file and variable name
filename = strcat('fakevideo',num2str(fakevideo_number),'.mat');
varname = strcat('fakevideo',num2str(fakevideo_number));
% load video to workspace or whatever
load(filename);
A = eval(varname);
% do what has to be done
results = A*2;
% save results to workspace, file, grandmothers mailbox, etc.
save([varname 'processed'],'results');
% just to show that both processes run parallel
pause(5)
exit
end
Finally call the two processes in your main script:
% function call with number of video: parallelinstance(fakevideo_number)
parallelinstance(1);
parallelinstance(2);
My code is completely executable, so just play around a bit. I tried to keep it simple.
After all you will find two .mat files with the processed video "data" in your workspace.
Be aware to adjust the string fakevideo to name root of all your video files.

[Q]uestion about reading and saving a large txt-file via {RSQLite} line by line into a DB

Since my hardware is very limited (a dual core with 32bit Win7 and 4GB of ram - I need to make the best of it.....) I try to save a large text-file (about 1.2GB) into a DB, which I can then trigger by SQL-like queries to do some analytics on particular subgroups.
To be honest I'm not familiar with this area and since I could not find help regarding my issues via "googling", I just quickly show what I came up with and how I thought things would look like:
First I check how many columns my txt-file has:
k <- length(scan("data.txt", nlines=1, sep="\t", what="character"))
Then I open a connection to the text file so that it does not need to be opened
again for every single line:
filecon<-file("data.txt", open="r")
Then I initialize a connection (dbcon) to an SQLite database
dbcon<- dbConnect(dbDriver("SQLite"), dbname="mydb.dbms")
I find out where the position of the first line is
pos<-seek(filecon, rw="r")
Since the first line contains the column-names I save them for later use
col_names <- unlist(strsplit(readLines(filecon, n=1), "\t"))
Next, I test to read the first 10 lines, line by line,
and save them into a DB, which themself (should) contain k - columns with columns-names = col_names.
for(i in 1:10) {
# prints the iteration number in hundreds
if(i %% 100 == 0) {
print(i)
}
# read one line into a variable tt
tt<-readLines(filecon, n=1)
# parse tt into a variable tt2, since tt is a string
tt2<-unlist(strsplit(tt, "\t"))
# Every line, read and parsed from the text file, is immediately saved
# in the SQLite database table "results" using the command dbWriteTable()
dbWriteTable(conn=dbcon, name="results", value=as.data.frame(t(tt2[1:k]),stringsAsFactors=T), col.names=col_names, append=T)
pos<-c(pos, seek(filecon, rw="r"))
}
If I run this I get the following error
Warning messages:
1: In value[[3L]](cond) :
RS-DBI driver: (error in statement: table results has 738 columns but 13 values were supplied)
Why should I supply 738 columns? If I change k (which is 12) to 738, the code works but then I need to trigger the columns by V1, V2, V3,.... and not by the column-names I intended to supply
res <- dbGetQuery(dbcon, "select V1, V2, V3, V4, V5, V6 from results")
Any help or even a small hint is very much appreciated!

Perl SQL file write delayed

Here is the simple perl script fetching data from SQL.
Read data and write on a file OUTFILE, and print the data on screen for every 10000th line.
One thing I am curious is that the printing the data on screen terminates very quickly(in 30 seconds), however, data fetching and writing on a file ends very slowly(30 minutes later).
The amount of data is not large. The output files size is less than 100Mbyte.
while ( my ($a,$b) = $curSqlEid->fetchrow_array() )
{
printf OUTFILE ("%s,%d\n", $a,$b);
$counter ++;
if($counter % 10000 == 0){
printf ("%s,%d\n", $a,$b);
}
}
$curSqlEid->finish();
$dbh->disconnect();
close(OUTFILE);
You are suffering from buffering.
Handles other than STDERR are buffered by default, and most handles use a block buffering. That means Perl will wait until there is 8KB* of data to write before sending anything to the system.
STDOUT is special. When is attached to a terminal (and only then), it uses a different kind of buffering: line buffering. When using line buffering, the data is flushed every time a newline is encountered in the data to write.
You can see this by running
$ perl -e'print "abc"; print "def"; sleep 5; print "\n"; sleep 5;'
[ 5 seconds pass ]
abcdef
[ 5 seconds pass ]
$ perl -e'print "abc"; print "def"; sleep 5; print "\n"; sleep 5;' | cat
[ 10 seconds pass ]
abcdef
The solution is to turn off buffering.
use IO::Handle qw( ); # Not needed on Perl 5.14 or later
OUTFILE->autoflush(1);
* — 8KB is the default. It can be configured when Perl is compiled. It used to be a non-configurable 4KB until 5.14.
I think you are seeing the output file size as 0 while the script is running and displaying on the console. Do not go by that. The file size will show up only once the script has finished. This is due to output buffering.
Anyways, the delay cannot be as large as 30 min. Once the script is done, you should see the output file data.
I tried various things, but the final conclusion is that python and perl has basically different handling data flow from DB. It looks like in perl, it is possible to handle data line by line while the data is transferred from DB. However, in Python it needs to wait until the entire data download from the server to process it.