Why Rebol Copy Big File fails with really big files whereas windows explorer doesn't? - rebol

I tried carl function
http://www.rebol.com/article/0281.html
with 155 Mo it works.
Then I tested with 7 Go it fails without saying the limit.
Why is there a limit I can't see anything in code that puts a limit.
There's no error message
>> copy-file to-rebol-file "D:\#mirror_ftp\cpmove.tar" to-rebol-file "D:\#mirror_ftp\testcopy.tar"
0:00
== none
>>

REBOL uses 32-bit signed integers, so it can't read files bigger than 2147483647 bytes (2^31-1) which is roughly 2GB. REBOL3 uses 64-bit integers, so won't have such limitation.

Related

How to interact with a subprocess through its stdin, stdout, stderr in Smalltalk?

This Python code shows how to call some process in Windows 10 and to send to it string commands, to read its string responses through stdin, stdout pipes of the process:
Python 3.8.0 (tags/v3.8.0:fa919fd, Oct 14 2019, 19:37:50) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from subprocess import *
>>> p = Popen("c:/python38/python.exe", stdin=PIPE, stdout=PIPE)
>>> p.stdin.write(b"print(1+9)\n")
11
>>> p.communicate()
(b'10\r\n', None)
>>>
As you can see the python.exe process returned 10 as an answer to print(1+9). Now I want to do the same in Pharo (or Squeak): in Windows 10 OS - I suppose something similar, i.e. short, simple, understandable, really working.
I installed OSProcess, ProcessWrapper (they were missing in Pharo, also its strange that I got warning that they are not marked for Pharo 8.0 and were not checked to work in Pharo 8.0, but OK), and I tried ProcessWrapper, PipeableOSProcess (copy-pasted different snippets from the Web), etc - with zero success! The results were:
nothing happens, python.exe was not started
VM errors console was opened (white console in the bottom of the Pharo, which is controlled with F2 menu)
different exceptions
etc
Would somebody show me simple working example how to start a process and to to send it commands, read answers, then send again, and so on in some loop - I plan to have such communication in a detached thread and to use it as some service, because Pharo, Smalltalk in general is missing most bindings, so then I will use subprocess communication like in "good" old days...
I know how to call a command and to get its output:
out := LibC resultOfCommand: 'dir ', aDir.
but I am talking about another scenario: a communication with a running process interactively (for example, with SSH or similar like in the example above - python.exe).
PS. Maybe it's possible to do it with LibC #pipe:mode even?
Let me start with that the PipeableOsProcess is probably broken on Windows. I have tried it and it just opened a command line and nothing else (it does not freeze my Pharo 8). The whole OSProcess does not work correctly in my eyes.
So I took a shot at LibC which is supposed to not work with Windows.
I’m a module defining access to standard LibC. I’m available under Linux and OSX, but not under Windows for obvious reasons :)
Next is to say that Python's Windows support is probably much better than Pharo's.
The solution, which is more like a workaround using files, is to use LibC and #runCommand: (I tried to come up with a similar example as you had shown above):
| count command result outputFile errorFile |
count := 9+1. "The counting"
command := 'echo ', count asString. "command run at the command line"
outputFile := 'output'. "a file into which the output is redirected"
errorFile := 'error'. "a file where the error output is redirected "
result := LibC runCommand: command, "run the command "
' >', outputFile, "redirect the output to output file"
' 2>', errorFile.
"reading back the value from output file"
outputFile asFileReference contents lines.
"reading back the value from the error file - which is empty in this case"
errorFile asFileReference contents lines.

Find out block size of a device

I am trying to find out the block size of a file system. I found these 2 commands:
[root#node1 test]# stat -fc %s .
4096
[root#node1 test]# blockdev --getbsz /dev/mapper/node1_test
512
Why is the result different? Which is the correct one?
Many thanks.
I think it's likely that both answers are correct. It's just that blockdev --getbsz returns the result in bytes. Your stat command must be returning bits.. So 512 X 8 = 4096.
It seems that the result of stat would only show in bytes with that format specifier if it was not a filesystem. See the note about the formats on this page for stat.
Reference for blockdev.

Why received ZFS dataset uses less space than original?

I have a dataset on the server1 that I want to back up to the second server2.
Server1 (original):
zfs list -o name,used,avail,refer,creation,usedds,usedsnap,origin,compression,compressratio,refcompressratio,mounted,atime,lused storage/iscsi/webhost-old produces:
NAME USED AVAIL REFER CREATION USEDDS USEDSNAP ORIGIN COMPRESS RATIO REFRATIO MOUNTED ATIME LUSED
storage/iscsi/webhost-old 67,8G 1,87T 67,8G Út kvě 31 6:54 2016 67,8G 16K - lz4 1.00x 1.00x - - 67,4G
Sending volume to the 2nd server:
zfs send storage/iscsi/webhost-old | pv | ssh -c arcfour,aes128-gcm#openssh.com root#10.0.0.2 zfs receive -Fduv pool/bkp-storage
received 69,6GB stream in 378 seconds (189MB/sec)
Server2 zfs list produces:
NAME USED AVAIL REFER CREATION USEDDS USEDSNAP ORIGIN COMPRESS RATIO REFRATIO MOUNTED ATIME LUSED
pool/bkp-storage/iscsi/webhost-old 36,1G 3,01T 36,1G Pá pro 29 10:25 2017 36,1G 0 - lz4 1.15x 1.15x - - 28,4G
Why is there such a difference in sizes? Thanks.
From what you posted, I noticed 3 things that seemed odd:
the compressratio is 1.15x on system 2, but 1.00x on system 1
on system 2, used is 1.27x higher than logicalused
the logicalused and the number zfs receive report are ~2.3x higher on system 1 than system 2
These terms are all defined in the man page, but are still confusing to reverse-engineer explanations for in practice.
(1) could happen if you enabled compression on the source dataset after you wrote all the data to it, since ZFS doesn't rewrite the data to compress it when you enable that setting. The data sent by zfs send is uncompressed unless you use -c, but system 2 will try to compress it as it runs zfs receive if the setting is enabled on the destination dataset. If both system 1 and system 2 had the same compression settings before the data was written, they would have the same compressratio as well.
(2) can happen due to metadata written along with your data, but in this case it's too high for "normal" metadata, which accounts for 1-2% of most pools. It's probably caused by a pool-wide setting, like configuring RAID-Z, or a weird combination of striping and mirroring (like 4 stripes, but with one of them being a mirror).
For (3), I re-read the man page to try to figure it out:
logicalused
The amount of space that is "logically" consumed by this dataset and
all its descendents. See the used property. The logical space
ignores the effect of the compression and copies properties, giving a
quantity closer to the amount of data that applications see.
If you were sending a dataset (instead of a single iSCSI volume) and the send size matched system 2's logicalused value (instead of system 1's), I would guess you forgot to send some child datasets (i.e. by using zfs send -R). However, neither of those are true in this case.
I had to do some additional digging -- this blog post from 2005 might contain the explanation. If system 1 didn't have compression enabled when the data was written (like I guessed above for (1)), the function responsible for not writing zeroed-out blocks (zio_compress_data) would not be run, so you probably have a bunch of empty blocks written to disk, and accounted for in the logicalused size. However, since lz4 is configured on system 2, it would run there, and those blocks would not be counted.

unable to merge large files in r

I have run into a problem.
I have 10 large separate files, file type File without column headers, which are in total near 4GB which are require merging. I have been told they are text files and pipe delimited, so I added the file extension txt on each files, which I hope is not the problem. R Studio is crashing when I use the following code...
multmerge = function(mypath){
filenames=list.files(path=mypath, full.names=TRUE)
datalist = lapply(filenames, function(x){read.csv(file=x,header=F, sep
= "|")})
Reduce(function(x,y) {merge(x,y, all=T)}, datalist)}
mymergeddata = multmerge("C://FolderName//FolderName")
or when I try to do something like this...
temp1 <- read.csv(file="filename.txt", sep="|")
:
temp10 <- read.csv(file="filename.txt", sep="|")
SomeData = Reduce(function(x, y) merge(x, y), list(temp1...,
temp10))
I seeing errors such as
"Error: C stack usage is too close to the limit r" and
"In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
Reached total allocation of 8183Mb: see help(memory.size)"
Then I saw a someone asked a question on SO as I am writing this question,
here, so I was wondering if SQL command can used in R Studio or SSMS to merge these large files? If they can how can it be merged to. If it can be done please can you advise me how to do this. I will looking around on the net.
If it can't then what is the best method to merge these rather large files. Can this be achieved in R Studio or is there open source?
I am working on a PC which has 64bit Windows with 8GB RAMS. I have included R and SQL Tags to see what options there are.
Thanks in advance if anyone can help me.
Your machine doesn't have enough memory for your selected operations.
You have 10 files ~ 4GB in total.
When you merge the 10 files you create another object which is also about 4GB, putting you very close to your machine's limit.
Your operating system and R and whatever else you're running also consume RAM so it's no surprise you run out of RAM.
I'd suggest taking a stepwise approach if you don't have access to a bigger maching:
- take the first two files and merge them.
- delete the file objects from R and keep only the merged one.
- load the third object and merge it with the earlier merger.
Repeat until done.

Best way to generate data to fill USB memory?

I need to fill a USB memory and I want others to be able to repeat this in an easy way. SO I dont want to write "find a file that filles the memory" so they have to look around for such a file.
Rather I want to generate X MB of data and write that to a file that can then be transferrred to the USB stick.
How would you do that (on Windows)?
If you are on Windows, you can use fsutil:
fsutil file createnew D:\fatboy.tmp SIZE
If you are on Linux or OSX or somesuch, you can use mkfile to make a big file:
mkfile SIZE PathToFileOnUSB
e.g.
mkfile 10m /some/where/on/USB/10MegFile
Or, if your system lacks mkfile, use dd to fill storage quickly and easily.
So, if you want 10MB, use:
dd if=/dev/zero of=PathToFileOnUSB bs=1m count=10
That says... dump data, reading from /dev/zero (which supplies an endless sream of zeroes) and writing to the file called PathToFileOnUSB using a blocksize (bs) of 1 megabyte, and do this 10 times (cnt).
If you want X MB, use:
dd if=/dev/zero of=PathToFileOnUSB bs=1m count=X
If you want to fill the device, write until error without specifying a count:
dd if=/dev/zero of=PathToFileOnUSB bs=1m