Inverse of `split` function: `join` a string using a delimeter - series

IN Red and Rebol(3), you can use the split function to split a string into a list of items:
>> items: split {1, 2, 3, 4} {,}
== ["1" " 2" " 3" " 4"]
What is the corresponding inverse function to join a list of items into a string? It should work similar to the following:
>> join items {, }
== "1, 2, 3, 4"

There's no inbuild function yet, you have to implement it yourself:
>> join: function [series delimiter][length: either char? delimiter [1][length? delimiter] out: collect/into [foreach value series [keep rejoin [value delimiter]]] copy {} remove/part skip tail out negate length length out]
== func [series delimiter /local length out value][length: either char? delimiter [1] [length? delimiter] out: collect/into [foreach value series [keep rejoin [value delimiter]]] copy "" remove/part skip tail out negate length length out]
>> join [1 2 3] #","
== "1,2,3"
>> join [1 2 3] {, }
== "1, 2, 3"
per request, here is the function split into more lines:
join: function [
series
delimiter
][
length: either char? delimiter [1][length? delimiter]
out: collect/into [
foreach value series [keep rejoin [value delimiter]]
] copy {}
remove/part skip tail out negate length length
out
]

There is an old modification of rejoin doing that
rejoin: func [
"Reduces and joins a block of values - allows /with refinement."
block [block!] "Values to reduce and join"
/with join-thing "Value to place in between each element"
][
block: reduce block
if with [
while [not tail? block: next block][
insert block join-thing
block: next block
]
block: head block
]
append either series? first block [
copy first block
] [
form first block
]
next block
]
call it like this rejoin/with [..] delimiter
But I am pretty sure, there are other, even older solutions.

Following function works:
myjoin: function[blk[block!] delim [string!]][
outstr: ""
repeat i ((length? blk) - 1)[
append outstr blk/1
append outstr delim
blk: next blk ]
append outstr blk ]
probe myjoin ["A" "B" "C" "D" "E"] ", "
Output:
"A, B, C, D, E"

Related

Alternative rules in PARSE with SKIP only did not output expected results

I encounter a snippet of code:
blk: [1 #[none] 2 #[none] 3 #[none]]
probe parse blk [
any [
set s integer! (print 'integer) | (print 'none) skip
]
]
the output is:
integer
none
integer
none
integer
none
none
true
Note that in front of true there are two nones. While the next code snippet output the expected output:
blk: [1 #[none] 2 #[none] 3 #[none]]
probe parse blk [
any [
set s integer! (print 'integer) | and none! (print 'none) skip
]
]
output:
integer
none
integer
none
integer
none
true
Why the previous one could not output the same result with the last one?
Your first rule should better be
probe parse blk [
any [
set s integer! (print 'integer) | skip (print 'none)
]
]
as in your first rule you print none, if there is no integer and just skip after. This causes, that you print none even when the cursor is at the end. The skip just ends the parsing.
In your second rule the none! is not true at the end, so the parsing stops. It can be written shorter
probe parse blk [
any [
set s integer! (print 'integer) | none! (print 'none)
]
]
In your second rule the and does not move the cursor forward, so you need the additional skip. Without and the none! eats one item already.

How do I convert set-words in a block to words

I want to convert a block from block: [ a: 1 b: 2 ] to [a 1 b 2].
Is there an easier way of than this?
map-each word block [ either set-word? word [ to-word word ] [ word ] ]
Keeping it simple:
>> block: [a: 1 b: 2]
== [a: 1 b: 2]
>> forskip block 2 [block/1: to word! block/1]
== b
>> block
== [a 1 b 2]
I had same problem so I wrote this function. Maybe there's some simpler solution I do not know of.
flat-body-of: function [
"Change all set-words to words"
object [object! map!]
][
parse body: body-of object [
any [
change [set key set-word! (key: to word! key)] key
| skip
]
]
body
]
These'd create new blocks, but are fairly concise. For known set-word/value pairs:
collect [foreach [word val] block [keep to word! word keep val]]
Otherwise, you can use 'either as in your case:
collect [foreach val block [keep either set-word? val [to word! val][val]]]
I'd suggest that your map-each is itself fairly concise also.
I like DocKimbel's answer, but for the sake of another alternative...
for i 1 length? block 2 [poke block i to word! pick block i]
Answer from Graham Chiu:
In R2 you can do this:
>> to block! form [ a: 1 b: 2 c: 3]
== [a 1 b 2 c 3]
Or using PARSE:
block: [ a: 1 b: 2 ]
parse block [some [m: set-word! (change m to-word first m) any-type!]]

How do I refer to variable in func argument when same is used in foreach

How can I refer to date as argument in f within the foreach loop if date is also used as block element var ? Am I obliged to rename my date var ?
f: func[data [block!] date [date!]][
foreach [date o h l c v] data [
]
]
A: simple, compose is your best friend.
f: func[data [block!] date [date!]][
foreach [date str] data compose [
print (date)
print date
]
]
>> f [2010-09-01 "first of sept" 2010-10-01 "first of october"] now
7-Sep-2010/21:19:05-4:00
1-Sep-2010
7-Sep-2010/21:19:05-4:00
1-Oct-2010
You need to either change the parameter name from date or assign it to a local variable.
You can access the date argument inside the foreach loop by binding the 'date word from the function specification to the data argument:
>> f: func[data [block!] date [date!]][
[ foreach [date o h l c v] data [
[ print last reduce bind find first :f 'date 'data
[ print date
[ ]
[ ]
>> f [1-1-10 1 2 3 4 5 2-1-10 1 2 3 4 5] 8-9-10
8-Sep-2010
1-Jan-2010
8-Sep-2010
2-Jan-2010
It makes the code very difficult to read though. I think it would be better to assign the date argument to a local variable inside the function as Graham suggested.
>> f: func [data [block!] date [date!] /local the-date ][
[ the-date: :date
[ foreach [date o h l c v] data [
[ print the-date
[ print date
[ ]
[ ]
>> f [1-1-10 1 2 3 4 5 2-1-10 1 2 3 4 5] 8-9-10
8-Sep-2010
1-Jan-2010
8-Sep-2010
2-Jan-2010

I need to generate 50 Millions Rows csv file with random data: how to optimize this program?

The program below can generate random data according to some specs (example here is for 2 columns)
It works with a few hundred of thousand lines on my PC (should depend on RAM). I need to scale to dozen of millions row.
How can I optimize the program to write directly to disk ? Subsidiarily how can I "cache" the parsing rule execution as it is always the same pattern repeated 50 Millions times ?
Note: to use the program below, just type generate-blocks and save-blocks output will be db.txt
Rebol[]
specs: [
[3 digits 4 digits 4 letters]
[2 letters 2 digits]
]
;====================================================================================================================
digits: charset "0123456789"
letters: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
separator: charset ";"
block-letters: [A B C D E F G H I J K L M N O P Q R S T U V W X Y Z]
blocks: copy []
generate-row: func[][
Foreach spec specs [
rule: [
any [
[
set times integer! [['digits (
repeat n times [
block: rejoin [block random 9]
]
)
|
'letters (repeat n times [
block: rejoin [ block to-string pick block-letters random 24]
]
)
]
|
[
'letters (repeat n times [block: rejoin [ block to-string pick block-letters random 24]
]
)
|
'digits (repeat n times [block: rejoin [block random 9]]
)
]
]
|
{"} any separator {"}
]
]
to end
]
block: copy ""
parse spec rule
append blocks block
]
]
generate-blocks: func[m][
repeat num m [
generate-row
]
]
quote: func[string][
rejoin [{"} string {"}]
]
save-blocks: func[file][
if exists? to-rebol-file file [
answer: ask rejoin ["delete " file "? (Y/N): "]
if (answer = "Y") [
delete %db.txt
]
]
foreach [field1 field2] blocks [
write/lines/append %db.txt rejoin [quote field1 ";" quote field2]
]
]
Use open with /direct and /lines refinement to write directly to file without buffering the content:
file: open/direct/lines/write %myfile.txt
loop 1000 [
t: random "abcdefghi"
append file t
]
Close file
This will write 1000 random lines without buffering.
You can also prepare a block of lines (lets say 10000 rows) then write it directly to file, this will be faster than writing line-by-line.
file: open/direct/lines/write %myfile.txt
loop 100 [
b: copy []
loop 1000 [append b random "abcdef"]
append file b
]
close file
this will be much faster, 100000 rows less than a second.
Hope this will help.
Note that, you can change the number 100 and 1000 according to your needs an memory of your pc, and use b: make block! 1000 instead of b: copy [], it will be faster.

Set Difference Operation between an Object Body and a Block definition in Rebol

I want to be able to modify Object dynamically by adding / removing properties or methods on the fly. For Adding no problem, for Removing I thought about using Set Difference Math Operator but it behaves weirdly as far as I can see when removing a method from the object.
For example if I have
O: make object! [
a: 1
f: func [][]
b: 1
]
I can substract [a: 1 b: 1] with no problem
>> difference third O [b: 1 a: 1]
== [f: func [][]]
But I cannot substract f: func[][]:
>> difference third O [f: func[][]]
== [a: 1 b: func [][] func []]
>>
Output is weird (I put strange maybe it doesn't sound english as I'm not english native :) )
Why and what should I do instead ?
Thanks.
Issue #1: Difference Discards Duplicates From Both Inputs
Firstly, difference shouldn't be thought of as a "subtraction" operator. It gives you one of each element that is unique in each block:
>> difference [1 1 2 2] [2 2 2 3 3 3]
== [1 3]
>> difference [2 2 2 3 3 3] [1 1 2 2]
== [3 1]
So you'd get an equivalent set by differencing with [a: 1 b: 1] and [1 a: b:]. This is why the second 1 is missing from your final output. Even differencing with the empty set will remove any duplicate items:
>> difference [a: 1 b: 1] []
== [a: 1 b:]
If you're looking to actually search and replace a known sequential pattern, then what you want is more likely replace with your replacement as the empty set:
>> replace [a: 1 b: 1] [b: 1] []
== [a: 1]
Issue #2: Function Equality Is Based On Identity
Two separate functions with the same definition will evaluate to two distinct function objects. For instance, these two functions both take no parameters and have no body, but when you use a get-word! to fetch them and compare they are not equal:
>> foo: func [] []
>> bar: func [] []
>> :foo == :bar
== false
So another factor in your odd result is that f: is being subtracted out of the set, and the two (different) empty functions are unique and thus both members of the differenced set.
R2 is a little weirder than R3 and I can't get :o/f to work. But the following is a way to get an ''artificially correct-looking version'' of the difference you are trying to achieve:
>> foo: func [] []
>> o: make object! [a: 1 f: :foo b: 2]
>> difference third o compose [f: (:foo)]
== [a: 1 b: 2]
Here you're using the same function identity that you put in the object in the block you are subtracting.
In R3, difference does not support function values in this way. It may relate to the underlying implementation being based on map! which cannot have ''function values'' as keys. Also in Rebol 3, using difference on an object is not legal. So even your first case won't work. :(
Issue #3: This isn't how to add and remove properties
In Rebol 3 you can add properties to an object dynamically with no problems.
>> obj: object [a: 1]
== make object! [
a: 1
]
>> append obj [b: 2]
== make object! [
a: 1
b: 2
]
But as far as I know of, you cannot remove them once they have been added. You can set them to none of course, but the reflection APIs will still report them as being there.
If you want to make trying to read them throw an error you can set it to an error object and then protect them from reads. A variant of this also works in R2:
>> attempt [obj/b: to-error "invalid member"]
== none
>> probe obj
== make object! [
a: 1
b: make error! [
code: 800
type: 'User
id: 'message
arg1: "invalid member"
arg2: none
arg3: none
near: none
where: none
]
]
>> obj/b
** User error: "invalid member"
R3 takes this one step further and lets you protect the member from writes, and even hide the member from having any new bindings made to it.
>> protect 'obj/b
== obj/b
>> obj/b: 100
** Script error: protected variable - cannot modify: b
>> protect/hide 'obj/b
== obj/b
>> obj
== make object! [
a: 1
]
If you need to dynamically add and remove members in R2, you might also consider a data member in your object which is a block. Blocks and objects are interchangeable for many operations, e.g:
>> data: [a: 1 b: 2]
== [a: 1 b: 2]
>> data/a
== 1
>> data/b
== 2
And you can remove things from them...
>> remove/part (find data (to-set-word 'a)) 2
== [b: 2]
It all depends on your application. The main thing object! has going over block! is the ability to serve as a context for binding words...
You cannot dynamically add or remove words from an object in Rebol 2. If you wish to simulate this behaviour you need to create and return a new object.