It seems that if I write a set to a file, it's not in a format where it can be read back in easily as a set. Here's an example:
#lang racket
(let ([out (open-output-file "test.rkt" #:exists 'replace)])
(write (set 1 2 3 4 5) out)
(close-output-port out))
This makes a file with #<set: 1 3 5 2 4>, which the reader complains about. There is a related unanswered question on the mailing list here.
The way I'm getting around it right now is by printing literally the string "(set " to a file, then all the integers with spaces, then a closing ")". Super ugly and I would like to use the reader if possible.
You can use the Racket serialization library to do this. Here's an example:
Welcome to Racket v6.4.0.7.
-> (require racket/serialize)
-> (with-output-to-file "/tmp/set.rktd"
(lambda () (write (serialize (set 1 2 3)))))
-> (with-input-from-file "/tmp/set.rktd"
(lambda () (deserialize (read))))
(set 1 3 2)
Note that a serialized value is just a special kind of s-expression, so you can manipulate it like other values (like store it in a database, write it to disk, send it over a network, etc.):
-> (serialize (set 1 2 3))
'((3)
1
(((lib "racket/private/set-types.rkt")
.
deserialize-info:immutable-custom-set-v0))
0
()
()
(0 #f (h - (equal) (1 . #t) (3 . #t) (2 . #t))))
Related
I am reading in a file (see below). The example file has 13 rows.
A|doe|chemistry|100|A|
B|shea|maths|90|A|
C|baba|physics|80|B|
D|doe|chemistry|100|A|
E|shea|maths|90|A|
F|baba|physics|80|B|
G|doe|chemistry|100|A|
H|shea|maths|90|A|
I|baba|physics|80|B|
J|doe|chemistry|100|A|
K|shea|maths|90|A|
L|baba|physics|80|B|
M|doe|chemistry|100|A|
Then iterating over these rows using a for each ( batch size 5 ) and then calling a REST API
Depending on REST API response ( success or failure ) I am writing payloads to respective success / error files.
I have mocked the called API such that first batch of 5 records will fail and rest of the files will succeed.
While writing to success / error files am using the following transformation :
output application/csv quoteValues=true,header=false,separator="|"
---
payload
All of this works fine.
Success log file:
"F"|"baba"|"physics"|"80"|"B"
"G"|"doe"|"chemistry"|"100"|"A"
"H"|"shea"|"maths"|"90"|"A"
"I"|"baba"|"physics"|"80"|"B"
"J"|"doe"|"chemistry"|"100"|"A"
"K"|"shea"|"maths"|"90"|"A"
"L"|"baba"|"physics"|"80"|"B"
"M"|"doe"|"chemistry"|"100"|"A"
Error log file:
"A"|"doe"|"chemistry"|"100"|"A"
"B"|"shea"|"maths"|"90"|"A"
"C"|"baba"|"physics"|"80"|"B"
"D"|"doe"|"chemistry"|"100"|"A"
"E"|"shea"|"maths"|"90"|"A"
Now what I want to do is append the row/line number to each of these files so when this goes to production , whoever is monitoring these files can easily understand and correlate with the original file .
So as an example in case of error log file ( the first batch failed which is rows 1 to 5 ) I want to append these numbers to each of the rows:
"1"|"A"|"doe"|"chemistry"|"100"|"A"
"2"|"B"|"shea"|"maths"|"90"|"A"
"3"|"C"|"baba"|"physics"|"80"|"B"
"4"|"D"|"doe"|"chemistry"|"100"|"A"
"5"|"E"|"shea"|"maths"|"90"|"A"
Not sure what I should write in DataWeave to achieve this?
Inside the ForEach scope, you have access to the counter vars.counter (or whatever name you've chosen since it's configurable).
You will need to iterate over each chunk of records for adding the position for each one. You can use something like:
%dw 2.0
output application/csv quoteValues=true,header=false,separator="|"
var batchSize = 5
---
payload map ({
counter: batchSize * (vars.counter - 1) + ($$ + 1)
} ++ $
)
Or if you prefer to use the update function (this will add the record counter at the last column instead though):
%dw 2.0
output application/csv quoteValues=true,header=false,separator="|"
var batchSize = 5
---
payload map (
$ update {
case .counter! -> batchSize * (vars.counter - 1) + ($$ + 1)
}
)
Remember to replace the batchSize variable from this code with the same value you're using in the ForEach scope (if it's parameterised, it would be better).
Edit 1 -
Clarification: the - 1 and + 1 are because both indexes (the counter from the For Each scope and the $$ from the map) are zero-based.
Just another workaround and to simplify without using any external variables. The script can be split into two; 1st is for Error group and 2nd is for Success.
%dw 2.0
output application/csv quoteValues=true,header=false,separator="|"
// Will be used for creating a counter for Error group
var errorIdx = 1
// Will be used for creating a counter for Success group
var successIdx = 6
---
//errorItems for the first 5 rows
(payload[0 to 4] map (items,idx) -> (({"0":(idx) + errorIdx} ++ items)))
++
//successItems from 6 and remaining items.
(payload[5 to -1] map (items,idx) -> (({"0":(idx) + successIdx} ++ items)))
DataWeave Inline Variables:
errorIdx is a pointer for starting the error counter
successIdx is a pointer for starting the success counter
This will extract from index 0 to 4 element:
payload[0 to 4]
This will extract from index 5 to remaining elements:
payload[5 to -1]
would anyone be able to provide me with a working example of dataframe zipping in C#? I am bit lost in the operation.
Thanks!
The frame.Zip operation is the same thing as zipAlign in the more documented F# API, so have a look at zipAlign in this section of the documentation.
Given a frame df1:
A
1 -> 1
2 -> 2
And a frame df2:
A
2 -> 2
3 -> 3
When you call df1.Zip(df2, (int a, int b) -> a + b), you get:
A
1 -> <missing>
2 -> 4
3 -> <missing>
That is, for cells where both frames contain a value, a + b is calculated. For all other cells, you get a missing value. Note that you need type annotations in the lambda function - this has to match the type of values in the frame (for not matching types, the function just returns the values from the first frame unchanged).
I'm using Gforth, and I want to create a word in a definition. In the cmd line of Gforth I can type:
create foo
ok
Or more specifically, I defined an array function that expects a size on the stack and creates a word with the address to that array:
: array ( n -- ) ( i -- addr)
create cells allot
does> cells + ;
So if I type 10 array foo I can then use foo later.
But if I were to write 10 array foo within another definition it gives me a compilation error. I've tried replacing foo with s" foo" which compiles, but it blows up at run time, saying:
Attempt to use zero-length string as a name
Is there a way to do this?
One way to do it in gforth:
: bar 10 s" foo" ['] array execute-parsing ;
Other implementations do it differently, e.g. http://pfe.sourceforge.net/words/w-header-015.html
It's not easy to do in Standard Forth, but this may be good enough:
: bar 10 s" array foo" evaluate ;
I guess most of what you want to do can be done by defining words, i.e. using create ... does> ... This allows you to define a word with specialized behaviour.
E.g.:
: 2const create , , does> 2# ;
can be used to create double constants like 2 3 2const a-double (that stashes 2 and 3 away in a-double) and then a-double pushes two values (2 3).
I have a similiar problem like this question:
selecting every Nth column in using SQLDF or read.csv.sql
I want to read some columns of large files (table of 150rows, >500,000 columns, space separated, filled with numeric data and only a 32 bit system available). This file has no header, therefore the code in the thread above didn't work and I decided to write a new post.
Do you have an idea to solve this problem?
I thought about something like that, but any results with fread or read.table are also ok:
MyConnection <- file("path/file.txt")
df<-sqldf("select column 1 100 1000 235612 from MyConnection",file.format = list(header=F,sep=" "))
You can use substr to specify the start and end position of the columns you want to read in if they are fixed width:
x <- tempfile()
cat("12345", "67890", "09876", "54321", sep = "\n", file = x)
myfile <- file(x)
sqldf("select substr(V1, 1, 1) var1, substr(V1, 3, 5) var2 from myfile")
# var1 var2
# 1 1 345
# 2 6 890
# 3 9 76
# 4 5 321
See this blog post for some more examples. The "select" statement can easily be constructed with paste if you know the details about the column starting positions and widths.
Is there a no-fuss serialization method for Haskell, similar to Erlang's term_to_binary/binary_to_term calls? Data.Binary seems unnecessarily complicated and raw. See this example where you are basically manually encoding terms to integers.
Use Data.Binary, and one of the deriving scripts that come with the package.
It's very simple to derive Binary instances, via the 'derive' or 'deriveM' functions provided in the tools set of Data.Binay.
derive :: (Data a) => a -> String
For any 'a' in Data, it derives a Binary instance for you as a String. There's a putStr version too, deriveM.
Example:
*Main> deriveM (undefined :: Drinks)
instance Binary Main.Drinks where
put (Beer a) = putWord8 0 >> put a
put Coffee = putWord8 1
put Tea = putWord8 2
put EnergyDrink = putWord8 3
put Water = putWord8 4
put Wine = putWord8 5
put Whisky = putWord8 6
get = do
tag_ <- getWord8
case tag_ of
0 -> get >>= \a -> return (Beer a)
1 -> return Coffee
2 -> return Tea
3 -> return EnergyDrink
4 -> return Water
5 -> return Wine
6 -> return Whisky
_ -> fail "no parse"
The example you cite is an example of what the machine generated output looks like -- yes, it is all bits at the lowest level! Don't write it by hand though -- use a tool to derive it for you via reflection.