Clojure: From Prismatic Schema nested data structure to SQL Tables - sql

I would like to convert a Prismatic Schema nested specs to flat SQL tables creation where when adding lines, schema gets validated.
(defn max-length [l] (s/pred (fn [x] (<= (count x) l))))
(defn min-length [l] (s/pred (fn [x] (>= (count x) l))))
(defn length [l] (s/pred (fn [x] (= (count x) l))))
(defn matches [r] (s/pred (fn [s] (re-matches r s))))
(defn linked-to-many [& keys] ???)
(defn schema->tables [coll] ???)
(defn add-line [t coll] ???)
(def schema
{:ideas {:id s/Uuid ; not sure about this type
:title (s/both s/String (max-length 50))
:tags (linked-to-many :tags :id) ; a list of tag ids
:votes (linked-to-many :users :id) ; a list of user ids
(s/optional :desc) (s/both s/String (max-length 300))}
:users {:id s/Uuid
:f-name (s/both s/String (min-length 2))
:l-name (s/both s/String (min-length 2))
:zip (s/both s/Int (length 5))
:email (s/both (min-length 5) (matches #".*#.*"))
(s/optional :no-spam) s/Bool}
:tags {:id
:name (s/both s/String (max-length 15))
:related (linked-to-many :tags :id) } ; similar tag ids
(schema->tables schema)
;=> sql tables...
(add-line :tags {:name "Too long tag name which will trigger prismatic error"
:related [1 23 8]})
;=> prismatic validation error
I suppose it should create 6 SQL tables adding :votes, :tags-ideas and :tags-tags tables...
How can I achieve this?
Are there any libraries that can help me with this project?
(Maybe I could use something like aggregate?)

Related

Can't get Clojure macro to execute without expansion error

I'm writing a macro that looks through the metadata on a given symbol and removes any entries that are not keywords, i.e. the key name doesn't start with a ":" e.g.
(meta (var X)) ;; Here's the metadata for testing...
=>
{:line 1,
:column 1,
:file "C:\\Users\\Joe User\\AppData\\Local\\Temp\\form-init11598934441516564808.clj",
:name X,
:ns #object[clojure.lang.Namespace 0x12ed80f6 "thic.core"],
OneHundred 100,
NinetyNine 99}
I want to remove entryes "OneHundred" and "NinetyNine" and leave the rest of the metadata untouched.
So I have a bit of code that works:
(let [Hold# (meta (var X))] ;;Make a copy of the metadata to search.
(map (fn [[kee valu]] ;;Loop through each metadata key/value.
(if
(not= \: (first (str kee))) ;; If we find a non-keyword key,
(reset-meta! (var X) (dissoc (meta (var X)) kee)) ;; remove it from X's metadata.
)
)
Hold# ;;map through this copy of the metadata.
)
)
It works. The entries for "OneHundred" and "NinetyNine" are gone from X's metadata.
Then I code it up into a macro. God bless REPL's.
(defmacro DelMeta! [S]
`(let [Hold# (meta (var ~S))] ;; Hold onto a copy of S's metadata.
(map ;; Scan through the copy looking for keys that DON'T start with ":"
(fn [[kee valu]]
(if ;; If we find metadata whose keyname does not start with a ":"
(not= \: (first (str kee)))
(reset-meta! (var ~S) (dissoc (meta (var ~S)) kee)) ;; remove it from S's metadata.
)
)
Hold# ;; Loop through the copy of S's metadata so as to not confuse things.
)
)
)
Defining the macro with defmacro works without error.
macroexpand-1 on the macro, e.g.
(macroexpand-1 '(DelMeta! X))
expands into the proper code. Here:
(macroexpand-1 '(DelMeta! X))
=>
(clojure.core/let
[Hold__2135__auto__ (clojure.core/meta (var X))]
(clojure.core/map
(clojure.core/fn
[[thic.core/kee thic.core/valu]]
(if
(clojure.core/not= \: (clojure.core/first (clojure.core/str thic.core/kee)))
(clojure.core/reset-meta! (var X) (clojure.core/dissoc (clojure.core/meta (var X)) thic.core/kee))))
Hold__2135__auto__))
BUT!!!
Actually invoking the macro at the REPL with a real parameter blatzes out the most incomprehensible error message:
(DelMeta! X) ;;Invoke DelMeta! macro with symbol X.
Syntax error macroexpanding clojure.core/fn at (C:\Users\Joe User\AppData\Local\Temp\form-init11598934441516564808.clj:1:1).
([thic.core/kee thic.core/valu]) - failed: Extra input at: [:fn-tail :arity-1 :params] spec: :clojure.core.specs.alpha/param-list
(thic.core/kee thic.core/valu) - failed: Extra input at: [:fn-tail :arity-n :params] spec: :clojure.core.specs.alpha/param-list
Oh, all-powerful and wise Clojuregods, I beseech thee upon thy mercy.
Whither is my sin?
You don't need a macro here. Also, you are misunderstanding the nature of a Clojure keyword, and the complications of a Clojure Var vs a local variable.
Keep it simple to start by using a local "variable" in a let block instead of a Var:
(ns tst.demo.core
(:use tupelo.core tupelo.test))
(dotest
(let [x (with-meta [1 2 3] {:my "meta"})
x2 (vary-meta x assoc :your 25 'abc :def)
x3 (vary-meta x2 dissoc 'abc )]
(is= x [1 2 3])
(is= x2 [1 2 3])
(is= x3 [1 2 3])
(is= (meta x) {:my "meta"})
(is= (meta x2) {:my "meta", :your 25, 'abc :def})
(is= (meta x3) {:my "meta", :your 25}))
So we see the value of x, x2, and x3 is constant. That is the purpose of metadata. The 2nd set of tests shows the effects on the metadata of using vary-meta, which is the best way to change the value.
When we use a Var, it is not only a global value, but it is like a double-indirection of pointers in C. Please see this question:
When to use a Var instead of a function?
This answer also clarifies the difference between a string, a symbol, and a keyword. This is important.
Consider this code
(def ^{:my "meta"} data [1 2 3])
(spyx data)
(spyx-pretty (meta (var data)))
and the result:
data => [1 2 3]
(meta (var data)) =>
{:my "meta",
:line 19,
:column 5,
:file "tst/demo/core.cljc",
:name data,
:ns #object[clojure.lang.Namespace 0x4e4a2bb4 "tst.demo.core"]}
(is= data [1 2 3])
(is= (set (keys (meta (var data))))
#{:my :line :column :file :name :ns})
So we have added the key :my to the metadata as desired. How can we alter it? For a Var, use the function alter-meta!
(alter-meta! (var data) assoc :your 25 'abc :def)
(is= (set (keys (meta (var data))))
#{:ns :name :file 'abc :your :column :line :my})
So we have added 2 new entries to the metadata map. One has the keyword :your as key with value 25, the other has the symbol abc as key with value :def (a keyword).
We can also use alter-meta! to remote a key/val pair from the metadata map:
(alter-meta! (var data) dissoc 'abc )
(is= (set (keys (meta (var data))))
#{:ns :name :file :your :column :line :my})
Keyword vs Symbol vs String
A string literal in a source file has double quotes at each end, but they are not characters in the string. Similarly a keyword literal in a source file needs a leading colon to identify it as such. However, neither the double-quotes of the string nor the colon of the keyword are a part of the name of that value.
Thus, you can't identify a keyword by the colon. You should use these functions to identify different data types:
string?
keyword?
symbol?
the above are from the Clojure CheatSheet. So, the code you really want is:
(defn remove-metadata-symbol-keys
[var-obj]
(assert (var? var-obj)) ; verify it is a Var
(doseq [k (keys (meta var-obj))]
(when (not (keyword? k))
(alter-meta! var-obj dissoc k))))
with a sample:
(def ^{:some "stuff" 'other :things} myVar [1 2 3])
(newline) (spyx-pretty (meta (var myVar)))
(remove-metadata-symbol-keys (var myVar))
(newline) (spyx-pretty (meta (var myVar)))
and result:
(meta (var myVar)) =>
{:some "stuff",
other :things, ; *** to be removed ***
:line 42,
:column 5,
:file "tst/demo/core.cljc",
:name myVar,
:ns #object[clojure.lang.Namespace 0x9b9155f "tst.demo.core"]}
(meta (var myVar)) => ; *** after removing non-keyword keys ***
{:some "stuff",
:line 42,
:column 5,
:file "tst/demo/core.cljc",
:name myVar,
:ns #object[clojure.lang.Namespace 0x9b9155f "tst.demo.core"]}
The above code was all run using this template project.

How to bind var's name and value in the clojure macro?

Assum I hava some(more than 20) variables, I want to save them to a file. I don't want to repeat 20 times the same code.
I wrote a macro but it gave me an error.
my test case:
;-----------------------------------------------
(defn processor [ some-parameters ]
(let [
;after some operation ,got these data:
date-str ["JN01","JN02","JN03","JN04"];length 8760
date-temperature (map #(str %2 "," %1) [3.3,4.4,5.5,6.6] date-str) ; all vector's length are 8760
date-ws (map #(str %2 "," %1) [0.2,0.1,0.3,0.4] date-str) ;
;... many variables such like date-relative-humidity,date-pressure, name starts with "date-",
; all same size
]
;(doseq [e date-temperature]
; (println e))
(spit "output-variable_a.TXT"
(with-out-str
(doseq [e date-temperature]
(println e))))
;same 'spit' part will repeat many times
))
(processor 123)
; I NEED to output other variables(ws, wd, relative-humidity, ...)
; Output example:
;JN01,3.3
;JN02,4.4
;JN03,5.5
;JN04,6.6
;-----------------------------------------------
what I want is a macro/function I can use this way:
(write-to-text temperature,ws,wd,pressure,theta-in-k,mixradio)
and this macro/function will do the work.
I don't know how to write such a macro/function.
My macro post here but it doesn't work:
(defmacro write-array [& rest-variables ]
`(doseq [ vname# '~rest-variables ]
;(println vname# vvalue#)
(println "the vname# is" (symbol vname#))
(println "resolve:" (resolve (symbol (str vname# "-lines"))))
(println "resolve2:" (resolve (symbol (str "ws-lines"))))
(let [ vvalue# 5] ;(var-get (resolve (symbol vname#)))]
;----------NOTE: commented out cause '(symbol vname#)' won't work.
;1(spit (str "OUT-" vname# ".TXT" )
;1 (with-out-str
;1 (doseq [ l (var-get (resolve (symbol (str vname# "-lines"))))]
;1 (println l))))
(println vname# vvalue#))))
I found that the problem is (symbol vname#) part, this method only works for a GLOBAL variable, cannot bound to date-temperature in the LET form,(symbol vname#) returns nil.
It looks like you want to write a file of delimited values using binding names and their values from inside a let. Macros transform code during compilation and so they cannot know the run-time values that the symbols you pass are bound to. You can use a macro to emit code that will be evaluated at run-time:
(defmacro to-rows [& args]
(let [names (mapv name args)]
`(cons ~names (map vector ~#args))))
(defn get-stuff []
(let [nums [1 2 3]
chars [\a \b \c]
bools [true false nil]]
(to-rows nums chars bools)))
(get-stuff)
=> (["nums" "chars" "bools"]
[1 \a true]
[2 \b false]
[3 \c nil])
Alternatively you could produce a hash map per row:
(defmacro to-rows [& args]
(let [names (mapv name args)]
`(map (fn [& vs#] (zipmap ~names vs#)) ~#args)))
=> ({"nums" 1, "chars" \a, "bools" true}
{"nums" 2, "chars" \b, "bools" false}
{"nums" 3, "chars" \c, "bools" nil})
You would then need to write that out to a file, either using data.csv or similar code.
To see what to-rows expands to, you can use macroexpand. This is the code being generated at compile-time that will be evaluated at run-time. It does the work of getting the symbol names at compile-time, but emits code that will work on their bound values at run-time.
(macroexpand '(to-rows x y z))
=> (clojure.core/cons ["x" "y" "z"] (clojure.core/map clojure.core/vector x y z))
As an aside, I'm assuming you aren't typing thousands of literal values into let bindings. I think this answers the question as asked but there could likely be a more direct approach than this.
I think you are looking for the function name. To demonstrate:
user=> (defmacro write-columns [& columns]
(let [names (map name columns)]
`(str ~#names)))
#'user/write-columns
user=> (write-columns a b c)
"abc"
You can first capture the variable names and their values into a map:
(defmacro name-map
[& xs]
(let [args-list# (cons 'list (map (juxt (comp keyword str) identity) xs))]
`(into {} ~args-list#)))
If you pass the var names to the macro,
(let [aa 11
bb 22
cc 33]
(name-map aa bb cc))
It gives you a map which you can then use for any further processing:
=> {:aa 11, :bb 22, :cc 33}
(def result *1)
(run!
(fn [[k v]] (println (str "spit file_" (name k) " value: " v)))
result)
=>
spit file_aa value: 11
spit file_bb value: 22
spit file_cc value: 33
Edit: Just noticed it's similar to Taylor's macro. The difference is this one works with primitive types as well, while Taylor's works for the original data (vars resolving to collections).

Split lines in clojure while reading from file

I am learning clojure at school and I have an exam coming up. I was just working on a few things to make sure I get the hang of it.
I am trying to read from a file line by line and as I do, I want to split the line whenever there is a ";".
Here is my code so far
(defn readFile []
(map (fn [line] (clojure.string/split line #";"))
(with-open [rdr (reader "C:/Users/Rohil/Documents/work.txt.txt")]
(doseq [line (line-seq rdr)]
(clojure.string/split line #";")
(println line)))))
When I do this, I still get the output:
"I;Am;A;String;"
Am I missing something?
I'm not sure if you need this at school, but since Gary already gave an excellent answer, consider this as a bonus.
You can do elegant transformations on lines of text with transducers. The ingredient you need is something that allows you to treat the lines as a reducible collection and which closes the reader when you're done reducing:
(defn lines-reducible [^BufferedReader rdr]
(reify clojure.lang.IReduceInit
(reduce [this f init]
(try
(loop [state init]
(if (reduced? state)
#state
(if-let [line (.readLine rdr)]
(recur (f state line))
state)))
(finally
(.close rdr))))))
Now you're able to do the following, given input work.txt:
I;am;a;string
Next;line;please
Count the length of each 'split'
(require '[clojure.string :as str])
(require '[clojure.java.io :as io])
(into []
(comp
(mapcat #(str/split % #";"))
(map count))
(lines-reducible (io/reader "/tmp/work.txt")))
;;=> [1 2 1 6 4 4 6]
Sum the length of all 'splits'
(transduce
(comp
(mapcat #(str/split % #";"))
(map count))
+
(lines-reducible (io/reader "/tmp/work.txt")))
;;=> 24
Sum the length of all words until we find a word that is longer than 5
(transduce
(comp
(mapcat #(str/split % #";"))
(map count))
(fn
([] 0)
([sum] sum)
([sum l]
(if (> l 5)
(reduced sum)
(+ sum l))))
(lines-reducible (io/reader "/tmp/work.txt")))
or with take-while:
(transduce
(comp
(mapcat #(str/split % #";"))
(map count)
(take-while #(> 5 %)))
+
(lines-reducible (io/reader "/tmp/work.txt")))
Read https://tech.grammarly.com/blog/building-etl-pipelines-with-clojure for more details.
TL;DR embrace the REPL and embrace immutability
Your question was "what am I missing?" and to that I'd say you're missing one of the best features of Clojure, the REPL.
Edit: you might also be missing that Clojure uses immutable data structures so
consider this code snippet:
(doseq [x [1 2 3]]
(inc x)
(prn x))
This code does not print "2 3 4"
it prints "1 2 3" because x isn't a mutable variable.
During the first iteration (inc x) gets called, returns 2, and that gets thrown away because it wasn't passed to anything, then (prn x) prints the value of x which is still 1.
Now consider this code snippet:
(doseq [x [1 2 3]] (prn (inc x)))
During the first iteration the inc passes its return value to prn so you get 2
Long example:
I don't want to rob you of the opportunity to solve the problem yourself so I'll use a different problem as an example.
Given the file "birds.txt"
with the data "1chicken\n 2duck\n 3Larry"
you want to write a function that takes a file and returns a sequence of bird names
Lets break this problem down into smaller chunks:
first lets read the file and split it up into lines
(slurp "birds.txt") will give us the whole file a string
clojure.string/split-lines will give us a collection with each line as an element in the collection
(clojure.string/split-lines (slurp "birds.txt")) gets us ["1chicken" "2duck" "3Larry"]
At this point we could map some function over that collection to strip out the number like (map #(clojure.string/replace % #"\d" "") birds-collection)
or we could just move that step up the pipeline when the whole file is one string.
Now that we have all of our pieces we can put them together in a functional pipeline where the result of one piece feeds into the next
In Clojure there is a nice macro to make this more readable, the -> macro
It takes the result of one computation and injects it as the first argument to the next
so our pipeline looks like this:
(-> "C:/birds.txt"
slurp
(clojure.string/replace #"\d" "")
clojure.string/split-lines)
last note on style, for Clojure functions you want to stick to kebab case so readFile should be read-file
I would keep it simple, and code it like this:
(ns tst.demo.core
(:use tupelo.test)
(:require [tupelo.core :as t]
[clojure.string :as str] ))
(def text
"I;am;a;line;
This;is;another;one
Followed;by;this;")
(def tmp-file-name "/tmp/lines.txt")
(dotest
(spit tmp-file-name text) ; write it to a tmp file
(let [lines (str/split-lines (slurp tmp-file-name))
result (for [line lines]
(for [word (str/split line #";")]
(str/trim word)))
result-flat (flatten result)]
(is= result
[["I" "am" "a" "line"]
["This" "is" "another" "one"]
["Followed" "by" "this"]])
Notice that result is a doubly-nested (2D) matrix of words. The simplest way to undo this is the flatten function to produce result-flat:
(is= result-flat
["I" "am" "a" "line" "This" "is" "another" "one" "Followed" "by" "this"])))
You could also use apply concat as in:
(is= (apply concat result) result-flat)
If you want to avoid building up a 2D matrix in the first place, you can use a generator function (a la Python) via lazy-gen and yield from the Tupelo library:
(dotest
(spit tmp-file-name text) ; write it to a tmp file
(let [lines (str/split-lines (slurp tmp-file-name))
result (t/lazy-gen
(doseq [line lines]
(let [words (str/split line #";")]
(doseq [word words]
(t/yield (str/trim word))))))]
(is= result
["I" "am" "a" "line" "This" "is" "another" "one" "Followed" "by" "this"])))
In this case, lazy-gen creates the generator function.
Notice that for has been replaced with doseq, and the yield function places each word into the output lazy sequence.

Clojure optimization of an inversion counter

I'm new to Clojure. I was wondering how I could optimize an algorithm to count the number of inversions in a list. From what I understand, Clojure doesn't do tail call optimization unless specifically asked to? How do you get it to do this?
This first attempt with a mutated variable has a runtime of about 3.5s. But my second attempt was a functional version and it takes about 1m15s! and both require growing the stack size quite a bit (like -Xss12m).
How would I go about getting better performance?
I'd prefer to not have mutable variables (like the functional one) if possible. You can create the array file by typing something like seq 100000 | sort -R > IntArray.txt.
The first attempt w/ mutable variable:
(use 'clojure.java.io)
(def inversions 0)
(defn merge_and_count' [left right left_len]
(if (empty? right) left
(if (empty? left) right
(if (<= (first left) (first right))
(cons (first left) (merge_and_count' (rest left) right (- left_len 1)))
(let [_ (def inversions (+ inversions left_len))]
(cons (first right) (merge_and_count' left (rest right) left_len)))
))))
(defn inversion_count [list]
(if (or (empty? list) (nil? (next list))) list
(let [mid (quot (count list) 2)]
(merge_and_count' (inversion_count (take mid list))
(inversion_count (drop mid list)) mid)
)))
(defn parse-int [s]
(Integer. (re-find #"\d+" s )))
(defn get-lines [fname]
(with-open [r (reader fname)]
(doall (map parse-int (line-seq r)))))
(let [list (get-lines "IntArray.txt")
_ (inversion_count list)]
(print inversions))
My second attempt to be purely functional (no mutability):
(use 'clojure.java.io)
(defn merge_and_count' [left right inversions]
(if (empty? right) (list left inversions)
(if (empty? left) (list right inversions)
(if (<= (first left) (first right))
(let [result (merge_and_count' (rest left) right inversions)]
(list (cons (first left) (first result)) (second result)))
(let [result (merge_and_count' left (rest right) (+ inversions (count left)))]
(list (cons (first right) (first result)) (second result)))
))))
(defn inversion_count [list' list_len]
(if (or (empty? list') (nil? (next list'))) (list list' 0)
(let [mid (quot list_len 2)
left (inversion_count (take mid list') mid)
right (inversion_count (drop mid list') (- list_len mid))]
(merge_and_count' (first left) (first right) (+ (second left) (second right)))
)))
(defn parse-int [s]
(Integer. (re-find #"\d+" s )))
(defn get-lines [fname]
(with-open [r (reader fname)]
(doall (map parse-int (line-seq r)))))
(let [list (get-lines "IntArray.txt")
result (inversion_count list 100000)]
(print (second result)))
The stack overflows due to the recursion in merge-and-count. I tried this approach, and for 100000 items, it came back instantly.
(defn merge_and_count [left right inversions]
(loop [l left r right inv inversions result []]
(cond (and (empty? r) (empty? l)) [result inv]
(empty? r) [(apply conj result l) inv]
(empty? l) [(apply conj result r) inv]
(<= (first l) (first r)) (recur (rest l) r inv (conj result (first l)))
:else (recur l (rest r) (+ inv (count l)) (conj result (first r))))))
You need to replace this code with code from your second approach.

How to compare data structures in clojure and highlight differences

Is there a nice way to print out differences in clojure data structures? In Perl for example there is Test::Differences which helps a lot.
Take a look at clojure.data/diff: http://clojure.github.io/clojure/clojure.data-api.html#clojure.data/diff
Examples:
async-demo.core> (use 'clojure.data)
nil
async-demo.core> (diff {:a 2 :b 4} {:a 2})
({:b 4} nil {:a 2})
async-demo.core> (diff [1 2 3 4] [1 2 6 7])
[[nil nil 3 4] [nil nil 6 7] [1 2]]
async-demo.core> (diff #{"one" "two" "three"} #{"one" "fourty-four"})
[#{"two" "three"} #{"fourty-four"} #{"one"}]
You can also visually diff two data structures using the gui-diff library.
(gui-diff {:a 1} {:a 2}) will shell out to an OS-appropriate gui diffing program to diff the two, potentially very large, data structures.
What I really searched for was difform. Using clojure.data/diff is nice but does not work well in a unit test where larger structures are compared. Here is an example where data/diff does not perform as well as difform in my opinion:
(defn example []
(let [data [{:foo 1} {:value [{:bar 2}]}]]
(diff data
[{:value [{:bar 2}]}])
(difform data
[{:value [{:bar 2}]}])))
;; => diff output
;; => [[{:foo 1} {:value [{:bar 2}]}] [{:value [{:bar 2}]}] nil]
;; => difform output
[{:
- foo 1} {:
value [{:bar 2}]}]
;; => nil
Differ is a recent library that seems to do a nice job:
(def person-map {:name "Robin"
:age 25
:sex :male
:phone {:home 99999999
:work 12121212})
(def person-diff (differ/diff person-map {:name "Robin Heggelund Hansen"
:age 26
:phone {:home 99999999})
;; person-diff will now be [{:name "Robin Heggelund Hansen"
;; :age 26}
;; {:sex 0
;; :phone {:work 0}]
EDITED: fix Differ repo URL that change from gitlab to GitHub
Fanciest diff what I saw is deep-diff
It produces nice colorful diff like this
There also mention of nice editscript lib whick produce diff as array of patches.