TemporalPoint as an array - encog

Looks like my original question was not well formed so I will make it a little simpler :
Why is TemporalPoint an array ?
In the book I see a TemporalPoint object called point .
Brief code sample shows point.Data[0] and point.Data[1] . What would Data[1] represent ?
The actual full code example shows only Data[0] being used so there is no example in
the book that actually uses Data[1] .

Many examples of forecasting use a single value to predict future trends. For example, you might use data[0] to represent the value company1, to predict company1. However, you can use multiple values to predict company1. Perhaps you might want to put the volume of company1 into data[1]. You might also want to use the values of company1, company2 and company3 to predict company 1. In this case you place company1, company2 and company3 all into data[0], data[1] and data[2] respectively.
If you were using the above length 3 array, then you have 3 TemporalDataDescription objects to describe each of these array elements. There is some more information in the Javadoc for TemporalMLDataSet and TemporalDataDescription. Links at: http://www.heatonresearch.com/encog

Related

Linear regression output is given as separated and NAN values

I'm trying to create the best linear regression model and my code is:
Daugialype2 <- lm(TNFBL~IL-4_konc_BL+ MCP-1_konc_BL+IL-8_konc_BL+TGF-β1_konc_BL)
summary(Daugialype2) #this code is working, I get a normal output
BUT
Then I want to introduce more variables to the model, e.g.
Daugialype2 <- lm(TNFBL~IL-4_konc_BL+ MCP-1_konc_BL+IL-8_konc_BL+TGF-β1_konc_BL+MiR_181_BL)
For unknown reasons, my output looks like this (even though without the MiR_181_BL variable, the output was good:
enter image description here
I don't know where is the problem - I don't get any error message. Could it be in the variable itself?
My variable looks like this (while others have less numbers after comma)
enter image description here
It's my very first model. Thank you for your answers!

Cypher-Neo4j Node Single Property change to Array

Following is a Node we having in DB
P:Person { name:"xxx", skill:"Java" }
and after awhile, we would like to change the Skill to skill array, is it possible?
P:Person { name:"xxx", skill:["Java", "Javascript"] }
Which Cypher query should I use?
If you have a single skill value in skill, then just do
MATCH (p:Person)
WHERE HAS (p.skill)
SET p.skill=[p.skill]
If there are multiple values you need to convert to an array such as P:Person { name:"xxx", skill:"Java","JavaScript" } then this should work:
MATCH (p:P)
SET p.skill= split(p.skill,",")
In fact, I think your real problem here is not how to get an array property in a node, but how to store it. Your data model is wrong in my opinion, storign data as array in neo4j is not common, since you have relations to store multiple skills (in your example).
How to create your data model
With your question, I can already see that you have one User, and one User can have 1..n skills.
I guess that one day (maybe tomorrow) you will need to know which users are able to use Java, C++, PHP, and every othre skills.
So, Here you can already see that every skill should have its own node.
What is the correct model in this case?
I think that, still with only what you said in question, you should have something like this:
(:Person{name:"Foo"})-[:KNOWS]->(:Skill{name:"Bar"})
using such a data model, you can get every Skill known by a Person using this query:
MATCH (:Person{name:"Foo"})-[:KNOWS]->(skill:Skill)
RETURN skill //or skill.name if you just want the name
and you can also get every Person who knows a Skill using this:
MATCH (:Skill{name:"Bar"})<-[:KNOWS]-(person)
RETURN person //Or person.name if you just want the name
Keep in mind
Storing array values in properties should be the last option when you are using neo4j.
If a property can be found in multiple nodes, having the same value, you can create a node to store it, then you will be able to link it the other nodes using relations, and finding every node having the property X = Y will be easier.

Associative arrays for grandmothers using awk

I have a really hard time wrapping my head around arrays and associative arrays in awk.
Say you want to compare two different columns in two different files using associative arrays, how would you do? Let's say column 1 in file 1 with column 2 in file two, then print the the matching, corresponding values of file 1 in a new column in file 2. Please explain each step really simply, as if talking to your grandmother, I mean, super-thoroughly and super-simple.
Cheers
Simple explanation of associative arrays (aka maps), not specifically for awk:
Unlike a normal array, where each element has a numeric index, an associative array uses a "key" instead of an index. You can think of it as being like a simple flat-file database, where each record has a key and a value. So if you have, e.g. some salary data:
Fred 10000
John 12000
Sara 11000
you could store it in an associative array, a, like this:
a["Fred"] = 10000
a["John"] = 12000
a["Sara"] = 11000
and then when you wanted to retrieve a salary for a person you would just look it up using their name as the key, e.g.
salary = a[person]
You can of course modify the values too, so if you wanted to give Fred a 10% pay rise you could do it like this:
a["Fred"] = a["Fred"] * 1.1
And if you wanted to set Sara's salary to be the same as John's you could write:
a["Sara"] = a["John"]
So an associative array is just an array that maps keys to values. Note that the keys do not need to be strings, and the values do not need to be numeric, but the basic concept is the same regardless of the data types. Note also that one obvious constraint is that keys need to be unique.
Grandma - let's say you want to make jam out of strawberries, raspberries, and blueberries, one jar of each.
You have a shelf on your wall with room/openings for 3 jars on it. That shelf is an associative array: shelf[]
Stick a label named "strawberry" beneath any one of the 3 openings. That is the index of an associative array: shelf["strawberry"]
Now place a jar of strawberry jam in the opening above that label. That is the contents of the associative array indexed by the word "strawberry": shelf["strawberry"] = "the jar of strawberry jam"
Repeat for raspberry and blueberry.
When you feel like making yourself a delicious sandwich, go to your shelf (array), look for the label (index) named "strawberry" and pick up the jar sitting above it (contents/value), open and apply liberally to bread (preferably Mothers Pride end slices).
Now - if a wolf comes to the door, do not open it in case he steals your sandwich or worse!

Sparql query with Blank node can be complex

I read this blog article, Problems of the RDF model: Blank Nodes, and there's mentioned that using blank nodes can complicate the handling of data.
Can you give me an example why using blank nodes is difficult to perform a SPARQL query?
I do not understand the complexity of blank nodes.
Can you explain me the meaning and semantics of an existential variable?
I do not understand clearly this explanation given in the RDF Semantics Recommendation, 1.5. Blank Nodes as Existential Variables.
Existential Variables
In the (first-order) predicate calculus, there is existential quantification which lets us make assertions about things that exist, without saying (or, possibly, knowing) which specific individuals in the domain we're actually talking about. For instance, a sentence like
hasUserId(JoshuaTaylor,1281433)
entails the sentence
∃x.hasUserId(x,1281433)
Of course, there are lots of scenarios in which the second sentence could be true without the first one being true. In that sense, the second sentence gives us less information than the first. It's also important to note that the variable x in the second sentence doesn't provide any way to find out which element in the domain of discourse actually has the given userId. It also also doesn't make any claim that there's only one such thing that has the given user id. To make that clearer, we might use an example:
∃y.hasAge(y,29)
This is presumably true, since someone or something out there is age 29. Note that we can't talk about y as the individual that is age 29, though, because there could be lots of them. All this sentence tells us is that there is at least one.
Even though we used different variables in the two sentences, there's nothing to say that the individuals with the specified properties might not be the same. This is particularly important in nested quantification, e.g.,
∃x.∃y.likes(x, y)
This sentence could be true because there is one individual in the domain that likes itself. just because x and y have different names in the sentence doesn't mean that they might not refer to the same individual.
Blank Nodes as Existential Variables
There is a defined RDF entailment model defined in RDF Semantics. This has been described more in another Stack Overflow question, RDF Graph Entailment. The idea is that an RDF graph is treated a big existential quantification over the blank nodes mentioned in the graph. E.g., if the triples in the graph are t1, …, tn, and the blank nodes that appear in those triples are b1, …, bm, then the graph is a formula:
∃b1, …, bm.(t1 &wedge; … &wedge; tn)
Based on the discussion of the existential variables above, note that this means that blank nodes in the data might refer to same element of the domain, or different elements, and that it's not required that exactly one element could take the place of a blank node. This means that a graph with blank nodes, when interpreted in this manner, provides much less information than you might expect.
Blank Nodes in Real Data
Now, the discussion above is useful if people are using blank nodes as existential variables. In many cases, authors think of them more as anonymous, but definite and distinct objects. E.g., if we casually write
#prefix : <https://stackoverflow.com/q/20629437/1281433/> .
:Carol :hasAddress [ :hasNumber 4222 ;
:hasStreet :Clinton_Way ] .
we may well be trying to say that there is a single address out there with the specified properties, but according to the RDF entailment model, that's not what we're doing.
In practice, this isn't so much of a problem, because we're usually not using RDF entailment. What is a problem though is that since the scope of blank variables is local to a graph, we can't run a SPARQL query against an endpoint asking for Carol's address and get back an IRI that we can reuse. If we run a query like this:
prefix : <https://stackoverflow.com/q/20629437/1281433/>
construct {
:Mike :hasAddress ?address
}
where {
:Carol :hasAddress ?address
}
then we get back the following (unhelpful) graph as a result:
#prefix : <https://stackoverflow.com/q/20629437/1281433/> .
:Mike :hasAddress [] .
We won't have a way to get more information about the address because all we have now is a blank node. If we had used IRIs, e.g.,
#prefix : <https://stackoverflow.com/q/20629437/1281433/> .
:Carol :hasAddress :address1267389 .
:address1267389 :hasNumber 4222 ;
:hasStreet :Clinton_Way .
then the query would have produced something more helpful:
#prefix : <https://stackoverflow.com/q/20629437/1281433/> .
:Mike :hasAddress :address1267389 .
Why is this more useful? The first case is like having the data
∃ x.(hasAddress(Carol,x) &wedge; hasNumber(x,4222) &wedge; hasStreet(x,ClintonWay))
and getting back a result
∃ y.hasAddress(Mike,y)
Sure, it's possible that Mike and Carol have the same address, but from these sentences there's no way to know for sure. It's much more helpful to have data like
hasAddress(Carol,address1267389)
hasNumber(address1267389,4222)
hasStreet(address1267389,ClintonWay))
and getting back a result
hasAddress(Mike,address1267389)
From this, you know that they have the same address, and you can ask things about it.
Conclusion
How much this will affect your data and its consumers depends on what the typical use cases are. For automatically constructed graphs, it may be hard to know in advance just what kind of data you'll need to be able to refer to later, so it's a good idea to generate IRIs for as many of your resources as you can. Since IRIs are free-form, it's usually not too hard to do this. For instance, if you've got some sensible “base” IRI, e.g.,
http://example.org/myData/
then you can easily append suffixes to identify your resources. E.g.,
http://example.org/myData/addresses/addr1
http://example.org/myData/addresses/addr2
http://example.org/myData/addresses/addr3
http://example.org/myData/individuals/ind34
http://example.org/myData/individuals/ind35

Tips for designing a serialization file format that will permit easy merging

Say I'm building a UML modeling tool. There's some hierarchical organization to the data, and model elements need to be able to refer to others. I need some way to save model files to disk. If multiple people might be working on the files simultaneously, the time will come to merge these model files. Also, it would be nice to compare two revisions in source control and see what has changed. This seems like it would be a common problem across many domains
For this to work well using existing difference and merge tools, the file format should be text, separated onto multiple lines.
What are some existing serialization formats that do a good job (or poor job) addressing such problems? Or, if designing a custom file format, what are some tips / guidelines / pitfalls?
Bonus question: Any additional guidance if I want to eventually split the model up into multiple files, each separately source controlled?
I solved that problem long ago for octave/matlab, now I need something for C#.
The task was to merge two octave-structs to one. I found no merge tool and no fitting serializer, so I had to think about something.
The most important concept decision was to split the struct-tree into lines with the complete path and the content of the leave.
The basic Idea was
Serialize the Struct to Lines, where each line represents a basic Variable (Matrix, string, float,...)
An array or matrix of struct will have the index in the path.
concatenate the two resulting text files, sort the lines
detect collisions and do collision-handling (very easy, because the same Properties will be positioned directly unde each other after the line sorting)
do deserialize
Example:
>> s1
s1 =
scalar structure containing the fields:
b =
2x2 struct array containing the fields:
bruch
t = Textstring
f = 3.1416
s =
scalar structure containing the fields:
a = 3
b = 4
will be serialized to
root.b(1,1).bruch=txt2base('isfloat|[ [ 0, 4 ] ; [ 1, 0 ] ; ]');
root.b(1,2).bruch=txt2base('isfloat|[ [ 1, 6 ] ; [ 1, 0 ] ; ]');
root.b(2,1).bruch=txt2base('isfloat|[ [ 2, 7 ] ; [ 1, 0 ] ; ]');
root.b(2,2).bruch=txt2base('isfloat|[ [ 7 ] ; [ 1 ] ; ]');
root.f=txt2base('isfloat|[3.1416]');
root.s.a=txt2base('isfloat|[3]');
root.s.b=txt2base('isfloat|[4]');
root.t=txt2base('ischar|Textstring');
The advantage of this method is, that it is very easy to implement and it is human readable. First you have to write the two functions base2txt and txt2base, wich convert basic types to strings and back. Then you just go recursively through the tree and write for each struct property the path to the property (here seperated by ".") and the content to one line.
The big disadvantage is, that at least my implementation of this is very slow.
The answer to the second question: Is there already something like this out there? I dont know... but I searched for a while, so I don't think so.
Some guidelines:
The format should be designed so that when only one thing has changed in a model, there is only one corresponding change in the file. Some counterexamples:
It's no good if the file format uses arbitrary reference IDs that change every time you edit and save the model.
It's no good if array items are stored with their indices listed explicitly, since inserting items into the middle of an array will cause all the following indices to get shuffled down. That will cause those items to show up in a 'diff' unnecessarily.
Regarding references: if IDs are created serially, then two people editing the same revision of the model could end up creating new elements with the same ID. This will become a problem when merging.