I would like to be able to declare one or two variables of the same type from a set called group. I know that one and lone can be used to declare one or zero/one variables respectively. My attempt so far is :
one x : group, lone y : from | {...}
However, this doesn't appear to work. My aim is to have either one or two variables that I can then use in the following expression.
There might be confusion here.
If you write one x:group| expr this means that there should be exactly one x in group so that the expression expr holds.
Knowing this, if you want to express that kind of constraint, you could write something like this :
e.g. assuming there is a field named size describing a relation from group to Int,
Expressing that at least one and at most two groups have a size of 5 can be done as follows
one x,y : group | (x + y).size=5
In this example, x+y will yield one or two group elements depending on whether x=y or not.
Related
I have 3 set , I want to know what element not belong to
Symmetric difference set.
Set1={1*125}
Set2={20*450}
Set3={45*235}
I show the symmetric difference of setA and set B by SymAB.
I calculate sym12,sym13,sym23. I have one if statement, like this ( if element x is not belong to symAB then display x)
How can I code this conditional statement?
Best
You cannot display just parts of a symbol. You can just use a condition to define, if you want to show a symbol or not, see also here: How Display some of 2 dimension parameter?
What you could do in your case is to define a symbol with all elements, which are not in the other symbol and display it, if it is not empty, like this:
Set x /1*4/
symAB /2,4/
notSymAB(x);
notSymAB(x) = not SymAB(x);
Display$(card(notSymAB)>0) notSymAB;
From what I understand, EATV (which Datomic does not have) would be great fit for as-of queries. On the other hand, I see no use-case for EAVT.
This is analogous to row/primary key access. From the docs: "The EAVT index provides efficient access to everything about a given entity. Conceptually this is very similar to row access style in a SQL database, except that entities can possess arbitrary attributes rather then being limited to a predefined set of columns."
The immutable time/history side of Datomic is a motivating use case for it, but in general, it's still optimized around typical database operations, e.g. looking up an entity's attributes and their values.
Update:
Datomic stores datoms (in segments) in the index tree. So you navigate to a particular E's segment using the tree and then retrieve the datoms about that E in the segment, which are EAVT datoms. From your comment, I believe you're thinking of this as the navigation of more b-tree like structures at each step, which is incorrect. Once you've navigated to the E, you are accessing a leaf segment of (sorted) datoms.
You are not looking for a single value at a specific point in time. You are looking for a set of values up to a specific point in time T. History is on a per value basis (not attribute basis).
For example, assert X, retract X then assert X again. These are 3 distinct facts over 3 distinct transactions. You need to compute that X was added, then removed and then possibly added again at some point.
You can do this with SQL:
create table Datoms (
E bigint not null,
A bigint not null,
V varbinary(1536) not null,
T bigint not null,
Op bit not null --assert/retract
)
select E, A, V
from Datoms
where E = 1 and T <= 42
group by E, A, V
having 0 < sum(case Op when 1 then +1 else -1 end)
The fifth component Op of the datom tells you whether the value is asserted (1) or retracted (0). By summing over this value (as +1/-1) we arrive at either 1 or 0.
Asserting the same value twice does nothing, and you always retract the old value before you assert a new value. The last part is a prerequisite for the algorithm to work out this nicely.
With an EAVT index, this is a very efficient query and it's quite elegant. You can build a basic Datomic-like system in just 150 lines of SQL like this. It is the same pattern repeated for any permutation of EAVT index that you want.
I have a pivot table chart in QlikView that has a dimension and an expression. The dimension is a column with 5 possible values: 'a','b','c','d','e'.
Is there a way to restrict the values to 'a','b' and 'c' only?
I would prefer to enforce this from the chart properties with a condition, instead of choosing the values from a listbox if possible.
Thank you very much, I_saw_drones! There is an problem I have though. I have different expressions defined depending on the category, like this:
IF( ([Category]) = 'A' , COUNT( {<[field1] = {'x','y'} >} [field2]), IF ([Category]) = 'B' , SUM( {<[field3] = {'z'} >} [field4]), IF (Category='C', ..., 0)))
In this case, where would I add $<Category={'A','B','C'} ? My expression so far doesn't help because although I tell QV to use a different formula/calculation for each category, the category overall (all 5 values) represents the dimension.
One possible method to do this is to use QlikView's Set Analysis to create an expression which sums only your desired values.
For this example, I have a very simple load script:
LOAD * INLINE [
Category, Value
A, 1
B, 2
C, 3
D, 4
E, 5
];
I then have the following Pivot Table Chart set up with a single expression which just sums the values:
What we need to do is to modify the expression, so that it only sums A, B and C from the Category field.
If I then use QlikView's Set Analysis to modify the expression to the following:
=sum({$<Category={A,B,C}>} Value)
I then achieve my desired result:
This then restricts my Pivot Table Chart to displaying only these three values for Category without me having to make a selection in a Listbox. The form of this expression also allows other dimensions to be filtered at the same time (i.e. the selections "add up"), so I could say, filter on a Country dimension, and my restriction for Category would still be applied.
How this works
Let's pick apart the expression:
=sum({$<Category={A,B,C}>} Value)
Here you can recognise the original form we had before (sum(Value)), but with a modification. The part {$<Category={A,B,C}>} is the Set Analysis part and has this format: {set_identifier<set_modifier>}. Coming back to our original expression:
{: Set Analysis expressions always start with a {.
$: Set Identifier: This symbol represents the current selections in the QlikView document. This means that any subsequent restrictions are applied on top of the existing selections. 1 can also be used, this represents the full set of data in your document irrespective of selections.
<: Start of the set modifiers.
Category={A,B,C}: The dimension that we wish to place a restriction on. The values required are contained within the curly braces and in this case they are ORed together.
>: End of the set modifiers.
}: End of the set analysis expression.
Set Analysis can be quite complex and I've only scratched the surface here, I would definitely recommend checking the QlikView topic "Set Analysis" in both the installed helpfile and the reference manual (PDF).
Finally, Set Analysis in QlikView is quite powerful, however it should be used sparingly as it can lead to some performance problems. In this case, as this is a fairly simple expression the performance should be reasonable.
Woa! a year later, but what you are loking for is osmething near this:
Go to the dimension sheet, then select the Category Dimension, and click on the Edit Dimesnion button
there you can use something like this:
= If(Match(Category, 'a', 'b', 'c'), Category, Null())
This will make the object display only a b and c Categories, and a line for the Null value.
What leasts is that you check the "Suppress value when null" option on the Dimension sheet.
c ya around
Just thought another solution to this which may still be useful to people looking for this.
How about creating a bookmark with the categories that you want and then setting the expressions to be evaluated in the context of that bookmark only?
(Will expand on this later, but take a look at how set analysis can be affected by a bookmark)
I am trying to understand if I can perform a query with Neo4j that contains both WITH and HAVING clauses. I have this so far:
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (m:LABEL)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100
RETURN p,r2,q;
I'd now need to add in the same query something that in SQL would look
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (m:LABEL)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100
AND WHERE count(q)=3
RETURN m,r2,q;
I know that Cypher doesn't let me use that without using something like the HAVING clause but when I try to add it to my query it conflicts with the previous WITH clause.
Is this feasible or it is too nested that Cypher won't allow me to do it?
You can have as many with statements as you want, it is just piping query results from one part to the next. Actually WITH + WHERE = `HAVING``
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (m:LABEL)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100
WITH m,collect([r2,q]) as paths
WHERE length(paths) = 3
RETURN m,paths;
Btw. I don't know where your p comes from.
Not sure what your reference is for HAVING in cypher, but that's not the problem with the query.
drop the second WHERE - in cypher, you WHERE once and then you can expand that with all the binary fun you want
your first filter condition tests individual relationships (r2), but the second tests an aggregate (count(q)). You can't test a flat pattern and an aggregate from the same pattern at the same time
return things that you have actually bound (what is p?)
You may also want to change the second MATCH, m is already bound but you are re-matching it with the just created label. All in all, try something like
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (m)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100
WITH m, collect(r2) as rr, collect(q) as qq
WHERE length(qq) = 3
RETURN p,rr,qq;
for filtering first on flat relationship r2 then on size of aggregate, or for a flat WHERE .. AND .. try something like
MATCH (n)-[r:RELATIONSHIP*1..3]->(m)
SET m:LABEL
WITH m
MATCH (m)-[r2:RELATIONSHIP]->(q:OTHERLABEL)
WHERE r2.time<100 AND q.someProp = 10
RETURN m,r2,q;
What is the difference between :: and . in pig?
When do I use one vs the other?
E.g., I know that :: is need in join when a field exists in both aliases:
A = foreach (join B by (x), C by (y)) generate B::y as b_y, C::y as c_y;
and I need . when accessing group fields:
A = foreach (group B by (x,y)) generate group.x as x, group.y as y, SUM(B?z) as z;
However, do I pass B::z or B.z to SUM above instead of B?z?
In Pig, :: is used as a disambiguation tool after operations which could possibly create naming collisions. Notably, this happens with JOIN, CROSS, and FLATTEN. Consider two relations, A:{(id:int, name:chararray)} and B:{(id:int, location:chararray)}. If you want to associate names with locations, naturally you would do:
C = JOIN A BY id, B BY id;
Without the disambiguation operator, your schema would be
C:{(id:int, name:chararray, id:int, location:chararray)}
Now you can't tell which field id refers to. To avoid this, Pig will instead do
C:{(A::id:int, A::name:chararray, B::id:int, B::location:chararray)}
Likewise, you could FLATTEN two bags whose tuples have fields with the same name, and they would also collide. So the same operator is used in this case as well. When there is no such conflict, you do not need to use the full name: name is unambiguous here. To simplify C, then, you can do this:
D = FOREACH C GENERATE A::id, name, location;
The . operator, by contrast, projects fields from bags and tuples. If you have a bag b with schema {(x:int, y:int, z:int)}, the projection b.y yields a bag with just the specified field: {(y:int)}. You can project multiple fields at once with parentheses: b.(y,z) yields {(y:int, z:int)}.
When used with tuples, the result is a tuple with just the specified fields. If the tuple t has schema (x:int, y:int, z:int), then t.x is the tuple (x:int) and t.(y,z) is the tuple (y:int, z:int).
To your specific question about SUM, note that SUM along with the other summary statistic UDFs, takes a bag as its argument. Therefore, you need to create a bag with just the one field per tuple that you want to sum. Using the projection operator, .: B.z.
IIRC you get :: as a side effect after some statements. You cannot bother about it, unless (as you mentioned) a name exists inside two different prefixes.
The . is different in that you are going inside the structure.
group.x as x, group.y as y is equivalent to FLATTEN(group)
SUM(B?z) - here you should do SUM(B.z), to specify that you need a particular field to SUM.