If I got a grammar rule like
a: A (C|D|E)
I can create AST for the rule by attaching rewrite rules for each alternative(C, D, E) like this:
a: A (C -> ^(A C)
| D -> ^(A D)
| E -> ^(A E))
But, if I got another slightly different grammar rule like
a: (A|B) (C|D|E)
how do I create AST for every possible match? I first tried like this:
a: (A|B) (C|D|E) -> ^((A|B) (C|D|E))
but, it did not work.
Is there a simple way to solve this problem?
Thanks in advance. :)
You have two options:
1
a : (left=A | left=B) (right=C | right=D | right=E) -> ^($left $right)
;
or:
2
a : left right -> ^(left right)
;
left
: A
| B
;
right
: C
| D
| E
;
Personally, I prefer the 2nd option.
Related
i have problem with some code.
If i write Recenzes select: [:a | a komponenta nazev = 'Hitachi P21'] i got some right records. But if i use something like this:
| brzdy |
brzdy := (((
(Sekces select: [:b | b nazev = 'Brzdy']) collect: [:b | b komponenty]) flatten)
select: [:c | c vyrobce nazev = 'Hitachi']) collect: [:d | d nazev].
i can get 'Hitachi P21' with ^ command. But if i use variable 'brzdy' here: Recenzes select: [:a | a komponenta nazev = brzdy] i won't get anything.
In a nutshell. I want to show 'Recenzes' for 'Komponenty' which are in 'Sekces' with value 'Brzdy' and they are saved in column 'Komponenty' (Set) for 'Recenzes' and 'Sekces'.
Does anyone know why?
Since brzdy is the result of a #collect: message, it is a collection of strings, not a single string. Therefore no element a would satisfy the condition a komponenta nazev = brzdy, because you would be comparing objects of different classes. Try something on the lines of
Recenzes select: [:a | brzdy includes: a komponenta nazev]
As a side note, remember that you may eliminate some parentheses by using select:thenCollect: other than (select: blah) collect: bluh. For instance
brzdy := (Sekces select: [:b | b nazev = 'Brzdy'] thenCollect: [:b | b komponenty]) flatten
select: [:c | c vyrobce nazev = 'Hitachi']
thenCollect: [:d | d nazev]
(I'm not familiar with the #flatten message, so I can't tell whether it is necessary or superfluous).
For a later traitements with The project CAPS , I need to store 2 different Graphs into one:
Graph3=Graph1+Graph2
I tried to search for solutions to do that and I found UNION ALL but the last doesn't work as I expected. Is there another way to do that with Cypher?
Example :
val Graph1=session.cypher("""
| FROM GRAPH mergeGraph
| MATCH (from)-[via]->(to)
|WHERE substring(from.geohash,0,5)=substring(to.geohash,0,5)
| CONSTRUCT
| CREATE (h1:HashNode{geohash:substring(from.geohash,0,5)})-[COPY OF via]->(h1)
| RETURN GRAPH
""".stripMargin).graph
which contains this pattern :
val Graph2=session.cypher("""
| FROM GRAPH mergeGraph
| MATCH (from)-[via]->(to)
|WHERE substring(from.geohash,0,5)<>substring(to.geohash,0,5)
| CONSTRUCT
| CREATE (:HashNode{geohash:substring(from.geohash,0,5)})-[COPY OF via]->(:HashNode{geohash:substring(to.geohash,0,5)})
| RETURN GRAPH
""".stripMargin).graph
which contains this pattern :
With union All :
Graph3=Graph1.unionAll(Graph2)
I get this graph :
As you can see the green nodes are the nodes of Graph2 without relationship ! thats what i didn't expected.
I have a grouped result which looks exactly like below :
| grouped | group:chararray | log:bag{:tuple(driverId:chararray,truckId:chararray,eventTime:chararray,eventType:chararray,longitude:chararray,latitude:chararray,eventKey:chararray,CorrelationId:chararray,driverName:chararray,routeId:chararray,routeName:chararray,eventDate:chararray)}
When I perform below :
x = FOREACH grouped GENERATE {log.driverId, log.truckId, log.driverName};
illustrate x;
The out put am getting is :
| x | :bag{:tuple(:bag{:tuple(driverId:chararray)})} |
------------------------------------------------------------------------------------
| | {({(11), (11)}), ({(74), (39)}), ({(Jamie Engesser), (Jamie Engesser)})} |
------------------------------------------------------------------------------------
Where as my expectation is :
{({(11, 74, Jamie Engesser), (11,39,Jamie Engesser)})
Got the Solutions
Since
Group was a tuple and The adjacent result was Bag i had to use Nested FOREACH like below :
x = FOREACH grouped{
val1 = group;
vals = FOREACH log GENERATE driverId, truckId, driverName;
GENERATE val1, vals;
};
So this selected only the required attributes from the given result.
Please comment if some one knows a better/optimal/easier way of doing it.
Thanks
Can data in Hive be transposed? As in, the rows become columns and columns are the rows? If there is no function straight up, is there a way to do it in a couple of steps?
I have a table like this:
| ID | Names | Proc1 | Proc2 | Proc3 |
| 1 | A1 | x | b | f |
| 2 | B1 | y | c | g |
| 3 | C1 | z | d | h |
| 4 | D1 | a | e | i |
I want it to be like this:
| A1 | B1 | C1 | D1 |
| x | y | z | a |
| b | c | d | e |
| f | g | h | i |
I have been looking up other related questions and they all mention using lateral views and explode, but is there a way to selectively choose columns for lateral(ly) view(ing) and explod(ing)?
Also, what might be the rough process to achieve what I would like to do? Please help me out. Thanks!
Edit: I have been reading this link: https://cwiki.apache.org/Hive/languagemanual-lateralview.html and it shows me half of what I want to achieve. The first example in the link is basically what I'd like except that I don't want the rows to repeat and want them as column names. Any ideas on how to get the data to a form such that if I do an explode, it would result in my desired output, or the other way, ie, explode first to lead to another step that would then lead to my desired output table. Thanks again!
I don't know of a way out of the box in hive to do this, sorry. You get close with explode etc. but I don't think it can get the job done.
Overall, conceptually, I think it's hard to a transpose without knowing what the columns of the destination table are going to be in advance. This is true, in particular for hive, because the metadata related to how many columns, their types, their names, etc. in a database - the metastore. And, it's true in general, because not knowing the columns beforehand, would require some sort of in-memory holding of data (ok, sure with spills) and users may need to be careful about not overflowing the memory and such (just like dynamic partitioning in hive).
In any case, long story short, if you know the columns of the destination table beforehand, life is good. There isn't a set command in hive per se, to the best of my knowledge, but you could use a bunch of if clauses and case statements (ugly I know, but that's how I have done the same in the past) in the select clause to transpose the data. Something along the lines of SQL - How to transpose?
Do let me know how it goes!
As Mark pointed out there's no easy way to do this in Hive since PIVOT doesn't present in Hive and you may also encounter issues when trying to use the case/when 'trick' since you have multiple values (proc1,proc2,proc3).
As for testing purposes, you may try a different approach:
select v, o1, o2, o3 from (
select k,
v,
LEAD(v,3) OVER() as o1,
LEAD(v,6) OVER() as o2,
LEAD(v,9) OVER() as o3
from (select transform(name,proc1,proc2,proc3) using 'python strm.py' AS (k, v)
from input_table) q1
) q2 where k = 'A1';
where strm.py:
import sys
for line in sys.stdin:
line = line.strip()
name, proc1, proc2, proc3 = line.split('\t')
print '%s\t%s' % (name, proc1)
print '%s\t%s' % (name, proc2)
print '%s\t%s' % (name, proc3)
The trick here is to use a python script in the map phase which emits each column of a row as distinct rows. Then every third (since we have 3 proc columns) row will form the resulting row which we get by peeking forward (lead).
However, this query does the job, it has the drawback that as the input grows, you need to peek the next 3rd element in the query which may lead to performance hit. Anyway you may evaluate it for testing purposes.
I'm kinda new to RDP/Pairwise Disjoint Test and this is just a sample problem. I already have the answer and I would just like to verify if this is correct.
Grammar:
<GU> ::= du<GU>bi<MI> | <HO> | ru
<MI> ::= ra | fa | <HO>
<HO>::= bi<HO> | bi
Solution:
will generate a sting of "bi" OR one "bi"
will generate one "ra" OR one "fa" OR (string of "bi" OR one "bi")
So will generate
du <GU> bi {ra | fa | {bi's | bi} } | {bi's | bi} | ru
Here are the sentences that can be produced by the grammar:
a. dudurubifabira
b. dubibibira
c. dubirubirurafa
d. dududubibibifabirabibibi
e. dududubibifarabirabibi
My answer is "b" and "d".
Am I correct?
Looks like a can also be generated by the language:
<GU>
-> du<GU>bi<MI>
-> dudu<GU>bi<MI>bi<MI>
-> dudurubi<MI>bi<MI>
-> dudurubifabi<MI>
-> dudurubifabira
Otherwise, your end result seems to be correct. I'd be careful about saying a "bi" will generate something though, since it's a terminal.