I'm kinda new to RDP/Pairwise Disjoint Test and this is just a sample problem. I already have the answer and I would just like to verify if this is correct.
Grammar:
<GU> ::= du<GU>bi<MI> | <HO> | ru
<MI> ::= ra | fa | <HO>
<HO>::= bi<HO> | bi
Solution:
will generate a sting of "bi" OR one "bi"
will generate one "ra" OR one "fa" OR (string of "bi" OR one "bi")
So will generate
du <GU> bi {ra | fa | {bi's | bi} } | {bi's | bi} | ru
Here are the sentences that can be produced by the grammar:
a. dudurubifabira
b. dubibibira
c. dubirubirurafa
d. dududubibibifabirabibibi
e. dududubibifarabirabibi
My answer is "b" and "d".
Am I correct?
Looks like a can also be generated by the language:
<GU>
-> du<GU>bi<MI>
-> dudu<GU>bi<MI>bi<MI>
-> dudurubi<MI>bi<MI>
-> dudurubifabi<MI>
-> dudurubifabira
Otherwise, your end result seems to be correct. I'd be careful about saying a "bi" will generate something though, since it's a terminal.
Related
According to mv-expand documentation:
Expands multi-value array or property bag.
mv-expand is applied on a dynamic-typed column so that each value in the collection gets a separate row. All the other columns in an expanded row are duplicated.
Just like the mv-expand operator will create a row each for the elements in the list -- Is there an equivalent operator/way to make each element in a list an additional column?
I checked the documentation and found Bag_Unpack:
The bag_unpack plugin unpacks a single column of type dynamic by treating each property bag top-level slot as a column.
However, it doesn't seem to work on the list, and rather works on top-level JSON property.
Using bag_unpack (like the below query):
datatable(d:dynamic)
[
dynamic({"Name": "John", "Age":20}),
dynamic({"Name": "Dave", "Age":40}),
dynamic({"Name": "Smitha", "Age":30}),
]
| evaluate bag_unpack(d)
It will do the following:
Name Age
John 20
Dave 40
Smitha 30
Is there a command/way (see some_command_which_helps) I can achieve the following (convert a list to columns):
datatable(d:dynamic)
[
dynamic(["John", "Dave"])
]
| evaluate some_command_which_helps(d)
That translates to something like:
Col1 Col2
John Dave
Is there an equivalent where I can convert a list/array to multiple columns?
For reference: We can run the above queries online on Log Analytics in the demo section if needed (however, it may require login).
you could try something along the following lines
(that said, from an efficiency standpoint, you may want to check your options of restructuring the data set to begin with, using a schema that matches how you plan to actually consume/query it)
datatable(d:dynamic)
[
dynamic(["John", "Dave"]),
dynamic(["Janice", "Helen", "Amber"]),
dynamic(["Jane"]),
dynamic(["Jake", "Abraham", "Gunther", "Gabriel"]),
]
| extend r = rand()
| mv-expand with_itemindex = i d
| summarize b = make_bag(pack(strcat("Col", i + 1), d)) by r
| project-away r
| evaluate bag_unpack(b)
which will output:
|Col1 |Col2 |Col3 |Col4 |
|------|-------|-------|-------|
|John |Dave | | |
|Janice|Helen |Amber | |
|Jane | | | |
|Jake |Abraham|Gunther|Gabriel|
To extract key value pairs from text and convert them to columns without hardcoding the key names in query:
print message="2020-10-15T15:47:09 Metrics: duration=2280, function=WorkerFunction, count=0, operation=copy_into, invocationId=e562f012-a994-4fc9-b585-436f5b2489de, tid=lct_b62e6k59_prd_02, table=SALES_ORDER_SCHEDULE, status=success"
| extend Properties = extract_all(#"(?P<key>\w+)=(?P<value>[^, ]*),?", dynamic(["key","value"]), message)
| mv-apply Properties on (summarize make_bag(pack(tostring(Properties[0]), Properties[1])))
| evaluate bag_unpack(bag_)
| project-away message
id | name | parent_id
ab | file | de
ad | song | de
bc | Bob |ad
mn | open.txt | bc
Assuming
ab is the ID of file and bc is the parent ID of file
then to store you can either use the bulk-insert utility
Or you can use the following Cypher query:
CREATE (A {id:'ab', name: 'file'}), (B {id:'bc', name: 'folder'}), (A)-[:child]->(B)
To query, depending on the data you would like to extract use a Cypher query similar to:
MATCH (c)-[:child]->(p) RETURN c,p
For the type of query you're running, I believe it would be better if you maintained a reverse edge [:parent] and modify your query as such:
GRAPH.QUERY Makinga "MATCH (r:Resource{Id:'6e3f67da-43ed-11e9-b149-d3f886f8337c'})-[:parent*1..]->(b:Resource) RETURN count(b) as count"
This is related to the way RedisGraph describes connections and applies filters.
I am currently updating an already existing SSIS package.
The current Package pulls data from an Excel Spread Sheet that is provided by our IT Department. It lists Machine Names of Computers and counts it for a License Report.
I currently have the Job (derived column) strip off the M (Mobile) or D (Desktop) from the first part of the machine name so that it returns just the user name, which is what I need for the report.
MBRUBAKERBR => BRUBAKERBR
However, our IT Department just implemented Windows 7 and with it a new Naming convention.
Now there is a 76A, B, C or D that is added to the end of all of the updated machines. If the machine has not been updated then it stays with the older Naming Convention (seen Above).
There are also machines that have to stay on XP, their names have been update to have X3A, B, C or D at the end of theirs.
MBRUBAKERBR76A or DBRUBAKERX3C
What I need is to remove the last part of the name so that I just get the user name out of it for reporting.
The issues is I can't use a LEFT, RIGHT, LTRIM or RTRIM expression as some of the computer names will only have the M or D in front (as they have not yet been upgraded).
What can I do to remove these characters without rebuilding this package?
UPDATE: I would really like to update the existing Expression that Removed the M and D.
Here is the Expression that I am using.
SUBSTRING(Name,2,50)
this is in a Derived Column in my SSIS Package.
As for Sample Data here is what it looks like coming in.
| Name |
| MBrubakerBR76A |
| MBROCKSKX3A |
| DGOLDBERGZA |
| MWILLIAMSEL |
| DEASTST76C |
| DCUSICKEVX3D |
This is what I want it to return.
| Name |
| BRUBAKERBR |
| BROCKSK |
| GOLDBERGZA |
| WILLIAMSEL |
| EASTST |
| CUSICKEV |
Let me know if you need any more information or examples.
First determine if the machine has been upgraded, if it is then strip out last 3 and the first letter. If it has not been upgraded then just strip out the first letter. I avoided Trim functions to keep the code clear.
SELECT
machineName,
CASE WHEN RIGHT(machineName, 3) Like '%[0-9]%' THEN
SUBSTRING(machineName, 2, len(machineName) - 4)
ELSE
RIGHT(machineName, len(machineName)-1)
END AS UserName
From MachineList
SQL Fiddle Example
SSIS Expression
As pattern matching not working in SSIS expression, try this
LEFT(RIGHT(machineName, 3),2)="X3"||LEFT(RIGHT(machineName, 3),2)="76"?SUBSTRING(machineName, 2, len(machineName) - 4):RIGHT(machineName, len(machineName)-1)
Can data in Hive be transposed? As in, the rows become columns and columns are the rows? If there is no function straight up, is there a way to do it in a couple of steps?
I have a table like this:
| ID | Names | Proc1 | Proc2 | Proc3 |
| 1 | A1 | x | b | f |
| 2 | B1 | y | c | g |
| 3 | C1 | z | d | h |
| 4 | D1 | a | e | i |
I want it to be like this:
| A1 | B1 | C1 | D1 |
| x | y | z | a |
| b | c | d | e |
| f | g | h | i |
I have been looking up other related questions and they all mention using lateral views and explode, but is there a way to selectively choose columns for lateral(ly) view(ing) and explod(ing)?
Also, what might be the rough process to achieve what I would like to do? Please help me out. Thanks!
Edit: I have been reading this link: https://cwiki.apache.org/Hive/languagemanual-lateralview.html and it shows me half of what I want to achieve. The first example in the link is basically what I'd like except that I don't want the rows to repeat and want them as column names. Any ideas on how to get the data to a form such that if I do an explode, it would result in my desired output, or the other way, ie, explode first to lead to another step that would then lead to my desired output table. Thanks again!
I don't know of a way out of the box in hive to do this, sorry. You get close with explode etc. but I don't think it can get the job done.
Overall, conceptually, I think it's hard to a transpose without knowing what the columns of the destination table are going to be in advance. This is true, in particular for hive, because the metadata related to how many columns, their types, their names, etc. in a database - the metastore. And, it's true in general, because not knowing the columns beforehand, would require some sort of in-memory holding of data (ok, sure with spills) and users may need to be careful about not overflowing the memory and such (just like dynamic partitioning in hive).
In any case, long story short, if you know the columns of the destination table beforehand, life is good. There isn't a set command in hive per se, to the best of my knowledge, but you could use a bunch of if clauses and case statements (ugly I know, but that's how I have done the same in the past) in the select clause to transpose the data. Something along the lines of SQL - How to transpose?
Do let me know how it goes!
As Mark pointed out there's no easy way to do this in Hive since PIVOT doesn't present in Hive and you may also encounter issues when trying to use the case/when 'trick' since you have multiple values (proc1,proc2,proc3).
As for testing purposes, you may try a different approach:
select v, o1, o2, o3 from (
select k,
v,
LEAD(v,3) OVER() as o1,
LEAD(v,6) OVER() as o2,
LEAD(v,9) OVER() as o3
from (select transform(name,proc1,proc2,proc3) using 'python strm.py' AS (k, v)
from input_table) q1
) q2 where k = 'A1';
where strm.py:
import sys
for line in sys.stdin:
line = line.strip()
name, proc1, proc2, proc3 = line.split('\t')
print '%s\t%s' % (name, proc1)
print '%s\t%s' % (name, proc2)
print '%s\t%s' % (name, proc3)
The trick here is to use a python script in the map phase which emits each column of a row as distinct rows. Then every third (since we have 3 proc columns) row will form the resulting row which we get by peeking forward (lead).
However, this query does the job, it has the drawback that as the input grows, you need to peek the next 3rd element in the query which may lead to performance hit. Anyway you may evaluate it for testing purposes.
If I got a grammar rule like
a: A (C|D|E)
I can create AST for the rule by attaching rewrite rules for each alternative(C, D, E) like this:
a: A (C -> ^(A C)
| D -> ^(A D)
| E -> ^(A E))
But, if I got another slightly different grammar rule like
a: (A|B) (C|D|E)
how do I create AST for every possible match? I first tried like this:
a: (A|B) (C|D|E) -> ^((A|B) (C|D|E))
but, it did not work.
Is there a simple way to solve this problem?
Thanks in advance. :)
You have two options:
1
a : (left=A | left=B) (right=C | right=D | right=E) -> ^($left $right)
;
or:
2
a : left right -> ^(left right)
;
left
: A
| B
;
right
: C
| D
| E
;
Personally, I prefer the 2nd option.