I just took my midterm but couldn't answer this question.
Construct grammar for the following languages
L={(a* ba* ba*)*}
The outermost rule is Kleene closure, *. What's inside parentheses is a language unto itself. This suggests the following productions for our grammar:
S := e
S := SL
Here, e is the empty string and L is the start symbol for a grammar generating the language corresponding to the regular expression inside the parentheses.
Now our language is that which begins with any number of as, followed by a b, followed by any number of as, followed by a b, followed by any number of as. We can first define "any number of as":
A := e
A := Aa
And then the definition of L is straightforward:
L := AbAbA
The complete grammar is therefore:
S := e
S := SL
L := AbAbA
A := e
A := Aa
Related
You need to create a automation grammar that generates chains of numbers in which no digit is repeated. The chains can be of any length, i.e. from 1 to 10,because all the numbers can be used
A single digit is also considered a sequence
I have a grammar for three digits, but for all 10 I can't make a grammar in any way:
S->1A|2B|3C
A->2D|3E|εD|εE
B->1D|3G|εD|εG
C->1E|2G|εE|εG
D->3|ε
E->2|ε
G->1|ε
Example:
allowed:
1234567890
1235678
09841
154
etc
not allowed:
1231
09877
11544
etc
I need to get the function, procedure, cursor names and other objects from a PL/SQL package body file (*.spb) in Notepad++, for example from this sql script:
create or replace PACKAGE BODY pac_emp3 AS
PROCEDURE p_buscar_salario_emp3 (p_employee_id IN employees.employee_id%TYPE,
p_employee_name OUT employees.first_name%type,
p_string IN OUT varchar2)
AS
v_salario employees.salary%TYPE;
BEGIN
SELECT salary, first_name INTO v_salario, p_employee_name FROM employees WHERE employees.employee_id = p_employee_id;
p_string := 'Procedimiento terminado';
DBMS_OUTPUT.PUT_LINE('Salario: '|| v_salario);
END p_buscar_salario_emp3;
FUNCTION f_foo RETURN NUMBER IS
SELECT 1+1 FROM DUAL;
RETURN 1;
END;
END pac_emp3;
In this case, I need extract only:
PROCEDURE p_buscar_salario_emp3
or that the text looks only with the object and the name of the object:
PROCEDURE p_buscar_salario_emp3
FUNCTION f_foo
Same with FUNCTION names, etc.
I understand that it's possible with regular expression, but which one regex?
Ctrl+H
Find what: (?:\A|\G)(?:(?!(?:PROCEDURE|FUNCTION)).)*((?:PROCEDURE|FUNCTION)\s+\w+)(?:(?!(?:PROCEDURE|FUNCTION)).)*
Replace with: $1\n or $1\r\n
check Wrap around
check Regular expression
CHECK . matches newline
Replace all
Explanation:
(?:\|\G) # non capture group, beginning of strig or restart from last match position
(?: # start non capture group
(?! # start negative lookahead
(?:PROCEDURE|FUNCTION) # non capture group PROCEDURE or FUNCTION (you can add other keywords)
) # end lookahead
. # any character
)* # end group, may appear 0 or more times
( # start group 1
(?:PROCEDURE|FUNCTION) # non capture group PROCEDURE or FUNCTION (you can add other keywords)
\s+ # 1 or more spaces
\w+ # 1 or more word character
) # end group 1
(?: # start non capture group
(?! # start negative lookahead
(?:PROCEDURE|FUNCTION) # non capture group PROCEDURE or FUNCTION (you can add other keywords)
) # end lookahead
. # any character
)* # end group, may appear 0 or more times
Replacement:
$1 # content of group 1
\n # linefeed, use \r\n for windows line break
Result for given example:
PROCEDURE p_buscar_salario_emp3
FUNCTION f_foo
This regex should work for you.
((PROCEDURE|FUNCTION) \S+)
If you need to add more terms, type them in like so:
((PROCEDURE|FUNCTION|NEW_TERM) \S+)
I have a table with a structure like this...
the_geom data
geom1 data1+3000||data2+1000||data3+222
geom2 data1+500||data2+900||data3+22232
I want to create a function that returns the records by user request.
Example: for data2, retrieve geom1,1000 and geom2, 900
Till now I created this function (see below) which works quite good but I am facing a parameter substitution problem... (you can see I am not able to substitute 'data2' for $1 in... BUT yes I can use $1 later
regexp_matches(t::text, E'(data2[\+])([0-9]+)'::text)::text)[2]::integer
MY FUNCTION
create or replace function get_counts(taxa varchar(100))
returns setof record
as $$
SELECT t2.counter,t2.the_geom
FROM (
SELECT (regexp_matches(t.data::text, E'(data2[\+])([0-9]+)'::text)::text)[2]::integer as counter,the_geom
from (select the_geom,data from simple_inpn2 where data ~ $1::text) as t
) t2
$$
language sql;
SELECT get_counts('data2') will work **but we should be able to make this substitution**:
regexp_matches(t::text, E'($1... instead of E'(data2....
I think its more a syntaxis issue, as the function execution gives no error, just interprets $1 as a string and gives no result.
thanks in advance,
A E'$1' is a string literal (using the escape string syntax) containing a dollar sign followed by a one. An unquoted $1 is the first parameter to your function. So this:
regexp_matches(t, E'($1[\+])([0-9]+)'))[2]::integer
as you've found, won't interpolate the $1 with the function's first parameter.
The regex is just a string, a string with an internal structure but still just a string. If you know that $1 will be a normal word then you could say:
regexp_matches(t, E'(' || $1 || E'[\+])([0-9]+)'))[2]::integer
to paste your strings together into a suitable regex. However, it is better to be a little paranoid, sooner or later someone is going to call your function with a string like 'ha ha (' so you should be prepared for it. The easiest way that I can think of to add an arbitrary string to a regex is to escape all the non-word characters:
-- Don't forget to escape the escaped escapes! Hence all the backslashes.
str := regexp_replace($1, E'(\\W)', E'\\\\\\1', 'g');
and then paste str into the regex as above:
regexp_matches(t, E'(' || str || E'[\+])([0-9]+)'))[2]::integer
or better, build the regex outside the regexp_matches to cut down on the nested parentheses:
re := E'(' || str || E'[\+])([0-9]+)';
-- ...
select regexp_matches(t, re)[2]::integer ...
PostgreSQL doesn't have Perl's \Q...\E and the (?q) metasyntax applies until the end of the regex so I can't think of any better way to paste an arbitrary string into the middle of a regex as a non-regex literal value than to escape everything and let PostgreSQL sort it out.
Using this technique, we can do things like:
=> do $$
declare
m text[];
s text;
r text;
begin
s = E'''{ha)?';
r = regexp_replace(s, E'(\\W)', E'\\\\\\1', 'g');
r = '(ha' || r || ')';
raise notice '%', r;
select regexp_matches(E'ha''{ha)?', r) into m;
raise notice '%', m[1];
end$$;
and get the expected
NOTICE: ha'{ha)?
output. But if you leave out the regexp_replace escaping step, you'll just get an
invalid regular expression: parentheses () not balanced
error.
As an aside, I don't think you need all that casting so I removed it. The regexes and escaping are noisy enough, there's no need to throw a bunch of colons into the mix. Also, I don't know what your standard_conforming_strings is set to or which version of PostgreSQL you're using so I've gone with E'' strings everywhere. You'll also want to switch your procedure to PL/pgSQL (language plpgsql) to make the escaping easier.
How can the negation meta-character, ~, be used in ANTLR's lexer- and parser rules?
Negating can occur inside lexer and parser rules.
Inside lexer rules you can negate characters, and inside parser rules you can negate tokens (lexer rules). But both lexer- and parser rules can only negate either single characters, or single tokens, respectively.
A couple of examples:
lexer rules
To match one or more characters except lowercase ascii letters, you can do:
NO_LOWERCASE : ~('a'..'z')+ ;
(the negation-meta-char, ~, has a higher precedence than the +, so the rule above equals (~('a'..'z'))+)
Note that 'a'..'z' matches a single character (and can therefor be negated), but the following rule is invalid:
ANY_EXCEPT_AB : ~('ab') ;
Because 'ab' (obviously) matches 2 characters, it cannot be negated. To match a token that consists of 2 character, but not 'ab', you'd have to do the following:
ANY_EXCEPT_AB
: 'a' ~'b' // any two chars starting with 'a' followed by any other than 'b'
| ~'a' . // other than 'a' followed by any char
;
parser rules
Inside parser rules, ~ negates a certain token, or more than one token. For example, you have the following tokens defined:
A : 'A';
B : 'B';
C : 'C';
D : 'D';
E : 'E';
If you now want to match any token except the A, you do:
p : ~A ;
And if you want to match any token except B and D, you can do:
p : ~(B | D) ;
However, if you want to match any two tokens other than A followed by B, you cannot do:
p : ~(A B) ;
Just as with lexer rules, you cannot negate more than a single token. To accomplish the above, you need to do:
P
: A ~B
| ~A .
;
Note that the . (DOT) char in a parser rules does not match any character as it does inside lexer rules. Inside parser rules, it matches any token (A, B, C, D or E, in this case).
Note that you cannot negate parser rules. The following is illegal:
p : ~a ;
a : A ;
I know of two types of left recursion, immediate and indirect, and I don't think the following grammar falls into any of them, but is that the case?
And is this grammar an LL grammar? Why or why not?
E ::= T+E | T
T ::= F*T | F
F ::= id | (E)
I assume you start with E. Both of E’s alternatives start with a T. Both of T’s alternatives start with F. Both of F’s alternatives start with a terminal symbol. Thus, the grammar is not left recursive.