DFA for strings not containing 101101 as substring - finite-automata

Given language L={ w | w belongs to (0,1)*, w does not contain the substring 101101}, Construct the DFA for this.
I understand that if I could draw the DFA for set of all strings over (0,1)* such that 101101 is a substring then I could simply use complementation to find the required DFA ..
Can anyone help me with the construction of DFA for L

I think this DFA will satisfy your requirement.

Your idea is right. Here are the steps written out:
First, make a DFA for the language of all strings containing 101101 as a substring. All such strings can begin and end with anything, so long as they have 101101 somewhere in between. In other words, a regular expression for this language is (0+1)*101101(0+1)*. A DFA looks like this:
|
|
| /-0-\
V / |
(q0)<---+
| |
1 |
| |
V |
(q1)--1-+
| |
0 |
| |
V |
(q2)--0-+
| |
1 |
| |
V |
(q3)--0-+
| |
1 |
| |
V |
(q4)--1-+
| |
0 |
| |
V |
(q5)--0-+
|
1
|
V
[[q6]]<-\
| |
\-0,1/
State [[q6]] is the accepting state since that's where you end up if you see the necessary substring.
Second, we need to take the complement. For a DFA, this is easy: we change all accepting states to non-accepting, and vice-versa. So, our new DFA looks the same, but has (q6) non-accepting and [[q0]], [[q1]], [[q2]], [[q3]], [[q4]], [[q5]] accepting.

Here is the answer for your question. First draw DFA which accepts 101101 as string. Then change the non-final states to final states and final states to non-final states. That's it,the required DFA is ready.You will get a DFA which doesn't accept 101101 string.
click here to view DFA

Related

How can I combine Postgresgl's ArrayField ANY option with LIKE

I'm trying to filter a queryset on the first characters of an element in an ArrayField in postgresql.
Data
--------------------
| id | registration_date | sbi_codes |
| 1 | 2007-11-13 | {9002, 1002, 85621} |
| 2 | 2010-10-11 | {1002, 9022, 9033 |
| 3 | 2019-02-02 | {9001, 8921} |
| 4 | 2012-02-02 | {120} |
I've tried the following (which obviously don't work), but I think clearly indicates what I'm trying to achieve.
select count(*)
from administrations_administration
where '90' = left(any(sbi_codes),2)
or
select count(*)
from administrations_administration
where '90%' like any(sbi_codes
So the sbi_codes can be for example 9002 or 9045, And I'm trying to filter all the records that contain an element that starts with 90.
expected result
____
| count | sbi_codes |
| 3 | 90 |
Thanks!
The thing on the left hand side of LIKE is the string, in which % is just a %. The thing on the right hand side is the pattern, in which % is a wildcard. Using ANY does't change these semantics, the pattern still goes the right.
To solve this, you could create your own operator which is like LIKE, but has its arguments reversed.

keep the extra whitespaces in display of pandas dataframe in jupyter notebook

In jupyter notebook, extra whitespaces in dataframe are removed. But sometime that is not preferred, e.g.
df=pd.DataFrame({'A':['a b','c'],'B':[1,2]})
df
The result I get:
| | A | B |
|---|-----|---|
| 0 | a b | 1 |
| 1 | c | 2 |
But I want:
| | A | B |
|---|-------|---|
| 0 | a b | 1 |
| 1 | c | 2 |
Is it possible? Thanks
It's actually HTML: pandas dutifully write all the spaces into the HTML markup (the front end format used by Jupyter Notebook). HTML, by default, collapses multiple adjacent whitespaces into one. Use the style object to change this:
df.style.set_properties(**{'white-space': 'pre'})
You unfortunately can't change the default render style of a DataFrame yet. You can write a function to wrap that line:
def print_df(df):
return df.style.set_properties(**{'white-space': 'pre'})
print_df(df)

Can i have data tables in the examples of cucumber?

I am looking for something like below . Let me know if we have a way to do it .
Scenario Outline : Verify
Examples :
| name | rollno | marks |
| raj | 110 | |science | maths | test | |
| 95 | 20 | finaltest |
| 100 | 20 | midterm |
Nested data tables are not supported in examples.
Why not move the science, maths marks up a level instead?
Or you could use a delimited string of all the subject and marks in one column - Science$95#Maths$20... Split this in your step definition to create your objects etc.
Or one column with a delimited string of subjects and another a delimited string with marks of the respective subjects. Makes things clearer than one string but you will need the splitting logic twice.

Oracle SQL regex extraction

I have data as follows in a column
+----------------------+
| my_column |
+----------------------+
| test_PC_xyz_blah |
| test_PC_pqrs_bloh |
| test_Mobile_pqrs_bleh|
+----------------------+
How can I extract the following as columns?
+----------+-------+
| Platform | Value |
+----------+-------+
| PC | xyz |
| PC | pqrs |
| Mobile | pqrs |
+----------+-------+
I tried using REGEXP_SUBSTR
Default first pattern occurrence for platform:
select regexp_substr(my_column, 'test_(.*)_(.*)_(.*)') as platform from table
Getting second pattern occurrence for value:
select regexp_substr(my_column, 'test_(.*)_(.*)_(.*)', 1, 2) as value from table
This isn't working, however. Where am I going wrong?
For Non-empty tokens
select regexp_substr(my_column,'[^_]+',1,2) as platform
,regexp_substr(my_column,'[^_]+',1,3) as value
from my_table
;
For possibly empty tokens
select regexp_substr(my_column,'^.*?_(.*)?_.*?_.*$',1,1,'',1) as platform
,regexp_substr(my_column,'^.*?_.*?_(.*)?_.*$',1,1,'',1) as value
from my_table
;
+----------+-------+
| PLATFORM | VALUE |
+----------+-------+
| PC | xyz |
+----------+-------+
| PC | pqrs |
+----------+-------+
| Mobile | pqrs |
+----------+-------+
(.*) is greedy by nature, it will match all character including _ character as well, so test_(.*) will match whole of your string. Hence further groups in pattern _(.*)_(.*) have nothing to match, whole regex fails. The trick is to match all characters excluding _. This can be done by defining a group ([^_]+). This group defines a negative character set and it will match to any character except for _ . If you have better pattern, you can use them like [A-Za-z] or [:alphanum]. Once you slice your string to multiple sub strings separated by _, then just select 2nd and 3rd group.
ex:
SELECT REGEXP_SUBSTR( my_column,'(([^_]+))',1,2) as platform, REGEXP_SUBSTR( my_column,'(([^_]+))',1,3) as value from table;
Note: AFAIK there is no straight forward method to Oracle to exact matching groups. You can use regexp_replace for this purpose, but it unlike capabilities of other programming language where you can exact just group 2 and group 3. See this link for example.

Let pandas use 0-based row number as index when reading Excel files

I am trying to use pandas to process a series of XLS files. The code I am currently using looks like:
with pandas.ExcelFile(data_file) as xls:
data_frame = pandas.read_excel(xls, header=[0, 1], skiprows=2, index_col=None)
And the format of the XLS file looks like
+---------------------------------------------------------------------------+
| REPORT |
+---------------------------------------------------------------------------+
| Unit: 1000000 USD |
+---------------------------------------------------------------------------+
| | | | | Balance |
+ ID + Branch + Customer ID + Customer Name +--------------------------+
| | | | | Daily | Monthly | Yearly |
+--------+---------+-------------+---------------+-------+---------+--------+
| 111111 | Branch1 | 1 | Company A | 10 | 5 | 2 |
+--------+---------+-------------+---------------+-------+---------+--------+
| 222222 | Branch2 | 2 | Company B | 20 | 25 | 20 |
+--------+---------+-------------+---------------+-------+---------+--------+
| 111111 | Branch1 | 3 | Company C | 30 | 35 | 40 |
+--------+---------+-------------+---------------+-------+---------+--------+
Even I explicitly gave index_col=None, pandas still take ID column as the index. I am wondering the right way of making row numbers to be the index.
pandas currently doesn't support parsing a MultiIndex columns without also parsing a row index. Related issue here - it probably could be supported, but this gets tricky to define in a non-ambiguous way.
It's a hack, but the easiest way to work around this right now is to add a blank column on the left side of data, then read it in like this.
pd.read_excel('file.xlsx', header=[0,1], skiprows=2).reset_index(drop=True)
Edit:
If you can't / don't want to modify the files, a couple options are:
If the data has a known / common header, use pd.read_excel(..., skiprows=4, header=None) and assign the columns yourself, suggested by #ayhan.
If you need to parse the header, use pd.read_excel(..., skiprows=2, header=0), then munge the second level of labels into a MultiIndex. This will probably mess up dtypes, so you may also need to do some typecasting (pd.to_numeric) as well.