Can you please help me to convert this Finite Automata to Regular Grammar?
a b
____________________
--> q1 | q2 q1
q2 | q1 q3
q3 | q4 q3
<-- q4 | q3 q1
____________________
First we convert the given data into equivalent Finite automata
Then from the FA we will get
Initial state -q1,final state -q4,
Non-terminals(V)-{q1,q2,q3,q4}(usually non-terminals are denoted with capital letters)
terminals-(T)-{a,b},
start symbol-q1
By using FA we write the production rules in regular grammer
P: { q1--> a q2
q1--> b q1
q2--> a q1
q2--> b q3
q3--> a q4
q3--> b q3
q3--> a
q4--> a q3
q4--> b q1 }
Therefore given FA is converted to regular grammer
Related
I have a Dataset like below that divided to two desired group by below condition
Employee No
Event date
Event Description
Quarter
Year
102
2021-10-12
First Hire
Q4
21
103
2021-11-02
First Hire
Q4
21
102
2022-01-01
Terminated
Q1
22
102
2021-12-12
Shift Change
Q4
21
101
2021-12-03
First Hire
Q4
21
103
2021-11-05
Terminated
Q4
21
101
2021-12-04
Terminated
Q4
21
105
2022-02-26
First Hire
Q1
22
106
2022-02-26
First Hire
Q1
22
102
2022-03-29
Second Hire
Q1
22
107
2021-05-04
First Hire
Q2
21
108
2022-04-04
First Hire
Q2
22
109
2022-03-03
Terminated
Q1
22
109
2021-12-29
First Hire
Q4
21
109
2022-04-01
Second Hire
Q2
22
109
2022-01-10
Shift Change
Q1
22
df = pd.DataFrame.from_dict(
{
'Employee No': [102,103,102,102,101,103,101,105,106,102,107,108,109,109,109,109],
'Event date': ['2021-10-12', '2021-11-02', '2022-01-01', '2021-12-12','2021-12-03','2021-11-05','2021-12-04','2022-02-26','2022-02-26','2022-03-29','2021-05-04','2022-04-04','2022-03-03','2021-12-29','2022-04-01','2022-01-10'],
'Event Description': ['First Hire', 'First Hire', 'Terminated', 'Shift Change','First Hire','Terminated ','Terminated','First Hire','First Hire',' Second Hire','First Hire','First Hire','Terminated','First Hire','Second Hire','Shift Change'],
'Quarter': ['Q4', 'Q4', 'Q1', 'Q4','Q4','Q4','Q4','Q1','Q1','Q1','Q2','Q2','Q1','Q4','Q2','Q1'],
'Year': ['21', '21', '22', '21','21','21','21','22','22','22','21','22','22','21','22','22']
}
)
# First hired
cond1 = df["EventDescription"].eq("FirstHire")
# Terminated later after first hired
cond2 = (
df["Event Description"].eq("Terminated")
& df["employee No"].isin(df.loc[cond1, "employee No"])
)
df[cond2]
I need to visualize as a table (Heatmap or Pivot Table ) distribution for Terminated by quarter and the year at vertical and first Hire at Horizontal by Quarter and the year.
the value at each cell will be unique counts of Cond 2 based on cond 1 date Quarter and year .
How we can use groupby or crosstab in pandas or any other solution ?
Thank you in advance.
I am currently querying just from one table, but the issue is my fiscal year it goes:
Q1 01/30/2021 to 04/30/2021
Q2 05/01/2021 to 07/30/2021
Q3 07/31/2021 to 10/29/2021
Q4 10/30/2021 to 01/28/2021
Does anyone knows how could I create a column that will show the quarter based on the dates above?
Outcome:
Order Number Order Date Amount Fiscal Year
1 01/01/2021 5 FY21Q1
2 05/15/2021 5 FY21Q2
3 08/10/2021 10 FY21Q3
4 09/10/2020 8 FY20Q3
Thanks
I have a CTE that has data like this. It follows two formats pretty much where counts and process_ids will have these two types of data.
client_id day counts process_ids
--------------------------------------------------------------------------------------------
abc1 Feb-01-2021 3 C1,C2 | C3,C4,C5 | C6,C7
abc2 Feb-05-2021 2, 3 C10,C11,C12 | C13,C14 # C15,C16 | C17,C18
Now I want to get this below output from the above CTE after splitting it out on counts and process_ids -
client_id day counts process_ids
--------------------------------------------------------
abc1 Feb-01-2021 3 C1
abc1 Feb-01-2021 3 C2
abc1 Feb-01-2021 3 C3
abc1 Feb-01-2021 3 C4
abc1 Feb-01-2021 3 C5
abc1 Feb-01-2021 3 C6
abc1 Feb-01-2021 3 C7
abc2 Feb-05-2021 2 C10
abc2 Feb-05-2021 2 C11
abc2 Feb-05-2021 2 C12
abc2 Feb-05-2021 2 C13
abc2 Feb-05-2021 2 C14
abc2 Feb-05-2021 3 C15
abc2 Feb-05-2021 3 C16
abc2 Feb-05-2021 3 C17
abc2 Feb-05-2021 3 C18
Basically, the idea is to split counts and process_ids basis on the below two use cases if they follow any of those formats.
UseCase 1
If counts column only has single-digit and process_ids column has | delimiter.
UseCase 2
If counts column only has two-digit separated by a , delimiter and process_ids column has # delimiter along with pipe.
I am working with Amazon Redshift here and I am confused about how can I split them out as needed.
Is this possible to do by any chance?
This might look a bit hairy at first sight but has been built up from solid techniques and gives the desired result...
SQL
WITH seq_0_9 AS (
SELECT 0 AS d
UNION ALL SELECT 1 AS d
UNION ALL SELECT 2 AS d
UNION ALL SELECT 3 AS d
UNION ALL SELECT 4 AS d
UNION ALL SELECT 5 AS d
UNION ALL SELECT 6 AS d
UNION ALL SELECT 7 AS d
UNION ALL SELECT 8 AS d
UNION ALL SELECT 9 AS d
),
numbers AS (
SELECT a.d + b.d * 10 + c.d * 100 + 1 AS n
FROM seq_0_9 a, seq_0_9 b, seq_0_9 c
),
processed AS
(SELECT client_id,
day,
REPLACE(counts, ' ', '') AS counts,
REPLACE(REPLACE(process_ids, ' ', ''), '|', ',') AS process_ids
FROM tbl),
split_pids AS
(SELECT
client_id,
day,
counts,
split_part(process_ids, '#', n) AS process_ids,
n AS n1
FROM processed
CROSS JOIN numbers
WHERE
split_part(process_ids, '#', n) IS NOT NULL
AND split_part(process_ids, '#', n) != ''),
split_counts AS
(SELECT
client_id,
day,
split_part(counts, ',', n) AS counts,
process_ids,
n1,
n AS n2
FROM split_pids
CROSS JOIN numbers
WHERE
split_part(counts, ',', n) IS NOT NULL
and split_part(counts, ',', n) != ''),
matched_up AS
(SELECT * FROM split_counts WHERE n1 = n2)
SELECT
client_id,
day,
counts,
split_part(process_ids, ',', n) AS process_ids
FROM
matched_up
CROSS JOIN
numbers
WHERE
split_part(process_ids, ',', n) IS NOT NULL
AND split_part(process_ids, ',', n) != '';
Demo
Online rextester demo (using PostgreSQL but should be compatible with Redshift): https://rextester.com/FNA16497
Brief Explanation
This technique is used to generate a numbers table (from 1 to 1000 inclusive). This technique is then used multiple times with multiple Common Table Expressions to achieve it in a single SQL statement.
I have built an example script, starting from this TSV
client_id day counts process_ids
abc1 Feb-01-2021 3 C1,C2 | C3,C4,C5 | C6,C7
abc2 Feb-05-2021 2,3 C10,C11,C12 | C13,C14 # C15,C16 | C17,C18
This is the pretty printed version
+-----------+-------------+--------+-------------------------------------------+
| client_id | day | counts | process_ids |
+-----------+-------------+--------+-------------------------------------------+
| abc1 | Feb-01-2021 | 3 | C1,C2 | C3,C4,C5 | C6,C7 |
| abc2 | Feb-05-2021 | 2,3 | C10,C11,C12 | C13,C14 # C15,C16 | C17,C18 |
+-----------+-------------+--------+-------------------------------------------+
I have written this Miller procedure
mlr --tsv clean-whitespace then put -S '
if ($process_ids=~"|" && $counts=~"^[0-9]$")
{$process_ids=gsub($process_ids," *[|] *",",")}
elif($process_ids=~"[#]")
{$process_ids=gsub(gsub($process_ids," *[|] *",",")," *# *","#");$counts=gsub($counts,",","#")}' then \
put '
asplits = splitnv($counts, "#");
bsplits = splitnv($process_ids, "#");
n = length(asplits);
for (int i = 1; i <= n; i += 1) {
outrec = $*;
outrec["counts"] = asplits[i];
outrec["process_ids"] = bsplits[i];
emit outrec;
}
' then \
uniq -a then \
filter -x -S '$counts=~"[#]"' then \
cat -n then \
nest --explode --values --across-records -f process_ids --nested-fs "," then \
cut -x -f n input.tsv
that gives you
client_id day counts process_ids
abc1 Feb-01-2021 3 C1
abc1 Feb-01-2021 3 C2
abc1 Feb-01-2021 3 C3
abc1 Feb-01-2021 3 C4
abc1 Feb-01-2021 3 C5
abc1 Feb-01-2021 3 C6
abc1 Feb-01-2021 3 C7
abc2 Feb-05-2021 2 C10
abc2 Feb-05-2021 2 C11
abc2 Feb-05-2021 2 C12
abc2 Feb-05-2021 2 C13
abc2 Feb-05-2021 2 C14
abc2 Feb-05-2021 3 C15
abc2 Feb-05-2021 3 C16
abc2 Feb-05-2021 3 C17
abc2 Feb-05-2021 3 C18
I have a dataset returned from a MS Access SQL query that looks like Table 1.
Table 1
Year Quarter P1 P2
2013 Q1 1 6
2013 Q2 2 9
2013 Q3 5 1
2013 Q4 6 4
2014 Q1 4 3
2014 Q2 8 2
2014 Q3 6 5
2014 Q4 2 4
2015 Q1 2 3
2015 Q2 1 1
I would like to transpose the data to look like Table 2.
Table 2
Year Quarter Value P1
2014 Q3 P1 6
2014 Q3 P2 5
2014 Q4 P1 2
2014 Q4 P2 4
2015 Q1 P1 2
2015 Q1 P2 3
2015 Q2 P1 1
2015 Q2 P2 1
I've been looking around internet and understand that I need to use TRANSPOSE in the query but I can't figure out how to use it especially since I don't want to transpose the two first columns.
I think you can do what you want with union all:
select year, quarter, 'P1' as value, p1
from table1
union all
select year, quarter, 'P2' as value, p2
from table1;
You might want to add where clauses to get only the rows in your desired results.
I am trying to return a list of products that only appear in one quarter. I have decided the best way to do this is to I have been fiddling around with my code for a while and tried a COUNT but realised that this wouldn't work as it is counting the number of entries a product has rather than the number of quarters it appears in.
These are my three tables:
SALES FACT TABLE
TIME_KEY PRODUCT_KEY BRANCH_KEY LOCATION_KEY POUNDS_SOLD AVG_SALES UNITS_SOLD
----------------------------- ----------- ---------- ------------ ----------- ---------- ----------
22-DEC-13 08.31.18.442000000 2 B1 L19 21542.39 10821.2 100
21-DEC-10 21.19.37.182000000 3 B8 L5 65487 32793.5 100
13-SEP-13 06.36.03.720000000 7 B2 L15 78541.84 39470.92 400
24-JUN-13 12.21.45.186000000 1 B7 L13 94115 47167.5 220
18-SEP-07 12.58.06.873000000 8 B2 L2 54000 27250 500
11-FEB-11 18.06.08.475000000 8 B9 L6 11123 5636.5 150
28-SEP-13 15.06.20.153000000 6 B3 L16 45896.31 23008.16 120
22-DEC-08 19.34.48.490000000 5 B6 L3 87451.01 43875.51 300
23-JUL-13 20.08.51.173000000 6 B6 L14 69542 34971 400
20-DEC-13 22.47.24.962000000 9 B4 L17 21584.39 10872.2 160
21-DEC-06 19.11.50.472000000 5 B10 L1 10000 27250 500
13-MAR-13 14.13.58.555000000 1 B2 L11 62413 31256 99
06-MAR-13 18.15.40.365000000 4 B6 L10 94785 47542.5 300
20-DEC-13 23.35.12.683000000 2 B5 L18 52359.19 26289.6 220
15-MAR-13 19.11.58.459000000 4 B9 L12 66499.84 33299.92 100
19-DEC-11 13.17.34.443000000 9 B2 L7 51449 26049.5 650
14-FEB-12 10.20.20.787000000 10 B5 L8 66589 33394.5 200
19-DEC-09 10.09.41.844000000 3 B7 L4 99125 49687.5 250
22-MAR-12 19.36.24.790000000 10 B2 L9 62331.66 31765.83 1200
11-JAN-14 19.18.58.595000000 7 B8 L20 35214.85 17667.43 120
TIME DIMENSION TABLE
TIME_KEY DAY DAY_OF_WEEK MONTH QUARTER YEAR
----------------------------- ---------- ----------- --------- ------- ----------
13-MAR-13 14.13.58.555000000 13 WEDNESDAY MARCH Q1 2013
22-DEC-13 08.31.18.442000000 22 SUNDAY DECEMBER Q4 2013
21-DEC-10 21.19.37.182000000 21 TUESDAY DECEMBER Q4 2010
15-MAR-13 19.11.58.459000000 15 FRIDAY MARCH Q1 2013
21-DEC-06 19.11.50.472000000 21 THURSDAY DECEMBER Q4 2006
28-SEP-13 15.06.20.153000000 28 SATURDAY SEPTEMBER Q3 2013
11-JAN-14 19.18.58.595000000 11 SATURDAY JANUARY Q1 2014
11-FEB-11 18.06.08.475000000 11 FRIDAY FEBRUARY Q1 2011
20-DEC-13 22.47.24.962000000 20 FRIDAY DECEMBER Q4 2013
14-FEB-12 10.20.20.787000000 14 TUESDAY FEBRUARY Q1 2012
24-JUN-13 12.21.45.186000000 24 MONDAY JUNE Q2 2013
20-DEC-13 23.35.12.683000000 20 FRIDAY DECEMBER Q4 2013
19-DEC-09 10.09.41.844000000 19 SATURDAY DECEMBER Q4 2009
06-MAR-13 18.15.40.365000000 6 WEDNESDAY MARCH Q1 2013
22-DEC-08 19.34.48.490000000 22 MONDAY DECEMBER Q4 2008
23-JUL-13 20.08.51.173000000 23 TUESDAY JULY Q3 2013
13-SEP-13 06.36.03.720000000 13 FRIDAY SEPTEMBER Q3 2013
18-SEP-07 12.58.06.873000000 18 TUESDAY SEPTEMBER Q3 2007
19-DEC-11 13.17.34.443000000 19 MONDAY DECEMBER Q4 2011
22-MAR-12 19.36.24.790000000 22 THURSDAY MARCH Q1 2012
PRODUCT DIMENSION TABLE
PRODUCT_KEY PRODUCT_NAME BRAND TYPE SUPPLIER_TYPE
----------- ------------------------- -------------------- ---------- ----------------
1 SVF1521P2EB SONY LAPTOP WHOLESALER
2 15-A003SA COMPAQ LAPTOP WHOLESALER
3 15-N271SA HP LAPTOP RETAIL
4 15-N290SA HP LAPTOP RETAIL
5 E6400 DELL LAPTOP RETAIL
6 SVF1521C2EB SONY LAPTOP WHOLESALER
7 SVF1532K4EB SONY LAPTOP WHOLESALER
8 C50-A-1CK TOSHIBA LAPTOP WHOLESALER
9 NX.MF8EK.001 ACER LAPTOP RETAIL
10 NP915S3G-K01UK SAMSUNG LAPTOP RETAIL
This is the code that I am running:
SELECT DISTINCT product.product_name, product.brand, quarter, SUM (sales.units_sold), COUNT (quarter)
FROM sales
INNER JOIN product
ON product.product_key=sales.product_key
INNER JOIN time
ON sales.time_key=time.time_key
GROUP BY quarter, product.product_name, product.brand
ORDER BY brand;
Below is the result once I run the query with the code that I have so far which is obviously not giving me what I want:
PRODUCT_NAME BRAND QUARTER SUM(SALES.UNITS_SOLD) COUNT(QUARTER)
------------------------- -------------------- ------- --------------------- --------------
NX.MF8EK.001 ACER Q4 810 2
15-A003SA COMPAQ Q4 320 2
E6400 DELL Q4 800 2
15-N271SA HP Q4 350 2
15-N290SA HP Q1 400 2
NP915S3G-K01UK SAMSUNG Q1 1400 2
SVF1521C2EB SONY Q3 520 2
SVF1521P2EB SONY Q1 99 1
SVF1521P2EB SONY Q2 220 1
SVF1532K4EB SONY Q1 120 1
SVF1532K4EB SONY Q3 400 1
C50-A-1CK TOSHIBA Q1 150 1
C50-A-1CK TOSHIBA Q3 500 1
I know its probably simple for you guys but I can sense that I am nearly there, I think I just have something the wrong way round and am not translating my intention into code.
The desired output would display products that have only been sold in one quarter. If they were sold in two quarters they would not be considered seasonal.
In your query count(quarter) is going to be the same as count(*). You need to remove quarter from the group by and do the comparison on the number of quarters in a having clause:
SELECT product.product_name, product.brand, MIN(quarter) as quarter, SUM(sales.units_sold)
FROM sales INNER JOIN
product
ON product.product_key = sales.product_key INNER JOIN
time
ON sales.time_key = time.time_key
GROUP BY product.product_name, product.brand
HAVING min(quarter) = max(quarter)
ORDER BY brand;
You could also use:
HAVING count(distinct quarter) = 1
However, count(distinct) is less efficient than most other aggregation functions.