How to print lines based on values in multiple columns - awk

I'd like to print every line in my file that satisfies the following:
print line if column 3 or column 4 or column 5 is less than 10
Example
Emma A 10 4 7
Sally A 4 4 7
Jack B 15 19 2
Jeff C 15 20 25
Mary A 15 20 25
Meg C 2 7 9
Output
Emma A 10 4 7
Sally A 4 4 7
Jack B 15 19 2
Meg C 2 7 9

It's pretty simple with awk:
awk '$3<10 || $4<10 || $5<10' file
The output:
Emma A 10 4 7
Sally A 4 4 7
Jack B 15 19 2
Meg C 2 7 9

Related

How can I add rows iteratively to a select result set in pl sql?

In the work_order table there is wo_no. When I query the work_order table I want 2 additional columns (Task_no, Task_step_no) in the results set as follows
this should be iterate for all the wo_no s in the work_order table. task_no should go up to 5 and task_step_no should go upto 2000. (please have a look on the attached image to see the results set if not clear)
Any idea how to get such a results set in plsql?
One option is to use 2 row generators cross joined to your current table.
SQL> with
2 work_order (wo_no) as
3 (select 1 from dual union all
4 select 2 from dual
5 ),
6 task (task_no) as
7 (select level from dual connect by level <= 5),
8 step (task_step_no) as
9 (select level from dual connect by level <= 20) --> you'd have 2000 here
10 select y.wo_no, t.task_no, s.task_step_no
11 from work_order y cross join task t cross join step s
12 order by 1, 2, 3;
Result:
WO_NO TASK_NO TASK_STEP_NO
---------- ---------- ------------
1 1 1
1 1 2
1 1 3
1 1 4
1 1 5
1 1 6
1 1 7
1 1 8
1 1 9
1 1 10
1 1 11
1 1 12
1 1 13
1 1 14
1 1 15
1 1 16
1 1 17
1 1 18
1 1 19
1 1 20
1 2 1
1 2 2
1 2 3
1 2 4
1 2 5
1 2 6
1 2 7
1 2 8
1 2 9
1 2 10
1 2 11
1 2 12
1 2 13
1 2 14
1 2 15
1 2 16
1 2 17
1 2 18
1 2 19
1 2 20
1 3 1
1 3 2
1 3 3
1 3 4
1 3 5
1 3 6
1 3 7
1 3 8
1 3 9
1 3 10
1 3 11
1 3 12
1 3 13
1 3 14
1 3 15
1 3 16
1 3 17
1 3 18
1 3 19
1 3 20
1 4 1
1 4 2
1 4 3
1 4 4
1 4 5
1 4 6
1 4 7
1 4 8
1 4 9
1 4 10
1 4 11
1 4 12
1 4 13
1 4 14
1 4 15
1 4 16
1 4 17
1 4 18
1 4 19
1 4 20
1 5 1
1 5 2
1 5 3
1 5 4
1 5 5
1 5 6
1 5 7
1 5 8
1 5 9
1 5 10
1 5 11
1 5 12
1 5 13
1 5 14
1 5 15
1 5 16
1 5 17
1 5 18
1 5 19
1 5 20
2 1 1
2 1 2
2 1 3
2 1 4
2 1 5
2 1 6
2 1 7
2 1 8
2 1 9
2 1 10
2 1 11
2 1 12
2 1 13
2 1 14
2 1 15
2 1 16
2 1 17
2 1 18
2 1 19
2 1 20
2 2 1
2 2 2
2 2 3
2 2 4
2 2 5
2 2 6
2 2 7
2 2 8
2 2 9
2 2 10
2 2 11
2 2 12
2 2 13
2 2 14
2 2 15
2 2 16
2 2 17
2 2 18
2 2 19
2 2 20
2 3 1
2 3 2
2 3 3
2 3 4
2 3 5
2 3 6
2 3 7
2 3 8
2 3 9
2 3 10
2 3 11
2 3 12
2 3 13
2 3 14
2 3 15
2 3 16
2 3 17
2 3 18
2 3 19
2 3 20
2 4 1
2 4 2
2 4 3
2 4 4
2 4 5
2 4 6
2 4 7
2 4 8
2 4 9
2 4 10
2 4 11
2 4 12
2 4 13
2 4 14
2 4 15
2 4 16
2 4 17
2 4 18
2 4 19
2 4 20
2 5 1
2 5 2
2 5 3
2 5 4
2 5 5
2 5 6
2 5 7
2 5 8
2 5 9
2 5 10
2 5 11
2 5 12
2 5 13
2 5 14
2 5 15
2 5 16
2 5 17
2 5 18
2 5 19
2 5 20
200 rows selected.
SQL>
As you already have the work_order table, you'd just use it in FROM clause (not as a CTE):
with
task (task_no) as
(select level from dual connect by level <= 5),
step (task_step_no) as
(select level from dual connect by level <= 20)
select y.wo_no, t.task_no, s.task_step_no
from work_order y cross join task t cross join step s
order by 1, 2, 3;

repeatrows based on second frame

I would like to ask for your support. I tried many things, without success.
Suppose you have two different frames, a long frame (LF) (high number of rows) and a short frame (SF) (low number of rows), see example
SF=pd.DataFrame({"col1":[1,2,3],"col2":[4,5,6]})
LF=pd.DataFrame({"col_long":[1,2,3,4,5,6,7,8,9,10,11]})
I need to loop through the values of a specific column from the short frame, let's say we take "Test col2" and concat along axis 1 both frames. I have a solution which works, like this:
EMPTY_FRAME=pd.DataFrame()
SF=pd.DataFrame({"col1":[1,2,3],"col2":[4,5,6]})
LF=pd.DataFrame({"col_long":[1,2,3,4,5,6,7,8,9,10,11]})
for i in range(len(SF.index)):
LF["col1"]=SF["col1"].values[i]
LF["col2"]=SF["col2"].values[i]
EMPTY_FRAME=EMPTY_FRAME.append(LF)
LF= col_long col1 col2
0 1 1 4
1 2 1 4
2 3 1 4
3 4 1 4
4 5 1 4
5 6 1 4
6 7 1 4
7 8 1 4
8 9 1 4
9 10 1 4
10 11 1 4
0 1 2 5
1 2 2 5
2 3 2 5
3 4 2 5
4 5 2 5
5 6 2 5
6 7 2 5
7 8 2 5
8 9 2 5
9 10 2 5
10 11 2 5
0 1 3 6
1 2 3 6
2 3 3 6
3 4 3 6
4 5 3 6
5 6 3 6
6 7 3 6
7 8 3 6
8 9 3 6
9 10 3 6
10 11 3 6
but gets pretty confusing since I have many columns inside the SF and thus I might forget some columns. So the question: Is there any chance have the following solution in a better and shorter way?
I really would be grateful if you guys have an idea how I could further improve my code
you can cross join with reindex to retain the order:
out = (SF.assign(k=1).merge(LF.assign(k=1),on='k').drop('k',1)
.reindex(columns=LF.columns.union(SF.columns,sort=False)))
out.index = out['col_long'].factorize()[0] #if required
print(out)
col_long col1 col2
0 1 1 4
1 2 1 4
2 3 1 4
3 4 1 4
4 5 1 4
5 6 1 4
6 7 1 4
7 8 1 4
8 9 1 4
9 10 1 4
10 11 1 4
0 1 2 5
1 2 2 5
2 3 2 5
3 4 2 5
4 5 2 5
5 6 2 5
6 7 2 5
7 8 2 5
8 9 2 5
9 10 2 5
10 11 2 5
0 1 3 6
1 2 3 6
2 3 3 6
3 4 3 6
4 5 3 6
5 6 3 6
6 7 3 6
7 8 3 6
8 9 3 6
9 10 3 6
10 11 3 6

Pandas : How can I assign group number according to specific value?

DataFrame
pd.DataFrame({'a': range(20)})
>>
a
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
Expected result:
a group_num
0 0 1
1 1 1
2 2 2
3 3 2
4 4 3
5 5 3
6 6 4
7 7 4
8 8 5
9 9 5
10 10 6
11 11 6
12 12 7
13 13 7
14 14 8
15 15 8
16 16 9
17 17 9
18 18 10
19 19 10
What I want to do is to assign group number, from 1 to 9, according to its value.
The idea is to sort these values and split them into 10 groups and assign from 1 to 9 to each group.
But have no idea how to implement it in Pandas
Need your helps
I believe need qcut for evenly sized bins:
df['b'] = pd.qcut(df['a'], 10, labels=range(1, 11))
print (df)
a b
0 0 1
1 1 1
2 2 2
3 3 2
4 4 3
5 5 3
6 6 4
7 7 4
8 8 5
9 9 5
10 10 6
11 11 6
12 12 7
13 13 7
14 14 8
15 15 8
16 16 9
17 17 9
18 18 10
19 19 10
And if you wanted to create groups of 2 you can use this:
df['b'] = df['a'].floordiv(2)+1
You can using //
df['G']=df.a//2+1
df
Out[609]:
a G
0 0 1
1 1 1
2 2 2
3 3 2
4 4 3
5 5 3
6 6 4
7 7 4
8 8 5
9 9 5
10 10 6
11 11 6
12 12 7
13 13 7
14 14 8
15 15 8
16 16 9
17 17 9
18 18 10
19 19 10

display the rows based on the last occurrence of an element in a column in Pandas dataframe

display the rows based on last occurrence of row based on matching values in qty and name column. I would like to drop the rows which does not match the criteria.
before:
name qty price
0 Adam 10 1
1 Rose 11 9
2 Jack 10 12
3 Jack 5 11
4 Rose 15 4
5 Jack 12 17
6 Adam 10 8
7 Rose 11 4
8 Jack 6 23
5 Jack 12 9
Jack 10 4
after:
name qty price
0 Jack 5 11
1 Rose 15 4
2 Adam 10 8
3 Rose 11 4
4 Jack 6 23
5 Jack 12 9
6 Jack 10 4
I believe you are trying to get the last occurrence of each name, qty grouping.
df.groupby(['name', 'qty'], as_index=False).last()
name qty price
0 Adam 10 8
1 Jack 5 11
2 Jack 6 23
3 Jack 10 4
4 Jack 12 9
5 Rose 11 4
6 Rose 15 4

SQL Existing Column Conditional Update Query

I have this data
AnsID QuesID AnsOrder
-----------------------
1 5 NULL
2 5 NULL
3 5 NULL
4 5 NULL
5 5 NULL
6 3 NULL
7 3 NULL
8 3 NULL
9 3 NULL
10 3 NULL
11 4 NULL
12 4 NULL
13 4 NULL
14 4 NULL
15 4 NULL
16 7 NULL
17 9 NULL
18 9 NULL
19 9 NULL
20 9 NULL
21 8 NULL
22 8 NULL
23 8 NULL
24 8 NULL
Want to UPDATE it into this format
AnsID QuesID AnsOrder
-----------------------
1 5 1
2 5 2
3 5 3
4 5 4
5 5 5
6 3 1
7 3 2
8 3 3
9 3 4
10 3 5
11 4 1
12 4 2
13 4 3
14 4 4
15 4 5
16 7 1
17 9 1
18 9 2
19 9 3
20 9 4
21 8 1
22 8 2
23 8 3
24 8 4
Basicaly I want to update AnsOrder column in ascending order according to QuesID column,
like this for more readability.
AnsID QuesID AnsOrder
-----------------------
1 5 1
2 5 2
3 5 3
4 5 4
5 5 5
6 3 1
7 3 2
8 3 3
9 3 4
10 3 5
11 4 1
12 4 2
13 4 3
14 4 4
15 4 5
16 7 1
17 9 1
18 9 2
19 9 3
20 9 4
21 8 1
22 8 2
23 8 3
24 8 4
You might generate row_numbers by quesID and assign them to AnsOrder like this:
; with ord as (
select *,
row_number() over (partition by quesID
order by AnsID) rn
from table1
)
update ord
set ansorder = rn
I've ordered by AnsID for consistency.
Check this # Sql Fiddle.