Pandas How to rename columns that don't have names, but they're indexed as 0, 1, 2, 3... etc - pandas

I don't know how to rename columns that are unnamed.
I have tried both approaches where I am putting the indices in quoutes and not, like this, and it didn't work:
train_dataset_with_pred_new_df.rename(columns={
0 : 'timestamp', 1 : 'open', 2 : 'close', 3 : 'high', 4 : 'low', 5 : 'volume', 6 : 'CCI7', 7 : 'DI+',\
8 : 'DI-', 9 : 'ADX', 10 : 'MACD Main', 11 : 'MACD Signal', 12 : 'MACD histogram', 13 : 'Fisher Transform',\
14 : 'Fisher Trigger'
})
And
train_dataset_with_pred_new_df.rename(columns={
'0' : 'timestamp', '1' : 'open', '2' : 'close', '3' : 'high', '4' : 'low', '5' : 'volume', '6' : 'CCI7', '8' : 'DI+',\
'9' : 'DI-', '10' : 'ADX', '11' : 'MACD Main', '12' : 'MACD Signal', '13' : 'MACD histogram', '15' : 'Fisher Transform',\
'16' : 'Fisher Trigger'
})
So If both didn't worked, how do I rename them?
Thank you for your help in advance :)

pandas.DataFrame.rename returns a new DataFrame if the parameter inplace is False.
You need to reassign your dataframe :
train_dataset_with_pred_new_df= train_dataset_with_pred_new_df.rename(columns={
0 : 'timestamp', 1 : 'open', 2 : 'close', 3 : 'high', 4 : 'low', 5 : 'volume', 6 : 'CCI7', 7 : 'DI+',\
8 : 'DI-', 9 : 'ADX', 10 : 'MACD Main', 11 : 'MACD Signal', 12 : 'MACD histogram', 13 : 'Fisher Transform',\
14 : 'Fisher Trigger'})
Or simply use inplace=True:
train_dataset_with_pred_new_df.rename(columns={
0 : 'timestamp', 1 : 'open', 2 : 'close', 3 : 'high', 4 : 'low', 5 : 'volume', 6 : 'CCI7', 7 : 'DI+',
8 : 'DI-', 9 : 'ADX', 10 : 'MACD Main', 11 : 'MACD Signal', 12 : 'MACD histogram', 13 : 'Fisher Transform',
14 : 'Fisher Trigger'
}, inplace=True)

df.rename(columns={ df.columns[1]: "your value" }, inplace = True)

What you are trying to do is renaming the index. Instead of renaming existing columns you are renaming index. So rename index and not columns.
train_dataset_with_pred_new_df.rename(
index={ 0 : 'timestamp', 1 : 'open', 2 : 'close', 3 : 'high', 4 : 'low', 5 : 'volume', 6 : 'CCI7', 7 : 'DI+', 8 : 'DI-', 9 : 'ADX', 10 : 'MACD Main', 11 : 'MACD Signal', 12 : 'MACD histogram', 13 : 'Fisher Transform', 14 : 'Fisher Trigger'
}, inplace=True)

As it looks like you want to reassign all names, simply do:
df.columns = ['timestamp', 'open', 'close', 'high', 'low', 'volume',
'CCI7', 'DI+', 'DI-', 'ADX', 'MACD Main', 'MACD Signal',
'MACD histogram', 'Fisher Transform', 'Fisher Trigger']
Or, in a chain:
df.set_axis(['timestamp', 'open', 'close', 'high', 'low', 'volume',
'CCI7', 'DI+', 'DI-', 'ADX', 'MACD Main', 'MACD Signal',
'MACD histogram', 'Fisher Transform', 'Fisher Trigger'],
axis=1)

Related

How to solve a query with case conditions returning null values?

I dont usually post in this forum but I came up with a problem I cant seem to fix since I'm fair new with programming.
I'm having a case problem with a query for a Datatable that needs to select orders based on an order line and case conditions, example if the order took 3 or less days to complete, it will go to column 1, if it took 5 or more days, to column 2, and between 4 and 5 days to column 3.
I'm using ajax for the Datatable.
This is how the code looks
Ajax code
$table = <<<EOT
(
SELECT ce.lineanegocio AS linea, COUNT(c.entity) AS qty,
DATE_FORMAT(c.date_creation,'%Y-%c') AS date_creation1,
DATE_FORMAT(pe.date_creation,'%Y-%c') AS date_creation2,
CASE WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 3 AND ce.lineanegocio = '11' THEN 1
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 3 AND ce.lineanegocio = '15' THEN 2
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 3 AND ce.lineanegocio = '12' THEN 3
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 3 AND ce.lineanegocio = '20' THEN 4
ELSE 5 END AS date_diff1,
CASE WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 5 AND ce.lineanegocio = '11' THEN 6
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 5 AND ce.lineanegocio = '15' THEN 7
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 5 AND ce.lineanegocio = '12' THEN 8
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 5 AND ce.lineanegocio = '20' THEN 9
ELSE 10 END AS date_diff2,
CASE WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 3 AND TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 5 AND ce.lineanegocio = '11' THEN 11
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 3 AND TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 5 AND ce.lineanegocio = '15' THEN 12
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 3 AND TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 5 AND ce.lineanegocio = '12' THEN 13
WHEN TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) > 3 AND TIMESTAMPDIFF(DAY, c.date_creation, pe.date_creation) <= 5 AND ce.lineanegocio = '20' THEN 14
ELSE 15 END AS date_diff3
FROM llx_commande c
INNER JOIN llx_commande_extrafields ce ON c.rowid = ce.fk_object
INNER JOIN llx_pedidosestado_fechaestado pe ON c.rowid = pe.id_pedido
WHERE (c.fk_statut = '9' OR c.fk_statut = '11') AND (pe.estado = '11' OR pe.estado = '9')
GROUP BY linea
ORDER BY qty DESC
) temp
EOT;
Datatable code
<script type="text/javascript">
$(document).ready(function() {
var table = $('#lineapedidos').DataTable( {
"processing": true,
"stateSave": true,
dom: '',
"bInfo": false,
"serverSide": true,
"ajax": "getPedidos2.php",
"columns": [
{ "data": 0 },
{ "data": 1 },
{ "data": 2 },
{ "data": 3 },
{ "data": 4 },
],
columnDefs: [
{
targets: [2, 3, 4],
render: function ( data, type, row, meta ) {
return type === 'display' ?
data + '%' :
data;
}
}
]
} );
} );
How the table looks
So the problem are the NULL values that show up in the table, in "linea negocio 11" it should say something like 50%, 35%, 15%.
If I have to bring any other information about this code please let me know. Thanks in advance!
PD: I made the cases separately with a coalesce since I read somewhere that it wont return a null result but it didnt do the trick

Select with subtotals using postgres sql

I've the following query:
select
json_build_object('id', i.id, 'task_id', i.task_id, 'time_spent', i.summary)
from
intervals I
where
extract(month from "created_at") = 10
and extract(year from "created_at") = 2021
group by
i.id, i.task_id
order by i.task_id
Which gives the following output:
json_build_object
{"id" : 53, "task_id" : 1, "time_spent" : "3373475"}
{"id" : 40, "task_id" : 1, "time_spent" : "3269108"}
{"id" : 60, "task_id" : 2, "time_spent" : "2904084"}
{"id" : 45, "task_id" : 4, "time_spent" : "1994341"}
{"id" : 38, "task_id" : 5, "time_spent" : "1933766"}
{"id" : 62, "task_id" : 5, "time_spent" : "2395378"}
{"id" : 44, "task_id" : 6, "time_spent" : "3304280"}
{"id" : 58, "task_id" : 6, "time_spent" : "3222501"}
{"id" : 48, "task_id" : 6, "time_spent" : "1990195"}
{"id" : 55, "task_id" : 7, "time_spent" : "1984300"}
How can I add subtotals of time_spent by each task?
I'd like to have an array structure of objects like this:
{
"total": 3968600,
"details:" [
{"id" : 55, "task_id" : 7, "time_spent" : "1984300"},
{"id" : 55, "task_id" : 7, "time_spent" : "1984300"}
]
}
How can I achieve it? Thank you!
You may try the following modification which groups your data based on the task_id and uses json_agg and json_build_object to produce your desired schema.
select
json_build_object(
'total', SUM(i.summary),
'details',json_agg(
json_build_object(
'id', i.id,
'task_id', i.task_id,
'time_spent', i.summary
)
)
) as result
from
intervals I
where
extract(month from "created_at") = 10
and extract(year from "created_at") = 2021
group by
i.task_id
order by i.task_id
See working demo fiddle online here

Mixed Integer Linear Optimization with Pyomo - Travelling salesman problem

I am trying to solve a travelling salesman problem with Pyomo framework. However, I am stuck, as the solver is informing me that I have formulated it as infeasible.
import numpy as np
import pyomo.environ as pyo
from pyomo.environ import *
from pyomo.opt import SolverFactory
journey_distances = np.array([[0, 28, 34, 45, 36],
[28, 0, 45, 52, 64],
[34, 45, 0, 11, 34],
[45, 52, 11, 0, 34],
[36, 64, 34, 34, 0]])
# create variables - integers
num_locations = M.shape[0]
model = pyo.ConcreteModel()
model.journeys = pyo.Var(range(num_locations), range(num_locations), domain=pyo.Binary, bounds = (0,None))
journeys = model.journeys
# add A to B constraints
model.AtoB = pyo.ConstraintList()
model.BtoA = pyo.ConstraintList()
AtoB = model.AtoB
BtoA = model.BtoA
AtoB_sum = [sum([ journeys[i,j] for j in range(num_locations) if i!=j]) for i in range(num_locations)]
BtoA_sum = [sum([ journeys[i,j] for i in range(num_locations) if j!=i]) for j in range(num_locations)]
for journey_sum in range(num_locations):
AtoB.add(AtoB_sum[journey_sum] == 1)
if journey_sum <num_locations -1:
BtoA.add(BtoA_sum[journey_sum] == 1)
# add auxilliary variables to ensure that each successive journey ends and starts on the same town. E.g. A to B, then B to C.
# u_j - u_i >= -(n+1) + num_locations*journeys_{ij} for i,j = 1...n, i!=j
model.successive_aux = pyo.Var(range(0,num_locations), domain = pyo.Integers, bounds = (0,num_locations-1))
model.successive_constr = pyo.ConstraintList()
successive_aux = model.successive_aux
successive_constr = model.successive_constr
successive_constr.add(successive_aux[0] == 1)
for i in range(num_locations):
for j in range(num_locations):
if i!=j:
successive_constr.add(successive_aux[j] - successive_aux[i] >= -(num_locations - 1) + num_locations*journeys[i,j])
obj_sum = sum([ sum([journey_distances [i,j]*journeys[i,j] for j in range(num_locations) if i!=j]) for i in range(num_locations)])
model.obj = pyo.Objective(expr = obj_sum, sense = minimize)
opt = SolverFactory('cplex')
opt.solve(model)
journey_res = np.array([model.journeys[journey].value for journey in journeys])
print(journey_res)
# results output is:
print(results)
Problem:
- Lower bound: -inf
Upper bound: inf
Number of objectives: 1
Number of constraints: 31
Number of variables: 26
Number of nonzeros: 98
Sense: unknown
Solver:
- Status: ok
User time: 0.02
Termination condition: infeasible
Termination message: MIP - Integer infeasible.
Error rc: 0
Time: 0.10198116302490234
# model.pprint()
7 Set Declarations
AtoB_index : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 5 : {1, 2, 3, 4, 5}
BtoA_index : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 4 : {1, 2, 3, 4}
journeys_index : Size=1, Index=None, Ordered=False
Key : Dimen : Domain : Size : Members
None : 2 : journeys_index_0*journeys_index_1 : 25 : {(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4)}
journeys_index_0 : Size=1, Index=None, Ordered=False
Key : Dimen : Domain : Size : Members
None : 1 : Any : 5 : {0, 1, 2, 3, 4}
journeys_index_1 : Size=1, Index=None, Ordered=False
Key : Dimen : Domain : Size : Members
None : 1 : Any : 5 : {0, 1, 2, 3, 4}
successive_aux_index : Size=1, Index=None, Ordered=False
Key : Dimen : Domain : Size : Members
None : 1 : Any : 5 : {0, 1, 2, 3, 4}
successive_constr_index : Size=1, Index=None, Ordered=Insertion
Key : Dimen : Domain : Size : Members
None : 1 : Any : 21 : {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21}
2 Var Declarations
journeys : Size=25, Index=journeys_index
Key : Lower : Value : Upper : Fixed : Stale : Domain
(0, 0) : 0 : None : 1 : False : True : Binary
(0, 1) : 0 : None : 1 : False : True : Binary
(0, 2) : 0 : None : 1 : False : True : Binary
(0, 3) : 0 : None : 1 : False : True : Binary
(0, 4) : 0 : None : 1 : False : True : Binary
(1, 0) : 0 : None : 1 : False : True : Binary
(1, 1) : 0 : None : 1 : False : True : Binary
(1, 2) : 0 : None : 1 : False : True : Binary
(1, 3) : 0 : None : 1 : False : True : Binary
(1, 4) : 0 : None : 1 : False : True : Binary
(2, 0) : 0 : None : 1 : False : True : Binary
(2, 1) : 0 : None : 1 : False : True : Binary
(2, 2) : 0 : None : 1 : False : True : Binary
(2, 3) : 0 : None : 1 : False : True : Binary
(2, 4) : 0 : None : 1 : False : True : Binary
(3, 0) : 0 : None : 1 : False : True : Binary
(3, 1) : 0 : None : 1 : False : True : Binary
(3, 2) : 0 : None : 1 : False : True : Binary
(3, 3) : 0 : None : 1 : False : True : Binary
(3, 4) : 0 : None : 1 : False : True : Binary
(4, 0) : 0 : None : 1 : False : True : Binary
(4, 1) : 0 : None : 1 : False : True : Binary
(4, 2) : 0 : None : 1 : False : True : Binary
(4, 3) : 0 : None : 1 : False : True : Binary
(4, 4) : 0 : None : 1 : False : True : Binary
successive_aux : Size=5, Index=successive_aux_index
Key : Lower : Value : Upper : Fixed : Stale : Domain
0 : 0 : None : 4 : False : True : Integers
1 : 0 : None : 4 : False : True : Integers
2 : 0 : None : 4 : False : True : Integers
3 : 0 : None : 4 : False : True : Integers
4 : 0 : None : 4 : False : True : Integers
1 Objective Declarations
obj : Size=1, Index=None, Active=True
Key : Active : Sense : Expression
None : True : minimize : 28*journeys[0,1] + 34*journeys[0,2] + 45*journeys[0,3] + 36*journeys[0,4] + 28*journeys[1,0] + 45*journeys[1,2] + 52*journeys[1,3] + 64*journeys[1,4] + 34*journeys[2,0] + 45*journeys[2,1] + 11*journeys[2,3] + 34*journeys[2,4] + 45*journeys[3,0] + 52*journeys[3,1] + 11*journeys[3,2] + 34*journeys[3,4] + 36*journeys[4,0] + 64*journeys[4,1] + 34*journeys[4,2] + 34*journeys[4,3]
3 Constraint Declarations
AtoB : Size=5, Index=AtoB_index, Active=True
Key : Lower : Body : Upper : Active
1 : 1.0 : journeys[0,1] + journeys[0,2] + journeys[0,3] + journeys[0,4] : 1.0 : True
2 : 1.0 : journeys[1,0] + journeys[1,2] + journeys[1,3] + journeys[1,4] : 1.0 : True
3 : 1.0 : journeys[2,0] + journeys[2,1] + journeys[2,3] + journeys[2,4] : 1.0 : True
4 : 1.0 : journeys[3,0] + journeys[3,1] + journeys[3,2] + journeys[3,4] : 1.0 : True
5 : 1.0 : journeys[4,0] + journeys[4,1] + journeys[4,2] + journeys[4,3] : 1.0 : True
BtoA : Size=4, Index=BtoA_index, Active=True
Key : Lower : Body : Upper : Active
1 : 1.0 : journeys[1,0] + journeys[2,0] + journeys[3,0] + journeys[4,0] : 1.0 : True
2 : 1.0 : journeys[0,1] + journeys[2,1] + journeys[3,1] + journeys[4,1] : 1.0 : True
3 : 1.0 : journeys[0,2] + journeys[1,2] + journeys[3,2] + journeys[4,2] : 1.0 : True
4 : 1.0 : journeys[0,3] + journeys[1,3] + journeys[2,3] + journeys[4,3] : 1.0 : True
successive_constr : Size=21, Index=successive_constr_index, Active=True
Key : Lower : Body : Upper : Active
1 : 1.0 : successive_aux[0] : 1.0 : True
2 : -Inf : -4 + 5*journeys[0,1] - (successive_aux[1] - successive_aux[0]) : 0.0 : True
3 : -Inf : -4 + 5*journeys[0,2] - (successive_aux[2] - successive_aux[0]) : 0.0 : True
4 : -Inf : -4 + 5*journeys[0,3] - (successive_aux[3] - successive_aux[0]) : 0.0 : True
5 : -Inf : -4 + 5*journeys[0,4] - (successive_aux[4] - successive_aux[0]) : 0.0 : True
6 : -Inf : -4 + 5*journeys[1,0] - (successive_aux[0] - successive_aux[1]) : 0.0 : True
7 : -Inf : -4 + 5*journeys[1,2] - (successive_aux[2] - successive_aux[1]) : 0.0 : True
8 : -Inf : -4 + 5*journeys[1,3] - (successive_aux[3] - successive_aux[1]) : 0.0 : True
9 : -Inf : -4 + 5*journeys[1,4] - (successive_aux[4] - successive_aux[1]) : 0.0 : True
10 : -Inf : -4 + 5*journeys[2,0] - (successive_aux[0] - successive_aux[2]) : 0.0 : True
11 : -Inf : -4 + 5*journeys[2,1] - (successive_aux[1] - successive_aux[2]) : 0.0 : True
12 : -Inf : -4 + 5*journeys[2,3] - (successive_aux[3] - successive_aux[2]) : 0.0 : True
13 : -Inf : -4 + 5*journeys[2,4] - (successive_aux[4] - successive_aux[2]) : 0.0 : True
14 : -Inf : -4 + 5*journeys[3,0] - (successive_aux[0] - successive_aux[3]) : 0.0 : True
15 : -Inf : -4 + 5*journeys[3,1] - (successive_aux[1] - successive_aux[3]) : 0.0 : True
16 : -Inf : -4 + 5*journeys[3,2] - (successive_aux[2] - successive_aux[3]) : 0.0 : True
17 : -Inf : -4 + 5*journeys[3,4] - (successive_aux[4] - successive_aux[3]) : 0.0 : True
18 : -Inf : -4 + 5*journeys[4,0] - (successive_aux[0] - successive_aux[4]) : 0.0 : True
19 : -Inf : -4 + 5*journeys[4,1] - (successive_aux[1] - successive_aux[4]) : 0.0 : True
20 : -Inf : -4 + 5*journeys[4,2] - (successive_aux[2] - successive_aux[4]) : 0.0 : True
21 : -Inf : -4 + 5*journeys[4,3] - (successive_aux[3] - successive_aux[4]) : 0.0 : True
13 Declarations: journeys_index_0 journeys_index_1 journeys_index journeys AtoB_index AtoB BtoA_index BtoA successive_aux_index successive_aux successive_constr_index successive_constr obj
If anyone can see what the problem is, and let me know, then that would be a great help.
I'm not overly familiar w/ coding TSP problems, and I'm not sure of all the details in your code, but this (below) is a problem. It seems you are coding successive_aux (call it sa for short) as a sequencing of integers. In this snippet (I chopped down to 3 points), if you think about the legal route of 0-1-2-0, sa_1 > sa_0 and sa_2 > sa_1, then it is infeasible to require sa_0 > sa_2. Also, your bounds on sa appear infeasible as well. In this example, sa_0 is 1, and the upper bound on sa is 2. Those are 2 "infeasibilities" in your formulation.
Key : Lower : Body : Upper : Active
1 : 1.0 : successive_aux[0] : 1.0 : True
2 : -Inf : -2 + 3*journeys[0,1] - (successive_aux[1] - successive_aux[0]) : 0.0 : True
3 : -Inf : -2 + 3*journeys[0,2] - (successive_aux[2] - successive_aux[0]) : 0.0 : True
4 : -Inf : -2 + 3*journeys[1,0] - (successive_aux[0] - successive_aux[1]) : 0.0 : True
5 : -Inf : -2 + 3*journeys[1,2] - (successive_aux[2] - successive_aux[1]) : 0.0 : True
6 : -Inf : -2 + 3*journeys[2,0] - (successive_aux[0] - successive_aux[2]) : 0.0 : True
7 : -Inf : -2 + 3*journeys[2,1] - (successive_aux[1] - successive_aux[2]) : 0.0 : True
I'm not an optimization expert but it looks like you need to change the distances between the cities since you're basically saying that the distance from city1 to city1 = 0, city2 to city2 = 0 etc. If you change these distances to a very large number (say 1000000) the optimizer will never pick to go from city1 back to city1.
Hope this helps.

Why am I getting an error of '6' for creating another column base from a value from another column

Data can be found here https://github.com/dqc002/Learning/blob/main/Data%20gathered.csv
I'm trying to create another column base from a value from another column and it's giving me an error of
KeyError Traceback (most recent call last)
<ipython-input-79-4bdbdff831dc> in <module>
14
15
---> 16 data['STATUS'] = data['ACT-STATUS'].apply(lambda x: ACT[x])
17 data
~\anacondafinal\envs\forcartopy\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
4136 else:
4137 values = self.astype(object)._values
-> 4138 mapped = lib.map_infer(values, f, convert=convert_dtype)
4139
4140 if len(mapped) and isinstance(mapped[0], Series):
pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-79-4bdbdff831dc> in <lambda>(x)
14
15
---> 16 data['STATUS'] = data['ACT-STATUS'].apply(lambda x: ACT[x])
17 data
KeyError: '6'
data=pd.read_csv('Data gathered1.csv')
data
ACT = {'0': 'No Activity',
'1A' : 'CONTAMINATION CONFIRMED',
'1B' : 'CONTAMINATION CONFIRMED',
'2A' :'INVESTIGATION',
'2B': 'INVESTIGATION',
'3':'CORRECTIVE ACTION PLANNING',
'4': 'IMPLEMENT ACTION',
'5': 'MONITOR ACTION',
'6A':'ACTION COMPLETED',
'6B':'ACTION COMPLETED',
'6C': 'INACTIVE',
'6D': 'INACTIVE'
}
data['STATUS'] = data['ACT-STATUS'].apply(lambda x: ACT[x])
data
You might wanna use 'map' here. Based on the dataset here's the input/output via map-
ACT = {'0': 'No Activity',
'1A' : 'CONTAMINATION CONFIRMED',
'1B' : 'CONTAMINATION CONFIRMED',
'2A' :'INVESTIGATION',
'2B': 'INVESTIGATION',
'3':'CORRECTIVE ACTION PLANNING',
'4': 'IMPLEMENT ACTION',
'5': 'MONITOR ACTION',
'6A':'ACTION COMPLETED',
'6B':'ACTION COMPLETED',
'6C': 'INACTIVE',
'6D': 'INACTIVE'
}
df['ACT-STATUS 5.236'] = df['ACT-STATUS 5.236'].astype(str)
df['STATUS'] = df['ACT-STATUS 5.236'].map(ACT)
print(df[['ACT-STATUS 5.236','STATUS']])
ACT-STATUS 5.236 STATUS
0 0 No Activity
1 0 No Activity
2 2A INVESTIGATION
3 2A INVESTIGATION
4 6 NaN
5 1A CONTAMINATION CONFIRMED
6 6 NaN
7 6 NaN
8 6 NaN
9 nan NaN
First, the column headers of your Data gathered.csv have some spaces like FILE NAME and extra data like ACT-STATUS 5.236.
COUNTY,Division, FILE NAME,File Number,LOCATION,LATITUDE,LONGITUDE,CONTAMINANTS,DATE,ENF-STATUS,ACT-STATUS 5.236,Category
Then when you do data['ACT-STATUS'].apply(lambda x: ACT[x]), pandas will treat value of ACT-STATUS column as x, then use x as key of ACT dictionary to find value. However, your ACT dictionary doesn't have key 6. So it gives you the error. You may want
import numpy as np
data['ACT-STATUS'].apply(lambda x: ACT.get(x, np.NaN))
where ACT.get(x, np.NaN) will return NaN if x is not found in key of ACT.
data['ACT-STATUS 5.236'].unique() gives the following result
['0' '2A' '6' '1A' '4' '5' '4,5A' '2B' '3' '6C']
4,5A is also an outliner value. You may need to come up a workaround to handle this.

Pandas Df.head() does not display when called inside the method()?

Cannot access the pandas dataframe.head() or dataframe.describe() when the call is made inside a method.
def develop_df():
studentData = {
0 : {
'name' : 'Aadi',
'age' : 16,
'city' : 'New york'
},
1 : {
'name' : 'Jack',
'age' : 34,
'city' : 'Sydney'
},
}
print("Now lets print student data")
print(studentData)
print("%" * 80)
print("Create a df and then print head")
st_df = pd.DataFrame(studentData)
st_df.head()
print("%" * 80)
develop_df()
Output:
Now lets print student data
{0: {'name': 'Aadi', 'age': 16, 'city': 'New york'}, 1: {'name': 'Jack', 'age': 34, 'city': 'Sydney'}}
Create a df and then print head
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
But, as seen when called outside the method, it works.
studentData = {
0 : {
'name' : 'Aadi',
'age' : 16,
'city' : 'New york'
},
1 : {
'name' : 'Jack',
'age' : 34,
'city' : 'Sydney'
},
}
print("Now lets print student data")
print(studentData)
print("%" * 80)
print("Create a df and then print head")
st_df = pd.DataFrame(studentData)
st_df.head()
Output:
Now lets print student data
{0: {'name': 'Aadi', 'age': 16, 'city': 'New york'}, 1: {'name': 'Jack', 'age': 34, 'city': 'Sydney'}}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Create a df and then print head
0 1
age 16 34
city New york Sydney
name Aadi Jack
Any suggestion on resolving it?
To pretty-print within a loop, first import the display_html function:
from IPython.display import display_html
Then wrap display_html around any calls to df.head() within a function definition, for example:
display_html(st_df.head())