Pandas dataframe [] issue - pandas

I am new to python. Could someone help me why the first code statement generates no coding error but the second one will generate a keyerror(I do not even know what is a keyerror).
p.s.: data is a DataFrame,'StrategyCumulativePct' and 'BuyHold' are two columns in the DataFrame respectively.
data[['StrategyCumulativePct', 'BuyHold']].plot()
data['StrategyCumulativePct', 'BuyHold'].plot()
On the other hand, may I ask why sometimes when I had only written 20 lines of codes but there could be errors pointing an arrow at lines 2000 / 3000.... which I have not created before? Thanks.

Related

length of 'dimnames' [1] not equal to array extent

I am new to using R and actually to most programming language, so I am a bit lost here. Hope you can help. I am using RCMap for whcih I have 4csv documents, I get the following error code:
Error in dimnames(x) <- dn :
length of 'dimnames' [1] not equal to array extent
I am sure it has something to do with my own data, because I get normal output if I use other people´s data. However, I don´t know where the problem is (not even in which of the four documents). I do have a lot of missing data, however changing the missing data to either blank spaces or NA, does not change the error code.
The documents of other people that I am able to run also contain missing data, although to a lesser extend.
Hope you can help,
best wishes, Doriene
I had a similar problem and it helped when i put a space in front of c__bacilli.
Ex: test <- subset_taxa(phylo, Class==" c__Bacilli")

why pandas df.drop() doesn't drop all indexes unless inplace used

I started writing this question in other form, but found a solution in the meantime. I have a dataframe shaped like 55k x 4. Since couple of hours now I couldn't understand why I can't drop rows I need to drop. I had something like that:
print(df.shape)
indexes_to_drop = list()
for row in df.itertuples(index=True):
if some_complex_function(row[1]):
indexes_to_drop.append(row[0])
print(len(indexes_to_drop))
df = df.drop(index=indexes_to_drop)
print(df.shape)
My output was like:
55000 x 4
2500
52500 x 4
However, once I displayed some rows from my df I was still able to find rows I thought were deleted. Ofc 1st thought was to check some_complex_function. But I logged everything it did, and it was just fine.
So, I tried couple other ways of deleting rows using index, for example:
df = df.drop(df.index[ignore_indexes])
Still, shape is ok, but not the rows.
Then I tried with iterrows() instead of itertuples. Same thing.
I thought maybe there is something wrong with indexing. You know: index number vs index label. I tested my code on small dataframe and everything worked like a charm.
Then I realized, I do some stuff with my df before I run above, code. So I reset the index before like that:
df.reset_index(inplace=True, drop=True) Indexes changed, started counting from 0, but my results were still wrong.
Then I tried this: df.drop(index=indexes_to_drop, inplace=True)
And BOOM it worked.
Right now Im not looking for the solution, as I apparently found one. I'd like to know WHY dropping rows not "inplace" didn't work. I don't get that.
Cheers!

I cannot understand why "in" doesn't work correctly

sp01 is dataframe which contains S&P 500 index. And I have a dataframe,interest, which contains daily interest rate. The two data started from same date, but their size were not same. It's error.
I want to get exact same date, so tried to check every date using "in" function. But "in" function doesn't work well. This is code :
print(sp01.Date[0], type(sp01.Date[0]) )
->1976-06-01, str
print(interest.DATE[0], type(interest.DATE[0]) )
->1976-06-01, str
print(sp01.Date[0] in interest.DATE)
->False
I can never understand why the result becomes False.
Of course, first date of sp01 and interest is totally same,
I checked it too by typing code. So, True should be come out, but False came out. I got mad!!!please Help me.
I solved it! the problem is that "in" function does not work for pandas series data. Those two data are pandas series, so I have to change one of them to list

How to insert empty rows and let data start on certain row number?

I am completely new in coding and started to experiment with python and pandas. Quite an adventure and I am learning a lot. I found a lot of solutions already here on Stack, but not for my latest quest.
With Pandas I imported and edited a txt-file in such a way that I could export it in a csv-file. But to be able to import this csv-file into another program I need that the header row starts on row number 20. So I actually need 19 empty rows.
Can somebody guide me in the right direction?
You can join your dataframe with an empty dataframe:
empty_data = [[''] * len(df.columns) for i in range(19)]
empty_df = pd.DataFrame(empty_data, columns=df.columns)
pd.concat((df, empty_df))

Octave: quadprog index issue?

I am trying to run several files of code for an assignment. I am trying to solve an optimization problem using the "quadprog" function from the "optim" package.
quadprog is supposed to solve the optimization problem in a certain format and takes inputs H,f, A,b, Aeq, Beq, lb, ub.
The issue I am having involves my f which is a column vector of constants. To clarify, f looks like c*[1,1,1,1,1,1] where c is a constant. Quadprog seems to run my code just fine for certain values of c, but gives me the error:
error: index (_,49): but object has size 2x2
error: called from
quadprog at line 351 column 32
for other values of c. So, for example, 1/3 works, but 1/2 doesn't. Does anyone have any experience with this?
Sorry for not providing a working example. My code runs over several files and I seem to only be having problems with a specific value set that is very big. Thanks!
You should try the qp native Octave function.
You mention f is: c*[1,1,1,1,1,1] but, if c is a scalar, that is not a column vector. It seems very odd that a scalar value might produce a dimensions error...