Changing column names in Julia to strings - dataframe

this might seem like a silly question. Started with Julia very recently and encountered this trivial problem.
Creating matrix as follows:
Matice = rand(10, 10)
Matice = convert(DataFrame, Matice)
Wanted to change the column names to A,B,C,...
NewColNames = Array(String,ncol(Matice))
for i = 1:(ncol(Matice))
NewColNames[i] = string('A' + (i-1))
end
names!(Matice, NewColNames)
the last line produces an error.
Also tried to do something more direct, such as:
for i = 1:(ncol(Matice))
names(Matice)[i] = string('A' + (i-1))
end
But it isn't working again.
Any help will be greatly appreciated.

I thought it is worth to add three remarks:
1) Probably the simplest code to generate the symbols you want is
Symbol.('A':'A'+ncol-1)`
where ncol is number of columns you want.
2) Currently you can create DataFrame from vector of vectors and give names for variables in the constructor, e.g.:
DataFrame([rand(10) for i in 1:10], Symbol.('A':'J'))
3) when this PR to DataFrames is merged and released (probably in a few weeks at most) you will be able to write the same using a matrix:
DataFrame(rand(10, 10), Symbol.('A':'J'))

With help of Dan, posting an alternative solution without creating a NewColNames vector:
names!(Matice, [Symbol('A' + (i-1)) for i in 1:(ncol(Matice))])

Related

What is the meaning of concatenation in the case of the cm operator in PDF?

I understand that "cm" concatenates two CTMs, however, it's not obvious to me what the specific definition of concatenation is. Reading through to "graphics state operators", in the specification, has not helped me.
Thus far I've looked at a whole bunch of different resources about matrix concatenation. There seems to be a number of different ways concatenation is defined for matrices: some examples seem to show it as:
[1,2; concat [5,6; = [1,2,5,6;
3,4] 7,8]. 3,4,7,8]
... however that would seem to break the transform matrices, so I assume that's not it.
Another option is that they just mean matrix addition:
[1,2; + [5,6; = [6 ,8 ;
3,4] 7,8]. 10,12]
but I feel, if it were just a matrix addition, they would just call it addition/matrix addition.
my last idea is:
[1,2; + [5,6; = [15,26 ;
3,4] 7,8]. 37,48]
but that seems like a bizarre approach, not least because it would have numbers behaving like text.
Thanks in advance

Defining a function in Pandas

I am new to Pandas and I am taking this course online. I know there is a way to define a function to make this code cleaner but I'm not sure how to go about it.
noshow = len((df[
(df['Gender'] == 'M') \
& (df['No_show'] == 'Yes') \
& (df['Persons_age'] == 'Child')
]))
noshow
There is multiple Genders and multiple No_show answers and Multiple Person's age and I don't want to have write out the code for each one of those.
I've gotten the code for a single function but not for the mutiple iterations.
def print_noshow_percentage(column_name, value, percentage_text):
total = (df[column_name] == value).sum()
noshow = len((df[(df[column_name] == value) & (df['No_show'] == 'Yes')]))
print(int((noshow / total) * 100), percentage_text)
I hope this makes sense. Thanks for any help!
Welcome to Stack Exchange. You are not too clear about your desired output, but I think what you are trying to do is to get a summary of every possible combination of age, gender, and no_show in your df. To accomplish this you can use the built in groupby method of pandas documentation here.
As mentioned by #ALollz, the following code will get you everything you need to know about your counts in terms of percentages.
counts = df.groupby(['Gender', 'Persons_age'])['No_show'].value_counts(normalize=True)
Now you need to decide what to do with it. You can either iterate through the dataframe printing each line, or you can find specific combinations or you can print out the whole thing.
In general, it is better to look for a built in method than to try to build a function outside of pandas. There are a lot of different ways to do things and checking the documentation is a good place to start.

Reading .txt into Julia DataFrame as Date Type

Is there way to read date ("2000-01") variables from text files into a Julia DataFrame directly, as a date? There's no documentation on this from what I have seen.
df = readtable("path/dates.txt", eltypes = [Date, Date])
This doesn't work, even though it seems like it should. My usual process is to read the dates in as strings and then loop over each row to create a new date variable. This has become a bottleneck in some of my processes now, do to the size of the DataFrames.
My usual flow is to do something like this:
full_df[:real_date] = Date(full_df[:temp_dte_string], "m/d/y")
Thank you
I don't think there's currently any way to do the loading in a single step like your first suggested code. However you can speed up the second method somewhat by making a DateFormat object and calling Date with that instead of with a string.
(This is mentioned briefly here.)
dfmt = Dates.DateFormat(“m/d/y”)
full_df[:real_date] = Date(full_df[:temp_dte_string], dfmt)
(For some reason I thought Date was not vectorized and have been doing this inside a for loop in all my code. Whoops.)
By delete a variable do you mean delete a column or a row? If you mean the former, then there's a few other ways to do this including things like
function vectorin(a, b) #IMHO this should be in base
bset = Set(b)
[i in bset for i in a]
end
df = DataFrame(A1="", A2="", A3="", D="", E="", F="") #Some long list of columns
badCols = [:D, :F] #Some long list of columns you want to remove
df = df[names(df)[!vectorin(names(df), badCols)]]
Sometimes I read in csv files with a lot of extra columns, then just do something like
df = readtable("data.csv")
df = df[[:Only, :the, :cols, :I, :want]]

How to assign a value to a specific mgrid entry in matplotlib?

I am new to matplotlib and scipy. I want to create a two dimensional mgrid in matplotlib and assign individual cells in this two dimensional array to values that I have generated. How can I do it? I am looking for an assignment function such as a[i,j] = k but I cant find one. Any clues?
Thanks in advance.
Ranga
OK. I think I found the answer. What I had wanted to do was better done with an numpy.array. So the way to do this (for me) was :
t = []
zeroRow = []
for j in range(cols):
zeroRow.append(0)
for i in range(rows):
t.append(zeroRow)
spectrogramData = np.array(t,float)
Later I read the values from a file where the row and column are stored and assign to the spectrogramData
spectrogramData[row][column] = valueRead
My confusion was not knowing how to access the wrapped array. It is accessed like any two dimensional array.
Thanks for responding!

Mathematica: Commands return no output, but itself. Bug?

I am working with Wolfram Mathematica 8 and have the following problem. I have an optimization problem under certain constraints and want to have an analytical (symbolical solution). I am maximizing function piA. My input is:
piA[a_, WA1_, WA0_] =
a/(1 + a)*(X - (y*WA1 + 1)^(1/y)) - 1/(1 + a) ((y*WA0 + 1)^(1/y));
Maximize[{piA[a, WA1, WA0], WA0 >= -1/y, WA1 >= -1/y}, WA0]
What I get most of the times is:
Maximize[{-((1 + WA0 y)^((1/y))/(1 + a)) + (
a (X - (1 + WA1 y)^(1/y)))/(1 + a), WA0 >= -(1/y), WA1 >= -(1/y)},a]
Basically, the command does nothing, but outputs itself. Only once I have managed to get the proper output (too long to paste here). I have tested it with simpler functions and it works. Unfortunately, I cannot understand what causes the problem. It is not a syntax problem, since it has worked like that several times. Any help would be very much appreciated.
P.S. Just checked again and my input ALWAYS generates the wrong output. The time it generated the solution was when I accidentally set parameters X and y to certain numbers.
The most likely reason is that given the function and constraints, Mathematica doesn't know how to maximize your function with respect to WA0. Note you also have a free variables X and a in there, and it might not have enough information about the domain of X and a to be able to properly form a solution to your equation.
I've had instances where I tried feeding in some equations and constraints and Mathematica simply couldn't do anything with them because they were too general. This may be the case here as well. Is there a specific problem you're trying to solve, and is there any way you could give Mathematica more context?
I don't think this is a bug at all, but it's unfortunate that sometimes Mathematica will just spit back your input when it doesn't have any rules for solving what you gave it.
The usual reason these things happens seems to be when the expressions given are too general for Mathematica to handle, or when it it's faced with a set of expressions that are ill formed.
Just as an example, I tried passing in fractions into a function I wrote that specifically looked for rational expressions, thinking it would work. It turned out that it needed to handle both Rational[a, b] and Times[a, Power[b, -1]]. It could be the case that Mathematica is not expecting a constraint to be of the form GreaterEqual[a, b].
Mathematica returns an answer if you assign the variable a some value. Maybe you could build your strategy on that? In fact it does provide an answer if you assign a value to any of the variables.
( I would need more background of the problem to go from there... )