How do I create a Text to Speech System given HMM models for every phone ? - text-to-speech

I am trying to create a text-to-speech system for a language (Kannada) using HTK (Hidden Markov Model Tool Kit). I am following the tutorial on Voxforge.org to generate HMMs for every phoneme of a language called Kannada. I performed the Training of my data and ended up with a series of HMMs after 9 successive re-estimations. The first 100 lines of this file (hmm9/hmmdefs) looks like the following :
~o
<STREAMINFO> 1 25
<VECSIZE> 25<NULLD><MFCC_D_N_Z_0><DIAGC>
~s "silst"
<MEAN> 25
-8.981787e+00 9.576690e+00 -5.363592e+00 7.546162e+00 -2.035893e+00 1.368924e+01 -1.560227e+00 1.069209e+01 -1.187764e+00 7.615524e+00 -2.514401e+00 1.025364e+01 2.944104e+00 2.461911e-01 1.115181e+00 -3.759977e-02 7.252287e-01 1.149914e-01 8.399552e-01 3.023236e-01 2.565392e-01 -1.392404e-01 6.415843e-02 -1.413524e-03 -2.068892e+00
<VARIANCE> 25
7.971278e+01 2.110036e+01 1.939896e+01 1.365947e+01 2.379106e+01 2.825374e+01 2.110220e+01 2.366302e+01 1.793456e+01 1.325843e+01 1.291668e+01 1.298042e+01 9.060905e+00 2.023319e+00 2.965916e+00 2.247055e+00 3.701807e+00 5.488801e+00 2.863564e+00 7.988983e+00 3.203042e+00 3.015911e+00 1.897855e+00 2.752292e+00 4.656968e+01
<GCONST> 1.009808e+02
~h "sp"
<BEGINHMM>
<NUMSTATES> 3
<STATE> 2
~s "silst"
<TRANSP> 3
0.000000e+00 6.046680e-01 3.953321e-01
0.000000e+00 7.563286e-01 2.436714e-01
0.000000e+00 0.000000e+00 0.000000e+00
<ENDHMM>
~h "\340\262\202"
<BEGINHMM>
<NUMSTATES> 5
<STATE> 2
<MEAN> 25
6.015578e+00 -1.092545e+00 5.592138e+00 2.388550e+00 -5.738769e+00 -1.192297e+01 3.629476e-01 -9.733001e+00 -9.800635e+00 -3.803736e+00 -1.282426e+00 -4.283876e+00 6.507627e-01 9.431242e-01 7.103178e-01 3.613995e-01 -6.547348e-01 3.122466e-01 4.086074e-01 1.523578e-01 -1.648303e+00 6.070523e-01 -8.467742e-01 7.640757e-01 -9.809423e-01
<VARIANCE> 25
7.796819e+00 1.998214e+01 2.138531e+01 2.325875e+01 3.568517e+01 4.934142e+01 4.020288e+01 4.395386e+01 5.728282e+01 4.333259e+01 4.898003e+01 5.033849e+01 4.539665e-01 1.352478e+00 2.180626e+00 2.384371e+00 2.703094e+00 2.922996e+00 3.646270e+00 4.031106e+00 6.244075e+00 3.810220e+00 4.399445e+00 4.742214e+00 4.219064e-01
<GCONST> 9.904237e+01
<STATE> 3
<MEAN> 25
8.230138e+00 3.391990e+00 7.918058e+00 7.893150e-01 -4.886593e+00 -7.784183e+00 -1.977627e+00 -5.161526e+00 -1.425691e+01 2.598373e-01 -6.913644e+00 -4.903273e-01 -5.491136e-01 1.191998e+00 5.495291e-01 -5.040157e-01 1.796028e+00 1.397739e+00 -3.372138e-01 1.118834e+00 8.360423e-01 1.942233e-01 -4.129717e-01 4.928542e-01 -1.360264e+00
<VARIANCE> 25
7.438219e+00 2.445082e+01 2.108792e+01 2.080677e+01 4.452454e+01 4.637808e+01 4.500817e+01 5.165137e+01 5.889799e+01 4.406474e+01 5.085579e+01 4.713411e+01 1.122957e+00 1.781516e+00 1.638479e+00 2.044909e+00 4.016158e+00 4.268800e+00 4.197068e+00 4.704672e+00 5.295690e+00 4.290681e+00 4.821241e+00 4.186268e+00 1.247262e+00
<GCONST> 1.023382e+02
<STATE> 4
<MEAN> 25
1.485957e+00 6.803011e+00 9.217879e+00 7.743183e-01 2.929435e+00 -2.796502e+00 -4.663482e-01 -2.663082e+00 -5.646381e+00 -1.620271e+00 -4.276723e+00 9.053932e-01 -3.141532e+00 -1.006466e+00 -4.909436e-01 -7.754492e-01 2.238380e+00 5.363849e-01 8.221800e-01 -3.755744e-01 4.192139e+00 -8.950262e-01 1.567237e+00 -1.683249e-02 8.296179e-01
<VARIANCE> 25
3.641056e+01 3.190582e+01 2.878690e+01 2.753709e+01 4.602863e+01 4.876221e+01 4.515038e+01 5.164213e+01 6.707198e+01 4.130317e+01 4.181741e+01 4.647356e+01 2.847422e+00 5.915011e+00 4.902977e+00 5.104756e+00 6.492997e+00 6.636117e+00 6.856898e+00 6.270217e+00 6.925057e+00 5.473263e+00 5.736817e+00 5.553099e+00 9.164440e+00
<GCONST> 1.135293e+02
<TRANSP> 5
0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
0.000000e+00 8.058720e-01 1.941280e-01 0.000000e+00 0.000000e+00
0.000000e+00 0.000000e+00 6.787453e-01 3.212548e-01 0.000000e+00
0.000000e+00 0.000000e+00 0.000000e+00 6.043503e-01 3.956497e-01
0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
<ENDHMM>
~h "\340\262\203"
<BEGINHMM>
<NUMSTATES> 5
<STATE> 2
<MEAN> 25
5.805896e-01 -1.075882e+01 -2.814176e+00 -5.880020e+00 9.052986e+00 -5.357531e+00 5.210671e-01 -3.864521e+00 -3.081197e+00 -3.748333e+00 -4.464721e+00 -9.172723e+00 -1.705454e-01 -2.218179e-01 -1.914008e-01 -1.184491e-01 8.222175e-02 7.313553e-02 3.581692e-01 2.845684e-01 1.727469e-01 5.520494e-01 2.320103e-01 -2.316576e-02 -2.695204e-01
<VARIANCE> 25
2.977049e+01 1.933836e+01 7.313203e+01 2.030999e+01 6.759410e+01 3.599974e+01 3.177183e+01 4.166393e+01 4.396541e+01 5.064403e+01 3.757855e+01 3.437418e+01 2.450529e+00 2.951347e+00 5.625500e+00 1.981921e+00 7.028037e+00 4.438006e+00 2.735678e+00 5.700938e+00 6.298313e+00 4.537711e+00 3.482989e+00 2.986932e+00 7.007732e-01
<GCONST> 1.053795e+02
<STATE> 3
<MEAN> 25
3.327909e-01 -7.221850e+00 -7.173618e+00 -5.016214e+00 5.876769e+00 -3.370543e+00 1.606741e+00 -1.376615e+00 1.960738e+00 1.589902e+00 1.996540e+00 -5.060460e+00 3.495924e-01 1.866842e+00 5.888807e-01 9.426065e-01 -9.187853e-01 8.452010e-01 -7.540780e-02 2.513928e-02 -3.550608e-01 4.580146e-01 -3.656684e-01 1.935040e+00 -2.721846e+00
<VARIANCE> 25
2.727732e+01 2.661234e+01 4.245024e+01 2.600391e+01 4.045695e+01 6.376625e+01 1.882898e+01 6.454602e+01 4.445516e+01 5.115188e+01 2.588016e+01 6.268848e+01 1.138753e+00 3.187240e+00 6.774692e+00 3.562576e+00 2.859416e+00 2.514062e+00 2.626426e+00 4.330146e+00 2.819590e+00 3.446536e+00 4.056371e+00 3.272135e+00 1.644712e+00
<GCONST> 1.038539e+02
<STATE> 4
<MEAN> 25
-1.948545e+00 5.389034e+00 7.606988e-01 2.563978e+00 3.025390e-01 -2.773341e+00 1.604488e+00 3.484166e-02 -1.042359e+00 -9.634234e-01 -2.470519e-01 9.371792e-01 -2.833774e+00 1.188266e+00 8.226773e-01 6.984392e-01 5.602998e-01 1.270509e+00 2.334537e-01 5.804406e-01 6.506713e-01 -2.372813e-01 1.170728e+00 8.938893e-01 -2.155946e+00
<VARIANCE> 25
4.566143e+01 5.729873e+01 3.032546e+01 4.105791e+01 3.591120e+01 1.047685e+02 3.953066e+01 5.429741e+01 4.366869e+01 3.534993e+01 2.732262e+01 5.425205e+01 5.152478e+00 1.152582e+01 3.550126e+00 5.687152e+00 1.209841e+01 1.279304e+01 4.226732e+00 9.372541e+00 4.770387e+00 7.745139e+00 5.311399e+00 6.597965e+00 2.797117e+01
<GCONST> 1.177988e+02
<TRANSP> 5
0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
0.000000e+00 9.086774e-01 9.132260e-02 0.000000e+00 0.000000e+00
0.000000e+00 0.000000e+00 7.875688e-01 2.124312e-01 0.000000e+00
0.000000e+00 0.000000e+00 0.000000e+00 7.366875e-01 2.633125e-01
0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
<ENDHMM>
~h "\340\262\205"
<BEGINHMM>
<NUMSTATES> 5
<STATE> 2
<MEAN> 25
-1.364586e-01 -1.345779e+01 -1.243845e+01 -6.815464e+00 9.149407e+00 -5.414480e+00 8.044611e+00 -4.851705e+00 3.946866e+00 -4.012804e+00 -3.772848e+00 -7.529723e+00 1.129064e-02 -4.110667e-01 7.924367e-01 -6.968765e-01 -1.525430e+00 -9.059939e-01 -1.230243e+00 7.858636e-01 4.188629e-01 -1.171953e+00 1.233250e-01 -1.297955e+00 1.128136e+00
<VARIANCE> 25
1.083044e+01 1.837658e+01 2.447788e+01 2.060703e+01 3.049649e+01 5.438379e+01 3.610032e+01 5.379135e+01 3.230940e+01 3.523213e+01 3.347173e+01 3.463765e+01 1.293990e+00 4.274662e+00 2.051608e+00 2.855648e+00 2.757517e+00 5.047538e+00 4.031400e+00 6.269907e+00 3.726271e+00 4.057119e+00 3.291831e+00 3.767739e+00 5.445541e+00
<GCONST> 1.028119e+02
<STATE> 3
<MEAN> 25
1.217768e+00 -6.389030e+00 -2.646137e+00 -3.905793e+00 7.397957e-01 -1.201395e+01 2.211602e-01 -5.126017e+00 2.621910e+00 -6.144597e+00 -2.084508e+00 -8.465837e+00 5.732061e-01 2.775116e+00 2.996845e+00 1.764418e+00 -8.277674e-01 -4.310024e-01 -1.470319e+00 -7.550560e-01 -1.513498e+00 3.458426e-01 -6.254972e-02 9.949619e-01 -2.641093e+00
<VARIANCE> 25
9.996929e+00 4.527835e+01 4.282214e+01 3.972993e+01 3.381993e+01 6.505142e+01 3.322316e+01 5.223600e+01 4.259438e+01 3.215968e+01 3.654654e+01 3.350212e+01 9.168934e-01 1.936100e+00 1.460505e+00 2.197761e+00 3.082174e+00 5.383616e+00 3.715859e+00 6.141081e+00 4.185381e+00 3.373948e+00 3.559795e+00 2.988963e+00 2.619537e+00
<GCONST> 1.026411e+02
<STATE> 4
<MEAN> 25
7.228215e-01 3.678457e+00 5.183236e+00 1.126518e+00 1.464845e+00 -7.542987e+00 -1.289414e+00 -3.014408e+00 -2.044827e+00 -2.407705e+00 -4.620550e+00 -2.489383e+00 -1.007321e+00 1.561653e+00 2.128999e+00 4.428890e-01 3.319249e-01 1.499440e+00 -8.010308e-02 1.836691e-01 -5.105380e-01 6.720528e-01 -4.222759e-02 1.465378e+00 -1.665329e+00
<VARIANCE> 25
3.001524e+01 5.585024e+01 3.712964e+01 3.331260e+01 4.078825e+01 9.574130e+01 3.707666e+01 6.811467e+01 6.065076e+01 4.812584e+01 3.788456e+01 4.958126e+01 3.807814e+00 3.643170e+00 2.988422e+00 3.972459e+00 4.200939e+00 4.923275e+00 4.783486e+00 5.365061e+00 6.556737e+00 4.130150e+00 5.004657e+00 3.739977e+00 3.390516e+00
<GCONST> 1.109406e+02
<TRANSP> 5
0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
0.000000e+00 8.390214e-01 1.609786e-01 0.000000e+00 0.000000e+00
0.000000e+00 0.000000e+00 7.104794e-01 2.895207e-01 0.000000e+00
0.000000e+00 0.000000e+00 0.000000e+00 4.845260e-01 5.154740e-01
0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
<ENDHMM>
Given this file, how do I convert any input Kannada Text (phones of which are encoded in Octal form in the file above) to speech? I understand that Julius does it the other way around i.e. Speech to Text. However, I want to convert Text to Speech. Any suggestions are appreciated.

This model is for recognition purpose. To convert text to speech you will need a more rich model, beside mel-cepstral coefficients, it includes excitation (fundamental frequency) and duration, also parameters are modeled in more vast context (phonetic, linguistic, prosodic).
you can start with maryTTS https://github.com/marytts/marytts/wiki/HMMVoiceCreation

Related

How to drop values for all rows in pandas

I have code that looks like this:
protein IHD CM ARR VD CHD CCD VOO
0 q9uku9 0.000000 0.039457 0.032901 0.014793 0.006614 0.006591 0.000000
1 o75461 0.000000 0.005832 0.027698 0.000000 0.000000 0.006634 0.000000
There's thousands of rows of proteins. However, I want to drop the rows in pandas where all of the values in the row for all of the diseases are less than 0.01. How do I do this?
You can use loc in combination with any. Basically you want to keep all rows where any value is above or equal to 0.01. Note, I adjusted your example to have the second protein have all values < 0.01.
import pandas as pd
df = pd.DataFrame([
['q9uku9', 0.000000, 0.039457, 0.032901, 0.014793, 0.006614, 0.006591, 0.000000 ],
['o75461', 0.000000, 0.005832, 0.007698, 0.000000, 0.000000, 0.006634, 0.000000]
], columns=['protein', 'IHD', 'CM', 'ARR', 'VD', 'CHD', 'CCD', 'VOO'])
df = df.set_index('protein')
df_filtered = df.loc[(df >= 0.01).any(axis=1)]
Which gives:
IHD CM ARR VD CHD CCD VOO
protein
q9uku9 0.0 0.039457 0.032901 0.014793 0.006614 0.006591 0.0
>>> df
protein IHD CM ARR VD CHD CCD VOO
0 q9uku9 0.0 0.039457 0.032901 0.014793 0.006614 0.006591 0.0
1 o75461 0.0 0.005832 0.027698 0.000000 0.000000 0.006634 0.0
2 d4acr8 0.0 0.001490 0.003920 0.000000 0.000000 0.009393 0.0
>>> df.loc[~(df.select_dtypes(float) < 0.01).all(axis="columns")]
protein IHD CM ARR VD CHD CCD VOO
0 q9uku9 0.0 0.039457 0.032901 0.014793 0.006614 0.006591 0.0
1 o75461 0.0 0.005832 0.027698 0.000000 0.000000 0.006634 0.0

Julia - using CartesianIndices with an array

I am trying to access specific elements of an NxN matrix 'msk', with indices stored in a Mx2 array 'idx'. I tried the following:
N = 10
msk = zeros(N,N)
idx = [1 5;6 2;3 7;8 4]
#CIs = CartesianIndices(( 2:3, 5:6 )) # this works, but not what I want
CIs = CartesianIndices((idx[:,1],idx[:,2]))
msk[CIs] .= 1
I get the following: ERROR: LoadError: MethodError: no method matching CartesianIndices(::Tuple{Array{Int64,1},Array{Int64,1}})
Is this what you want? (I am using your definitions)
julia> msk[CartesianIndex.(eachcol(idx)...)] .= 1;
julia> msk
10×10 Array{Float64,2}:
0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Note that I use a vector of CartesianIndex:
julia> CartesianIndex.(eachcol(idx)...)
4-element Array{CartesianIndex{2},1}:
CartesianIndex(1, 5)
CartesianIndex(6, 2)
CartesianIndex(3, 7)
CartesianIndex(8, 4)
as CartesianIndices is:
Define a region R spanning a multidimensional rectangular range of integer indices.
so the region defined by it must be rectangular.
Another way to get the required indices would be e.g.:
julia> CartesianIndex.(Tuple.(eachrow(idx)))
4-element Array{CartesianIndex{2},1}:
CartesianIndex(1, 5)
CartesianIndex(6, 2)
CartesianIndex(3, 7)
CartesianIndex(8, 4)
or (this time we use linear indexing into msk as it is just a Matrix)
julia> [x + (y-1)*size(msk, 1) for (x, y) in eachrow(idx)]
4-element Array{Int64,1}:
41
16
63
38

pandas iterate over 3 data frames element wise into a function

i wrote :
def revertcheck(basevalue,first,second):
if basevalue==1:
return 0
elif basevalue > first and first > second:
return -abs(first-second)
elif basevalue < first and first < second:
return -abs(first-second)
else:
return abs(first-second)
and now I have 3 same sized correlation matrices of the type
pandas.core.frame.DataFrame
I want to iterate over every element, and feed all those 3 values into my function at a time. Can someone give me a hint how to do that?
AAPL AMZN BAC GE GM GOOG GS SNP XOM
AAPL 1.000000 0.567053 0.410656 0.232328 0.562110 0.616592 0.800797 -0.139989 0.147852
AMZN 0.567053 1.000000 -0.012830 0.071066 0.271695 0.715317 0.146355 -0.861710 -0.015936
BAC 0.410656 -0.012830 1.000000 0.953016 0.958784 0.680979 0.843638 0.466912 0.942582
GE 0.232328 0.071066 0.953016 1.000000 0.935008 0.741110 0.667574 0.308813 0.995237
GM 0.562110 0.271695 0.958784 0.935008 1.000000 0.857678 0.857719 0.206432 0.899904
GOOG 0.616592 0.715317 0.680979 0.741110 0.857678 1.000000 0.632255 -0.326059 0.675568
GS 0.800797 0.146355 0.843638 0.667574 0.857719 0.632255 1.000000 0.373738 0.623147
SNP -0.139989 -0.861710 0.466912 0.308813 0.206432 -0.326059 0.373738 1.000000 0.369004
XOM 0.147852 -0.015936 0.942582 0.995237 0.899904 0.675568 0.623147 0.369004 1.000000
Let's assume basevalue, first and second are your three dataframes of exactly the same size and structure, then you can do what you want in a vectorised manner:
output = abs(first - second)
output = output.mask(basevalue == 1, 0)
output = output.mask((basevalue > first) & (first > second), -abs(first - second))
output = output.mask((basevalue < first) & (first < second), -abs(first - second))

Unable to set cell values in Python Pandas SparseDataFrame

I'm having a hard time getting cell values to stick in a SparseDataFrame when updating by index/column. I've tried setting cell values using df.at, df.ix, df.loc and the dataframe remains empty.
df = pd.SparseDataFrame(np.zeros((10,10)), default_fill_value=0)
df.at[1,1] = 1
df.ix[2,2] = 1
df.loc[3,3] = 1
df
0 1 2 3 4 5 6 7 8 9
0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
5 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Any one of these options work fine on a standard dataframe.
The one option I've found that works is
df = df.set_values(1, 1, 1)
But this is terribly slow for a large matrix.
[edit] I did see a 4 year old answer that suggested the below, but it suggests that more direct methods were in the works. I also haven't tested speed on this, but it seems like converting whole slides of the matrix to dense and back would be much slower than directly updating a row,col, val entry as you can with scipy sparse matrices.
df = pd.SparseDataFrame(columns=np.arange(250000), index=np.arange(250000))
s = df[2000].to_dense()
s[1000] = 1
df[2000] = s
In [11]: df.ix[1000,2000]
Out[11]: 1.0

How do I aggregate sub-dataframes in pandas?

Suppose I have two-leveled multi-indexed dataframe
In [1]: index = pd.MultiIndex.from_tuples([(i,j) for i in range(3)
: for j in range(1+i)], names=list('ij') )
: df = pd.DataFrame(0.1*np.arange(2*len(index)).reshape(-1,2),
: columns=list('xy'), index=index )
: df
Out[1]:
x y
i j
0 0 0.0 0.1
1 0 0.2 0.3
1 0.4 0.5
2 0 0.6 0.7
1 0.8 0.9
2 1.0 1.1
And I want to run a custom function on every sub-dataframe:
In [2]: def my_aggr_func(subdf):
: return subdf['x'].mean() / subdf['y'].mean()
:
: level0 = df.index.levels[0].values
: pd.DataFrame({'mean_ratio': [my_aggr_func(df.loc[i]) for i in level0]},
: index=pd.Index(level0, name=index.names[0]) )
Out[2]:
mean_ratio
i
0 0.000000
1 0.750000
2 0.888889
Is there an elegant way to do it with df.groupby('i').agg(__something__) or something similar?
Need GroupBy.apply, which working with DataFrame:
df1 = df.groupby('i').apply(my_aggr_func).to_frame('mean_ratio')
print (df1)
mean_ratio
i
0 0.000000
1 0.750000
2 0.888889
You don't need the custom function. You can calculate the 'within group means' with agg then perform an eval to get the ratio you want.
df.groupby('i').agg('mean').eval('x / y')
i
0 0.000000
1 0.750000
2 0.888889
dtype: float64