I have a actual plane with known 3D coordinates of it's four corners relative to a landmark. It's coordinates are:
Front left corner: -32.5100 128.2703 662.2551
Front right corner: 65.2244 131.0850 656.1088
Back left corner: -23.4983 129.0271 838.3724
Back right corner: 74.1135 131.4294 833.4199
I am now creating a 3D obj file plane by using blender which has a image as texture mapped on it. By following the tutorial about adding texture on a plane using blender, I get both my obj file and mtl file shows below. I tried to directly replace the geometric vertex of the obj file to my own coordinates, but the coordinates are not connected in meshlab. Any idea about how to modify the obj file?
Thanks,
OBJ File:
# Blender v2.76 (sub 0) OBJ File: ''
# www.blender.org
mtllib planePhantom.mtl
o Plane
v -0.088000 0.000000 0.049250
v 0.088000 0.000000 0.049250
v -0.088000 0.000000 -0.049250
v 0.088000 0.000000 -0.049250
vt 0.000000 0.000000
vt 1.000000 0.000000
vt 1.000000 1.000000
vt 0.000000 1.000000
vn 0.000000 1.000000 0.000000
usemtl Material.001
s off
f 1/1/1 2/2/1 4/3/1 3/4/1
MTL file:
# Blender MTL File: 'None'
# Material Count: 1
newmtl Material.001
Ns 96.078431
Ka 1.000000 1.000000 1.000000
Kd 0.640000 0.640000 0.640000
Ks 0.500000 0.500000 0.500000
Ke 0.000000 0.000000 0.000000
Ni 1.000000
d 1.000000
illum 0
map_Kd IMG_0772_cropped_unsharpmask_100_4_0.jpeg
The plane shows in meshlab before replacing:
Okay, it turns out be that the order for connecting vertex is wrong and I have already figured it out
Related
I want to calculate the TF-IDF of keywords for a given genre. These keywords were never part of a text, they were already separated but in a different format. I extracted them from that format and put them into lists. The same with genres
I had a df in this format:
```keywords,genres
['k1','k2','k3'],['g1','g2']
['k2','k5','k7'],['g1','g3']
['k1','k2','k9'],['g4']
['k6','k7','k8'],['g3','g5]
...```
I used explode on the genres col and got:
```['k1','k2','k3'],g1
['k1','k2','k3'],g2
['k2','k5','k7'],g1
['k2','k5','k7'],g3
['k1','k2','k9'],g4
['k6','k7','k8'],g3
['k6','k7','k8'],g5
...```
then I 'grouped by' genre to have this df_agg:
```genres,keywords
g1,['k1','k2','k3','k2','k5','k7']
g2,['k1','k2','k3']
g3,['k2','k5','k7','k6','k7','k8']
g4,['k1','k2','k9']
g5,['k6','k7','k8']
...```
So I made these changes to calculate the Tf-IDF for the keywords per genre but I'm not sure whether this is the correct format as df_agg['keywords'] is a list but all examples I see online use a text and get the tokens off the text. Doesn't my df_agg structure suggest that genres are documents and the keywords are the tokens ready?
Should I do something different?
What you're doing is a bit unconventional, but if you wish to do so you can proceed as follows: do one step back and compose a string of your tokens:
from sklearn.feature_extraction.text import TfidfVectorizer
tfidf = TfidfVectorizer()
tfidf_matrix = tfidf.fit_transform(df["keywords"].apply(lambda x: " ".join(x))).toarray()
which you can put into a df, if you wish:
df_tfidf = pd.DataFrame(tfidf_matrix, columns=tfidf.vocabulary_)
print(df_tfidf)
k1 k2 k3 k5 k7 k6 k8 \
0 0.359600 0.605014 0.433206 0.433206 0.000000 0.359600 0.000000
1 0.562638 0.473309 0.677803 0.000000 0.000000 0.000000 0.000000
2 0.000000 0.279457 0.000000 0.400198 0.400198 0.664401 0.400198
3 0.503968 0.423954 0.000000 0.000000 0.000000 0.000000 0.000000
4 0.000000 0.000000 0.000000 0.000000 0.609818 0.506204 0.609818
k9
0 0.000000
1 0.000000
2 0.000000
3 0.752515
4 0.000000
I have a sparse matrix that stores computed similarities between a set of documents. The matrix is an ndarray.
0 1 2 3 4
0 1.000000 0.000000 0.000000 0.000000 0.000000
1 0.000000 1.000000 0.067279 0.000000 0.000000
2 0.000000 0.067279 1.000000 0.025758 0.012039
3 0.000000 0.000000 0.025758 1.000000 0.000000
4 0.000000 0.000000 0.012039 0.000000 1.000000
I would like to transform this data into a 3-dimensional dataframe as follows.
docA docB similarity
1 2 0.067279
2 3 0.025758
2 4 0.012039
This final result does not contain matrix diagonals or zero values. It also lists each document pair only once (i.e. in one row only). Is there is a built-in / efficient method to achieve this end result? Any pointers would be much appreciated.
Thanks!
Convert the dataframe to an array:
x = df.to_numpy()
Get a list of non-diagonal non-zero entries from the sparse symmetric distance matrix:
i, j = np.triu_indices_from(x, k=1)
v = x[i, j]
ijv = np.concatenate((i, j, v)).reshape(3, -1).T
ijv = ijv[v != 0.0]
Convert it back to a dataframe:
df_ijv = pd.DataFrame(ijv)
I'm not sure if this is any faster or anything but an alternative way to do the middle step is to convert the numpy array to an ijv or "triplet" sparse matrix:
from scipy import sparse
coo = sparse.coo_matrix(x)
ijv = np.concatenate((coo.row, coo.col, coo.data)).reshape(3, -1).T
Now given a symmetric distance matrix, all you need to do is to keep the non-zero elements on the upper right triangle. You could loop through these. Or you could pre-mask the array with np.triu_indices_from(x, k=1), but that kind of defeats the whole purpose of this supposedly faster method... hmmm.
I am using pandas describe function for the below result:
dt_d=dt.describe()
print(dt_d)
count 120.00000 120.000000 120.000000 120.000000
mean 5.89000 3.060000 3.795833 1.190833
std 0.84589 0.441807 1.792861 0.757372
min 4.30000 2.000000 1.000000 0.100000
25% 5.17500 2.800000 1.575000 0.300000
50% 5.80000 3.000000 4.450000 1.400000
75% 6.40000 3.325000 5.100000 1.800000
max 7.90000 4.400000 6.900000 2.500000
If I want to take a cell from the describe function, for example, from the mean row, the mean in the third column, how will I be able to call it on its own?
df.describe() returns a DataFrame so you can just index it as you would any other DataFrame, using .loc.
import pandas as pd
import numpy as np
np.random.seed(123)
df = pd.DataFrame(np.random.randint(1,10,(10,3)))
df.describe()
# 0 1 2
#count 10.00000 10.000000 10.000000
#mean 4.30000 2.400000 5.400000
#std 2.58414 1.429841 2.458545
#min 1.00000 1.000000 1.000000
#25% 2.25000 1.000000 4.250000
#50% 4.00000 2.500000 6.000000
#75% 5.00000 3.000000 7.000000
#max 9.00000 5.000000 8.000000
df.describe().loc['mean', 2]
#5.4
I have a cohort of N people and I computed a correlation matrix of some quantities (q1_score,...q5_score)
df.groupby('participant_id').corr()
Out[130]:
q1_score q2_score q3_score q4_score q5_score
participant_id
11.0 q1_score 1.000000 -0.748887 -0.546893 -0.213635 -0.231169
q2_score -0.748887 1.000000 0.639649 0.324976 0.335596
q3_score -0.546893 0.639649 1.000000 0.154539 0.151233
q4_score -0.213635 0.324976 0.154539 1.000000 0.998752
q5_score -0.231169 0.335596 0.151233 0.998752 1.000000
14.0 q1_score 1.000000 -0.668781 -0.124614 -0.352075 -0.244251
q2_score -0.668781 1.000000 -0.175432 0.360183 0.184585
q3_score -0.124614 -0.175432 1.000000 -0.137993 -0.125115
q4_score -0.352075 0.360183 -0.137993 1.000000 0.968564
q5_score -0.244251 0.184585 -0.125115 0.968564 1.000000
17.0 q1_score 1.000000 -0.799223 -0.814424 -0.790587 -0.777318
q2_score -0.799223 1.000000 0.787238 0.658524 0.640786
q3_score -0.814424 0.787238 1.000000 0.702570 0.701440
q4_score -0.790587 0.658524 0.702570 1.000000 0.998996
q5_score -0.777318 0.640786 0.701440 0.998996 1.000000
18.0 q1_score 1.000000 -0.595545 -0.617691 -0.472409 -0.477523
q2_score -0.595545 1.000000 0.386705 0.148761 0.115068
q3_score -0.617691 0.386705 1.000000 0.806637 0.782345
q4_score -0.472409 0.148761 0.806637 1.000000 0.982617
q5_score -0.477523 0.115068 0.782345 0.982617 1.000000
I need to compute the median values of the correlations of all participants? What I mean: I need to take corr. between the item J and item K for all participants and find their median value.
I am sure it is a one line of code, but I'm struggling to realise (still learning pandas by examples).
Stack your data, and do another groupby:
df.groupby('participant_id').corr().stack().groupby(level = [1,2]).median()
Edit: Actually, you don't need to stack if you don't want to:
df.groupby('participant_id').corr().groupby(level = [1]).median()
works too.
IIUC, you want the average mean of each participant across all questions:
df.where(df != 1).mean(axis=1).mean(level=0)
Let's get rid of correlations with same question with where, then get the mean for all questions by participant_id with direction of axis=1, then get the participant_id mean level=0.
Output:
participant_id
11.0 0.086416
14.0 -0.031493
17.0 0.130800
18.0 0.105896
dtype: float64
Edit: I used mean instead of median, we can so do the same logic with median.
df.where(df != 1).median(axis=1).median(level=0)
I am simply trying to cycle through a list of (10) names using an incrementing counter by taking the modulus of the counter with respect to the length of the list. However, the code seems to skip a number here and there. I have tried both modf() and modff() and different type castings, but no luck.
Here is an example of the code:
defaultNameList = [NSArray arrayWithObjects:#"RacerX",#"Speed",#"Sprittle",#"Chim-Chim",#"Pops",#"Dale",#"Junior",#"Chip",#"Fred",#"Barney", nil];
float intpart;
int pickName = (int)(modff(entryCount/10.0,&intpart) * 10.0);
NSLog(#"%ld %f %f %f %d %#",entryCount, entryCount/10.0, modff(entryCount/10.0,&intpart), modff(entryCount/10.0,&intpart) * 10.0 ,pickName, [defaultNameList objectAtIndex:pickName]);
The console gives:
0 0.000000 0.000000 0.000000 0 RacerX
1 0.100000 0.100000 1.000000 1 Speed
2 0.200000 0.200000 2.000000 2 Sprittle
3 0.300000 0.300000 3.000000 3 Chim-Chim
4 0.400000 0.400000 4.000000 4 Pops
5 0.500000 0.500000 5.000000 5 Dale
6 0.600000 0.600000 6.000000 6 Junior
7 0.700000 0.700000 7.000000 6 Junior
8 0.800000 0.800000 8.000000 8 Fred
9 0.900000 0.900000 9.000000 8 Fred
10 1.000000 0.000000 0.000000 0 RacerX
As far as I can tell it should not skip pickName = 7 or 9, but it does.
Casting to (int) truncates the file. That is, if it cannot be exactly represented in the floating-point system which is used on the actual architecture, and is a bit less than the exact value, it will be rounded towards zero. To solve this problem, round the number instead of truncating:
int pickName = (int)(modff(entryCount / 10.0, &intpart) * 10.0 + 0.5);
(This assumes that the number is not negative.)
However, since you're working with integers here, and floating-point operations are expensive, you should consider using the modulo operator instead (which operates on integers):
int pickName = entryCount % 10;