Using a .poly file, I created models with holes in them. I would like to mark/identify the faces bounding these holes. Is there a way? Maybe using regions?
You can do it using the [boundary marker] attribute. See the example below.
Example (with arbitrary numbers):
342 3 0 1
- - -
1 0.312500 0.000000 0.000000 1
2 0.304126 0.000000 0.031250 1
. . .
303 -0.004873 -0.259999 -0.017013 2
304 -0.008291 -0.267013 -0.008944 2
. . .
where the format is:
[index] [x_coordinate] [y_coordinate] [z_cordinate] [boundary marker]
So now, you have marked vertices. So in the next part of the .poly file, we'll specify faces.
676 1
- - -
1 0 1
3 15 16 13
1 0 1
3 13 16 17
. . .
1 0 2
3 304 303 335
1 0 2
3 303 309 335
where the format is:
[# of polygons] [# of holes] [boundary marker]
[# of corners] [corner 1] [corner 2] ... [corner #]
Related
I have created a set of 4 clusters using kmeans, but I'd like to reorder the clusters in an ascending manner to have a predictable way of outputting an analysis every time the script is executed.
The resulting df with the clusters is something like:
customer_id recency frequency monetary_value recency_cluster \
0 44792907512250289 21 1 43.76 0
1 4277896431638207047 443 1 73.13 1
2 1509512561185834874 559 1 37.50 1
3 -8259919882769629944 437 1 34.38 1
4 8269311313560571571 133 2 324.78 0
5 6521698907264712834 311 1 6.32 3
6 9102795320443090762 340 1 174.99 3
7 6203217338400763719 39 1 77.50 0
8 7633758030510673403 625 1 95.26 2
9 -2417721548925747504 644 1 76.84 2
frequency_cluster monetary_value_cluster
0 1 0
1 1 0
2 1 0
3 1 0
4 0 1
5 1 0
6 1 1
7 1 0
8 1 0
9 1 0
The recency clusters are not sorted by the data, I'd like for example that the recency cluster 0 to be the one with the min value = 1.0 (recency cluster 1).
recency_cluster count mean std min 25% 50% 75% max
0 17609.0 700.900960 56.895995 609.0 651.0 697.0 749.0 807.0
1 16458.0 102.692672 62.952229 1.0 47.0 101.0 159.0 210.0
2 17166.0 515.971746 56.592490 418.0 466.0 517.0 567.0 608.0
3 18634.0 317.599227 58.852980 211.0 269.0 319.0 367.0 416.0
Using something like:
rfm_df.groupby('recency_cluster')['recency'].transform('min')
Will return a colum with the min value of each clusters
0 1
1 418
2 418
3 418
4 1
...
69862 609
69863 1
69864 211
69865 609
69866 211
I guess there's got to be a way to convert this categories [1,211,418,609] into [0, 1, 2, 3] in order to get the desired result but I can't come up with a solution.
Or maybe there's a better approach to the problem.
Edit: I did this and I think it's working:
rfm_df['recency_normalized_cluster'] = rfm_df.groupby('recency_cluster')['recency'].transform('min').astype('category').cat.codes
rfm_df['recency_normalized_cluster'] = rfm_df.groupby('recency_cluster')['recency'].transform('min').astype('category').cat.codes
I am Learning Python using pandas, I do not know, how to pivot a data frame with columns with a multilevel index. I have the following pivot table :
df= df.pivot_table(index=["FECHA",'Planta'],
aggfunc = {'Menor_F0' :np.sum, 'Menor_fc' :np.sum,
"Total_Muestras " : "count"
})
it gives: PD: it is correct
Menor_F0 Menor_fc Total_Muestras
FECHA Planta
01/2014 455 0 0 2
470 1 2 5
01/2016 455 0 0 1
470 0 1 2
But I want to visualize it, in this form, how can I do it?
FECHA 01/2014 01/2016
Menor_F0 Menor_fc Total_Muestras Menor_F0 Menor_fc Total_Muestras
PLANTA
455 0 0 2 0 0 1
470 1 2 5 0 1 2
You can try stack and unstack:
df.stack().unstack(level=(0,2))
I have a data frame like this:
> head(a)
FID IID FLASER PLASER DIABDUR HBA1C ESRD pheno
1 fam1000-03 G1000 1 1 38 10.2 1 control
2 fam1001-03 G1001 1 1 15 7.3 1 control
3 fam1003-03 G1003 1 2 17 7.0 1 case
4 fam1005-03 G1005 1 1 36 7.7 1 control
5 fam1009-03 G1009 1 1 23 7.6 1 control
6 fam1052-03 G1052 1 1 32 7.3 1 control
My df has 1698 obs of which 828 who have "case" in pheno column and 836 who have "control" in pheno column.
I make a histogram via:
library(ggplot2)
ggplot(a, aes(x=HBA1C, fill=pheno)) +
geom_histogram(binwidth=.5, position="dodge")
I would like to have the y-axis show the percentage of individuals which
have either "case" or "control" in pheno instead of the count. So percentage would be calculated for each group on y axis ("case" or "control"). I also do have NAs in my plot and it would be good to exclude those from the plot.
I guess I can remove NAs from pheno with this:
ggplot(data=subset(a, !is.na(pheno)), aes(x=HBA1C, fill=pheno)) + geom_histogram(binwidth=.5, position="dodge")
This can be achieved like so:
Note: Concerning the NAs you were right. Simply subset for non-NA values or use dplyr::filter or ...
a <- read.table(text = "id FID IID FLASER PLASER DIABDUR HBA1C ESRD pheno
1 fam1000-03 G1000 1 1 38 10.2 1 control
2 fam1001-03 G1001 1 1 15 7.3 1 control
3 fam1003-03 G1003 1 2 17 7.0 1 case
4 fam1005-03 G1005 1 1 36 7.7 1 control
5 fam1009-03 G1009 1 1 23 7.6 1 control
6 fam1052-03 G1052 1 1 32 7.3 1 control
7 fam1052-03 G1052 1 1 32 7.3 1 NA", header = TRUE)
library(ggplot2)
ggplot(a, aes(x=HBA1C, fill=pheno)) +
geom_histogram(aes(y = ..count.. / tapply(..count.., ..group.., sum)[..group..]),
position='dodge', binwidth=0.5) +
scale_y_continuous(labels = scales::percent)
Created on 2020-05-23 by the reprex package (v0.3.0)
I'd like to select a subset of columns from a DataFrame while applying a transformation to some of those columns at the same time. Is it possible to transform a column when that column is selected as one in a list of columns?
For example, I have a column StartDate that is of type np.datetime[64] that I'd like to extract the month from.
When dealing with that Series on its own, I'd do something like
print(df['StartDate'].transform(lambda x: x.month))
to see the transformed data. Can I accomplish the same thing when the above expression is part of a list of columns? Something like:
print(df[['ColumnA', 'ColumnB', 'StartDate'.transform(lambda x: x.month)]])
Of course the above gives the error
AttributeError: 'str' object has no attribute 'month'
So, if my data looks like:
Metadata | Metadata | 2020-01-01
Metadata | Metadata | 2020-02-06
Metadata | Metadata | 2020-02-25
I'd like to see:
Metadata | Metadata | 1
Metadata | Metadata | 2
Metadata | Metadata | 2
Without appending a new separate "Month" column to the DataFrame. Is this possible?
If you have some data like below
df = pd.DataFrame({'col1' : np.random.randint(10, size = 366), 'col2': np.random.randint(10, size = 366),'StartDate' : pd.date_range('2018', '2019')})
which looks like
col1 col2 StartDate
0 0 2 2018-01-01
1 8 0 2018-01-02
2 0 5 2018-01-03
3 3 4 2018-01-04
4 8 6 2018-01-05
... ... ... ...
361 8 8 2018-12-28
362 9 9 2018-12-29
363 4 1 2018-12-30
364 2 4 2018-12-31
365 0 9 2019-01-01
You could redefine the column, or you could assign and create a temporary view, like.
df.assign(StartDate = df['StartDate'].dt.month)
which outputs.
col1 col2 StartDate
0 0 2 1
1 8 0 1
2 0 5 1
3 3 4 1
4 8 6 1
... ... ... ...
361 8 8 12
362 9 9 12
363 4 1 12
364 2 4 12
365 0 9 1
This also doesn't change the original dataframe. If you want to create a permanent version, then just reassign.
df = df.assign(StartDate = df['StartDate'].dt.month)
You could also take this further, such as.
df.assign(StartDate = df['StartDate'].dt.month, col1 = df['col1'] + 100)[['col1', 'StartDate']]
You can apply whatever transform you need and then access any columns you want after assigning these transforms.
col1 StartDate
0 105 1
1 109 1
2 108 1
3 101 1
4 108 1
... ... ...
361 104 12
362 102 12
363 109 12
364 102 12
365 100 1
I guess you could use the attribute name of the Series.
Something like:
dt_to_month = lambda x: [d.month for d in x] if x.name == 'StartDate' else x
df[['ColumnA', 'ColumnB', 'StartDate']].apply(dt_to_month)
will do the trick.
I need to create spline programmatically. I've made something like:
0
SECTION
2
HEADER
9
$ACADVER
1
AC1006
0
ENDSEC
0
SECTION
2
TABLES
0
TABLE
2
LAYER
0
LAYER
2
shape
70
64
62
250
6
CONTINUOUS
0
LAYER
2
holes
70
64
62
250
6
CONTINUOUS
0
ENDTAB
0
ENDSEC
0
SECTION
2
ENTITIES
0
SPLINE
8
shape
100
AcDbSpline
210
0
220
0
230
1
70
4
71
3
72
11
73
4
74
4
42
0.0000001
43
0.0000001
44
0.0000000001
40
0
40
0
40
0
40
0
40
1
40
1
40
1
40
2
40
2
40
2
40
2
10
0
20
0
30
0
10
100
20
50
30
0
10
40
20
40
30
0
10
15
20
23
30
0
11
0
21
0
31
0
11
200
21
200
31
0
11
80
21
80
31
0
11
432
21
234
31
0
0
ENDSEC
0
EOF
When I'm trying to open it in Autodesk TrueView, I'm getting an error:
Undefined group code 210 for object on line 54.
Invalid or incomplete DXF input -- drawing discarded.
Where is the error? When I'm copying just SPLINE section to the DXF generated by AI everything works fine. So I think I need to add something in the header section or something.
This file is DXF version AC1006 which is older than DXF R12. The SPLINE entity
requires at least DXF version AC1012 DXF R13/R14. But with DXF version AC1012
the tag structure of DXF files is changed (OBJECTS and CLASSES sections, SubClassMarkers ...), so just editing the DXF version
does not work.
See also: http://ezdxf.readthedocs.io/en/latest/dxfinternals/filestructure.html#minimal-dxf-content
Also the SPLINE entity seems to be invalid, it has no handle (5) and no owner
tag (330), and the whole AcDbEntity subclass is missing.
Your spline is of degree 3 with 11 knots (0, 0,0,0,1,1,1,2,2,2,2) and 4 control points ( (0,0), (100,50),(40,40),(15,23) ). This might be the problem culprit. You should either have 4 control points and 8 knots or 7 control points and 11 knots.
You may need to assign a handle to the SPLINE, since you're specifying $ACADVER = AC1018 = AutoCAD 2004 where item handles are required.
Try adding a 5-code pair right before the layer designation, like so, where AAAA is a unique hex-encoded handle:
0
SPLINE
5 <-- add these two lines
AAAA <--
8
shape
100
AcDbSpline