extracting lines with pivot column - awk

Infile,
S 235 1365 * 0 * * * 15 1 c81 592
H 235 296 99.7 + 0 0 3I296M1066I 14 1 s15018 1
H 235 719 95.4 + 0 0 174D545M820I 15 1 c2664 10
H 235 764 99.1 + 0 0 55I764M546I 15 1 c6519 4
H 235 792 100 + 0 0 180I792M393I 14 1 c407 107
S 236 1365 * 0 * * * 15 1 c474 152
H 236 279 95 + 0 0 765I279M321I 10-1 1 s7689 1
H 236 301 99.7 - 0 0 908I301M156I 15 1 s8443 1
H 236 563 95.2 - 0 0 728I563M74I 17 1 c1725 12
H 236 97 97.9 - 0 0 732I97M536I 17 1 s11472 1
S 237 1365 * 0 * * * 15 1 c474 152
H 237 279 95 + 0 0 765I279M321I 15 1 s7689 1
S 238 1365 * 0 * * * 12 1 c474 152
H 238 279 95 + 0 0 765I279M321I 10-1 1 s7689 1
H 238 301 99.7 - 0 0 908I301M156I 15 1 s8443 1
H 238 563 95.2 - 0 0 728I563M74I 17 1 c1725 12
H 238 97 97.9 - 0 0 732I97M536I 17 1 s11472 1
Outfile what I want is below,
Example 1 by specifying ninth column "10-1", "15", and "17".
S 236 1365 * 0 * * * 15 1 c474 152
H 236 279 95 + 0 0 765I279M321I 10-1 1 s7689 1
H 236 301 99.7 - 0 0 908I301M156I 15 1 s8443 1
H 236 563 95.2 - 0 0 728I563M74I 17 1 c1725 12
H 236 97 97.9 - 0 0 732I97M536I 17 1 s11472 1
Example 2 by specifying ninth column "14" and "15".
S 235 1365 * 0 * * * 15 1 c81 592
H 235 296 99.7 + 0 0 3I296M1066I 14 1 s15018 1
H 235 719 95.4 + 0 0 174D545M820I 15 1 c2664 10
H 235 764 99.1 + 0 0 55I764M546I 15 1 c6519 4
H 235 792 100 + 0 0 180I792M393I 14 1 c407 107
Example 3 by specifying ninth column "15".
S 237 1365 * 0 * * * 15 1 c474 152
H 237 279 95 + 0 0 765I279M321I 15 1 s7689 1
So I would like to extract set of lines those have same value in the second column. At this time, I need to extract only set of lines which have specific values in 9th column. In that case, the set of lines need to have "all of the specified values".
The set 238 has "12" in the ninth column, which is not specified. So I do not want them to be extracted.
This question is very similar to this question.
Extracting lines using two criteria

There's many possible approaches but IMHO the most robust and easiest to expand upon later is to create a hash table of the desired values (goodVals[] below) and then just test if the current $9 is a value that's not in that table:
BEGIN { split("10-1 15 17",tmp); for (i in tmp) goodVals[tmp[i]] }
$2 != prevPivot { prtCurrSet() }
!($9 in goodVals) { isBadSet=1 }
{ currSet = currSet $0 ORS; prevPivot = $2 }
END { prtCurrSet() }
function prtCurrSet() {
if ( !isBadSet ) {
printf "%s", currSet
}
currSet = ""
isBadSet = 0
}
Given the new requirement from your comment, here's a solution for one possible interpretation of that requirement:
$ cat tst.awk
BEGIN { split("10-1 15 17",tmp); for (i in tmp) goodVals[tmp[i]] }
$2 != prevPivot { prtCurrSet() }
{ seen[$9]; currSet = currSet $0 ORS; prevPivot = $2 }
END { prtCurrSet() }
function prtCurrSet( val,allGoodPresent) {
allGoodPresent = 1
for (val in goodVals) {
if ( !(val in seen) ) {
allGoodPresent = 0
}
}
if ( allGoodPresent ) {
printf "%s", currSet
}
currSet = ""
delete seen
}
$ awk -f tst.awk file
S 236 1365 * 0 * * * 15 1 c474 152
H 236 279 95 + 0 0 765I279M321I 10-1 1 s7689 1
H 236 301 99.7 - 0 0 908I301M156I 15 1 s8443 1
H 236 563 95.2 - 0 0 728I563M74I 17 1 c1725 12
H 236 97 97.9 - 0 0 732I97M536I 17 1 s11472 1
and here's another:
$ cat tst.awk
BEGIN { split("10-1 15 17",tmp); for (i in tmp) goodVals[tmp[i]] }
$2 != prevPivot { prtCurrSet() }
{ seen[$9]; currSet = currSet $0 ORS; prevPivot = $2 }
END { prtCurrSet() }
function prtCurrSet( val,allGoodPresent,someBadPresent) {
allGoodPresent = 1
for (val in goodVals) {
if ( !(val in seen) ) {
allGoodPresent = 0
}
delete seen[val]
}
someBadPresent = length(seen)
if ( allGoodPresent && !someBadPresent ) {
printf "%s", currSet
}
currSet = ""
delete seen
}
$ awk -f tst.awk file
S 236 1365 * 0 * * * 15 1 c474 152
H 236 279 95 + 0 0 765I279M321I 10-1 1 s7689 1
H 236 301 99.7 - 0 0 908I301M156I 15 1 s8443 1
H 236 563 95.2 - 0 0 728I563M74I 17 1 c1725 12
H 236 97 97.9 - 0 0 732I97M536I 17 1 s11472 1
Unfortunately your posted sample input/output isn't adequate to test the differences.

Related

Coordinates extracted from PDF are not exact

I'm working on rendering a georeferenced pdf within a map, I was able to retrieve the geolocation information from the pdf, but the coordinates I receive are not correct, they are a few meters apart from the places they really should be.
Opening the same PDF in Avenza Maps, it indicates this list of coordinates, and these are correct:
[-26.413082, -51.561534, -26.435838, -51.561643, -26.435909, -51.543773,-26.413152, -51.543667]
In the format I'm doing (reading the PDF as a String and doing a RegEx) I get these values:
[-26.43302 -51.56133 -26.41418 -51.56124 -26.41424 -51.54409 -26.43309 -51.54418]
[-26.45579 -51.59842 -26.41777 -51.59822 -26.41811 -51.51036 -26.45613 -51.51053]
But unfortunately none of the two reflect in the correct place (as in avenza).
That said, I opened the PDF in Notepad and found other values (more related to conversion and information), and I believe that maybe there is some way to convert the coordinates that I got through this other information, to the correct coordinates.
Follow the informations:
<?xpacket end="w"?>
endstream
endobj
294 0 obj
3495
endobj
295 0 obj
/DeviceRGB
endobj
296 0 obj
<</Length 297 0 R>>stream
/GS_init gs
/Group_6 Do
endstream
endobj
297 0 obj
24
endobj
298 0 obj
<</ExtGState 2 0 R/ColorSpace << /CS_P 295 0 R >>/XObject << /Group_6 6 0 R >>>>endobj
299 0 obj
<</Type /Group/S /Transparency/CS 295 0 R/I false/K false>>endobj
300 0 obj
<</Type /Page/Parent 301 0 R/Contents 296 0 R/Resources 298 0 R/MediaBox [0 0 841.88808 1190.5488]/ArtBox [0 0 841.88808 1190.5488]/UserUnit 1/Group 299 0 R/VP[<</Type /Viewport/BBox [14.1732 147.400915455 822.0456 1133.350548016]/Name (þÿ T S B I I)/Measure<</Type /Measure/Subtype /GEO/Bounds [0 0 0 1 1 1 1 0 0 0]/GPTS [ -26.43302 -51.56133 -26.41418 -51.56124 -26.41424 -51.54409 -26.43309 -51.54418]/LPTS [ 0 0 0 1 1 1 1 0]/GCS<</Type /PROJCS/WKT (PROJCS["SIRGAS_2000_UTM_Zone_22S",GEOGCS["GCS_SIRGAS_2000",DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",-51.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]])>>>>>><</Type /Viewport/BBox [14.1732 14.1732 239.961243463 122.688692878]/Name (þÿ R e f e r e n c i a _ M a p a)/Measure<</Type /Measure/Subtype /GEO/Bounds [0 0 0 1 1 1 1 0 0 0]/GPTS [ -26.45579 -51.59842 -26.41777 -51.59822 -26.41811 -51.51036 -26.45613 -51.51053]/LPTS [ 0 0 0 1 1 1 1 0]/GCS<</Type /PROJCS/WKT (PROJCS["SIRGAS_2000_UTM_Zone_22S",GEOGCS["GCS_SIRGAS_2000",DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",500000.0],PARAMETER["False_Northing",10000000.0],PARAMETER["Central_Meridian",-51.0],PARAMETER["Scale_Factor",0.9996],PARAMETER["Latitude_Of_Origin",0.0],UNIT["Meter",1.0]])>>>>>>]>>endobj
301 0 obj
<</Type /Pages/Kids [ 300 0 R ]/Count 1>>endobj
302 0 obj
<<>>endobj
303 0 obj
<</Type /Catalog/Pages 301 0 R/PageMode /UseNone/PageLayout /SinglePage/ViewerPreferences <</PrintScaling /None /FitWindow true /DisplayDocTitle true>>/OpenAction [300 0 R /Fit]/OCProperties<</OCGs [ 10 0 R 11 0 R 12 0 R 13 0 R 14 0 R 15 0 R 16 0 R 17 0 R 18 0 R 19 0 R 20 0 R 21 0 R 22 0 R 35 0 R 36 0 R 43 0 R 44 0 R 47 0 R 50 0 R 53 0 R 56 0 R 59 0 R 62 0 R 63 0 R 64 0 R 65 0 R 66 0 R 67 0 R 68 0 R 69 0 R 76 0 R 77 0 R 80 0 R 83 0 R 90 0 R 93 0 R 96 0 R 99 0 R 102 0 R 105 0 R 108 0 R 111 0 R 114 0 R 117 0 R 120 0 R 123 0 R 126 0 R 129 0 R 132 0 R 135 0 R 138 0 R 141 0 R 148 0 R 149 0 R 152 0 R 155 0 R 158 0 R 161 0 R 176 0 R ]/D<</Name (Layers Tree)/Order [ 176 0 R 161 0 R 158 0 R 148 0 R [ 155 0 R 152 0 R 149 0 R ] 141 0 R 138 0 R 135 0 R 132 0 R 129 0 R 126 0 R 123 0 R 120 0 R 117 0 R 114 0 R 111 0 R 108 0 R 105 0 R 102 0 R 99 0 R 96 0 R 93 0 R 90 0 R 83 0 R 80 0 R 62 0 R [ 76 0 R [ 77 0 R ] 63 0 R [ 69 0 R 68 0 R 67 0 R 64 0 R [ 66 0 R 65 0 R ] ] ] 10 0 R [ 59 0 R 43 0 R [ 56 0 R 53 0 R 50 0 R 47 0 R 44 0 R ] 11 0 R [ 36 0 R 35 0 R 22 0 R 21 0 R 12 0 R [ 20 0 R 19 0 R 18 0 R 17 0 R 16 0 R 15 0 R 14 0 R 13 0 R ] ] ] ]/ListMode /VisiblePages>>>>/Metadata 293 0 R>>endobj
304 0 obj
<</Type/XRef/Size 305/W[1 4 2]/Filter/FlateDecode/Info 292 0 R/Root 303 0 R/ID [<c9167b70223726438d277b1b4409c053> <c9167b70223726438d277b1b4409c053>]/Length 923>>stream
I needed someone to tell me some way to get the correct coordinates, I hope this information helps to find
The PDF content in your question includes two ViewPort dictionaries.
These dictionaries map a location on the page ("BBox")
onto the GPTS referencing the specified WKT.
This is covered in the PDF 2.0 reference ISO-32000-2 section 12.9 & 12.10.
Unfortunately, this spec is not freely available, and it's not cheap.
Here are some definitions from the spec:
BBox:
A rectangle in default user space coordinates specifying the location of the viewport on the page.
The two coordinate pairs of the rectangle shall be specified in normalised form; that is, lower-left followed by upper-right, relative to the measuring coordinate system. This ordering shall determine the orientation of the measuring coordinate system (that is, the direction of the positive x and y axes) in this viewport, which may have a different rotation from the page.
GPTS:
(Required; PDF 2.0) An array of numbers that shall be taken pairwise, defining points in geographic space as degrees of latitude and longitude, respectively when defining a geographic coordinate system. These values shall be based on the geographic coordinate system described in the GCS dictionary. When defining a projected coordinate system, this array contains values in a planar projected coordinate space as eastings and northings. For Geospatial3D, when Geospatial feature information is present (requirement type Geospatial3D) in a 3D annotation, the GPTS array is required to hold 3D point coordinates as triples rather than pairwise where the third value of each tripe is an elevation value.
NOTE 2 Any projected coordinate system includes an underlying geographic coordinate system.
WKT:
A string of Well Known Text describing the geographic coordinate system.
The assumption is, if you're interested in Geospatial coordinates,
then you know what a WKT is, and what the projection means.
This may be enough information for you to map the geo coordinates for the
separate viewports to their locations on the page.
Here are the PDF Viewports in more readable form:
/VP [
<<
/Type
/Viewport
/BBox [14.1732 147.400915455 822.0456 1133.350548016]
/Name (TSBII)
/Measure <<
/Type
/Measure
/Subtype
/GEO
/Bounds [0 0 0 1 1 1 1 0 0 0]
/GPTS [ -26.43302 -51.56133 -26.41418 -51.56124
-26.41424 -51.54409 -26.43309 -51.54418]
/LPTS [ 0 0 0 1 1 1 1 0]
/GCS<<
/Type
/PROJCS
/WKT (
PROJCS["SIRGAS_2000_UTM_Zone_22S",
GEOGCS["GCS_SIRGAS_2000",
DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]
],
PROJECTION["Transverse_Mercator"],
PARAMETER["False_Easting",500000.0],
PARAMETER["False_Northing",10000000.0],
PARAMETER["Central_Meridian",-51.0],
PARAMETER["Scale_Factor",0.9996],
PARAMETER["Latitude_Of_Origin",0.0],
UNIT["Meter",1.0]
]
)
>>
>>
>>
<<
/Type
/Viewport
/BBox [14.1732 14.1732 239.961243463 122.688692878]
/Name (Referencia_Mapa)
/Measure <<
/Type
/Measure
/Subtype
/GEO
/Bounds [0 0 0 1 1 1 1 0 0 0]
/GPTS [ -26.45579 -51.59842 -26.41777 -51.59822
-26.41811 -51.51036 -26.45613 -51.51053]
/LPTS [ 0 0 0 1 1 1 1 0]
/GCS<<
/Type
/PROJCS
/WKT (
PROJCS["SIRGAS_2000_UTM_Zone_22S",
GEOGCS["GCS_SIRGAS_2000",
DATUM["D_SIRGAS_2000",SPHEROID["GRS_1980",6378137.0,298.257222101]],
PRIMEM["Greenwich",0.0],
UNIT["Degree",0.0174532925199433]
],
PROJECTION["Transverse_Mercator"],
PARAMETER["False_Easting",500000.0],
PARAMETER["False_Northing",10000000.0],
PARAMETER["Central_Meridian",-51.0],
PARAMETER["Scale_Factor",0.9996],
PARAMETER["Latitude_Of_Origin",0.0],
UNIT["Meter",1.0]])
>>
>>
>>
]
>>
Note that a PDF file is a structured document and not parsable as a string. These specific elements could be compressed, or might occur multiple times for different pages. You'll need a toolkit that can access Pages and Resources and Dictionaries in order to locate the ViewPorts.

Display commend in ampl

I have a 2 dimension variable in ampl and I want to display it. I want to change the order of the indices but I do not know how to do that! I put my code , data and out put I described what kind of out put I want to have.
Here is my code:
param n;
param t;
param w;
param p;
set Var, default{1..n};
set Ind, default{1..t};
set mode, default{1..w};
var E{mode, Ind};
var B{mode,Var};
var C{mode,Ind};
param X{mode,Var,Ind};
var H{Ind};
minimize obj: sum{m in mode,i in Ind}E[m,i];
s.t. a1{m in mode, i in Ind}: sum{j in Var} X[m,j,i]*B[m,j] -C[m,i] <=E[m,i];
solve;
display C;
data;
param w:=4;
param n:=9;
param t:=2;
param X:=
[*,*,1]: 1 2 3 4 5 6 7 8 9 :=
1 69 59 100 70 35 1 1 0 0
2 34 31 372 71 35 1 0 1 0
3 35 25 417 70 35 1 0 0 1
4 0 10 180 30 35 1 0 0 0
[*,*,2]: 1 2 3 4 5 6 7 8 9 :=
1 64 58 68 68 30 2 1 0 0
2 44 31 354 84 30 2 0 1 0
3 53 25 399 85 30 2 0 0 1
4 0 11 255 50 30 2 0 0 0
The output of this code using glpksol is like tis:
C[1,1].val = -1.11111111111111
C[1,2].val = -1.11111111111111
C[2,1].val = -0.858585858585859
C[2,2].val = -1.11111111111111
C[3,1].val = -0.915032679738562
C[3,2].val = -1.11111111111111
C[4,1].val = 0.141414141414141
C[4,2].val = 0.2003367003367
but I want the result to be like this:
C[1,1].val = -1.11111111111111
C[2,1].val = -0.858585858585859
C[3,1].val = -0.915032679738562
C[4,1].val = 0.141414141414141
C[1,2].val = -1.11111111111111
C[2,2].val = -1.11111111111111
C[3,2].val = -1.11111111111111
C[4,2].val = 0.2003367003367
any idea?
You can use for loops and printf commands in your .run file:
for {i in Ind}
for {m in mode}
printf "C[%d,%d] = %.4f\n", m, i, C[m,i];
or even:
printf {i in Ind, m in mode} "C[%d,%d] = %.4f\n", m, i, C[m,i];
I don't get the same numerical results as you, but anyway the output works:
C[1,1] = 0.0000
C[2,1] = 0.0000
C[3,1] = 0.0000
C[4,1] = 0.0000
C[1,2] = 0.0000
C[2,2] = 0.0000
C[3,2] = 0.0000
C[4,2] = 0.0000

What does each number represents in MNIST?

I have successfully downloaded MNIST data in files with .npy extension. When I print the few columns of first image. I get the following result. What does each number represent here?
a= np.load("training_set.npy")
print(a[:1,100:200])
print(a.shape)
[[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 18
18 18 126 136 175 26 166 255 247 127 0 0 0 0 0 0 0 0
0 0 0 0 30 36 94 154 170 253 253 253 253 253 225 172 253 242
195 64 0 0 0 0 0 0 0 0]]
(60000, 784)
[[ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 18
18 18 126 136 175 26 166 255 247 127 0 0 0 0 0 0 0 0
0 0 0 0 30 36 94 154 170 253 253 253 253 253 225 172 253 242
195 64 0 0 0 0 0 0 0 0]]
These are the intensity values (0-255) for each of the 784 pixels (28x28) of a MNIST image; the total number of training images is 60,000 (you'll find 10,000 more images in the test set).
(60000, 784) means 60,000 samples (images), each one consisting of 784 features (pixel values).

matplotlib surface plot linewidth wrong

I am trying to plot a surface using matplotlib.
There is problem that, even if I specify the linewidth as zero in the code, the show() displays the correct plot without lines. However the pdf generated still has lines in it.
Can anyone tell how to resolve this problem?
Here is the code that I am using to plot
#!/usr/bin/env python
import numpy as np
from matplotlib import cm
import matplotlib.pyplot as plt
import scipy.ndimage as ndimage
vmaxValue=400
#plt.ion()
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
fileName="test"
csvFile=fileName+".csv"
outputFile=fileName+".pdf"
pgfFile=fileName+".pgf"
data = np.genfromtxt(csvFile)
# Delete the first row and first column.
Z = np.delete(data, (0), axis=0)
Z = np.delete(Z, (0), axis=1)
Z2 = ndimage.gaussian_filter(Z, sigma=0.85, order=0)
X, Y = np.meshgrid(np.arange(1,len(Z[0])+1,1), np.arange(1,len(Z)+1,1))
surf = ax.plot_surface(X, Y, Z2, linewidth=0, cstride=1, rstride=1, cmap=cm.coolwarm, antialiased=False, vmin=0, vmax=vmaxValue)
#plt.colorbar()
fig.colorbar(surf, ax=ax, shrink=0.75,pad=-0.05, aspect=15)
ax.set_zlim(0,vmaxValue)
ax.set_xlabel(r'$\alpha$')
ax.set_ylabel('processors')
ax.set_zlabel('Exploration Time(seconds)')
ax.view_init(20, -160)
fig.savefig(outputFile,format="pdf", bbox_inches='tight')
here is the csv file test.csv
Processors graph1 graph2 graph3 graph4 graph5 graph6 graph7 graph8 graph9 graph10 graph11 graph12 graph13 graph14 graph15 graph16 graph17 graph18 graph19 graph20 graph21 graph22 graph23
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 10 7 190 180 360 180 360 180 360 180 360 180 360 180 360
3 0 0 0 0 0 0 0 0 0 64 52 85 247 274 180 360 360 360 360 360 360 360 360
4 0 0 0 0 0 0 0 0 6 1 1 2 180 180 187 187 180 180 360 360 180 180 360
5 0 0 0 0 0 0 0 0 0 0 180 177 175 180 180 360 360 360 360 360 540 540 360
6 0 0 0 0 0 0 0 1 1 1 1 1 181 181 180 180 180 180 360 360 360 360 180
7 0 0 0 0 0 0 0 8 12 6 6 7 8 8 180 180 180 180 180 180 180 360 180
8 0 0 0 0 0 0 0 0 180 133 175 166 148 180 180 180 180 180 180 180 180 180 360
9 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180 180 180 180 180 180 360
10 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180 180 180 180 180 360
11 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180 180 180 180 360
12 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180 180 180 180
13 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180 180 180
14 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180 180
15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180 180
16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180 180
17 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180 180
18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180 180
19 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180 180 180
20 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 184 180
21 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180 180
22 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 180
23 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
25 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Thanks
Apparently it has to do with your pdf antialiasing.
If you don't mind some extra computation time, this answer suggests that you can get rid of it by plotting the same figure multiple times,
for i in range(k):
ax.plot_surface(X, Y, Z2, linewidth=0, cstride=1, rstride=1, cmap=cm.coolwarm, antialiased=False, vmin=0, vmax=vmaxValue)
Note k=2 worked pretty well for me. I wouldn't say it doubles your file size every time you do it, but the size raises by a noticeable amount
If you feel more adventurous, you can check this answer regarding the same issue with contourf. In summary:
for c in cnt.collections:
c.set_edgecolor("face")
Unfortunately matplotlib 3D Objects have no attribute collections so it won't work right away, but hopefully this gives you an inspiration

Extract helix residues from DSSP with awk

I would like to extract helix(H) residues from DSSP files .
1CRN.dssp
31 37 A K H < S+
32 38 A V H < S+
33 39 A F H >< S-
34 40 A G G >< S+
35 41 A K G > S+
1GB5.dssp
113 242 B G H 3>>S+
114 243 B I H <45S+
115 244 B L H X45S+
116 245 B S H 3<5S+
117 246 B K T >X5S+
I want to save the output in the following format.
>1CRN
KVF
>1GB5
GILS
How can I do this with awk? Your suggestions would be appreciated!
It's the 'H' in the 5 th column that indicates "helix(H) residues"?
awk '{
if (FNR == 1 ) print ">" FILENAME
if ($5 == "H") {
printf $4
}
}
END { printf "\n"}' file
output
>tstDat.txt
KVF
IHTH