I labeled a part of a GeoTiff image with polygon using an annotation software which output is X,Y image coordinates (including sub-pixel ones) for each point in the polygon in XML format.
The question is how to convert these points to UTM coordinates in Python in GeoJson format.
From the GeoTiff image, I can extract the following information:
gdalinfo -mm test-area.tif
Driver: GTiff/GeoTIFF
Files: test-area.tif
Size is 1356, 1351
Coordinate System is:
PROJCRS["WGS 84 / UTM zone 11N",
BASEGEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["UTM zone 11N",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-117,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["unknown"],
AREA["World - N hemisphere - 120°W to 114°W - by country"],
BBOX[0,-120,84,-114]],
ID["EPSG",32611]]
Data axis to CRS axis mapping: 1,2
Origin = (432390.000000000000000,3727776.000000000000000)
Pixel Size = (3.000000000000000,-3.000000000000000)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( 432390.000, 3727776.000) (117d43'46.05"W, 33d41'15.97"N)
Lower Left ( 432390.000, 3723723.000) (117d43'44.94"W, 33d39' 4.38"N)
Upper Right ( 436458.000, 3727776.000) (117d41' 8.05"W, 33d41'16.87"N)
Lower Right ( 436458.000, 3723723.000) (117d41' 7.01"W, 33d39' 5.28"N)
Center ( 434424.000, 3725749.500) (117d42'26.51"W, 33d40'10.63"N)
Band 1 Block=1356x1 Type=Int16, ColorInterp=Gray
Computed Min/Max=185.000,4470.000
NoData Value=32767
Band 2 Block=1356x1 Type=Int16, ColorInterp=Undefined
Computed Min/Max=299.000,4895.000
NoData Value=32767
Band 3 Block=1356x1 Type=Int16, ColorInterp=Undefined
Computed Min/Max=276.000,5419.000
NoData Value=32767
Band 4 Block=1356x1 Type=Int16, ColorInterp=Undefined
Computed Min/Max=659.000,5466.000
NoData Value=32767
Related
I have text data that looks like the following after extracting from a file and cleaning. I want to put the data into a pandas dataframe where the columns are ('EXAMINATION', 'TECHNIQUE', 'COMPARISON', 'FINDINGS', 'IMPRESSION'), and each cell in each row contains the extracted data related to the column name (i.e. the keyword).
'FINAL REPORT EXAMINATION: CHEST PA AND LAT INDICATION: F with new onset ascites eval for infection TECHNIQUE: Chest PA and lateral COMPARISON: None FINDINGS: There is no focal consolidation pleural effusion or pneumothorax Bilateral nodular opacities that most likely represent nipple shadows The cardiomediastinal silhouette is normal Clips project over the left lung potentially within the breast The imaged upper abdomen is unremarkable Chronic deformity of the posterior left sixth and seventh ribs are noted IMPRESSION: No acute cardiopulmonary process'
For example, under the column TECHNIQUE there should be a cell containing "Chest PA and lateral", and under the column IMPRESSION, there should be a cell containing "No acute cardiopulmonary process".
Solution as follows, please note the following assumptions:
Keywords as presented are located in that order within the sample text.
The keywords are not contained within the text to be extracted.
Each keyword is followed by a ": " (the colon and whitespace is removed).
Solution
import pandas as pd
sample = "FINAL REPORT EXAMINATION: CHEST PA AND LAT INDICATION: F with new onset ascites eval for infection TECHNIQUE: Chest PA and lateral COMPARISON: None FINDINGS: There is no focal consolidation pleural effusion or pneumothorax Bilateral nodular opacities that most likely represent nipple shadows The cardiomediastinal silhouette is normal Clips project over the left lung potentially within the breast The imaged upper abdomen is unremarkable Chronic deformity of the posterior left sixth and seventh ribs are noted IMPRESSION: No acute cardiopulmonary process"
keywords = ["EXAMINATION", "TECHNIQUE", "COMPARISON", "FINDINGS", "IMPRESSION"]
# Create function to extract text between each of the keywords
def extract_text_using_keywords(clean_text, keyword_list):
extracted_texts = []
for prev_kw, current_kw in zip(keyword_list, keyword_list[1:]):
prev_kw_index = clean_text.index(prev_kw)
current_kw_index = clean_text.index(current_kw)
extracted_texts.append(clean_text[prev_kw_index + len(prev_kw) + 2:current_kw_index])
# Extract the text after the final keyword in keyword_list (i.e. "IMPRESSION")
if current_kw == keyword_list[-1]:
extracted_texts.append(clean_text[current_kw_index + len(current_kw) + 2:len(clean_text)])
return extracted_texts
# Extract text
result = extract_text_using_keywords(sample, keywords)
# Create pandas dataframe
df = pd.DataFrame([result], columns=keywords)
print(df)
# To append future results to the end of the pandas df you can use
# df.loc[len(df)] = result
Output
EXAMINATION TECHNIQUE COMPARISON FINDINGS IMPRESSION
0 CHEST PA AND LAT INDICATION: F with new onset ... Chest PA and lateral None There is no focal consolidation pleural effusi... No acute cardiopulmonary process
It looks like the input is organized such that EXAMINATION, TECHNIQUE, etc. occur in that order.
One approach is to iterate over pairs of strings and use .split() to select content between them. Here is one approach:
import pandas as pd
data = 'FINAL REPORT EXAMINATION: CHEST PA AND LAT INDICATION: F with new onset ascites eval for infection TECHNIQUE: Chest PA and lateral COMPARISON: None FINDINGS: There is no focal consolidation pleural effusion or pneumothorax Bilateral nodular opacities that most likely represent nipple shadows The cardiomediastinal silhouette is normal Clips project over the left lung potentially within the breast The imaged upper abdomen is unremarkable Chronic deformity of the posterior left sixth and seventh ribs are noted IMPRESSION: No acute cardiopulmonary process'
strings = ('EXAMINATION','TECHNIQUE', 'COMPARISON','FINDINGS', 'IMPRESSION', '')
out = {}
for s1, s2 in zip(strings, strings[1:]):
if not s2:
text = data.split(s1)[1]
else:
text = data.split(s1)[1].split(s2)[0]
out[s1] = [text]
print(pd.DataFrame(out))
Which results in:
EXAMINATION TECHNIQUE COMPARISON FINDINGS IMPRESSION
0 : CHEST PA AND LAT INDICATION: F with new onse... : Chest PA and lateral : None : There is no focal consolidation pleural effu... : No acute cardiopulmonary process
3 RGB values are represented with a single one value in some image processing applications.
For example: The single value for RGB(2758, 5541, 4055) is 4542.64
There are some questions related on how to obtain single pixel values from 8bit RGB images but none works with 48bit RGB images. How can I obtain that value?
If I do (2758 + 5541 + 4055) / 3 the result is 4118 which is near but not the same.
It appears that you are trying to determine the grayscale formula used to arrive at that given value. I suggest that you read Seven grayscale conversion algorithms by Tanner Helland.
Based on your example of:
The single value for RGB(2758, 5541, 4055) is 4542.64
It appears that value is computed using the formula:
Gray = (Red * 0.3 + Green * 0.59 + Blue * 0.11)
I have a map file in a tiff format that is not geo-referenced, but I know the following projection information: +proj=lcc +lon_0=10.856 +lat_0=51.322 +lat_1=42 +lat_2=57 +datum=WGS84. I would like to transform it into EPSG:3857.
In my first attempt I tried to fix four arbitrary points and perform warping
gdal_translate -a_srs "+proj=lcc +lon_0=10.856 +lat_0=51.322 +lat_1=42 +lat_2=57 +datum=WGS84" -of GTiff -gcp 19725 11865 1 56 -gcp 103755 12990 23 56 -gcp 112755 80400 23 46 -gcp 8925 78990 1 46 original.tif translated.tif
But the coordinates of the corners of the warped image where not as expected and the map was distorted.
Corner Coordinates:
Upper Left ( 1208480.629, 6678543.890) ( 10d51'21.48"E, 51d19'21.08"N)
Lower Left ( 1208480.629, 6678541.123) ( 10d51'21.48"E, 51d19'21.03"N)
Upper Right ( 1208486.180, 6678543.890) ( 10d51'21.66"E, 51d19'21.08"N)
Lower Right ( 1208486.180, 6678541.123) ( 10d51'21.66"E, 51d19'21.03"N)
Center ( 1208483.404, 6678542.507) ( 10d51'21.57"E, 51d19'21.06"N)
After reading many posts, here and elsewhere, I came to this:
Converting the UL and LR corners to x and y
cs2cs +proj=latlong +lon_0=10.856 +lat_0=51.322 +lat_1=42 +lat_2=57 +datum=WGS84 +to +init=epsg:3857
-3.63 57.28 > -404089.75 7817564.89
23.76 44.39 > 2644951.10 5525995.28
Geo-referencing the tiff
gdal_translate -a_srs "+proj=lcc +lon_0=10.856 +lat_0=51.322 +lat_1=42 +lat_2=57 +datum=WGS84" -a_ullr -404089.75 7817564.89 2644951.10 5525995.28 original.tif translated.tif
The result of gdalinfo translated.tif is as follows:
Size is 14735, 11333
Coordinate System is:
PROJCS["unnamed",
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4326"]],
PROJECTION["Lambert_Conformal_Conic_2SP"],
PARAMETER["standard_parallel_1",42],
PARAMETER["standard_parallel_2",57],
PARAMETER["latitude_of_origin",51.322],
PARAMETER["central_meridian",10.856],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]]]
Origin = (-404089.750000000000000,7817564.889999999664724)
Pixel Size = (206.925066168985410,-202.203265684284787)
Metadata:
AREA_OR_POINT=Area
TIFFTAG_RESOLUTIONUNIT=2 (pixels/inch)
TIFFTAG_XRESOLUTION=144
TIFFTAG_YRESOLUTION=144
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( -404089.750, 7817564.890) (146d18'53.31"E, 73d27'53.56"N)
Lower Left ( -404089.750, 5525995.280) (158d45'33.92"W, 88d 1'24.64"N)
Upper Right ( 2644951.100, 7817564.890) (172d26'19.88"W, 64d26'59.68"N)
Lower Right ( 2644951.100, 5525995.280) (138d13'57.20"E, 73d22'12.76"N)
Center ( 1120430.675, 6671780.085) (161d52'17.63"W, 79d37'35.75"N)
However, the coordinates of the corners above are completely off. When I attempted to warp it, the image could become so huge that I had to stop the process after a couple of minutes.
gdalwarp -overwrite -t_srs EPSG:3857 -r near -co COMPRESS=LZW translated.tif warped.tif
The result of gdalinfo warped.tif is as follows:
Corner Coordinates:
Upper Left (-20037508.339,29704755.766) (180d 0' 0.00"W, 88d54'44.28"N)
Lower Left (-20037508.339, 9464937.842) (180d 0' 0.00"W, 64d26'59.58"N)
Upper Right (20037503.600,29704755.766) (179d59'59.85"E, 88d54'44.28"N)
Lower Right (20037503.600, 9464937.842) (179d59'59.85"E, 64d26'59.58"N)
Center ( -2.370,19584846.804) ( 0d 0' 0.08"W, 84d41'15.52"N)
What did I miss?
Let's simplify the problem: how do I geo-reference the image file without converting the projection into EPSG:3857?
The first step I took was to embed the projection info and the four corner points:
gdal_translate -a_srs "+proj=lcc +lon_0=10.856 +lat_0=51.322 +lat_1=42 +lat_2=57 +datum=WGS84" -of GTiff -gcp 0 0 -4.226944 57.35083 -gcp 14735 0 27.78 57.59555 -gcp 14735 11333 24.30917 44.06611 -gcp 0 11333 -3.537778 43.6 original.tif georeferenced.tif
In the second step I warped georeferenced.tif without specifying a target projection:
gdalwarp -overwrite -r near georeferenced.tif warped.tif
When I checked warped.tif info got the following corner coordinates:
Corner Coordinates:
Upper Left ( -4.5777770, 57.6508975) ( 10d51'21.36"E, 51d19'21.08"N)
Lower Left ( -4.5777770, 43.6558450) ( 10d51'21.36"E, 51d19'20.62"N)
Upper Right ( 26.7396454, 57.6508975) ( 10d51'22.99"E, 51d19'21.08"N)
Lower Right ( 26.7396454, 43.6558450) ( 10d51'22.99"E, 51d19'20.62"N)
I was hoping to get:
Corner Coordinates:
Upper Left (0.0, 0.0) (-4.226944, 57.35083,0)
Lower Left (0.0, 11333.0) (-3.537778, 43.6,0)
Upper Right (14735.0, 0.0) (27.78, 57.59555,0)
Lower Right (14735.0, 11333.0) (24.30917, 44.06611,0)
Are you sure you GCP's are correct? They should be in the form:
-gcp pixel line easting northing [elevation]]*
You have one GCP which suggest that your input file is at least 112755 x 80400 pixels (w x h) in size. Its possible of course, but its a large file.
I think you should combine your first attempt with your later gdalwarp step. Since gdalwarp doesn't take gcp's, and gdal_translate can't warp, it takes two steps.
You can output your first step to a VRT file, so it takes less processing and disk space.
gdal_translate -of VRT -a_srs "+proj=lcc +lon_0=10.856 +lat_0=51.322 +lat_1=42 +lat_2=57 +datum=WGS84" -gcp 19725 11865 1 56 -gcp 103755 12990 23 56 -gcp 112755 80400 23 46 -gcp 8925 78990 1 46 original.tif georeferenced.vrt
And then warp the VRT:
gdalwarp -overwrite -t_srs EPSG:3857 -r near -co COMPRESS=LZW georeferenced.vrt warped.tif
I have a jpg image that I would like to convert to a GeoTiff. I have done the following with GDAL:
gdal_translate -a_srs "+proj=latlong +datum=WGS84" -of GTiff -co "INTERLEAVE=PIXEL" -a_ullr -112.749767303467 40.2223700746836 -112.731463909149 40.2063119735618 image0.jpg image0.tif
This provides a GeoTiff with the following Geo Data (using gdalinfo):
Driver: GTiff/GeoTIFF
Files: image0.tif
Size is 853, 980
Coordinate System is:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4326"]]
Origin = (-112.749767303467000,40.222370074683603)
Pixel Size = (0.000021457672120,-0.000016385817471)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left (-112.7497673, 40.2223701) (112d44'59.16"W, 40d13'20.53"N)
Lower Left (-112.7497673, 40.2063120) (112d44'59.16"W, 40d12'22.72"N)
Upper Right (-112.7314639, 40.2223701) (112d43'53.27"W, 40d13'20.53"N)
Lower Right (-112.7314639, 40.2063120) (112d43'53.27"W, 40d12'22.72"N)
Center (-112.7406156, 40.2143410) (112d44'26.22"W, 40d12'51.63"N)
Band 1 Block=853x3 Type=Byte, ColorInterp=Red
Band 2 Block=853x3 Type=Byte, ColorInterp=Green
Band 3 Block=853x3 Type=Byte, ColorInterp=Blue
I now want to add a projection to the file using gdalwarp so I do this:
gdalwarp -of GTiff -co "INTERLEAVE=PIXEL" -s_srs "+proj=utm +zone=12 +datum=WGS84" - t_srs "+proj=latlong +datum=WGS84" -r cubic image0.tif image0proj.tif
The output from gdal info for this is:
Driver: GTiff/GeoTIFF
Files: image0proj.tif
Size is 974, 860
Coordinate System is:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4326"]]
Origin = (-115.489754008884220,0.000362780166170)
Pixel Size = (0.000000000168394,-0.000000000168394)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left (-115.4897540, 0.0003628) (115d29'23.11"W, 0d 0' 1.31"N)
Lower Left (-115.4897540, 0.0003626) (115d29'23.11"W, 0d 0' 1.31"N)
Upper Right (-115.4897538, 0.0003628) (115d29'23.11"W, 0d 0' 1.31"N)
Lower Right (-115.4897538, 0.0003626) (115d29'23.11"W, 0d 0' 1.31"N)
Center (-115.4897539, 0.0003627) (115d29'23.11"W, 0d 0' 1.31"N)
Band 1 Block=974x2 Type=Byte, ColorInterp=Red
Band 2 Block=974x2 Type=Byte, ColorInterp=Green
Band 3 Block=974x2 Type=Byte, ColorInterp=Blue
I was expecting the output to contain projection information in UTM starting like this:
Coordinate System is:
PROJCS["UTM",
GEOGCS["WGS 84",
What am I doing wrong?
Thanks in advance.
JC
I am trying plot data sets consisting of 3 coordinates:
X-coordinate, x-coordinate and the number of occurrences.
example:
1 2 10
3 1 2
3 2 1
I would like to draw for every line a dot at x,y with a diameter which is depending on the third value.
Is that possible with Gnuplot?
Create a 2D plot with variable point size. See the demo.
Example:
plot 'dataFile.dat' u 1:2:3 w points lt 1 pt 10 ps variable
This is basically equivalent to the existing answer, just shorter:
plot 'dataFile.dat' with circles
Credit: Gnuplot: plot with circles of a defined radius