Using BigQuery Geo Viz,
I am trying to visualize a polygon and its centroid point, simultaneously on the same map.
I tried the ST_UNION function but could not really combine the two GEOGRAPHYs.
Any idea how to visualize both GEOGRAPHYs.
Polygon:
POLYGON((-95.7082555 29.9212101, -95.665885 29.907145, -95.7742806214083 29.82947355, -95.7303605 29.8538605, -95.659484 29.901497, -95.662932 29.894958, -95.8441482 29.7265376, -95.646749 29.905534, -95.810012 29.719363, -95.664174 29.883618, -95.639718 29.910045, -95.652796 29.89204, -95.649915 29.886317, -95.650089 29.881912, -95.641443 29.897741, -95.632912 29.911674, -95.653458 29.864561, -95.635056 29.864431, -95.636533 29.757219, -95.623339 29.903466, -95.597235 29.75367, -95.3636989932886 29.8063167449664, -95.575123 29.920295, -95.3944858832763 29.94248964622, -95.147033 30.013214, -95.586588 29.947706, -95.456723 31.3287239, -95.69717 29.96911, -95.674433 29.943844, -95.678203 29.935184, -95.7082555 29.9212101))
Centroid point:
POINT(-95.5606651932764 30.2307053050834)
Try selecting the two structures separately and using UNION ALL to gather them in the same visualization:
SELECT ST_GeogFromText('POLYGON((-95.7082555 29.9212101, -95.665885 29.907145, -95.7742806214083 29.82947355, -95.7303605 29.8538605, -95.659484 29.901497, -95.662932 29.894958, -95.8441482 29.7265376, -95.646749 29.905534, -95.810012 29.719363, -95.664174 29.883618, -95.639718 29.910045, -95.652796 29.89204, -95.649915 29.886317, -95.650089 29.881912, -95.641443 29.897741, -95.632912 29.911674, -95.653458 29.864561, -95.635056 29.864431, -95.636533 29.757219, -95.623339 29.903466, -95.597235 29.75367, -95.3636989932886 29.8063167449664, -95.575123 29.920295, -95.3944858832763 29.94248964622, -95.147033 30.013214, -95.586588 29.947706, -95.456723 31.3287239, -95.69717 29.96911, -95.674433 29.943844, -95.678203 29.935184, -95.7082555 29.9212101))') t UNION ALL SELECT ST_GeogFromText('POINT(-95.5606651932764 30.2307053050834)') t
If your intention is to show the geometry and the point in the same visualization it will work as you can see in the image below:
Please let me know if this is what you are looking for
In case of the simple scenario you presented in the question of having just one polygon and its centroid below simple solution works
#standardSQL
WITH objects AS (
SELECT 'POLYGON((-95.7082555 29.9212101, -95.665885 29.907145, -95.7742806214083 29.82947355, -95.7303605 29.8538605, -95.659484 29.901497, -95.662932 29.894958, -95.8441482 29.7265376, -95.646749 29.905534, -95.810012 29.719363, -95.664174 29.883618, -95.639718 29.910045, -95.652796 29.89204, -95.649915 29.886317, -95.650089 29.881912, -95.641443 29.897741, -95.632912 29.911674, -95.653458 29.864561, -95.635056 29.864431, -95.636533 29.757219, -95.623339 29.903466, -95.597235 29.75367, -95.3636989932886 29.8063167449664, -95.575123 29.920295, -95.3944858832763 29.94248964622, -95.147033 30.013214, -95.586588 29.947706, -95.456723 31.3287239, -95.69717 29.96911, -95.674433 29.943844, -95.678203 29.935184, -95.7082555 29.9212101))' wkt_string UNION ALL
SELECT 'POINT(-95.5606651932764 30.2307053050834)'
)
SELECT ST_GEOGFROMTEXT(wkt_string) geo
FROM objects
and this can be visualized with different tools - like in below example
For more realistic scenario, when you have many polygons and need to visulaize them along with their centroids - you can use below approach (based on example with us states)
#standardSQL
SELECT state_geom state, ST_CENTROID(state_geom) centroid
FROM `bigquery-public-data.utility_us.us_states_area`
with result as below
which can be visualized as in below examples (just showing few states to get an idea)
And finally, you can combine all such polygons (states in this example) with their centroids in nice visualization as below
Another (of many endless options) thing you can do is to add some metrics and more attributes to the query - for example state_name and area_land_meters and make your visualization data driven and with dynamic tooltips like in below example
Related
I have some data on YouTube channel descriptions which are quite messy as you'd imagine. I'd like to filter channels whose description is in English, but I'm not sure how to go about it. Here's a sample of what the data looks like
WITH
foo AS (
SELECT ".olá sejam muito bem vindos. este canal foi criado" AS x
UNION ALL SELECT "Hello, I am Abhy and welcome to my channel." AS x
UNION ALL SELECT "Channels I love: Labrant Fam, Norris Nuts, La Familia Diamond, Piper Rockelle" AS x
UNION ALL SELECT "हेलो दोस्तो रमेश और सागर और सुखदेव आपका स्वागत करते हैं इस चैनल के ऊपर" AS x
UNION ALL SELECT "Hi, I'm K-POP RANDOM👩🇲🇨 === 🌈KPOP RANDOM DANCE🌈 === 🌻I hope you can enjoy" AS x
UNION ALL SELECT 'Public TV Kannada news channel. The slogan is "Yaara Aasthiyoo Alla, Idu Nimma TV"' AS x
UNION ALL SELECT "Instagram: www.instagram.com/whatsfordinner5291/" AS x
UNION ALL SELECT "Welcome to RunningBoy12, a gaming channel brought to you by RO!" as x
)
select * from foo
My idea is to hand-label some records, measure the frequency of foreign characters and words, and then fit a logistic regression model to the data using BigQuery ML. Is there a better way?
You can detect language with Cloud Translation API. Before inserting records, you need to run this API. You may want to use Cloud Functions to call this API. Or if you want to do more complicated ETL, you may use Cloud Dataflow.
When a text is categorized as English, you shall insert record to any DB you want.
In this way, you don't have to store non-English text in your DB, and can save your money for storage and querying. Instead of BigQuery, CloudFirestore could be option. It depends on the service you want to achieve.
Here is Cloud Translation API document:
https://cloud.google.com/translate/docs/advanced/detecting-language-v3#before_you_begin
Comparizon of DB:
https://db-engines.com/en/system/Amazon+DocumentDB%3BGoogle+BigQuery%3BGoogle+Cloud+Firestore
I would like to compute the shortest distance from the yellow point in the image below to the polygon boundary using built in BigQuery Geo functions.
I could not find anything myself.
Here is the query that builds the example.
WITH objects AS(
SELECT 'POLYGON((-84.3043408314983 33.78004925, -84.3058929975152 33.7780287948446, -84.3026549053438 33.77962155, -84.3018234603607 33.7798783, -84.3041030408163 33.7785105714286, -84.2983655895464 33.7814847396304, -84.2869801170094 33.7772419185107, -84.2842584693878 33.7827876938775, -84.2863881748169 33.7848439284835, -84.2963746470588 33.7897689411765, -84.2979002513655 33.790508814658, -84.2978883265306 33.7851126734694, -84.300035153059 33.78268675, -84.3043408314983 33.78004925))' wkt_string
UNION ALL
SELECT 'POINT(-84.2998716702097 33.7796025711153)' wkt_string
)
SELECT ST_GEOGFROMTEXT(wkt_string) geo
FROM objects
this is the function i was looking for:
ST_CLOSESTPOINT(geography_1, geography_2[, use_spheroid]).
Use ST_Distance function to compute shortest distance between shapes:
WITH objects AS(
SELECT
'POLYGON((-84.3043408314983 33.78004925, -84.3058929975152 33.7780287948446, -84.3026549053438 33.77962155, -84.3018234603607 33.7798783, -84.3041030408163 33.7785105714286, -84.2983655895464 33.7814847396304, -84.2869801170094 33.7772419185107, -84.2842584693878 33.7827876938775, -84.2863881748169 33.7848439284835, -84.2963746470588 33.7897689411765, -84.2979002513655 33.790508814658, -84.2978883265306 33.7851126734694, -84.300035153059 33.78268675, -84.3043408314983 33.78004925))'
AS poly,
'POINT(-84.2998716702097 33.7796025711153)' AS point
)
SELECT ST_Distance(ST_GEOGFROMTEXT(poly), ST_GEOGFROMTEXT(point))
FROM objects
One caveat - it computes distance between point and polygon, so if the point is inside the polygon, the distance is 0. If you really want distance to polygon boundary, add ST_Boundary to the mix:
WITH objects AS(
...
)
SELECT ST_Distance(ST_Boundary(ST_GEOGFROMTEXT(poly)), ST_GEOGFROMTEXT(point))
FROM objects
This has be stumped for more than a day now and examples I could find have not worked. I am new to SQLALCHEMY and I find the documentation not very enlightening.
The query (so far):
prey = alias(ensembl_genes, name='prey')
bait = alias(ensembl_genes, name='bait')
query = db.session.query(tap,prey,bait).\
join(prey, tap.c.TAP_PREY_ENSEMBL_GENE_ID==prey.c.ENSEMBL_GENE_ID).\
join(bait, tap.c.TAP_BAIT_ENSEMBL_GENE_ID==bait.c.ENSEMBL_GENE_ID).\
filter(\
or_(\
tap.c.TAP_PREY_ENSEMBL_GENE_ID=='ENSG00000100360',\
tap.c.TAP_BAIT_ENSEMBL_GENE_ID=='ENSG00000100360'\
)\
).\
order_by(desc(tap.c.TAP_UNIQUE_PEPTIDE_COUNT))
tap refers to a table of interacting genes. One interactor is designated the 'bait' and the other the 'prey'. Prey and Bait are aliases for the same table that holds additional information on these genes. The objective is to select all interactions with a given gene 'ENSG00000100360' as either bait or prey.
The problem:
This query returns about 20 or so columns, but I need only six specific ones, two from each original tables (I'd like to rename them as well). From examples found on the interwebz I thought I should add:
options(
Load(tap).load_only('TAP_UNIQUE_PEPTIDE_COUNT','TAP_SEQUENCE_COVERAGE'),
Load(prey).load_only('ENSEMBL_GENE_SYMBOL','ENSEMBL_GENE_ID'),
Load(bait).load_only('ENSEMBL_GENE_SYMBOL','ENSEMBL_GENE_ID')
)
But this gives me the following error:
File "/Users/jvandam/Github/syscilia/tools/BDT/quest/blueprints/genereport.py", line 246, in createTAPMSView
Load(tap).load_only('TAP_UNIQUE_PEPTIDE_COUNT','TAP_SEQUENCE_COVERAGE')
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sqlalchemy/orm/strategy_options.py", line 82, in init
self.path = insp._path_registry
AttributeError: 'Table' object has no attribute '_path_registry'
I have not been able to find anything on google about what to do about this.
The sqlalchemy table objects are created from the database table metadata.
What I am trying to emulate using the sqlalchemy orm statements is:
SELECT
prey.ENSEMBL_GENE_SYMBOL AS PREY_ENSEMBL_GENE_SYMBOL,
prey.ENSEMBL_GENE_ID AS PREY_ENSEMBL_GENE_ID,
bait.ENSEMBL_GENE_SYMBOL AS BAIT_ENSEMBL_GENE_SYMBOL,
bait.ENSEMBL_GENE_ID AS BAIT_ENSEMBL_GENE_ID,
t.TAP_UNIQUE_PEPTIDE_COUNT AS UNIQUE_PEPTIDE_COUNT,
t.TAP_SEQUENCE_COVERAGE AS SEQUENCE_COVERAGE
FROM TAP as t
INNER JOIN ENSEMBL_GENES AS prey
ON tap.TAP_PREY_ENSEMBL_GENE_ID=prey.ENSEMBL_GENE_ID
INNER JOIN ENSEMBL_GENES AS bait
ON t.TAP_BAIT_ENSEMBL_GENE_ID=bait.ENSEMBL_GENE_ID
WHERE
t.TAP_PREY_ENSEMBL_GENE_ID='ENSG00000100360'
OR t.TAP_BAIT_ENSEMBL_GENE_ID='ENSG00000100360'
ORDER BY t.TAP_UNIQUE_PEPTIDE_COUNT DESC
Can anyone help me fix my query?
Thanks in advance!
John
Just change this part db.session.query(tap,prey,bait).\ with the below:
db.session.query(\
prey.ENSEMBL_GENE_SYMBOL.label("PREY_ENSEMBL_GENE_SYMBOL"),
prey.ENSEMBL_GENE_ID.label("PREY_ENSEMBL_GENE_ID"),
bait.ENSEMBL_GENE_SYMBOL.label("BAIT_ENSEMBL_GENE_SYMBOL"),
bait.ENSEMBL_GENE_ID.label("BAIT_ENSEMBL_GENE_ID"),
tap.TAP_UNIQUE_PEPTIDE_COUNT.label("UNIQUE_PEPTIDE_COUNT"),
tap.TAP_SEQUENCE_COVERAGE.label("SEQUENCE_COVERAGE"),
).\
select_from(tap).\ # #note: need this in so that FROM and JOINs are in desired order
This will select only the columns you need.
I need to calculate the are of overlap between polygons in the same table. Idealy I would like to use mssql spatial capabilities for this (something like #a.SHAPe.STIntersections(#b.SHAPE).STArea()).
But I do not know how to do this for polygons in the same layer.
Thanks!
Freddie
I have knock up a little example for you that shows you how this can be accomplished.
SELECT
a.Geog1.STIntersection(b.Geog2) AS OverlapGeog
, a.Geog1.STIntersection(b.Geog2).STArea() AS AreaOverlap
FROM
(
SELECT
GEOGRAPHY::STGeomFromText('POINT(0.0 0.0)',4326).STBuffer(100) AS Geog1
) a
INNER JOIN
(
SELECT
GEOGRAPHY::STGeomFromText('POINT(0.001 0.0)',4326).STBuffer(100) AS Geog2
) b
On
a.Geog1.STIntersects(b.Geog2) = 1
I've imported US Census Shapefiles (All Roads) into SQL Serve 2008 R2. I'd like to find out what "road" a particular lat/long coordinates fall on. What does that query look like?
I'm new to GIS; searched around without luck. Thanks!
Here are the top 10 rows as a sample data set:
MULTILINESTRING ((-73.924385 40.865365, -73.92249799999999 40.866064, -73.920611999999991 40.866758999999995, -73.919215 40.867275, -73.918414 40.867584), (-73.92662 40.864525, -73.924385 40.865365))
LINESTRING (-73.91434 40.862521, -73.915523999999991 40.863040999999996, -73.917063 40.863690999999996, -73.918943 40.864463, -73.919361999999992 40.864809, -73.919996 40.865797, -73.920611999999991 40.866758999999995, -73.921213999999992 40.867692999999996, -73.921725999999992 40.868497, -73.922145 40.86915, -73.922343 40.869459)
LINESTRING (-73.91704399999999 40.867025999999996, -73.918414 40.867584, -73.919754 40.868114)
LINESTRING (-73.91911 40.859573, -73.919845999999993 40.859898, -73.921235 40.860476)
LINESTRING (-73.917913 40.869667, -73.918109 40.86987, -73.918269999999993 40.870035, -73.918643 40.870421, -73.919249999999991 40.871010999999996, -73.919671 40.872076)
LINESTRING (-73.917913 40.869667, -73.918109 40.86987, -73.918269999999993 40.870035, -73.918643 40.870421, -73.919249999999991 40.871010999999996, -73.919671 40.872076)
LINESTRING (-73.911771 40.868096, -73.913352 40.868777, -73.915183 40.869551, -73.91588 40.869847)
LINESTRING (-73.911227 40.871655, -73.91268 40.872292)
LINESTRING (-73.911227 40.871655, -73.91268 40.872292)
LINESTRING (-73.932523 40.854538, -73.932092 40.855157, -73.931654999999992 40.855754999999995, -73.929509 40.857341999999996)
The following returns the closest road to the search point specified. It was kinda slow until we added spatial indexes to the roads table.
DECLARE #search_point GEOGRAPHY, #search_buffer_metres INT = 50
SET #search_point = geography::STGeomFromText( 'POINT(174.083058 -35.410539)',4326 )
SELECT TOP 1
ROAD_NAME,SHAPE.STDistance( #search_point ) AS MetresFromSearchPoint
FROM
ROADS
WHERE
SHAPE.STIntersects( #search_point.STBuffer(#search_buffer_metres) ) = 1
ORDER BY
SHAPE.STDistance( #search_point )
This is a non-trivial problem, and probably won't get you a satisfactory answer here. The problem with trying to resolve the street of a house with the tiger data is that there isn't really enough information to make a good deterministic decision- all that the tiger files provide is the geographic coordinates for each end of a road segment, along with the name of the road and a few other bits of information.
My house is a great example of why this is difficult. The property is bounded by a residential road, a tertiary road, and an interstate highway. My house is set far back on the lot, and so the rooftop lat/lon is geographically closer to the highway than either of the other two roads. My address of course is on the residential street, but there is no way you'd be able to determine that from the data that you have.