How to connect to AWS Neptune (graph database) with Python? - amazon-neptune

I'm following this tutorial:
https://docs.aws.amazon.com/neptune/latest/userguide/access-graph-gremlin-python.html
How can I add a node and then retrieve the same node?
from __future__ import print_function # Python 2/3 compatibility
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
graph = Graph()
remoteConn = DriverRemoteConnection('wss://your-neptune-endpoint:8182/gremlin','g')
g = graph.traversal().withRemote(remoteConn)
print(g.V().limit(2).toList())
remoteConn.close()
All the above right now is doing is retrieving 2 nodes right?

If you want to add a vertex and then return the information about the vertex (assuming you did not provide your own ID) you can do something like
newId = g.addV("mylabel").id().next()

Related

Synapse Analytics Auto ML Predict No module named 'azureml.automl'

I follow the official tutotial from microsoft: https://learn.microsoft.com/en-us/azure/synapse-analytics/machine-learning/tutorial-score-model-predict-spark-pool
When I execute:
#Bind model within Spark session
model = pcontext.bind_model(
return_types=RETURN_TYPES,
runtime=RUNTIME,
model_alias="Sales", #This alias will be used in PREDICT call to refer this model
model_uri=AML_MODEL_URI, #In case of AML, it will be AML_MODEL_URI
aml_workspace=ws #This is only for AML. In case of ADLS, this parameter can be removed
).register()
I got : No module named 'azureml.automl'
My Notebook
As per the repro from my end, the above code which you have shared works as excepted and I don't see any error message which you are experiencing.
I had even tested the same code on the newly created Apache spark 3.1 runtime and it works as expected.
I would request you to create a new cluster and see if you are able to run the above code.
I solved it. In my case it works best like this:
Imports
#Import libraries
from pyspark.sql.functions import col, pandas_udf,udf,lit
from notebookutils.mssparkutils import azureML
from azureml.core import Workspace, Model
from azureml.core.authentication import ServicePrincipalAuthentication
from azureml.core.model import Model
import joblib
import pandas as pd
ws = azureML.getWorkspace("AzureMLService")
spark.conf.set("spark.synapse.ml.predict.enabled","true")
Predict function
def forecastModel():
model_path = Model.get_model_path(model_name="modelName", _workspace=ws)
modeljob = joblib.load(model_path + "/model.pkl")
validation_data = spark.read.format("csv") \
.option("header", True) \
.option("inferSchema",True) \
.option("sep", ";") \
.load("abfss://....csv")
validation_data_pd = validation_data.toPandas()
predict = modeljob.forecast(validation_data_pd)
return predict

Can't add vertex with python in neptune workbench

I'm trying to put together a demo of Neptune using Neptune workbench, but something's not working right. I've got this block set up:
from __future__ import print_function # Python 2/3 compatibility
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
graph = Graph()
cluster_url = #my cluster
remoteConn = DriverRemoteConnection( f'wss://{cluster_url}:8182/gremlin','g')
g = graph.traversal().withRemote(remoteConn)
import uuid
tmp = uuid.uuid4()
tmp_id=str(id)
def get_id(name):
uid = uuid.uuid5(uuid.NAMESPACE_DNS, f"{name}.licensing.company.com")
return str(uid)
def add_sku(name):
tmp_id = get_id(name)
g.addV('SKU').property('id', tmp_id, 'name', name)
return name
def get_values():
return g.V().properties().toList()
The problem is that calling add_sku doesn't result in a vertex being added to the graph. Doing the same operation in a cell with gremlin magic works, and I can retrieve values through python, but I can't add vertices. Does anyone see what I'm missing here?
The Python code is not working because it is missing a terminal step (next() or iterate()) on the end of it which forces it to evaluate. If you add the terminal step it should work:
g.addV('SKU').property('id', tmp_id, 'name', name).next()

Trouble geo mapping with datashader, holoviews and bokeh

I'm trying to map google phone history locations on to a map using holoviews, datashader and bokeh. Mostly very similar to the examples given in the datashader website. But when I do the map overlay doesn't work as the lat/long gets mangled up.
import datashader as ds
import geoviews as gv
import holoviews as hv
from holoviews.operation.datashader import datashade, dynspread
from datashader import transfer_functions as tf
from colorcet import fire
hv.extension('bokeh')
> df2.head()
lat long
0 -37.7997515 144.9636466
1 -37.7997515 144.9636466
2 -37.7997369 144.9636036
3 -37.7997387 144.9636358
4 -37.7997515 144.9636466
This works to produce an image of the data,
ds_viz = ds.Canvas().points(df2,'lat','long')
tf.set_background(tf.shade(ds_viz, cmap=fire),"black")
However when I try to overlay it with a map it doesn't work,
from bokeh.models import WMTSTileSource
url = 'https://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{Z}/{Y}/{X}.jpg'
tile_opts = dict(width=1000,height=600,bgcolor='black',show_grid=False)
map_tiles = gv.WMTS(url).opts(style=dict(alpha=0.5), plot=tile_opts)
points = hv.Points(df2, kdims=['long','lat'])
trips = datashade(points, cmap=fire,width=1000, height=600)
map_tiles * trips
What am I doing wrong?
It looks like your points are in lon,lat but your map is in Web Mercator coordinates, so you need to project your points into Web Mercator before you overlay them. GeoViews offers comprehensive support for projections, but for this specific case Datashader provides the special-purpose function datashader.utils.lnglat_to_meters. Something like this should work:
df2.loc[:, 'lon'], df.loc[:, 'lat'] = lnglat_to_meters(df2.lon,df2.lat)
Projecting can be slow, so you may want to save the resulting df2 to a Parquet file so that you only have to do it once.

Checking if a geocoordinate point is land or ocean with cartopy?

I want to know given a latitude and longitude if a coordinate is land or sea
According to https://gis.stackexchange.com/questions/235133/checking-if-a-geocoordinate-point-is-land-or-ocean
from mpl_toolkits.basemap import Basemap
bm = Basemap() # default: projection='cyl'
print bm.is_land(99.675, 13.104) #True
print bm.is_land(100.539, 13.104) #False
The problem is that basemap is deprecated. how di perform this with cartopy?
A question which deals with point containment testing of country geometries using cartopy can be found at Polygon containment test in matplotlib artist.
Cartopy has the tools to achieve this, but there is no built-in method such as "is_land". Instead, you need to get hold of the appropriate geometry data, and query that using standard shapely predicates.
import cartopy.io.shapereader as shpreader
import shapely.geometry as sgeom
from shapely.ops import unary_union
from shapely.prepared import prep
land_shp_fname = shpreader.natural_earth(resolution='50m',
category='physical', name='land')
land_geom = unary_union(list(shpreader.Reader(land_shp_fname).geometries()))
land = prep(land_geom)
def is_land(x, y):
return land.contains(sgeom.Point(x, y))
This gives the expected results for two sample points:
>>> print(is_land(0, 0))
False
>>> print(is_land(0, 10))
True
If you have access to it, fiona will make this easier (and snappier):
import fiona
import cartopy.io.shapereader as shpreader
import shapely.geometry as sgeom
from shapely.prepared import prep
geoms = fiona.open(
shpreader.natural_earth(resolution='50m',
category='physical', name='land'))
land_geom = sgeom.MultiPolygon([sgeom.shape(geom['geometry'])
for geom in geoms])
land = prep(land_geom)
Finally, I produced (back in 2011) the shapely.vectorized functionality to speed up this kind of operation when testing many points at the same time. The code is available as a gist at https://gist.github.com/pelson/9785576, and produces the following proof-of-concept for testing land containment for the UK:
Another tool you may be interested in reading about is geopandas, as this kind of containment testing is one of its core capabilities.

Pygraphviz and fixed node positions

How do I force pygraphviz to maintain fixed positions for my nodes. Assume that you have the following code
from __future__ import absolute_import
from __future__ import unicode_literals
from __future__ import print_function
from __future__ import division
import pygraphviz as pgv
from _operator import pos
A=pgv.AGraph()
A.add_node(1,color='red',pos="0,1")
A.add_node(2,color='blue',pos="1,10")
A.add_node(3,color='yellow'pos="2,2")
A.add_edge(1,2,color='green')
A.add_edge(2,3)
A.add_edge(2,2,"1")
A.add_edge(1,3)
A.graph_attr['epsilon']='0.001'
print(A.string()) # print dot file to standard output
A.layout('dot') # layout with dot
A.draw('foo.pdf') # write to file
How do I force the nodes to show up at predetermined positions (0.1), (1,10 and respective (2,2)
Looking at http://pygraphviz.github.io/documentation/pygraphviz-1.4rc1/reference/agraph.html and the signature for the draw method, it looks as if layout is only needed if pos is not present.
"
If prog is not specified and the graph has positions (see layout()) then no additional graph positioning will be performed."
In your case you could just try without using A.layout(). just A.draw('foo.pdf')