folium choropleth and geojson not rendering in jupyter - pandas

I cannot get a folium map to display in jupyter when all 33 London boroughs are included in the geojson file
but
I can get the folium map to display if fewer boroughs are included in the geojson file. (up to 23)
If I save the map as an html file and open it separately it works just fine.
here is the version of the code that works (just using the first 23 boroughs).
m = folium.Map(location=[51.5, -0.1], zoom_start=10)
m.choropleth(
geo_data={"type":geo_london["type"],"features":geo_london["features"][:23]}, # 23 of the boroughs
data=df["Underground"],
columns=["LA",'Underground'],
key_on='feature.properties.name',
fill_color='BuPu',
fill_opacity=0.9,
line_opacity=0.2,
legend_name='Underground Useage',
highlight=True
)
Here is the version that doesn't work:
m = folium.Map(location=[51.5, -0.1], zoom_start=10)
m.choropleth(
geo_data= geo_london, # all 33 boroughs
data=df["Underground"],
columns=["LA",'Underground'],
key_on='feature.properties.name',
fill_color='BuPu',
fill_opacity=0.9,
line_opacity=0.2,
legend_name='Underground Useage',
highlight=True
)
Other things to note:
I parsed the geojson file using json within python so geo_london is a
dictionary
if I do m.save('mymap.html') and open the map the second
version also works fine.
I have the same problem if I don't use the data in a
chorepleth but instead use folium.GeoJson(geo_london).add_to(m)
folium 0.5.0
the data is a pandas data series

Probably you are describing the bug explained here https://github.com/python-visualization/folium/issues/768 (Folium displays nothing if number of overlaid images > 80 on Chrome).
Try to use a different browser like Firefox or Safari.

Related

How to increase length of ouput table or dataframe in Jupyter Notebook?

I am working on the Jupyter notebook and have been facing issues in increasing the length of the output of the Jupyter Notebook. I can see the output as follows:
I tried increasing the default length of the columns in pandas with no success. Can you please help me with it?
If you were using the typical way to view a dataframe in Jupyter (see my puzzelment about your screenshot in my comments to your original post) it would be things like this:
adapted from answer to 'Pretty-print an entire Pandas Series / DataFrame'
with pd.option_context('display.max_rows', None, 'display.max_columns', None):
display(df)
(Note that will work with the text-based viewing, too. Note it uses print(df) in the answer to 'Pretty-print an entire Pandas Series / DataFrame'.
Adjust the 'display.max_colwidth' if you want the entire column text to show:
with pd.option_context('display.max_rows', None, 'display.max_columns', None,'display.max_colwidth', -1):
display(df)
(If you prefer text like you posted, replace display() with print()
Generally with the solutions above the view window in Jupyter will get scrollbars so you can navigate to view all still.
You can also set the number of rows to show to be lower to save space, see example here.
You may also be interested in Pandas dataframe hide index functionality? or Using python / Jupyter Notebook, how to prevent row numbers from printing?.
As pointed out here, setting some some global options is covered in the Pandas Documentation for top-level options.
For display() to work these days you don't need to do anything extra. But if your are using old Jupyter or it doesn't work then try adding towards the top of your notebook file and running the following as a cell first:
from IPython.display import display

Template Matching through python API on Linux desktop

I'm following the tutorial on using your own template images to do object 3D pose tracking, but I'm trying to get it working on Ubuntu 20.04 with a live webcam stream.
I was able to successfully make my index .pb file with extracted KNIFT features from my custom images.
It seems the next thing to do is load the provided template matching graph (in mediapipe/graphs/template_matching/template_matching_desktop.pbtxt) (replacing the index_proto_filename of the BoxDetectorCalculator with my own index file), and run it on a video input stream to track my custom object.
I was hoping that would be easiest to do in python, but am running into dependency problems.
(I installed mediapipe python with pip3 install mediapipe)
First, I couldn't find how to directly load a .pbtxt file as a graph in the mediapipe python API, but that's ok. I just load the text it contains and use that.
template_matching_graph_filepath=os.path.abspath("~/mediapipe/mediapipe/graphs/template_matching/template_matching_desktop.pbtxt")
graph = mp.CalculatorGraph(graph_config=open(template_matching_graph_filepath).read())
But I get missing calculator targets.
No registered object with name: OpenCvVideoDecoderCalculator; Unable to find Calculator "OpenCvVideoDecoderCalculator"
or
[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/text_format.cc:309] Error parsing text-format mediapipe.CalculatorGraphConfig: 54:70: Could not find type "type.googleapis.com/mediapipe.TfLiteInferenceCalculatorOptions" stored in google.protobuf.Any.
It seems similar to this troubleshooting case but, since I'm not trying to compile an application, I'm not sure how to link in the missing calculators.
How to I make the mediapipe python API aware of these graphs?
UPDATE:
I made decent progress by adding the graphs that the template_matching depends on to the cc_library deps of the mediapipe/python/BUILD file
cc_library(
name = "builtin_calculators",
deps = [
"//mediapipe/calculators/image:feature_detector_calculator",
"//mediapipe/calculators/image:image_properties_calculator",
"//mediapipe/calculators/video:opencv_video_decoder_calculator",
"//mediapipe/calculators/video:opencv_video_encoder_calculator",
"//mediapipe/calculators/video:box_detector_calculator",
"//mediapipe/calculators/tflite:tflite_inference_calculator",
"//mediapipe/calculators/tflite:tflite_tensors_to_floats_calculator",
"//mediapipe/calculators/util:timed_box_list_id_to_label_calculator",
"//mediapipe/calculators/util:timed_box_list_to_render_data_calculator",
"//mediapipe/calculators/util:landmarks_to_render_data_calculator",
"//mediapipe/calculators/util:annotation_overlay_calculator",
...
I also modified solution_base.py so it knows about BoxDetector's options.
from mediapipe.calculators.video import box_detector_calculator_pb2
...
CALCULATOR_TO_OPTIONS = {
'BoxDetectorCalculator':
box_detector_calculator_pb2
.BoxDetectorCalculatorOptions,
Then I rebuilt and installed mediapipe python from source with:
~/mediapipe$ python3 setup.py install --link-opencv
Then I was able to make my own class derived from SolutionBase
from mediapipe.python.solution_base import SolutionBase
class ObjectTracker(SolutionBase):
"""Process a video stream and output a video with edges of templates highlighted."""
def __init__(self,
object_knift_index_file_path):
super().__init__(binary_graph_path=object_pose_estimation_binary_file_path,
calculator_params={"BoxDetector.index_proto_filename": object_knift_index_file_path},
)
def process(self, image: np.ndarray) -> NamedTuple:
return super().process(input_data={'input_video':image})
ot = ObjectTracker(object_knift_index_file_path="/path/to/my/object_knift_index.pb")
Finally, I process a video frame from a cv2.VideoCapture
cv_video = cv2.VideoCapture(0)
result, frame = cv_video.read()
input_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
res = ot.process(image=input_frame)
So close! But I run into this error which I just don't know what to do with.
/usr/local/lib/python3.8/dist-packages/mediapipe/python/solution_base.py in process(self, input_data)
326 if data.shape[2] != RGB_CHANNELS:
327 raise ValueError('Input image must contain three channel rgb data.')
--> 328 self._graph.add_packet_to_input_stream(
329 stream=stream_name,
330 packet=self._make_packet(input_stream_type,
RuntimeError: Graph has errors:
Calculator::Open() for node "BoxDetector" failed: ; Error while reading file: /usr/local/lib/python3.8/dist-packages/
Looks like CalculatorNode::OpenNode() is trying to open the python API install path as a file. Maybe it has to do with the default_context. I have no idea where to go from here. :(

Use folium Map as holoviews DynamicMap

I have a folium.Map that contains custom HTML Popups with clickable URLs. These Popups open when clicking on the polygons of the map. This is a feature that doesn't seem to be possible to achieve using holoviews.
My ideal example of the final application that I want to build with holoviews/geoviews is here with the source code here, but I would like to exchange the main map with my folium Map and plot polygons instead of rasterized points. Now when I would like to create the holoviews.DynamicMap from the folium.Map, holoviews complains (of course) that the data type "map" is not accepted. Is this somehow still possible?
I have found some notebook on GitHub where a holoviews plot in embedded in a folium map using a workaround that writes and reads again HTML, but it seems impossible to embed a folium map into holoviews such that other plots can be updated from this figure using Streams!?
Here is some toy data (from here) for the datasets that I use. For simplicity, let's assume I just had point data instead of polygons:
import folium as fn
def make_map():
m = fm.Map(location=[20.59,78.96], zoom_start=5)
green_p1 = fm.map.FeatureGroup()
green_p1.add_child(
fm.CircleMarker(
[row.Latitude, row.Longitude],
radius=10,
fill=True,
fill_color=fill_color,
fill_opacity=0.7
)
)
map.add_child(green_p1)
return map
If I understand it correctly, this needs to be tweaked now in the fashion that it can passed as the first argument to a holoviews.DynamicMap:
hv.DynamicMap(make_map, streams=my_streams)
where my_streams are some other plots that should be updated with the extent of the folium map.
Is that somehow possible or is my strategy wrong?

TensorFlow: Opening log data written by SummaryWriter

After following this tutorial on summaries and TensorBoard, I've been able to successfully save and look at data with TensorBoard. Is it possible to open this data with something other than TensorBoard?
By the way, my application is to do off-policy learning. I'm currently saving each state-action-reward tuple using SummaryWriter. I know I could manually store/train on this data, but I thought it'd be nice to use TensorFlow's built in logging features to store/load this data.
As of March 2017, the EventAccumulator tool has been moved from Tensorflow core to the Tensorboard Backend. You can still use it to extract data from Tensorboard log files as follows:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
event_acc = EventAccumulator('/path/to/summary/folder')
event_acc.Reload()
# Show all tags in the log file
print(event_acc.Tags())
# E. g. get wall clock, number of steps and value for a scalar 'Accuracy'
w_times, step_nums, vals = zip(*event_acc.Scalars('Accuracy'))
Easy, the data can actually be exported to a .csv file within TensorBoard under the Events tab, which can e.g. be loaded in a Pandas dataframe in Python. Make sure you check the Data download links box.
For a more automated approach, check out the TensorBoard readme:
If you'd like to export data to visualize elsewhere (e.g. iPython
Notebook), that's possible too. You can directly depend on the
underlying classes that TensorBoard uses for loading data:
python/summary/event_accumulator.py (for loading data from a single
run) or python/summary/event_multiplexer.py (for loading data from
multiple runs, and keeping it organized). These classes load groups of
event files, discard data that was "orphaned" by TensorFlow crashes,
and organize the data by tag.
As another option, there is a script
(tensorboard/scripts/serialize_tensorboard.py) which will load a
logdir just like TensorBoard does, but write all of the data out to
disk as json instead of starting a server. This script is setup to
make "fake TensorBoard backends" for testing, so it is a bit rough
around the edges.
I think the data are encoded protobufs RecordReader format. To get serialized strings out of files you can use py_record_reader or build a graph with TFRecordReader op, and to deserialize those strings to protobuf use Event schema. If you get a working example, please update this q, since we seem to be missing documentation on this.
I did something along these lines for a previous project. As mentioned by others, the main ingredient is tensorflows event accumulator
from tensorflow.python.summary import event_accumulator as ea
acc = ea.EventAccumulator("folder/containing/summaries/")
acc.Reload()
# Print tags of contained entities, use these names to retrieve entities as below
print(acc.Tags())
# E. g. get all values and steps of a scalar called 'l2_loss'
xy_l2_loss = [(s.step, s.value) for s in acc.Scalars('l2_loss')]
# Retrieve images, e. g. first labeled as 'generator'
img = acc.Images('generator/image/0')
with open('img_{}.png'.format(img.step), 'wb') as f:
f.write(img.encoded_image_string)
You can also use the tf.train.summaryiterator: To extract events in a ./logs-Folder where only classic scalars lr, acc, loss, val_acc and val_loss are present you can use this GIST: tensorboard_to_csv.py
Chris Cundy's answer works well when you have less than 10000 data points in your tfevent file. However, when you have a large file with over 10000 data points, Tensorboard will automatically sampling them and only gives you at most 10000 points. It is a quite annoying underlying behavior as it is not well-documented. See https://github.com/tensorflow/tensorboard/blob/master/tensorboard/backend/event_processing/event_accumulator.py#L186.
To get around it and get all data points, a bit hacky way is to:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
class FalseDict(object):
def __getitem__(self,key):
return 0
def __contains__(self, key):
return True
event_acc = EventAccumulator('path/to/your/tfevents',size_guidance=FalseDict())
It looks like for tb version >=2.3 you can streamline the process of converting your tb events to a pandas dataframe using tensorboard.data.experimental.ExperimentFromDev().
It requires you to upload your logs to TensorBoard.dev, though, which is public. There are plans to expand the capability to locally stored logs in the future.
https://www.tensorflow.org/tensorboard/dataframe_api
You can also use the EventFileLoader to iterate through a tensorboard file
from tensorboard.backend.event_processing.event_file_loader import EventFileLoader
for event in EventFileLoader('path/to/events.out.tfevents.xxx').Load():
print(event)
Surprisingly, the python package tb_parse has not been mentioned yet.
From documentation:
Installation:
pip install tensorflow # or tensorflow-cpu pip install -U tbparse # requires Python >= 3.7
Note: If you don't want to install TensorFlow, see Installing without TensorFlow.
We suggest using an additional virtual environment for parsing and plotting the tensorboard events. So no worries if your training code uses Python 3.6 or older versions.
Reading one or more event files with tbparse only requires 5 lines of code:
from tbparse import SummaryReader
log_dir = "<PATH_TO_EVENT_FILE_OR_DIRECTORY>"
reader = SummaryReader(log_dir)
df = reader.scalars
print(df)

nbconvert to convert from notebook to Latex/PDF cuts off the dataframe

I have a notebook that displays lots of dataframe tables in HTML format. However when I used nbconvert to convert the notebook to latex and to PDF, some tables with wider body gets cut off. I have tried using pd.set_option('display.width', xxxx) but it didn't seem to work. It seems that the issue is not in the display of the pandas dataframe( the table is displayed just fine on the notebook ), but rather in converting the notebook to Latex/PDF. Is there anything I can do to resolve this problem?