Cannot group boxplots or add manual legend using ggpattern and 2 datasets - ggplot2

I am quite new to R in general and currently struggling with a specific plot I want to create.
Preferably I would like to group my boxplots (2 per Area -> Good Habitat, average habitat etc.) and add a legend. Since I am using two datasets from two separate excel spreadsheets using the fill = Area command in the aesthetics does not help me group the plots or give me a legend. I also tried manually adding a legend as seen in other posts, but again since I am using two datasets it does not work.
This is the base code I use for this plot. The striped boxplots are supposed to show that larva are present whereas the empty boxplots show the availability of hostplant:
#DATASETS
ControlPlots <- read_excel("ApolloLarva_EnvVari_PerGrid.xlsx", sheet = "ControlPlots_InHabitat")
LarvaDataAreas <- read_excel("ApolloLarva_M_CombinedAreas.xlsx", sheet = "OnlyLarvaPlots")
#Renaming Areas for x-axis
ControlPlots$Area <- gsub("^.*Core", "Good Habitat", ControlPlots$Area)
LarvaDataAreas $Area <- gsub("^.*Core_Area", "Good Habitat", LarvaDataAreas $Area)
ControlPlots$Area <- gsub("^.*Ref", "Average Habitat", ControlPlots$Area)
LarvaDataAreas $Area <- gsub("^.*Ref_Area", "Average Habitat", LarvaDataAreas $Area)
ControlPlots$Area <- gsub("^.*Rest2020", "Targeted Restoration", ControlPlots$Area)
LarvaDataAreas $Area <- gsub("^.*Rest_Area_2020", "Targeted Restoration", LarvaDataAreas $Area)
ControlPlots$Area <- gsub("^.*Rest2021", "Non-targeted Restoration", ControlPlots$Area)
#Plot
a <- ggplot(ControlPlots, aes(x = Area, y = Hostplant)) +
geom_boxplot(data= ControlPlots, colour = "black") +
geom_boxplot_pattern(data= LarvaDataAreas,fill = "white", colour = "black",
pattern_density = 0.02, pattern_spacing = 0.01,
pattern_fill = 'black', pattern_colour = 'black', alpha = 0.8) +
theme_bw() +
labs(x = "Areas",
y = "Hostplant cover [%]",
title = "") +
theme()
If anyone has any tips, I would be really grateful! Thank you!!
UPDATE: Dataset:
dput(ControlPlots2)
structure(list(Area = c("Rest2020", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Core", "Core",
"Core", "Core", "Core", "Core", "Core", "Core", "Ref", "Ref",
"Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref",
"Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref",
"Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref",
"Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref",
"Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref",
"Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref", "Ref",
"Ref", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020", "Rest2020",
"Rest2020", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021", "Rest2021",
"Rest2021"), GridID = c(13069, 11053, 11053, 11053, 11053, 11053,
11053, 11053, 11053, 11053, 11053, 11053, 11053, 11053, 11052,
11052, 11052, 11052, 11052, 11052, 11052, 11052, 11052, 11052,
11052, 11051, 11051, 11051, 11051, 11051, 11051, 11154, 11154,
11154, 11154, 11154, 11154, 11154, 11154, 11154, 11154, 11153,
11153, 11153, 11153, 11153, 11153, 11153, 11153, 11153, 11153,
11153, 11153, 11153, 11153, 11152, 11152, 11152, 11152, 11152,
11152, 11152, 11152, 11152, 11152, 11255, 11255, 11255, 11255,
11255, 11255, 11255, 11255, 11255, 11255, 11255, 11254, 11254,
11254, 11254, 11254, 11254, 11254, 11254, 11254, 11254, 11254,
11253, 11253, 11253, 11253, 11253, 11253, 11253, 11253, 11253,
11253, 11253, 11253, 12161, 12161, 12161, 12161, 12161, 12161,
12160, 12160, 12160, 12263, 12263, 12263, 12263, 12263, 12263,
12263, 12262, 12262, 12262, 12262, 12262, 12262, 12262, 12262,
12262, 12261, 12261, 12261, 12261, 12261, 12261, 12261, 12261,
12466, 12466, 12466, 12466, 12466, 12466, 12466, 12567, 12567,
12567, 12567, 12567, 12567, 12668, 12668, 12668, 12668, 12668,
12668, 12668, 12668, 12667, 12667, 12667, 13272, 13272, 13272,
13272, 13272, 13272, 13272, 13272, 13171, 13171, 13171, 13171,
13171, 13171, 13171, 13171, 13171, 13171, 13171, 13171, 13171,
13171, 13171, 13171, 13171, 13171, 13171, 13170, 13170, 13170,
13170, 13170, 13170, 13170, 13170, 13170, 13170, 13170, 13170,
13170, 13170, 13170, 13170, 13069, 13069, 13069, 13069, 13069,
10867, 10867, 10867, 10867, 10766, 10766, 10766, 10766, 10766,
10766, 10766, 10767, 10767, 10767, 10767, 10767, 10767, 10767,
10767, 10767, 10768, 10768, 10768, 10768, 10768, 10768, 10768,
10666, 10666, 10666, 10666, 10666, 10666, 10666, 10666, 10666,
10666, 10666, 10666, 10666, 10666, 10666, 10565, 10565, 10565,
10565, 10565, 10565, 10566, 10566, 10566, 10566, 10566, 10566,
10566, 10566, 10566, 10566, 10568, 10568, 10568, 10568, 10568,
10568, 10568, 10568, 10568, 10568, 10668, 10668, 10668, 10668,
10668, 10668, 10668, 10668, 10668, 10668, 10669, 10669, 10669,
10669, 10669, 10669, 10669, 10669, 10669, 10669, 10669, 10669,
10770, 10770, 10770, 10770, 10770, 10770, 10770, 10770, 10770,
10770, 10770, 10770), Hostplant = c(0, 1, 0, 0, 5, 6, 12, 14,
0, 5, 12, 13, 16, 3, 1, 0, 0, 14, 0, 2, 2, 10, 6, 0, 0, 0, 0,
0, 0, 0, 1, 23, 2, 3, 0, 14, 5, 0, 0, 0, 0, 3, 1, 4, 0, 6, 6,
2, 9, 3, 6, 7, 36, 16, 3, 2, 1, 0, 4, 16, 1, 0, 8, 0, 4, 0, 9,
7, 0, 2, 8, 4, 0, 9, 2, 7, 16, 1, 0, 0, 5, 2, 5, 0, 4, 13, 4,
4, 9, 0, 6, 1, 1, 1, 1, 1, 3, 22, 1, 0, 0, 1, 22, 0, 1, 0, 0,
0, 2, 0, 3, 7, 0, 17, 0, 2, 4, 5, 9, 0, 3, 3, 0, 0, 1, 2, 4,
0, 1, 6, 0, 6, 6, 6, 15, 2, 25, 0, 5, 21, 14, 0, 3, 2, 3, 0,
12, 10, 2, 0, 13, 4, 0, 1, 11, 6, 0, 1, 0, 0, 3, 0, 0, 0, 0,
0, 0, 3, 0, 0, 0, 0, 3, 18, 0, 0, 3, 0, 13, 0, 0, 0, 0, 0, 5,
0, 5, 0, 5, 10, 7, 0, 3, 21, 4, 0, 3, 0, 0, 0, 0, 0, 3, 15, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), LarvaeCount = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -306L))
Larva Dataset: > dput(LarvaDataAreas2)
structure(list(Area = c("Rest_Area_2020", "Rest_Area_2020", "Rest_Area_2020",
"Ref_Area", "Ref_Area", "Ref_Area", "Ref_Area", "Ref_Area", "Ref_Area",
"Ref_Area", "Ref_Area", "Ref_Area", "Ref_Area", "Ref_Area", "Ref_Area",
"Ref_Area", "Core_Area", "Core_Area", "Core_Area", "Core_Area",
"Core_Area", "Core_Area", "Core_Area", "Core_Area", "Core_Area",
"Core_Area", "Core_Area", "Core_Area", "Core_Area", "Core_Area",
"Core_Area", "Core_Area", "Core_Area", "Core_Area", "Core_Area",
"Core_Area", "Core_Area", "Core_Area", "Core_Area", "Core_Area",
"Core_Area", "Core_Area", "Core_Area"), FID = c(241, 243, 291,
226, 162, 150, 151, 154, 156, 158, 161, 174, 179, 181, 210, 213,
24, 1, 2, 3, 5, 7, 11, 12, 13, 14, 16, 17, 18, 19, 23, 29, 30,
31, 36, 38, 41, 44, 64, 78, 108, 123, 135), Hostplant = c(5,
9, 15, 17, 24, 4, 9, 6, 8, 12, 12, 11, 19, 18, 11, 12, 17, 5,
10, 24, 8, 12, 11, 10, 5, 20, 0, 4, 24, 8, 4, 1, 5, 4, 4, 2,
8, 8, 4, 9, 16, 5, 29)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -43L))

As suggested in the comments, easiest is to bind the data frames by using a source identifying column as an aesthetic. dplyr::bind_rows allows you to easily create an ID column "on the fly".
library(tidyverse)
library(ggpattern)
## bind data sets
df <- bind_rows(test = LarvaDataAreas, control = ControlPlots, .id = "control")
ggplot(df, aes(x = Area, y = Hostplant)) +
geom_boxplot_pattern(aes(pattern = control),
pattern_density = 0.02, pattern_spacing = 0.01,
pattern_colour = 'black', alpha = 0.8) +
scale_pattern_manual(NULL, values=c("none", "stripe"))

Related

How to get data from the device using IOBufferMemoryDescriptor in driverKit

I'm trying to create a driver for my usb device, using iOS and DriverKit.
I'm basing my code in the example used in WWDC: https://github.com/knightsc/USBApp
My driver starts fine when the device is connected and the readCompleted method is called fine, but the action->GetReference() gets only \0 characteres.
Also in order to know that the usb device is actually working I've connected it to my mac first and using PyUSB I can see that it's returning data in chunks of 1024 bytes in the interface 0.
This is the data I get in PyUSB:
array('B', [6, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0, 10, 0, 11, 0, 12, 0, 13, 0, 14, 0, 15, 0, 16, 0, 17, 0, 18, 0, 19, 0, 20, 0, 21, 0, 22, 0, 23, 0, 24, 0, 25, 0, 26, 0, 27, 0, 28, 0, 29, 0, 30, 0, 31, 0, 32, 0, 33, 0, 34, 0, 35, 0, 36, 0, 37, 0, 38, 0, 39, 0, 40, 0, 41, 0, 42, 0, 43, 0, 44, 0, 45, 0, 46, 0, 47, 0, 48, 0, 49, 0, 50, 0, 51, 0, 52, 0, 53, 0, 54, 0, 55, 0, 56, 0, 57, 0, 58, 0, 59, 0, 60, 0, 61, 0, 62, 0, 63, 0, 64, 0, 65, 0, 66, 0, 67, 0, 68, 0, 69, 0, 70, 0, 71, 0, 72, 0, 73, 0, 74, 0, 75, 0, 76, 0, 77, 0, 78, 0, 79, 0, 80, 0, 81, 0, 82, 0, 83, 0, 84, 0, 85, 0, 86, 0, 87, 0, 88, 0, 89, 0, 90, 0, 91, 0, 92, 0, 93, 0, 94, 0, 95, 0, 96, 0, 97, 0, 98, 0, 99, 0, 100, 0, 101, 0, 102, 0, 103, 0, 104, 0, 105, 0, 106, 0, 107, 0, 108, 0, 109, 0, 110, 0, 111, 0, 112, 0, 113, 0, 114, 0, 115, 0, 116, 0, 117, 0, 118, 0, 119, 0, 120, 0, 121, 0, 122, 0, 123, 0, 124, 0, 125, 0, 126, 0, 127, 0, 128, 0, 129, 0, 130, 0, 131, 0, 132, 0, 133, 0, 134, 0, 135, 0, 136, 0, 137, 0, 138, 0, 139, 0, 140, 0, 141, 0, 142, 0, 143, 0, 144, 0, 145, 0, 146, 0, 147, 0, 148, 0, 149, 0, 150, 0, 151, 0, 152, 0, 153, 0, 154, 0, 155, 0, 156, 0, 157, 0, 158, 0, 159, 0, 160, 0, 161, 0, 162, 0, 163, 0, 164, 0, 165, 0, 166, 0, 167, 0, 168, 0, 169, 0, 170, 0, 171, 0, 172, 0, 173, 0, 174, 0, 175, 0, 176, 0, 177, 0, 178, 0, 179, 0, 180, 0, 181, 0, 182, 0, 183, 0, 184, 0, 185, 0, 186, 0, 187, 0, 188, 0, 189, 0, 190, 0, 191, 0, 192, 0, 193, 0, 194, 0, 195, 0, 196, 0, 197, 0, 198, 0, 199, 0, 200, 0, 201, 0, 202, 0, 203, 0, 204, 0, 205, 0, 206, 0, 207, 0, 208, 0, 209, 0, 210, 0, 211, 0, 212, 0, 213, 0, 214, 0, 215, 0, 216, 0, 217, 0, 218, 0, 219, 0, 220, 0, 221, 0, 222, 0, 223, 0, 224, 0, 225, 0, 226, 0, 227, 0, 228, 0, 229, 0, 230, 0, 231, 0, 232, 0, 233, 0, 234, 0, 235, 0, 236, 0, 237, 0, 238, 0, 239, 0, 240, 0, 241, 0, 242, 0, 243, 0, 244, 0, 245, 0, 246, 0, 247, 0, 248, 0, 249, 0, 250, 0, 251, 0, 252, 0, 253, 0, 254, 0, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
This is the Ivars:
struct Mk1dDriver_IVars
{
IOUSBHostInterface *interface;
IOUSBHostPipe *inPipe;
OSAction *ioCompleteCallback;
IOBufferMemoryDescriptor *inData;
uint16_t maxPacketSize;
};
This is the Start method:
kern_return_t
IMPL(Mk1dDriver, Start)
{
kern_return_t ret;
IOUSBStandardEndpointDescriptors descriptors;
ret = Start(provider, SUPERDISPATCH);
__Require(kIOReturnSuccess == ret, Exit);
ret = RegisterService();
if (ret != kIOReturnSuccess)
{
Log("Start() - Failed to register service with error: 0x%08x.", ret);
goto Exit;
}
ivars->interface = OSDynamicCast(IOUSBHostInterface, provider);
__Require_Action(NULL != ivars->interface, Exit, ret = kIOReturnNoDevice);
ret = ivars->interface->Open(this, 0, NULL);
__Require(kIOReturnSuccess == ret, Exit);
ret = ivars->interface->CopyPipe(kMyEndpointAddress, &ivars->inPipe);
__Require(kIOReturnSuccess == ret, Exit);
ret = ivars->interface->CreateIOBuffer(kIOMemoryDirectionIn,
1024,
&ivars->inData);
__Require(kIOReturnSuccess == ret, Exit);
ret = OSAction::Create(this,
Mk1dDriver_ReadComplete_ID,
IOUSBHostPipe_CompleteAsyncIO_ID,
0,
&ivars->ioCompleteCallback);
__Require(kIOReturnSuccess == ret, Exit);
ret = ivars->inPipe->AsyncIO(ivars->inData,
ivars->maxPacketSize,
ivars->ioCompleteCallback,
0);
__Require(kIOReturnSuccess == ret, Exit);
os_log(OS_LOG_DEFAULT,"Finish");
// WWDC slides don't show the full function
// i.e. this is still unfinished
Exit:
return ret;
}
The only difference in this compared with the code from Apple is that I set capacity in the method CreateIOBuffer to 1024. This is because if I leave it to 0 it will return an error that memory could not be allocated: kIOReturnNoMemory
And the ReadComplete method:
void
IMPL(Mk1dDriver, ReadComplete)
{
char output[1024];
memcpy(action->GetReference(), &output, 1024);
os_log(OS_LOG_DEFAULT,"ReadComplete");
If I put a breakpoint in the log, I can see all the positions in output will be \0
Any idea what I'm doing wrong?
Thanks
You need to store some reference to the IOBufferMemoryDescriptor* that you asked AsyncIO to write to when the data from the device is received (ivars->inData) so that you can access it when the completion callback ReadComplete is called. You can store this in the memory that you can access with GetReference().
You should set the size of the custom memory that should be allocated for you. Currently you are allocating 0 bytes. See OSAction::Create.
In ReadComplete you can then call GetReference() to access the memory. Since you know that this memory contains a reference to the IOBufferMemoryDescriptor that data has been written to, you can then use it with memcpy.
Something like this:
ret = OSAction::Create(this,
Mk1dDriver_ReadComplete_ID,
IOUSBHostPipe_CompleteAsyncIO_ID,
sizeof(IOBufferMemoryDescriptor*),
&ivars->ioCompleteCallback);
memcpy(ivars->ioCompleteCallback->GetReference(),
ivars->inData, sizeof(IOBufferMemoryDescriptor*));
First parameter to memcpy is the destination.
void IMPL(Mk1dDriver, ReadComplete)
{
IOBufferMemoryDescriptor* ptr;
memcpy(ptr, action->GetReference(), sizeof(IOBufferMemoryDescriptor*));
IOAddressSegment addressSegement{};
ptr->GetAddressRange(&addressSegement);
char output[1024];
memcpy(output, addressSegement.address, addressSegement.length);
}

Problem by converting a dataframe column into a list

I have a csv file with following structure:
Tokens,Tags,Polarities
"['i', 'agree', 'about', 'arafat', '.', 'i', 'mean', ',', 'shit', ',', 'they', 'even', 'gave', 'one', 'to', 'jimmy', 'carter', 'ha', '.', 'it', 'should', 'be', 'called', ""''"", 'the', 'worst', 'president', ""''"", 'prize', '.']","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]","[-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1]"
"['musicmonday', 'britney', 'spears', '-', 'lucky', 'do', 'you', 'remember', 'this', 'song', '?', 'it', '`', 's', 'awesome', '.', 'i', 'love', 'it', '.']","[0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]","[-1, 2, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1]"
"['wtf', '?', 'hilary', 'swank', 'is', 'coming', 'to', 'my', 'school', 'today', ',', 'just', 'to', 'chill', '.', 'lol', 'wow']","[0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]","[-1, -1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1]"
"['my', '3-year-old', 'was', 'amazed', 'yesterday', 'to', 'find', 'that', ""'"", 'real', ""'"", '10', 'pin', 'bowling', 'is', 'nothing', 'like', 'it', 'is', 'on', 'the', 'wii', '...']","[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0]","[-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1]"
"['God', 'damn', '.', 'That', 'Sony', 'remote', 'for', 'google', 'is', 'fucking', 'hideeeeeous', '!']","[0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]","[-1, -1, -1, -1, -1, -1, -1, 0, -1, -1, -1, -1]"
I am trying to read the file as follows:
twitter_train = pd.read_csv('twitter_train.csv')
Then I can see that it has a correct structure:
twitter_train.head(3)
Tokens Tags Polarities
0 ['i', 'agree', 'about', 'arafat', '.', 'i', 'm... [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -...
1 ['musicmonday', 'britney', 'spears', '-', 'luc... [0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... [-1, 2, 2, -1, -1, -1, -1, -1, -1, -1, -1, -1,...
2 ['wtf', '?', 'hilary', 'swank', 'is', 'coming'... [0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... [-1, -1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1,...
I want to convert each column to a list of lists, for example:
twitter_train_lists = twitter_train['Tokens'].tolist()
But I have incorrect structure that has an extra \ or " with each element in the list and around each list itself:
['[\'i\', \'agree\', \'about\', \'arafat\', \'.\', \'i\', \'mean\', \',\', \'shit\', \',\', \'they\', \'even\', \'gave\', \'one\', \'to\', \'jimmy\', \'carter\', \'ha\', \'.\', \'it\', \'should\', \'be\', \'called\', "\'\'", \'the\', \'worst\', \'president\', "\'\'", \'prize\', \'.\']',
"['musicmonday', 'britney', 'spears', '-', 'lucky', 'do', 'you', 'remember', 'this', 'song', '?', 'it', '`', 's', 'awesome', '.', 'i', 'love', 'it', '.']",
"['wtf', '?', 'hilary', 'swank', 'is', 'coming', 'to', 'my', 'school', 'today', ',', 'just', 'to', 'chill', '.', 'lol', 'wow']",
'[\'my\', \'3-year-old\', \'was\', \'amazed\', \'yesterday\', \'to\', \'find\', \'that\', "\'", \'real\', "\'", \'10\', \'pin\', \'bowling\', \'is\', \'nothing\', \'like\', \'it\', \'is\', \'on\', \'the\', \'wii\', \'...\']',
"['God', 'damn', '.', 'That', 'Sony', 'remote', 'for', 'google', 'is', 'fucking', 'hideeeeeous', '!']"]
How I can extract lists properly from this csv file to get the correct structure:
[['i', 'agree', 'about', 'arafat', '.', 'i', 'mean', ',', 'shit', ',', 'they', 'even', 'gave', 'one', 'to', 'jimmy', 'carter', 'ha', '.', 'it', 'should', 'be', 'called', "''", 'the', 'worst', 'president', "''", 'prize', '.'],
['musicmonday', 'britney', 'spears', '-', 'lucky', 'do', 'you', 'remember', 'this', 'song', '?', 'it', '`', 's', 'awesome', '.', 'i', 'love', 'it', '.'],
['wtf', '?', 'hilary', 'swank', 'is', 'coming', 'to', 'my', 'school', 'today', ',', 'just', 'to', 'chill', '.', 'lol', 'wow'],
['my', '3-year-old', 'was', 'amazed', 'yesterday', 'to', 'find', 'that', "'", 'real', "'", '10', 'pin', 'bowling', 'is', 'nothing', 'like', 'it', 'is', 'on', 'the', 'wii', '...'],
['God', 'damn', '.', 'That', 'Sony', 'remote', 'for', 'google', 'is', 'fucking', 'hideeeeeous', '!']]
You can find the original dataset file here: https://github.com/1tangerine1day/Aspect-Term-Extraction-and-Analysis/tree/master/data
Update:
I tried another way but have the same problem:
import csv
with open('twitter_train.csv', newline='') as f:
reader = csv.reader(f)
data = list(reader)
Another incorrect output:
print(data[3])
["['wtf', '?', 'hilary', 'swank', 'is', 'coming', 'to', 'my', 'school', 'today', ',', 'just', 'to', 'chill', '.', 'lol', 'wow']", '[0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]', '[-1, -1, 1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1]']
Thanks in advance!
Your info in your csv is actually a string not a list. You need to make them actual lists.
twitter_train = pd.read_csv('twitter_train.csv')
twitter_train['Tokens'] = list(twitter_train['Tokens'].str.strip("['").str.rstrip("']").str.split("', '"))

Python - Convert from Response Variable to Pandas Dataframe

I ran a LIWC analysis and it gives me the following results (below). I would like to turn the result into a pandas dataframe. If anyone can chip in, that would be wonderful.
Thanks in advance :)
Best,
David
resp = requests.post(url, auth=(api_key, api_secret), data=data)
resp1 = resp
print(resp.json())
{'plan_usage': {'call_limit': 1000, 'calls_made': 6, 'calls_remaining': 994, 'percent_used': 0.6, 'start_date': '2020-12-09T03:05:57.779556Z', 'end_date': '2020-12-23T03:05:57.779556Z'}, 'results': [{'response_id': 'd1382f42-5c28-4528-ab2e-81b80ba185e2', 'request_id': 'req-1', 'language': 'en', 'version': 'v1.0.0', 'summary': {'word_count': 57, 'words_per_sentence': 11.4, 'sentence_count': 5, 'six_plus_words': 0.2982456140350877, 'emojis': 0, 'emoticons': 0, 'hashtags': 0, 'urls': 0}, 'liwc': {'scores': {'analytical_thinking': 80.77394876079086, 'authentic': 38.8220872694557, 'clout': 50, 'emotional_tone': 97.58138119866139, 'dictionary_words': 0.8771929824561403, 'categories': {'achievement': 0, 'adjectives': 0.017543859649122806, 'adverbs': 0.03508771929824561, 'affect': 0.05263157894736842, 'affiliation': 0.017543859649122806, 'all_punctuation': 0.10526315789473684, 'anger_words': 0, 'anxiety_words': 0, 'apostrophes': 0, 'articles': 0.12280701754385964, 'assent': 0, 'auxiliary_verbs': 0.14035087719298245, 'biological_processes': 0, 'body': 0, 'causation': 0, 'certainty': 0, 'cognitive_processes': 0.05263157894736842, 'colons': 0, 'commas': 0.017543859649122806, 'comparisons': 0, 'conjunctions': 0.07017543859649122, 'dashes': 0, 'death': 0, 'differentiation': 0, 'discrepancies': 0.017543859649122806, 'drives': 0.03508771929824561, 'exclamations': 0, 'family': 0, 'feel': 0, 'female': 0, 'filler_words': 0, 'focus_future': 0, 'focus_past': 0, 'focus_present': 0.14035087719298245, 'friends': 0.017543859649122806, 'function_words': 0.543859649122807, 'health': 0, 'hear': 0, 'home': 0, 'i': 0.03508771929824561, 'impersonal_pronouns': 0.03508771929824561, 'informal_language': 0, 'ingestion': 0, 'insight': 0, 'interrogatives': 0.017543859649122806, 'leisure': 0.14035087719298245, 'male': 0, 'money': 0, 'motion': 0.05263157894736842, 'negations': 0, 'negative_emotion_words': 0, 'netspeak': 0, 'nonfluencies': 0, 'numbers': 0, 'other_grammar': 0.2807017543859649, 'other_punctuation': 0, 'parentheses': 0, 'perceptual_processes': 0.017543859649122806, 'periods': 0.08771929824561403, 'personal_concerns': 0.14035087719298245, 'personal_pronouns': 0.03508771929824561, 'positive_emotion_words': 0.05263157894736842, 'power': 0, 'prepositions': 0.10526315789473684, 'pronouns': 0.07017543859649122, 'quantifiers': 0.05263157894736842, 'question_marks': 0, 'quotes': 0, 'relativity': 0.17543859649122806, 'religion': 0, 'reward': 0.017543859649122806, 'risk': 0, 'sad_words': 0, 'see': 0.017543859649122806, 'semicolons': 0, 'sexual': 0, 'she_he': 0, 'social': 0.03508771929824561, 'space': 0.10526315789473684, 'swear_words': 0, 'tentative': 0.03508771929824561, 'they': 0, 'time': 0.017543859649122806, 'time_orientation': 0.14035087719298245, 'verbs': 0.19298245614035087, 'we': 0, 'work': 0, 'you': 0}}}, 'sallee': {'counts': {'emotions': {'admiration': 5, 'amusement': 0, 'anger': 0, 'boredom': 0, 'calmness': 0, 'curiosity': 0, 'desire': 0, 'disgust': 0, 'excitement': 0.375, 'fear': 0, 'gratitude': 2, 'joy': 6.375, 'love': 5, 'pain': 0, 'sadness': 0, 'surprise': 0}, 'goodfeel': 13.375, 'ambifeel': 0, 'badfeel': 0, 'emotionality': 13.375, 'sentiment': 13.375, 'non_emotion': None}, 'scores': {'emotions': {'admiration': 0.3333333333333333, 'amusement': 0, 'anger': 0, 'boredom': 0, 'calmness': 0, 'curiosity': 0, 'desire': 0, 'disgust': 0, 'excitement': 0.03614457831325301, 'fear': 0, 'gratitude': 0.16666666666666666, 'joy': 0.3893129770992366, 'love': 0.3333333333333333, 'pain': 0, 'sadness': 0, 'surprise': 0}, 'goodfeel': 0.2015065913370998, 'ambifeel': 0, 'badfeel': 0, 'emotionality': 0.2015065913370998, 'sentiment': 0.6541600137038615, 'non_emotion': 0.7984934086629002}, 'emotion_word_count': 4}}]}
js = resp.json()
df = pd.json_normalize(js['results'][0])
df.columns
Index(['response_id', 'request_id', 'language', 'version',
'summary.word_count', 'summary.words_per_sentence',
'summary.sentence_count', 'summary.six_plus_words', 'summary.emojis',
'summary.emoticons',
...
'sallee.scores.emotions.pain', 'sallee.scores.emotions.sadness',
'sallee.scores.emotions.surprise', 'sallee.scores.goodfeel',
'sallee.scores.ambifeel', 'sallee.scores.badfeel',
'sallee.scores.emotionality', 'sallee.scores.sentiment',
'sallee.scores.non_emotion', 'sallee.emotion_word_count'],
dtype='object', length=150)
df.iloc[0]
response_id d1382f42-5c28-4528-ab2e-81b80ba185e2
request_id req-1
language en
version v1.0.0
summary.word_count 57
...
sallee.scores.badfeel 0
sallee.scores.emotionality 0.202
sallee.scores.sentiment 0.654
sallee.scores.non_emotion 0.798
sallee.emotion_word_count 4
Name: 0, Length: 150, dtype: object

KeyError: "None of [Index([...] are in the [columns]

I've got numpy array with shape of (3, 50):
data = np.array([[0, 3, 0, 2, 0, 0, 1, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 2, 0, 0,
0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 7, 0, 0, 0, 0,
1, 1, 2, 0, 0, 2],
[0, 0, 0, 0, 0, 3, 0, 1, 6, 1, 1, 0, 0, 0, 0, 2, 0, 0, 1, 0, 1, 0,
3, 0, 0, 0, 0, 0, 0, 5, 2, 2, 2, 1, 0, 0, 1, 0, 1, 3, 2, 0, 0, 0,
0, 0, 2, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0]])
and the following column names:
new_cols = [f'description_word_{i+1}_count' for i in range(50)]
I'm trying to add new columns in already existing dataframe in such way:
df[new_cols] = data
but get the error:
KeyError: "None of [Index(['description_word_1_count',
'description_word_2_count',\n 'description_word_3_count',
'description_word_4_count',\n 'description_word_5_count',
'description_word_6_count',\n 'description_word_7_count',
'description_word_8_count',\n 'description_word_9_count',
'description_word_10_count',\n 'description_word_11_count',
'description_word_12_count',\n 'description_word_13_count',
'description_word_14_count',\n 'description_word_15_count',
'description_word_16_count',\n 'description_word_17_count',
'description_word_18_count',\n 'description_word_19_count',
'description_word_20_count',\n 'description_word_21_count',
'description_word_22_count',\n 'description_word_23_count',
'description_word_24_count',\n 'description_word_25_count',
'description_word_26_count',\n 'description_word_27_count',
'description_word_28_count',\n 'description_word_29_count',
'description_word_30_count',\n 'description_word_31_count',
'description_word_32_count',\n 'description_word_33_count',
'description_word_34_count',\n 'description_word_35_count',
'description_word_36_count',\n 'description_word_37_count',
'description_word_38_count',\n 'description_word_39_count',
'description_word_40_count',\n 'description_word_41_count',
'description_word_42_count',\n 'description_word_43_count',
'description_word_44_count',\n 'description_word_45_count',
'description_word_46_count',\n 'description_word_47_count',
'description_word_48_count',\n 'description_word_49_count',
'description_word_50_count'],\n dtype='object')] are in the
[columns]"
Also I don't know where it finds a '\n' symbols in my column names.
At the same time creating a new dataframe with the data is OK:
new_df = pd.DataFrame(data=data, columns=new_cols)
Does anyone know what is causing the error?
Suppose you have a df like this:
df = pd.DataFrame({'person': [1,1,1], 'event': ['A','B','C']})
You can add new columns like this:
import pandas as pd
import numpy as np
data = np.array([[0, 3, 0, 2, 0, 0, 1, 2, 2, 0, 1, 0, 0, 0, 0, 0, 0, 2, 1, 2, 0, 0,
0, 0, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 1, 0, 0, 7, 0, 0, 0, 0,
1, 1, 2, 0, 0, 2],
[0, 0, 0, 0, 0, 3, 0, 1, 6, 1, 1, 0, 0, 0, 0, 2, 0, 0, 1, 0, 1, 0,
3, 0, 0, 0, 0, 0, 0, 5, 2, 2, 2, 1, 0, 0, 1, 0, 1, 3, 2, 0, 0, 0,
0, 0, 2, 0, 0, 0],
[1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0]])
new_cols = [f'description_word_{i+1}_count' for i in range(50)]
df[new_cols] = pd.DataFrame(data, index=df.index)
I think the problem is that you are using a syntax to create series, when you actually need to create several series. In other words, a dataframe.

Can't use deployed TF BERT model to get GCloud online predictions from SavedModel: "Bad Request" error

I trained a BERT model based on this notebook.
I export it as a tf SavedModel this way:
def serving_input_fn():
receiver_tensors = {
"input_ids": tf.placeholder(dtype=tf.int32, shape=[1, MAX_SEQ_LENGTH])
}
features = {
"input_ids": receiver_tensors['input_ids'],
"input_mask": 1 - tf.cast(tf.equal(receiver_tensors['input_ids'], 0), dtype=tf.int32),
"segment_ids": tf.zeros(dtype=tf.int32, shape=[1, MAX_SEQ_LENGTH]),
"label_ids": tf.placeholder(tf.int32, [None], name='label_ids')
}
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
estimator._export_to_tpu = False
estimator.export_saved_model("export", serving_input_fn)
Then if I try to use the saved model locally it works:
from tensorflow.contrib import predictor
predict_fn = predictor.from_saved_model("export/1575241274/")
print(predict_fn({
"input_ids": [[101, 10468, 99304, 11496, 171, 112, 10176, 22873, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
}))
# {'probabilities': array([[-0.01023898, -4.5866656 ]], dtype=float32), 'labels': 0}
Then I uploaded the SavedModel to a bucket and created a model and a model version on gcloud this way:
gcloud alpha ai-platform versions create v1gpu --model [...] --origin=[...] --python-version=3.5 --runtime-version=1.14 --accelerator=^:^count=1:type=nvidia-tesla-k80 --machine-type n1-highcpu-4
No issue there, the model is deployed and displayed as working in the console.
But if I try to get predictions, as such:
import googleapiclient.discovery
service = googleapiclient.discovery.build('ml', 'v1')
name = 'projects/[project_name]/models/[model_name]/versions/v1gpu'
response = service.projects().predict(
name=name,
body={'instances': [{
"input_ids": [[101, 10468, 99304, 11496, 171, 112, 10176, 22873, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
}]}
).execute()
print(response["predictions"])
All I get is the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/googleapiclient/http.py", line 851, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://ml.googleapis.com/v1/projects/[project_name]/models/[model_name]/versions/v1gpu:predict?alt=json returned "Bad Request">
I get the same error if I test the model from the gcloud console using the "Test your model with sample input data" feature.
Edit:
The saved_model has a tagset "serve" and signature_def "serving_default".
Output of "saved_model_cli show --dir 1575241274/ --tag_set serve --signature_def serving_default":
The given SavedModel SignatureDef contains the following input(s):
inputs['input_ids'] tensor_info:
dtype: DT_INT32
shape: (1, 128)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['labels'] tensor_info:
dtype: DT_INT32
shape: ()
name: loss/Squeeze:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (1, 2)
name: loss/LogSoftmax:0
Method name is: tensorflow/serving/predict
The body of the request sent to the API has the form:
{"instances": [<instance 1>, <instance 2>, ...]}
As specified in documentation we need something like this:
{
"instances": [
<object>
...
]
}
In this case you have:
{
"instances": [
{
"input_ids":
[ <object> ]
}
...
]
}
You need to replace input_ids to instances:
{
"instances":
[
[101, 10468, 99304, 11496, 171, 112, 10176, 22873, 119, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
]
}
Note If you can show the saved_model_cli will be great.
Also gcloud local predict command is also a good option for testing.
It depend of the signature of the model. In my case I have the following signature (just keeping the input part):
The given SavedModel SignatureDef contains the following input(s):
inputs['attention_mask'] tensor_info:
dtype: DT_INT32
shape: (-1, 128)
name: serving_default_attention_mask:0
inputs['input_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 128)
name: serving_default_input_ids:0
inputs['token_type_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 128)
name: serving_default_token_type_ids:0
and I need to pass data in the following format (in this case 2 examples):
{'instances':
[
{'input_ids': [101, 143, 18267, 15470, 90395, ...],
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, .....],
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, .....]
},
{'input_ids': [101, 17664, 143, 30728, .........],
'attention_mask': [1, 1, 1, 1, 1, 1, 1, .......],
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, ....]
}
]
}
I am using it with a Keras model with Tensorflow 2.2.0
I guess in your case you need (for 2 examples):
{'instances':
[
{'input_ids': [101, 143, 18267, 15470, 90395, ...]},
{'input_ids': [101, 17664, 143, 30728, .........]}
]
}