How to fix UnicodeDecodeError for bytes from requests? - selenium

I have the following full working example code using selenium-wire to record all requests made.
import os
import sys
import json
from seleniumwire import webdriver
driver = webdriver.Chrome()
driver.get("http://www.google.com")
list_requests = []
for request in driver.requests:
req = {
"method": request.method,
"url": request.url,
"body": request.body.decode(), # to avoid json error
"headers": {k:str(v) for k,v in request.headers.__dict__.items()} # to avoid json error
}
if request.response:
resp = {
"status_code": request.response.status_code,
"reason": request.response.reason,
"body": request.response.body.decode(), # ???
"headers": {k:str(v) for k,v in request.response.headers.__dict__.items()} # to avoid json error
}
req["response"] = resp
list_requests.append(req)
with open(f"test.json", "w") as outfile:
json.dump(list_requests, outfile)
However, the decoding of the response body creates an error
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte
and without trying to decode the response body I get an error
TypeError: Object of type bytes is not JSON serializable
I do not care about the encoding, I just want to be able to write the 'body' to the json file in some way. If needed the byte/character in question can be removed, I do not care.
Any ideas how this problem can be solved?

I've used the next approach in order to extract some field (some_key) from json response:
from gzip import decompress
import json
some_key = None
for request in driver.requests:
if request.response:
if request.method == 'POST':
print(request.method + ' ' + request.url)
try:
# try to parse the json response to extract the data
data = json.loads(request.response.body)
print('parsed as json')
if 'some_key' in data:
some_key = data['some_key']
except UnicodeDecodeError:
try:
# decompress on UnicodeDecodeError and parse the json response to extract the data
data = json.loads(decompress(request.response.body))
print('decompressed and parsed as json')
if 'some_key' in data:
some_key = data['some_key']
except json.decoder.JSONDecodeError:
data = request.response.body
print('decompressed and not parsed')
print(data)
print(some_key)
gzip.decompress helped me with UnicodeDecodeError.
Hope this will be helpful.

Related

Sending a SOAP request with a file attachment [duplicate]

I have an endpoint who consumes Json with 2 attributes, like
{id='12344', data=byte_array}
so I've wrote a test
Feature: submitted request
Scenario: submitted request
* def convertToBytes =
"""
function(arg) {
var StreamUtils = Java.type('my.utils.StreamUtils');
// it reads stream and convert it to a byte array
return StreamUtils.getBytes(arg);
}
"""
Given url 'http://my-server/post'
And def image = convertToBytes(read('classpath:images/image_1.jpg'));
And request {id:1, data: "#(image)"}
When method POST
Then status 200
However is got an exception form karate without much details
ERROR com.intuit.karate - http request failed: [B cannot be cast to [Ljava.lang.Object;
Any hits how to submit byte arrays as a part of Json with karate?
I don't think you can do that. Either the whole request should be binary (byte-array) or you do a multi-part request, where binary is Base64 encoded. As far as I know you can't put binary inside JSON. There is something called Binary JSON though.
EDIT: after assuming that the byte[] has to be Base64 encoded:
Background:
* url demoBaseUrl
* def Base64 = Java.type('java.util.Base64')
Scenario: json with byte-array
Given path 'echo', 'binary'
And def encoded = Base64.encoder.encodeToString('hello'.bytes);
And request { message: 'hello', data: '#(encoded)' }
When method post
Then status 200
And def expected = Base64.encoder.encodeToString('world'.bytes);
And match response == { message: 'world', data: '#(expected)' }
I just added this test to the Karate demos, and it is working fine. Here is the commit.

Tensorflow serving: Unable to base64 decode

I use the slim package resnet_v2_152 to train a classification model.
Then it is exported to .pb file to provide a service.
Because the input is image, so it would be encoded with web-safe base64 encoding. It looks like:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
decoded = tf.decode_base64(serialized_tf_example)
I then encode an image with base64 such that:
img_path = '/Users/wuyanxue/Desktop/not_emoji1.jpeg'
img_b64 = base64.b64encode(open(img_path, 'rb').read())
s = str(img_b64, encoding='utf-8')
s = s.replace('+', '-').replace(r'/', '_')
My post data is as structured as follow:
post_data = {
'signature_name': 'predict',
'instances':[ {
'inputs':
{ 'b64': s }
}]
}
Finally, I post a HTTP request to this server:
res = requests.post('server_address', json=post_data)
It gives me:
'{ "error": "Failed to process element: 0 key: inputs of \\\'instances\\\' list. Error: Invalid argument: Unable to base64 decode" }'
I want to know how it could be encountered? And are there some solutions for that?
I had the same issue when using python3. I solved it by adding a 'b' - a byte-like object instead of the default str to the encode function:
b'{"instances" : [{"b64": "%s"}]}' % base64.b64encode(
dl_request.content)
Hope that helps, please see this answer for extra info.
This question is already solved.
post_data = {
'signature_name': 'predict',
'instances':[ {
'inputs':
{ 'b64': s }
}]
}
We see that inputs is with 'b64' flag, which illustrates that tensorflow serving will decode s with base64 code.
It belongs to the tensorflow serving internal method.
So, the placeholder:
serialized_tf_example = tf.placeholder(dtype=tf.string, name='tf_example')
will directly receive the binary format of the input data BUT NOT base64 format.
So, finally,
decoded = tf.decode_base64(serialized_tf_example)
is NOT necessary.

How to send a byte array as a part of Json with karate framework

I have an endpoint who consumes Json with 2 attributes, like
{id='12344', data=byte_array}
so I've wrote a test
Feature: submitted request
Scenario: submitted request
* def convertToBytes =
"""
function(arg) {
var StreamUtils = Java.type('my.utils.StreamUtils');
// it reads stream and convert it to a byte array
return StreamUtils.getBytes(arg);
}
"""
Given url 'http://my-server/post'
And def image = convertToBytes(read('classpath:images/image_1.jpg'));
And request {id:1, data: "#(image)"}
When method POST
Then status 200
However is got an exception form karate without much details
ERROR com.intuit.karate - http request failed: [B cannot be cast to [Ljava.lang.Object;
Any hits how to submit byte arrays as a part of Json with karate?
I don't think you can do that. Either the whole request should be binary (byte-array) or you do a multi-part request, where binary is Base64 encoded. As far as I know you can't put binary inside JSON. There is something called Binary JSON though.
EDIT: after assuming that the byte[] has to be Base64 encoded:
Background:
* url demoBaseUrl
* def Base64 = Java.type('java.util.Base64')
Scenario: json with byte-array
Given path 'echo', 'binary'
And def encoded = Base64.encoder.encodeToString('hello'.bytes);
And request { message: 'hello', data: '#(encoded)' }
When method post
Then status 200
And def expected = Base64.encoder.encodeToString('world'.bytes);
And match response == { message: 'world', data: '#(expected)' }
I just added this test to the Karate demos, and it is working fine. Here is the commit.

Aerospike: zlib/bz2 store and retrieve didnt worked

I am compressing a string using zlib, then storing in Aerospike bin. On retrieval and decompressing, I am getting "zlib.error: Error -5 while decompressing data: incomplete or truncated stream"
When I compared original compressed data and retrieved compressed data, some thing is missing at the end in retrieved data.
I am using Aerospike 3.7.3 & python client 2.0.1
Please help
Thanks
Update: Tried using bz2. Throws ValueError: couldn't find end of stream while retrieve and decompress. Looks like aerospike is stripping of the last byte or something else from the blob.
Update: Posting the code
import aerospike
import bz2
config = {
'hosts': [
( '127.0.0.1', 3000 )
],
'policies': {
'timeout': 1000 # milliseconds
}
}
client = aerospike.client(config)
client.connect()
content = "An Aerospike Query"
content_bz2 = bz2.compress(content)
key = ('benchmark', 'myset', 55)
#client.put(key, {'bin0':content_bz2})
(key, meta, bins) = client.get(key)
print bz2.decompress(bins['bin0'])
Getting Following Error:
Traceback (most recent call last):
File "asread.py", line 22, in <module>
print bz2.decompress(bins['bin0'])
ValueError: couldn't find end of stream
The bz.compress method returns a string, and the client sees that type and tries to convert it to the server's as_str type. If it runs into a \0 in an unexpected position it will truncate the string, causing your error.
Instead, make sure to cast binary data to a bytearray, which the client converts to the server's as_bytes type. On the read operation, bz.decompress will work with the bytearray data and give you back the original string.
from __future__ import print_function
import aerospike
import bz2
config = {'hosts': [( '33.33.33.91', 3000 )]}
client = aerospike.client(config)
client.connect()
content = "An Aerospike Query"
content_bz2 = bytearray(bz2.compress(content))
key = ('test', 'bytesss', 1)
client.put(key, {'bin0':content_bz2})
(key, meta, bins) = client.get(key)
print(type(bins['bin0']))
bin0 = bz2.decompress(bins['bin0'])
print(type(bin0))
print(bin0)
Gives back
<type 'bytearray'>
<type 'str'>
An Aerospike Query

Sending form data with an HTTP PUT request using Grinder API

I'm trying to replicate the following successful cURL operation with Grinder.
curl -X PUT -d "title=Here%27s+the+title&content=Here%27s+the+content&signature=myusername%3A3ad1117dab0ade17bdbd47cc8efd5b08" http://www.mysite.com/api
Here's my script:
from net.grinder.script import Test
from net.grinder.script.Grinder import grinder
from net.grinder.plugin.http import HTTPRequest
from HTTPClient import NVPair
import hashlib
test1 = Test(1, "Request resource")
request1 = HTTPRequest(url="http://www.mysite.com/api")
test1.record(request1)
log = grinder.logger.info
test1.record(log)
m = hashlib.md5()
class TestRunner:
def __call__(self):
params = [NVPair("title","Here's the title"),NVPair("content", "Here's the content")]
params.sort(key=lambda param: param.getName())
ps = ""
for param in params:
ps = ps + param.getValue() + ":"
ps = ps + "myapikey"
m.update(ps)
params.append(NVPair("signature", ("myusername:" + m.hexdigest())))
request1.setFormData(tuple(params))
result = request1.PUT()
The test runs okay, but it seems that my script doesn't actually send any of the params data to the API, and I can't work out why. There are no errors generated, but I get a 401 Unauthorized response from the API, indicating that a successful PUT request reached it, but obviously without a signature the request was rejected.
This isn't exactly an answer, more of a workaround that I came up with, that I've decided to post since this question hasn't yet received any responses, and it may help anyone else trying to achieve the same thing.
The workaround is basically to use the httplib and urllib modules to build and make the PUT request instead of the HTTPClient module.
import hashlib
import httplib, urllib
....
params = [("title", "Here's the title"),("content", "Here's the content")]
params.sort(key=lambda param: param[0])
ps = ""
for param in params:
ps = ps + param[1] + ":"
ps = ps + "myapikey"
m = hashlib.md5()
m.update(ps)
params.append(("signature", "myusername:" + m.hexdigest()))
params = urllib.urlencode(params)
print params
headers = {"Content-type": "application/x-www-form-urlencoded"}
conn = httplib.HTTPConnection("www.mysite.com:80")
conn.request("PUT", "/api", params, headers)
response = conn.getresponse()
print response.status, response.reason
print response.read()
conn.close()
(Based on the example at the bottom of this documentation page.)
You have to refer to the multi-form posting example in Grinder script gallery, but changing the Post to Put. It works for me.
files = ( NVPair("self", "form.py"), )
parameters = ( NVPair("run number", str(grinder.runNumber)), )
# This is the Jython way of creating an NVPair[] Java array
# with one element.
headers = zeros(1, NVPair)
# Create a multi-part form encoded byte array.
data = Codecs.mpFormDataEncode(parameters, files, headers)
grinder.logger.output("Content type set to %s" % headers[0].value)
# Call the version of POST that takes a byte array.
result = request1.PUT("/upload", data, headers)