Clickhouse protobuf output format - http-headers

I use clickhouse server in docker with just 1 table and several rows in it. I can request all the data in default format with clickhouse client (over TCP) or with some GUI tool like DBeaver (over HTTP).
SELECT * FROM some_table;
Also I can change format to something special:
SELECT * FROM some_table FORMAT Pretty;
I want to request data from clickhouse in protobuf format. Query looks like this:
SELECT * FROM some_table FORMAT Protobuf SETTINGS format_schema = 'proto_file:ProtoStructure';
I have the proto_file.proto in the same directory with clickhouse-client, so I can made TCP request throw it (successful).
But I don't know data structure of TCP request to reproduce it in my program by myself. So I tried to execute the same request in HTTP (through DBeaver) to intercept request and reproduce it. Unfortunately I can't execute script in DBeaver properly, because it complains on proto_file.proto (File not found, I don't know where to place it to make it work). The only thing I known, that format is specified by X-Clickhouse-Format HTTP header, but I don't know and can't find any info about where in HTTP request I should place content of proto file.
So, the main question: Is there any examples of pure HTTP request to clickhouse for protobuf data output format?

SETTINGS format_schema = 'proto_file:ProtoStructure' -- is the feature of clickhouse-client application. It's only possible with clickhouse-client.
clickhouse-client is the reach client. It queries data from clickhouse-server using TPC/native protocol and forms Protobuf by itself using the schema file.
clickhouse-server is also able to form Protobuf using .proto files (using HTTP and GRPC protocols). But in this case .proto files should be placed at the clickhouse-server node into /var/lib/clickhouse/format_schemas/ folder.
https://clickhouse.com/docs/en/operations/server-configuration-parameters/settings/#server_configuration_parameters-format_schema_path
for example I created .proto file
cat /var/lib/clickhouse/format_schemas/test.proto
syntax = "proto3";
message TestMessage {
int64 id = 1;
uint32 blockNo = 2;
string val1 = 3;
float val2 = 4;
uint32 val3 = 5;
};
made it available chown clickhouse.clickhouse test.proto
Now I can do this
curl -o out.protobuf 'localhost:8123/?format_schema=test:TestMessage&query=select+1+id+from+numbers(10)+format+Protobuf'

Related

How can I configure a specific serialization method to use only for Celery ping?

I have a celery app which has to be pinged by another app. This other app uses json to serialize celery task parameters, but my app has a custom serialization protocol. When the other app tries to ping my app (app.control.ping), it throws the following error:
"Celery ping failed: Refusing to deserialize untrusted content of type application/x-stjson (application/x-stjson)"
My whole codebase relies on this custom encoding, so I was wondering if there is a way to configure a json serialization but only for this ping, and to continue using the custom encoding for the other tasks.
These are the relevant celery settings:
accept_content = [CUSTOM_CELERY_SERIALIZATION, "json"]
result_accept_content = [CUSTOM_CELERY_SERIALIZATION, "json"]
result_serializer = CUSTOM_CELERY_SERIALIZATION
task_serializer = CUSTOM_CELERY_SERIALIZATION
event_serializer = CUSTOM_CELERY_SERIALIZATION
Changing any of the last 3 to [CUSTOM_CELERY_SERIALIZATION, "json"] causes the app to crash, so that's not an option.
Specs: celery=5.1.2
python: 3.8
OS: Linux docker container
Any help would be much appreciated.
Changing any of the last 3 to [CUSTOM_CELERY_SERIALIZATION, "json"] causes the app to crash, so that's not an option.
Because result_serializer, task_serializer, and event_serializer doesn't accept list but just a single str value, unlike e.g. accept_content
The list for e.g. accept_content is possible because if there are 2 items, we can check if the type of an incoming request is one of the 2 items. It isn't possible for e.g. result_serializer because if there were 2 items, then what should be chosen for the result of task-A? (thus the need for a single value)
This means that if you set result_serializer = 'json', this will have a global effect where all the results of all tasks (the returned value of the tasks which can be retrieved by calling e.g. response.get()) would be serialized/deserialized using the json-serializer. Thus, it might work for the ping but it might not for the tasks that can't be directly serialized/deserialized to/from JSON which really needs the custom stjson-serializer.
Currently with Celery==5.1.2, it seems that task-specific setting of result_serializer isn't possible, thus we can't set a single task to be encoded in 'json' and not 'stjson' without setting it globally for all, I assume the same case applies to ping.
Open request to add result_serializer option for tasks
A short discussion in another question
Not the best solution but a workaround is that instead of fixing it in your app's side, you may opt to just add support to serialize/deserialize the contents of type 'application/x-stjson' in the other app.
other_app/celery.py
import ast
from celery import Celery
from kombu.serialization import register
# This is just a possible implementation. Replace with the actual serializer/deserializer for stjson in your app.
def stjson_encoder(obj):
return str(obj)
def stjson_decoder(obj):
obj = ast.literal_eval(obj)
return obj
register(
'stjson',
stjson_encoder,
stjson_decoder,
content_type='application/x-stjson',
content_encoding='utf-8',
)
app = Celery('other_app')
app.conf.update(
accept_content=['json', 'stjson'],
)
You app remains to accept and respond stjson format, but now the other app is configured to be able to parse such format.

SQL compilation error while copying data from snowflakes to S3 using s3a:// and s3n://

I am trying to copy results to Amazon s3 from snowflakes using s3n:// and s3a:// url but getting an SQL compilation error
The sql query is in the format
COPY INTO '&s3_path/&curr_dt/pvc'
FROM (
SELECT OBJECT_CONSTRUCT('id',id,'keyword',keyword)
FROM brands_delta)
CREDENTIALS = (AWS_KEY_ID='&aws_key_id' AWS_SECRET_KEY='&aws_secret_key')
FILE_FORMAT = (TYPE=JSON)
SINGLE = false
OVERWRITE = true
MAX_FILE_SIZE = 1073741824;
The error in the log files is as follows :
001011 (42601): SQL compilation error:
invalid URL prefix found in: 's3a://abc/prod-runs/input/2021-01-19/pvc'
The URI protocol determines the code/software used by the client to access the resource given in the URI.
In this case, Snowflake is the client software and it, obviously, doesn't use the s3a/s3n protocols. I'm not sure why you are trying to use them?

Compilation of contract with include

I’m currently struggling to compile a contract (on aeternity's Sophia language) with include of a custom library “Library.aes” which resides in a separate file at the same level of the filesystem as the using contract.
The library looks like
namespace Library =
type number = int
function inc(x : number) : number = x + 1
The contract is using it like this
include "Library.aes"
When I compile (locally using compiler node) the contract, I always get
"Couldn't find include file 'Library.aes'\n"
also tried to pass the full path to the include, same result.
Is there a need to define the attribute options.file_system somehow?
let’s use the same example:
~/Quviq/Aeternity/aesophia_http [git:master]: FOO="include \\\"Bar.aes\\\"\\n\\ncontract Foo =\\n entrypoint foo() = Bar.bar()"
~/Quviq/Aeternity/aesophia_http [git:master]: BAR="namespace Bar =\\n function bar() = 42"
~/Quviq/Aeternity/aesophia_http [git:master]: curl -H "Content-Type: application/json" -d "{\"code\":\"$FOO\",\"options\":{\"backend\":\"fate\",\"file_system\":{\"Bar.aes\":\"$BAR\"}}}" -X POST http://localhost:3080/compile
{"bytecode":"cb_+IJGA6AANCB3UsSiP2HGHRML0dG95vNT9JsqZQMjPYAfEG1w6cC4Va3+RNZEHwA3ADcAGg6CPwEDP/5sbA2iAjcABwEDVP64/p9/ADcABwQDEWxsDaKjLwMRRNZEHxFpbml0EWxsDaIhLkJhci5iYXIRuP6ffw1mb2+CLwCFNC4yLjAAfreb3w=="}
Beware the quoting of strings, but apart from that it is rather straightforward.
This post is pretty old but thought on responding anyway.
Did you tried this by using aeproject?
https://github.com/aeternity/aepp-aeproject-js
Try to put your contract code under the file structure and use the deploy script so you can work it out locally first and the deploy to testnet or mainnet.
If you have Erlang installed you may use aesophia_cli instead of hosting a local node. Then it should search for include files in the same directory as your main .aes file.

Using Leigh version of S3Wrapper.cfc Can't get past Init

I am new to S3 and need to use it for image storage. I found a half dozen versions of an s2wrapper for cf but it appears that the only one set of for v4 is one modified by Leigh
https://gist.github.com/Leigh-/26993ed79c956c9309a9dfe40f1fce29
Dropped in the com directory and created a "test" page that contains the following code:
s3 = createObject('component','com.S3Wrapper').init(application.s3.AccessKeyId,application.s3.SecretAccessKey);
but got the following error :
So I changed the line 37 from
variables.Sv4Util = createObject('component', 'Sv4').init(arguments.S3AccessKey, arguments.S3SecretAccessKey);
to
variables.Sv4Util = createObject('component', 'Sv4Util').init(arguments.S3AccessKey, arguments.S3SecretAccessKey);
Now I am getting:
I feel like going through Leigh code and start changing things is a bad idea since I have lurked here for year an know Leigh's code is solid.
Does any know if there are any examples on how to use this anywhere? If not what I am doing wrong. If it makes a difference I am using Lucee 5 and not Adobe's CF engine.
UPDATE :
I followed Leigh's directions and the error is now gone. I am addedsome more code to my test page which now looks like this :
<cfscript>
s3 = createObject('component','com.S3v4').init(application.s3.AccessKeyId,application.s3.SecretAccessKey);
bucket = "imgbkt.domain.com";
obj = "fake.ping";
region = "s3-us-west-1"
test = s3.getObject(bucket,obj,region);
writeDump(test);
test2 = s3.getObjectLink(bucket,obj,region);
writeDump(test2);
writeDump(s3);
</cfscript>
Regardless of what I put in for bucket, obj or region I get :
JIC I did go to AWS and get new keys:
Leigh if you are still around or anyone how has used one of the s3Wrappers any suggestions or guidance?
UPDATE #2:
Even after Alex's help I am not able to get this to work. The Link I receive from getObjectLink is not valid and getObject never does download an object. I thought I would try the putObject method
test3 = s3.putObject(bucketName=bucket,regionName=region,keyName="favicon.ico");
writeDump(test3);
to see if there is any additional information, I received this :
I did find this article https://shlomoswidler.com/2009/08/amazon-s3-gotcha-using-virtual-host.html but it is pretty old and since S3 specifically suggests using dots in bucketnames I don't that it is relevant any longer. There is obviously something I am doing wrong but I have spent hours trying to resolve this and I can't seem to figure out what it might be.
I will give you a rundown of what the code does:
getObjectLink returns a HTTP URL for the file fake.ping that is found looking in the bucket imgbkt.domain.com of region s3-us-west-1. This link is temporary and expires after 60 seconds by default.
getObject invokes getObjectLink and immediately requests the URL using HTTP GET. The response is then saved to the directory of the S3v4.cfc with the filename fake.ping by default. Finally the function returns the full path of the downloaded file: E:\wwwDevRoot\taa\fake.ping
To save the file in a different location, you would invoke:
downloadPath = 'E:\';
test = s3.getObject(bucket,obj,region,downloadPath);
writeDump(test);
The HTTP request is synchronous, meaning the file will be downloaded completely when the functions returns the filepath.
If you want to access the actual content of the file, you can do this:
test = s3.getObject(bucket,obj,region);
contentAsString = fileRead(test); // returns the file content as string
// or
contentAsBinary = fileReadBinary(test); // returns the content as binary (byte array)
writeDump(contentAsString);
writeDump(contentAsBinary);
(You might want to stream the content if the file is large since fileRead/fileReadBinary reads the whole file into buffer. Use fileOpen to stream the content.
Does that help you?

JSON Schema for FHIR false positives

I am new to JSON Schema, and am trying to validate JSON based on the HL7-FHIR schemas. Data I think should be invalid (and that the official Java-based validator says are invalid) shows up as valid.
For example, {"dog": "food"} should be invalid, because when I run the validator, I get:
> java -jar org.hl7.fhir.validator.jar bad.json -defn definitions.json.zip
.. load FHIR from definitions.json.zip
.. connect to tx server # http://tx.fhir.org/r3
(vnull-null)
.. validate
*FAILURE* validating bad.json: error:1 warn:0 info:0
Fatal # $ (line 1, col2) : Unable to find resourceType property
But if I paste the fhir.schema.json file from here into a JSON Schema validator like the one here, and evaluate {"dog": "food"}, it's valid.
It's valid even if I supply a resourceType, which I thought might cause the restrictions to kick in. It's also valid if I copy an example I expect to be valid—say, this Practitioner example—and change some of the types (set name to be a string rather than an array, for example).
I'm not sure if I'm running into a problem with the HL7-FHIR JSON Schema in particular or with JSON Schemas in general. I believe my question is different than this one because it appears that we're up to release 3.0, and so the schema I'm using is updated.