Setting maxJsonLength property for USQL Newtonsoft JSON extractor

Setting maxJsonLength property for USQL Newtonsoft JSON extractor - azure-data-lake

I'm using USQL for Azure data lake project and my input is in JSON format and I'm extracting each line and converting it back to JSON tuple. But the issue is some string lengths are bigger 102400 and Newtonsoft JSON extractor defaults to 102400 maximum length and this is causing failure on these records. Is it possible to change maxJsonLength property to bigger value to handle these large inputs? I found a property MaximumLength in Newtonsoft.Json.XML file inside assemblies, but it is also not working.
Any suggestion is highly appreciated.

Please note that U-SQL currently has a string data type size limit of 128kB (in UTF-8 encoded data). So even if you increase your Newtonsoft properties, you may run into that limit. Can you provide more information on what you are exactly doing and what the exact error message is to provide you more concrete answers?

Related

Return binary data via RFC

I want to return binary data in ABAP, for example a PNG image file.
Which data type should I use? string, xstring, ...?
I use the PyRFC SDK: https://github.com/SAP/PyRFC

xstring
Sidenotes if you have large data:
max size of an xstring is 2GB (depending also on profile parameter ztta/max_memreq_MB)
If you use an internal table of xstrings (e.g. dictionary type XSTRINGS_TABLE), dynamic memory allocation is easier because it will not be requested in one go, as is the case for a flat xstring

How to load CanvasSvgDocument from file?

In a C++/winrt project I have a large number of small svg resources to be loaded from file. Since it would be slow to reload them all from disk at each CreateResources event from the CanvasVirtualControl I have loaded them in advance and stored the data for each in an array. When CreateResources happens my intent is to load a CanvasSvgDocument for each of these by using the CanvasSvgDocument method LoadFromXml(System.string). However, If I create an svgDocument using the resourcecreator, I get an invalid argument crash when calling LoadFromXml(). The resourceCreator argument looks right (VS preview 6 now allows me to see local variables!) and the xml data string argument looks like the valid svg data, so my best guess about the crash is that the data string is the wrong format. The file data is UTF-8. If I convert that to a std::wstring as I must for the LoadFromXml argument can it still be understood as byte data?
For example, I create the std::wstring this way, given a pointer to unsigned char file data and its length in bytes:
m_data_string = std::wstring(data, data + dataLength);
When CreateResources is triggered that datastring is referenced this way:
m_svg = CanvasSvgDocument(resourceCreator);
m_svg.LoadFromXml(resourceCreator, m_data_string);
But LoadFromXml crashes with that invalid parameter error. I see that the length of the data string is correct, but of course that is the number of characters, not the actual size of the data. Could there be a conflict between the UTF-8 attribute in the svg and the fact that it is now recorded as 16-bit characters? If so, how would one load an xml document from such data?
[Update] with the suggestion that I use winrt::to_hstring. I read the unsigned char data into a std::string,
std::string cstring = std::string("");
cstring.assign(data, data + dataLength);
Then I convert that:
m_data_string = winrt::to_hstring(cstring);
And finally try to load an svg as before:
m_svg.LoadFromXml(resourceCreator, m_data_string);
And it crashes as before. I notice that in the debugger that converted string in neither case appeared to be gibberish - in both cases it read in the debugger as the expected svg data. But if this hstring is wide chars wouldn't that be a conflict with the attribute in the svg that identifies it as UTF-8?
[Update] I'm starting to wonder if anyone has ever used CanvasSvgDocument.Draw() to draw an svg loaded from a file. The files are now loading without crashing without any change to their internal encoding reference. But - they won't draw. These files - 239 of them - are UTF-8, svg 1.1, and they display nicely if opened in Edge or any browser. But if I load the file data to an hstring, create a CanvasSvgDocument and then use CanvasSvgDocument.LoadFromXml to load them, they do not draw when called by CanvasSvgDocument's draw method. Other drawing of shapes, etc. works fine during the drawing session. Here is what could be a hint: If I call GetXML() on one of these svgs after it is loaded, what is returned is just this:
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"></svg>
That is, the drawing information is not there. Or is this the full extent of what GetXml() is meant to return? That wouldn't seem useful. So perhaps CanvasSvgDocument.LoadFromXml(ResourceCreator, String) doesn't actually work yet?
So I'm back to asking again: is there a way to load a functional CanvasSvgDocument from file data?

My first answer here was wrong: the fault in my code above is that LoadFromXml() is a static method, and someone pointed out to me elsewhere, I was discarding the returned result. It should be theSvg = CanvasSvgDocument::LoadFromXml(string).
Having corrected that, I'm back to the problem of loading UTF-8 data in a method whose argument is a wide-character string. Changing the internal reference to UTF-16 doesn't help after all. Loading the svg with CanvasSvgDocument::LoadAsync(filestream) works, but if I want to load these without re-accessing the disk I will need to find a way to make RandomAccessFileStream from a buffer of bytes and then use LoadAsync. I think. Unless there is some other way to make LoadFromXmL() work - at present it fails with an invalid argument error.

VTK: The data array in the element may be too short

I'm trying to visualize some data in vtr format. For this purposes I've created a couple npy files by this library, then I've converted this files by PyEVTK into the vtr format (like in the lowlevel.py example). But when I'm trying to visualize this data by ParaView, an error appears:
ERROR: In /var/tmp/portage/sci-visualization/paraview-4.0.1-r1/work/ParaView-v4.0.1-source/VTK/IO/XML/vtkXMLDataReader.cxx, line 510
vtkXMLRectilinearGridReader (0x36bb080): Cannot read point data array "Pressure" from PointData in piece 0. The data array in the element may be too short.
Can anybody explain, what exactly means this error message, and what's wrong with the my visualization data?
Solved:
I made a stupid mistake - data size in header was different from the actual data size, and this was the cause of error.

This error may be coming from the XML header declaration, that may not contain all data needed. You may miss the header_type that contains the size of each info written between each set of data.
<VTKFile type="UnstructuredGrid" version="0.1" byte_order="BigEndian" header_type="UInt64">

How to write to a file in Go

I have seen How to read/write from/to file using golang? and http://golang.org/pkg/os/#File.Write but could not get answer.
Is there a way, I can directly write an array of float/int to a file. Or do I have to change it to byte/string to write it. Thanks.

You can use the functions in the encoding/binary package for this purpose.
As far as writing an entire array at once goes, there are no functions for this. You will have to iterate the array and write each element individually. Ideally, you should prefix these elements with a single integer, denoting the length of the array.
If you want a higher level solution, you can try the encoding/gob package:
Package gob manages streams of gobs - binary values exchanged between an Encoder (transmitter) and a Decoder (receiver). A typical use is transporting arguments and results of remote procedure calls (RPCs) such as those provided by package "rpc".

00626 SQL Loader error

How to avoid
"characterset conversion buffer overflow" error in sql*loader? error # 00626.
I am not able to find this on internet please suggest me the solution for this.

What is the character set of the input datafile? You might try specifying the character set in the control file:
CHARACTERSET char_set_name LENGTH SEMANTICS CHARACTER
By default, if not specified, Oracle will use byte length semantics. Thus, if you define a field length in your control file as VARCHAR(20), in byte semantics you'd have 20 byte buffer, but in character length semantics you might have a 40 byte buffer. This would be my guess as to what could be the source of the error.

It's not a lot of help, but here's what the Oracle error manual has to say about that error:
SQL*Loader-00626: Character set
conversion buffer overflow.
Cause: A conversion from the datafile character set to the client
character set required more space than
that allocated for the conversion
buffer. The size of the conversion
buffer is limited by the maximum size
of a varchar2 column.
Action: The input record is rejected. The data will not fit into
the column.
It sounds like there isn't any way to work around this within SQLLoader. If it is affecting a small number of records then it may be easiest to simply handle those manually. If it is many records, then you probably need to find or create a different loading tool.

Just a few ideas for you to think about:
You could try to load different parts of the "string" into different fields in the database .. maybe that way you can work around the limitation.
You could try to do the character set conversion in a different tool .. some text editors may give you some options .. and then load the file without it requiring the conversion.
Not sure if there's any merit in these ideas, but hopefully you can work something out.

Thanks for all your help. This problem has been resolved. We split the file and loaded in chunks and it worked fine

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas