Write null value to Parquet file - apache

I'm using Parquet CPP library to write data from MySQL database to a parquet file. I have two questions:
1) What does the REPETITION in schema mean? Is it related to table constraints when we define a column as NULL or NOT NULL?
2) How to insert NULL value into a column? Do I just pass a null pointer to the value parameter?
WriteBatch(int64_t num_levels, const int16_t* def_levels,
const int16_t* rep_levels,
const typename ParquetType::c_type* values)
Thanks in advance!

#Ivy.W I have been using parquet CPP recently at work and this is what I understood
Parquet schema needs to know about each column of the table that you are going to read from and write to. If the column is nullable then it means that the repetitionType is optional, if it is not nullable it means the repetitionType is required else it will be repeated (for nested structures like map, list etc). Let me give a quick intro to definition and repetition levels:
The definition level in parquet is to identify if the value to be written is nullable or not I.e we should tell the level for which the particular field is NULL. So basically, if you want to reconstruct the schema back, we can use the definition and repetition levels.
A field can be Optional/required/repeated. If the field is required, it means it can't be null so the definition level is not required. If it is optional, it will be 0 for null and 1 for non-null. If the schema is nested, we use additional values accordingly.
e.g
message ExampleDefinitionLevel {
optional group a {
optional group b {
optional string c;
}
}
}
definition level for a would be 0, for b would be 1 for c would be 2.
enter image description here
Repetition level:
Repetition level is only applicable for nested structures such as lists, map etc.
for e.g when a user can have multiple phone numbers the field will be "repeated".
e.g
message list{
repeated string list;
}
The data would be like: ["a","b","c"] and would look like:
{
list:"a",
list:"b",
list:"c"
}
To write null, make sure the schema knows that the column is nullable and just pass the definition level as 0 and parquet writebatch should take care of the rest.
Please refer to https://blog.twitter.com/engineering/en_us/a/2013/dremel-made-simple-with-parquet.html

Related

Error trying to screen an internal table abap

I´m learning ABAP and I keep trying to write that internal table and show it. There's syntax error message at line WRITE: / I_EJSEIS:
"I_EJSEIS" cannot be converted to a character-like value
I just don't understand it.
TYPES: S_EJSEIS LIKE SPFLI.
DATA: I_EJSEIS TYPE TABLE OF S_EJSEIS WITH HEADER LINE,
WA_EJSEIS TYPE S_EJSEIS.
SELECT FLTYPE
FROM SPFLI
INTO TABLE I_EJSEIS
WHERE CARRID = 'LH'.
LOOP AT I_EJSEIS.
WRITE: / I_EJSEIS.
ENDLOOP.
According to the ABAP documentation of WRITE dobj:
For dobj, those data types are allowed that are grouped under the generic type simple:
All flat data types; flat structures are handled like a data object of type c and can only contain any character-like components.
The data types string and xstring
enumerated types; the name (maximum three characters) of the enumerated constant is used in uppercase letters, which defines the current enumerated value.
In your case, I_EJSEIS is a (flat) structure containing at least one non-character component (e.g. fltime, distance, period), so it doesn't fall in any of the categories above.
The workaround is to display the fields individually:
WRITE: / I_EJSEIS-FLTYPE, I_EJSEIS-FLTIME.

What is an efficient way to parse query result into a struct that has two fileds: a string and an array of structs using pkg sqlx?

I have written the following query and found myself trying to find an efficient way to parse the result of it:
SELECT modes.mode_name, confs.config_name, confs.field1, confs.field2, confs.field3
FROM modes
JOIN confs ON modes.config_name = confs.name
WHERE modes.mode_name = $1 ORDER BY confs.config_name ASC;
Foe each mode there are multiple corresponding configs - table modes has two columns forming a primary key - mode_name and config_name.
Here are the structs I have to use:
type Mode struct {
Name string `db:"mode_name"`
Configs []Config
}
type Config struct {
Name string `db:"name" json:"-"`
Mode string `db:"mode_name" json:"mode_name,omitempty"`
Field1 float32 `db:"field1" json:"field1,omitempty"`
Field2 float32 `db:"field2" json:"field2,omitempty"`
Field3 float32 `db:"field3" json:"field3,omitempty"`
}
I expect to find a way to populate Mode struct with the data from the query above:
Name from mode_name
Then parse each corresponding config into Config struct and add them to []Configs into the Configs field
I have studied the docs for pkg sqlx, picked and tried several options that looked promising:
sqlx.QueryxContext alongside attempting to iterate over Rows with StructScan
sqlx.NamedExec - parsing directly into the struct (which fails yet again as mine has an embedded struct inside).
Both of them failed and I am beginning to think there might be no elegant way to solve the task in these circumstances with the aforementioned tools.

How to call BAPI_MATERIAL_SAVEDATA with custom fields from NCo?

In our current project we are using SAPNCO3 with RFC calls. The requirement is to create material with the function "BAPI_MATERIAL_SAVEDATA" and some custom fields (via EXTENSIONIN). The problem now is how to extend the needed structures "BAPI_TE_MARA/X" so that they can carry the custom fields? I cannot found any function for this.
Please have a look at the Code snippet at the bottom.
Thank you!
Tobias
var BAPI_TE_MARA = repo.GetStructureMetadata("BAPI_TE_MARA");
IRfcStructure structure = BAPI_TE_MARA.CreateStructure();
structure.SetValue("MATERIAL", material.Number);
//structure.SetValue("ZMM_JOB_REFERENCE", "f");
BAPI_MATERIAL_SAVEDATA has two table parameters EXTENSIONIN and EXTENSIONINX to which you pass lines with the values of your custom fields.
These table parameters have to indicate what extension structures you want to use and their values.
As these custom fields may extend different tables of the material, you have to indicate different extension structures depending on which table these fields belong to:
For the table MARA, the extension structures are BAPI_TE_MARA and BAPI_TE_MARAX.
For the table MARC, the extension structures are BAPI_TE_MARC and BAPI_TE_MARCX.
These extension structures should preferably have character-like fields to simplify the programming (and to support IDocs, as rule-of-thumb).
For instance, if you have the custom fields ZZCNAME (7 characters) and ZZCTEXT (50 characters) in the table MARA, they will also be defined in BAPI_TE_MARA and have the same names and types. In BAPI_TE_MARAX, you also have two fields with the same names, but always of length 1 character and their values must be 'X' to indicate that a value is passed in BAPI_TE_MARA (useful in case a blank value is passed that must not be ignored). The X extension structures are essential especially in "change" BAPIs.
If you want to pass values to the BAPI, you must first initialize these structures:
BAPI_TE_MARA:
MATERIAL ZZCNAME ZZCTEXT
------------ ------- -------
000000012661 NAME TEXT
BAPI_TE_MARAX:
MATERIAL ZZCNAME ZZCTEXT
------------ ------- -------
000000012661 X X
Then, you must initialize the two parameters of the BAPI:
EXTENSIONIN (notice that there are 3 spaces in NAME TEXT because the technical length of ZZCNAME is 7 characters and its value "NAME" occupies only 4 characters):
STRUCTURE VALUEPART1 (240 Char) VALUEPART2 (240) VALUEPART3 (240) VALUEPART4 (240)
------------ ----------------------- ---------------- ---------------- ----------------
BAPI_TE_MARA 000000012661NAME TEXT
EXTENSIONINX:
STRUCTURE VALUEPART1 (240 Char) VALUEPART2 (240) VALUEPART3 (240) VALUEPART4 (240)
------------- --------------------- ---------------- ---------------- ----------------
BAPI_TE_MARAX 000000012661XX
Consequently, your program must:
concatenate all BAPI_TE_MARA fields together and copy the resulting string into fields VALUEPART1 to VALUEPART4 of EXTENSIONIN as if it was a 960 characters field
concatenate all BAPI_TE_MARAX fields together and copy the resulting string into fields VALUEPART1 to VALUEPART4 of EXTENSIONINX
I guess you may use ToString() to get one concatenated string of characters of all fields of a structure, and to set the value of VALUEPART1, VALUEPART2, etc., you'll probably need to initialize them individually from the string of characters with Substring.
My comment was half by half correct and incorrect, I wasn't aware of the extension technique in this BAPI, so I wasn't aware of this structure is really used in this BAPI.
You asked
The problem now is how to extend the needed structures "BAPI_TE_MARA/X" so that they can carry the custom fields?`
and what I said is indeed stays valid: you cant extend the interface from NCo, only on backend.
You writes:
At this If I load BAPI_TE_MARA there aren't any custom fields but the material
and this get me to the idea that your ABAP developers made only half of the work. The things to be done on the SAP backend:
Extend MARA table with custom Z fields (in SAP it is called Append structure)
Extend interface structure BAPI_TE_MARA with the fields which should exactly correspond to the MARA fields
This is how it must look like on backend
If you don't see any custom fields in BAPI_TE_MARA except MATERIAL probably step 2 is missing on SAP side. As what I got from your comments, they created structure ZMM_S_MATMAS_ADDITION but appended it only to MARA, but not to BAPI_TE_MARA.
What is missing from Sandra excellent holistic answer is step 3: for all this construction to work some customizing need to be done.
T130F table must contain your custom fields. To maintain the entry for T130F go to transaction SPRO or directly to maintenance view V_130F.
SPRO way: go to SPRO -> Logistics-General -> Material Master -> Field Selection -> Assign fields to field Selection Groups and maintain the entry in the table
Sample ABAP code that does the thing:
DATA: ls_headdata TYPE bapimathead,
lt_extensionin TYPE STANDARD TABLE OF bapiparex,
ls_extensionin LIKE LINE OF lt_extensionin,
lt_extensioninx TYPE STANDARD TABLE OF bapiparexx,
ls_extensioninx LIKE LINE OF lt_extensioninx,
lt_messages TYPE bapiret2_t,
ls_bapi_te_mara TYPE bapi_te_mara,
ls_bapi_te_marax TYPE bapi_te_marax.
ls_headdata-material = |{ ls_headdata-material ALPHA = IN }|.
ls_headdata-basic_view = 'X'.
ls_bapi_te_mara-material = ls_headdata-material.
ls_bapi_te_mara-zztest1 = '322223'.
ls_bapi_te_marax-material = ls_headdata-material.
ls_bapi_te_marax-zztest1 = 'X'.
ls_extensionin-structure = 'BAPI_TE_MARA'.
ls_extensionin-valuepart1 = ls_bapi_te_mara
APPEND ls_extensionin TO lt_extensionin.
ls_extensioninx-structure = 'BAPI_TE_MARAX'.
ls_extensioninx-valuepart1 = ls_bapi_te_marax-zztest1.
APPEND ls_extensioninx TO lt_extensioninx.
CALL FUNCTION 'BAPI_MATERIAL_SAVEDATA'
EXPORTING
headdata = ls_headdata
TABLES
returnmessages = lt_messages
extensionin = lt_extensionin
extensioninx = lt_extensioninx.
CALL FUNCTION 'BAPI_TRANSACTION_COMMIT'
EXPORTING
wait = 'X'.
Based on this you can model your .Net code for BAPI calling.
P.S. Pay attention to the first line with ALPHA = IN. The input to the material number field must be in fully qualified 18-char format with leading zeroes, e.g. 000000000000323, otherwise the update will fail.
Always extend structure EMARA and not MARA, BAPI_TE_MARA, ... directly.

Flatbuffers: can I change int field to struct with 1 int?

Based on a very good approach for null fields proposed by the main contributor to flatbuffers:
https://github.com/google/flatbuffers/issues/333#issuecomment-155856289
The easiest way to get a null default for an integer field is to wrap
it in a struct. This will get you null if scalar isn't present. It
also doesn't take up any more space on the wire than a regular int.
struct myint { x:int; }
table mytable { scalar:myint; }enter code here
this will get you null if scalar isn't present. It also doesn't take
up any more space on the wire than a regular int.
Also based on the flatbuffers documentation:
https://google.github.io/flatbuffers/md__schemas.html
You can't change types of fields once they're used, with the exception of same-size data where a reinterpret_cast would give you a desirable result, e.g. you could change a uint to an int if no values in current data use the high bit yet.
My question is can I treat int as reinterpret_cast-able to myint?
In other words, if I start with just a simple int as a field, can I later on decide that I actually want this int to be nullable and change it to myint? I know that all values that used to be default value in the first int schema will be read as null in the myint schema and I am ok with that.
Of course the obvious follow up question is can I do the same thing for all scalar types?
Though that isn't explicitly documented, yes, int and myint are wire-format compatible (they are both stored inline). Like you say, you will lose any default value instances to become null.

AS400 RPGLE/free dynamic variables in operations

I'm fairly certain after years of searching that this is not possible, but I'll ask anyway.
The question is whether it's possible to use a dynamic variable in an operation when you don't know the field name. For example, I have a data structure that contains a few hundred fields. The operator selects one of those fields and the program needs to know what data resides in the field from the data structure passed. So we'll say that there are 100 fields, and field50 is what the operator chose to operate on. The program would be passed in the field name (i.e. field50) in the FLDNAM variable. The program would read something like this the normal way:
/free
if field50 = 'XXX'
// do something
endif;
/end-free
The problem is that I would have to code this 100 times for every operation. For example:
/free
if fldnam = 'field1';
// do something
elseif fldnam = 'field2';
// do something
..
elseif fldnam = 'field50';
// do something
endif;
Is there any possible way of performing an operation on a field not yet known? (i.e. IF FLDNAM(pointer data) = 'XXX' then do something)
If the data structure is externally-described and you know what file it comes from, you could use the QUSLFLD API to find out the offset, length, and type of the field in the data structure, and then use substring to get the data and then use other calculations to get the value, depending on the data type.
Simple answer, no.
RPG's simply not designed for that. Few languages are.
You may want to look at scripting languages. Perl for instance, can evaluate on the fly. REXX, which comes installed on the IBM i, has an INTERPRET keyword.
REXX Reference manual