Keen IO mixed property values (integers as strings)

Keen IO mixed property values (integers as strings) - keen-io

Since Keen is not strongly typed, I've noticed it is possible to send data of different types into the same property. For instance, some events may have a property whose value is a String (sent surrounded by quotes), and some whose value is an integer (sent without quotes). In the case of mathematical operations, what is the expected behavior?

Our comparator will only compute mathematical operations on numbers. If you have a property whose values are mixed, the operation will only apply to the numbers, strings will be ignored. You can see the values in your property by running a select_unique query on that property as the target_property, then (if you're using the Explorer) selecting JSON from the drop-down in the top-right. Any values you see there that are surrounded by quotes will be ignored by a mathematical query type (minimum, maximum, median, average, percentile, and sum).
If you are just starting out, and you know you want to be able to do mathematical operations on this property, we recommend making sure that you always send integers as numbers (without quotes). If you really want to keep your dataset clean, you can even start a new collection once you've made sure you are no longer sending any strings.

Yes, you're correct, Keen can accept data of different types as the value for your properties. An example of Keen's lenient data type is that a property such as VisitorID can contain both numbers (ie 14558) or strings (ie "14558").
This is article from the Keen site is useful for seeing where you can check data types: https://keen.io/docs/data-collection/data-modeling-guide-200/#check-for-data-type-mismatch

Related

STRING type or SSTRING element for a text field in table? Pros and cons

I need to create a Z table to store reasons for modifications of a certain custom object.
In the UI, the user will pick a reason ID and then optionally fill a text box. The table will have more or less the fields below:
key objectID
key changeReasonID
changedOn
changedBy
comments
My doubt is with the comments field. I read the documentation about the limitations of STRING and SSTRING, but it's not clear to me if a STRING type field used in a transparent table has a limited length or not.
Even if the length is not limited (at least by the DB), I'm not sure if it's a good idea to use this approach or would you recommend CHAR/SSTRING types with a fix length instead?
**My system is running MSSQL database.

Strings have unlimited length, both in ABAP structures/tables, and in the database.
Most databases will store only a pointer in this column that points to the real CLOB value which is stored in a different memory segment. As a result, they restrict the usage of these columns, and may not allow you to use them as a key or index.
If I remember correctly, ABAP supports a maximum of 16 string fields per structure, which naturally limits its use cases. Also consider that ABAP structures have a maximum size.
For your case, if the comment will remain the only long field, and if you are actually fine with storing unlimited input (--> security constraints?), string sounds like a reasonable option.
If you are unsure what the future will bring, or to be on the safe side regarding security, you might want to opt for sstring or simply a long char instead.

Do I need to convert a value?

While I was selecting a unit from a database table I noticed, via transaction SE16N, that there are two different values for the same field. An unconverted and a converted value. With my SELECT statement, I receive the unconverted one. Do I need to convert this value in order to continue working with it?

First of all, it's probably worth explaining what is the concept of "converted value" and "unconverted value" (what is better known as "external value" and "internal value").
Internal values are the actual values used by the programs and stored in the database, and the external values are only calculated at the time of the display, on screen, printout, and so on.
It's very practical to see a meaningful code, as Legxis explained, for the internal value of the unit of measure "ST" (a unit of measure which indicates that the number is a number of pieces, an English user would prefer to see PCS (English word "pieces"), while a German user would prefer to see ST (German word "Stücks").
The conversion algorithm is defined at the DDIC domain level (transaction code SE11) via the "conversion routine" field, a 5-character code which defines the conversion function modules which are called automatically at display time. For instance, the Unit of measure is related to the domain MEINS, which has the routine CUNIT which corresponds to the function modules CONVERSION_EXIT_CUNIT_INPUT and CONVERSION_EXIT_CUNIT_OUTPUT.
CONVERSION_EXIT_CUNIT_INPUT does the conversion from the external value (displayed) to the internal value (program and database)
CONVERSION_EXIT_CUNIT_OUTPUT does the conversion from the internal value (program and database) to the external value (displayed)
These function modules are automatically called in SAP rendering technologies like SAP GUI, SAPscript, Smart Form, SAP Adobe form, BSP, Web Dynpro, etc. The "OUTPUT" function module is also called if you call the ABAP statement WRITE.
Note that the "output length" defined for a DDIC domain may be of some importance, because one may define an output length (displayed) larger than the internal length. For instance, the language code is stored internally on one character but displayed on two characters. For instance, in English, the language code "V" (Sweden) is displayed "SW" (Sweden), and the language code "S" (Spain) is displayed "SP" (Spain).
Finally, if you understand well the concept, you should conclude that you usually don't need to convert anything yourself. It can be useful only if you want to define an interface which is not one of the SAP supported technologies mentioned above.

The table rows you SELECT in ABAP do only contain the unconverted values. Use these to e.g. JOIN with other tables or call methods/function modules. Conversion is only relevant when displaying the data.
By the way: Nonetheless these conversions with "good intentions" can cause problems. Values with type NUMC (numeric characters) for example are often trimmed/stripped during conversion when they have leading zeros. But some function modules do not work when these leading zeros are missing.

Naming conventions for the same variable as a different type?

Many people believe Hungarian notation is bad. How then do you name a variables that represent the same value casted to different types?
I've got a variable called value, that might be a string, or a decimal. What would you call the different formats? strValue, decValue? valueAsString?

I think it would largely depend on the context. For instance if the string value was named age, and the decimal was the parsed value then perhaps parsedAge or something along those lines. Really it comes down to what makes sense given what you are doing and the lifetime of that variable. If it only exists long enough to actually collect and parse the value, then I would give the better name to the parsed variable or worry less about the naming of the intermediary.
If you actually need to hold on to both values, then I might consider creating a struct or some similar data structure that represents the various forms for that data value to prevents the need to shift between string and decimal formats etc.

SQL Server 2008 - Default column value - should i use null or empty string?

For some time i'm debating if i should leave columns which i don't know if data will be passed in and set the value to empty string ('') or just allow null.
i would like to hear what is the recommended practice here.
if it makes a difference, i'm using c# as the consuming application.

I'm afraid that...
it depends!
There is no single answer to this question.
As indicated in other responses, at the level of SQL, NULL and empty string have very different semantics, the former indicating that the value is unknown, the latter indicating that the value is this "invisible thing" (in displays and report), but none the less it a "known value". A example commonly given in this context is that of the middle name. A null value in the "middle_name" column would indicate that we do not know whether the underlying person has a middle name or not, and if so what this name is, an empty string would indicate that we "know" that this person does not have a middle name.
This said, two other kinds of factors may help you choose between these options, for a given column.
The very semantics of the underlying data, at the level of the application.
Some considerations in the way SQL works with null values
Data semantics
For example it is important to know if the empty-string is a valid value for the underlying data. If that is the case, we may loose information if we also use empty string for "unknown info". Another consideration is whether some alternate value may be used in the case when we do not have info for the column; Maybe 'n/a' or 'unspecified' or 'tbd' are better values.
SQL behavior and utilities
Considering SQL behavior, the choice of using or not using NULL, may be driven by space consideration, by the desire to create a filtered index, or also by the convenience of the COALESCE() function (which can be emulated with CASE statements, but in a more verbose fashion). Another consideration is whether any query may attempt to query multiple columns to append them (as in SELECT name + ', ' + middle_name AS LongName etc.).
Beyond the validity of the choice of NULL vs. empty string, in given situation, a general consideration it to try and be as consistent as possible, i.e. to try and stick to ONE particular way, and to only/purposely/explicitly depart from this way for good reasons and in few cases.

Don't use empty string if there is no value. If you need to know if a value is unknown, have a flag for it. But 9 times out of 10, if the information is not provided, it's unknown, and that's fine.

NULL means unknown value. An empty string means a known value - a string with length zero. These are totally different things.

empty when I want a valid default value that may or may not be changed, for example, a user's middle name.
NULL when it is an error if the ensuing code does not set the value explicitly.
However, By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.

Theory aside, I tend to view:
Empty string as a known value
NULL as unknown
In this case, I'd probably use NULL.
One important thing is to be consistent: mixing NULLs and empty strings will end in tears.
On a practical implementation level, empty string takes 2 bytes in SQL Server where as NULLs are bitmapped. In some conditions and for wide/larger tables it makes a different in performance because it's more data to shift around.

Term to represent all possible values of a variable

Is there a term to represent a set of all possible values a variable can assume?
Analogy:
In mathematics a domain of a function is a set of values a function is defined on (function can take as an argument).
Examples:
A variable of type UInt16 can hold values in range [0-65536).
Completion status (represented by a double value) can hold a value in range [0-100].
Gender (represented by an Enum) can hold one of { Male, Female }.
Q:
What is a term to describe all possible values a variable can (contextually) assume?
Basically need a short version of "set of values for a variable". I have seen term type being used to describe such a range, but Type often encompasses other bits of information (e.g. a name, operations, module).

value set
domain
value range

I've also heard "value space" as a term for this.

I would just call it the "range", or "range of values".

Domain would be the math term.

I don't know of programming-specific jargon with that meaning, but "domain" itself seems like a pretty good one...
[EDIT] Read the comments to this, and I actually prefer "range".

I don't know if this is the exact terminology (if it even has one) but I have always referred to it as a range or in the case of enums options.

Range is the proper term, as in "this method will return values within the range of..."; "The expected range of this variable is:..." etc.

For atomic types, the type itself describes the range (e.g. int has a range of -2,147,483,648 to 2,147,483,647).
Anything that is a custom type may or may not have a range because custom types (e.g. struct, class, interface) are composite types that can be made up of atomic or other custom types.
The definition of a type will also vary between different languages.
The long and short of it is generally you will only be able to apply a range to atomic types based on a specific language.

It depends on the type system. In some programming laguages, a "string" can hold a sequence of characters, and an "unsigned int" can only hold positive whole numbers. In others like python, a variable can hold anything at all because it doesn't have a certain type.

Our quants here say it is called a value set. They get paid tons of money to create them so I believe them!

You may think of a variable as containing an element that is a member of a set of numbers.
As such, domain is a good descriptor for the possible values of this set.
Range is also often used in a similar context. Here we talk of the range of a function, as the set of values the function can take on. Since a variable always contains the result of some expression or computation, range clearly makes sense too.
Either is appropriate in the proper context.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas