My group was writing a code when a problem gives us an exponential formula. One member used ^ as exponential operator, and the other used **. Then, each method returned a different value:
Im not trying to know which one is correct, but i would like to know why each operator retrieved a different value. What is the difference between the mathematical logic behind each of them.
I would like to know why these two operators, which essentialy have the same utility, retrieved different values.
Related
Background:
Our group is going through a Cloudera upgrade to 6.1.1 and I have been tasked with determining how to handle the loss of the implicit data type conversion across data types. See link below for the relevant Release Note details.
https://docs.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_611_incompatible_changes.html#hive_union_all_returns_incorrect_data
Not only does this issue affect UNION ALL queries, but there is a function that performs comparisons on columns of different data types (i.e, STRING to BIGINT).
The group has decided that we do not want to change the underlying table meta data. So the solution is to allow for potential data loss by using the CAST() function to cast the data. In the case of UNION ALL, we cast to the destination table's meta data. But, when performing comparisons, I am trying to determine the simplest and easiest way to perform comparisons without getting erroneous results.
Question:
Can I simply cast everything to either STRING or VARCHAR() when performing the comparison? Are there any potential problems that might create incorrect results?
Update:
If there are problems with this approach, is there a correct solution to handle this?
Note: this is my first engagement working with Hadoop/HIVE and I have learned that everything I know in RDBMS land does not always apply.
It is possible that you will have problems. For instance, if comparing a string to an int, then:
'1.00' = 1 --> true, because the values are compared as numbers
But as strings:
'1.00' = '1' --> false, because the values are compared as strings
You can get similar issues with dates, I think.
Since Keen is not strongly typed, I've noticed it is possible to send data of different types into the same property. For instance, some events may have a property whose value is a String (sent surrounded by quotes), and some whose value is an integer (sent without quotes). In the case of mathematical operations, what is the expected behavior?
Our comparator will only compute mathematical operations on numbers. If you have a property whose values are mixed, the operation will only apply to the numbers, strings will be ignored. You can see the values in your property by running a select_unique query on that property as the target_property, then (if you're using the Explorer) selecting JSON from the drop-down in the top-right. Any values you see there that are surrounded by quotes will be ignored by a mathematical query type (minimum, maximum, median, average, percentile, and sum).
If you are just starting out, and you know you want to be able to do mathematical operations on this property, we recommend making sure that you always send integers as numbers (without quotes). If you really want to keep your dataset clean, you can even start a new collection once you've made sure you are no longer sending any strings.
Yes, you're correct, Keen can accept data of different types as the value for your properties. An example of Keen's lenient data type is that a property such as VisitorID can contain both numbers (ie 14558) or strings (ie "14558").
This is article from the Keen site is useful for seeing where you can check data types: https://keen.io/docs/data-collection/data-modeling-guide-200/#check-for-data-type-mismatch
I'm using VB.Net 2013 and really could use some help. Perhaps I have been staring at it too long. I am presented with a value from a variable. The specific value is this
3.190E+01+3.366E+01+8.036E+00
The value is actually 3 smaller values in scientific notation as follows
3.190E+01
3.366E+01
8.036E+00
I need to get the individual values into individual variables. Once I have the individual values I need to calculate the notation of each value so 3.190E+01 is equivalent to 3.190*10^1 and 8.036E+00 is equivalent to 8.036*10^0. I can probably figure out the last part of this question if I can just get the individual values. The caveat is that the numbers will vary in size and the scientific notation part will not always be the same. I do believe it will always be E+XX though so possible to use some regex stuff that I don't fully understand.
Thank you, I look forward to your help and it is very much appreciated.
https://wiki.apache.org/pig/UDFManual
The example UDF has a null-check on the input tuple in the exec method. The various built-in methods sometimes do and sometimes don't.
Are there actually any cases where a Pig script will cause a UDF to be called with a null input tuple? Certainly an empty input tuple is normal and expected, or a tuple of one null value, but I've never the tuple itself be null.
A tuple may be null because a previous UDF returns null.
Think on log analysis system, where you 1. parse the log, 2. enrich it with external data.
LOG --(PARSER)--> PARSED_LOG --(ENRICHENER)--> ENRICHED_LOG
If one log LOG have wrong format and cannot be parsed, the UDF PARSED_LOG may returns null.
Therefore, if used directly, the ENRICHENER have to test the input.
You can FILTER those null values before too, specially if the tuples are used multiple times or STORED.
My best understanding after working with Pig for a while is that bare nulls are never passed bare nulls, always a non-null Tuple (which itself may contain nulls.)
I'm trying to extract data (using SPUFI) from a DB2 table to a file, with one of the output fields converting a decimal field to the same format as a COBOL comp field.
So e.g. today's date (20141007) would be ..ëõ
The SQL HEX function converts 20141007 to 013353CF, and doing a SELECT of x'013353CF' gives me the desired result, but obviously that's a constant, I'm trying to find an equivalent function.
Basically an inverse of the HEX function.
I've come across a couple of suggestions using user defined functions. Problem is, we've only recently upgraded to DB2 10 and new function mode isn't enabled yet, which means I don't have access to any control functions in a UDF.
I suspect I'm out of luck, but wondering if anyone has any suggestions.
I appreciate this is completely the wrong tool for the job, and would be easier to just write a COBOL program to do it, but various constraints are preventing that. I'm limited to just SQL functions and possibly JCL).
I thought I had a solution using a recursive UDF to get around the lack of control functions, but that's not allowed either.