Pentaho: logarithmic scale in charts

Pentaho: logarithmic scale in charts - pentaho

Is there a way to set a "logarithmic scale" in a Pentaho chart?
Result from my sql query has a very wide range (from 1 to 1000000).
I could change my sql query to something like this:
"Select LOG(10, wide_value)..."
But Y-axis scale changes the same way at Pentaho chart (1,2,3...). I need the original labels on Pentaho chart (10,100,1000...).
What I found on web are old forums talking about that this is still a feature request, or to try with "CCC" (too much documentation for something too simple) or to modify the source code (even worse). None of them useful.
Hint: I'm using a legacy implementation on Pentaho bi-server 6.0. If there is an option on newer versions it will be useful too, maybe there is something similar on my old version.

This is the solution in case someone else needs it.
My query originally delivered something like:
Select
   x_axis as "x_axis",
   y_axis_wide_value as "value"
From ...
It must be modified to:
Select
x_axis as "x_axis",
LOG(10, y_axis_wide_value) as "value"
From ...
My graphic in Pentaho's "Components Panel" is a "CCC Line Chart".
What I needed is for the Y axis to show the real values. That is achieved as follows:
In "Advanced Properties" you must locate "OrthoAxisThickFormatter" and enter the following function:
function fun (value)
{
     return Math.pow (10, value);
}
Some additional settings for better viewing:
orthoAxisOriginIsZero: false
orthoAxisZeroLine: false
And that was all. Now my graphs are displayed with logarithmic scale.

Related

Feature and FeatureView versioning

my team is interested in a feature store solution that enables rapid experimentation of features, probably using feature versioning. In the Feast slack history, I found
#Benjamin Tan’s post that explains their feast workflow, and they explain FeatureView versioning:
insights_v1 = FeatureView(
features=[
Feature(name="insight_type", dtype=ValueType.STRING)
]
)
insights_v2 = FeatureView(
features=[
Feature(name="customer_id", dtype=ValueType.STRING)
Feature(name="insight_type", dtype=ValueType.STRING)
]
)
Is this the recommended best practice for FeatureView versioning? It looks like Features do not have a version field. Is there a recommended strategy for Feature versioning?
Creating a new column for each Feature version is one approach:
driver_rating_v1
driver_rating_v2
But that could get unwieldy if we want to experiment with dozens of permutations of the same Feature.
Featureform appears to have support for feature versions through the "variant" field, but their documentation is a bit unclear.

Adding additional clarity on Featureform: Variant is analogous to version. You'd supply a string which then becomes an immutable identifier for the version of the transformation, source, etc. Variant is one of the common metadata fields provided in the Featureform API.
Using the example of an ecommerce dataset & spark, here's an example of using the variant field to version a source (a parquet file in this case):
orders = spark.register_parquet_file(
name="orders",
variant="default",
description="This is the core dataset. From each order you might find all other information.",
file_path="path_to_file",
)
You can set the variant variable ahead of time:
VERSION="v1" # You can change this to rerun the definitions with with new variants
orders = spark.register_parquet_file(
name="orders",
variant=f"{VERSION}",
description="This is the core dataset. From each order you might find all other information.",
file_path="path_to_file",
)
And you can create versions or variants of the transformations -- here I'm taking a dataframe called total_paid_per_customer_per_day and aggregating it.
# Get average order value per day
#spark.df_transformation(inputs=[("total_paid_per_customer_per_day", "default")], variant="skeller88_20220110")
def average_daily_transaction(df):
from pyspark.sql.functions import mean
return df.groupBy("day_date").agg(mean("total_customer_order_paid").alias("average_order_value"))
There are some more details on the Featureform CLI here: https://docs.featureform.com/getting-started/interact-with-the-cli

Can this list be sorted by date in Velocity?

I've found this code for getting articles by tag and display them as a list with links in xWiki, but I want it sorted by date.
Has anyone a suggestion for me?
{{velocity}}
#set ($list = $xwiki.tag.getDocumentsWithTag('myTag'))
#foreach($reference in $list)
#set ($document = $xwiki.getDocument($reference))
#set ($label = $document.getTitle())
[[$label>>$reference]]
#end
{{/velocity}}
Thanks in advance!

Sorting in velocity can hit one of 2 performance penalties:
Actually sorting in velocity, either with a sorting algorithm -> unnecesarrily compicated
Loading all the document results into memory (a collection) and
sorting that collection with the sort/collection tool -> you risk quickly running out of memory if the result is larger than you expected.
The easiest alternative, given that there is XWiki running behind it, would be to do an XWQL query for the XWiki.TagClass objects stored inside the documents and do the sorting at the database level. At this point, in velocity, you only need to display the results:
{{velocity}}
#foreach ($docStringRef in $services.query.xwql("from doc.object(XWiki.TagClass) tagsObj where 'conference' member of tagsObj.tags order by doc.creationDate DESC").setLimit(10).execute())
#set ($document = $xwiki.getDocument($docStringRef))
[[$document.title>>$docStringRef]]
#end
{{/velocity}}
For future use/reference, the list of available Velocity tools in XWiki might also be useful https://extensions.xwiki.org/xwiki/bin/view/Extension/Velocity%20Module#HVelocityTools since they can help with common operations, including sorting (that I mentioned at point 2. above)

Open CMIS - Querying string property results in weird behavior

I'm executing the following SQL query:
SELECT doc.cmis:description, doc.cmis:name
FROM cmis:document doc
WHERE IN_FOLDER(doc,'folderID')
This result in something like below:
doc.cmis:description = "this is description"
doc.cmis:name = "fileName"
Now, if I add following statements, it returns zero result:
and doc.cmis:description = 'this is description'
However, if I modify and-statement with following, it works:
and doc.cmis:description like '%'
If I add one character (but not two interestingly...) as below, it also works:
and doc.cmis:description like '%t%'
It's very interesting to note that and-statement work very well with doc.cmis:name (as well as other properties).
Does anyone have clue as to why this strange / mysterious behavior is occurring?

The specifications delegate to the implementer if the cmis:description is queryable or not.
Anyway, which Alfresco version are you using ? There was an issue/bug time ago, but this should be solved: The cmis:description field should be queryable, although I don't know if it's fixed in enterprise or community.
By the way, I am currently using Alfresco Community 4.2.f and I have the same problem.

How do I tell Wolfram Alpha that I want it to compute the terms of an integer sequence?

I want to compute a list of the terms of the sequence
(14747-40*n)/(2621440*(41-n)), n from 1 to 40
I tried entering the above in Wolfram Alpha, and it plots me a graph of the function. This isn't what I want. I've tried variations on this command, as well as guessing at various keywords to stick before it, and I either get back the same thing, or something unhelpful.
The help page on sequences suggests various things you might do with sequences, but doesn't say how to do something simple like this???

The following works:
Table[(14747-40*n)/(2621440*(41-n)) n, {n, 1, 40}]
Clicking on "approximate form" then on "copy plaintext" gives the following:
{0.000140257, 0.000286924, 0.000440507, 0.000601567, 0.000770728,
0.000948683, 0.00113621, 0.00133417, 0.00154356, 0.00176547,
0.00200115, 0.00225204, 0.00251976, 0.00280618, 0.00311345,
0.00344409, 0.00380101, 0.00418764, 0.00460803, 0.00506701,
0.00557035, 0.00612508, 0.00673974, 0.00742493, 0.00819385,
0.00906326, 0.0100547, 0.0111963, 0.0125257, 0.0140939, 0.0159728,
0.0182658, 0.0211282, 0.0248041, 0.0297003, 0.0365488, 0.0468139,
0.0639122, 0.0980936, 0.200607}

Translate SQL to OCL?

I have a piece of SQL that I want to translate to OCL. I'm not good at SQL so I want to increase maintainability by this. We are using Interbase 2009, Delphi 2007 with Bold and modeldriven development. Now my hope is that someone here both speaks good SQL and OCL :-)
The original SQL:
Select Bold_Id, MessageId, ScaniaId, MessageType, MessageTime, Cancellation, ChassieNumber, UserFriendlyFormat, ReceivingOwner, Invalidated, InvalidationReason,
(Select Parcel.MCurrentStates From Parcel
Where ScaniaEdiSolMessage.ReceivingOwner = Parcel.Bold_Id) as ParcelState From ScaniaEdiSolMessage
Where MessageType = 'IFTMBP' and
not Exists (Select * From ScaniaEdiSolMessage EdiSolMsg
Where EdiSolMsg.ChassieNumber = ScaniaEdiSolMessage.ChassieNumber and EdiSolMsg.ShipFromFinland = ScaniaEdiSolMessage.ShipFromFinland and EdiSolMsg.MessageType = 'IFTMBF') and
invalidated = 0 Order By MessageTime desc
After a small simplification:
Select Bold_Id, (Select Parcel.MCurrentStates From Parcel
where ScaniaEdiSolMessage.ReceivingOwner = Parcel.Bold_Id) From ScaniaEdiSolMessage
Where MessageType = 'IFTMBP' and not Exists (Select * From ScaniaEdiSolMessage
EdiSolMsg Where EdiSolMsg.ChassieNumber = ScaniaEdiSolMessage.ChassieNumber and
EdiSolMsg.ShipFromFinland = ScaniaEdiSolMessage.ShipFromFinland and
EdiSolMsg.MessageType = 'IFTMBF') and invalidated = 0
NOTE: There are 2 cases for MessageType, 'IFTMBP' and 'IFTMBF'.
So the table to be listed is ScaniaEdiSolMessage.
It has attributes like:
MessageType: String
ChassiNumber: String
ShipFromFinland: Boolean
Invalidated: Boolean
It has also a link to table Parcel named ReceivingOwner with BoldId as key.
So it seems like it list all rows of ScaniaEdiSolMessage and then have a subquery that also list all rows of ScaniaEdiSolMessage and name it EdiSolMsg. Then it exclude almost all rows. In fact the query above give one hit from 28000 records.
In OCL it is easy to list all instances:
ScaniaEdiSolMessage.allinstances
Also easy to filter rows by select for example:
ScaniaEdiSolMessage.allinstances->select(shipFromFinland and not invalidated)
But I do not understand how I should make a OCL to match the SQL above.

Listen to Gabriel and Stephanie, learn more SQL.
You state that you want to make the code more maintainable, yet the number of developers who understand SQL is greater by far than the number of developers who understand OCL.
If you leave the project tomorrow after converting this to OCL, the chances that you'd be able to find someone who could maintain the OCL are very slim. However, the chances that you could find someone to maintain the SQL are very high.
Don't try to fit a square peg in a round hole just because you're good with round hammers :)

There is a project, Dresden OCL, that might help you.
Dresden OCL provides a set of tools to parse and evaluate OCL constraints on various models like UML, EMF and Java. Furthermore Dresden OCL provides tools for Java/AspectJ and SQL code generation. The tools of Dresden OCL can be either used as a library for other project or as a plug-in project that extends Eclipse with OCL support.
I haven't used it, but there is a demo showing how the tool generates SQL from a model and OCL constraints. I realize you're asking for the opposite, but maybe using this you can figure it out. There is also a paper that describes OCL->SQL transformations by the same people.

With MDriven (successor of Bold for Delphi) I would do it like this:
When working with OCL to SQL everything becomes easier if you think about the different set's of information you need to check - and then use ocl operators as ->intersection to find the set you are after.
So in your case you might have a set like this:
ScaniaEdiSolMessage.allinstances->select(shipFromFinland and not invalidated)
but you also have a set like this:
ScaniaEdiSolMessage.allinstances->select(m|m.ReceivingOwner.MessageType = 'IFTMBP')
And you further more have this criteria:
Parcel.allinstances->select(p|p.Messages->exists(m|m.MessageType = 'IFTMBF')).Messages
If all these Sets have the same result type (collection of ScaniaEdiSolMessage) you can simply intersect them to get your desired result
ScaniaEdiSolMessage.allinstances->select(shipFromFinland and not invalidated)
->intersection(ScaniaEdiSolMessage.allinstances->select(m|m.ReceivingOwner.MessageType = 'IFTMBP'))
->intersection(Parcel.allinstances->select(p|p.Messages->exists(m|m.MessageType = 'IFTMBF')).Messages
)
And looking at that we can reduce it a bit to:
ScaniaEdiSolMessage.allinstances
->select(m|m.shipFromFinland and (not m.invalidated) and
(m.ReceivingOwner.MessageType = 'IFTMBP'))
->intersection(Parcel.allinstances->select(p|
p.Messages->exists(m|m.MessageType = 'IFTMBF')).Messages
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Pentaho: logarithmic scale in charts - pentaho

Related

Feature and FeatureView versioning

Can this list be sorted by date in Velocity?

Open CMIS - Querying string property results in weird behavior

How do I tell Wolfram Alpha that I want it to compute the terms of an integer sequence?

Translate SQL to OCL?

Categories

Resources