Qlik sense: How to aggregate strings into single row in script - qlikview

I am trying to aggregate strings that belong to the same product code in one row. Which Qlik sense aggregation function should I use?
image
I am able to aggregate integers in such example, but failed for string aggregation.

Have you tried maxstring() - this is a string aggregation function.

As x3ja mentioned, you can use an aggregation function in charts that will work for strings, including:
MaxString()
Only()
Concat()
These can result in the type of thing you're looking for:
It's worth noting, though, that this sort of problem is almost always an issue with the underlying data model. Depending on what your source data looks like, you should consider investigating your use of Join and/or Concatenate. You can see more info on how to use those functions on this Qlik Help page.
Here's a very basic example of using a Join to properly combine the data in a way that results in all data showing up a single record without needing any aggregations in the table chart:

Related

List of aggregation functions in Spark SQL

I'm looking for a list of pre-defined aggregation functions in Spark SQL. I have in mind something analogous to Presto Aggregate Functions.
I Ctrl+F'd around a little in the SQL API docs to no avail... it's also hard to tell at a glance which functions are for aggregation vs. not. For example, if I didn't know avg is an aggregation function I'd be hard pressed to tell it is one (in a way that's actually scalable to the full set of functions):
avg - avg(expr) - Returns the mean calculated from values of a group.
If such a list doesn't exist, can someone at least confirm to me that there's no pre-defined function like any/bool_or or all/bool_and to determine if any or all of a boolean column in a group are true (or false)?
For now, my workaround is
select grp_col, count(if(bool_col, true, NULL)) > 0 any_agg
Just take a look at Spark Docs on Aggregate functions section
The list of functions is here under Relational Grouped Dataset - specifically the API's that return DataFrame (not RelationalGroupedDataSet):
https://spark.apache.org/docs/latest/api/scala/index.html?org/apache/spark/sql/RelationalGroupedDataset.html#org.apache.spark.sql.RelationalGroupedDataset

How can i use the new UDF functionality to create "Dynamic SQL statement"?

How can i use the new UDF functionality to create "Dynamic SQL statement"?
Is there a way to use UDF in order to construct SQL statement based on template and input variables, and later run this query?
The documentation https://cloud.google.com/bigquery/user-defined-functions?hl=en says:
A UDF is similar to the "Map" function in a MapReduce: it takes a
single row as input and produces zero or more rows as output. The
output can potentially have a different schema than the input.
So your UDF receives just a single row.
Therefore - no, UDF is not for the purpose you described in your question.
You might take a look at views - maybe that will suit you better:
https://cloud.google.com/bigquery/querying-data#views

What is the use case for Merge function SQL Clr?

I am writing a CLR userdefinedAggregate function to implement median. While I understand all the other function which I have to implement. I can not understand, what is the use of the merge function.
I am getting a vague idea that if aggregated function is partially evaluated ( i.e. evaluated for some rows with one group and the remaining in other ) then the values needs to be aggregated. If its the case is there a way to test this ?
Please let me know if any of the above is not clear or if you need any further information.
Your vague idea is correct.
From Requirements for CLR User-Defined Aggregates
This method can be used to merge another instance of this aggregate
class with the current instance. The query processor uses this method
to merge multiple partial computations of an aggregation.
The parameter to merge is another instance of your aggregate and you should merge the aggregated data in that instance to your current instance.
You can have a look at the sample string concatenate aggregate. The merge method add the concatenated strings from the parameter to the current instance of the aggregate class.

Dynamic Pivot Query without storing query as String

I am fully familiar with the following method in the link for performing a dynamic pivot query. Is there an alternative method to perform a dynamic pivot without storing the Query as a String and inserting a column string inside it?
http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx
Short answer: no.
Long answer:
Well, that's still no. But I will try to explain why. As of today, when you run the query, the DB engine demands to be aware of the result set structure (number of columns, column names, data types, etc) that the query will return. Therefore, you have to define the structure of the result set when you ask data from DB. Think about it: have you ever ran a query where you would not know the result set structure beforehand?
That also applies even when you do select *, which is just a sugar syntax. At the end, the returning structure is "all columns in such table(s)".
By assembling a string, you dynamically generate the structure that you desire, before asking for the result set. That's why it works.
Finally, you should be aware that assembling the string dynamically can theoretically and potentially (although not probable) get you a result set with infinite columns. Of course, that's not possible and it will fail, but I'm sure you understood the implications.
Update
I found this, which reinforces the reasons why it does not work.
Here:
SSIS relies on knowing the metadata of the dataflow in advance and a
dynamic pivot (which is what you are after) is not compatible with
that.
I'll keep looking and adding here.

How best to sum multiple boolean values via SQL?

I have a table that contains, among other things, about 30 columns of boolean flags that denote particular attributes. I'd like to return them, sorted by frequency, as a recordset along with their column names, like so:
Attribute Count
attrib9 43
attrib13 27
attrib19 21
etc.
My efforts thus far can achieve something similar, but I can only get the attributes in columns using conditional SUMs, like this:
SELECT SUM(IIF(a.attribIndex=-1,1,0)), SUM(IIF(a.attribWorkflow =-1,1,0))...
Plus, the query is already getting a bit unwieldy with all 30 SUM/IIFs and won't handle any changes in the number of attributes without manual intervention.
The first six characters of the attribute columns are the same (attrib) and unique in the table, is it possible to use wildcards in column names to pick up all the applicable columns?
Also, can I pivot the results to give me a sorted two-column recordset?
I'm using Access 2003 and the query will eventually be via ADODB from Excel.
This depends on whether or not you have the attribute names anywhere in data. If you do, then birdlips' answer will do the trick. However, if the names are only column names, you've got a bit more work to do--and I'm afriad you can't do it with simple SQL.
No, you can't use wildcards to column names in SQL. You'll need procedural code to do this (i.e., a VB Module in Access--you could do it within a Stored Procedure if you were on SQL Server). Use this code build the SQL code.
It won't be pretty. I think you'll need to do it one attribute at a time: select a string whose value is that attribute name and the count-where-True, then either A) run that and store the result in a new row in a scratch table, or B) append all those selects together with "Union" between them before running the batch.
My Access VB is more than a bit rusty, so I don't trust myself to give you anything like executable code....
Just a simple count and group by should do it
Select attribute_name
, count(*)
from attribute_table
group by attribute_name
To answer your comment use Analytic Functions for that:
Select attribute_table.*
, count(*) over(partition by attribute_name) cnt
from attribute_table
In Access, Cross Tab queries (the traditional tool for transposing datasets) need at least 3 numeric/date fields to work. However since the output is to Excel, have you considered just outputting the data to a hidden sheet then using a pivot table?