Loading Cross Table With Multiple Levels in Flat Format in QlikView - qlikview

Is it possible to load a cross table in QlikView which has multiple layers in columns and rows? This is an example of a table in Excel which I would like to load as a flat table in QlikView:

Its not trivial in the case of qlikview...BUT..luckly for you I keep a demo file that shows exactly how to accomplish loading multi header tables!
You can download it from here.
takes time to understand the logic so study it carefully.

This is pretty straightforward using the crosstable function. Assuming your static data columns (i.e. "Total_CountriesWithAlloc", "Total Countries incl. costs not allocated") are static and constant in number, you can perform a load * from your source to pick up the "new" month columns as they appear over time.
Here's a basic example (just drop the script into a new document and run it to see conceptually how it works).
[Data]:
CrossTable(Months, Amount, 2)
LOAD * INLINE [
StaticDataA,StaticDataB,Jan,Feb,Mar
AAA,XXX,10,20,30
BBB,YYY,41,41,41
CCC,ZZZ,72,82,92
];
Based on the clarifications, you could try something like this:
QUALIFY *;
[Headers]:
LOAD
RecNo() as RecordIndex
,*
FROM [lib://qlikid_graeme.smith/Multi Header.xlsx]
(ooxml, no labels, table is Sheet1)
WHERE RecNo() <4
;
[UnpivotedRawData]:
LOAD
*
FROM [lib://qlikid_graeme.smith/Multi Header.xlsx]
(ooxml, no labels, table is Sheet1)
WHERE RecNo() >=4
;
UNQUALIFY *;
[PivotedData]:
CrossTable(ColRef, Amount, 2)
LOAD
*
RESIDENT
[UnpivotedRawData]
;
DROP TABLE UnpivotedRawData;
RenameMapData:
LOAD 'UnpivotedRawData.A' as FromFieldName, CONCAT([Headers.A]) as ToFieldName Resident Headers;
CONCATENATE LOAD 'UnpivotedRawData.B' as FromFieldName, CONCAT([Headers.B]) as ToFieldName Resident Headers;
RenameMapDataMap:
Mapping Load * Resident RenameMapData;
DROP TABLE RenameMapData;
RENAME FIELDS USING RenameMapDataMap;
ConcatenatedHeaders:
LOAD 'UnpivotedRawData.C' as ColRef, CONCAT([Headers.C]) as ConcatenatedFieldName Resident Headers;
CONCATENATE LOAD 'UnpivotedRawData.D' as ColRef, CONCAT([Headers.D]) as ConcatenatedFieldName Resident Headers;
CONCATENATE LOAD 'UnpivotedRawData.E' as ColRef, CONCAT([Headers.E]) as ConcatenatedFieldName Resident Headers;
DROP TABLE Headers;
In this example I have just concatenated the 3 column header rows, but you have the Row ID, so you could easily adapt to split these into new fields if you want. Also, you should automate the creation of the ConcatenatedHeaders table above by iterating dynamically through the NoOfFields() in the table rather than hard coding it like I have above, particularly if you have a dynamic number of incoming columns.
This is the output in Qliksense (I have done this on QlikSense cloud as I am at home on my mac). Script should work in Qlikview too though.
If you need to split out the individual fields (as per clarified requirements), just replace the last section of the script with this:
ConcatenatedHeaders:
LOAD 'UnpivotedRawData.C' as ColRef, CONCAT([Headers.C],'~') as ConcatenatedFieldName Resident Headers;
CONCATENATE LOAD 'UnpivotedRawData.D' as ColRef, CONCAT([Headers.D],'~') as ConcatenatedFieldName Resident Headers;
CONCATENATE LOAD 'UnpivotedRawData.E' as ColRef, CONCAT([Headers.E],'~') as ConcatenatedFieldName Resident Headers;
DROP TABLE Headers;
JOIN(ConcatenatedHeaders)
LOAD
ColRef
,SubField(ConcatenatedFieldName, '~',1) as Row1Header
,SubField(ConcatenatedFieldName, '~',2) as Row2Header
,SubField(ConcatenatedFieldName, '~',3) as Row3Header
RESIDENT
ConcatenatedHeaders;
The data should look like this:

Related

Qlik - Building a dynamic view

I have a SQL query that creates a table, and every month 2 new columns will be added for that table related to the current month.
I have tried without success to set up a flat table (visual) in Qlik that will automatically expand every month to include these table. Is there a way to do this, and i so please point me in the right direction.
You can have a look at CrossTable prefix.
This prefix allows a wide table to be converted to a long table.
So if we have data like this:
After running the following script:
CrossTable:
CrossTable(Month, Sales)
LOAD Item,
[2022-10],
[2022-11],
[2022-12],
[2023-01],
[2023-02],
[2023-03],
[2023-04]
FROM
[C:\Users\User1\Documents\SO_75447715.xlsx]
(ooxml, embedded labels, table is Sheet1);
The final data will looks like below. As you can see there are only 3 columns. All xls month columns (after Item) are now collapsed under one field - Month and all the values are collapsed under Sales column.
Having the data in this format then allows creating "normal" charts with adding Month column as dimension and use sum(Sales) as an expression.
P.S. If you dont want to manage the new columns being added then the script can be:
CrossTable(Month, Sales)
LOAD
Item,
*
FROM
...

Dynamic list of variables in process in Azure Data Factory

I have a lookup config table that stores the 1) source table and 2) list of variables to process, for example:
SQL Lookup Table:
tableA, variableX,variableY,variableZ <-- tableA has more than these 3 variables, i.e it has other variables such as variableV, variable W but they do not need to be processed
tableB, variableA,variableB <-- tableB has more than these 2 variables
Hence, I will need to dynamically connect to each table and process the specific variables in each table. The processing step is to convert the julian date (in integer format) to standard date (date format). Example of SQL query:
select dateadd(dd, (variableX - ((variableX/1000) * 1000)) - 1, dateadd(yy, variableX/1000, 0)) FROM [dbo].[tableA]
The problem is after setting up lookup and forEach in ADF, I am unsure how to loop through the variable array (or string, since SQL DB does not allow me to store array results) and convert all these variables into the standard time format.
The return result should be a processed dataset to be exported to a sink.
Hence would like to check what will be the best way to achieve this in ADF?
Thank you!
I have reproed in my local environment. Please see the below steps.
Using lookup activity, first get all the tables list from control table.
Pass the lookup output to ForEach activity.
Inside ForEach activity, add lookup activity to get the variables list from control table where table name is current item from ForEach activity.
#concat('select table_variables from control_tb where table_name = ''',item().table_name,'''')
Convert lookup2 activity output value to an array using set variable activity.
#split(activity('Lookup2').output.firstRow.table_variables,',')
create another pipeline (pipeline2) with 2 parameters (table name (string) and variables (array)) and add ForEach activity in pipeline2
Pass the array parameter to ForEach activity in pipeline2 and Use the copy activity to copy data from source to sink
Connect Execute pipeline activity to pipeline 1 inside ForEach activity.

How can data studio read a repeatable column as values of a single record?

I'm moving a Mongo collection into BigQuery to do analysis and visualizations in Google Data Studio. I'm specifically trying to map all results of a locations collection, which has multiple records, one for each location. Each record stores the lat long as an array of 2 numbers.
In Data Studio, when i try to map the locations.coordinates value, it croaks, because it only pulls in the first value of the array. If instead of mapping it, I output the result as a table, I see 2 rows for each record, with the _id being the same and locations.coordinates being different between a row that has the latitude (locations.coordinates[0]) and another row for the longitude (locations.coordinates[1]).
I think I have to do this as a scheduled query in bigquery, that runs after every sync of data. But, I'm hoping there is a way to do this as a calculated field or a blended data set, in Google Data Studio.
Data as it exists in mongo
Data as it exists in bigquery
Data as it exists in data studio
additional:
Big Query Record Types
You can address values in arrays directly and transform your data accordingly using struct etc.:
WITH t AS (
SELECT * FROM UNNEST([
STRUCT('a' AS company, STRUCT([-71.2, 42.0] as coordinates, 'Point' as type) AS location),
('b', ([-71.0, 42.2], 'Point')),
('c', ([-71.4, 42.4], 'Point'))
])
)
--show source structure of example data
--SELECT * FROM t
SELECT * except(location),
STRUCT(
location.coordinates[safe_offset(0)] as long,
location.coordinates[safe_offset(1)] as lat,
location.type
) as location
FROM t
There's offset() for 0-based access, ordinal() for 1-based access and with safe_ you don't trigger errors in case the index in the array doesn't exist. If you need to know that values are missing, then you should use the version without safe_.
Anyway - this structure is flat by choosing specific values from the array. It should work with datastudio or any other visualization tool, there are no repeated rows anymore

Filter by length of array in Pig

I have data stored in avro format. One of the fields of each record (array_field, say) is an array. Using Pig how do I obtain only the records that have arrays with, for example, length(array_field) >= 2 and then store the results in avro files using the same schema as the original input?
This should be doable with something like code below:
A = LOAD '$INPUT' USING AvroStorage();
B = FILTER A BY SIZE(array_field) >= 2;
STORE B INTO '$OUTPUT' USING AvroStorage('schema', '<schema_here>');

OrientDB embeddedmap query

Let's say I have a Vertex class Data in OrientDB. Data has a property data which is of the type EMBEDDEDMAP.
I can create a new Vertex of this type and assign an object to the property data with the following command:
CREATE VERTEX Data SET data = {'key1':'val1', 'key2':'val2'}
Let's say now that I want to query the database and get records that holds exactly this structure in the data property.
Ie, something in the lines of:
SELECT FROM Data WHERE data = {"key1":"val1","key2":"val2"}
This doesn't work however (Note also that the structure in data is arbitrary and can have nested structures: {"key2":{"key2":"val2"}} etc.)
I know that this query is possible for an embeddedmap type:
SELECT FROM Data WHERE "val1" IN data.key1 AND "val2" IN data.key2
But for arbitrary data structures it would be bothersome to parse such a query, this also made me find out another thing:
Let's say I create two vertices:
CREATE VERTEX Data SET data = {"key1":["one", "two"]}
CREATE VERTEX Data SET data = {"key1":["one"]}
I now want to select only the first of them, for instance with:
SELECT FROM Data WHERE ["one", "two"] IN data.key1
This query however returns both records:
#rid #version #class data
#13:7 1 Data {"key1":["one","two"]}
#13:8 1 Data {"key1":["one"]}
I'm guessing I have to do:
SELECT FROM Data WHERE "one" IN data.key1 AND "two" IN data.key1
However, this also seems quite cumbersome for nested lists.
Question: How could I query on a known, arbitrary data structure (embeddedmap)
NOTE: I'm not asking about specific values in the structure, but the whole structure.