append new information in csv files to existing historical qvd - qlikview

Let's say I have a "master" qvd file named salesHistory.qvd, and I want to append new monthly sales from file salesMarch.csv
How do I do that without replacing existing information, but adding new months?
Thanks for the help!

By default, QlikView automatically appends table loads to a previously loaded table if the fields are identical. You can use this to your advantage by using a script similar to the following:
SalesHistory:
LOAD
*
FROM
[salesHistory.qvd] (qvd);
LOAD
*
FROM
[salesMarch.csv]
(txt, utf8, embedded labels, delimiter is ',', msq);
STORE SalesHistory INTO [salesHistory.qvd] (qvd);
This initially loads the contents of your salesHistory.qvd file into a table, and then loads the contents of salesMarch.csv and concatenates it into the SalesHistory table (which contains the contents of salesHistory.qvd.
The final STORE step saves this concatenated table into the salesHistory.qvd file by overwriting it completely.
In the above example, we use * as a field specifier to load all fields from the source files. This means that this approach only works if your QVD file contains the same fields (and field names) as your CSV file.
Furthermore, as this script loads the contents of the QVD file each time it is executed, it will start to duplicate data if it is executed more than once per month as there is no determination of which months already exist in the QVD file. If you need to execute it more than once per month (perhaps due to adjustments) then you may wish to consider applying a WHERE clause to the load from salesHistory.qvd so that only data up to and including the previous month is loaded.
Finally, you may wish to alter the name of your CSV file so that it is always the same (e.g. salesCurrentMonth.csv) so that you do not have to change the filename in your script.

Related

load a csv file from a particular point onwards

I have the below code to read a particular csv file, my problem is that the csv file contains two sets of data, one underneath the other, with different headers.
The starting point of the second data set can vary daily. So I need something that lets me find the row in the CSV where the second dataset begins (it will always start with the number '302') and load the csv from there. The problem i have is that the code below starts from where i need it to start, but it always includes the headers from the first part of the data, which is wrong.
USE csvImpersonation
FILE 'c:\\myfile.TXT'
SKIP_AT_START 3
RETURN #myfile
#loaddata = SELECT * FROM #mylife
where c1 = '302'
The below is a sample of the text file (after the first 3 rows are skipped, which are just full of file settings, dates, etc).
Any help is much appreciated

Azure Data Factory - How to create multiple datasets and apply different treatments on files in same blob container?

Starting up with Azure Data factory here.
I have a scenario where I gather csv files (different sources and formats/templates) that I store in a single Azure blob container. I would like to extract the data to an SQL DB. I need to apply different treatments to the files before pushing the data to SQL, based on the format. The format is indicated in each file name (for example: Myfile-formatA-20201201).
I am unclear on my pipeline / datasets setup. I assume I need to create a new (input) dataset for each CSV format, but cannot find a way to create differentiated datasets by relying on the different naming pattern. If creating a single input dataset instead, I can create a pipeline with differentiated copy activity using the same single dataset created in input and applying different filtering rules (relying based on my files naming pattern) - which seems to be working fine for files having the same encoding, column delimiters etc.. but as expected, fails for other files that do not.
I could not find any official information on how to to apply filters on creating multiple datasets from files contained in the same container. Is it possible at all? Or is a prerequisite to store files with different format in different containers or directories?
I created a test to copy different format csv in one pipeline.Then select different copy activities according to the file name. I think this is the answer you want.
In my container, I created csv in two formats:
Creat a dataset to the input container:
Edit: Do not specify a file in the File Path
Using Get Metadata1 activity to get the Child items.
The output is as follows:
Then in ForEach1 activity, we can traverse this array. Add dynamic content #activity('Get Metadata1').output.childItems to the Items tab.
5.Inside ForEach1 activity, we can use Switch1 activity and add dynamic content #split(item().name,'-')[1] to the Expression. It will get the format name. Such as: Myfile-formatA-20201201 -> formatA.
Case default, we can copy csv files of fortmatA.
Edit: in order to select only files of with "formatA" in their name, in the copy activity, use the Wildcard file path option:
enter image description here
Key in #item().name , so we can specify one csv file.
Add formatB case:
Then use the same source dataset.
Edit: as in previous step, use the Wildcard file path option:
enter image description here
That's all. We can set different sink at these Copy activities.

How to Load data in CSV file with a query in SSIS

I have a CSV file which contain millions records/rows. The header row is like:
<NIC,Name,Address,Telephone,CardType,Payment>
In my scenario I want to load data "CardType" is equal to "VIP". How can I preform this operation without loading whole records in the file into a staging table?
I am not loading these records into a data warehouse. I only need to separate these data in CSV file.
The question isn't super-clear, but it sounds like you want to do some processing of the rows before outputting them back into another CSV file. If that's the case, then you'll want to make use of the various transforms available, notably Conditional Split. In there, you can look for rows where CardType == VIP and send those down one output (call it "Valid Rows"), and send the others into the default output. Connect up your valid rows output to your CSV destination and that should be it.

How to make load script that load multiple files in to one table

I have multiple files named by convention: "data_YYYY.MM.xslx"
I need to load all these files to one qlikview table, but when I do:
Tab:
load Name, Number from [data_*.csv];
I get Tab, Tab-1, Tab-2 files for each file.
I`ve also tryed to do:
Tab:
add load Name, Number from [data_*.csv];
With the same effect.
If anybody know the way how to do it, please help.
This question makes no sense, unless you have omitted some detail.
QlikView will implicitly append all data to the original table 'Tab' by a statement such as:
Tab:
load Name, Number from [data_*.csv] (txt);
Note the file format specified in brackets.
Appending data implicitly occurs when ever a table is loaded with exactly the same field names as a table already created. So in your example, the first file encountered constitutes a load of data from that file. Assuming the field names are indeed referenced as per your question, the resultant table should have two fields in it: 'Name' and 'Number'. When the second file is encountered via the wildcard match, the second load takes place and it will append that data to the table 'Tab'.
If you wish to NOT rely upon IMPLICIT concatenation (QlikView terminology for appending data to an existing table) then you can write a FOR loop to load your files instead whist using the explicit CONCATENATE load prefix to point to the table you wish to append data to.
E.g.
// QV trick to 'declare' a table
Tab:
LOAD null() AS Name
AUTOGENERATE(1)
WHERE RecNo() < 0;
FOR EACH file IN filelist('data_*.csv')
CONCATENATE('Tab')
LOAD * FROM [$(file)] (txt);
NEXT file
This hack works for me:
tmp:
LOAD #1 inline [#1];
Tab:
Concatenate load Name, Number from [data_*.csv];
You can do it by this way:
Load * from data_*.csv;
Simply use a mask in the file name.
And for completeness one way with a loop (here using qvd files):
FOR Each db_schema in '2013-07','2013-08','2013-09'
LOAD ...., db_schema
FROM `x-$(db_schema)`.`qvd` (qvd);
next;
(Knowing the names.)
Here is another way if the files names are different
Tab:
load Name, Number from [data_1.csv];
join (Tab)
load Name, Number from [data_2.csv];

SSIS - Column names as Variable/Changed Programmatically

I'm hoping someone might be able to help me out with this one - I have 24 files in CSV format, they all have the same layout and need to be joined onto some pre-existing data. Each file has a single column that needs to be joined onto the rest of the data, but those columns all have the same names in the original files. I need the columns automatically renamed to the filename as part of the join.
The final name of the column needs to be: Filename - data from another column.
My current approach is to use a foreach container and use the variable generated by the container to name the column, but there's nowhere I can input that value in the join, and even if I did, it'd mess up the output mappings, because the column names would be different.
Does anyone have any thoughts about how to get around these issues? Whoever has an idea will be saving my neck!
EDIT In case some more detail helps with this... SSIS version is 2008 and there are only a few hundred rows per file. It's basically a one time task to collect a full billing history from several bills which are issued monthly.
The source data has three columns, the product number, the product type and the cost.
The destination needs to have 24*3 columns, each of which has a monthly cost for a given product category. There are three product categories, and 24 bills (in seperate files) hence 24*3.
So hopefully I'm being a bit clearer - all I really need to know how to do, is to change the name of a column using a variable passed in from the foreach file container.
I think the easiest is to create a tmp database (aka staging db)
to load data from xls file to it and to define stored procedures where you can pass paramas (ex file names etc) and to build your won logic ...
Cheers Mario