How to create tff.simulation.datasets.ClientData with custom number of client id via a CSV file? - tensorflow2.0

I have a CSV file, containing all of data I need. I want to create tff.simulation.datasets.ClientData via the above CSV, so that I can create any amounts of clients for training and test. Anyone can tell me how to do it.

Related

Azure Data Factory 2 : How to split a file into multiple output files

I'm using Azure Data Factory and am looking for the complement to the "Lookup" activity. Basically I want to be able to write a single line to a file.
Here's the setup:
Read from a CSV file in blob store using a Lookup activity
Connect the output of that to a For Each
within the For Each, take each record (a line from the file read by the Lookup activity) and write it to a distinct file, named dynamically.
Any clues on how to accomplish that?
Use Data flow, use the derived column activity to create a filename column. Use the filename column in sink. Details on how to implement dynamic filenames in ADF is describe here: https://kromerbigdata.com/2019/04/05/dynamic-file-names-in-adf-with-mapping-data-flows/
Data Flow would probably be better for this, but as a quick hack, you can do the following to read the text file line by line in a pipeline:
Define your source dataset to output a line as a single column. Normally I would use "NoDelimiter" for this, but that isn't supported by Lookup. As a workaround, define it with an incorrect Column Delimiter (like | or \t for a CSV file). You should also go to the Schema tab, and CLEAR the schema. This will generate a column in the output named "Prop_0".
In the foreach activity, set the Items to the Lookup's "output.value" and check "Sequential".
Inside the foreach, you can use item().Prop_0 to grab the text of the line:
To the best of my understanding, creating a blob isn't directly supported by pipelines [hence my suggestion above to look into Data Flow]. It is, however, very simple to do in Logic Apps. If I was tackling this problem, I would create a logic app with an HTTP Request Received trigger, then call it from ADF with a Web activity and send the text line and dynamic file name in the payload.

how to load multiple CSV files into Multiple Tables

I have Multiple CSV files in Folder
Example :
Member.CSv
Leader.CSv
I need to load them in to Data base tables .
I have worked on it using ForEachLoop Container ,Data FlowTask, Excel Source and OLEDB Destination
we can do if by using Expressions and Precedence Constraints but how can I do using Script task if I have more than 10 files ..I got Stuck with this one
We have a similar issue, our solution is a mixture of the suggestions above.
We have a number of files types sent from our client on a daily basis.
These have a specific filename pattern (e.g. SalesTransaction20160218.csv, Product20160218.csv)
Each of these file types have a staging "landing" table of the structure you expect
We then have a .net script task that takes the filename pattern and loads that data into a landing table.
There are also various checks that are done within the csv parser - matching number of columns, some basic data validation, before loading into the landing table
We are not good enough .net programmers to be able to dynamically parse an unknown file structure, create SQL table and then load the data in. I expect it is feasible, after all, that is what the SSIS Import/Export Wizard does (with some manual intervention)
As an alternative to this (the process is quite delicate), we are experimenting with a HDFS data landing area, then it allows us to use analytic tools like R to parse the data within HDFS. After that utilising PIG to load the data into SQL.

Pentaho generate dynamic password in files

I have a transformation which has table input (which fetch data from db) and and csv output (which will save table input data into csv file.)
And a Job which runs this transformation on weekly basis.
What I want now that whenever my report will get generated, a new dynamic password would create.
Please help me on this. Iam using pdi.

Creating metadata dynamically from a flat .csv file in CC

I am having some difficulties on how to dynamically create a metadata, which need to be extracted from the header line of a flat .csv file in CC.
Usually, I manually define the metadata by select New Metadata --> Extract from flat file in CC. However the metadata of the file may changes with additional columns. Thus, I do not know the metadata of the file and I can not define it in this static approach.
It would be helpful if you could suggest a solution to create metadata dynamically and using this newly created metadata for connecting to other components. Perhaps an example graph file for demonstration would be great.
Thanks,
Andy
I have discovered this kind of solution.
You just have to fill in flat .csv filename into csv readers and writers.
MetaDataMaster.grf - runs the graphs below.
MetaDataCreator.grf - creates metadata according to csv header and
write it into meta_example.fmt file
MetaDataUser.grf - Reads csv according to created meta_example.fmt file - you can add there a reformat and use just some predefined fields.
You can run the 2nd and 3rd graph separately to test it.

Writing data flow to postgresql

I know that by doing:
COPY test FROM '/path/to/csv/example.txt' DELIMITER ',' CSV;
I can import csv data to postgresql.
However, I do not have a static csv file. My csv file gets downloaded several times a day and it includes data which has previously been imported to the database. So, to get a consistent database I would have to leave out this old data.
My bestcase idea to realize this would be something like above. However, worstcase would be a java program manually checks each entry of the db with the csv file. Any recommendations for the implementation?
I really appreciate your answer!
You can dump latest data to the temp table using COPY command and MERGE temp table with the live table.
If you are using JAVA program for execute COPY command, then try CopyManager API.