Load multiple csv files from a directory using dataloader - datastax

I have data in multiple CSV files(having same header name) in a directory. I want to create vertices from those CSV files. How can i load all files with a single load using dse graph loader. because i have almost more then 600 csv file ?

You are correct that graph loader the tool for the job. The docs have a good example here.

Related

How to add an extension to a file copy activity with Azure Data Factory

The datasets that I ingest from a REST API endpoint do not include the .JSON extension to the files (even though they're JSON files). Therefore, can someone let me know where I can add a .json extension from the following scenarios
Scenario 1.
Adding .JSON to the relativeURL
Scenario 2
Adding .JSON to the SINK
Scenario 3
Adding .JSON to SOURCE - However, I don't think this is possible
Can someone please take a look at the three scenarios and let me know if I can add .JSON extension to any of those methods?
Thanks to #Scott Mildenberger, we can provide the name of the file and its extension from the sink dataset.
The following is a demonstration of the same. I have a file called sample without extension.
In the sink dataset, you can simply concat the extension to your filename (If it is just a single file, you can directly give the required name with extension directly). I have used the following dynamic content (fileName parameter value is req_filename).
#concat(dataset().fileName,'.json')
The following file would be generated in the sink.

Mosaic Decisions Azure BLOB writer node creating multiple files

I’m using mosaic decisions data flow feature to read a file from Azure blob, do a few transformations and write that data back to Azure. It worked fine except that in the output file path I have given, it created a folder and I can see many files with some strange “part-000” etc in their names. What I need is a single file in that output location – Not many. Is there a way around this?
Mosaic-Decisions uses apache spark as its backend execution engine. In Spark, the dataframe read is split into multiple partitions and these partitions are written to the output location in parallel. That's the reason it creates multiple files at the target location with "part-0000", "part-0001" etc. (part here represents partition).
The workaround on this is to check "combine-output-files-into-one" in writer node. This will combine all of the part files into one big file. But use this with caution and only if you really need a single file - as this will come with a performance tradeoff.

What is the need of uploading CSV file for performance testing in blazemeter?

I am new in Testing world and just started working on performance and load testing, I just want to know that while we record script in Blazemeter and upload a jmx file for test then why we upload a CSV file with it and What data we need to enter in that CSV file. pls help.
Thank you
You can generate data for testing (e.g. user names and passwords etc.) and save in CSV format and then read them from CSV file and use in your test scenarios if it's needed. Please refer to the Blazemeter documentation - Using CSV DATA SET CONFIG

I can't see my transformation files (PDI) in Pentaho user console

I have csv files in the local file system that contain output results of a program. I want to use Pentaho CDE to visualize those data in graphs (Pie charts ...)
I can do that if I upload my csv files directly as a datasource. But I would like to read from my file in real time as data come.
I saw that I have to use PDI and I created a transformation from the input csv file to an output file. I can visualize the output.
I saved the transformation file .ktr in pentaho-solutions directory But I can't see it in Pentaho user Console. I think that I have to use a kettle over kettleTransFromFile but I can't see my transformation file to load it ! I refreshed the cache but still can't see it ...
Am I doing this wrong ?
Thank you for your time

Inserting realtime data into Bigquery with a file on compute engine?

I'm downloading realtime data into a csv file on Google's Compute Engine instance and want to load this file into Bigquery for realtime analysis.
Is there a way for me to do this without first uploading the file to Cloud Storage?
I tried this: https://cloud.google.com/bigquery/streaming-data-into-bigquery but since my file isnt in JSON, this fails.
Have you tried the command line tool? You can upload CSVs from it.