Jmeter csv file for jdbc request as an input - sql

I have a question using CSV file as an input for Jmeter. Is it possible to use CSV file for storing data like select statements or update statements and then passing all that to Jmeter in JDBC request so it can execute them accordingly.
I'd appreciate any suggestion or pointing me to the right solution.
Thanks

You can create a variable from a csv file using the CSV Data Set Config http://jmeter.apache.org/usermanual/component_reference.html#CSV_Data_Set_Config
Then you just need to use them in your Jdbc Sampler.
The problem here is that you must have you sql on one line.
You can also create variable from a select JDBC Sample. (See variable Names)
http://jmeter.apache.org/usermanual/component_reference.html#JDBC_Request

Yes you can do that.your statements need to be in different column of the CSV file.
define 2 variable names in CSV file and call them in JDBC Request file ${variable name given in CSV file}.

Related

Dynamic filename in Data Factory dataflow source

I’m working with a pipeline that loads table data from onpremise SQL to a datalake csv file dynamically, sinking a .csv file for each table that I already set to load in a versionControl table in a AzureSQL using Foreach.
So, after load the data, i want to update the versionControl table with the lastUpdate date, based on the MAX(lastUpdate) field of each .csv file loaded. To accomplish that, i know that i need to add a dataflow after the copy activity, so i can use the aggregate transformation, but don’t know how to pass the filename to the source of the dataflow dynamically in a parameter.
Thanks!
2 options:
Parameterized dataset. Use a source dataset in the dataflow that has a parameter for the file name. You can then pass in that filename as a pipeline parameter.
Parameterized Source wildcard. You can also use a source dataset in the dataflow that points just to a folder in your container. You can then parameterize the wildcard property in the Source and send in the filename there as a pipeline parameter.

Azure Data Factory 2 : How to split a file into multiple output files

I'm using Azure Data Factory and am looking for the complement to the "Lookup" activity. Basically I want to be able to write a single line to a file.
Here's the setup:
Read from a CSV file in blob store using a Lookup activity
Connect the output of that to a For Each
within the For Each, take each record (a line from the file read by the Lookup activity) and write it to a distinct file, named dynamically.
Any clues on how to accomplish that?
Use Data flow, use the derived column activity to create a filename column. Use the filename column in sink. Details on how to implement dynamic filenames in ADF is describe here: https://kromerbigdata.com/2019/04/05/dynamic-file-names-in-adf-with-mapping-data-flows/
Data Flow would probably be better for this, but as a quick hack, you can do the following to read the text file line by line in a pipeline:
Define your source dataset to output a line as a single column. Normally I would use "NoDelimiter" for this, but that isn't supported by Lookup. As a workaround, define it with an incorrect Column Delimiter (like | or \t for a CSV file). You should also go to the Schema tab, and CLEAR the schema. This will generate a column in the output named "Prop_0".
In the foreach activity, set the Items to the Lookup's "output.value" and check "Sequential".
Inside the foreach, you can use item().Prop_0 to grab the text of the line:
To the best of my understanding, creating a blob isn't directly supported by pipelines [hence my suggestion above to look into Data Flow]. It is, however, very simple to do in Logic Apps. If I was tackling this problem, I would create a logic app with an HTTP Request Received trigger, then call it from ADF with a Web activity and send the text line and dynamic file name in the payload.

Trying to create a table and load data into same table using Databricks and SQL

I Googled for a solution to create a table, using Databticks and Azure SQL Server, and load data into this same table. I found some sample code online, which seems pretty straightforward, but apparently there is an issue somewhere. Here is my code.
CREATE TABLE MyTable
USING org.apache.spark.sql.jdbc
OPTIONS (
url "jdbc:sqlserver://server_name_here.database.windows.net:1433;database = db_name_here",
user "u_name",
password "p_wd",
dbtable "MyTable"
);
Now, here is my error.
Error in SQL statement: SQLServerException: Invalid object name 'MyTable'.
My password, unfortunately, has spaces in it. That could be the problem, perhaps, but I don't think so.
Basically, I would like to get this to recursively loop through files in a folder and sub-folders, and load data from files with a string pattern, like 'ABC*', and load recursively all these files into a table. The blocker, here, is that I need the file name loaded into a field as well. So, I want to load data from MANY files, into 4 fields of actual data, and 1 field that captures the file name. The only way I can distinguish the different data sets is with the file name. Is this possible? Or, is this an exercise in futility?
my suggestion is to use the Azure SQL Spark library, as also mentioned in documentation:
https://docs.databricks.com/spark/latest/data-sources/sql-databases-azure.html#connect-to-spark-using-this-library
The 'Bulk Copy' is what you want to use to have good performances. Just load your file into a DataFrame and bulk copy it to Azure SQL
https://docs.databricks.com/data/data-sources/sql-databases-azure.html#bulk-copy-to-azure-sql-database-or-sql-server
To read files from subfolders, answer is here:
How to import multiple csv files in a single load?
I finally, finally, finally got this working.
val myDFCsv = spark.read.format("csv")
.option("sep","|")
.option("inferSchema","true")
.option("header","false")
.load("mnt/rawdata/2019/01/01/client/ABC*.gz")
myDFCsv.show()
myDFCsv.count()
Thanks for a point in the right direction mauridb!!

How to add headers when we DUMP the data in output file using PIG scripts?

I tried to search for it but cannot find the tip/recommendations.
Here is my situation. I have all the data lined up correctly and output working fine using pig script. Stored the files in a output directory. The output files are more than 100 files so what i have done is accumulated the results file using another pig script.
I was wondering if there is anything in PIG LATIN that will help me add "Header" to the accumulated results file so that business users can quickly use it as it also has headers?
Please advise
If you are using DUMP in Pig script and redirecting the result to a single file, you can use DESCRIBE before DUMP. Doing so will append schema information as header to your output file
A = LOAD 'test' USING PigStorage() AS (col1:int, col2:chararray);
DESCRIBE A;
DUMP A;
output will be something like:
A: {col1: int,col2: chararray}
1,test
2,test
...
Pig can store the schema into a different file ".pig_schema" using PigStorage:
store A into 'outputFile' using PigStorage('\t', '-schema');
will save your data in the outputFile using tabs as delimiters and also creates the schema file.
You can store the header in a separate file, LOAD it and UNION it with your data. Then you need to do an ORDER BY (that might be tricky depending on your data).
Another way would be to use hadoop getmerge.
In general, this is not something pig is very good at, you might as well write a script in another language.

How to write SQL Query that matches data from a .csv file to a table in MySQL?

Is it possible for me to write an SQL query from within PhpMyAdmin that will search for matching records from a .csv file and match them to a table in MySQL?
Basically I want to do a WHERE IN query, but I want the WHERE IN to check records in a .csv file on my local machine, not a column in the database.
Can I do this?
I'd load the .csv content into a new table, do the comparison/merge and drop the table again.
Loading .csv files into mysql tables is easy:
LOAD DATA INFILE 'path/to/industries.csv'
INTO TABLE `industries`
FIELDS TERMINATED BY ';'
IGNORE 1 LINES (`nogaCode`, `title`);
There are a lot more things you can tell the LOAD command, like what char wraps the entries, etc.
I would do the following:
Create a temporary or MEMORY table on the server
Copy the CSV file to the server
Use the LOAD DATA INFILE command
Run your comparison
There is no way to have the CSV file on the client and the table on the server and be able to compare the contents of both using only SQL.
Short answer: no, you can't.
Long answer: you'll need to build a query locally, maybe with a script (Python/PHP) or just uploading the CSV in a table and doing a JOIN query (or just the WHERE x IN(SELECT y FROM mytmmpTABLE...))
For anyone new asking, there is this new tool that i used : Write SQL on CSV file