I am trying to compare two CSV files, most of the time it will have same data but order of data will not be the same. Eg
csv file1
AAA,111,A1A1
BBB,222,B2B2
CCC,333,C3C3
CSV File2
CCC,333,C3C3
BBB,212,B2B2
AAA,111,A1A1
so I want to use third column as Primary key to compare other values. Report the difference. Is this possible to do it in Robotframework or Panda?
If you are making use of robotframework you need to do the following,
install robotframework-csvlib
Use Built-in Collections
Input from your question
csv file1
AAA,111,A1A1
BBB,222,B2B2
CCC,333,C3C3
csv file2
CCC,333,C3C3
BBB,212,B2B2
AAA,111,A1A1
My Solution
In the below approach, we are first reading csv into list of lists for both csv files and then comparing all the list of list items by making use of Collections KW List Should Contain Sub List, here, notice that we are passing an argument "values=True" which compares the value as well.
Code that compares 2 csv files
*** Settings ***
Library CSVLib
Library Collections
*** Test Cases ***
Test CSV
${list1}= read csv as list csv1.csv
log to console ${list1}
${list2}= read csv as list csv2.csv
log to console ${list2}
List Should Contain Sub List ${list1} ${list2} values=True
OUTPUT
(rf1) C:\Users\kgurupra>robot s1.robot
==============================================================================
S1
==============================================================================
Test CSV .[['C1,C2,C3'], ['AAA,111,A1A1'], ['BBB,222,B2B2'], ['CCC,333,C3C3']]
..[['C1,C2,C3'], ['CCC,333,C3C3'], ['BBB,212,B2B2'], ['AAA,111,A1A1']]
Test CSV | FAIL |
Following values were not found from first list: ['BBB,212,B2B2']
------------------------------------------------------------------------------
S1 | FAIL |
1 critical test, 0 passed, 1 failed
1 test total, 0 passed, 1 failed
==============================================================================
Output: C:\Users\kgurupra\output.xml
Log: C:\Users\kgurupra\log.html
Report: C:\Users\kgurupra\report.html
Assuming you've imported your CSV files as pandas DataFrames you can do the following to merge the two while retaining fundamental differences:
df = csv1.merge(csv2, on='<insert name primary key column here>',how='outer')
Adding the suffixes option allows you to more clearly differentiate between identically named columns from each file:
df = csv1.merge(csv2, on='<insert name>',how='outer',suffixes=['_csv1','_csv2'])
After that it depends on what kind of differences you are looking to spot but perhaps a starting point is:
df['difference_1'] = df['column1_csv1'] == df['column1_csv2']
this will create a boolean column which indicates True if observations are the same and False otherwise.
But there are nearly endless options for comparison.
I'm trying to add a new column to my file. I want to add the date to each row of my file.
Filename is: 2016-06-15.txt
The schema my file is:
A B C
7 8 13
I want to obtain:
Date A B C
2016-06-15 7 8 13
For that I'm using Pig with following script:
A = LOAD 'user/cloudera/Analytics/source/file.txt' using PigStorage(' ','-tagPath');
DUMP A ; ****--> ERROR****
STORE A INTO 'user/cloudera/Analytics/source/file.txt' USING PigStorage(' '); ****--> ERROR****
But I'm getting an error and I don't have any log available :( Anyone can help? Many thanks!
You will have to use the -tagFile option to get the file name as the first field.
Before that check to make sure the file path is correct.Looks like a forward slash is missing at the beginning of your file path.Ensure you are using the correct delimiter in PigStorage.Seems like the columns are separated by a tab or multiple spaces.Lastly choose a different folder to Store the new file or else you will get a file exists error.
A = LOAD '/user/cloudera/Analytics/source/2016-06-15.txt' using PigStorage(' ','-tagFile');
STORE A INTO '/user/cloudera/Analytics/NEW_source/2016-06-15.txt' USING PigStorage(' ');
I'm doing some optimization using a model whose number of constraints and variables exceeds the cap for the student version of, say, AMPL, so I've found a webpage [http://www.neos-server.org/neos/solvers/milp:Gurobi/AMPL.html] which can solve my type of model.
I've found however that when using a solver where you can provide a commandfile (which I assume is the same as a .run file) the documentation of NEOS server tells that you should see the documentation of the input file. I'm using AMPL input which according to [http://www.neos-guide.org/content/FAQ#ampl_variables] should be able to print the decision variables using a command file with the appearance:
solve;
display _varname, _var;
The problem is that NEOS claim that you cannot add the:
data datafile;
model modelfile;
commands into the .run file, resulting in that the compiler cannot find the variables.
Does anyone know of a way to work around this?
Thanks in advance!
EDIT: If anyone else has this problem (which I believe many people have based on my Internet search). Try to remove any eventual reset; command from the .run file!
You don't need to specify model or data commands in the script file submitted to NEOS. It loads the model and data files automatically, solves the problem, and then executes the script (command file) you provide. For example submitting diet1.mod model diet1.dat data and this trivial command file
display _varname, _var;
produces the output which includes
: _varname _var :=
1 "Buy['Quarter Pounder w/ Cheese']" 0
2 "Buy['McLean Deluxe w/ Cheese']" 0
3 "Buy['Big Mac']" 0
4 "Buy['Filet-O-Fish']" 0
5 "Buy['McGrilled Chicken']" 0
6 "Buy['Fries, small']" 0
7 "Buy['Sausage McMuffin']" 0
8 "Buy['1% Lowfat Milk']" 0
9 "Buy['Orange Juice']" 0
;
As you can see this is the output from the display command.
I have a table with about 900 records. A sample record looks like this:
Field Names:
ID FNN DSLAM_ID SHORT_CODE PORT_TYPE PANEL SLOT CHANNEL CONNECTION_TYPE SERVICE_TYPE PVCID CHANNEL_TYPE PROD_CODES
Record 1:
1 A99TEST9999 QXXXXENNNN ABCDE DSL48P 1 11 38 ABC ADSL RANDOMIDXXYY N ADESP=NNNNNNN_ABCDEFG_L2PPP
I'd like to build a text file, where for each record it builds a new line and inputs a specific field as a variable.
An example Line:
FNN="[FNN]" : ACTION="" : SERVICE_TYPE="[CONNECTION_TYPE]" : NE_ID="[DSLAM_ID]", NE_DEFN="[SERVICE_TYPE]", PORT="[PANEL] / [SLOT] / [CHANNEL]"
I've seen people write scripts to create Router Configurations before and essentially this is what I want to do to build a Mass Configuration File for an application.
You'll need to get the recordset object and then do something like this:
Open "yourfilename.txt" for Output as #1
While not (recordset.eof)
Print #1, "FNN=" & recordset.fields("FNN").value (add the rest of your string here...)
recordset.movenext
Wend
Close #1
Technically, instead of "#1" you should grab the file number by using the FreeFile() function.
I'm new to Octave/Matlab and I want to plot a 3D-Graph.
I was able to do so using a predefined formula, like this:
x=1:.1:5;
y=1:.1:5;
[xx,yy] = meshgrid(x,y);
z = sin(xx)+sin(yy);
mesh(x,y,z);
But now the question is how to do the same getting the data from a CSV (for example). I know I can use the function csvread, but the big question is how to format the CSV to contain such data.
An example of doing the same graph above but this time grabbing the data from Excel/CSV would be appreciated. Thanks!
Done! I was finally able to do it!
Here's how I did it:
1) I've created a file in Excel with the X values in the cells A2:A42, and the Y values in the cells B1:AP1 (so you form a rectangle).
2) Then in the cells in the middle I put the formula I want (ie =sin(A$2)+sin($B1))
3) Saved the file as CSV (but separated by spaces!) and manually edited it to look this way (the way QtOctave opens matrix files, in Matlab it might be different). For example (note the extra space before each column):
# Created by Octave 3.2.4, Thu Jan 12 19:32:05 2012 ART <diego#notebook2>
# name: z
# type: matrix
# rows: 3
# columns: 3
1 2 3
4 5 6
7 8 9
(if you're not sure how to do it, do what I did: create a simple matrix and export it to see how the exported file looks like!)
4) Octave has a function under Data -> Load matrix from file, which loads that kind of files. Or actually running this command (varname is the name of the resulting variable):
load("-text", "file-where-the-data-is", "varname")
5) Create the graph (ex is the name of the matrix I've just imported):
x=1:.1:5;
y=1:.1:5;
mesh(x,y,ex)