Importing csv data to SQL using PowerShell - sql

Hi Glorius People of the Interwebz!
I come to you with a humble question (please go easy on me, I am fairly OK in PowerShell, but my SQL skills are minimal... :( )
So I have been tasked with to write a powershell script to import data (from a number of csv files to a database) and I made a good progress, based on this (I heavily modified my version). All works dashingly, except one part: when I try to insert the values (I created a sort of "mapping file" to map the csv headers to the data), I can't seem to use the created string in the values part. So here is what I have:
This is my current code for powershell (ignore the comments)
This is a sample data csv
This is my mapping file
What I would want, is to replace the
VALUES(
'$($CSVLine.Invoice_Status_Text)',
'$($CSVLine.Invoice_Status)',
'$($CSVLine.Dispute_Required_Text)',
'$($CSVLine.Dispute_Required)',
'$($CSVLine.Dispute_Resolved_Text)',
'$($CSVLine.Dispute_Resolved)',
'$($CSVLine.Sub_Account_Number)',
'$($CSVLine.QTY)',
'$($CSVLine.Date_of_Service)',
'$($CSVLine.Service)',
'$($CSVLine.Amount_of_Service)',
'$($CSVLine.Total)',
'$($CSVLine.Location)',
'$($CSVLine.Dispute_Reason_Text)',
'$($CSVLine.Dispute_Reason)',
'$($CSVLine.Numeric_counter)'
);"
Part, for example with a string generated this way:
But when I replace the long - and honestly, boring to type - values with the $valueString, I get this type of error:
Incorrect syntax was encountered while parsing '$($'.
Not sure, if it matters, but my PS is 7.1
Any good people who can give a good suggestion on how to build the values from my text file...?
Ta,
F.

As commented, wrapping variables inside single-quotes takes the variable as written literally, so you do not get the value contained (7957), but a string like $($CSVLine.Numeric_counter) instead.
I don't do SQL a lot, but I think I would change the part where you construct the values to insert like this:
# demo, read the csv file in your example
$csv = Import-Csv D:\Test\test.csv -Delimiter ';'
# demo, these are the headers (or better yet, the Property Names to use from the objects in the CSV) as ARRAY
# (you use `$headers = Get-Content -Path 'C:\Temp\SQL\ImportingCSVsIntoSQLv1\config\headers.txt'`)
$headers = 'Invoice_Status_Text','Invoice_Status','Dispute_Required_Text','Dispute_Required',
'Dispute_Resolved_Text','Dispute_Resolved','Sub_Account_Number','QTY','Date_of_Service',
'Service','Amount_of_Service','Total','Location','Dispute_Reason_Text','Dispute_Reason','Numeric_counter'
# capture formatted blocks of values for each row in the CSV
$AllValueStrings = $csv | ForEach-Object {
# get a list of values using propertynames you have in the $headers
$values = foreach ($propertyName in $headers) {
$value = $_.$propertyName
# output the VALUE to be captured in $values
# for SQL, single-quote the string type values. Numeric values without quotes
if ($value -match '^[\d\.]+$') { $value }
else { "'{0}'" -f $value }
}
# output the values for this row in the CSV
$values -join ",`r`n"
}
# $AllValueStrings will now have as many formatted values to use
# in the SQL as there are records (rows) in the csv
$AllValueStrings
Using your examples, $AllValueStrings would yield
'Ready To Pay',
1,
'No',
2,
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
7957

Related

value can not be null error in PowerShell

I'm trying to import my csv file into my sql database using like this but I'm not sure why it's saying
Exception calling "ExecuteWithResults" with "1" argument(s): "Value cannot be null.
Parameter name: sqlCommands"
even though I don't have Null value in my csv file and also I make sure my table columns to accept null value.
$s = New-Object Microsoft.SqlServer.Management.Smo.Server "server name"
$db = $s.Databases.Item("LitHold")
$csvfile = import-csv -delimiter ";" -path "C:\scripts\LitHold-OneDrive\output\Return\2022-01-12-Return.csv"
$csvfile |foreach-object{
$query = "insert into DailyReport VALUES ('$($_.MIN)','$($_.MID)','$($_.UPN)','$($_.Change)','$($_.Type)','$($_.HoldValue)','$($_.OneDrive)','$($_.Mailbox)','$($_.Created)','$($_.Modified)','$($_.MultMID)','$($_.Account)','$($_.ExistOD)')"
}
$result = $db.ExecuteWithResults($query)
# Show output
$result.Tables[0]
My csv file
//The top one is my columns name and it's already inside my table
"MIN","MID","UPN","Change","Type","Hold Value","OneDrive","Mailbox","Created","Modified","Mult MID","Account","Exist OD"
"338780228","lzlcdg","lzlcdg#NAMQA.COM","Hold Created","OneDrive and Mailbox","Y","https://devf-my.sharepoint.com/personal/lzlcdg_namqa_corpqa_geuc_corp_com","lzlcdg#NAMQA.CORPQA.GEUC.CORP.COM","1/11/2022 11:38:57 AM","1/11/2022 11:38:57 AM","N","",""
"419150027","lzs8rl","lzs8rl#.corpq.com","Hold Created","OneDrive and Mailbox","Y","https://my.sharepoint.com/personal/lzs8rl_namqa_corpqa_gcom","lzs8rl#namqa.corpqa.geuc.corp.com","1/11/2022 11:39:05 AM","1/11/2022 11:39:05 AM","N","",""
Don't remove the column headers, but double check how they are written.. with spaces
Your code ignores those here
$($_.HoldValue) --> $($_.'Hold Value')
$($_.MultMID) --> $($_.'Mult MID')
$($_.ExistOD) --> $($_.'Exist OD')
Either keep the code and rewrite the headers (take out the spaces) or make sure you use the property names according to the headers.
By removing the column headers, the first line in the csv file wil be used as column headers unless you supply new ones with parameter -Header. Removing headers will cause problems if the same field value is encountered more than once because column headers must be unique
Then there is this line:
$result = $db.ExecuteWithResults($csvfile)
which should be
$result = $db.ExecuteWithResults($query)
AND there is no point in looping over the records of the csv file and inside that loop overwrite your query string on every iteration so only the last record wil remain...

Run .sql file containing PL/SQL in PowerShell

I would like to find a way to run .sql file containing PL/SQL in PowerShell using .NET Data Proider for Oracle (System.Data.OracleClient).
I would deffinitely avoid using sqlplus for this task.
This is where I am now
add-type -AssemblyName System.Data.OracleClient
function Execute-OracleSQL
{
Param
(
# UserName required to login
[string]
$UserName,
# Password required to login
[string]
$Password,
# DataSource (This is the TNSNAME of the Oracle connection)
[string]
$DataSource,
# SQL File to execute.
[string]
$File
)
Begin
{
}
Process
{
$FileLines = Get-Content $File
$crlf = [System.Environment]::NewLine
$Statement = [string]::Join($crlf,$FileLines)
$connection_string = "User Id=$UserName;Password=$Password;Data Source=$DataSource"
try{
$con = New-Object System.Data.OracleClient.OracleConnection($connection_string)
$con.Open()
$cmd = $con.CreateCommand()
$cmd.CommandText = $Statement
$cmd.ExecuteNonQuery();
} catch {
Write-Error (“Database Exception: {0}`n{1}” -f $con.ConnectionString, $_.Exception.ToString())
stop-transcript
exit 1
} finally{
if ($con.State -eq ‘Open’) { $con.close() }
}
}
End
{
}
}
but I keep getting following error message
"ORA-00933: SQL command not properly ended
The content of the file is pretty basic:
DROP TABLE <schema name>.<table name>;
create table <schema name>.<table name>
(
seqtuninglog NUMBER,
sta number,
msg varchar2(1000),
usrupd varchar2(20),
datupd date
);
The file does not contain PL/SQL. It contains two SQL statements, with a semicolon statement separator between (and another one at the end, which you've said you've removed).
You call ExecuteNonQuery with the contents of that file, but that can only execute a single statement, not two at once.
You have a few options. Off the top of my head and in no particular order:
a) split the statements into separate files, and have your script read and process them in the right order;
b) keep them in one file and have your script split that into multiple statements, based on the separating semicolon - which is a bit messy and gets nasty if you will actually have PL/SQL at some point, since that has semicolons with one 'statement' block, unless you change everything to use /;
c) wrap the statements in an anonymous PL/SQL in the file, but as you're using DDL (drop/create) those would also then have to change to dynamic SQL;
d) have your script wrap the file contents in an anonymous PL/SQL block, but then that would have to work out if there is DDL and make that dynamic on the fly;
e) find a library to deal with the statement manipulation so you don't have to work out all the edge cases and combinations (no idea if such a thing exists);
f) use SQL*Plus or SQLcl, which you said you want to avoid.
There may be other options but they all have pros and cons.

How to parse the XML output from postgres as input for basex in Linux

How can I parse the XML output from Postgres as an input for Basex in Linux?
oh I see may answer is somehow outdated; yet I'll leave it here as in my opinion the appraoch you describe in your answer might be overkill for the task at hand.
I am not sure if you even have a question, yet I'd like to propose a fundamentally leaner approach ;-)
I hope it helps a little! Have fun!
For the current use case you may throw away awk, sed, postgres and wget, you can do all that you need in 25 lines of XQuery:
1) Some basics, fetch a file from a remote server:
fetch:text('https://www.wien.gv.at/statistik/ogd/vie_101.csv')
2) Skip the first line.
I decided to use the header that came with the original file, but you
fetch:text('https://www.wien.gv.at/statistik/ogd/vie_101.csv')
=> tokenize(out:nl()) (: Split string by newline :)
=> tail() (: Skip first line :)
=> string-join(out:nl()) (: Join strings with newline :)
So in total your Requirements condense to:
RQ1.:
(: Fetch CSV as Text, split it per line, skip the first line: :)
let $lines := fetch:text('https://www.wien.gv.at/statistik/ogd/vie_101.csv')
=> tokenize(out:nl()) (: Split string by newline :)
=> tail() (: Skip first line :)
=> string-join(out:nl()) (: Join strings with newline :)
(: Parse the csv file, first line contains element names.:)
let $csv := csv:parse($lines, map { "header": true(), "separator": ";"})
for $record in $csv/csv/record
group by $date := $record/REF_DATE
order by $date ascending
return element year_total {
attribute date { $date },
attribute population { sum($record/POP_TOTAL) => format-number("0000000")}
}
RQ 2.:
(: Fetch CSV as Text, split it per line, skip the first line: :)
let $lines := fetch:text('https://www.wien.gv.at/statistik/ogd/vie_101.csv')
=> tokenize(out:nl()) (: Split string by newline :)
=> tail() (: Skip first line :)
=> string-join(out:nl()) (: Join strings with newline :)
(: Parse the csv file, first line contains element names.:)
let $csv := csv:parse($lines, map { "header": true(), "separator": ";"})
for $record in $csv/csv/record
group by $date := $record/REF_DATE
order by $date ascending
return element year_total {
attribute date { $date },
attribute population { sum($record/POP_TOTAL) => format-number("0000000")},
for $sub_item in $record
group by $per-district := $sub_item/DISTRICT_CODE
return element district {
attribute name { $per-district },
attribute population { sum($sub_item/POP_TOTAL) => format-number("0000000")}
}
}
Including the file write and the date formatted in a more readable way:
(: wrap elements in single root element :)
let $result := element result {
(: Fetch CSV as Text, split it per line, skip the first line: :)
let $lines := fetch:text('https://www.wien.gv.at/statistik/ogd/vie_101.csv')
=> tokenize(out:nl()) (: Split string by newline :)
=> tail() (: Skip first line :)
=> string-join(out:nl()) (: Join strings with newline :)
(: Parse the csv file, first line contains element names.:)
let $csv := csv:parse($lines, map { "header": true(), "separator": ";"})
for $record in $csv/csv/record
group by $date := $record/REF_DATE
order by $date ascending
return element year_total {
attribute date { $date => replace("^(\d{4})(\d{2})(\d{2})","$3.$2.$1")},
attribute population { sum($record/POP_TOTAL) => format-number("0000000")},
for $sub_item in $record
group by $per-district := $sub_item/DISTRICT_CODE
return element district {
attribute name { $per-district },
attribute population { sum($sub_item/POP_TOTAL) => format-number("0000000")},
$sub_item
}
}
}
return file:write("result.xml", $result)
Setup
Data source : http://www.wien.gv.at/statistik/ogd/vie_101.csv
Research questions (RQ):
RQ1: How many people lived in Vienna in total per census?
RQ2: How many people lived in each Viennese district per census?
Preparation
In order to answer the RQ the postgre DB was chosen. Adhering to the proverbial saying “Where
there’s a shell, there’s a way” this code shows a neat solution for the BASH (CLI Debian/Ubuntu
flavored). Also, it is much easier to interact with postgre from the BASH when creating files needed
for further processing. Regarding the installation process please consult:
https://tecadmin.net/install-postgresql-server-on-ubuntu/
First download the file with wget:
cd /path/to/directory/ ;
wget -O ./vie_101.csv http://www.wien.gv.at/statistik/ogd/vie_101.csv ;
Then look at the file with your favorite spread sheet calculation program (Libre Office Calc).
vie_101 should be in UTF-8 encoding and probably uses a semicolumn \; delimiter. Open,
check, change, save.
Some reformatting is needed for ease of processing down the line. First, a header file is created
with the appropriate column names. Second, the downloaded file is “beheaded” (first 2 rows are
removed) and “cut” (into the columns of interest). Finally, it is attached to the header file.
echo 'DISTRICT,POPULATION,MALE,FEMALE,DATE' > ./vie.csv ;
declare=$(sed -e 's/,/ INT,/g' ./vie.csv)' INT' ;
sed 's/\;/\,/g' ./vie_101.csv | sed 's/\.//g' | tail -n+3 | cut -d ','
-f4,6-9 >> ./vie.csv ;
Postgre
In order to load data into postgres a schema needs to be created first:
echo "create table vie ( $declare );" | sudo -u postgres psql ;
In order to actually load data into postgres the previously created and formatted file (vie.csv)
needs to be copied into the folder accessible by postgres by the super user. Only then the copy
command can be executed to load data into postgres. It needs to be noted that root privileges are
required for this operation (sudo).
sudo cp ./vie.csv /var/lib/postgresql/ ;
echo "\copy vie from '/var/lib/postgresql/vie.csv' delimiter ',' csv
header ;" | sudo -u postgres psql ;
XML Schema
Before we create our XML document, we have to design the structure of our file. We decided to
create an XML schema (schema.xsd) instead of the DTD.
Our schema defines a root element , and its child which are complex elements.
The element can occur in any number. The children of element are ,
, , and . These 5 elements(siblings) are simple
elements and the defined value type is always an integer.
Create XML with Postgre
Since the ultimate goal is to answer the RQ via an xquery an xml file is needed. This file
(xml.xml) needs to be correctly formatted and well formed. As the next step the query_to_xml
command is piped to postgres -Aqt is used to:
-A [aligned mode disable, remove header and + at end of line]
-q [quiet output]
-t [tuples only, removes footer]
echo "select query_to_xml( 'select * from vie order by date asc', true,
false, 'vie' ) ;" | sudo -u postgres psql -Aqt > ./vie_data.xml ;
Now, it is important to export the schema of the table with table_to_xmlschema().
echo "select table_to_xmlschema( 'vie', true, false, '') ;" | sudo -u
postgres psql -Aqt > ./vie_schema.xsd ;
This concludes all tasks within postgre and the BASH. As last command basex can be launched.
basexgui
Xquery
Using basex the XML file can easily be validated against the schema with via:
validate:xsd('vie_data.xml', 'vie_schema.xsd')
The XML file can be imporet by clicking:
Database -> New
General -> Browse Select XML file.
Parsing Turn on "Enable Namespaces" if its not enabled.
OK
RQ1 can only be answered by grouping the the data by ‘DATE’ via a for-loop. Results are saved
via:
file:write( 'path/to/directory/file_name' ).
file:write( '/path/to/directory/population_year_total.xml',
for $row in //table/row
group by $date := $row/date
order by $date ascending
return <year_total date="{$date}"
population="{sum($row/population)}">
</year_total>)
RQ2 is answerd by nesting two for loops. The outer loop groups by DATE and returns the
POPULATION total for each DATE given. The inner loop groups by DISTRICT, hence, it returns a
sub-sum of the POPULATION.
file:write( '/path/to/directory/district_year_subtotal.xml',
for $row in //table/row
group by $date:= $row/date
order by $date ascending
return <sub_sum date="{$date}"
population="{sum($row/population)}">{
for $sub_item in $row
group by $district := $sub_item/district
order by $district ascending
return <sub_item district="{$district}"
population="{sum($sub_item/population)}"/>
}</sub_sum>)
Done

Inserting a file into a Postgres bytea column using perl/SQL

I'm working with a legacy system and need to find a way to insert files into a pre-existing Postgres 8.2 bytea column using Perl.
So far my searching has lead me to believe the following:
there is no consensus best approach for this.
lo_import looks promising, but I'm apparently too perl-tarded to get it to work.
I was hoping to do something like the following
my $bind1 = "foo"
my $bind2 = "123"
my $file = "/path/to/file.ext"
my $q = q{
INSERT INTO generic_file_table
(column_1,
column_2,
bytea_column
)
VALUES
(?, ?, lo_import(?))
};
my $sth = $dbh->prepare($q);
$sth->execute($bind1, $bind2, $file);
$sth->finish();`
My script works w/o the lo_import/bytea part. But with it I get this error:
DBD::Pg::st execute failed: ERROR: column "contents" is of type bytea but expression is >of type oid at character 176
HINT: You will need to rewrite or cast the expression.
What I think I'm doing wrong is that I'm not passing the actual binary file to the DB properly. I think I'm passing the file path, but not the file itself. If that's true then what I need to figure out is how to open/read the file into a tmp buffer, and then use the buffer for the import.
Or am I way off base here? I'm open to any pointers, or alternative solutions as long as they work with Perl 5.8/DBI/PG 8.2.
Pg offers two ways to store binary files:
large objects, in the pg_largeobject table, which are referred to by an oid. Often used via the lo extension. May be loaded with lo_import.
bytea columns in regular tables. Represented as octal escapes like \000\001\002fred\004 in PostgreSQL 9.0 and below, or as hex escapes by default in Pg 9.1 and above eg \x0102. The bytea_output setting lets you select between escape (octal) and hex format in versions that have hex format.
You're trying to use lo_import to load data into a bytea column. That won't work.
What you need to do is send PostgreSQL correctly escaped bytea data. In a supported, current PostgreSQL version you'd just format it as hex, bang a \x in front, and you'd be done. In your version you'll have to escape it as octal backslash-sequences and (because you're on an old PostgreSQL that doesn't use standard_conforming_strings) probably have to double the backslashes too.
This mailing list post provides a nice example that will work on your version, and the follow-up message even explains how to fix it to work on less prehistoric PostgreSQL versions too. It shows how to use parameter binding to force bytea quoting.
Basically, you need to read the file data in. You can't just pass the file name as a parameter - how would the database server access the local file and read it? It'd be looking for a path on the server.
Once you've read the data in, you need to escape it as bytea and send that to the server as a parameter.
Update: Like this:
use strict;
use warnings;
use 5.16.3;
use DBI;
use DBD::Pg;
use DBD::Pg qw(:pg_types);
use File::Slurp;
die("Usage: $0 filename") unless defined($ARGV[0]);
die("File $ARGV[0] doesn't exist") unless (-e $ARGV[0]);
my $filename = $ARGV[0];
my $dbh = DBI->connect("dbi:Pg:dbname=regress","","", {AutoCommit=>0});
$dbh->do(q{
DROP TABLE IF EXISTS byteatest;
CREATE TABLE byteatest( blah bytea not null );
});
$dbh->commit();
my $filedata = read_file($filename);
my $sth = $dbh->prepare("INSERT INTO byteatest(blah) VALUES (?)");
# Note the need to specify bytea type. Otherwise the text won't be escaped,
# it'll be sent assuming it's text in client_encoding, so NULLs will cause the
# string to be truncated. If it isn't valid utf-8 you'll get an error. If it
# is, it might not be stored how you want.
#
# So specify {pg_type => DBD::Pg::PG_BYTEA} .
#
$sth->bind_param(1, $filedata, { pg_type => DBD::Pg::PG_BYTEA });
$sth->execute();
undef $filedata;
$dbh->commit();
Thank you to those who helped me out. It took a while to nail this one down. The solution was to open the file and store it. then specifically call out the bind variable that is type bytea. Here is the detailed solution:
.....
##some variables
my datum1 = "foo";
my datum2 = "123";
my file = "/path/to/file.dat";
my $contents;
##open the file and store it
open my $FH, $file or die "Could not open file: $!";
{
local $/ = undef;
$contents = <$FH>;
};
close $FH;
print "$contents\n";
##preparte SQL
my $q = q{
INSERT INTO generic_file_table
(column_1,
column_2,
bytea_column
)
VALUES
(?, ?, ?)
};
my $sth = $dbh->prepare($q);
##bind variables and specifically set #3 to bytea; then execute.
$sth->bind_param(1,$datum1);
$sth->bind_param(2,$datum2);
$sth->bind_param(3,$contents, { pg_type => DBD::Pg::PG_BYTEA });
$sth->execute();
$sth->finish();

How to match similar filenames and rename so that diff tools like Beyond Compare see them as a pair to perform a binary comparison?

I'm looking for the best approach to comparing files that I believe are identical but which have different filenames. Comparison tools like BeyondCompare are great but they don't yet handle different filenames - when comparing files in separate folders they attempt comparisons with the files that have the same name on either side.
(I don't work for or have a financial interest in BeyondCompare, but I use the tool a lot and find it has some great features).
There is MindGems Fast Duplicate File Finder for matching files in any location throughout several folder trees that have different names but this is based on CRC checks I believe, I am using this tool but I am only gradually trusting it, so far no faults but don't trust it as much as BeyondCompare yet. BeyondCompare offers the complete piece of mind of doing a full binary compare on the file.
In my case the files tend to have similar names, the difference being ordering of the words, punctuation, case differences and not all words present. So it's not easy to use a regex filter to match the files that some diff tools like Beyond Compare already provide because the file substrings can be out of order.
I'm looking for a way to match similar filenames before renaming the files to be the same and then 'feeding' them to a tool like BeyondCompare. Solutions could be scripts or perhaps in the form of an application.
At the moment I have an idea for an algorithm (to implement in Perl) to match the filenames to suit my problem whereby the filenames are similar as described above.
Can you suggest something better or a completely different approach?
Find a list of files with the exact same filesize
Make a hash of alphanumeric substrings from first file, using
non-alphanumeric characters or space as delimiter
Make a hash of alphanumeric substrings from second file, using
non-alphanumeric characters or space as delimiter
Match occurrences
Find which file has the highest number of substrings.
Calculate a percentage score for the comparison on the pair based on
number of matches divided by the highest number of substrings.
Repeat comparison for each file with every other file with the exact
file size
sort the pair comparisons by percentage score to get suggestions of
files to compare.
Rename one file in the pair so that it is the same as the other. Place in separate folders.
Run a comparison tool like BeyondCompare with the files, folder comparison mode.
As I already have Fast Duplicate File Finder Pro, this outputs a text report of the duplicates in CSV and XML format.
I will process the CSV to see the groupings and rename the files so that I can get beyond compare to do a full binary comparison on them.
Update:
And here is my code. This Perl script will look at each pair of files (in the directories/folders being compared) that are the same and rename one of them to be the same as the other so that the two folders can be run through Beyond Compare which will do a full binary compare (if the flatten folders option is switched on). Binary compare confirms the match so that means that one of each duplicate pair can be purged.
#!/usr/bin/perl -w
use strict;
use warnings;
use File::Basename;
my $fdffCsv = undef;
# fixed
# put matching string - i.e. some or all of path of file to keep here e.g. C:\\files\\keep\\ or just keep
my $subpathOfFileToKeep = "keep";
# e.g. jpg mp3 pdf etc.
my $fileExtToCompare = "jpg";
# changes
my $currentGroup = undef;
my $group = undef;
my $filenameToKeep = "";
my $path = undef;
my $name = undef;
my $extension = undef;
my $filename = undef;
open ( $fdffCsv, '<', "fast_duplicate_filefinder_export_as_csv.csv" );
my #filesToRenameArray = ();
while ( <$fdffCsv> )
{
my $line = $_;
my #lineColumns = split( /,/, $line );
# is the first column and index value
if ( $lineColumns[0] =~ m/\d+/ )
{
$group = $lineColumns[0];
( $line ) =~ /("[^"]+")/;
$filename = $1;
$filename =~ s/\"//g;
if ( defined $currentGroup )
{
if ( $group == $currentGroup )
{
( $name, $path, $extension ) = fileparse ( $filename, '\..*"' );
store_keep_and_rename();
}
else # group changed
{
match_the_filenames();
( $name, $path, $extension ) = fileparse ( $filename, '\..*"' );
store_keep_and_rename();
}
}
else # first time - beginning of file
{
$currentGroup = $group;
( $name, $path, $extension ) = fileparse ( $filename, '\..*"' );
store_keep_and_rename();
}
}
}
close( $fdffCsv );
match_the_filenames();
sub store_keep_and_rename
{
if ( $path =~ /($subpathOfFileToKeep)/ )
{
$filenameToKeep = $name.$extension;
}
else
{
push( #filesToRenameArray, $filename );
}
}
sub match_the_filenames
{
my $sizeOfFilesToRenameArraySize = scalar( #filesToRenameArray );
if ( $sizeOfFilesToRenameArraySize > 0 )
{
for (my $index = 0; $index < $sizeOfFilesToRenameArraySize; $index++ )
{
my $PreRename = $filesToRenameArray[$index];
my ( $preName, $prePath, $preExtension ) = fileparse ( $PreRename, '\..*' );
my $filenameToChange = $preName.$preExtension;
my $PostRename = $prePath.$filenameToKeep;
print STDOUT "Filename was: ".$PreRename."\n";
print STDOUT "Filename will be: ".$PostRename."\n\n";
rename $PreRename, $PostRename;
}
}
undef( #filesToRenameArray ); #filesToRenameArray = ();
$currentGroup = $group;
}
Beyond Compare can do that.
Just select the file on the left and the file to compare on the right.
Choose 'compare' or use the align function (right mouse button)