I have this perl script which takes the data from sqlplus database... this database adds a new entry every time when there is a change in the value of state for a particular serial number. Now we need to pick the entries at every state change and prepare a csv file with old state, new state and other fields. db table sample.
SERIALNUMBER STATE AT OPERATORID SUBSCRIBERID TRANSACTIONID
51223344558899 Available 20081008T10:15:47 vsuser
51223344558857 Available 20081008T10:15:49 vsowner
51223344558899 Used 20081008T10:20:25 vsuser
51223344558860 Stolen 20081008T10:15:49 vsanyone
51223344558857 Damaged 20081008T10:50:49 vsowner
51223344558899 Damaged 20081008T10:50:25 vsuser
51343253335355 Available 20081008T11:15:47 vsindian
my script:
#! /usr/bin/perl
#use warnings;
use strict;
#my $circle =
#my $schema =
my $basePath = "/scripts/Voucher-State-Change";
#my ($sec, $min, $hr, $day, $month, $years) = localtime(time);
#$years_+=1900;$mont_+=1;
#my $timestamp=sprintf("%d%02d%02d",$years,$mont,$moday);
sub getDate {
my $daysago=shift;
$daysago=0 unless ($daysago);
#my #months=qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time-(86400*$daysago));
# YYYYMMDD, e.g. 20060126
return sprintf("%d%02d%02d",$year+1900,$mon+1,$mday);
}
my $filedate=getDate(1);
#my $startdate="${filedate}T__:__:__";
my $startdate="20081008T__:__:__";
print "$startdate\n";
##### Generating output file---
my $outputFile = "${basePath}/VoucherStateChangeReport.$filedate.csv";
open (WFH, ">", "$outputFile") or die "Can't open output file $outputFile for writing: $!\n";
print WFH "VoucherSerialNumber,Date,Time,OldState,NewState,UserId\n";
##### Generating log file---
my $logfile = "${basePath}/VoucherStateChange.$filedate.log";
open (STDOUT, ">>", "$logfile") or die "Can't open logfile $logfile for writing: $!\n";
open (STDERR, ">>", "$logfile") or die "Can't open logfile $logfile for writing: $!\n";
print "$logfile\n";
##### Now login to sqlplus-----
my $SQLPLUS='/opt/oracle/product/11g/db_1/bin/sqlplus -S system/coolman7#vsdb';
`$SQLPLUS \#${basePath}/VoucherQuery1.sql $startdate> ${basePath}/QueryResult1.txt`;
open (FH1, "${basePath}/QueryResult1.txt");
while (my $serial = <FH1>) {
chomp ($serial);
my $count = `$SQLPLUS \#${basePath}/VoucherQuery2.sql $serial $startdate`;
chomp ($count);
$count =~ s/\s+//g;
#print "$count\n";
next if $count == 1;
`$SQLPLUS \#${basePath}/VoucherQuery3.sql $serial $startdate> ${basePath}/QueryResult3.txt`;
# print "select * from sample where SERIALNUMBER = $serial----\n";
open (FH3, "${basePath}/QueryResult3.txt");
my ($serial_number, $state, $at, $operator_id);
my $count1 = 0;
my $old_state;
while (my $data = <FH3>) {
chomp ($data);
#print $data."\n";
my #data = split (/\s+/, $data);
my ($serial_number, $state, $at, $operator_id) = #data[0..3];
#my $serial_number = $data[0];
#my $state = $data[1];
#my $at = $data[2];
#my $operator_id = $data[3];
$count1++;
if ($count1 == 1) {
$old_state = $data[1];
next;
}
my ($date, $time) = split (/T/, $at);
$date =~ s/(\d{4})(\d{2})(\d{2})/$1-$2-$3/;
print WFH "$serial_number,$date,$time,$old_state,$state,$operator_id\n";
$old_state = $data[1];
}
}
close(WFH);
query in VoucherQuery1.sql:
select distinct SERIALNUMBER from sample where AT like '&1';
query in VoucherQuery2.sql:
select count(*) from sample where SERIALNUMBER = '&1' and AT like '&2';
query in VoucherQuery2.sql:
select * from sample where SERIALNUMBER = '&1' and AT like '&2';
and my sample output:
VoucherSerialNumber,Date,Time,OldState,NewState,UserId
51223344558857,2008-10-08,10:50:49,Available,Damaged,vsowner
51223344558899,2008-10-08,10:20:25,Available,Used,vsuser
51223344558899,2008-10-08,10:50:25,Used,Damaged,vsuser
Script is working pretty fine. But problem is that actual db table has millions of records for a specific day... and therefore it is raising performance issues... could you please advise how can we improve the efficiency of this script in terms of time & load. Only restriction is that I can't use DBI module for this...
Also in case of any error in the sql queries, error msg is coming to QueryResult?.txt files. I want to handle and receive these errors in my log file. how this can be accomplished? thanks
I think you need to tune your query. A good starting point is to use the EXPLAIN PLAN, if it is an Oracle database.
Related
I am trying to load 160gb csv file to sql and I am using powershell script I got from Github and I get this error
IException calling "Add" with "1" argument(s): "Input array is longer than the number of columns in this table."
At C:\b.ps1:54 char:26
+ [void]$datatable.Rows.Add <<<< ($line.Split($delimiter))
+ CategoryInfo : NotSpecified: (:) [], MethodInvocationException
+ FullyQualifiedErrorId : DotNetMethodException
So I checked the same code with small 3 line csv and all of the columns match and also have header in first row and there are no extra delimiters not sure why I am getting this error.
The code is below
<# 8-faster-runspaces.ps1 #>
# Set CSV attributes
$csv = "M:\d\s.txt"
$delimiter = "`t"
# Set connstring
$connstring = "Data Source=.;Integrated Security=true;Initial Catalog=PresentationOptimized;PACKET SIZE=32767;"
# Set batchsize to 2000
$batchsize = 2000
# Create the datatable
$datatable = New-Object System.Data.DataTable
# Add generic columns
$columns = (Get-Content $csv -First 1).Split($delimiter)
foreach ($column in $columns) {
[void]$datatable.Columns.Add()
}
# Setup runspace pool and the scriptblock that runs inside each runspace
$pool = [RunspaceFactory]::CreateRunspacePool(1,5)
$pool.ApartmentState = "MTA"
$pool.Open()
$runspaces = #()
# Setup scriptblock. This is the workhorse. Think of it as a function.
$scriptblock = {
Param (
[string]$connstring,
[object]$dtbatch,
[int]$batchsize
)
$bulkcopy = New-Object Data.SqlClient.SqlBulkCopy($connstring,"TableLock")
$bulkcopy.DestinationTableName = "abc"
$bulkcopy.BatchSize = $batchsize
$bulkcopy.WriteToServer($dtbatch)
$bulkcopy.Close()
$dtbatch.Clear()
$bulkcopy.Dispose()
$dtbatch.Dispose()
}
# Start timer
$time = [System.Diagnostics.Stopwatch]::StartNew()
# Open the text file from disk and process.
$reader = New-Object System.IO.StreamReader($csv)
Write-Output "Starting insert.."
while ((($line = $reader.ReadLine()) -ne $null))
{
[void]$datatable.Rows.Add($line.Split($delimiter))
if ($datatable.rows.count % $batchsize -eq 0)
{
$runspace = [PowerShell]::Create()
[void]$runspace.AddScript($scriptblock)
[void]$runspace.AddArgument($connstring)
[void]$runspace.AddArgument($datatable) # <-- Send datatable
[void]$runspace.AddArgument($batchsize)
$runspace.RunspacePool = $pool
$runspaces += [PSCustomObject]#{ Pipe = $runspace; Status = $runspace.BeginInvoke() }
# Overwrite object with a shell of itself
$datatable = $datatable.Clone() # <-- Create new datatable object
}
}
# Close the file
$reader.Close()
# Wait for runspaces to complete
while ($runspaces.Status.IsCompleted -notcontains $true) {}
# End timer
$secs = $time.Elapsed.TotalSeconds
# Cleanup runspaces
foreach ($runspace in $runspaces ) {
[void]$runspace.Pipe.EndInvoke($runspace.Status) # EndInvoke method retrieves the results of the asynchronous call
$runspace.Pipe.Dispose()
}
# Cleanup runspace pool
$pool.Close()
$pool.Dispose()
# Cleanup SQL Connections
[System.Data.SqlClient.SqlConnection]::ClearAllPools()
# Done! Format output then display
$totalrows = 1000000
$rs = "{0:N0}" -f [int]($totalrows / $secs)
$rm = "{0:N0}" -f [int]($totalrows / $secs * 60)
$mill = "{0:N0}" -f $totalrows
Write-Output "$mill rows imported in $([math]::round($secs,2)) seconds ($rs rows/sec and $rm rows/min)"
Working with a 160 GB input file is going to be a pain. You can't really load it into any kind of editor - or at least you don't really analyze such a data mass without some serious automation.
As per the comments, it seems that the data has some quality issues. In order to find the offending data, you could try binary searching. This approach shrinks the data fast. Like so,
1) Split the file in about two equal chunks.
2) Try and load first chunk.
3) If successful, process the second chunk. If not, see 6).
4) Try and load second chunk.
5) If successful, the files are valid, but you got another a data quality issue. Start looking into other causes. If not, see 6).
6) If either load failed, start from the beginning and use the failed file as the input file.
7) Repeat until you narrow down the offending row(s).
Another a method would be using an ETL tool like SSIS. Configure the package to redirect invalid rows into an error log to see what data is not working properly.
Good morning stackoverflow. I have a PowerShell script that is executing a SQL query against an Oracle database and then taking the results and passing them to a local shell command. It works, mostly. What is happening is some of the results are being dropped and the only significance I can see about these is that they have a couple of columns that have null values (but only 2 out of the 8 columns that are being returned). When the query is executed in sQL developer I get all expected results. This issue applies to the $eventcheck switch, the $statuscheck works fine. The Powershell script is below:
param(
[parameter(mandatory=$True)]$username,
[parameter(mandatory=$True)]$password,
$paramreport,
$paramsite,
[switch]$eventcheck,
[switch]$statuscheck
)
$qry1 = Get-Content .\vantageplus_processing.sql
$qry2 = #"
select max(TO_CHAR(VP_ACTUAL_RPT_DETAILS.ETLLOADER_OUT,'YYYYMMDDHH24MISS')) Completed
from MONITOR.VP_ACTUAL_RPT_DETAILS
where VP_ACTUAL_RPT_DETAILS.REPORTNUMBER = '$($paramreport)' and VP_ACTUAL_RPT_DETAILS.SITE_NAME = '$($paramsite)'
order by completed desc
"#
$connString = #"
Provider=OraOLEDB.Oracle;Data Source=(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST="HOST")(PORT="1521"))
(CONNECT_DATA=(SERVICE_NAME="SERVICE")));User ID="$username";Password="$password"
"#
function Get-OLEDBData ($connectstring, $sql) {
$OLEDBConn = New-Object System.Data.OleDb.OleDbConnection($connectstring)
$OLEDBConn.open()
$readcmd = New-Object system.Data.OleDb.OleDbCommand($sql,$OLEDBConn)
$readcmd.CommandTimeout = '300'
$da = New-Object system.Data.OleDb.OleDbDataAdapter($readcmd)
$dt = New-Object System.Data.DataTable
[void]$da.fill($dt)
$OLEDBConn.close()
return $dt
}
if ($eventcheck)
{
$output = Get-OLEDBData $connString $qry1
ForEach ($lines in $output)
{
start-process -NoNewWindow -FilePath msend.exe -ArgumentList #"
-n bem_snmp01 -r CRITICAL -a CSG_VANTAGE_PLUS -m "The report $($lines.RPT) for site $($lines.SITE) has not been loaded by $($lines.EXPDTE)" -b "vp_reportnumber='$($lines.RPT)'; vp_sitename='$($lines.SITE)'; vp_expectedcomplete='$($lines.SIMEXPECTED)'; csg_environment='Production';"
"# # KEEP THIS TERMINATOR AT THE BEGINNING OF LINE
}
}
if ($statuscheck)
{
$output = Get-OLEDBData $connString $qry2
# $output | foreach {$_.completed}
write-host -nonewline $output.completed
}
So that you can see the data, below is a csv output from Oracle SQL Developer with the ACTUAL results to the query that is being referenced by my script. Of these results lines 4, 5, 6, 7, 8, 10 are the only ones being passed along in the ForEach loop, while the others are not even captured in the $output array. If anyone can advise of a method for getting all of the results passed along, I would appreciate it.
"SITE","RPT","LSDTE","EXPDTE","CIMEXPECTED","EXPECTED_FREQUENCY","DATE_TIMING","ETME"
"chrcse","CPHM-054","","2014/09/21 12:00:00","20140921120000","MONTHLY","1",
"chrcse","CPSM-226","","2014/09/21 12:00:00","20140921120000","MONTHLY","1",
"dsh","CPSD-176","2014/09/28 23:20:04","2014/09/30 04:00:00","20140930040000","DAILY","1",1.41637731481481481481481481481481481481
"dsh","CPSD-178","2014/09/28 23:20:11","2014/09/30 04:00:00","20140930040000","DAILY","1",1.4162962962962962962962962962962962963
"exp","CPSM-610","2014/08/22 06:42:10","2014/09/21 09:00:00","20140921090000","MONTHLY","1",39.10936342592592592592592592592592592593
"mdc","CPKD-264","2014/09/24 00:44:32","2014/09/30 04:00:00","20140930040000","DAILY","1",6.35771990740740740740740740740740740741
"nea","CPKD-264","2014/09/24 01:00:31","2014/09/30 03:00:00","20140930030000","DAILY","1",6.34662037037037037037037037037037037037
"twtla","CPOD-034","","2014/09/29 23:00:00","20140929230000","DAILY","0",
"twtla","CPPE-002","2014/09/29 02:40:35","2014/09/30 06:00:00","20140930060000","DAILY","1",1.27712962962962962962962962962962962963
"twtla","CPXX-004","","2014/09/29 23:00:00","20140929230000","DAILY","0",
It appears, actually, that somehow a comment that was in the query was causing this issue. I removed it and the results started returning normal. I have no idea why this would be the case, unless it has to do with the way the import works (does it import everything as one line?). Either way, the results are normal now.
The code below works, but all of the data displays in one row(but different columns) when opened in Excel. The query SHOULD display the data headings, row 1, and row 2. Also, when I open the file, I get a warning that says "The file you are trying to open,'xxxx.csv', is in a different format than specified by the file extension. Verify that the file is not corrupted...etc. Do you want to open the file now?" Anyway to fix that? That may be the cause too.
tldr; export to csv with multiple rows - not just one. fix Excel error. Thanks!
#!/usr/bin/perl
use warnings;
use DBI;
use Text::CSV;
# local time variables
($sec,$min,$hr,$mday,$mon,$year) = localtime(time);
$mon++;
$year += 1900;
# set name of database to connect to
$database=MDLSDB1;
# connection to the database
my $dbh = DBI->connect("dbi:Oracle:$database", "", "")
or die "Can't make database connect: $DBI::errstr\n";
# some settings that you usually want for oracle 10
$dbh->{LongReadLen} = 65535;
$dbh->{PrintError} = 0;
# sql statement to run
$sql="select * from eg.well where rownum < 3";
my $sth = $dbh->prepare($sql);
$sth->execute();
my $csv = Text::CSV->new ( { binary => 1 } )
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open my $fh, ">:raw", "results-$year-$mon-$mday-$hr.$min.$sec.csv";
$csv->print($fh, $sth->{NAME});
while(my $row = $sth->fetchrow_arrayref){
$csv->print($fh, $row);
}
close $fh or die "Failed to write CSV: $!";
while(my $row = $sth->fetchrow_arrayref){
$csv->print($fh, $row);
$csv->print($fh, "\n");
}
CSV rows are delimited by newlines. Just simply add a newline after each row.
I think another solution is to use the instantiation of the Text::CSV object and pass along the desired line termination there...
my $csv = Text::CSV->new ( { binary => 1 } )
or die "Cannot use CSV: " . Text::CSV->error_diag();
becomes:
my $csv = Text::CSV->new({ binary => 1, eol => "\r\n" })
or die "Cannot use CSV: " . Text::CSV->error_diag();
I want to retrieve the email address from the table and using that to send email using perl script.
How to use the query result in mail.
I am new to perl scripting please help.
I have updated as suggested but still there are some issues.
Please tell me where I am going wrong.
Thanks in advance.
#!/usr/bin/perl
# $Id: outofstockmail.pl,v 1.0 2012/03/01 21:35:24 isha Exp $
require '/usr/home/fnmugly/main.cfg';
use DBI;
my $dbh = DBI->connect($CFG{'mysql_dsn'},$CFG{'mysql_user'},
$CFG{'mysql_password'})
or &PrintError("Could not connect to the MySQL Database.\nFile could not be made!\n");
$dbh->{RaiseError} = 1; # save having to check each method call
print "<H1>Hello World</H1>\n";
$sql = "Select OS.name, OS.customer_email, OS.product, OS.salesperson,
OS.salesperson_email
from products AS P
LEFT JOIN outofstock_sku AS OS ON OS.product = P.sku
LEFT JOIN tech4less.inventory AS I ON (I.sku = P.sku AND I.status = 'A')
WHERE mail_sent='0'
GROUP BY OS.product";
#$sth = $dbh->do($sql);
my $sth1 = $dbh->prepare($sql);
$sth1->execute;
while ( my #row = $sth1->fetchrow_array ) {
# email
open MAIL, "| $mail_prog -t" || die "Could not connect to sendmail.";
print MAIL "To: $row[1]";
print MAIL "From: $row[4]";
print MAIL "Reply-To:$row[4]";
print MAIL "CC: $row[4]";
print MAIL "Subject: Product requested is back in inventory\n";
print MAIL "\n";
print MAIL "Hi $row[0] , The product $row[2] is available in the stock.\n";
print MAIL "\n";
close MAIL;
$sql = "Update outofstock_sku SET mail_sent='0' WHERE mail_sent='1'";
$sth2 = $dbh->do($sql);
}
$sth = $dbh->do($sql);
$dbh->disconnect();
exit;
This script has too many problems to address question specifically:
1) Use use strict; use warnings;. It will help you to be more accurate.
2) DBI->connect() takes options as last argument, so you can set RaiseError there:
my $dbh = DBI->connect($dsn, $user, $pwd, { RaiseError => 1 });
3) $dbh->do doesn't return sth object. You need prepare and execute:
my $sth = $dbh->prepare($sql);
$sth->execute;
while ( my #row = $sth->fetchrow_array ) {
...
print MAIL "Hi $row[0]. We're happy to ... $row[1]...\n";
...
}
4) To send mail use a module, for example Email::Sender::Simple.
Scary to think how much old code like this is still in use out there.
You don't actually say what your question is. Just saying that the code has "some issues" doesn't really help us to help you.
Some suggestions...
Use strict and warnings
Pass RaiseError as an argument to the connect call
Print out the contents of #row so you can be sure that the SQL is correct
Use Email::Simple to create your email and Email::Sender to send it
(Far less important) Consider using DBIx::Class to talk to the database
#!/usr/bin/perl
require '/main.cfg';
# use DBI interface for MySQL
use DBI;
# connect to MySQL
$dbh = DBI->connect($CFG{'mysql_dsn'},$CFG{'mysql_user'},$CFG{'mysql_password'}) or
&PrintError("Could not connect to the MySQL Database.\nFile could not be made!\n");
$dbh->{RaiseError} = 1; # save having to check each method call
$mailprog = $CFG{'mail_prog'};
$sql = "Select OS.name,OS.customer_email,OS.product,OS.salesperson_email from products AS P LEFT JOIN outofstock_sku AS OS ON OS.product = P.sku LEFT JOIN inventory AS I ON (I.sku = P.sku AND I.status = 'A') WHERE mail_sent='0' GROUP BY OS.product";
$sth = $dbh->prepare($sql);
$sth->execute();
while($ref = $sth->fetchrow_hashref())
{
open MAIL, "| $mailprog -f sssss\#gmail.com -t" || die "Could not connect to sendmail.";
print "Content-Type: text/html\n\n";
print MAIL "To: $ref->{'customer_email'}\n";
print MAIL "From: \"\" <wwwwwwwwww\n>";
# print MAIL "From: \"\" <$ref->{'salesperson_email'}\n>";
print MAIL "Reply-To: ";
# print MAIL "CC: $ref->{'salesperson_email'}\n";
print MAIL "Subject:#$ref->{'product'}\n";
print MAIL "Hi $ref->{'name'},\nThe product $ref->{'product'} is available in the stock.\n";
close MAIL;
}
$sth->finish();
# Close MySQL connection
$dbh->disconnect();
exit;
here i have 3 variable in perl and i m passing those variable in one db.sql file in sql query line then executing that db.sql file in same perl file but it's not fetching those variable.
$patch_name="CS2.001_PROD_TEST_37987_spyy";
$svn_url="$/S/B";
$ftp_loc="/Releases/pkumar";
open(SQL, "$sqlfile") or die("Can't open file $sqlFile for reading");
while ($sqlStatement = <SQL>)
{
$sth = $dbh->prepare($sqlStatement) or die qq("Can't prepare $sqlStatement");
$sth->execute() or die ("Can't execute $sqlStatement");
}
close SQL;
$dbh->disconnect;
following is the db.sql file query
insert into branch_build_info(patch_name,branch_name,ftp_path)
values('$patch_name','$svn_url','$ftp_loc/$patch_name.tar.gz');
pls help me how can i pass the variable so that i can run more the one insert query at a time.
If you can influence the sql-file, you could use placeholders
insert into ... values(?, ?, ?);
and then supply the parameters at execution:
my $ftp_path = "$ftp_loc/$patch_name.tar.gz";
$sth->execute( $patch_name, $svn_url, $ftp_path) or die ...
That way you would only need to prepare the statement once and can execute it as often as you want. But I'm not sure, where that sql-file comes from. It would be helpful, if you could clarify the complete workflow.
You can use the String::Interpolate module to interpolate a string that's already been generated, like the string you're pulling out of a file:
while ($sqlStatement = <SQL>)
{
my $interpolated = new String::Interpolate { patch_name => \$patch_name, svn_url => \$svn_url, ftp_loc => \$ftp_loc };
$interpolated->($sqlStatement);
$sth = $dbh->prepare($interpolated) or die qq("Can't prepare $interpolated");
$sth->execute() or die ("Can't execute $interpolated");
}
Alternately, you could manually perform replacements on the string to be interpolated:
while ($sqlStatement = <SQL>)
{
$sqlStatement =~ s/\$patch_name/$patch_name/g;
$sqlStatement =~ s/\$svn_url/$svn_url/g;
$sqlStatement =~ s/\$ftp_loc/$ftp_loc/g;
$sth = $dbh->prepare($sqlStatement) or die qq("Can't prepare $sqlStatement");
$sth->execute() or die ("Can't execute $sqlStatement");
}
there's a few problems there - looks like you need to read in all of the SQL file to make the query, but you're just reading it a line at a time and trying to run each line as a separate query. The <SQL> just takes the next line from the file.
Assuming there is just one query in the file, something like the following may help:
open( my $SQL_FH, '<', $sqlfile)
or die "Can't open file '$sqlFile' for reading: $!";
my #sql_statement = <$SQL_FH>;
close $SQL_FH;
my $sth = $dbh->prepare( join(' ',#sql_statement) )
or die "Can't prepare #sql_statement: $!";
$sth->execute();
However, you still need to expand the variables in your SQL file. If you can't replace them with placeholders as already suggested, then you'll need to either eval the file or parse it in some way...