I have a package which looks as follows:
Notice the two areas which I have marked with a red rectangle: they are identical in every way. Can I make changes to the package so I can avoid this duplication? It seems to me I cannot move them to a Data Flow Task since loops and File System Task do not exists there.
You can create a sub package for the Loop logic as #Bill mentioned, and the way to 'pass the resultset of a query on to another package' is as below (use SSIS 2012 as the example, I did the similar work SSIS 2005, so you only need to change the c# code to vb.net)
In your parent package, create a variable to hold the name of the resultSet variable in the parent package:
In your sub package, create a string variable parentResultSetName:
In your sub package, add a package configuration to mapping the parentResultSetName to the parent package variable resultSetVariableName:
Now, we can read the resultSet variable by the name in the script task of sub package:
public void Main()
{
// TODO: Add your code here
var dsName = Dts.Variables["parentResultSetName"].Value.ToString();
Variables variables = null;
DataSet resultSet = null;
Dts.VariableDispenser.LockForRead(dsName);
Dts.VariableDispenser.GetVariables(ref variables);
try
{
resultSet = variables[dsName].Value as DataSet;
if (resultSet != null)
{
MessageBox.Show("Sub package get: " + resultSet.Tables[0].Rows[0][0].ToString());
}
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception e)
{
Dts.Events.FireError(-1, "", e.Message, "", 0);
}
}
Here is the result:
Just place both your queries from "select batch logins" tasks to recordset and make another foreach loop over that recordset, executing those two queries from a variable.
I can see that in your second foreach loop there are some additional tasks, so you'll have incorporate them to "Select all batch logins" task somehow, or make constraints that'd match in both loops.
An alternative approach is to add an 'action' column to your 'batches' table. (or add an outrigger table). Work out beforehand what you want to do to the records. Then just delete the records and files once.
It looks like you are doing RBAR operations here.
So for example you run a couple of UPDATE statements against your tables that leave the tables in a state where each record is flagged for deletion or not.
Then after that you go through the loop and delete based off what is in the table.
It would make your package a lot simpler. and you can incorporate some 'commit' logic that makes sure a record and file are always deleted at the same time.
Related
I'm a newbie so don't laugh :#
I'm working with 2002-2003 Microsoft access database.
Now, I want to add an array of DataRow into an existing table that I've in my database. Is there a way to do that? because right now I'm just adding the rows with a foreach loop
thank you
I think that the foreach-loop actually is the best way to do it.
foreach(DataRow row in yourRowArray)
{
dataTable.Add(row);
}
If you are using .Net Framework 3.5+ you can also use the DataRows CopyToDataTable() Method.
But you have to watch out because the Data in the DataTable is overwritten in this case.
DataTable table = yourDataTable;
DataRow[] yourRowArray = ...;
if(yourRowArray.Length > 0)
{
table = yourRowArray.CopyToDataTable();
}
I would recommend using the foreach-loop.
What you describe as array must be a saved file type i.e. excel or csv. Be sure it is a clean grid of data without extraneous non aligned rows.
Then you can link to that file with Access as a table. This is a manual step using the Access interface - in the ribbon it is the External area. This link remains good - allowing you to replace the excel/csv with a new one as long as the location path and structure of the file do not change.
Then you create an Append query to write all the records from this table into the table in your Access database.
www.CahabaData.com
I am using FileHelpers libary and I have a pipe "|" delimited file that must have only 4 fields, and I need to validate when a record has more than 4 fields and save error.
bla|bla2|bla3|bla4 <- Good Record
bla|bla2|bla3|bla4|bla5 <- Wrong record
File Helpers throw a BadUsageException but the message does not describe well the ocurrence.
Thanks for answer.
You can use the engine.AfterReadRecord event to tell FileHelpers to skip the record:
engine.AfterReadRecord += Engine_AfterReadRecord;
private void Engine_AfterReadRecord(EngineBase engine, FileHelpers.Events.AfterReadEventArgs<object> e)
{
e.SkipThisRecord = true;
}
This would make the engine skip passed every record since I haven't put any criteria in. Just add your own custom logic.
I am calling a stored procedure from my Groovy code. The stored proc looks like this
SELECT * FROM blahblahblah
SELECT * FROM suchAndsuch
So basically, two SELECT statements and therefore two ResultSets.
sql.eachRow("dbo.testing 'param1'"){ rs ->
println rs
}
This works fine for a single ResultSet. How can I get the second one (or an arbitrary number of ResultSets for that matter).
You would need callWithAllRows() or its variant.
The return type of this method is List<List<GroovyRowResult>>.
Use this when calling a stored procedure that utilizes both output
parameters and returns multiple ResultSets.
This question is kind of old, but I will answer since I came across the same requirement recently and it maybe useful for future reference for me and others.
I'm working on a Spring application with SphinxSearch. When you run a query in sphinx, you get results, you need to run a second query to get the metadata for number of records etc...
// the query
String query = """
SELECT * FROM INDEX_NAME WHERE MATCH('SEARCHTERM')
LIMIT 0,25 OPTION MAX_MATCHES=25;
SHOW META LIKE 'total_found';
"""
// create an instance of our groovy sql (sphinx doesn't use a username or password, jdbc url is all we need)
// connection can be created from java, don't have to use groovy for it
Sql sql = Sql.newInstance('jdbc:mysql://127.0.0.1:9306/?characterEncoding=utf8&maxAllowedPacket=512000&allowMultiQueries=true','sphinx','sphinx123','com.mysql.jdbc.Driver')
// create a prepared statement so we can execute multiple resultsets
PreparedStatement ps = sql.getConnection().prepareStatement(query)
// execute the prepared statement
ps.execute()
// get the first result set and pass to GroovyResultSetExtension
GroovyResultSetExtension rs1 = new GroovyResultSetExtension(ps.getResultSet())
rs1.eachRow {
println it
}
// call getMoreResults on the prepared statement to activate the 2nd set of results
ps.getMoreResults()
// get the second result set and pass to GroovyResultSetExtension
GroovyResultSetExtension rs2 = new GroovyResultSetExtension(ps.getResultSet())
rs2.eachRow {
println it
}
Just some test code, this needs some improving on. You can loop the result sets and do whatever processing...
Comments should be self-explanatory, hope it helps others in the future!
Ok, I have a simple process...
Read a table and get the rows that
have a "StatusID" of 1. Simple.
Select ProductID from PreorderStatus where StatusID = 1
Foreach row returned from that
query, perform an action. For
simplicity sake, let's just modify
the original table to set the
"StatusID" to 2.
Update PreorderStatus set StatusID = 2 where ProductID = #ProductID
In order to do this in SSIS, I have created a simple "Execute SQL Task" with the first statement. In the editor I have set the Result Set to return a Full result set and the Result Name of 0 is set to fill an object variable named ReadySet.
The output is then routed to a For Each Loop container. The Enumerator is set to Foreach ADO Enumerator and the object source variable set to the ReadySet variable from above. I have also mapped the variable v_ProductID to index 0.
Setting a breakpoint at the begining of the Foreach loop shows the variable being set correctly. GREAT!! Now on to step two....
Now I have placed a new SQL task in the foreach container. Now I have a head scratcher. How do I actually use the variable in the SQL statement. Simply using "v___ProductID" or "User::v_ProductID" doesn't seem to work. Mapping a parameter seemed like a good idea (got a #ProductID and everything!) but that didn't seem to work either.
I get the feeling that I am missing something pretty simple but can't tell what. Thanks for any help!!
I think there is a better approach. Here are the approximate steps:
Drag a DataFlow task onto the design surface.
Open it up and add a OLE DB source and OLEDB Command components to the design surface.
Modify the source to use the query you have described.
Connect the source to the Command component.
Modify command component to use "Update PreorderStatus set StatusID = 2 where ProductID = ?" query and on param mapping page map the ? variable to the input coming from the datasource.
HTH
When I want to use an execute sql task and vary something based on a variable, I use a stored proc and make the variable the input parameter for the proc.
Then you set the parmeter in the execute SQL task and set the SQL statement to something like:
exec myproc ?
Setting the DBIC_TRACE environment variable to true:
BEGIN { $ENV{DBIC_TRACE} = 1 }
generates very helpful output, especially showing the SQL query that is being executed, but the SQL query is all on one line.
Is there a way to push it through some kinda "sql tidy" routine to format it better, perhaps breaking it up over multiple lines? Failing that, could anyone give me a nudge into where in the code I'd need to hack to add such a hook? And what the best tool is to accept a badly formatted SQL query and push out a nicely formatted one?
"nice formatting" in this context simply means better than "all on one line". I'm not particularly fussed about specific styles of formatting queries
Thanks!
As of DBIx::Class 0.08124 it's built in.
Just set $ENV{DBIC_TRACE_PROFILE} to console or console_monochrome.
From the documentation of DBIx::Class::Storage
If DBIC_TRACE is set then trace information is produced (as when the
debug method is set). ...
debug Causes trace information to be emitted on the debugobj
object. (or STDERR if debugobj has not specifically been set).
debugobj Sets or retrieves the object used for metric collection.
Defaults to an instance of DBIx::Class::Storage::Statistics that is
compatible with the original method of using a coderef as a callback.
See the aforementioned Statistics class for more information.
In other words, you should set debugobj in that class to an object that subclasses DBIx::Class::Storage::Statistics. In your subclass, you can reformat the query the way you want it to be.
First, thanks for the pointers! Partial answer follows ....
What I've got so far ... first some scaffolding:
# Connect to our db through DBIx::Class
my $schema = My::Schema->connect('dbi:SQLite:/home/me/accounts.db');
# See also BEGIN { $ENV{DBIC_TRACE} = 1 }
$schema->storage->debug(1);
# Create an instance of our subclassed (see below)
# DBIx::Class::Storage::Statistics class
my $stats = My::DBIx::Class::Storage::Statistics->new();
# Set the debugobj object on our schema's storage
$schema->storage->debugobj($stats);
And the definition of My::DBIx::Class::Storage::Statistics being:
package My::DBIx::Class::Storage::Statistics;
use base qw<DBIx::Class::Storage::Statistics>;
use Data::Dumper qw<Dumper>;
use SQL::Statement;
use SQL::Parser;
sub query_start {
my ($self, $sql_query, #params) = #_;
print "The original sql query is\n$sql_query\n\n";
my $parser = SQL::Parser->new();
my $stmt = SQL::Statement->new($sql_query, $parser);
#printf "%s\n", $stmt->command;
print "The parameters for this query are:";
print Dumper \#params;
}
Which solves the problem about how to hook in to get the SQL query for me to "pretty-ify".
Then I run a query:
my $rs = $schema->resultset('SomeTable')->search(
{
'email' => $email,
'others.some_col' => 1,
},
{ join => 'others' }
);
$rs->count;
However SQL::Parser barfs on the SQL generated by DBIx::Class:
The original sql query is
SELECT COUNT( * ) FROM some_table me LEFT JOIN others other_table ON ( others.some_col_id = me.id ) WHERE ( others.some_col_id = ? AND email = ? )
SQL ERROR: Bad table or column name '(others' has chars not alphanumeric or underscore!
SQL ERROR: No equijoin condition in WHERE or ON clause
So ... is there a better parser than SQL::Parser for the job?