Assume I know or am able to learn any adjacent technology/language--what's the best way to go about automating a number of processing/summary SQL scripts?
I have a number of scripts, clean-up (eg, update, delete), processing (eg, joins) and summaries post-processing, that I wrote last month but would like to automate. What's the preferred method(s) of automating the entire process as a series of sequential scripts?
EDIT: All of this is run on MySQL dbs.
Similar to MAW's answer, except I would use a Windows Service instead of a command line app (no GUI), and split out the individual DB scripts into separate tasks within the Service so that they can if necessary be called at different time intervals, log their results separately, and be independently configured.
This depends greatly on your DBMS. In SQL Server there are scheduled jobs which can run any combination of stored procedures/commands on a schedule, similar to Windows scheduled tasks.
Since you've tagged this question with mysql, I'll assume thats what your running.
One way to do this on mysql is:
http://dev.mysql.com/tech-resources/articles/event-feature.html
If all else fails, you can reference all of these scripts in a stored procedure, than create a simple command line program which would connect to that db and call that procedure. Then schedule that program with Windows Task scheduler or similar.
Related
I am new to Control-M scheduling and the scenario I have at hand is like below:
There is a stored procedure in SQL DEVELOPER which creates subpartition queries on a table. Now I need to schedule a Control-M job which runs this stored procedure directly into the database and schema mentioned in the controlM job parameters. I was able to set up the database connection part and with the Execution type as embedded query I wrote the SQL statement: EXEC <procedure_name>;
The Control-M job is failing with ORA00900: invalid SQL statement.
Note: the procedure doesnot have any partition. Also when I run the same query in SQL developer it runs successfully giving the expected result. The issue is with execution from Control-M job.
Can anyone please help with the solution. Many thanks!
#thatjeffsmith Control-M is the market leading job scheduling solution, it exists in Distributed (Unix and Windows and everything else) versions and Mainframe (with the possibility to skip between the 2).
Control-M can run scripts or commands. You won't be able to run a stored proc straight from a system command prompt but they have a solution for this. Control-M has many modules that run under the Control-M Agent (the Agent is installed locally on the box where you want to execute stuff) and these Agents come in many different varieties; ones manage file transfers, one to run SAP jobs, Oracle applications, Hadoop, all sorts of things.
The one you need is Control-M for Databases. Install that under your Control-M Agent (it's free for existing customers), takes 2 mins to add and it works with all the major DB vendors. Once running, setup a connection profile(s) on the Control-M for Databases module (DB name, id, pw, port number, the usual).
Then (back on your Control-M system) create a Control-M for Databases job. You can either choose a stored proc or use Control-M to store your SQL or whatever. Control-M gives you features like variables (including saving your output as a variable), rerun options, dependencies on other jobs, notifications (e.g. for failures or late runners), workload balancing, version control, yadda-yadda-yadda ...
I was able to run the stored procedure with the execution type as stored procedure in Control-M. Earlier I was not able to do this because of database connectivity issue, when that was solved I was able to do it.
Thanks!
I need to collect data from a SQL Server table, format it, and then put it into a different table.
I have access to SQL Server but cannot setup triggers or scheduled jobs.
I can create tables, stored procedures, views and functions.
What can I setup that will automatically collect the data and insert it into a SQL Server table for me?
I would probably create a stored procedure to do this task.
In the stored procedure you can create a CTE or use temp tables (depending on the task) and do all the data manipulation you require and once done, you can use the SELECT INTO statement to move all the data from the temp table into the SQL Server table you need.
https://www.w3schools.com/sql/sql_select_into.asp
You can then schedule this stored procedure to run at a time desired by you
A database is just a storage container. It doesn't "do" things automatically all by itself. Even if you did have the access to create triggers, something would have to happen to the table to cause the trigger to fire, typically a CRUD operation on the parent table. And something external needs to happen to initiate that CRUD operation.
When you start talking about automating a process, you're talking about the function of a scheduler program. SQL Server has one built in, the SQL Agent, and depending on your needs you may find that it's appropriate to enlist help from whoever in your organization does have access to it. I've worked in a couple of organizations, though, that only used the SQL Agent to schedule maintenance jobs, while data manipulation jobs were scheduled through an outside resource. The most common one I've run across is Control-M, but there are other players in that market. I even ran across one homemade scheduler protocol that was just built in C#.NET that worked great.
Based on the limitations you lay out in your question, and the comments you've made in response to others, it sounds to me like you need to do socialize your challenge within your organization to find out what their routine mechanism is for setting up data transfers. It's unlikely that this is the first time it's come up, unless the company was founded in the last week or two. It will probably require that you set up your code, probably a stored procedure or maybe an SSIS package, and then work with someone else, perhaps a DBA or a Site Operations team or some such, to get that process automated to fire when you need it to, whether through an Agent job or maybe a file listener.
Well you have two major options, SP and SSIS.
Both of them can be scheduled to run at a given time with a simple Job from the SQL Server Agent. Keep in mind that if you are doing this on a separate server you might need to add the source server as a Linked Server so you can access it from the script.
I've done this approach in the past and it has worked great. Note, for security reasons, I am not able to access the remote server's task scheduler, so I go through the SQL Server Agent:
Run a SQL Server Agent on a schedule of your choice
Use the SQL Server Agent to call an SSIS Package
The SSIS Package then calls an executable which can pull the data you want from your original table, evaluate it, and then insert a formatted version of it, one record at a time. Alternatively, you can simply create a C# script within the SSIS package via a Script Task.
I hope this helps. Please let me know if you need more details.
This is a general question and probably there are some solutions already. Most of the things I have found are related to database development, deployment, etc..
I am looking for a process that runs daily and performs some checks against some tables of a database. The data loaded in these tables is loaded by a lot of users, but the idea is that defining some rules, the process will detect "wrong" values loaded by the user.
I know this is a very open question, but do you know if this possible with some tools: Jenkins, DBGhost, etc...?
Thank you,
Kat
You have many options. Here's one train of thought.
Create a table called data_audit with fields like so:
audit_datetime
table
field
wrong_value
rule_violated
issue_description
Create stored procedures/functions that can detect wrong values and store the data into this audit table.
Depending on your database, you can run the stored procedure upon schedule. For example, if you have SQL Server, you can run the job using SQL Agent. Once the job is finished, you can run another job that finds count(*) from audit table for today's date. If count was higher than zero, use Database Mail feature to email relevant people to take action.
If you have a database like MySQL or PostgreSQL, write a short program in a language of your choice (PHP/Python/.NET/whatever) to execute the stored procedure, then do count(*) and then email if count was higher than zero. You can run this program using either cron on Linux or Linux-like systems or Task Scheduler in Windows.
You could use tools like Jenkins to schedule such activity. Task Scheduler/cron are built into your operating system and are easy to use. Additional installation like Jenkins is not necessary. If you already have Jenkins installed, you can certainly piggy-back on it.
I am trying to find out an ideal way to automatically copy new records from one database to another. the databases have different structure! I achieved it by writing VBS scripts which copy the data from one to another and triggered the scripts from another application which passes arguments to the script. But I faced issues at points where there were more than 100 triggers. i.e. 100wscript processes trying to access the database and they couldn't complete the task.
I want to find out a simpler solution inside SQL, I read about setting triggers, Stored procedure and running them from SQL agent, replication etc. The requirement is that I have to copy records to another database periodically or when there is a new record into another database.
Which method will suit me the best?
You can use CDC to do this activity. Create a SSIS package using CDC and run that package periodically through SQL Server Agent Job. CDC will store all the changes of that table and will do all those changes to the destination table when you run the package. Please follow the below link.
http://sqlmag.com/sql-server-integration-services/combining-cdc-and-ssis-incremental-data-loads
The word periodically in your question suggests that you should go for Jobs. You can schedule jobs in SQL Server using Sql Server agent and assign a period. The job will run your script as per assigned frequency.
PrabirS: Change Data Capture
This is a good option. Because it uses the truncation-log to create something similar to the Command Query Segregation Pattern (CQRS).
Alok Gupta: A SQL Job that runs in the SQL Agent
This too is a good option, given that you have something like a modified date thus you can filter the altered data. You can create a Stored Procedure and let it run regularly in the SQL Agent.
A third option could be triggers (the change will happen in the same transaction).
This option is useful for auditing and logging. But you should definitely avoid writing business logic in triggers, as triggers are more or less hidden and occur without directly calling them (similar to CDC actually). I have actually created a trigger about half a year ago that captured the data and inserted it somewhere else in xml-format as the columns in the original table could change over time (multiple projects using the same database(s)).
-Edit-
By the way, your question more or less suggest a lack of a clear design pattern and that the used technique is not the main problem. You could try to read how an ETL-layer is build, or try to implement a "separations of concerns". Note; it is hard to tell if this is the case, but given how you formulated your question, an unclear design is something that pops up in my mind as possible problem.
I have an archiving process that basically deletes archived records after a set number of days. Is it better to write a scheduled SQL job or a windows service to accomplish the deletion? The database is mssql2005.
Update:
To speak to some of the answers below, this question is regarding an in house application and not a distributed product.
It depends on what you want to accomplish.
Do you want to store the deleted archives somewhere? Log the changes? An SQL Job should perform better since it is run directly in the database, but it is easier to give a service acces to resources outside the database. So it depends on what you want to do,,,
I would think a scheduled SQL job would be a safer solution since if the database is migrated to a new machine, someone doing the migration might forget that there is a windows service involved and forget to start/install it on the new server.
In the past we've had a number of SQL Jobs run. Recently, however, we've been moving to calling those processes from .Net code as a client application run from a windows schedule task, for two reasons:
It's easier to implement features like logging this way.
We have other batch jobs that don't run in the database, and therefore must be in windows scheduled tasks. This way all the batch jobs of any type will be listed in one place.
Please note that regardless of how you do it, for this task you do not want a service. Services run all day, and will consume a bit of the server's ram all day.
In this, you have a task you need to run, and run once a day, every day. As such, you'd either want a job in SQL Server or as Joel described an application (console or winforms) that was setup on a schedule to execute and then unload from the server's memory space.
Is this for you/in house, or is this part of a product that you distribute.
If in house, I'd say the SQL job. That's just another service too.
If it's part of a product that you distribute, I would consider how the installation and support will be.
To follow on Corey's point, if this is externally distributed, will you need to support SQL Express? If not, I'd go with the SQL job directly. Otherwise, you'll have to get more creative as SQL Express does not have the SQL Agent that comes with the full versions of SQL 2005 (as well as MSDE). Without the SQL Agent, you'll need another way to automatically fire the job. It could be a windows service, a scheduled task (calling a .NET app, powershell script, VBscript, etc.), or you could try to implement some trigger in SQL Server directly.