Convert Existing Stored Procedure having multiple update statements on Mosaic - mosaic-decisions

I have a requirement to convert an existing Stored procedure having Multiple update statements.
Ex:
Current query in Stored procedure is -
CREATE PROCEDURE [dbo].[SP_Load_Catalog]
AS
BEGIN
SET NOCOUNT ON;
truncate table dbo..Dummy2
update A set Market = b.code_cmmt
From dbo..Dummy2 A
inner join dbo..Dummy3 b on a.cm__chr03 = b.code_value and b.code_fldname = 'xar1'
How can this functionality be achieved on Mosaic

To achieve this functionality on Mosaic Decisions, follow the steps listed below:
Drag in a Reader Node and configure it with the information about the source data.
Drag in the Custom SQL Process Node. Here, provide the UPDATE SQL queries that are equivalent to the logic mentioned in the Stored Procedure.
Attach a Writer Node and configure it to update the existing table.
After execution of the flow, you should have the desired result.

Related

When to use user defined functions in a SQL Server data warehouse

I am working on creating a DWH where I am loading the data in Staging DB and before loading them into final DB I apply all the udfs that I have created on the data.
Source DB : Oracle
Dest DB : SQL Server
ETL Process : SSIS packages
I was not processing anything on staging to have a quick load.
Question: is it quicker to apply any udfs when the data is in staging itself or should it be done when loading the data to final DB.
Below facility_cd is a float value and I am passing it to a function emr_get_code_Description to get the corresponding description. The table where it's getting the description from is in the final DB. udf_replace_special_char is a simple function which is replacing a few special characters with NULL.
LTRIM(RTRIM([Dest_DWH].[dbo].udf_replace_special_char([Dest_DWH].[dbo].[emr_get_code_Description](Stg_ap.Facility_cd))))
In general what should be a better practice? Should I be updating this in staging and then load the data after all conversions to Final DB.
Function definitions :
Function 1 :
USE [PROD_DWH]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER function [dbo].[emr_get_code_Description](#cv int)
returns varchar(80)
as begin
-- Returns the code value display
declare #ret varchar(80)
select #ret = cv.DESCRIPTION
from PROD_DWH.DBO.table cv
where cv.code_value = #cv
and cv.active_ind = 1
return isnull(#ret, 0)
end;
Function 2 :
USE [PROD_DWH]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER function [dbo].[udf_replace_special_char](#var varchar(1000))
returns varchar(1000)
as begin
-- Returns the code value display
declare #return_var varchar(1000)
set #return_var = #var
set #return_var = replace(#return_var,CHAR(13),'')
set #return_var = replace(#return_var,CHAR(10),'')
set #return_var = replace(#return_var,CHAR(09),'')
set #return_var = replace(#return_var,CHAR(34),CHAR(39))
return isnull(#return_var, 0)
end;
First of all, as #Nick.McDermaid mentioned in the comments: Best practice is to avoid using User defined functions. There are many links containing information about the functions effects on query performance.
Removing Function Calls for Better Performance in SQL Server
Performance Considerations of User-Defined Functions in SQL Server 2012
Are SQL Server Functions Dragging Your Query Down?
T-SQL Best Practices - Don't Use Scalar Value Functions in Column List or WHERE Clauses
There is not ideal answer for these question, it is related to the case you are working with, but i can give some tips that you can take into consideration:
First, if you are using SSIS to import data into Staging Table, try replacing user defined function with the SSIS data flow components such as derived column transformation, Lookups, in a way that can enhance the performance of the data import.
If you cannot replace the UDF by SSIS components: If you are collecting data in high speed to a data lake (staging level) and then loading the data when needed, it is better to avoid using functions when importing data to staging table.
If You need to guarantee a high speed when loading data from staging table, then use the function in the first data import phase.
If the first data import phase (to staging table) and the second phase (from staging table) are not executed on the same machine, it could be better to execute functions on the more performant machine.
If function contains some operations like lookups, try replacing them with joins.
...
Update 1
After posting functions in your question, you can replace function 2 with a Derived Column Transformation in your SSIS package:
ISNULL([Column]) ? "" : REPLACE(REPLACE(REPLACE(REPLACE([Column],CHAR(10),""),CHAR(13),""),CHAR(09),""),CHAR(34),CHAR(39))
Also you can replace Function 1 with a Lookup Transformation in SSIS package or with a LEFT JOIN in SQL query.

Manipulating data from a stored procedure by saving data into a table

Problem: I have a stored procedure in SQL Server 2012 and I want to put constraints to the output so I only get relevant information.
I am using Execute. The way I see it I have two options:
save the result of the execution into a table, so I can use it for different purposes
put constraints to the variables in Execute so I only get the results I want
The first method is discussed here:
Insert results of a stored procedure into a temporary table .
My code is (due to company information I can't share the whole thing):
create table #mtable ( .... )
Insert into #mtable
Execute [myProcedure]
The error:
An INSERT EXEC statement cannot be nested.
I assume the error is because of the code in the stored procedure. How can I fix that problem without looking into the code for the stored procedure. Is there another way where I can save the content in a table?
My problem can also be solved by proposal #2. Is it possible for me to manipulate the output from the stored procedure with something like:
Execute [myProcedure] where variable1 > 100

SSIS pre-evaluation phase taking long

I have a data flow that contains a OLEDB source (statement generated through a variable) which calls a stored procedure.
In SSMS, it takes 8 minutes but the package itself takes 3 times longer to complete.
I've set the validation (DelayValidation) to true, so it still does it at run time. Ive also set the validation of the metadata in the data flow component, as well as in the connection manager.
The data flows have ReadUncommitted on them as well.
I`m not sure where else to look, any assistance on how to make this run faster would be great.
I suspect the real problem is in your stored procedure, but I've included some basic SSIS items as well to try to fix your problem:
Ensure connection managers for OLE DB sources are all set toDelayValidation ( = True).
Ensure that ValidateExternalMetadata is set to false
DefaultBufferMaxRows and DefaultBufferSize to correspond to the table's row sizes
DROP and Recreate your destination component is SSIS
Ensure your stored procedure has SET ANSI_NULLS ON
Ensure that the SQL in your sproc hits an index
Add the query hint OPTION (FAST 10000) - This hint means that it will choose a query which will optimise for the first 10,000 rows – the default SSIS buffer size
Review your stored procedure SQL Server parameter sniffing.
Slow way:
create procedure GetOrderForCustomers(#CustID varchar(20))
as
begin
select * from orders
where customerid = #CustID
end
Fast way:
create procedure GetOrderForCustomersWithoutPS(#CustID varchar(20))
as
begin
declare #LocCustID varchar(20)
set #LocCustID = #CustID
select * from orders
where customerid = #LocCustID
end

How to print select results from within a stored procedure

I have a stored procedure that I'm trying to debug (T-SQL).
It contains creates a temporary table, and has several update statements to update it with various data.
How can I insert statements to view the contents of this table at various points during the running of the stored procedure.
Ideally, I'll be running this directly from MS SQL Server Management Studio, and simple output to the Message frame will suffice.
Wouldn't it be possible to SELECT what you need? Sorry if I misunderstood the question. Selected variables/tables will be returned as results

What is a stored procedure?

What is a "stored procedure" and how do they work?
What is the make-up of a stored procedure (things each must have to be a stored procedure)?
Stored procedures are a batch of SQL statements that can be executed in a couple of ways. Most major DBMs support stored procedures; however, not all do. You will need to verify with your particular DBMS help documentation for specifics. As I am most familiar with SQL Server I will use that as my samples.
To create a stored procedure the syntax is fairly simple:
CREATE PROCEDURE <owner>.<procedure name>
<Param> <datatype>
AS
<Body>
So for example:
CREATE PROCEDURE Users_GetUserInfo
#login nvarchar(30)=null
AS
SELECT * from [Users]
WHERE ISNULL(#login,login)=login
A benefit of stored procedures is that you can centralize data access logic into a single place that is then easy for DBA's to optimize. Stored procedures also have a security benefit in that you can grant execute rights to a stored procedure but the user will not need to have read/write permissions on the underlying tables. This is a good first step against SQL injection.
Stored procedures do come with downsides, basically the maintenance associated with your basic CRUD operation. Let's say for each table you have an Insert, Update, Delete and at least one select based on the primary key, that means each table will have 4 procedures. Now take a decent size database of 400 tables, and you have 1600 procedures! And that's assuming you don't have duplicates which you probably will.
This is where using an ORM or some other method to auto generate your basic CRUD operations has a ton of merit.
A stored procedure is a set of precompiled SQL statements that are used to perform a special task.
Example: If I have an Employee table
Employee ID Name Age Mobile
---------------------------------------
001 Sidheswar 25 9938885469
002 Pritish 32 9178542436
First I am retrieving the Employee table:
Create Procedure Employee details
As
Begin
Select * from Employee
End
To run the procedure on SQL Server:
Execute Employee details
--- (Employee details is a user defined name, give a name as you want)
Then second, I am inserting the value into the Employee Table
Create Procedure employee_insert
(#EmployeeID int, #Name Varchar(30), #Age int, #Mobile int)
As
Begin
Insert Into Employee
Values (#EmployeeID, #Name, #Age, #Mobile)
End
To run the parametrized procedure on SQL Server:
Execute employee_insert 003,’xyz’,27,1234567890
--(Parameter size must be same as declared column size)
Example: #Name Varchar(30)
In the Employee table the Name column's size must be varchar(30).
A stored procedure is a group of SQL statements that has been created and stored in the database. A stored procedure will accept input parameters so that a single procedure can be used over the network by several clients using different input data. A stored procedures will reduce network traffic and increase the performance. If we modify a stored procedure all the clients will get the updated stored procedure.
Sample of creating a stored procedure
CREATE PROCEDURE test_display
AS
SELECT FirstName, LastName
FROM tb_test;
EXEC test_display;
Advantages of using stored procedures
A stored procedure allows modular programming.
You can create the procedure once, store it in the database, and call it any number of times in your program.
A stored procedure allows faster execution.
If the operation requires a large amount of SQL code that is performed repetitively, stored procedures can be faster. They are parsed and optimized when they are first executed, and a compiled version of the stored procedure remains in a memory cache for later use. This means the stored procedure does not need to be reparsed and reoptimized with each use, resulting in much faster execution times.
A stored procedure can reduce network traffic.
An operation requiring hundreds of lines of Transact-SQL code can be performed through a single statement that executes the code in a procedure, rather than by sending hundreds of lines of code over the network.
Stored procedures provide better security to your data
Users can be granted permission to execute a stored procedure even if they do not have permission to execute the procedure's statements directly.
In SQL Server we have different types of stored procedures:
System stored procedures
User-defined stored procedures
Extended stored Procedures
System-stored procedures are stored in the master database and these start with a sp_ prefix. These procedures can be used to perform a variety of tasks to support SQL Server functions for external application calls in the system tables
Example: sp_helptext [StoredProcedure_Name]
User-defined stored procedures are usually stored in a user database and are typically designed to complete the tasks in the user database. While coding these procedures don’t use the sp_ prefix because if we use the sp_ prefix first, it will check the master database, and then it comes to user defined database.
Extended stored procedures are the procedures that call functions from DLL files. Nowadays, extended stored procedures are deprecated for the reason it would be better to avoid using extended stored procedures.
Generally, a stored procedure is a "SQL Function." They have:
-- a name
CREATE PROCEDURE spGetPerson
-- parameters
CREATE PROCEDURE spGetPerson(#PersonID int)
-- a body
CREATE PROCEDURE spGetPerson(#PersonID int)
AS
SELECT FirstName, LastName ....
FROM People
WHERE PersonID = #PersonID
This is a T-SQL focused example. Stored procedures can execute most SQL statements, return scalar and table-based values, and are considered to be more secure because they prevent SQL injection attacks.
Think of a situation like this,
You have a database with data.
There are a number of different applications needed to access that central database, and in the future some new applications too.
If you are going to insert the inline database queries to access the central database, inside each application's code individually, then probably you have to duplicate the same query again and again inside different applications' code.
In that kind of a situation, you can use stored procedures (SPs). With stored procedures, you are writing number of common queries (procedures) and store them with the central database.
Now the duplication of work will never happen as before and the data access and the maintenance will be done centrally.
NOTE:
In the above situation, you may wonder "Why cannot we introduce a central data access server to interact with all the applications? Yes. That will be a possible alternative. But,
The main advantage with SPs over that approach is, unlike your data-access-code with inline queries, SPs are pre-compiled statements, so they will execute faster. And communication costs (over networks) will be at a minimum.
Opposite to that, SPs will add some more load to the database server. If that would be a concern according to the situation, a centralized data access server with inline queries will be a better choice.
A stored procedure is mainly used to perform certain tasks on a database. For example
Get database result sets from some business logic on data.
Execute multiple database operations in a single call.
Used to migrate data from one table to another table.
Can be called for other programming languages, like Java.
A stored procedure is nothing but a group of SQL statements compiled into a single execution plan.
Create once time and call it n number of times
It reduces the network traffic
Example: creating a stored procedure
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE GetEmployee
#EmployeeID int = 0
AS
BEGIN
SET NOCOUNT ON;
SELECT FirstName, LastName, BirthDate, City, Country
FROM Employees
WHERE EmployeeID = #EmployeeID
END
GO
Alter or modify a stored procedure:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE GetEmployee
#EmployeeID int = 0
AS
BEGIN
SET NOCOUNT ON;
SELECT FirstName, LastName, BirthDate, City, Country
FROM Employees
WHERE EmployeeID = #EmployeeID
END
GO
Drop or delete a stored procedure:
DROP PROCEDURE GetEmployee
A stored procedure is used to retrieve data, modify data, and delete data in database table. You don't need to write a whole SQL command each time you want to insert, update or delete data in an SQL database.
A stored procedure is a precompiled set of one or more SQL statements which perform some specific task.
A stored procedure should be executed stand alone using EXEC
A stored procedure can return multiple parameters
A stored procedure can be used to implement transact
"What is a stored procedure" is already answered in other posts here. What I will post is one less known way of using stored procedure. It is grouping stored procedures or numbering stored procedures.
Syntax Reference
; number as per this
An optional integer that is used to group procedures of the same name. These grouped procedures can be dropped together by using one DROP PROCEDURE statement
Example
CREATE Procedure FirstTest
(
#InputA INT
)
AS
BEGIN
SELECT 'A' + CONVERT(VARCHAR(10),#InputA)
END
GO
CREATE Procedure FirstTest;2
(
#InputA INT,
#InputB INT
)
AS
BEGIN
SELECT 'A' + CONVERT(VARCHAR(10),#InputA)+ CONVERT(VARCHAR(10),#InputB)
END
GO
Use
exec FirstTest 10
exec FirstTest;2 20,30
Result
Another Attempt
CREATE Procedure SecondTest;2
(
#InputA INT,
#InputB INT
)
AS
BEGIN
SELECT 'A' + CONVERT(VARCHAR(10),#InputA)+ CONVERT(VARCHAR(10),#InputB)
END
GO
Result
Msg 2730, Level 11, State 1, Procedure SecondTest, Line 1 [Batch Start Line 3]
Cannot create procedure 'SecondTest' with a group number of 2 because a procedure with the same name and a group number of 1 does not currently exist in the database.
Must execute CREATE PROCEDURE 'SecondTest';1 first.
References:
CREATE PROCEDURE with the syntax for number
Numbered Stored Procedures in SQL Server - techie-friendly.blogspot.com
Grouping Stored Procedures - sqlmag
CAUTION
After you group the procedures, you can't drop them individually.
This feature may be removed in a future version of Microsoft SQL Server.
for simple,
Stored Procedure are Stored Programs, A program/function stored into database.
Each stored program contains a body that consists of an SQL statement. This statement may be a compound statement made up of several statements separated by semicolon (;) characters.
CREATE PROCEDURE dorepeat(p1 INT)
BEGIN
SET #x = 0;
REPEAT SET #x = #x + 1; UNTIL #x > p1 END REPEAT;
END;
A stored procedure is a named collection of SQL statements and procedural logic i.e, compiled, verified and stored in the server database. A stored procedure is typically treated like other database objects and controlled through server security mechanism.
In a DBMS, a stored procedure is a set of SQL statements with an assigned name that's stored in the database in compiled form so that it can be shared by a number of programs.
The use of a stored procedure can be helpful in
Providing a controlled access to data (end users can only enter or change data, but can't write procedures)
Ensuring data integrity (data would be entered in a consistent manner) and
Improves productivity (the statements of a stored procedure need to be written only once)
Stored procedures in SQL Server can accept input parameters and return multiple values of output parameters; in SQL Server, stored procedures program statements to perform operations in the database and return a status value to a calling procedure or batch.
The benefits of using stored procedures in SQL Server
They allow modular programming.
They allow faster execution.
They can reduce network traffic.
They can be used as a security mechanism.
Here is an example of a stored procedure that takes a parameter, executes a query and return a result. Specifically, the stored procedure accepts the BusinessEntityID as a parameter and uses this to match the primary key of the HumanResources.Employee table to return the requested employee.
> create procedure HumanResources.uspFindEmployee `*<<<---Store procedure name`*
#businessEntityID `<<<----parameter`
as
begin
SET NOCOUNT ON;
Select businessEntityId, <<<----select statement to return one employee row
NationalIdNumber,
LoginID,
JobTitle,
HireData,
From HumanResources.Employee
where businessEntityId =#businessEntityId <<<---parameter used as criteria
end
I learned this from essential.com...it is very useful.
Stored Procedure will help you to make code in server.You can pass parameters and find output.
create procedure_name (para1 int,para2 decimal)
as
select * from TableName
In Stored Procedures statements are written only once and reduces network traffic between clients and servers.
We can also avoid Sql Injection Attacks.
Incase if you are using a third party program in your application for
processing payments, here database should only expose the
information it needed and activity that this third party has been
authorized, by this we can achieve data confidentiality by setting
permissions using Stored Procedures.
The updation of table should only done to the table it is targeting
but it shouldn't update any other table, by which we can achieve
data integrity using transaction processing and error handling.
If you want to return one or more items with a data type then it is
better to use an output parameter.
In Stored Procedures, we use an output parameter for anything that
needs to be returned. If you want to return only one item with only
an integer data type then better use a return value. Actually the
return value is only to inform success or failure of the Stored
Procedure.
Preface: In 1992 the SQL92 standard was created and was popularised by the Firebase DB. This standard introduced the 'Stored Procedure'.
**
Passthrough Query: A string (normally concatenated programatically) that evaluates to a syntactically correct SQL statement, normally generated at the server tier (in languages such as PHP, Python, PERL etc). These statements are then passed onto the database.
**
**
Trigger: a piece of code designed to fire in response to a database event (typically a DML event) often used for enforcing data integrity.
**
The best way to explain what a stored procedure is, is to explain the legacy way of executing DB logic (ie not using a Stored Procedure).
The legacy way of creating systems was to use a 'Passthrough Query' and possibly have triggers in the DB.
Pretty much anyone who doesn't use Stored Procedures uses a thing call a 'Passthrough Query'
With the modern convention of Stored Procedures, triggers became legacy along with 'Passthrough Queries'.
The advantages of stored procedures are:
They can be cached as the physical text of the Stored Procedure
never changes.
They have built in mechanisms against malicious SQL
injection.
Only the parameters need be checked for malicious SQL
injection saving a lot of processor overhead.
Most modern database
engines actually compile Stored Procedures.
They increase the
degree of abstraction between tiers.
They occur in the same
process as the database allowing for greater optimisation and
throughput.
The entire workflow of the back end can be tested
without client side code. (for example the Execute command in
Transact SQL or the CALL command in MySQL).
They can be used to
enhance security because they can be leveraged to disallow the
database to be accessed in a way that is inconsistent with how the
system is designed to work. This is done through the database user
permission mechanism. For example you can give users privileges only
to EXECUTE Stored Procedures rather that SELECT, UPDATE etc
privileges.
No need for the DML layer associated with triggers. **
Using so much as one trigger, opens up a DML layer which is very
processor intensive **
In summary when creating a new SQL database system there is no good excuse to use Passthrough Queries.
It is also noteworthy to mention that it is perfectly safe to use Stored Procedures in legacy systems that already uses triggers or Passthrough Queries; meaning that migration from legacy to Stored Procedures is very easy and such migration need not take a system down for long if at all.
create procedure <owner>.<procedure name><param> <datatype>
As
<body>
Stored procedure are groups of SQL statements that centralize data access in one point. Useful for performing multiple operations in one shot.