I had been getting data from production and loading it into a staging DWH. And then loading it further into usable format later in another DWH from staging.
Both staging and the final DWH are on the same server. This process wasn't taking long before, but now it's taking ages to load the data from staging. It takes a few minutes to load data from production into staging, but it takes hours to load it further and I am not sure why.
FYI: I had been testing the loads, so I have truncated/deleted the table a few times and reloaded them
Also I had a non clustered index on one of the columns in actual DWH which I removed
CONSTRAINT [PK_EncounterTB_Encounter_id]
PRIMARY KEY CLUSTERED ([Encounter_id] ASC),
CONSTRAINT [Uniq_EncounterTB_Encounter_table_id]
UNIQUE NONCLUSTERED ([Encounter_Table_id] ASC)
Below is the table structure for staging and I have removed a few of the columns:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Stg_Encounter]
(
[encntr_id] [float] NOT NULL,
[person_id] [float] NOT NULL,
[visit_id_stay_number] [varchar](1000) NULL,
[mrn] [varchar](1000) NULL,
[encntr_type_cd] [float] NULL,
[reg_dt_tm] [datetime2](7) NULL,
[disch_dt_tm] [datetime2](7) NULL,
[admit_cd] [float] NULL,
[visit_cd] [float] NULL,
[source_cd] [float] NULL,
[sepearation_cd] [float] NULL,
[medical_service_cd] [float] NULL,
[reason_problem] [varchar](1000) NULL,
) ON [PRIMARY]
GO
For the actual DWH, the table structure is as below :
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Encounter]
(
[Encounter_Table_id] [int] NOT NULL,
[Encounter_id] [int] NOT NULL,
[Person_id] [int] NOT NULL,
[Visit_ID] [varchar](1000) NULL,
[MRN] [varchar](1000) NULL,
[Encounter_Type] [varchar](1000) NULL,
[Arrival_Dt_Tm] [datetime2](7) NULL,
[Departure_Dt_Tm] [datetime2](7) NULL,
[Mode_of_Arrival] [varchar](1000) NULL,
[Visit_Type] [varchar](1000) NULL,
[Admit_Source] [varchar](1000) NULL,
[Mode_of_Separation] [varchar](1000) NULL,
[Medical_Service] [varchar](1000) NULL,
[Presenting_Problem] [varchar](1000) NULL,
[LOAD_Dt_Tm] [datetime] NOT NULL,
[Data_Source] [varchar](1000) NOT NULL,
CONSTRAINT [PK_EncounterTB_Encounter_id]
PRIMARY KEY CLUSTERED ([Encounter_id] ASC)
) ON [PRIMARY]
GO
Below insert is being used for inserting data :
INSERT INTO [ACTUAL_DWH].[dbo].[Encounter]
(
[Encounter_Table_id]
,[Encounter_id]
,[Person_id]
,[Visit_ID]
,[MRN]
,[Encounter_Type]
,[Arrival_Dt_Tm]
,[Departure_Dt_Tm]
,[Mode_of_Arrival]
,[Visit_Type]
,[Admit_Source]
,[Mode_of_Separation]
,[Medical_Service]
,[Presenting_Problem]
,[MSAU_LOAD_Dt_Tm]
,[Data_Source]
)
SELECT
[Encounter_Table_id]= CONVERT(INT,Stg_e.[encntr_id])
, [Encounter_id] = CONVERT(INT,Stg_e.[encntr_id])
,[Person_id] = CONVERT(INT,Stg_e.[person_id])
,[Visit_ID] = Stg_e.[visit_id_stay_number]
,[MRN] = Stg_e.[mrn]
,[Encounter_Type] = [ACTUAL_DWH].[dbo].[emr_get_code_Description](Stg_e.encntr_type_cd)
,[Arrival_Dt_Tm] = CONVERT(DATETIME,Stg_e.reg_dt_tm)
,[Departure_Dt_Tm] = CONVERT(DATETIME,Stg_e.disch_dt_tm)
,[Mode_of_Arrival] = [ACTUAL_DWH].[dbo].[Description](Stg_e.admit_cd)
,[Visit_Type] = [ACTUAL_DWH].[dbo].[Description](Stg_e.visit_cd)
,[Admit_Source] = [ACTUAL_DWH].[dbo].[Description](Stg_e.source_cd)
,[Mode_of_Separation] = [ACTUAL_DWH].[dbo].[Description](Stg_e.sepearation_cd)
,[Medical_Service] = [ACTUAL_DWH].[dbo].[Description](Stg_e.medical_service_cd)
,[Presenting_Problem] = Stg_e.reason_problem
,[MSAU_LOAD_Dt_Tm] = getdate()
,[Data_Source] = 'SourceName'
FROM [dbo].Stg_Encounter Stg_e
where NOT EXISTS ( SELECT 1 FROM [ACTUAL_DWH].[dbo].Encounter e
WHERE stg_e.encntr_id = e.encounter_id)
The function used is as per below :
USE [ACTUAL_DWH]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER function [dbo].[Description](#cv int)
returns varchar(80)
as begin
declare #ret varchar(80)
select #ret = cv.DESCRIPTION
from ACTUAL_DWH.DBO.CODE_VALUE cv
where cv.code_value = #cv
and cv.active_ind = 1
return isnull(#ret, 0)
end;
I am just confused where I have missed stuff!!! And what can I change. the table has around 6 million rows and it was loading them in a minute.
After the suggestions provided, I got to know that issue is with the function that I am using.I have read about CROSS APPLY but is it a good idea to apply CROSS APPLY on 15 columns?
You can use SQL CREATE INDEX Statement to retrieve data from the database very fast.
CREATE INDEX IX_Encounter
ON [ACTUAL_DWH].[dbo].[Encounter](Encounter_Table_id) ON [PRIMARY]
INSERT INTO [ACTUAL_DWH].[dbo].[Encounter]
(
[Encounter_Table_id]
,[Encounter_id]
,[Person_id]
,[Visit_ID]
,[MRN]
,[Encounter_Type]
,[Arrival_Dt_Tm]
,[Departure_Dt_Tm]
,[Mode_of_Arrival]
,[Visit_Type]
,[Admit_Source]
,[Mode_of_Separation]
,[Medical_Service]
,[Presenting_Problem]
,[MSAU_LOAD_Dt_Tm]
,[Data_Source]
)
SELECT
[Encounter_Table_id]= CONVERT(INT,Stg_e.[encntr_id])
, [Encounter_id] = CONVERT(INT,Stg_e.[encntr_id])
,[Person_id] = CONVERT(INT,Stg_e.[person_id])
,[Visit_ID] = Stg_e.[visit_id_stay_number]
,[MRN] = Stg_e.[mrn]
,[Encounter_Type] = [ACTUAL_DWH].[dbo].[emr_get_code_Description](Stg_e.encntr_type_cd)
,[Arrival_Dt_Tm] = CONVERT(DATETIME,Stg_e.reg_dt_tm)
,[Departure_Dt_Tm] = CONVERT(DATETIME,Stg_e.disch_dt_tm)
,[Mode_of_Arrival] = [ACTUAL_DWH].[dbo].[Description](Stg_e.admit_cd)
,[Visit_Type] = [ACTUAL_DWH].[dbo].[Description](Stg_e.visit_cd)
,[Admit_Source] = [ACTUAL_DWH].[dbo].[Description](Stg_e.source_cd)
,[Mode_of_Separation] = [ACTUAL_DWH].[dbo].[Description](Stg_e.sepearation_cd)
,[Medical_Service] = [ACTUAL_DWH].[dbo].[Description](Stg_e.medical_service_cd)
,[Presenting_Problem] = Stg_e.reason_problem
,[MSAU_LOAD_Dt_Tm] = getdate()
,[Data_Source] = 'SourceName'
FROM [dbo].Stg_Encounter Stg_e
where NOT EXISTS ( SELECT 1 FROM [ACTUAL_DWH].[dbo].Encounter e
WHERE stg_e.encntr_id = e.encounter_id)
You can check more info about index here.INDEX
Just to close this post. As suggested I had tried to break down the query and found that the function was the culprit. I am exploring it further for the resolution.
The function is taking parameter and running a SQL on another table. Which is slowing down the query. If I do the insert without the function it actually takes a few seconds to load 6 million rows.
Related
I am trying to insert a table into multiple databases.
I have figured out how to make it insert the single table into all of the Databases but I need it to exclude about 10 databases like the master, model, tempdb, msdb, then I have some others as well.
EXEC sp_msforeachdb
'
USE ?;
IF [?] NOT IN master, model, tempdb,
CREATE TABLE [?].[dbo].[FaceEnrollments](
[EmployeeNo] [varchar](10) NOT NULL,
[TIN] [varchar](20) NOT NULL,
[irFaceTemplate] [varchar](max) NULL,
[faceBitmap] [varchar](max) NULL,
[faceAttestationAccepted] [bit] NULL,
[EnrollmentDateTime] [datetime] NULL,
CONSTRAINT [PK_FaceEnrollments] PRIMARY KEY CLUSTERED
(
[EmployeeNo] ASC,
[TIN] ASC
)
);
';
This is what I got so far to add the table to all databases.
I have tried multiple things to exclude databases from this, but I cannot figure it out.
I was hoping someone could help me out. Please and Thank you!
you can adjust your dynamic query to this:
EXEC sp_msforeachdb
'
IF ''?'' NOT IN (''master'', ''model'', ''tempdb'')
CREATE TABLE ?.[dbo].[FaceEnrollments](
[EmployeeNo] [varchar](10) NOT NULL,
[TIN] [varchar](20) NOT NULL,
[irFaceTemplate] [varchar](max) NULL,
[faceBitmap] [varchar](max) NULL,
[faceAttestationAccepted] [bit] NULL,
[EnrollmentDateTime] [datetime] NULL,
CONSTRAINT [PK_FaceEnrollments] PRIMARY KEY CLUSTERED
(
[EmployeeNo] ASC,
[TIN] ASC
)
);
';
Also I recommand the better and more flexible sp originally created by Aaron Bertrand, you can download from Github for free :
SQL-Server-First-Responder-Kit/sp_ineachdb
which you can use exclude_list or exclude_pattern parameter to exclude databases:
exec [dbo].[sp_ineachdb]
#exclude_list = 'List of excluded databases'
#command = 'CREATE TABLE [dbo].[FaceEnrollments](...)'
I've been having this problem with my database where it kept on incrementing the id column even though it has been removed. To better understand what I meant, here is a screenshot of my gridview:
As you can see from the id column, everything is fine from 11 - 16. but it suddenly skipped from 25 - 27. What i want to happen is, when i remove an item, i want it to start from the last id which is 16. So the next id should be 17. I hope this makes sense for you guys.
Here is also part of the SQL script:
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[guitarItems]
(
[id] [int] IDENTITY(1,1) NOT NULL,
[type] [varchar](50) NOT NULL,
[brand] [varchar](50) NOT NULL,
[model] [varchar](50) NOT NULL,
[price] [float] NOT NULL,
[itemimage1] [varchar](255) NULL,
[itemimage2] [varchar](255) NULL,
[description] [text] NOT NULL,
[necktype] [varchar](100) NOT NULL,
[body] [varchar](100) NOT NULL,
[fretboard] [varchar](100) NOT NULL,
[fret] [varchar](50) NOT NULL,
[bridge] [varchar](100) NOT NULL,
[neckpickup] [varchar](100) NOT NULL,
[bridgepickup] [varchar](100) NOT NULL,
[hardwarecolor] [varchar](50) NOT NULL,
PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
You can use:
DBCC CHECKIDENT ("YourTableNameHere", RESEED, 1);
Before using it, visit link: https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-checkident-transact-sql
Primary autoincrement keys in the database are used to uniquely identify a given row and shouldn't be given any business meaning. So leave the primary key as it is and add another column like guitarItemsId. Then when you delete a record from the database you may want to send an additional UPDATE statement in order to decrease the guitarItemsId column of all rows that have the guitarItemsId greater than the one you are currently deleting.
Also, remember that you should never modify the value of a primary key in a relational database because there could be other tables that reference it as a foreign key and modifying it might violate the referential constraints.
I am trying to create a table that dynamically pulls the starting IDENTITY ID based on a variable from another table. The SQL executes successfully but afterwards, I am unable to find my temporary table. The DBCC CHECKIDENT brings back Invalid object name '#address_temp'.
IF OBJECT_ID('tempdb.dbo.#address_temp', 'U') IS NOT NULL
DROP TABLE #address_temp
DECLARE #address_temp_ID VARCHAR(MAX)
SET #address_temp_ID = (SELECT MAX(ID) FROM [PRIMARYDB].[dbo].[ADDRESS])
DECLARE #SQLBULK VARCHAR(MAX)
SET #SQLBULK = 'CREATE TABLE #address_temp(
[ID] [int] IDENTITY(' + #address_temp_ID + ',1) NOT NULL,
[NAME] [varchar](128) NOT NULL,
[ADDRESS1] [varchar](128) NOT NULL,
[ADDRESS2] [varchar](128) NULL,
[CITY] [varchar](128) NULL,
[STATE_ID] [smallint] NULL,
[ZIP] [varchar](10) NOT NULL,
[COUNTY] [varchar](50) NULL,
[COUNTRY] [varchar](50) NULL
CREATE CLUSTERED INDEX pk_add ON #address_temp ([NAME])'
EXEC (#SQLBULK)
DBCC CHECKIDENT('#address_temp')
Table names that start with # are temporary tables and SQL Server treats them differently. First of all they are only available to the session that created them (this is not quite true since they you can find them in the temp name space but they have a unique system generated name)
In any case they won't persist so you don't need to drop them (that happens auto-magically) and you certainly can't look at them after your session ends.... they are gone.
Don't use a temp table, take out the # in the name. Things will suddenly start working.
I am working on a project where I need to extract the data from excel sheet to SQL Server
, well that bit have done successfully. Now my problem is that for a particular column
called product size, I want to update current table based on product size in other table, I am really very confused , please help me out
Please find the table structure
CREATE TABLE [dbo].[T_Product](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[PartNo] [nvarchar](255) NULL,
[CategoryID] [int] NULL,
[MaterialID] [float] NULL,
[WireformID] [float] NULL,
[ProductName] [nvarchar](50) NULL,
[ProductSize] [nvarchar](50) NULL,
[ProductLength] [varchar](20) NULL,
[ProductActive] [bit] NULL,
[ProductImage] [varchar](60) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[T_ProductSize](
[Code] [int] IDENTITY(1,1) NOT NULL,
[ProductSize] [nvarchar](50) NULL,
[Length] [nchar](20) NULL
) ON [PRIMARY]
GO
OK, so ignore the previous answer, I got the wrong end of the stick!!
You want something like this I think:
UPDATE T_Product
SET [ProductLength] = ps.[Length]
FROM T_Product p
INNER JOIN T_ProductSize ps
ON p.[ProductSize] = ps.[ProductSize]
That will take the Length value from T_ProductSize and place it in T_Product.ProductLength based on the value of T_Product.ProductSize
You mention a foreign key but you haven't included a definition for it. Is it between the two tables in your example? If so, which columns make the key? Is product size the key? If so, then your question doesn't make a lot of sense as the value will be the same in both tables.
Is it possible that you mean the product size is to be stored in a separate table and not in the T_Product table? In that case then instead of ProductSize in T_Product you will want the code from the T_ProductSize table (can I also suggest that instead of 'code' you call it 'ProductSizeCode' or better yet 'ProductSizeId' or similar? having columns simply called code can be very confusing as you have no simple way of know what table that value is in). Also, you should always create a primary key on each table: You cannot have a foreign key without one. They don't have to be clustered, that will depend upon hwo your search the table, but I am using a clustered PK for this example. That would give you something like this:
CREATE TABLE [dbo].[T_Product](
[ProductID] [int] IDENTITY(1,1) NOT NULL,
[PartNo] [nvarchar](255) NULL,
[CategoryID] [int] NULL,
[MaterialID] [float] NULL,
[WireformID] [float] NULL,
[ProductName] [nvarchar](50) NULL,
[ProductSizeId] [int] NOT NULL,
[ProductLength] [varchar](20) NULL,
[ProductActive] [bit] NULL,
[ProductImage] [varchar](60) NULL
CONSTRAINT [PK_T_Product] PRIMARY KEY CLUSTERED
(
[ProductID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[T_ProductSize](
[ProductSizeId] [int] IDENTITY(1,1) NOT NULL,
[ProductSize] [nvarchar](50) NULL,
[Length] [nchar](20) NULL
CONSTRAINT [PK_T_ProductSize] PRIMARY KEY CLUSTERED
(
[ProductSizeId] ASC
) ON [PRIMARY]
) ON [PRIMARY]
GO
--Now add your foreign key to T_Product.
ALTER TABLE [T_Product] WITH NOCHECK ADD CONSTRAINT [FK_Product_ProductSize] FOREIGN KEY([ProductSizeId])
REFERENCES [T_ProductSize] ([ProductSizeId])
GO
ALTER TABLE [T_Product] CHECK CONSTRAINT [FK_Product_ProductSize]
GO
Now, to retrieve your product along with the product size use something like this:
SELECT p.[ProductID], p.[PartNo], p.[CategoryID], p.[MaterialID], p.[WireformID], p.[ProductName],
ps.[ProductSize], ps.[Length], p.[ProductLength], p.[ProductActive], p.[ProductImage]
FROM [T_Product] p
INNER JOIN [T_ProductSize] ps
ON ps.[ProductSizeId] = p.[ProductSizeId]
If I have understood you correctly, then this is what I think you're after. If not, then have another go at explaining what it is you need and I'll try again.
Try this...
UPDATE t2
SET t2.ProductLength = t1.Length
FROM dbo.T_ProductSize t1
INNER JOIN dbo.T_Product t2 ON t1.ProductSize = t2.ProductSize
I have (another) question about indexing.
I use the following code:
CREATE TABLE [dbo].[PnrDetails1](
[OId] [int] IDENTITY(1,1) NOT NULL ,
[file_name] [varchar](256) NOT NULL,
[gds_id] [int] NOT NULL,
[pnr_locator] [varchar](15) NOT NULL,
[first_cust_name] [varchar](50) NOT NULL,
[ticket_number] [varchar](20) NOT NULL,
[full_price] [decimal](18, 0) NOT NULL,
[currency_desc] [varchar](4) NOT NULL,
[user_name] [varchar](50) NOT NULL,
[save_time] [datetime] NOT NULL,
[update_time] [datetime] NOT NULL,
[clerk_id] [int] NOT NULL,
[isUpdated] [bit] NOT NULL,
[isDeleted] [bit] NOT NULL,
[pnr_file_id] [int] NOT NULL
) ON [PRIMARY]
ALTER TABLE [dbo].[PnrDetails1] ADD PRIMARY KEY CLUSTERED
(
[OId] ASC
)ON [PRIMARY]
this is actually a script sql server 2008 created for me,
but when I look at the object explorer I see an ugly name for the index (something like PK_PnrDetai_CB394B1958F2C25C). How can I change it? If so?
While I agree with #marc_s that you should always declare these names up front, I disagree that you have to drop and re-create:
EXEC sp_rename 'PK_PnrDetai_CB394B1958F2C25C', 'my_new_shiny_name', OBJECT;
You can (and you should) explicitly give a name to your primary key constraint:
ALTER TABLE [dbo].[PnrDetails1]
ADD CONSTRAINT PK_PnrDetails1
PRIMARY KEY CLUSTERED([OId] ASC) ON [PRIMARY]
You can only do this at the time of creation - so in your case, you probably have to drop it first, and then re-create it with the proper, readable name.
Or use this, if the table already exists:
DECLARE #new_pk_name VARCHAR(MAX)='pk_new_name' --here set new name of primary key
DECLARE #table VARCHAR(MAX)='dbo.table_name' --here set name of the table
DECLARE #old_pk_name VARCHAR(MAX)
SELECT
#old_pk_name=name
FROM sys.key_constraints
WHERE [type] = 'PK'
AND [parent_object_id] = OBJECT_ID(#table);
SET #old_pk_name=#table+'.'+#old_pk_name;
EXEC sp_rename #old_pk_name, #new_pk_name;
GO