How do I create data during a SELECT statement while doing an INSERT INTO? - sql

Sorry if the question is confusing, but I wasn't sure how exactly to ask it. Basically I have a giant table with 20 columns and millions of rows. I want to insert into another table a subset of those rows and columns, and then also for each of those rows, create a new time and new GUID in the table. For example, let's say I have the following table:
CREATE TABLE [dbo].[AddressData](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[AddressDataGUID] [uniqueidentifier] ROWGUIDCOL NOT NULL,
[City] [varchar](32) NOT NULL,
[State] [varchar](2) NOT NULL,
[DateModified] [datetime2](7) NOT NULL,
[TS] [timestamp] NOT NULL,
CONSTRAINT [PK_AddressData_ID] PRIMARY KEY CLUSTERED [ID]
Now I also have another table that has City and State, but then it has a bunch of other columns too, like Street, Zip, etc. I want to select every distinct pair of City and State from the bigger table and stick them in the new table, and I also want it to set the DateModified to the current time, and create a new random GUID for each of those distinct pairs. I can't figure out how to do this in sql server. I tried things like the following:
INSERT INTO Community (CommunityGUID, City, [State], DateModified)
VALUES (NEWID(), t.City, t.[State], SYSUTCDATETIME())
SELECT DISTINCT City, [State] FROM FullTable t
However, I can't figure out the correct syntax. I COULD probably take care of this by creating a temp table with all nullable fields, select distinct with those two columns, and then loop through it row by row creating all of these values, and then finally selecting the fully constructed table into the other one. I'm almost tempted to do that, because that's my first instinct as a software developer and not a database developer, but DB developers tend to dislike having stuff like that in DB if there's a way to just do it in one statement and let the DB engine optimize it.

I think you just want a real query for the INSERT:
INSERT INTO Community (CommunityGUID, City, [State], DateModified)
SELECT NEWID(), t.City, t.[State], SYSUTCDATETIME()
FROM (SELECT DISTINCT City, [State]
FROM FullTable t
) t;

Related

Is it good to have multiple inner joins in SQL select statement?

I have a table which looks like this:
CREATE TABLE [dbo].[Devices]
(
[Device_ID] [nvarchar](10) NOT NULL,
[Series_ID] [int] NOT NULL,
[Start_Date] [date] NULL,
[Room_ID] [int] NOT NULL,
[No_Of_Ports] [int] NULL,
[Description] [text] NULL
);
I want to show this table in a gridview, but instead of showing the [Series_ID] column, I want to show 3 columns Series_Name, Brand_Name, and Type_Name from another 3 columns, and instead of showing the [Room_ID] column, I want to show 3 columns Site_Name, Floor_Name, Room_Name from another 3 columns
I can do that by more than 6 inner joins. I am a beginner in SQL and I want to know is this right to have a lot of inner joins in one statement in point of performance?
Based on your query explanation, I assume it will be 2 inner joins instead of 6.
If Series_Name, Brand_Name and Type_Name are in one table with Series_Id as ForeignKey, then you would need one join.
Similarly, If Site_Name, Floor_Name, Room_Name are in one table with Room_ID as ForeignKey, then you would need another innjer join.
Again, it is difficult to tell the exact number of joins without understanding table structure of the other referential tables.

SQL - Joining to Nonexistent Records

After doing a bit of looking, I thought maybe I'd found a solution here: sql join including null and non existing records. Cross Joining my tables seems like a great way to solve my problem, but now I've hit a snag:
Below are the tables I’m using:
CREATE TABLE [dbo].[DCRSales](
[WorkingDate] [smalldatetime] NOT NULL,
[Store] [int] NOT NULL,
[Department] [int] NOT NULL,
[NetSales] [money] NOT NULL,
[DSID] [int] IDENTITY(1,1) NOT NULL)
CREATE TABLE [dbo].[Stores](
[Number] [int] NOT NULL,
[Has_Deli] [bit] NOT NULL,
[Alcohol_Register] [int] NULL,
[Is_Cost_Saver] [bit] NOT NULL,
[Store_Status] [nchar](10) NOT NULL,
[Supervisor_Number] [int] NOT NULL,
[Email_Address] [nchar](20) NOT NULL,
[Sales_Area] [int] NULL,
[PZ_Store_Number] [int] NULL,
[Has_SCO] [bit] NULL,
[SCO_Reg] [nchar](25) NULL,
[Has_Ace] [bit] NULL,
[Ace_Sq_Ft] [int] NULL,
[Open_Date] [datetime] NULL,
[Specialist] [nchar](2) NULL,
[StateID] [int] NOT NULL)
CREATE TABLE [dbo].[DepartmentMap](
[Department_Number] [int] NOT NULL,
[Description] [nvarchar](max) NOT NULL,
[Parent_Department] [int] NOT NULL)
CREATE TABLE [dbo].[ParentDepartments](
[Parent_Department] [int] NOT NULL,
[Description] [varchar](50) NULL
DCRSales is a table holding new and archived data. The archived data is not perfect, meaning that there are of course certain missing date gaps and some stores which currently have a department they didn't have or no longer have a department they used to have. My goal is to join this table to our department list, list the child departments and parent departments and SUM up the netsales for a given date range. In cases where a store does not have a department whatsoever in that date range, I still need to display it as 0.00.
A more robust solution would probably be to just store all departments for each store regardless of whether they have that department or not (with sales set to 0.00 of course). However I imagine doing that and/or solving my problem here would require very similar queries anyway.
The query I have tried is as follows:
WITH CTE AS (
SELECT S.Number as Store, DepartmentMap.Department_Number as Department, ParentDepartments.Parent_Department as Parent, ParentDepartments.Description as ParentDescription, DepartmentMap.Description as ChildDescription
FROM Stores as S CROSS JOIN dbo.DepartmentMap INNER JOIN ParentDepartments ON DepartmentMap.Parent_Department = ParentDepartments.Parent_Department
WHERE S.Number IN(<STORES>) AND Department_Number IN(<DEPTS>)
)
SELECT CTE.Store, CTE.Department, SUM(ISNULL(DCRSales.NetSales, 0.00)) as Sales, CTE.Parent, CTE.ParentDescription, CTE.ChildDescription
FROM CTE LEFT JOIN DCRSales ON DCRSales.Department = CTE.Department AND DCRSales.Store = CTE.Store
WHERE DCRSales.WorkingDate BETWEEN '<FIRSTDAY>' AND '<LASTDAY>' OR DCRSales.WorkingDate IS NULL
GROUP BY CTE.Store, CTE.Department, CTE.Parent, CTE.ParentDescription, CTE.ChildDescription
ORDER BY CTE.Store ASC, CTE.Department ASC
In this query I try to CROSS JOIN each department to a store from the Stores table so that I can get a combination of every store and every department. I also include the Parent Departments of each department along with the child department's description and the parent department's description. I filter this first portion based on store and department, but this does not change the general concept.
With this result set, I then try to join this table to all of the sales in DCRSales that are within a certain date range. I also include the date if it’s null because the results that have a NULL sales also have a NULL WorkingDate.
This query seemed to work, until I noticed that not all departments are being used with all stores. The stores in particular that do not get combined with all departments are the ones that have no data for the given date range (meaning they have been closed). If there is no data for the department, it should still be listed with its department number, parent number, department description and parent description (with Sales as 0.00). Any help is greatly appreciated.
Your WHERE clause is filtering out records that do have sales at some point in time, but not for the desired period of time, those records don't meet either criteria and are therefore excluded.
I might be under-thinking it, but might just need to move:
DCRSales.WorkingDate BETWEEN '<FIRSTDAY>' AND '<LASTDAY>'
To your LEFT JOIN criteria and get rid of WHERE clause. If that's not right, you could filter sales by date in a 2nd cte prior to the join.
What you want is an OUTER JOIN.
See this: https://technet.microsoft.com/en-us/library/ms187518(v=sql.105).aspx
I suggest that this process is probably much too complicated to be done using a single query. I think that you need to perform several queries: to extract the transactions-of-interest into a separate table, then to modify the results one-or-more times in that table before using it to produce your final statistics. A stored procedure, which drives several separately stored queries, could be used to drive the process, which, in several "stages," makes several "passes" over the initially-extracted data.
One piece of important information, for instance, would be to know when a particular store had a particular department. (For instance: store, department, starting_date, ending_date.) This would be a refinement of a possibly-existing table (and, maybe, drawn from it ...) which lists what departments a particular store has today.
Let us hope that department-numbers do not change, or that your company hasn't acquired other companies with the resulting need to "re-map" these numbers in some way.
Also: frankly, if you have access to a really-good statistics package, such as SAS® or SPSS® ... can "R" do this sort of thing? ... you might find yourself better-off. (And no, I don't mean "Microsoft Excel ...") ;-)
When I have been faced with requirements like these (and I have been, many times ...), a stats-package was indispensable. I found that I had to "massage" the process and the extracted data a number of times, in a system of successive-refinement that gradually lead me to a reporting process that I could trust and therefore defend.

Table cell to count rows in other table

I'm not entirely sure if I'm even going about this in the right manner.
MVC+EF site so i could do this in the controller but I would prefer it in the DB if possible.
I have two tables. One contains entries and one contains a list of members. I want the list of members to have a column that contains the count of how many times the member name appears in the list of entries. Can I do this in the definition of the table itself?
I know this query works:
select count(*)
from dbo.Entries
where dbo.Entries.AssignedTo = 'Bob Smith'
But is there any way of doing this? What is the correct syntax?
CREATE TABLE [dbo].[Members] (
[ID] INT IDENTITY (1, 1) NOT NULL,
[Name] NVARCHAR (100) NOT NULL,
[Email] NVARCHAR (500) NOT NULL,
[Count] INT = select count(*) from dbo.Entries where dbo.Entries.AssignedTo = [Name]
PRIMARY KEY CLUSTERED ([ID] ASC)
I've done some searching and have tried a few different syntax's but I'm completely lost at this point so if anyone can get me headed in the correct direction I would really appreciate it.
Thanks in advance.
You could create Members as a view combining both the Entries data and another table. (Warning: Syntax not tested)
CREATE TABLE [_member_data] (
[ID] INT IDENTITY(1, 1) NOT NULL,
[Name] NVARCHAR (100) NOT NULL,
[Email] NVARCHAR (500) NOT NULL,
PRIMARY KEY CLUSTERED ([ID] ASC)
);
CREATE VIEW [dbo].[Members] AS
SELECT ID, Name, Email, COUNT(*)
FROM _member_data JOIN dbo.Entries ON [Name] = [AssignedTo]
GROUP BY 1, 2, 3;
It is possible to take this even further with triggers/rules that rewrite attempted inserts into Members as inserts to the appropriate backing table. But to get the kind of expressive information that you are looking for, you really want to explore using a view.

How to update 2nd table with identity value of inserted rows into 1st table

I have the following table structures
CREATE TABLE [dbo].[WorkItem](
[WorkItemId] [int] IDENTITY(1,1) NOT NULL,
[WorkItemTypeId] [int] NOT NULL,
[ActionDate] [datetime] NOT NULL,
[WorkItemStatusId] [int] NOT NULL,
[ClosedDate] [datetime] NULL,
)
CREATE TABLE [dbo].[RequirementWorkItem](
[WorkItemId] [int] NOT NULL,
[RequirementId] [int] NOT NULL,
)
CREATE TABLE #RequirmentWorkItems
(
RequirementId int,
WorkItemTypeId int,
WorkItemStatusId int,
ActionDate datetime
)
I use the #RequirmentWorkItems table to create workitems for requirements. I then need to INSERT the workitems into the WorkItem table and use the identity values from the WorkItem table to create the cross-reference rows in the RequirementWorkItem table.
Is there a way to do this without cursoring thru each row? And I can't put the RequirementId into the WorkItem table because depending on the WorkItemTypeId the WorkItem could be linked to a Requirement or a Notice or an Event.
So there are really 3 xref tables for WorkItems. Or would it be better to put a RequirementId, NoticeId and EventId in the WorkItem table and 1 of the columns would have a value and other 2 would be null? Hopefully all this makes sense. Thanks.
You should read MERGE and OUTPUT – the swiss army knife of T-SQL for more information about this.
Today I stumbled upon a different use for it, returning values using an OUTPUT clause from a table used as the source of data for an insertion. In other words, if I’m inserting from [tableA] into [tableB] then I may want some values from [tableA] after the fact, particularly if [tableB] has an identity. Observe my first attempt using a straight insertion where I am trying to get a field from #source.[id] that is not used in the insertion:

how can i get Students Grades using Grade,studets tables

how can i get students grades using these 2 tables?
CREATE TABLE [dbo].[Grade](
[ID] [int] NULL,
[From] [int] NULL,
[To] [int] NULL,
[Grade] [nvarchar](50) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[Students](
[ID] [int] NULL,
[Name] nvarchar NULL,
[Score] [int] NULL
) ON [PRIMARY]
Ok, lets take a guess that the Students table has a column (probably an identity column) called 'ID' which is the students' unique identifier.
Let's also assume that the Grade.ID field is a foreign key reference to the Student table (I know it doesn't say that, but it's probably a good idea.)
In that case, try this (untested code):
SELECT Student.Name, Grade.Grade, Grade.From, Grade.To
FROM Student, Grade
WHERE Student.ID = Grade.ID
ORDER BY Student.ID, Grade.To
That ought to give you a list of students, with their grades in ID order.
It will repeat the student name for each grade record however (assuming no null values, of course.)
If you want one student instance with a list of grades, that's significantly harder.
(Btw, the more and better information in the initial question, the better the answers you're likely to get.)
Ok - second shot, now I understand the table structure:
SELECT Students.Name, (select Grades.GradeName from Grades where Students.Score <= Grades.ScoreTo AND Students.Score > Grades.ScoreFrom) AS Grade
FROM Students
I changed the names a little to make them more obvious and distinct (having the same name for a table and a field on that table is confusing.)
This should give you the results you're looking for.
BUT - you must ensure that your data is clean - if you have overlapping ranges, you will get a multiple sub-query return (which will fail.) You can overcome this with a 'SELECT TOP 1' restriction, but it's messy.