SQL operations on all columns of a table - sql

I have many (>48) columns in one table, each column corresponds to a month and contains sales for that month. I need to create another table in which each column equals the addition of the previous 12 columns, e.g. getting the "rolling year" figure, so that e.g. July 2010 has everything from August 2009 through July 2010 added, August 2010 has everything from September 2009 through August 2010, and so on.
I could write this as:
select
[201007TOTAL] = [200908] + [200909] + ... + [201007]
,[201008TOTAL] = [200909] + ... + [201008]
...
...
into #newtable
from #mytable
I was wondering if there was a smarter way of doing this, either creating these as new columns in the table in one step, or perhaps pivoting the data, doing something to it, and re-pivoting?

Altough everybody is right, a different database set-up would be best, I thought this was a nice problem to play around with. Here's my setup:
CREATE TABLE TEST
(
ID INT
, [201401] decimal(19, 5)
, [201402] decimal(19, 5)
, [201403] decimal(19, 5)
, [201404] decimal(19, 5)
, [201405] decimal(19, 5)
, [201406] decimal(19, 5)
, [201407] decimal(19, 5)
)
INSERT INTO TEST
VALUES (1, 1, 2, 3, 4, 5, 6, 7)
Just one record with data is enough to test.
On the assumption the columns to be summed are consecutive in the table, and the first one is the first with datatype decimal. In other words, the table 'starts' (for want of better word) with a PK, which is usually INT, may be followed by descriptions or whatever, followed by the monthly columns to be summed:
DECLARE #OP_START INT
, #OP_END INT
, #LOOP INT
, #DATE VARCHAR(255)
, #SQL VARCHAR(MAX) = 'SELECT '
, #COLNAME VARCHAR(MAX)
-- Set Date to max date (=columnname)
SET #DATE = '201406'
-- Find Last attribute
SET #OP_END = (
SELECT MAX(ORDINAL_POSITION)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TEST'
AND COLUMN_NAME <= #DATE
)
-- Find First attribute
SET #OP_START = (
SELECT MIN(ORDINAL_POSITION)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TEST'
AND DATA_TYPE = 'DECIMAL'
)
SET #LOOP = #OP_START
-- Loop through the columns
WHILE #LOOP <= #OP_END
BEGIN
SET #COLNAME = (
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'TEST'
AND ORDINAL_POSITION = #LOOP
)
-- Build SQL with found ColumnName
SET #SQL = #SQL + '[' + #COLNAME + ']' + '+'
SET #LOOP = #LOOP + 1
END
-- Remove last "+"
SET #SQL = SUBSTRING(#SQL, 1, LEN(#SQL) - 1)
-- Complete SQL
SET #SQL = #SQL + ' FROM TEST'
-- Execute
EXEC(#SQL)
This should keep adding up the monthly values, regardless how many you add. Just change the max date to what pleases you.
I'm NOT saying this is the best way to go, but it is a fun way :P

Related

How to pull day of month from Column header including row data?

Trying to make a report where employee needs to work 30 working days since last point. I would like to have someway to show anything with 8.0 as working day. So column std_01_04 would show as 12/04/2020 and as a work day.
The sql database has been set up very goofy and having issues pulling from it.
there is a column named yr, data shows the current year "2020"
there is a column named mth_cal, show current month "12" for December.
The tricky part is the day. Column is std_01_15 the last 2 numbers are day of month, this example is the 15th. The data under that column will be a 0 or 8. 8 means they worked(full day) or 0 which means day off/holiday. So I need to pull data from a few spots to make a date and need to figure out 30 working days since last point. Any help would be great.
Edit: This was designed by different company. I have no way of redesigning this. I can only try to work with it. Or I need to go with manually entering all holidays/days off
You can create store procedure or function with input param #year and #month with code below to get all day with 8.0 working hours.
declare #year int = 2020
declare #month int = 11
create table #temp (workingDay date)
declare #index int = 1
declare #query nvarchar(200)
declare #paramdef nvarchar(300) = '#valOut nvarchar(10) OUTPUT'
declare #value nvarchar(10)
while #index <= 30
begin
set #query = 'select #valOut = STD_01_'+ REPLACE(STR(#index, 2), SPACE(1), '0') +' from work where yr='+ CAST(#year as nvarchar(4));
exec sp_executesql #query, #paramdef, #valOut = #value output
if #value = '8.0'
begin
insert into #temp values(CAST(#year as nvarchar(4)) + REPLACE(STR(#month + 1, 2), SPACE(1), '0') + REPLACE(STR(#index, 2), SPACE(1), '0'))
end
set #index = #index + 1
end
select * from #temp
drop table #temp
This is result when i run this code in SQL Server
I think you require results like below. You can use unpivot in sql server.
I have tested with below query.
Create table TestUnpivot(yr int , mnth int, std_01_01 int, std_01_02 int, std_01_03 int, primary key(yr,mnth) )
go
Insert into TestUnpivot values(2019,12,0,8,8),(2020,1,8,0,8),(2020,2,8,8,8)
go
Select * from TestUnpivot
go
Select Yr,Mnth, workd, dayss, Convert(Varchar(2),Mnth)+'/'+right(dayss,2)+'/'+convert(Varchar(4),Yr) as DT
from
(
select * from TestUnpivot
)p
unpivot
(
workd for dayss in (std_01_01,std_01_02,std_01_03)
) as unp
go

Dynamic SQL Procedure with Pivot displaying counts based on Date Range

I have a table which contains multiple user entries.
I want to pull counts of user entries based on date range passed to a stored procedure.
start date: 11/9/2017
end date: 11/11/2017
However the response needs to be dynamic based on amount of days in the date range.
Here is a desired format:
Now that you have provided examples, I have updated my answer which provides you with a solution based on the data you have provided.
Note that you are able to change the date range and the query will update accordingly.
Bare in mind that this SQL query is for SQL Server:
create table #tbl1 (
[UserId] int
,[UserName] nvarchar(max)
,[EntryDateTime] datetime
);
insert into #tbl1 ([UserId],[UserName],[EntryDateTime])
values
(1,'John Doe','20171109')
,(1,'John Doe','20171109')
,(1,'John Doe','20171110')
,(1,'John Doe','20171111')
,(2,'Mike Smith','20171109')
,(2,'Mike Smith','20171110')
,(2,'Mike Smith','20171110')
,(2,'Mike Smith','20171110')
;
-- declare variables
declare
#p1 date
,#p2 date
,#diff int
,#counter1 int
,#counter2 int
,#dynamicSQL nvarchar(max)
;
-- set variables
set #p1 = '20171109'; -- ENTER THE START DATE IN THE FORMAT YYYYMMDD
set #p2 = '20171111'; -- ENTER THE END DATE IN THE FORMAT YYYYMMDD
set #diff = datediff(dd,#p1,#p2); -- used to calculate the difference in days
set #counter1 = 0; -- first counter to be used in while loop
set #counter2 = 0; -- second counter to be used in while loop
set #dynamicSQL = 'select pivotTable.[UserId] ,pivotTable.[UserName] as [Name] '; -- start of the dynamic SQL statement
-- to get the dates into the query in a dynamic way, you need to do a while loop (or use a cursor)
while (#counter1 < #diff)
begin
set #dynamicSQL += ',pivotTable.[' + convert(nvarchar(10),dateadd(dd,#counter1,#p1),120) + '] '
set #counter1 = (#counter1 +1)
end
-- continuation of the dynamic SQL statement
set #dynamicSQL += ' from (
select
t.[UserId]
,t.[UserName]
,cast(t.[EntryDateTime] as date) as [EntryDate]
,count(t.[UserId]) as [UserCount]
from #tbl1 as t
where
t.[EntryDateTime] >= ''' + convert(nvarchar(10),#p1,120) + ''' ' +
' and t.[EntryDateTime] <= ''' + convert(nvarchar(10),#p2,120) + ''' ' +
'group by
t.[UserId]
,t.[UserName]
,t.[EntryDateTime]
) as mainQuery
pivot (
sum(mainQuery.[UserCount]) for mainQuery.[EntryDate]
in ('
;
-- the second while loop which is used to create the columns in the pivot table
while (#counter2 < #diff)
begin
set #dynamicSQL += ',[' + convert(nvarchar(10),dateadd(dd,#counter2,#p1),120) + ']'
set #counter2 = (#counter2 +1)
end
-- continuation of the SQL statement
set #dynamicSQL += ')
) as pivotTable'
;
-- this is the easiet way I could think of to get rid of the leading comma in the query
set #dynamicSQL = replace(#dynamicSQL,'in (,','in (');
print #dynamicSQL -- included this so that you can see the SQL statement that is generated
exec sp_executesql #dynamicSQL; -- this will run the generate dynamic SQL statement
drop table #tbl1;
Let me know if that's what you were looking for.
If you are using MySQL this will make what you want:
SELECT UserID,
UserName,
SUM(Date = '2017-11-09') '2017-11-09',
SUM(Date = '2017-11-10') '2017-11-10',
SUM(Date = '2017-11-11') '2017-11-11'
FROM src
GROUP BY UserID
If you are using SQL Server, you could try it with PIVOT:
SELECT *
FROM
(SELECT userID, userName, EntryDateTime
FROM t) src
PIVOT
(COUNT(userID)
FOR EntryDateTime IN (['2017-11-09'], ['2017-11-10'], ['2017-11-11'])) pvt

simplify if statement with stored procedure

i have a following stored procedure where i am repeating similar code. all i am doing is checking the condition based on Sample id1, sampleid2, and sample id3 to follow in similar fashion. The value of 'y' goes on till about it reaches 10, so it's going to be a big 'if' condition based statements. i was trying to see if a better solution could be put in place. thanks.
#select = 'select * from tbl Sample......'
if(x = 1 and y=1)
set #where = 'where Sample.id1 >=1 and <=10'
if(x = 1 and y=2)
set #where = 'where Sample.id1 >=11 and <=20'
if(x=2 and y=1)
set #where = 'where Sample.id2 >=1 and <= 10'
if(x=2 and y=2)
set #where = 'where Sample.id2 >=11 and <=20'
if(x=3 and y=1)
set #where = 'where Sample.id3 >=1 and <=10'
if(x=3 and y=2)
set #where = 'where Sample.id3 >=11 and <=20' //increment goes on
exec(#select+#where)
In general, if there is no easy correlation between the values of x, y and the filtered columns id1, id2 etc, then you could move the where predicates into a table keyed by values of x and y, and then use this as a lookup to apply to your PROC. Assuming the SPROC is used heavily, the lookup table can be made permanent and indexed on your x,y input mapping columns.
CREATE TABLE dbo.WhereMappings
(
x INT,
y INT,
Predicate NVARCHAR(MAX),
CONSTRAINT PK_MyWhereMappings PRIMARY KEY(x, y)
)
INSERT INTO dbo.WhereMappings(x, y, Predicate) VALUES
(1, 1, 'Sample.id1 > 5 and Sample.id2 <= 10'),
(1, 2, 'Sample.id1 > 7 and Sample.id2 <= 15'),
(2, 1, 'Sample.id2 > 2 and Sample.id3 <= 18');
Your proc then simplifies to:
CREATE PROC MyProc(#x INT, #y INT) AS
BEGIN
DECLARE #sql NVARCHAR(MAX);
DECLARE #predicate NVARCHAR(MAX);
SELECT TOP 1 #predicate = Predicate
FROM dbo.WhereMappings WHERE x = #x AND y = #y;
-- TODO THROW if predicate not mapped
SET #sql = CONCAT('SELECT * FROM Sample WHERE ', #predicate);
EXECUTE(#sql);
END;
Re : What does this solve
Although this hasn't necessarily reduced the complexity of the original queries, it does however allow for a data-only maintenance approach to the mappings, e.g. Admin UI screens could be written to maintain (and validate! think Sql Injection) the predicate mappings, without the need for direct modification to the SPROC.
Edit
After your edit, it does appear that there is a correlation between x, y and the filtered column and range used in the idx predicates, viz x sets the column, and y sets the range between.
In that case, simply append the value of x to an id column name stub, and multiply out the value of the BETWEEN clause to y*10 - 9 to y * 10;
You may do something like this:
select
*
from
tbl Sample
where
(#x=1 and #y=1 and Sample.id1>=..and Sample.id1<=..) --(or you could use between)
OR (#x=1 and #y=2 and Sample.id1>=..and Sample.id1<=..)
..
set #select = 'select * from tbl Sample......'
set #where = 'where Sample.id'+convert(nvarchar(10),#x)+' >=....and <=...'
exec(#select+#where)
I would suggest to use another sql table which will have information of all these condition like shown in below screenshot.
Then use join in your sql query like.(Assume above table has name Limit
select * from tbl Sample smpl
inner join Limit lmt
on #x=lmt.x and #y=lmt.y and
(
(#x=1 and smpl.id1 >= lmt.Min_limit and smpl.id1 <=lmt.Max_limit) or
(#x=2 and smpl.id2 >= lmt.Min_limit and smpl.id2 <=lmt.Max_limit) or
(#x=3 and smpl.id3 >= lmt.Min_limit and smpl.id3 <=lmt.Max_limit)
)
In this I have tried to avoid dynamic query.
I usually try to find a relation between inputs and outputs and in this case I found this way:
SET #where = 'WHERE Sample.id{0} >= {1} + 1 and <= {1} + 10'
SET #where = REPLACE(#where, '{0}', CAST(x AS varchar(5)))
SET #where = REPLACE(#where, '{1}', CAST((y - 1) AS varchar(5)))
I think you want something like:
SET #where = 'where Sample.id' + CAST(#x AS VARCHAR(10)) + ' between ' +
CAST((#y - 1) * 10 + 1 AS VARCHAR(10)) + ' and ' +
CAST(#y * 10 AS VARCHAR(10))

Export data from a non-normalized database

I need to export data from a non-normalized database where there are multiple columns to a new normalized database.
One example is the Products table, which has 30 boolean columns (ValidSize1, ValidSize2 ecc...) and every record has a foreign key which points to a Sizes table where there are 30 columns with the size codes (XS, S, M etc...). In order to take the valid sizes for a product I have to scan both tables and take the value SizeCodeX from the Sizes table only if ValidSizeX on the product is true. Something like this:
Products Table
--------------
ProductCode <PK>
Description
SizesTableCode <FK>
ValidSize1
ValidSize2
[...]
ValidSize30
Sizes Table
-----------
SizesTableCode <PK>
SizeCode1
SizeCode2
[...]
SizeCode30
For now I am using a "template" query which I repeat for 30 times:
SELECT
Products.Code,
Sizes.SizesTableCode, -- I need this code because different codes can have same size codes
Sizes.Size_1
FROM Products
INNER JOIN Sizes
ON Sizes.SizesTableCode = Products.SizesTableCode
WHERE Sizes.Size_1 IS NOT NULL
AND Products.ValidSize_1 = 1
I am just putting this query inside a loop and I replace the "_1" with the loop index:
SET #counter = 1;
SET #max = 30;
SET #sql = '';
WHILE (#counter <= #max)
BEGIN
SET #sql = #sql + ('[...]'); -- Here goes my query with dynamic indexes
IF #counter < #max
SET #sql = #sql + ' UNION ';
SET #counter = #counter + 1;
END
INSERT INTO DestDb.ProductsSizes EXEC(#sql); -- Insert statement
GO
Is there a better, cleaner or faster method to do this? I am using SQL Server and I can only use SQL/TSQL.
You can prepare a dynamic query using the SYS.Syscolumns table to get all value in row
DECLARE #SqlStmt Varchar(MAX)
SET #SqlStmt=''
SELECT #SqlStmt = #SqlStmt + 'SELECT '''+ name +''' column , UNION ALL '
FROM SYS.Syscolumns WITH (READUNCOMMITTED)
WHERE Object_Id('dbo.Products')=Id AND ([Name] like 'SizeCode%' OR [Name] like 'ProductCode%')
IF REVERSE(#SqlStmt) LIKE REVERSE('UNION ALL ') + '%'
SET #SqlStmt = LEFT(#SqlStmt, LEN(#SqlStmt) - LEN('UNION ALL '))
print ( #SqlStmt )
Well, it seems that a "clean" (and much faster!) solution is the UNPIVOT function.
I found a very good example here:
http://pratchev.blogspot.it/2009/02/unpivoting-multiple-columns.html

Big ugly cursor

I'm populating a table of about 15 columns from a table of about 1000 columns. I need to grab the time from the big table. That time is broken up into minutes and hours [rn-min] and [rn-hr] and I need them in an am/pm format in the new table. The table is populated by an outside company so I can't really change much about it, I did get them to put in a transferred column for me to check. It's big and slow and I only need a few columns and there is a lot of duplicate/similar rows. In any case I'm making the smaller table from the bigger table. I wrote a cursor, its slow and I was wondering if there was a better way to do it. I can't just use a simple insert(select columns) because I want to change the way the time and date are stored. Thanks, any help or advice is appreciated
declare data CURSOR READ_ONLY FORWARD_ONLY
for
select [raID],
(otherfields),
CAST([RA-rent-mm] as varchar(2)) + '/' + CAST([RA-rent-dd] as varchar(2)) + '/' +
CAST([RA-Rent-CC] as varchar(2)) + CAST([RA-RENT-YY] as varchar(2)) [Date_Out],
CAST([RA-Rtrn-mm] as varchar(2)) + '/' + CAST([RA-Rtrn-dd] as varchar(2)) +
'/' + CAST([RA-Rtrn-CC] as varchar(2)) + CAST([RA-Rtrn-YY] as varchar(2)) [Date_In],
CAST([RA-RENTAL-HOURS] as varchar(2)),
CAST([RA-RENTAL-Minutes] as varchar(2)),
CAST([RA-RTRN-HOURS] as varchar(2)),
CAST([RA-RTRN-MINUTES] as varchar(2)),
(other fields)
from table_name
where Transfered is null
and [RA-rtrn-mm] != 0 --this keeps me from getting the duplicate/similar rows, once this doesn't equal 0 there aren't anymore rows so I just grab this one
declare #sql as varchar(max)
declare #raID int;
(other fields),
declare #rentDate varchar(8);
declare #rtrnDate varchar(8);
declare #rentHours varchar(2);
declare #rentMinutes varchar(2);
declare #rtrnHours varchar(2);
declare #rtrnMinutes varchar(2);
(other fields)
open data
fetch next from data into
#raID,
(other fields),
#rentDate ,
#rtrnDate ,
#rentHours ,
#rentMinutes ,
#rtrnHours ,
#rtrnMinutes ,
(other fields),
while ##FETCH_STATUS = 0
begin
set #rentMinutes = left('0' + #rentMinutes,2);--padding with 0 if minutes is 1-9
set #rtrnMinutes = left('0' + #rtrnMinutes,2);
--turning the varchar times into a time then back to varchar with correct am/pm notation
declare #rentT time = #rentHours + ':' + #rentMinutes;
declare #rtnT time = #rtrnHours + ':' + #rtrnMinutes;
declare #rentTime varchar(7) = convert(varchar(15),#rentT, 100);
declare #returnTime varchar(7) = convert(varchar(15),#rtnT, 100);
--print #rentTime;
set #sql = 'INSERT other_tbl_name(raID, (other fields), Date_Out, Date_In, Time_Out, Time_In, (other fields))
values ('+cast(#raID as varchar(max))+', (other fields),'''+#rentDate+''',
'''+#rtrnDate+''', '''+#rentTime+''', '''+#returnTime+''',
(other fields))';
--exec(#sql)
print #sql
--need a way to make sure the insert worked before updating
--need to update transferred to keep from updating the same info
declare #update as varchar(max) = '
UPDATE Capture.icokc_data
SET Transfered = 1
WHERE [raID] = '+cast(#raID as varchar(10))
--exec(#update)
--print #update
fetch next from data into
#raID,
(other fields)
#rentDate ,
#rtrnDate ,
#rentHours ,
#rentMinutes ,
#rtrnHours ,
#rtrnMinutes ,
(other fields)
end
close data;
deallocate data;
Why dont you bulk insert it, and transform the dates and times in the select?
Something like this:
INSERT other_tbl_name(raID, (other fields), Date_Out, Date_In, Time_Out, Time_In, (other fields))
select
[raID],
(otherfields),
CAST([RA-rent-mm] as varchar(2)) + '/' + CAST([RA-rent-dd] as varchar(2)) + '/' + CAST([RA-Rent-CC] as varchar(2)) + CAST([RA-RENT-YY] as varchar(2)) [Date_Out],
CAST([RA-Rtrn-mm] as varchar(2)) + '/' + CAST([RA-Rtrn-dd] as varchar(2)) + '/' + CAST([RA-Rtrn-CC] as varchar(2)) + CAST([RA-Rtrn-YY] as varchar(2)) [Date_In],
CONVERT(varchar(15),DATEADD(minute, [RA-RENTAL-Minutes], DATEADD(hour, [RA-RENTAL-HOURS], '00:00')), 100) as [Time_out],
CONVERT(varchar(15),DATEADD(minute, [RA-RTRN-MINUTES], DATEADD(hour, [RA-RTRN-HOURS], '00:00')), 100) as [Time_in],
(other fields)
from table_name
where Transfered is null
and [RA-rtrn-mm] != 0
UPDATE Capture.icokc_data
SET Transfered = 1
WHERE [raID] IN
(
select
[raID]
from table_name
where Transfered is null
-- and [RA-rtrn-mm] != 0 -- not sure about this one
)
As it's a direct conversion, i.e. one record in and one record out, I don't really see any reason why it couldn't be done with a single insert query.
Anyhow, don't create queries dynamically. The dynamic queries will be parsed and planned for each iteration, which is most likely the reason for most of the performance problems.
For example, instead of:
declare #update as varchar(max) = '
UPDATE Capture.icokc_data
SET Transfered = 1
WHERE [raID] = '+cast(#raID as varchar(10))
exec(#update)
just do:
UPDATE Capture.icokc_data
SET Transfered = 1
WHERE [raID] = #raID