I am building a temp table #Table that contains different Categories, Items, and Costs.
For example:
Category Item Cost
-------- ---- ----
Office Desk 100.00
Office Chair 75.00
Office PC 800.00
Home Desk 0.00
At the time that I receive the temp table for my processing, there are the individual rows with Category, Item, and Cost as well as Summary rows that contain the sum of each category that has a non-zero total:
Category Item Cost Type
-------- ---- ---- -----
Office Desk 100.00 Cost
Office Chair 75.00 Cost
Office PC 800.00 Cost
Office null 975.00 Summary
Home Desk 0.00 Cost
I would like to add summary rows for the $0.00 cost rows now as well, but am having trouble with figuring out how to do so.
INSERT INTO #Table
SELECT X.Category, null, 0.00, 'Summary'
FROM #Table X
...[Get Category data that does not have a Summary row]
I had thought about
WHERE NOT EXISTS ( SELECT * FROM #Table Y WHERE Y.Category = X.Category AND Type = 'Summary')
GROUP BY X.Category
but am concerned about performance, as there could be a lot of rows in this table.
You should benchmark your performance, but I've found joins are faster. Also, if your dataset it very large, adding an index after the data is loaded into the temp table can dramatically speed up subsequent table queries.
INSERT INTO #Table
SELECT X.Category, null, 0.00, 'Summary'
FROM #Table X
LEFT JOIN #Table Y ON Y.Category = X.Category AND Y.Type = 'Summary'
WHERE Y.Category IS NULL
How about doing this in 2 steps -
1 - Insert the raw data straight away - without summary info
2 - Insert more data - SELECT data grouping by Category, using SUM aggregate on cost, and hard coding NULL for Item & SUMMARY for Type field.
Related
Wondering what the options are to do Excel's SUMPRODUCT in SQL Server.
I have 3 tables:
Transaction list of items being sold
Raw materials making up each item
Price of raw materials by date
For each item sold (table 1), I want to find the total price of raw materials (table 3) on that sold date, considering the % of raw materials that make up the item. Sample data below.
Table 1: Items sold
Item
Date
Qty_sold
Pencil
5/1/2022
1
Pencil
6/1/2022
2
Pencil
9/1/2022
1
Table 2: Raw materials making up each item
Item
Raw_material
pct_of_total
Pencil
Wood
70%
Pencil
Rubber
5%
Pencil
Lead
25%
Table 3: Raw material prices by date
Date
Raw_material
Part_unitprice
5/1/2022
Wood
0.20
6/1/2022
Wood
0.21
9/1/2022
Wood
0.21
5/1/2022
Rubber
0.10
6/1/2022
Rubber
0.10
9/1/2022
Rubber
0.12
5/1/2022
Lead
0.50
6/1/2022
Lead
0.55
9/1/2022
Lead
0.50
The result I'm looking for is below, at the same level of detail as Table 1.
Item
Date
Qty_sold
SUMPRODUCT_unitprice
Pencil
5/1/2022
1
0.27
Pencil
6/1/2022
2
0.2895
Pencil
9/1/2022
1
0.278
My approach would be to
First join Table 2 (~3k rows) & Table 3 (~72k rows)
Join the resulting table with Table 1 (> 2M rows)
I'm conscious of the number of rows that would need to get crunched with these two joins, and I'm wondering if there is a more sophisticated way of doing this.
This is an example query where you can calculate the required sumproduct value. Just note that I don't see any reason to worry about "huge numbers of rows" here. This is not a cartesian join.
select
t1.Item,
t1.[Date],
t1.Qty_Sold, sum(t2.pct_of_total * t3.Part_UnitPrice) as SumProduct_UnitPrice
from
#Table1 t1
inner join #Table2 t2
on t2.Item = t1.Item
inner join #Table3 t3
on t3.Raw_Material = t2.Raw_Material
and t3.[Date] = t1.[Date]
group by
t1.Item, t1.[Date], t1.Qty_Sold;
For completeness sake, my test data:
drop table if exists #Table1, #Table2, #Table3
create table #Table1 (Item nvarchar(30), [Date] date, Qty_Sold int)
insert into #Table1 values
('Pencil', '20220501', 1),
('Pencil', '20220601', 2),
('Pencil', '20220901', 1)
create table #Table2 (Item nvarchar(30), Raw_Material nvarchar(30), pct_of_total decimal(5, 2))
insert into #Table2 values
('Pencil', 'Wood', .70),
('Pencil', 'Rubber', .05),
('Pencil', 'Lead', .25)
create table #Table3 ([Date] date, Raw_Material nvarchar(30), Part_UnitPrice decimal(5,2))
insert into #Table3 values
('20220501', 'Wood', .20),
('20220601', 'Wood', .21),
('20220901', 'Wood', .21),
('20220501', 'Rubber', .10),
('20220601', 'Rubber', .10),
('20220901', 'Rubber', .12),
('20220501', 'Lead', .50),
('20220601', 'Lead', .55),
('20220901', 'Lead', .50)
I have a query that will potentially return multiple rows for the same ID from my database. This is because it is a payment table and an invoice can be paid on multiple times.
So my results can look like this.
ID Company BillAmount AmountPaid
----- --------- ------------ ------------
123 ABC 1000.00 450.00
123 ABC 1000.00 250.00
456 DEF 1200.00 1200.00
I am building this query to put into Crystal Reports. If I just pull the raw data, I won't be able to do any sub totaling in CR as Bill amount on this will show $3200 when it is really $2200. I'll need to show balance and I can do that in CR but if I am pulling balance on each line returned, the total balance due for all records shown will be wrong as the "duplicate" rows will be counted wrong.
I am not sure what kind of report you need but maybe a query like this might be useful:
select ID, Company, max(BillAmount), sum(AmountPaid)
from Payment
group by ID
-improved after Juan Carlos' suggestion
For this, there are 2 option available.
at Crystal report side
In crystal report, there is facility to group, as suggested in this link, follow steps
for group summary, after add group, put all fields in group footer, check this link
at Sql side the below suggestion (you are not define which sql or db use, I assume Sqlserver 2012 and above)
Get the records with extra 2 column ( TotalBill ,TotalPaid)
declare #Invoice table(id int , Company varchar(25), BillAmount int )
declare #payment table(id int , InvoiceId int, AmountPaid int )
insert into #Invoice values (1, 'ABC', 1000), (2, 'DFE', 1200)
insert into #payment values (1, 1, 450), (2, 1, 250), (3, 2, 1200)
;with cte as
( select sum(BillAmount) TotalBill from #Invoice i )
Select
i.*, p.AmountPaid ,
Sum(AmountPaid) over ( partition by i.id ) InvoiceWiseTotalPaid,
cte.TotalBill,
Sum(AmountPaid) over ( order by i.id ) TotalPaid
from
#Invoice i
Join #payment p on i.id= p.InvoiceId
, cte
Output will be
I have several tables all holding small amounts of data on a batch of product. For example, I have a table (called 'Tests') that holds a test number, a test name and the description. This is referenced by my batch table, which holds a test number, test result (as a real) and the batch number itself.
Some batches may have 50 tests, some may have 30, some may have as little as 1.
I was hoping to create a view that converts something like these tables;
BatchNumber TestNum TestResult | TestNumber TestName TestDesc
----------- -------- ----------- | ----------- --------- ---------
1000 1 1.20 | 1 Thickness How thick the product is
1001 1 1.30 | 2 Colour What colour the product is
1001 2 45.1 | 3 Weight How heavy the product is
...
to the following;
BatchNumber Thickness Colour Weight
------------ --------- ------ -------
1000 1.20 NULL NULL
1001 1.30 45.1 NULL
...
Though the 'null' could just be blank, it would probably be better that way, I just used that to better show my requirement.
I've found many articles online on the benefit of PIVOTing, UNPIVOTing, UNIONing but none show the direct benefit, or indeed provide a clear and succinct way of using the data without copying data into a new table, which isn't really useful for my need. I was hoping that a view would be possible so that end-user applications can just call that instead of doing the joins locally.
I hope that makes sense, and thank you!
You need cross tab query for this.
http://www.mssqltips.com/sqlservertip/1019/crosstab-queries-using-pivot-in-sql-server/
DECLARE #Tests Table (TestNumber int, TestName VARCHAR(100),TestDesc varchar(100))
INSERT INTO #Tests
SELECT 1, 'Thickness', '' UNION ALL
SELECT 2, 'Color', '' UNION ALL
SELECT 3, 'Weight', ''
DECLARE #BTests Table (BatchNum int, TestNumber int, TestResult float)
INSERT INTO #BTests
SELECT 1000, 1, 1.20 UNION ALL
SELECT 1001, 1, 1.30 UNION ALL
SELECT 1001, 2, 45.1
;with cte as (
select b.*, t.TestName
from #BTests b
inner join #Tests t on b.TestNumber = t.TestNumber
)
SELECT Batchnum, [Thickness] AS Thickness, [Color] AS Color, [Weight] as [Weight]
FROM
(SELECT Batchnum, TestName, testResult
FROM Cte ) ps
PIVOT
(
SUM (testResult)
FOR testname IN
( [Thickness],[Color], [Weight])
) AS pvt
CREATE VIEW?
e.g.
CREATE VIEW myview AS
SELECT somecolumns from sometable
sorry if this has already been posted but I've been through umpteen posts on pivoting the past day and still havn't been able to get the result i want.
Background:
In short, I am developing a set of tables that will store a questionnaire dynamically.
I wont go into detail of it probably isnt relative.
I basically want to query the table that stores the user input for a set question.
These questions branch off each other allowing me to show columns and rows per question etc.
Anyway this query:
SELECT qr.*, Question
FROM QuestionRecord qr
INNER JOIN
QuestionRecord P
ON P.ID = qr.ParentQuestionRecordId
JOIN Questions q ON q.ID = qr.QuestionID
Produces this result set :
ID FormRecordId QuestionId ParentQuestionRecordId Value Question
---------------------------------------------------------------------------------------
2 1 31 1 Consultancy Eligible project costs
3 1 32 2 NULL Date
4 1 33 2 25000 Cash Costs £
5 1 34 2 NULL In Kind Costs £
6 1 35 2 25000 Total Costs
7 1 31 1 Orchard day x2 Eligible project costs
8 1 32 7 NULL Date
9 1 33 7 15000 Cash Costs £
10 1 34 7 NULL In Kind Costs £
11 1 35 7 15000 Total Costs
I basically want to Pivot(I think) these rows to look like so:
Eligible project costs Date Cash Costs £ In Kind Costs Total Costs
--------------------------------------------------------------------------------
Consultancy NULL 25000 NULL 25000
Orchard day x2 NULL 15000 NULL 15000
I have tried:
SELECT [Eligible project costs],[Date],[Cash Costs £],[In Kind Costs £],[Total Costs]
FROM
(
SELECT qr.*, Question
FROM QuestionRecord qr
INNER JOIN
QuestionRecord P
ON P.ID = qr.ParentQuestionRecordId
JOIN Questions q ON q.ID = qr.QuestionID
)pvt
PIVOT
(
MIN(Value)
FOR Question IN
([Eligible project costs],[Date],[Cash Costs £],[In Kind Costs £],[Total Costs])
)pivotTable
but this returns each column on a seperate row:
Eligible project costs Date Cash Costs £ In Kind Costs Total Costs
--------------------------------------------------------------------------------
Consultancy NULL NULL NULL NULL
NULL NULL NULL NULL NULL
NULL NULL 25000 NULL NULL
NULL NULL NULL NULL NULL
NULL NULL NULL NULL 25000
So that's as close as i have managed to get with it, i was wondering if you guys/girls could help me out :)
Thanks!
Try the following changes to your script (strikethrough = deleted, bold = added):
SELECT [Eligible project costs],[Date],[Cash Costs £],[In Kind Costs £],[Total Costs]
FROM
(
SELECT qr.*,
grp = ROW_NUMBER() OVER (PARTITION BY qr.QuestionId ORDER BY qr.ID),
Value,
Question
FROM QuestionRecord qr
INNER JOIN
QuestionRecord P
ON P.ID = qr.ParentQuestionRecordId
JOIN Questions q ON q.ID = qr.QuestionID
)pvt
PIVOT
(
MIN(Value)
FOR Question IN
([Eligible project costs],[Date],[Cash Costs £],[In Kind Costs £],[Total Costs])
)pivotTable
I think it must give your the result you are after.
Change SELECT qr.*, Question to SELECT Value, Question. PIVOT groups by the remaining columns.
what you need, like andriy kinda pointed out, is something to make each record unique depending on how you want them grouped. now, if this is a survey system i'm going to guess that you've got some sort of id to identify who the record belongs to. the reason why it's returning on seperate rows is that you have unique records for each row based on those ids, what you need is to add the respondent id to your derived table and get rid of your other id's.
see my example:
declare #table table (ID int identity(1,1), QuestionID int, value varchar(50), Respondent int)
declare #questions table (QID int, name varchar(50))
insert into #questions values (31,'Eligible project costs')
insert into #questions values (32,'Date')
insert into #questions values (33,'Cash Costs')
insert into #questions values (34,'In Kind Costs')
insert into #questions values (35,'Total Costs')
insert into #table values (31,'Consultancy',1)
insert into #table values (32,null,1)
insert into #table values (33,25000,1)
insert into #table values (34,null,1)
insert into #table values (35,25000,1)
insert into #table values (31,'Orchard day x2',2)
insert into #table values (32,null,2)
insert into #table values (33,15000,2)
insert into #table values (34,null,2)
insert into #table values (35,15000,2)
select
[Eligible project costs],[Date],[Cash Costs],[In Kind Costs],[Total Costs]
from
(
select
Respondent,
q.name,
t.Value
from #table t
inner join #questions q
on t.QuestionID=QID
) a
pivot
(
min(Value)
for name in ([Eligible project costs],[Date],[Cash Costs],[In Kind Costs],[Total Costs])
) p
I have an interesting SQL problem that I need help with.
Here is the sample dataset:
Warehouse DateStamp TimeStamp ItemNumber ID
A 8/1/2009 10001 abc 1
B 8/1/2009 10002 abc 1
A 8/3/2009 12144 qrs 5
C 8/3/2009 12143 qrs 5
D 8/5/2009 6754 xyz 6
B 8/5/2009 6755 xyz 6
This dataset represents inventory transfers between two warehouses. There are two records that represent each transfer, and these two transfer records always have the same ItemNumber, DateStamp, and ID. The TimeStamp values for the two transfer records always have a difference of 1, where the smaller TimeStamp represents the source warehouse record and the larger TimeStamp represents the destination warehouse record.
Using the sample dataset above, here is the query result set that I need:
Warehouse_Source Warehouse_Destination ItemNumber DateStamp
A B abc 8/1/2009
C A qrs 8/3/2009
D B xyz 8/5/2009
I can write code to produce the desired result set, but I was wondering if this record combination was possible through SQL. I am using SQL Server 2005 as my underlying database. I also need to add a WHERE clause to the SQL, so that for example, I could search on Warehouse_Source = A. And no, I can't change the data model ;).
Any advice is greatly appreciated!
Regards,
Mark
SELECT source.Warehouse as Warehouse_Source
, dest.Warehouse as Warehouse_Destination
, source.ItemNumber
, source.DateStamp
FROM table source
JOIN table dest ON source.ID = dest.ID
AND source.ItemNumber = dest.ItemNumber
AND source.DateStamp = dest.DateStamp
AND source.TimeStamp = dest.TimeStamp + 1
Mark,
Here is how you can do this with row_number and PIVOT. With a clustered index or primary key on the columns as I suggest, it will use a straight-line query plan with no Sort operation, thus be particularly efficient.
create table T(
Warehouse char,
DateStamp datetime,
TimeStamp int,
ItemNumber varchar(10),
ID int,
primary key(ItemNumber,DateStamp,ID,TimeStamp)
);
insert into T values ('A','20090801','10001','abc','1');
insert into T values ('B','20090801','10002','abc','1');
insert into T values ('A','20090803','12144','qrs','5');
insert into T values ('C','20090803','12143','qrs','5');
insert into T values ('D','20090805','6754','xyz','6');
insert into T values ('B','20090805','6755','xyz','6');
with Tpaired(Warehouse,DateStamp,TimeStamp,ItemNumber,ID,rk) as (
select
Warehouse,DateStamp,TimeStamp,ItemNumber,ID,
row_number() over (
partition by ItemNumber,DateStamp,ID
order by TimeStamp
)
from T
)
select
max([1]) as Warehouse_Source,
max([2]) as Warehouse_Destination,
ItemNumber,
DateStamp
from Tpaired
pivot (
max(Warehouse) for rk in ([1],[2])
) as P
group by ItemNumber, DateStamp, ID;
go
drop table T;