How to display values of previous rows - sql

I have two tables, I am struggling to write a query that will generate the result I require.
Table 1
CREATE TABLE [Table 1](
[ID] [int] NOT NULL,
[Active_Status] [char](1) NOT NULL,
[Status Change Date] [date] NOT NULL
)
INSERT INTO [Table 1] VALUES (1,'Y','2000-01-15')
INSERT INTO [Table 1] VALUES (1,'N','2003-01-20')
INSERT INTO [Table 1] VALUES (2,'N','2002-01-25')
INSERT INTO [Table 1] VALUES (2,'Y','2003-01-15')
INSERT INTO [Table 1] VALUES (2,'N','2010-01-20')
INSERT INTO [Table 1] VALUES (3,'Y','2005-01-25')
INSERT INTO [Table 1] VALUES (3,'Y','2007-01-20')
INSERT INTO [Table 1] VALUES (3,'N','2011-01-15')
Table 2
CREATE TABLE [Table 2](
[ID] [int] NOT NULL,
[Decision] [varchar](4) NOT NULL,
[Decision Change Date] [date] NOT NULL
)
INSERT INTO [Table 2] VALUES (1,'BUY' ,'2000-05-15')
INSERT INTO [Table 2] VALUES (1,'SELL','2010-05-20')
INSERT INTO [Table 2] VALUES (1,'SELL','2012-05-25')
INSERT INTO [Table 2] VALUES (2,'HOLD','2004-05-15')
INSERT INTO [Table 2] VALUES (2,'BUY' ,'2011-05-10')
INSERT INTO [Table 2] VALUES (3,'SELL','2008-05-15')
INSERT INTO [Table 2] VALUES (3,'BUY' ,'2011-05-25')
My desired output
To start I need to sort my result table by ID and Decision Change Date. Subsequently I need to look up the appropriate Active_Status for the corresponding Decision Change Date.
Likewise I need to display the Active_Status and Decision for the previous period.

Final edit after talking on chat to get a final solution:
DECLARE #Result TABLE
(
TICKR_SYMB VARCHAR (15) NOT NULL
,fromReviewStatus char(10)
,toReviewStatus char(10)
,ReviewStatusChangeDate DATETIME
,fromRestrictionStatus char(10)
,toRestrictionStatus char(10)
,RestrictionStatusChangeDate DATETIME
,fromCoverageStatus char(10)
,toCoverageStatus char(10)
,CoverageStatusChangeDate DATETIME
,fromRating VARCHAR(20)
,toRating VARCHAR(20)
,RatingChangeDate DATETIME
)
/* Rating History */
;WITH DecisionsHistory AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY H.TICKR_SYMB ORDER BY H.[Rating Change Date]) AS Row
,H.TICKR_SYMB
,H.[to Rating] AS toRating
,H.[Rating Change Date]
FROM tblTickerRatingHistory H
)
INSERT #Result
(
TICKR_SYMB
,fromRating
,toRating
,RatingChangeDate
)
SELECT
CurrentHistory.TICKR_SYMB
,LastHistory.toRating AS fromRating
,CurrentHistory.toRating
,CurrentHistory.[Rating Change Date]
FROM DecisionsHistory CurrentHistory
LEFT JOIN DecisionsHistory LastHistory
ON LastHistory.Row = (CurrentHistory.Row - 1)
AND LastHistory.TICKR_SYMB = CurrentHistory.TICKR_SYMB
/* ReviewStatus */
;WITH ReviewStatusHistory AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY R.TICKR_SYMB, R.RatingChangeDate ORDER BY H.ReviewStatusChangeDate DESC) AS Row
,R.TICKR_SYMB
,R.RatingChangeDate
,H.ReviewStatus AS ToReviewStatus
,H.ReviewStatusChangeDate
FROM #Result R
LEFT JOIN tblTickerStatusHistory H
ON H.TICKR_SYMB = R.TICKR_SYMB
AND H.ReviewStatusChangeDate < R.RatingChangeDate
)
UPDATE R
SET
fromReviewStatus = LastActiveHistory.toReviewStatus
,toReviewStatus = CurrentActiveHistory.toReviewStatus
,ReviewStatusChangeDate = CurrentActiveHistory.ReviewStatusChangeDate
FROM #Result R
LEFT JOIN ReviewStatusHistory CurrentActiveHistory
ON CurrentActiveHistory.TICKR_SYMB = R.TICKR_SYMB
AND CurrentActiveHistory.RatingChangeDate = R.RatingChangeDate
AND CurrentActiveHistory.Row = 1
LEFT JOIN ReviewStatusHistory LastActiveHistory
ON LastActiveHistory.TICKR_SYMB = R.TICKR_SYMB
AND LastActiveHistory.RatingChangeDate = R.RatingChangeDate
AND LastActiveHistory.Row = 2
/* CoverageStatus */
;WITH CoverageStatusHistory AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY R.TICKR_SYMB, R.RatingChangeDate ORDER BY H.CoverageStatusChangeDate DESC) AS Row
,R.TICKR_SYMB
,R.RatingChangeDate
,H.CoverageStatus AS ToCoverageStatus
,H.CoverageStatusChangeDate
FROM #Result R
LEFT JOIN tblTickerStatusHistory H
ON H.TICKR_SYMB = R.TICKR_SYMB
AND H.CoverageStatusChangeDate < R.RatingChangeDate
)
UPDATE R
SET
fromCoverageStatus = LastActiveHistory.toCoverageStatus
,toCoverageStatus = CurrentActiveHistory.toCoverageStatus
,CoverageStatusChangeDate = CurrentActiveHistory.CoverageStatusChangeDate
FROM #Result R
LEFT JOIN CoverageStatusHistory CurrentActiveHistory
ON CurrentActiveHistory.TICKR_SYMB = R.TICKR_SYMB
AND CurrentActiveHistory.RatingChangeDate = R.RatingChangeDate
AND CurrentActiveHistory.Row = 1
LEFT JOIN CoverageStatusHistory LastActiveHistory
ON LastActiveHistory.TICKR_SYMB = R.TICKR_SYMB
AND LastActiveHistory.RatingChangeDate = R.RatingChangeDate
AND LastActiveHistory.Row = 2
/*RestrictionStatus */
;WITH RestrictionStatusHistory AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY R.TICKR_SYMB, R.RatingChangeDate ORDER BY H.RestrictionStatusChangeDate DESC) AS Row
,R.TICKR_SYMB
,R.RatingChangeDate
,H.RestrictionStatus AS ToRestrictionStatus
,H.RestrictionStatusChangeDate
FROM #Result R
LEFT JOIN tblTickerStatusHistory H
ON H.TICKR_SYMB = R.TICKR_SYMB
AND H.RestrictionStatusChangeDate < R.RatingChangeDate
)
UPDATE R
SET
fromRestrictionStatus = LastActiveHistory.toRestrictionStatus
,toRestrictionStatus = CurrentActiveHistory.toRestrictionStatus
,RestrictionStatusChangeDate = CurrentActiveHistory.RestrictionStatusChangeDate
FROM #Result R
LEFT JOIN RestrictionStatusHistory CurrentActiveHistory
ON CurrentActiveHistory.TICKR_SYMB = R.TICKR_SYMB
AND CurrentActiveHistory.RatingChangeDate = R.RatingChangeDate
AND CurrentActiveHistory.Row = 1
LEFT JOIN RestrictionStatusHistory LastActiveHistory
ON LastActiveHistory.TICKR_SYMB = R.TICKR_SYMB
AND LastActiveHistory.RatingChangeDate = R.RatingChangeDate
AND LastActiveHistory.Row = 2
SELECT
R1.TICKR_SYMB
,R1.fromCoverageStatus
,R1.toCoverageStatus
,R1.CoverageStatusChangeDate
,R1.fromReviewStatus
,R1.toReviewStatus
,R1.ReviewStatusChangeDate
,R1.fromRestrictionStatus
,R1.toRestrictionStatus
,R1.RestrictionStatusChangeDate
,R1.fromRating
,R1.toRating
,R1.RatingChangeDate
FROM #Result R1
ORDER BY TICKR_SYMB, RatingChangeDate

here's something that should get you started.
create table t1 (
id int, act char(1), scd date
)
create table t2 (
id int, decs varchar(4), dcd date
)
insert into t1 values
( 1, 'y', '20000115'),
( 1, 'n', '20030120'),
( 2, 'n', '20020125'),
( 2, 'y', '20030115'),
( 2, 'n', '20100120'),
( 3,'y','20050125'),
( 3,'y','20070120'),
( 3,'n','20110115')
insert into t2 values
(1,'buy','20000515' ),
(1,'sell', '20100520' ),
(1,'sell', '20120525' ),
(2,'hold', '20040515'),
(2,'buy', '20110510' ),
(3,'sell', '20080515'),
(3,'buy','20110525' )
with decisions as (
select row_number() over (partition by id order by dcd) as rn,
id, decs, dcd from t2
),
activities as
(
select row_number() over (partition by id order by scd) as rn,
id, act, scd from t1
)
select dec_to.id, x.from_act, x.to_act, x.scd, x.from_act, x.to_act, x.scd as scd, dec_from.decs as from_dec, dec_to.decs as to_dec, dec_to.dcd from decisions dec_from
right outer join decisions dec_to on dec_from.id = dec_to.id and
dec_to.rn = dec_from.rn + 1
outer apply (
select top 1 act_to.id, act_from.act as from_act, act_to.act as to_act, act_to.scd
from activities act_to
left outer join activities as act_from
on act_from.id = act_to.id and act_from.rn = act_to.rn - 1
where act_to.id = dec_to.id and act_to.scd <= dec_to.dcd
order by act_to.scd desc
) x
order by dec_to.id, dec_to.rn
yeah it is ugly. probably won't perform well on large datasets without proper indexing. however your requirements are vague, and the rules you use on what rows goes where does not make a whole lot of sense.
The assumption here is that you want the "latest" activity change row thats on or before the decision date. this works nicely and produces the following
ID FROM_ACT TO_ACT SCD FROM_DEC TO_DEC DCD
1 y 2000-01-15 buy 2000-05-15
1 y n 2003-01-20 buy sell 2010-05-20
1 y n 2003-01-20 sell sell 2012-05-25
2 n y 2003-01-15 hold 2004-05-15
2 y n 2010-01-20 hold buy 2011-05-10
3 y y 2007-01-20 sell 2008-05-15
3 y n 2011-01-15 sell buy 2011-05-25
You can play with it at SQLFiddle

You can do this by using CROSS APPLY and LAG.
SELECT t2.ID,
LAG(t1.Active_Status, 1, NULL) OVER (PARTITION BY t2.ID ORDER BY t2.[Decision Change Date]) AS [From Active Status],
t1.Active_Status AS [To Active Status], t1.[Status Change Date] AS [Active Status Change Date],
LAG(t2.Decision, 1, NULL) OVER (PARTITION BY t2.ID ORDER BY t2.[Decision Change Date]) AS [From Decision Status],
t2.Decision AS [To Decision Status], t2.[Decision Change Date]
FROM [Table 2] t2
CROSS APPLY (SELECT TOP 1 *
FROM [Table 1]
WHERE ID = t2.ID AND [Status Change Date] < t2.[Decision Change Date]
ORDER BY [Status Change Date] DESC) t1
ORDER BY t2.ID, [Decision Change Date]

Related

SQL to select the 'first' date a project was made inactive for all projects

I am trying to work out the SQL I would need to select certain records, here is an example of what I'm trying to do:
Project number
Active/Inactive
Date
1
A
1/1/20
1
I
3/1/20
1
A
5/1/20
1
I
7/1/20
1
I
9/1/20
2
I
1/1/19
2
A
5/1/19
3
A
1/3/20
3
I
3/3/20
3
I
5/3/20
Note: A=Active project, I=Inactive.
What I would like to do is for each project where the project is currently inactive (i.e. the latest date for the project in the above table is set to I), return the row of the longest time ago it was made inactive, but NOT before it was last active (hope this is understandable!). So for the above table the following would be returned:
Project number
Active/Inactive
Date
1
I
7/1/20
3
I
3/3/20
So proj number 1 is inactive and the earliest time it was made inactive (after the last time it was active) is 7/1/20. Project 2 is not selected as it is currently active. Project 3 is inactive and the earliest time it was made inactive (after the last time it was active) is 3/3/20.
Thanks.
You could use the 'row_number' function to help you.
create TABLE #PROJECT(ProjectNumber int, [Status] varcha(1), [Date] date)
INSERT INTO #PROJECT VALUES
(1 ,'A' ,'1/1/20'),
(1 ,'I' ,'3/1/20'),
(1 ,'A' ,'5/1/20'),
(1 ,'I' ,'7/1/20'),
(1 ,'I' ,'9/1/20'),
(2 ,'I' ,'1/1/19'),
(2 ,'A' ,'5/1/19'),
(3 ,'A' ,'1/3/20'),
(3 ,'I' ,'3/3/20'),
(3 ,'I' ,'5/3/20')
select * from
(SELECT
row_number() over (partition by projectNumber order by [date]) as [index]
,*
FROM
#PROJECT
WHERE
[STATUS] = 'I'
) as a where [index] = 1
Using some effective date joins, this should work. I am using SQL Server. Create your tables and set up the same data set you provided:
CREATE TABLE dbo.PROJECTS
(
PROJ_NUM int NULL,
STTS char(1) NULL,
STTS_DT date NULL
) ON [PRIMARY]
GO
INSERT INTO dbo.PROJECTS values (1, 'A', '1/1/20');
INSERT INTO dbo.PROJECTS values (1, 'I', '3/1/20');
INSERT INTO dbo.PROJECTS values (1, 'A', '5/1/20');
INSERT INTO dbo.PROJECTS values (1, 'I', '7/1/20');
INSERT INTO dbo.PROJECTS values (1, 'I', '9/1/20');
INSERT INTO dbo.PROJECTS values (2, 'I', '1/1/19');
INSERT INTO dbo.PROJECTS values (2, 'A', '5/1/19');
INSERT INTO dbo.PROJECTS values (3, 'A', '1/3/20');
INSERT INTO dbo.PROJECTS values (3, 'I', '3/3/20');
INSERT INTO dbo.PROJECTS values (3, 'I', '5/3/20');
Write a sub-query that filters out just to the projects that are INACTIVE:
-- sub-query that gives you projects that are inactive
SELECT PROJ_NUM, STTS, STTS_DT FROM dbo.PROJECTS CURRSTTS
WHERE STTS_DT = (SELECT MAX(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = CURRSTTS.PROJ_NUM)
AND CURRSTTS.STTS = 'I'
;
Write another sub-query that provides you the last active status date for each project:
-- sub-query that gives you last active status date for each project
SELECT PROJ_NUM, STTS, STTS_DT FROM dbo.PROJECTS LASTACTV
WHERE STTS_DT = (SELECT MAX(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = LASTACTV.PROJ_NUM AND ALLP.STTS = 'A')
;
Combine those two sub-queries into a query that gives you the list of inactive projects with their last active status date:
-- sub-query using the 2 above to show only inactive projects with last active stts date
SELECT CURRSTTS.PROJ_NUM, CURRSTTS.STTS, CURRSTTS.STTS_DT, LASTACTV.STTS_DT AS LASTACTV_STTS_DT FROM dbo.PROJECTS CURRSTTS
INNER JOIN
(SELECT PROJ_NUM, STTS, STTS_DT FROM dbo.PROJECTS LASTACTV
WHERE STTS_DT = (SELECT MAX(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = LASTACTV.PROJ_NUM AND ALLP.STTS = 'A'))
LASTACTV ON CURRSTTS.PROJ_NUM = LASTACTV.PROJ_NUM
WHERE CURRSTTS.STTS_DT = (SELECT MAX(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = CURRSTTS.PROJ_NUM)
AND CURRSTTS.STTS = 'I'
Add one more layer to the query that selects the MIN(STTS_DT) that is greater than the LASTACTV_STTS_DT:
-- final query that uses above sub-query
SELECT P.PROJ_NUM, P.STTS, P.STTS_DT
FROM dbo.PROJECTS P
INNER JOIN (
SELECT CURRSTTS.PROJ_NUM, CURRSTTS.STTS, CURRSTTS.STTS_DT, LASTACTV.STTS_DT AS LASTACTV_STTS_DT FROM dbo.PROJECTS CURRSTTS
INNER JOIN
(SELECT PROJ_NUM, STTS, STTS_DT FROM dbo.PROJECTS LASTACTV
WHERE STTS_DT = (SELECT MAX(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = LASTACTV.PROJ_NUM AND ALLP.STTS = 'A'))
LASTACTV ON CURRSTTS.PROJ_NUM = LASTACTV.PROJ_NUM
WHERE CURRSTTS.STTS_DT = (SELECT MAX(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = CURRSTTS.PROJ_NUM)
AND CURRSTTS.STTS = 'I'
) SUB ON SUB.PROJ_NUM = P.PROJ_NUM
WHERE P.STTS_DT = (SELECT MIN(STTS_DT) FROM dbo.PROJECTS ALLP WHERE ALLP.PROJ_NUM = P.PROJ_NUM AND ALLP.STTS_DT > SUB.LASTACTV_STTS_DT)
The result I get back matches your desired result:
"Greatest n-per group" is the thing to look up when you run accross a problem like this again. Here is a query that will get what you need in postgresSQL.
I realized I changed your column to a boolean, but you will get the gist.
with most_recent_projects as (
select project_number, max(date) date from testtable group by project_number
),
currently_inactive_projects as (
select t.project_number, t.date from testtable t join most_recent_projects mrp on t.project_number = mrp.project_number and t.date = mrp.date where not t.active
),
last_active_date as (
select project_number, date from (
select t.project_number, rank() OVER (
PARTITION BY t.project_number
ORDER BY t.date DESC), t.date
from currently_inactive_projects cip join testtable t on t.project_number = cip.project_number where t.active) t1 where rank = 1
)
-- oldest inactive -- ie, result
select t.project_number, t.active, min(t.date) from last_active_date lad join testtable t on lad.project_number = t.project_number and t.date > lad.date group by t.project_number, t.active;
This is a variation of "gaps and islands" problem.
The query may be like this
SELECT
num,
status,
MIN(date) AS date
FROM (
SELECT
*,
MAX(group_id) OVER (PARTITION BY num) AS max_group_id
FROM (
SELECT
*,
SUM(CASE WHEN status = prev_status THEN 0 ELSE 1 END) OVER (PARTITION BY num ORDER BY date) AS group_id
FROM (
SELECT
*,
LAG(status) OVER (PARTITION BY num ORDER BY date) AS prev_status
FROM projects
) groups
) islands
) q
WHERE status = 'I' AND group_id = max_group_id
GROUP BY num, status
ORDER BY num
Another approach using CTEs
WITH last_status AS (
SELECT
*
FROM (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY num ORDER BY date DESC) AS rn
FROM projects
) rns
WHERE rn = 1
),
last_active AS (
SELECT
num,
MAX(date) AS date
FROM projects
WHERE status = 'A'
GROUP BY num
),
last_inactive AS (
SELECT
p.num,
MIN(p.date) AS date
FROM projects p
WHERE p.status = 'I'
AND (
EXISTS (
SELECT 1 FROM last_active la
WHERE la.num = p.num AND la.date < p.date
)
OR NOT EXISTS (
SELECT 1 FROM last_active la
WHERE la.num = p.num
)
)
GROUP BY num
)
SELECT
ls.num,
ls.status,
li.date
FROM last_status ls
JOIN last_inactive li ON li.num = ls.num
WHERE ls.status = 'I'
You can check a working demo with both queries here

Get the earliest record in this group with all the details

enter image description here
I need only the latest record with respect to each REV.NO. There are 2 revisions for REV.NO 2, I need the 26-Feb entry alone. I have to do this to get 100,000 records. Please help.
WITH DocumentAttribute AS ( SELECT * FROM (
SELECT
[ParentDocumentId]
,[AttributeName]
,[AttributeValue]
,DocumentAttribute.[RowValidTo]
FROM [ODS].[asite].[DocumentAttribute] FOR SYSTEM_TIME ALL DocumentAttribute
LEFT JOIN [asite].[Document] FOR SYSTEM_TIME ALL document
ON document.DocumentId = ParentDocumentId
WHERE document.WorkspaceId IN ('1105994') ) AS source_table PIVOT (
MAX([AttributeValue]) For [AttributeName] in
(
[Date Document Due],[Date Document Received],[Document Discipline],[Document Type],[Fabrication Package],[Fabrication
Recipient],
[IFF Status],[Incoming Transmittal No.],[Model No.],[Model Revision],[Transmittal No.]
) ) AS PivotTABLE ) SELECT
[DocumentId],
[DocTitle],
[DocRef],
[IssNo],
[RevNo],
LEFT([PublishedDate],11) AS PublishedDate,
[PurposeOfIssue],
[DocStatus],
[Date Document Due],
[Date Document Received] FROM [ODS].[asite].[Document] FOR SYSTEM_TIME ALL Document LEFT JOIN DocumentAttribute
ON Document.DocumentId = DocumentAttribute.[ParentDocumentId]
AND CAST(Document.RowValidTo AS DATE) = CAST(DocumentAttribute.RowValidTo AS DATE) WHERE WorkspaceId IN
('1105994') and DocRef = 'TL601-06MP005' --'KD-CH0202-001-24-1045'
I hope I understand what you means. Also, it is a good way to insert to a temp table.
SELECT ceq.documentId,
ceq.RevNo,
ceq.PublishedDate
FROM
(
SELECT c.documentId,
c.RevNo,
c.PublishedDate,
ROW_NUMBER() OVER(PARTITION BY c.RevNo
ORDER BY c.PublishedDate) AS rn
FROM tableC AS c
) AS ceq
WHERE ceq.rn = 1;

How can I spilt a varchar column to different columns

I have a database table that holds userspecified data for customer orders.
instead of making a column per custom field the wrighter of the software made a 3 column system like this:
orderline_ID Field_ID Value
--------------------------------
1 1 50
1 2 today
1 3 green
2 1 80
2 2 next week
2 3 60
I want this data sorted like this:
Orderline_ID 1 2 3
----------------------------------------
1 50 today green
2 80 next week 60
so I can join it in an other query I use.
But the code I wrote came up like
Orderline_ID 1 2 3
-----------------------------------------
1 50 NULL NULL
1 NULL today NULL
1 NULL NULL green
2 80 NULL NULL
2 NULL next week NULL
2 NULL NULL 60
and when I sort by Orderline_ID it results in a error.
The code I used:
SELECT
fldVerkoopOrderRegelID,
(SELECT VOG.fldWaarde
WHERE (VOG.fldVeldNummer = 1) AND (VOG.fldWaarde IS NOT NULL)) AS [aantal vaten],
(SELECT VOG.fldWaarde
WHERE (VOG.fldVeldNummer = 2) AND (VOG.fldWaarde IS NOT NULL)) AS [Vat nett0],
(SELECT VOG.fldWaarde
WHERE (VOG.fldVeldNummer = 3) AND (VOG.fldWaarde IS NOT NULL)) AS [Vat bruto],
(SELECT VOG.fldWaarde
WHERE (VOG.fldVeldNummer = 4) AND (VOG.fldWaarde IS NOT NULL)) AS [cust product code],
(SELECT VOG.fldWaarde
WHERE (VOG.fldVeldNummer = 5) AND (VOG.fldWaarde IS NOT NULL)) AS [extra text],
(SELECT VOG.fldWaarde
WHERE (VOG.fldVeldNummer = 6) AND (VOG.fldWaarde IS NOT NULL)) AS [HS code]
FROM
dbo.tblVerkoopOrderIngaveGegeven AS VOG
WHERE
(fldVerkoopOrderRegelID IS NOT NULL)
this achievable using left join.
select t1.orderline_id, t1.Value, t2.Value, t3.Value
from tblVerkoopOrderIngaveGegeven t1
left join tblVerkoopOrderIngaveGegeven t2 on t2.orderline_id = t1.orderline_id and t2.field_id = 2
left join tblVerkoopOrderIngaveGegeven t3 on t3.orderline_id = t1.orderline_id and t3.field_id = 3
where t1.field_id = 1
You can select distinct a unique order ids and then do a left join on three tables that each has the column you need i.e. 1,2,3
DECLARE #Orders TABLE (
[Orderline_ID] INT,
[Field_ID] INT,
[Value] VARCHAR(MAX)
)
INSERT INTO #Orders SELECT 1, 1, '50'
INSERT INTO #Orders SELECT 1, 2, 'today'
INSERT INTO #Orders SELECT 1, 3, 'green'
INSERT INTO #Orders SELECT 2, 1, '80'
INSERT INTO #Orders SELECT 2, 2, 'next week'
INSERT INTO #Orders SELECT 2, 3, '60'
SELECT
[T].[Orderline_ID],
[T1].[C1],
[T2].[C2],
[T3].[C3]
FROM
(SELECT DISTINCT [Orderline_ID] FROM #Orders ) AS [T]
LEFT JOIN (SELECT [Orderline_ID], [Field_ID], [Value] AS [C1] FROM #Orders) AS [T1] ON ([T].[Orderline_ID] = [T1].[Orderline_ID] AND [T1].[Field_ID] = 1)
LEFT JOIN (SELECT [Orderline_ID], [Field_ID], [Value] AS [C2] FROM #Orders) AS [T2] ON ([T].[Orderline_ID] = [T2].[Orderline_ID] AND [T2].[Field_ID] = 2)
LEFT JOIN (SELECT [Orderline_ID], [Field_ID], [Value] AS [C3] FROM #Orders) AS [T3] ON ([T].[Orderline_ID] = [T3].[Orderline_ID] AND [T3].[Field_ID] = 3)
Using PIVOT is also a way to achieve this.
SELECT orderline_ID,
[1] AS [total barrels],
[2] AS [vat netto],
[3] AS [vat bruto]
FROM
(
SELECT orderline_ID, Field_ID, [Value]
FROM YourSaleOrderInputDataTable
WHERE Field_ID IN (1, 2, 3) -- optional criteria
) AS src
PIVOT
(
MAX([Value])
FOR Field_ID IN ([1], [2], [3])
) AS pvt
ORDER BY orderline_ID;

T-SQL - Copying & Transposing Data

I'm trying to copy data from one table to another, while transposing it and combining it into appropriate rows, with different columns in the second table.
First time posting. Yes this may seem simple to everyone here. I have tried for a couple hours to solve this. I do not have much support internally and have learned a great deal on this forum and managed to get so much accomplished with your other help examples. I appreciate any help with this.
Table 1 has the data in this format.
Type Date Value
--------------------
First 2019 1
First 2020 2
Second 2019 3
Second 2020 4
Table 2 already has the Date rows populated and columns created. It is waiting for the Values from Table 1 to be placed in the appropriate column/row.
Date First Second
------------------
2019 1 3
2020 2 4
For an update, I might use two joins:
update t2
set first = tf.value,
second = ts.value
from table2 t2 left join
table1 tf
on t2.date = tf.date and tf.type = 'First' left join
table1 ts
on t2.date = ts.date and ts.type = 'Second'
where tf.date is not null or ts.date is not null;
use conditional aggregation
select date,max(case when type='First' then value end) as First,
max(case when type='Second' then value end) as Second from t
group by date
You can do conditional aggregation :
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date;
After that you can use cte :
with cte as (
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date
)
update t2
set t2.First = t1.First,
t2.Second = t1.Second
from table2 t2 inner join
cte t1
on t1.date = t2.date;
Seems like you're after a PIVOT
DECLARE #Table1 TABLE
(
[Type] NVARCHAR(100)
, [Date] INT
, [Value] INT
);
DECLARE #Table2 TABLE(
[Date] int
,[First] int
,[Second] int
)
INSERT INTO #Table1 (
[Type]
, [Date]
, [Value]
)
VALUES ( 'First', 2019, 1 )
, ( 'First', 2020, 2 )
, ( 'Second', 2019, 3 )
, ( 'Second', 2020, 4 );
INSERT INTO #Table2 (
[Date]
)
VALUES (2019),(2020)
--Show us what's in the tables
SELECT * FROM #Table1
SELECT * FROM #Table2
--How to pivot the data from Table 1
SELECT * FROM #Table1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4
--Using that we can update #Table2
UPDATE [tbl2]
SET [tbl2].[First] = pvt.[First]
,[tbl2].[Second] = pvt.[Second]
FROM #Table1 tbl1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
INNER JOIN #Table2 tbl2 ON [tbl2].[Date] = [pvt].[Date]
--Results from #Table 2 after updated
SELECT * FROM #Table2
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4

SQL reporting query

I have a database with following structure.
CREATE TABLE Party
(
PartyID INT IDENTITY
PRIMARY KEY ,
StatusID INT ,
Weigth INT ,
OldWeigth INT
);
GO
CREATE TABLE PartyLocation
(
PartyLocationID INT IDENTITY
PRIMARY KEY ,
PartyID INT FOREIGN KEY REFERENCES dbo.Party ( PartyID ) ,
LocationID INT ,
Distance INT
);
GO
CREATE TABLE PartyRole
(
PartyRoleID INT IDENTITY
PRIMARY KEY ,
PartyID INT FOREIGN KEY REFERENCES dbo.Party ( PartyID ) ,
RoleID INT
);
with some simple data.
INSERT INTO dbo.Party
( StatusID, Weigth, OldWeigth )
VALUES ( 1, -- StatusID - int
10, -- Age - int
20 -- OldAge - int
),
( 1, 15, 25 ),
( 2, 20, 30 );
INSERT INTO dbo.PartyLocation
( PartyID, LocationID, Distance )
VALUES ( 1, -- PartyID - int
1, -- LocationID - int
100 -- Distance - int
),
( 1, 2, 200 ),
( 1, 3, 300 ),
( 2, 1, 1000 ),
( 2, 2, 2000 ),
( 3, 1, 10000 );
INSERT INTO dbo.PartyRole
( PartyID, RoleID )
VALUES ( 1, -- PartyID - int
1 -- RoleID - int
),
( 1, 2 ),
( 1, 3 ),
( 2, 1 ),
( 2, 2 ),
( 3, 1 );
I want to query the following information
Return sum of Weigth of all parties that has roleID = 1 in PartyRole table
Return sum of OldWeigth of all parties that has statusID = 2
Return sum of distances of all parties that has locationID = 3
Return sum of distances of all parties that has roleID = 2
So the expected results are
FilteredWeigth FilteredOldWeigth FilteredDistance AnotherFilteredDistance
-------------- ----------------- ---------------- -----------------------
45 30 600 3600
Can we write a query that will query each table just once? If no what will be the most optimal way to query the data?
You can try this.
SELECT
FilteredWeigth = SUM(CASE WHEN RoleID = 1 AND RN_P = 1 THEN Weigth END) ,
FilteredOldWeigth = SUM(CASE WHEN StatusID = 2 AND RN_P = 1 THEN OldWeigth END),
FilteredDistance = SUM(CASE WHEN LocationID = 3 AND RN_L = 1 THEN Distance END),
AnotherFilteredDistance = SUM(CASE WHEN RoleID = 2 THEN Distance END)
FROM (
SELECT P.Weigth, P.StatusID, P.OldWeigth, PL.LocationID, PL.Distance, PR.RoleID,
RN_P = ROW_NUMBER() OVER (PARTITION BY P.PartyID ORDER BY PL.PartyLocationID),
RN_L = ROW_NUMBER() OVER (PARTITION BY PL.LocationID ORDER BY PR.PartyRoleID)
FROM Party P
INNER JOIN PartyLocation PL ON P.PartyID = PL.PartyID
INNER JOIN PartyRole PR ON P.PartyID = PR.PartyID
) AS T
the below gives
45 20 300 3600
the third column gives 300 which does not correspond to your expected result.
with q1
as
(
select sum(weigth) FilteredWeigth
from party join partyrole on party.partyid = partyrole.partyid
where partyrole.RoleID = '1'
),
q2 as
(
select sum(weigth) OldWeigth from party where StatusID = '2'
),
q3 as (
select sum(Distance) FilteredDistance
from party join PartyLocation on party.partyid = PartyLocation.partyid
where PartyLocation.locationID = '3'
),
q4 as
(
select sum(Distance) AnotherFilteredDistance
from party join partyrole on party.partyid = partyrole.partyid
join PartyLocation on party.partyid = PartyLocation.partyid
where partyrole.RoleID = '2'
)
select FilteredWeigth,OldWeigth,FilteredDistance,AnotherFilteredDistance
from q1,q2,q3,q4
When Using Individual Queries, you can achieve this using the following
Return sum of Weight of all parties that has roleID = 1 in PartyRole table
SELECT
SUM(Weight) FilteredWeigth
FROM dbo.Party P
WHERE EXISTS
(
SELECT
1
FROM dbo.PartyRole PR
WHERE PR. PartyID = P.PartyID
AND PR.RoleId = 1
)
Return sum of OldWeigth of all parties that has statusID = 2
SELECT
SUM(OldWeigth) FilteredOldWeigth
FROM dbo.Party P
WHERE EXISTS
(
SELECT
1
FROM dbo.PartyRole PR
WHERE PR. PartyID = P.PartyID
AND PR.RoleId = 2
)
Return sum of distances of all parties that has locationID = 3
SELECT
SUM(Distance) FilteredDistance
FROM dbo.PartyLocation
WHERE LocationID = 3
Return sum of distances of all parties that has roleID = 2
SELECT SUM(Distance) FROM PartyLocation PL
WHERE EXISTS
(
SELECT 1 FROM PartyRole PR
WHERE PR.PartyID = PL.PartyID
AND PR.Roleid = 2
)
If you want to get the result of all these in a single result set. then maybe you can try a pivot query. Like this
WITH CTE
AS
(
SELECT
'FilteredWeigth' ColNm,
SUM(Weigth) Val
FROM dbo.Party P
WHERE EXISTS
(
SELECT
1
FROM dbo.PartyRole PR
WHERE PR. PartyID = P.PartyID
AND PR.RoleId = 1
)
UNION
SELECT
'FilteredOldWeigth' ColNm,
SUM(OldWeigth) Val
FROM dbo.Party P
WHERE EXISTS
(
SELECT
1
FROM dbo.PartyRole PR
WHERE PR. PartyID = P.PartyID
AND PR.RoleId = 2
)
UNION
SELECT
'FilteredDistance' ColNm,
SUM(Distance) Val
FROM dbo.PartyLocation
WHERE LocationID = 3
UNION
SELECT
'AnotherFilteredDistance' ColNm,
SUM(Distance) Val FROM PartyLocation PL
WHERE EXISTS
(
SELECT 1 FROM PartyRole PR
WHERE PR.PartyID = PL.PartyID
AND PR.Roleid = 2
)
)
SELECT
*
FROM CTE
PIVOT
(
SUM(Val)
FOR ColNm IN
(
[FilteredWeigth],[FilteredOldWeigth],[FilteredDistance],[AnotherFilteredDistance]
)
)Pvt
The Result Will be
I could think of only three possible options:
Union query with four different select statements as answered by #ab-bennett
Join all tables then use select statements as answered by sarslan
Mix of 1 and 2, based on experiments
Coming to the question you asked:
Can we write a query that will query each table just once?
Assuming best performance is the goal, following could happen in each of the above cases:
All select statements would have their own where clause. This would perform best when where produces few rows compared to the count(*). Note that Joins are terrible for very large tables.
A join is made once, and the desired output is obtained from the same Joined table. This would perform optimal when where produces significant number of rows and the table is not too big to join.
You can mix JOIN / IN / EXISTS / WHERE to optimize your queries based on number of rows you are having in table. This approach could be used when your dataset cardinality might not vary a lot.