I have a question about creating a running total on an existing query in SQL Server 2008 R2.
Basically, I have a query that combines data from 3 separate tables and groups them to produce a single entry for each combination of ACCOUNT, PROFITCENTRE, TIMEID and DIVISION
However, I need to alter the values in SIGNEDDATA, such that each entry is the total of all previous months.
e.g.
TIMEID 20120300 (March 2012) would contain -35143.0000000000
TIMEID 20120400 (April 2012) should then contain -36000.0000000000 (March plus the -857 for April)
TIMEID 20120500 (May 2012) should be -36857.0000000000, etc.,
Rather than what I currently get, which is the sum of SIGNEDDATA for each month, grouped by ACCOUNT, PROFITCENTRE, DIVISION, etc, but not added to previous months.
How can I do this, as I have tried removing TIMEID from the GROUP BY clause, but I just get the usual error about not being able to retrieve the column as it is not contained in either an aggregate or Group by......?
My selection is as follows:
Insert INTO Data_Services.dbo.[NEW_TABLE]
Select
[Account],
[Category],
[DataSrc],
[ProfitCentre],
'MY_FLOW' as [Flow],
[RptCurrency],
[TimeID],
sum(round([SignedData],2,0)) as SignedData,
[Division],
#CurrentTime as CTimeID
FROM
(select
T1.[Account],
T1.[Category],
T1.[DataSrc],
T1.[ProfitCentre],
T1.[Flow],
T1.[RptCurrency],
T1.[TimeID],
sum(round(T1.[SignedData],2,0)) as SignedData,
mbrPC.[Division]
from
MY_DATABASE.dbo.TABLE1 T1
join
MY_DATABASE.dbo.mbrProfitCentre mbrPC on T1.ProfitCentre = mbrPC.[ID]
where
T1.Category = 'WForecast'
and T1.DataSrc in (SELECT [ID] from DSParent)
and T1.Flow = 'MOVEMENT'
and T1.Account in
(SELECT [ID] from AccIEParent)
and TimeID in
(select [TimeID] from MY_DATABASE.dbo.mbrTime
where [Period_Start] <> ''
and [Period_Start] is not null
and convert(date,[Period_Start],101) <= '2013-01-24'
and [CURRYEAR] = 'Y')
group by T1.[Account],
T1.[Category],
T1.[DataSrc],
T1.[ProfitCentre],
T1.[Flow],
T1.[RptCurrency],
T1.[TimeID],
mbrPC.[Division]
UNION
select
T2.[Account],
T2.[Category],
T2.[DataSrc],
T2.[ProfitCentre],
T2.[Flow],
T2.[RptCurrency],
T2.[TimeID],
sum(round(T2.[SignedData],2,0)) as SignedData,
mbrPC.[Division]
from MY_DATABASE.dbo.TABLE2 T2
join MY_DATABASE.dbo.mbrProfitCentre mbrPC
on T2.ProfitCentre = mbrPC.[ID]
where T2.Category = 'WForecast'
and T2.DataSrc in
(SELECT [ID] from DSParent)
and T2.Flow = 'MOVEMENT'
and T2.Account in
(SELECT [ID] from AccIEParent)
and TimeID in
(select [TimeID] from MY_DATABASE.dbo.mbrTime
where [Period_Start] <> ''
and [Period_Start] is not null
and convert(date,[Period_Start],101) <= '2013-01-24'
and [CURRYEAR] = 'Y')
group by T2.[Account],
T2.[Category],
T2.[DataSrc],
T2.[ProfitCentre],
T2.[Flow],
T2.[RptCurrency],
T2.[TimeID],
mbrPC.[Division]
UNION
select
T3.[Account],
T3.[Category],
T3.[DataSrc],
T3.[ProfitCentre],
T3.[Flow],
T3.[RptCurrency],
T3.[TimeID],
sum(round(T3.[SignedData],2,0)) as SignedData,
mbrPC.[Division]
from MY_DATABASE.dbo.TABLE3 T3
join MY_DATABASE.dbo.mbrProfitCentre mbrPC
on T3.ProfitCentre = mbrPC.[ID]
where T3.Category = 'WForecast'
and T3.DataSrc in
(SELECT [ID] from DSParent)
and T3.Flow = 'MOVEMENT'
and T3.Account in
(SELECT [ID] from AccIEParent)
and TimeID in
(select [TimeID] from MY_DATABASE.dbo.mbrTime
where [Period_Start] <> ''
and [Period_Start] is not null
and convert(date,[Period_Start],101) <= '2013-01-24'
and [CURRYEAR] = 'Y')
group by T3.[Account],
T3.[Category],
T3.[DataSrc],
T3.[ProfitCentre],
T3.[Flow],
T3.[RptCurrency],
T3.[TimeID],
mbrPC.[Division]
) a
group by [Account],
[Category],
[DataSrc],
[ProfitCentre],
[Flow],
[RptCurrency],
[TimeID],
[Division]
And the resulting data-set is:
What you're trying to do is slightly more complex than the examples in this ticket but there are answers here with various methods of calculating a cumulative sum.
how to get cumulative sum
Related
so I have a statement I believe should work... However it feels pretty suboptimal and I can't for the life of me figure out how to optimise it.
I have the following tables:
Transactions
[Id] is PRIMARY KEY IDENTITY
[Hash] has a UNIQUE constraint
[BlockNumber] has an Index
Transfers
[Id] is PRIMARY KEY IDENTITY
[TransactionId] is a Foreign Key referencing [Transactions].[Id]
TokenPrices
[Id] is PRIMARY KEY IDENTITY
TokenPriceAttempts
[Id] is PRIMARY KEY IDENTITY
[TransferId] is a Foreign Key referencing [Transfers].[Id]
What I want to do, is select all the transfers, with a few bits of data from their related transaction (one transaction to many transfers), where I don't currently have a price stored in TokenPrices related to that transfer.
In the first part of the query, I am getting a list of all the transfers, and calculating the DIFF between the nearest found token price. If one isn't found, this is null (what I ultimately want to select). I am allowing 3 hours either side of the transaction timestamp - if nothing is found within that timespan, it will be null.
Secondly, I am selecting from this set, ensuring first that diff is null as this means the price is missing, and finally that token price attempts either doesn't have an entry for attempting to get a price, or if it does than it has fewer than 5 attempts listed and the last attempt was more than a week ago.
The way I have laid this out results in essentially 3 of the same / similar SELECT statements within the WHERE clause, which feels hugely suboptimal...
How could I improve this approach?
WITH [transferDateDiff] AS
(
SELECT
[t1].[Id],
[t1].[TransactionId],
[t1].[From],
[t1].[To],
[t1].[Value],
[t1].[Type],
[t1].[ContractAddress],
[t1].[TokenId],
[t2].[Hash],
[t2].[Timestamp],
ABS(DATEDIFF(SECOND, [tp].[Timestamp], [t2].[Timestamp])) AS diff
FROM
[dbo].[Transfers] AS [t1]
LEFT JOIN
[dbo].[Transactions] AS [t2]
ON [t1].[TransactionId] = [t2].[Id]
LEFT JOIN
(
SELECT
*
FROM
[dbo].[TokenPrices]
)
AS [tp]
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp])
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp])
WHERE
[t1].[Type] < 2
)
SELECT
[tdd].[Id],
[tdd].[TransactionId],
[tdd].[From],
[tdd].[To],
[tdd].[Value],
[tdd].[Type],
[tdd].[ContractAddress],
[tdd].[TokenId],
[tdd].[Hash],
[tdd].[Timestamp]
FROM
[transferDateDiff] AS tdd
WHERE
[tdd].[diff] IS NULL AND
(
(
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
)
= 0 OR
(
(
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
)
< 5 AND
(
DATEDIFF(DAY,
(
SELECT
MAX([tpa].[Created])
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
),
CURRENT_TIMESTAMP
) >= 7
)
)
)
Here is an attempt to help simplify. I stripped out all the [brackets] that really are not required unless you are running into something like a reserved keyword, or columns with spaces in their name (bad to begin with).
Anyhow, your main query had 3 instances of a select per ID. To eliminate that, I did a LEFT JOIN to a subquery that pulls all transfers of type < 2 AND JOINS to the price attempts ONCE. This way, the result will have already pre-aggregated the count(*) and Max(Created) done ONCE for the same basis of transfers in question with your WITH CTE declaration. So you dont have to keep running the 3 queries each time, and you dont have to query the entire table of ALL transfers, just those with same underlying type < 2 condition. The result subquery alias "PQ" (preQuery)
This now simplifies the readability of the outer WHERE clause from the redundant counts per Id.
WITH transferDateDiff AS
(
SELECT
t1.Id,
t1.TransactionId,
t1.From,
t1.To,
t1.Value,
t1.Type,
t1.ContractAddress,
t1.TokenId,
t2.Hash,
t2.Timestamp,
ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) AS diff
FROM
dbo.Transfers t1
LEFT JOIN dbo.Transactions t2
ON t1.TransactionId = t2.Id
LEFT JOIN dbo.TokenPrices tp
ON t1.ContractAddress = tp.ContractAddress
AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
WHERE
t1.Type < 2
)
SELECT
tdd.Id,
tdd.TransactionId,
tdd.From,
tdd.To,
tdd.Value,
tdd.Type,
tdd.ContractAddress,
tdd.TokenId,
tdd.Hash,
tdd.Timestamp
FROM
transferDateDiff tdd
LEFT JOIN
( SELECT
t1.Id,
COUNT(*) Attempts,
MAX(tpa.Created) MaxCreated
FROM
dbo.Transfers t1
JOIN dbo.TokenPriceAttempts tpa
on t1.Id = tpa.TransferId
WHERE
t1.Type < 2
GROUP BY
t1.Id ) PQ
on tdd.Id = PQ.Id
WHERE
tdd.diff IS NULL
AND ( PQ.Attempts IS NULL
OR PQ.Attempts = 0
OR ( PQ.Attempts < 5
AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7
)
)
REVISED to remove the WITH CTE into a single query
SELECT
t1.Id,
t1.TransactionId,
t1.From,
t1.To,
t1.Value,
t1.Type,
t1.ContractAddress,
t1.TokenId,
t2.Hash,
t2.Timestamp
FROM
-- Now, this pre-query is left-joined to token price attempts
-- so ALL Transfers of type < 2 are considered
( SELECT
t1.Id,
coalesce( COUNT(*), 0 ) Attempts,
MAX(tpa.Created) MaxCreated
FROM
dbo.Transfers t1
LEFT JOIN dbo.TokenPriceAttempts tpa
on t1.Id = tpa.TransferId
WHERE
t1.Type < 2
GROUP BY
t1.Id ) PQ
-- Now, we can just directly join to transfers for the rest
JOIN dbo.Transfers t1
on PQ.Id = t1.Id
-- and the rest from the WITH CTE construct
LEFT JOIN dbo.Transactions t2
ON t1.TransactionId = t2.Id
LEFT JOIN dbo.TokenPrices tp
ON t1.ContractAddress = tp.ContractAddress
AND tp.Timestamp >= DATEADD(HOUR, - 3, t2.Timestamp)
AND tp.Timestamp <= DATEADD(HOUR, 3, t2.Timestamp)
WHERE
ABS( DATEDIFF( SECOND, tp.Timestamp, t2.Timestamp )) IS NULL
AND ( PQ.Attempts = 0
OR ( PQ.Attempts < 5
AND DATEDIFF(DAY, PQ.MaxCreated, CURRENT_TIMESTAMP ) >= 7 )
)
I don't understand why you are doing this:
LEFT JOIN
(
SELECT
*
FROM
[dbo].[TokenPrices]
)
AS [tp]
ON [tp].[ContractAddress] = [t1].[ContractAddress]
AND [tp].[Timestamp] >= DATEADD(HOUR, - 3, [t2].[Timestamp])
AND [tp].[Timestamp] <= DATEADD(HOUR, 3, [t2].[Timestamp])
Isn't this a
LEFT JOIN [dbo].[TokenPrices] as TP ...
This:
SELECT
COUNT(*)
FROM
[dbo].[TokenPriceAttempts] tpa
WHERE
[tpa].[TransferId] = [tdd].[Id]
could be another CTE instead of being a sub...
In fact any of your sub queries could be CTE's, that's part of a CTE is making things easier to read.
,TPA
AS
(
SELECT COUNT(*)
FROM [dbo].[TokenPriceAttempts] tpa
WHERE [tpa].[TransferId] = [tdd].[Id]
)
When I execute the following script with hive:
select
a.keyno
from
(
select
keyno,
reportyear
from hive_ldtmp.tmp_kzz_company_report_people_count_grow_info
where yeartype=1
and keyno='00003d22be771b36f27a7be24431e407'
and reportyear=2019
) a
left join
(
select
keyno,
reportyear
from hive_ldtmp.tmp_kzz_company_report_people_count_grow_info
where yeartype=2
and keyno='00003d22be771b36f27a7be24431e407'
and reportyear=2019
) b on a.keyno=b.keyno
and a.reportyear=b.reportyear;
The return i get is None:
0 results.
However, I am sure that when I execute the two queries separately, they have results:
-- a
select
keyno,
reportyear
from hive_ldtmp.tmp_kzz_company_report_people_count_grow_info
where yeartype=1
and keyno='00003d22be771b36f27a7be24431e407'
and reportyear=2019;
-- b
select
keyno,
reportyear
from hive_ldtmp.tmp_kzz_company_report_people_count_grow_info
where yeartype=2
and keyno='00003d22be771b36f27a7be24431e407'
and reportyear=2019;
this is the result(the results of a and b are the same):
keyno
year
00003d22be771b36f27a7be24431e407
2019
Than,I changed the format of the table from ORC to Textfile,
I executed the entire code and it got the result I wanted.
so...?Where is the problem?
In the first query this filter is different where yeartype=2. In the query b which you executed separately it is where yeartype=1
BTW you can eliminate joining with the same table. Use aggregation and filtering, like in this query:
select keyno
from
(
select keyno,
max(case when reportyear = 1 then keyno else null end) as yr1_keyno,
max(case when reportyear = 2 then keyno else null end) as yr2_keyno,
reportyear
from hive_ldtmp.tmp_kzz_company_report_people_count_grow_info
where yeartype in ( 1, 2)
and keyno='00003d22be771b36f27a7be24431e407'
and reportyear=2019
group by keyno, reportyear
)s where yr1_keyno=yr2_keyno --the same as your INNER JOIN (if join does not duplicate rows)
I'm trying to copy data from one table to another, while transposing it and combining it into appropriate rows, with different columns in the second table.
First time posting. Yes this may seem simple to everyone here. I have tried for a couple hours to solve this. I do not have much support internally and have learned a great deal on this forum and managed to get so much accomplished with your other help examples. I appreciate any help with this.
Table 1 has the data in this format.
Type Date Value
--------------------
First 2019 1
First 2020 2
Second 2019 3
Second 2020 4
Table 2 already has the Date rows populated and columns created. It is waiting for the Values from Table 1 to be placed in the appropriate column/row.
Date First Second
------------------
2019 1 3
2020 2 4
For an update, I might use two joins:
update t2
set first = tf.value,
second = ts.value
from table2 t2 left join
table1 tf
on t2.date = tf.date and tf.type = 'First' left join
table1 ts
on t2.date = ts.date and ts.type = 'Second'
where tf.date is not null or ts.date is not null;
use conditional aggregation
select date,max(case when type='First' then value end) as First,
max(case when type='Second' then value end) as Second from t
group by date
You can do conditional aggregation :
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date;
After that you can use cte :
with cte as (
select date,
max(case when type = 'first' then value end) as first,
max(case when type = 'Second' then value end) as Second
from table t
group by date
)
update t2
set t2.First = t1.First,
t2.Second = t1.Second
from table2 t2 inner join
cte t1
on t1.date = t2.date;
Seems like you're after a PIVOT
DECLARE #Table1 TABLE
(
[Type] NVARCHAR(100)
, [Date] INT
, [Value] INT
);
DECLARE #Table2 TABLE(
[Date] int
,[First] int
,[Second] int
)
INSERT INTO #Table1 (
[Type]
, [Date]
, [Value]
)
VALUES ( 'First', 2019, 1 )
, ( 'First', 2020, 2 )
, ( 'Second', 2019, 3 )
, ( 'Second', 2020, 4 );
INSERT INTO #Table2 (
[Date]
)
VALUES (2019),(2020)
--Show us what's in the tables
SELECT * FROM #Table1
SELECT * FROM #Table2
--How to pivot the data from Table 1
SELECT * FROM #Table1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4
--Using that we can update #Table2
UPDATE [tbl2]
SET [tbl2].[First] = pvt.[First]
,[tbl2].[Second] = pvt.[Second]
FROM #Table1 tbl1
PIVOT (
MAX([Value]) --Pivot on this Column
FOR [Type] IN ( [First], [Second] ) --Make column where [Value] is in one of this
) AS [pvt] --Table alias
INNER JOIN #Table2 tbl2 ON [tbl2].[Date] = [pvt].[Date]
--Results from #Table 2 after updated
SELECT * FROM #Table2
--which gives
--Date First Second
------------- ----------- -----------
--2019 1 3
--2020 2 4
I have the following query
SELECT
A.IdDepartment,
A.IdParent,
A.Localidad,
A.Codigo,
A.Nombre,
A.Departamento,
A.Fecha,
A.[Registro Entrada],
A.[Registro Salida],
CASE
WHEN (SELECT IdUser FROM Exception WHERE IdUser = A.Codigo) <> ''
THEN(SELECT Description FROM Exception WHERE IdUser = A.Codigo AND A.Fecha BETWEEN BeginingDate AND EndingDate)
ELSE ('Ausente')
END AS Novedades
FROM VW_HORARIOS A
WHERE A.[Registro Entrada] = A.[Registro Salida]
GROUP BY A.IdDepartment,A.IdParent, A.Localidad, A.Codigo, A.Nombre, A.Departamento, A.Fecha, A.[Registro Entrada],A.[Registro Salida]
ORDER BY A.Fecha
the query performs the following selects all the records placed in the following query, what I want to validate is the following if on a date there was no record I want to create it but I do not know how to create that record because it does not exist, if someone can help me I would appreciate the help
You can try something like this. Just fill out your own Date table with values that is within your range of dates.
Remember to verify the last join. I dont know if that is the unique businesskey within your data sample
SQL Test Code
declare #DateTable table (Dates date)
insert into #DateTable
values
('2017-01-01'),
('2017-01-02'),
('2017-01-03'),
('2017-01-04'),
('2017-01-05'),
('2017-01-06'),
('2017-01-07'),
('2017-01-08'),
('2017-01-09'),
('2017-01-10')
declare #SamleTable table (DateStamp date,Department nvarchar(50),LocationId nvarchar(50),Code int,name nvarchar(50),Entrada nvarchar(50))
insert into #SamleTable
values
('2017-01-01','BOTELLO','SANTO',5540,'JOSE','Something'),
('2017-01-04','BOTELLO','SANTO',5540,'JOSE','Something'),
('2017-01-06','BOTELLO','SANTO',5540,'JOSE','Something'),
('2017-01-09','BOTELLO','SANTO',5540,'JOSE','Something')
select z.Department,z.LocationId,z.Code,z.name,z.Dates,COALESCE(a.Entrada,'EMPTY') as Entrada from (
Select Department,LocationId,Code,Name,Dates from (
select Department,LocationId,Code,Name,MIN(DateStamp) mind, MAX(Datestamp) maxd from #SamleTable
group by Department,LocationId,Code,Name
)x
CROSS JOIN #DateTable b
where b.Dates between x.mind and x.maxd
) z
left join #SamleTable a on a.Department = z.Department and a.LocationId = z.LocationId and a.Code = z.Code and a.name = z.name
and a.DateStamp = z.Dates
Result
You can use a recursive query building all dates from the minimum date to the maximum date found in your table.
with dates(fecha, maxfecha) as
(
select min(fecha) as fecha, max(fecha) as maxfecha from vw_horarios
union all
select dateadd(dd, 1, fecha) as fecha, maxfecha from dates where fecha < maxfecha
)
select d.fecha, q.*
from dates d
left join ( your query here ) q on q.fecha = d.fecha;
I have following table:
Card(
MembershipNumber,
EmbossLine,
status,
EmbossName
)
with sample data
(0009,0321,'E0','Finn')
(0009,0322,'E1','Finn')
(0004,0356,'E0','Mary')
(0004,0398,'E0','Mary')
(0004,0382,'E1','Mary')
I want to retrieve rows such that only those rows should appear that have count of MembershipNumber > 1 AND count of status='E0' > 1.
For Example The query should return following result
(0004,0356,'E0','Mary')
(0004,0398,'E0','Mary')
I have the query for filtering it with MembershipNumber count but cant figure out how to filter by status='E0'. Here's the query so far
SELECT *
FROM (SELECT *,
Count(MembershipNumber)OVER(partition BY EmbossName) AS cnt
FROM card) A
WHERE cnt > 1
You can just add WHERE status = 'E0' inside your subquery:
SQL Fiddle (credit to Raging Bull for the fiddle)
SELECT *
FROM (
SELECT *,
COUNT(MembershipNumber) OVER(PARTITION BY EmbossName) AS cnt
FROM card
WHERE status = 'E0'
)A
WHERE cnt > 1
You can do it this way:
select t1.*
from card t1 left join
(select EmbossName
from card
where [status]='E0'
group by EmbossName,[status]
having count(MembershipNumber)>1 ) t2 on t1.EmbossName=t2.EmbossName
where t2.EmbossName is not null and [status]='E0'
Result:
MembershipNumber EmbossLine status EmbossName
---------------------------------------------------
4 356 E0 Mary
4 398 E0 Mary
Sample result in SQL Fiddle
try :
WITH cnt AS (
SELECT MembershipNumber, status
FROM Card
WHERE status = 'E0'
GROUP BY MembershipNumber, status
HAVING COUNT(MembershipNumber) > 1 AND COUNT(status) > 1
)
SELECT c.*
FROM Card c
INNER JOIN cnt
ON c.MembershipNumber = cnt.MembershipNumber
AND c.status = cnt.status;
You can try this:
DECLARE #DataSource TABLE
(
[MembershipNumber] SMALLINT
,[EmbossLine] SMALLINT
,[status] CHAR(2)
,[EmbossName] VARCHAR(8)
);
INSERT INTO #DataSource ([MembershipNumber], [EmbossLine], [status], [EmbossName])
VALUES (0009,0321,'E0','Finn')
,(0009,0322,'E1','Finn')
,(0004,0356,'E0','Mary')
,(0004,0398,'E0','Mary')
,(0004,0382,'E1','Mary');
SELECT [MembershipNumber]
,[EmbossLine]
,[status]
,[EmbossName]
FROM
(
SELECT *
,COUNT([MembershipNumber]) OVER (PARTITION BY [EmbossName]) AS cnt1
,SUM(IIF([status] = 'E0' , 1, 0)) OVER (PARTITION BY [EmbossName]) AS cnt2
FROM #DataSource
) DS
WHERE cnt1 > 1
AND cnt2 > 1
AND [status] = 'E0';
The idea is to add a second counter, but instead of COUNT function to use SUM function for counting only the rows that have [status] = 'E0'. Then, in the where clause we are filtering by the two counters and [status] = 'E0'.