Stuck with this query - sql

My database looks like this
DailyData
rID int
Stock varchar
rDate date
Shares int
price float
What I am trying to do is get the data for two dates.
Sample data
rID stock rDate Shares price
11 Stock1 21/03/2016 15 1.22
12 Stock2 21/03/2016 22 2.23
13 Stock3 21/03/2016 17 3.32
14 Stock4 21/03/2016 10 4.24
15 Stock1 22/03/2016 15 1.25
16 Stock2 22/03/2016 20 2.27
17 Stock3 22/03/2016 17 3.32
18 Stock1 23/03/2016 15 1.28
19 Stock2 23/03/2016 20 2.20
20 Stock3 23/03/2016 17 3.32
21 Stock4 23/03/2016 10 4.24
Expected output
Stock Shares-21 Shares-20
Stock1 15 15
Stock2 22 20
Stock3 17 17
Stock4 10 0
My query against a SQL Server CE database:
Select
DD1.Stock, sum(DD1.Shares) as Shares-21, sum(DD2.shares) as Shares-20
from
DailyData DD1, DailyData DD2
where
DD1.rDate = '21/03/2016' and DD2.rDate = '20/03/2016'
and DD1.Stock = DD2.Stock
group by
DD1.Stock
I am getting 7 rows of data instead of 4.
Please help with the query.
******************************* new modification *********************
i followed as suggested but it seems not to work. this is a actual sql script.
Select P.pName,DD.Stock,
sum(case DD.rDate when '03/21/2016' then DD.Shares else 0 end) as Shares21,
sum(case DD.rDate when '03/20/2016' then DD.Shares else 0 end) as Shares20
from dailyData DD, Portfolios P
where DD.rDate = '03/21/2016' or DD.rDate = '03/20/2016'
and DD.pID = P.pID
and DD.pID=1
group by P.pName,DD.stock
order by P.pName,DD.Stock
now for pID=1 there are 23 records for 20-Mar and 21-Mar
upon running this query, it returns way more than 23. I am expecting 23 records only.

It actually does not need to use JOIN. You could use CASE WHEN statement like this
SELECT stock,
Sum(CASE rdate
WHEN '21/03/2016' THEN shares
ELSE 0
END) AS Shares21,
Sum(CASE rdate
WHEN '20/03/2016' THEN shares
ELSE 0
END) AS Shares20
FROM dailydata
WHERE rdate = '21/03/2016'
OR rdate = '20/03/2016'
GROUP BY stock

You could use PIVOT with DATEPART
SELECT Stock, [21] AS [Shares-21], [20] AS [Shares-20] FROM
(
SELECT s.Stock, Datepart(d,s.rDate) rDate , s.Shares FROM DailyData s
) src
PIVOT
(
MAX(Shares) for rDate in ([21], [20])
) pvt

Related

Subtract in Union

I have this data, where I want to generate the last row "on the fly" from the first two:
Group
1yr
2yrs
3yrs
date
code
Port
19
-15
88
1/1/2020
arp
Bench
10
-13
66
1/1/2020
arb
Diff
9
2
22
I am trying to subtract the Port & Bench returns and have the difference on the new row. How can I do this?
Here's my code so far:
Select
date
Group,
Code,
1 yr returnp,
2 yrs returnp,
3yrs return
From timetable
union
Select
date,
Group,
Code,
1 yr returnb,
2 yrs returnb,
3yrs returnb
From timetable
Seems to me that a UNION ALL in concert with a conditional aggregation should do the trick
Note the sum() is wrapped in an abs() to match desired results
Select *
From YourTable
Union All
Select [Group] = 'Diff'
,[1yr] = abs(sum([1yr] * case when [Group]='Bench' then -1 else 1 end))
,[2yrs] = abs(sum([2yrs] * case when [Group]='Bench' then -1 else 1 end))
,[3yrs] = abs(sum([3yrs] * case when [Group]='Bench' then -1 else 1 end))
,[date] = null
,[code] = null
from YourTable
Results
Group 1yr 2yrs 3yrs date code
Port 19 -15 88 2020-01-01 arp
Bench 10 -13 66 2020-01-01 arb
Diff 9 2 22 NULL NULL
If you know there is always 2 rows, something like this would work
SELECT * FROM timetable
UNION ALL
SELECT
MAX(1yr) - MIN(1yr),
MAX(2yrs) - MIN(2yrs),
MAX(3yrs) - MIN(3yrs),
null,
null,
FROM timetable

How do I transpose a result set and group by week?

I have a view based on query:
SELECT CONVERT(VARCHAR(10), date, 103) AS date,
eventid, name, time, pts
FROM results
WHERE DATEPART(yy, date) = 2019;
This provides a data set such as this:
Date EventID Name Time Points
24/04/2019 10538 Fred Flintstone 22:27 10
24/04/2019 10538 Barney Rubble 22:50 9
24/04/2019 10538 Micky Mouse 23:17 8
24/04/2019 10538 Yogi Bear 23:54 7
24/04/2019 10538 Donald Duck 24:07 6
01/05/2019 10541 Barney Rubble 21:58 10
01/05/2019 10541 Fred Flintstone 22:00 9
01/05/2019 10541 Donald Duck 23:39 8
01/05/2019 10541 Yogi Bear 23:43 7
12/06/2019 10569 Fred Flintstone 22:06 10
12/06/2019 10569 Barney Rubble 22:22 9
12/06/2019 10569 Micky Mouse 23:05 8
12/06/2019 10569 Donald Duck 23:55 7
I need an output row for each name listing the pts per round and a total in the form:
Name 24/04/2019 01/05/2019 12/06/2019 total
Fred Flintstone 10 9 10 29
Barney Rubble 9 10 9 28
Yogi Bear 7 7 7 21
Micky Mouse 8 8 16
Donald Duck 6 8 14
There could be up to 16 non-consecutive event dates for the year.
Nothing wrong with PIVOT but, for me, the easiest and most performant way to do this would be to perform a Cross Tab. The syntax is less verbose, more portable, and easier to understand.
First for some DDL and easily consumable sample data. <<< Learn how to do this it will get you better answers more quickly.
SET NOCOUNT ON;
SET DATEFORMAT dmy; -- I need this because I'm American
-- DDL and easily consumable sample data
DECLARE #Results TABLE
(
[Date] DATE,
EventId INT,
[Name] VARCHAR(40), -- if indexed, go as narrow as possible
[Time] TIME,
Points INT,
INDEX uq_poc_results CLUSTERED([Name],[EventId]) -- a covering index is vital for a query like this
); -- note: ^^^ this bad clustered index candidate, I went this route for simplicity
INSERT #Results VALUES
('4/04/2019', 10538, 'Fred Flintstone', '22:27',10),
('24/04/2019',10538, 'Barney Rubble', '22:50',9),
('24/04/2019',10538, 'Micky Mouse ', '23:17',8),
('24/04/2019',10538, 'Yogi Bear', '23:54',7),
('24/04/2019',10538, 'Donald Duck', '2307',6),
('01/05/2019',10541, 'Barney Rubble', '21:58',10),
('01/05/2019',10541, 'Fred Flintstone', '22:00',9),
('01/05/2019',10541, 'Donald Duck', '23:39',8),
('01/05/2019',10541, 'Yogi Bear', '23:43',7),
('12/06/2019',10569, 'Fred Flintstone', '22:06',10),
('12/06/2019',10569, 'Barney Rubble', '22:22',9),
('12/06/2019',10569, 'Micky Mouse', '23:05',8),
('12/06/2019',10569, 'Donald Duck', '23:55',7);
Note that I created a clustered index on (Name,EventId) - I would use a non-clustered index that covered the columns you need in the real world. If you have a lot of rows then you will want that index.
Basic Cross-Tab
SELECT [Name] = r.[Name],
[24/04/2019] = MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
[01/05/2019] = MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
[12/06/2019] = MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name];
Results:
Name 24/04/2019 01/05/2019 12/06/2019
-------------------- ------------ ------------ ------------
Barney Rubble 9 10 9
Donald Duck 6 8 7
Fred Flintstone 0 9 10
Micky Mouse 8 0 8
Yogi Bear 7 7 0
To get the total we can wrap this in logic in a subquery and add the columns like this:
SELECT
[Name] = piv.N,
[24/04/2019] = piv.D1,
[01/05/2019] = piv.D2,
[12/06/2019] = piv.D3,
Total = piv.D1+piv.D2+piv.D3
FROM
(
SELECT r.[Name],
MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name]
) AS piv(N,D1,D2,D3);
Returns:
Name 24/04/2019 01/05/2019 12/06/2019 Total
------------------- ----------- ----------- ----------- -------
Barney Rubble 9 10 9 28
Donald Duck 6 8 7 21
Fred Flintstone 0 9 10 19
Micky Mouse 8 0 8 16
Yogi Bear 7 7 0 14
Not only does this get you what you need with very little SQL, you benefit from pre-aggregation inside the subquery. A huge benefit of this approach over PIVOT is how you can do multiple aggregations in one query. Below are two examples of how to use this approach for multiple aggregations; this first using a standard GROUP BY twice, the other using window aggregate functions (.. OVER (partition by, order by..):
--==== Traditional Approach
SELECT
[Name] = piv.N,
[24/04/2019] = MAX(piv.D1),
[01/05/2019] = MAX(piv.D2),
[12/06/2019] = MAX(piv.D3),
Total = MAX(f.Ttl),
Avg1 = AVG(piv.D1), -- 1st date (24/04/2019)
Avg2 = AVG(piv.D2), -- 2nd date...
Avg3 = AVG(piv.D3), -- 3rd date...
TotalAvg = AVG(f.Ttl) ,
Mn = MIN(f.Ttl) ,
Mx = MAX(f.Ttl)
FROM
(
SELECT r.[Name],
MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name]
) AS piv(N,D1,D2,D3)
CROSS APPLY (VALUES(piv.D1+piv.D2+piv.D3)) AS f(Ttl)
GROUP BY piv.N;
--==== Leveraging Window Aggregates
SELECT
[Name] = piv.N,
[24/04/2019] = piv.D1,
[01/05/2019] = piv.D2,
[12/06/2019] = piv.D3,
Total = f.Ttl,
Avg1 = AVG(piv.D1) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)), -- 1st date (24/04/2019)
Avg2 = AVG(piv.D2) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)), -- 2nd date...
Avg3 = AVG(piv.D3) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)), -- 3rd date...
TotalAvg = AVG(f.Ttl) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)),
Mn = MIN(f.Ttl) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL)),
Mx = MAX(f.Ttl) OVER(PARTITION BY piv.N ORDER BY (SELECT NULL))
FROM
(
SELECT r.[Name],
MAX(CASE r.[Date] WHEN '24/04/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '01/05/2019' THEN r.Points ELSE 0 END),
MAX(CASE r.[Date] WHEN '12/06/2019' THEN r.Points ELSE 0 END)
FROM #Results AS r
GROUP BY r.[Name]
) AS piv(N,D1,D2,D3)
CROSS APPLY (VALUES(piv.D1+piv.D2+piv.D3)) AS f(Ttl);
Both Return:
Name 24/04/2019 01/05/2019 12/06/2019 Total Avg1 Avg2 Avg3 TotalAvg Mn Mx
----------------- ----------- ----------- ----------- ------ ------ ------ ------ ---------- ------ ------
Barney Rubble 9 10 9 28 9 10 9 28 28 28
Donald Duck 6 8 7 21 6 8 7 21 21 21
Fred Flintstone 0 9 10 19 0 9 10 19 19 19
Micky Mouse 8 0 8 16 8 0 8 16 16 16
Yogi Bear 7 7 0 14 7 7 0 14 14 14
To handle the columns dynamically you need to have a look at:
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs by Jeff Moden.

How to split the column values after a certain number

I have a dataset that looks like this:
ID HoursWorked TotalHours
23 1 1
23 1 2
23 1 3
23 0.5 3.5
23 1 4.5
23 1 5.5
23 1 6.5
23 1 7.5
23 1 8.5
61 1 1
61 1 2
What I want to do is if the total hours hits 8 hours, I want to split that row (e.g. 8.5 in the sample data above) so that an employee always has the total hours of 8. If someone works over 8 hours it should continue after hitting 8 in the totalhours column. For example, I want something like this as my final result.
ID HoursWorked TotalHours
23 1 1
23 1 2
23 1 3
23 0.5 3.5
23 1 4.5
23 1 5.5
23 1 6.5
23 1 7.5
23 0.5 8 *
23 0.5 8.5 *
61 1 1
61 1 2
As you can see the row which originally had 8.5 for its totalhours got broken down into two different rows.
I couldn't think of any way to do this in SQL Server. I'd appreciate any help on this.
see if this works.
select ID,HoursWorked,TotalHours from table_name where TotalHours <=8
union
select ID,(HoursWorked-(TotalHours-8) as HoursWorked ,8 as TotalHours from table_name where TotalHours >8
union
select ID,(TotalHours-8) as HoursWorked ,TotalHours from table_name where TotalHours >8
This seems rather complicated. This approach takes all the rows before 8 hours. It then finds the row that first passes 8 hours and splits that one as needed:
select id, hoursworked, totalhours
from t
where totalhours <= 8
union all
select t.id, v.hoursworked, v.totalhours
from (select t.*, row_number() over (partition by id order by totalhours) as seqnum
from t
where totalhours > 8
) t cross apply
(values (case when seqnum = 1 then totalhours - 8 end,
case when seqnum = 1 then 8 end
),
(case when seqnum = 1 and totalhours >= 8 then totalhours - 8 else hoursworked end,
totalhours
)
) v(hoursworked, totalhours)
where v.hoursworked > 0
order by id, totalhours;
Here is a db<>fiddle.

SQL cumulative sum until a flag value and resetting the sum

I'm still learning SQL and I'm trying to figure out a problem that I wasn't able to solve. So my problem is that I'm trying to select a table(let say Expense), ordered by date and in the table I have a column named Charged and I want to add charges to be cumulative(This part I figured out). However after that I have another column that will be acting as a flag called PayOut. When the PayOut value is 1 I want the summation of Charged(SumValue) to reset to zero. How would I do this? Here is what I have tried and the current output I get and what output I want. Note: I saw some posts using CTE's but wasn't the same scenario and more complex.
select ex.date,
ex.Charged,
(case when(ex.PayOut=1) then 0
else sum(ex.Charged) over (order by ex.date)end) as SumValue,
ex.PayOut
from Expense ex
order by ex.date asc
The data looks like this
Date Charged PayOut
01/10/2018 10 0
01/20/2018 5 0
01/30/2018 3 0
02/01/2018 0 1
02/11/2018 12 0
02/21/2018 15 0
Output I get
Date Charged PayOut SumValue
01/10/2018 10 0 10
01/20/2018 5 0 15
01/30/2018 3 0 18
02/01/2018 0 1 0
02/11/2018 12 0 30
02/21/2018 15 0 45
Output Wanted
Date Charged PayOut SumValue
01/10/2018 10 0 10
01/20/2018 5 0 15
01/30/2018 3 0 18
02/01/2018 0 1 0
02/11/2018 12 0 12
02/21/2018 15 0 27
Just create group from your PayOut Column and use it as a partition in OVER
WITH Expense AS (
SELECT CAST('01/10/2018' AS DATE) AS Date, 10 AS Charged, 0 AS PayOut
UNION ALL SELECT CAST('01/20/2018' AS DATE), 5, 0
UNION ALL SELECT CAST('01/30/2018' AS DATE), 3, 0
UNION ALL SELECT CAST('02/01/2018' AS DATE), 0, 1
UNION ALL SELECT CAST('02/11/2018' AS DATE), 12, 0
UNION ALL SELECT CAST('02/21/2018' AS DATE), 15, 0
)
SELECT
dat.date
,dat.Charged
,dat.PayOut
,dat.PayOutGroup
,SUM(dat.Charged) OVER (PARTITION BY dat.PayOutGroup ORDER BY dat.date) as SumValue
FROM (
SELECT
e.date
,e.Charged
,e.PayOut
,SUM(e.PayOut) OVER (ORDER BY e.date) AS PayOutGroup
FROM Expense e
) dat

Count parts of total value as columns per row (pivot table)

I'm stuck with a seemingly easy query, but couldn't manage to get it working the last hours.
I have a table files that holds file names and some values like records in this file, DATE of creation (create_date), DATE of processing (processing_date) and so on. There can be multiple files for a create date in different hours and it is likely that they will not get processed in the same day of creaton, in fact it can even take up to three days or longer for them to get processed.
So let's assume I have these rows, as an example:
create_date | processing_date
------------------------------
2012-09-10 11:10:55.0 | 2012-09-11 18:00:18.0
2012-09-10 15:20:18.0 | 2012-09-11 13:38:19.0
2012-09-10 19:30:48.0 | 2012-09-12 10:59:00.0
2012-09-11 08:19:11.0 | 2012-09-11 18:14:44.0
2012-09-11 22:31:42.0 | 2012-09-21 03:51:09.0
What I want in a single query is to get a grouped column truncated to the day create_date with 11 additional columns for the differences between the processing_date and the create_date, so that the result should roughly look like this:
create_date | diff0days | diff1days | diff2days | ... | diff10days
------------------------------------------------------------------------
2012-09-10 | 0 2 1 ... 0
2012-09-11 | 1 0 0 ... 1
and so on, I hope you get the point :)
I have tried this and so far it works getting a single aggregated column for a create_date with a difference of - for example - 3:
SELECT TRUNC(f.create_date, 'DD') as created, count(1) FROM files f WHERE TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD') = 3 GROUP BY TRUNC(f.create_date, 'DD')
I tried combining the single queries and I tried sub-queries, but that didn't help or at least my knowledge about SQL is not sufficient.
What I need is a hint so that I can include the various differences as columns, like shown above. How could I possibly achieve this?
That's basically the pivoting problem:
SELECT TRUNC(f.create_date, 'DD') as created
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 0 then 1 end) as diff0days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 1 then 1 end) as diff1days
, sum(case TRUNC(f.process_date, 'DD') - trunc(f.create_date, 'DD')
when 2 then 1 end) as diff2days
, ...
FROM files f
GROUP BY
TRUNC(f.create_date, 'DD')
SELECT CreateDate,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 1 THEN 1 ELSE 0 END) AS Diff1,
sum(CASE WHEN DateDiff(day, CreateDate, ProcessDate) = 2 THEN 1 ELSE 0 END) AS Diff2,
...
FROM table
GROUP BY CreateDate
ORDER BY CreateDate
As you are using Oracle 11g you can also get desired result by using pivot query.
Here is an example:
-- sample of data from your question
SQL> create table Your_table(create_date, processing_date) as
2 (
3 select '2012-09-10', '2012-09-11' from dual union all
4 select '2012-09-10', '2012-09-11' from dual union all
5 select '2012-09-10', '2012-09-12' from dual union all
6 select '2012-09-11', '2012-09-11' from dual union all
7 select '2012-09-11', '2012-09-21' from dual
8 )
9 ;
Table created
SQL> with t2 as(
2 select create_date
3 , processing_date
4 , to_date(processing_date, 'YYYY-MM-DD')
- To_Date(create_date, 'YYYY-MM-DD') dif
5 from your_table
6 )
7 select create_date
8 , max(diff0) diff0
9 , max(diff1) diff1
10 , max(diff2) diff2
11 , max(diff3) diff3
12 , max(diff4) diff4
13 , max(diff5) diff5
14 , max(diff6) diff6
15 , max(diff7) diff7
16 , max(diff8) diff8
17 , max(diff9) diff9
18 , max(diff10) diff10
19 from (select *
20 from t2
21 pivot(
22 count(dif)
23 for dif in ( 0 diff0
24 , 1 diff1
25 , 2 diff2
26 , 3 diff3
27 , 4 diff4
28 , 5 diff5
29 , 6 diff6
30 , 7 diff7
31 , 8 diff8
32 , 9 diff9
33 , 10 diff10
34 )
35 ) pd
36 ) res
37 group by create_date
38 ;
Result:
Create_Date Diff0 Diff1 Diff2 Diff3 Diff4 Diff5 Diff6 Diff7 Diff8 Diff9 Diff10
--------------------------------------------------------------------------------
2012-09-10 0 2 1 0 0 0 0 0 0 0 0
2012-09-11 1 0 0 0 0 0 0 0 0 0 1