Here's my dataset of trades, traders and counterparties:
TRADER_ID | TRADER_NAME | EXEC_BROKER | TRADE_AMOUNT | TRADE_ID
ABC123 | Jules Winnfield | GOLD | 10000 | ASDADAD
XDA241 | Jimmie Dimmick | GOLD | 12000 | ASSVASD
ADC123 | Vincent Vega | BARC | 10000 | ZXCZCX
ABC123 | Jules Winnfield | BARC | 15000 | ASSXCQA
ADC123 | Vincent Vega | CRED | 250000 | RFAQQA
ABC123 | Jules Winnfield | CRED | 5000 | ASDQ23A
ABC123 | Jules Winnfield | GOLD | 5000 | AVBDQ3A
I'm looking to produce a repeatable monthly report that gives me a view of trading activity aggregated at the counterparty (the EXEC_BROKER field) level, with subtotals - as shown below:
TRADER_ID | TRADER_NAME | NO._OF_CCP_USED | CCP | TRADED_AMT_WITH_CCP | VALUE_OF_TOTAL_TRADES | TRADES_WITH_CCP | TOTAL_TRADES
ABC123 | Jules Winnfield | 3 | GOLD | 15000 | 35000 | 2 | 4
ABC123 | Jules Winnfield | 3 | BARC | 15000 | 35000 | 1 | 4
ABC123 | Jules Winnfield | 3 | CRED | 5000 | 35000 | 1 | 4
...and so on the rest.
The idea is to aggregate the number of trades per counterparty (which I have done using a count function), and the sum of traded amounts with the ccp, but I'm struggling to get the 'subtotal' field next to each trader as shown in my desired output above - so you can see here that Jules has dealt with 3 counterparties in total, with 4 trades between them, and a collective amount of 35000.
I have tried using a combination of aggregate and over by functions, but to no avail.
SELECT
OT.TRADER_ID,
OT.TRADER_NAME,
OT.EXEC_BROKER,
SUM(OT.TRADE_AMOUNT) AS VALUE_OF_TOTAL_TRADES,
COUNT(OT.TRADE_ID) AS TOTAL_TRADES,
COUNT(OT.EXEC_BROKER) OVER PARTITION BY (OT.TRADER_ID) AS NO._OF_CCP_USED,
SUM(OT.TRADE_AMOUNT) OVER PARTITION BY (OT.EXEC_BROKER) AS TRADED_AMT_WITH_CCP,
COUNT(OT.TRADE_ID) OVER PARTITION BY (OT.EXEC_BROKER) AS TRADES_WITH_CCP
FROM dbo.ORDERS_TRADES OT
GROUP BY OT.TRADER_ID, OT.TRADER_NAME, OT.EXEC_BROKER, OT.TRADE_AMOUNT, OT.TRADE_ID
The code above runs but returns millions of rows. When I remove the partition by lines, I get the desired result minus the subtotal columns I'm looking for.
Any suggestions please? Thanks very much!
EDIT:
Final code which gave me the desired output: updating my question to provide this response (thanks to Gordon Linoff) so that others can benefit:
SELECT
OT.TRADER_ID,
OT.TRADER_NAME,
OT.EXEC_BROKER,
RANK() OVER (PARTITION BY OT.TRADER_ID ORDER BY
SUM(OT.TRADE_AMOUNT) DESC) AS CCP_RANK,
SUM(OT.TRADE_AMOUNT) AS TRADED_AMT_WITH_CCP,
SUM(SUM(OT.TRADE_AMOUNT)) OVER (PARTITION BY OT.TRADER_ID) AS
VALUE_OF_TOTAL_TRADES,
COUNT(*) OVER (PARTITION BY OT.TRADER_ID) AS NUM_OF_CCP_USED,
SUM(COUNT(OT.TRADE_ID)) OVER (PARTITION BY OT.TRADER_ID) AS
TOTAL_TRADES
FROM dbo.ORDERS_TRADES OT
GROUP BY OT.TRADER_ID, OT.TRADER_NAME, OT.EXEC_BROKER
You seem to want:
SELECT OT.TRADER_ID, OT.TRADER_NAME, OT.CCP,
COUNT(*) OVER (PARTITION BY OT.TRADER_ID) as NUM_CCP,
SUM(OT.TRADED_AMT) AS TRADED_AMT_WITH_CCP,
SUM(SUM(OT.TRADED_AMT)) OVER (PARTITION BY OT.TRADER_ID) AS VALUE_OF_TOTAL_TRADES,
COUNT(OT.TRADE_ID) AS CCP_TRADES,
SUM(COUNT(OT.TRADE_ID)) OVER (PARTITION BY OT.TRADER_ID) AS TOTAL_TRADES
FROM ORDERS_TRADES OT
GROUP BY OT.TRADER_ID, OT.TRADER_NAME, OT.CCP;
I'm not sure what your query has to do with the results you want. The columns have little to do with what you are asking.
Here is a db<>fiddle.
Making some assumptions about the nomenclature, here is a solution that doesn't use anything too fancy so it's easy to maintain, though it's not the most efficient:
create table trades
(
TRADER_ID varchar(10),
TRADER_NAME varchar(20),
CCP char(4),
TRADED_AMT decimal(10,2),
TRADE_ID varchar(10) primary key
);
insert trades
values
('ABC123', 'Jules Winnfield', 'GOLD', 10000 , 'ASDADAD'),
('XDA241', 'Jimmie Dimmick ', 'GOLD', 12000 , 'ASSVASD'),
('ADC123', 'Vincent Vega ', 'BARC', 10000 , 'ZXCZCX'),
('ABC123', 'Jules Winnfield', 'BARC', 15000 , 'ASSXCQA'),
('ADC123', 'Vincent Vega ', 'CRED', 250000, 'RFAQQA'),
('ABC123', 'Jules Winnfield', 'CRED', 5000 , 'ASDQ23A'),
('ABC123', 'Jules Winnfield', 'GOLD', 5000 , 'AVBDQ3A');
with trader_totals as
(
select trader_id,
distinct_ccps = count(distinct CCP),
total_amt = sum(traded_amt),
total_count = count(*)
from trades
group by trader_id
)
select trader_id = tr.trader_id,
trader_name = trader_name,
distinct_CCP_count = tt.distinct_ccps,
CCP = tr.CCP,
this_CCP_traded_amt = sum(traded_amt),
total_traded_amt = tt.total_amt,
this_CCP_traded_count = count(*),
total_traded_count = tt.total_count
from trades tr
join trader_totals tt on tt.trader_id = tr.trader_id
group by tr.trader_id,
tr.trader_name,
tr.CCP,
tt.distinct_ccps,
tt.total_amt,
tt.total_count
Related
I have 2 tables.
Table 1 is a temp variable table:
declare #Temp as table ( proj_num varchar(10), sum_dom decimal(23,8))
My temp table is populated with a list of project numbers, and a month end accounting dollar amount.
For example:
proj_num | sum_dom
11522 | 2477.15
11524 | 26474.20
41865 | 9012.10
Table 2 is a Project Transactions table.
We're concerned with just the following columns:
proj_num
amount
cost_code
tran_date
Individual values will somemething like this:
proj_num | cost_code | amount | tran_date
11522 | LBR | 112.10 | 10/1/2018
11522 | LBR | 1765.90 | 10/2/2018
11522 | MAT | 599.15 | 10/3/2018
11522 | FRT | 57.50 | 10/4/2018
So for this project, since the grand total of $2477.15 is met on 10/3, example output would be:
proj_num | cost_code | amount
11522 | LBR | 1878.00
11522 | MAT | 599.15
I want to sum the amounts (grouped by cost_code, and ordered by tran_date) under the project transaction table until the total sum of values for that project value matches the value in the sum_dom column of the temp table, at which point I will output that data.
Can you help me figure out how to write the query to do that?
I know I should avoid cursors, but I havent had much luck with my attempts so far. I cant seem to get it to keep a running total.
Running sum is done using SUM(...) OVER (ORDER BY ...). You just need to tell where to stop:
SELECT sq.*
FROM projects
INNER JOIN (
SELECT
proj_num,
cost_code,
amount,
SUM(amount) OVER (PARTITION BY proj_num ORDER BY tran_date) AS running_sum
FROM project_transactions
) AS sq ON projects.proj_num = sq.proj_num
WHERE running_sum <= projects.sum_dom
DB Fiddle
I have a table containing records of Users' internet history. The table's structure contains the User_ID, the Page Accessed, and the Date Accessed of the page. For Example:
+==========================================+
|User_ID | Page_Accessed | Date_Accessed |
+==========================================+
|Johh.Doe | Google | 1/1/2015 |
|Johh.Doe | Google | 1/1/2015 |
|Suzy.Lue | Google | 7/11/2015 |
|Suzy.Lue | Wikipedia | 4/23/2015 |
|Babe Ruth| StackOverflow | 9/1/2015 |
+==========================================+
I am currently trying to use a SQL query that uses:
RANK() OVER (PARTITION BY [Page Accessed] ORDER BY Count(DateAcc))
Then I use a PIVOT() by the Various Sites. However after selecting the records WHERE (Num = 1) from the PIVOT() and a GROUP BY [Rank], I'm ending up with resulting query similar to:
+=================================================+
|Rank | Google | Wikipedia | StackOverflow |
+=================================================+
| 1 | John Doe| NULL | NULL |
| 1 | NULL | Suzy Lue | NULL |
| 1 | NULL | NULL | Babe Ruth |
+=================================================+
Instead I need to reformat my output as:
+=================================================+
|Rank | Google | Wikipedia | StackOverflow |
+=================================================+
| 1 | John Doe| Suzy Lue | Babe Ruth |
+=================================================+
My Current Query:
SELECT Rank, Google, Wikipedia, StackOverflow
FROM(
SELECT TOP (100) PERCENT User_ID, Page_Accessed, COUNT(Date_Accessed) AS Views,
RANK() OVER (PARTITION BY Page_Accessed ORDER BY Count(Date_Accessed) DESC) AS Rank
FROM Record_Table
GROUP BY dbo.location_key.subSite, dbo.user_info_list_parse.Name
ORDER BY Views DESC) AS tb
PIVOT (
max(tb.User_ID) FOR
Page_Accessed IN ( Google, Wikipedia, StackOverflow)
) pvt
WHERE (Num = 1)
Are there any creative solutions to obtain this result?
I think you've already found solution but for your information and for others reading this - let me erase noise in this query. There is no need to ORDER BY, no need to apply TOP (100) PERCENT, Views column is redundant. I would simplify this query as follows:
CREATE TABLE InternetHistory
(
[User_ID] varchar(20),
[Page_Accessed] varchar(20),
[Date_Accessed] datetime
)
INSERT InternetHistory VALUES
('Johh.Doe', 'Google', '2015-01-01'),
('Johh.Doe', 'Google', '2015-01-01'),
('Suzy.Lue', 'Google', '2015-07-11'),
('Suzy.Lue', 'Wikipedia', '2015-04-23'),
('Babe Ruth', 'StackOverflow', '2015-01-09')
SELECT * FROM
(
SELECT [User_ID], [Page_Accessed], RANK() OVER (PARTITION BY [Page_Accessed] ORDER BY COUNT(*) DESC) Ranking
FROM InternetHistory
GROUP BY [User_ID], [Page_Accessed]
) AS Src
PIVOT
(
MAX([User_Id]) FOR [Page_Accessed] IN ([Google], [Wikipedia], [StackOverflow])
) AS Pvt
WHERE Ranking = 1
I'm stucking for a solution at the problem of finding daily profits from db (ms access) table. The difference wrt other tips I found online is that I don't have in the table a field "Price" and one "Cost", but a field "Type" which distinguish if it is a revenue "S" or a cost "C"
this is the table "Record"
| Date | Price | Quantity | Type |
-----------------------------------
|01/02 | 20 | 2 | C |
|01/02 | 10 | 1 | S |
|01/02 | 3 | 10 | S |
|01/02 | 5 | 2 | C |
|03/04 | 12 | 3 | C |
|03/03 | 200 | 1 | S |
|03/03 | 120 | 2 | C |
So far I tried different solutions like:
SELECT
(SELECT SUM (RS.Price* RS.Quantity)
FROM Record RS WHERE RS.Type='S' GROUP BY RS.Data
) as totalSales,
(SELECT SUM (RC.Price*RC.Quantity)
FROM Record RC WHERE RC.Type='C' GROUP BY RC.Date
) as totalLosses,
ROUND(totalSales-totaleLosses,2) as NetTotal,
R.Date
FROM RECORD R";
in my mind it could work but obviously it doesn't
and
SELECT RC.Data, ROUND(SUM (RC.Price*RC.QuantitY),2) as DailyLoss
INTO #DailyLosses
FROM Record RC
WHERE RC.Type='C' GROUP BY RC.Date
SELECT RS.Date, ROUND(SUM (RS.Price*RS.Quantity),2) as DailyRevenue
INTO #DailyRevenues
FROM Record RS
WHERE RS.Type='S'GROUP BY RS.Date
SELECT Date, DailyRevenue - DailyLoss as DailyProfit
FROM #DailyLosses dlos, #DailyRevenues drev
WHERE dlos.Date = drev.Date";
My problem beyond the correct syntax is the approach to this kind of problem
You can use grouping and conditional summing. Try this:
SELECT data.Date, data.Income - data.Cost as Profit
FROM (
SELECT Record.Date as Date,
SUM(IIF(Record.Type = 'S', Record.Price * Record.Quantity, 0)) as Income,
SUM(IIF(Record.Type = 'C', Record.Price * Record.Quantity, 0)) as Cost,
FROM Record
GROUP BY Record.Date
) data
In this case you first create a sub-query to get separate fields for Income and Cost, and then your outer query uses subtraction to get actual profit.
I'm facing a challenging request that's had me beating my head against the keyboard. I need to implement a script which will sort and summarize a dataset while accounting for overlapping values which are associated with different identifiers. The table from which I am selecting contains the following columns:
BoxNumber (Need to group by this value, which serves as the identifier)
ProdBeg (Contains the first 'page number' for the document/record)
ProdEnd (Contains the last 'page number' for the document/record)
DateProduced (Date the document was produced)
ArtifactID (Unique identifier for each document)
NumPages (Contains the number of pages associated with each document)
Selecting a sample of the data with no conditions resembles the following (sorry for lousy formatting):
BoxNumber | ProdBeg | ProdEnd | DateProduced | ArtifactID | NumPages
1200 | ABC01 | ABC10 | 12/4/2013 | 1564589 | 10
1201 | ABC11 | ABC20 | 12/4/2013 | 1498658 | 10
1200 | ABC21 | ABC30 | 12/4/2013 | 1648596 | 10
1200 | ABC31 | ABC40 | 12/4/2013 | 1489535 | 10
Using something like the following effectively groups and sorts the data by box number while accounting for different DateProduced dates, but does not account for overlapping ProdBeg/ProdEnd values between different BoxNumbers:
SELECT BoxNumber, MIN(ProdBeg) AS 'ProdBeg', MAX(ProdEnd) AS 'ProdEnd', DateProduced, COUNT(ArtifactID) AS 'Documents', SUM(NumPages) AS 'Pages'
FROM MyTable
GROUP BY BoxNumber, DateProduced
ORDER BY ProdBeg, ProdEnd
This yields:
BoxNumber | ProdBeg | ProdEnd| DateProduced | Documents| Pages
1200 | ABC01 | ABC40 | 12/4/2013 | 3 | 30
1201 | ABC11 | ABC20 | 12/4/2013 | 1 | 10
Here, it becomes apparent that the ProdBeg/ProdEnd values for box 1201 overlap those for box 1200. No variation on the script above will work, as it will inherently ignore any overlaps and only select the min/max. We need something which will produce the following result:
BoxNumber | ProdBeg | ProdEnd | DateProduced | Documents| Pages
1200 | ABC01 | ABC10 | 12/4/2013 | 1 | 10
1201 | ABC11 | ABC20 | 12/4/2013 | 1 | 10
1200 | ABC21 | ABC40 | 12/4/2013 | 2 | 20
I'm just not sure how we can group by box number without showing only distinct values (which can result in overlaps for ProdBeg/ProdEnd). Any suggestions would be greatly appreciated! The environment version is SQL 2008 R2 (SP1).
Yuch. This would be helped at least if you had lead()/lag() as in SQL Server 2012. But it is doable.
The idea is the following:
Add a variable that is the number part of the code (the last two digits).
Calculate the next number in the sequence.
Calculate a flag if there is a gap to the next number. This is the start of a "group".
Calculate the cumulative sum of the "start of a group" flag. This is a group id.
Do the aggregation.
The following query follows this logic. I didn't include the date produced. This seems redundant with the number, unless a box can appear on multiple days. (Adding the date produced is just a matter of adding the condition to the where clauses.) The resulting query is:
with bp as (
select t.*,
cast(right(prodbeg, 2) as int) as pbeg,
cast(right(prodend, 2) as int) as pend
from mytable t
),
bp1 as (
select bp.*,
(select top 1 pbeg
from bp bp2
where bp2.pbeg < bp.pbeg and pb2.BoxNumber = pb.BoxNumber
order by bp2.pbeg desc
) as prevpend
from bp
),
bp2 as (
select bp1.*,
(select sum(case when prevpend = pbeg - 1 then 0 else 1 end)
from bp1 bp1a
where bp1a.pbeg < bp1.pbeg and pb1a.BoxNumber = pb1.BoxNumber
) as groupid
from bp1
)
select BoxNumber, MIN(ProdBeg) AS ProdBeg, MAX(ProdEnd) AS ProdEnd, DateProduced,
COUNT(ArtifactID) AS 'Documents', SUM(NumPages) AS 'Pages'
FROM bp2
GROUP BY BoxNumber, groupid
ORDER BY ProdBeg, ProdEnd;
I am having trouble finding what date my customers hit a certain threshold in how much money they make.
customer_id | Amount | created_at
---------------------------
1134 | 10 | 01.01.2010
1134 | 15 | 02.01.2010
1134 | 5 | 03.24.2010
1235 | 10 | 01.03.2010
1235 | 15 | 01.03.2010
1235 | 30 | 01.03.2010
1756 | 50 | 01.05.2010
1756 | 100 | 01.25.2010
To determine how much total amount they made I run a simple query like this:
SELECT customer_id, SUM(amount)
FROM table GROUP BY customer_id
But I need to be able to find for e.g. the date a customer hits $100 in total amount.
Any help is greatly appreciated. Thanks!
Jesse,
I believe you are looking for a version of "running total" calculation.
Take a look at this post calculate-a-running-total.
There is number of useful links there.
This article have a lot of code that you could reuse as well: http://www.sqlteam.com/article/calculating-running-totals.
Something like having clause
SELECT customer_id, SUM(amount) as Total FROM table GROUP BY customer_id having Total > 100
I'm not sure if MySQL supports subqueries, so take this with a grain of salt:
SELECT customer_id
, MIN(created_at) AS FirstDate
FROM ( SELECT customer_id
, created_at
, ( SELECT SUM(amount)
FROM [Table] t
WHERE t.CustomerID = [Table].CustomerID
AND t.created_at <= [Table].created_at
) AS RunTot
FROM [Table]
) x
WHERE x.RunTot >= 100
GROUP BY customer_id