We are raising bills to clients on various dates, and the payment is received in a irregular way. We need to calculate the payment delay days till full payments are received for a particular payment due. The data is sample data for only one client 0123
table Due (id, fil varchar(12), amount numeric(10, 2), date DATE)
table Received (id, fil varchar(12), amount numeric(10, 2), date DATE)
Table Due:
id fil amount date
1. 0123 1000. 2019-jan-01
2. 0123 1500 2019-jan-15
3. 0123 1200. 2019-jan-25
4. 0123 1800. 2019-feb-10
Table Received:
id. fil. amount. date
1. 0123 1000. 2019-jan-10
2. 0123 500. 2019-jan-20
3. 0123 1300. 2019-jan-25
4. 0123 400. 2019-feb-08
5. 0123 1000. 2019-feb-20
The joined table should show:
fil. due_date due_amount. received_amount date delay
0123 2019-jan-01 1000. 1000 9
0123 2019-jan-15. 1500. 500
0123 1300. 10(since payment completed on 25th jan)
0123 2019-jan-25. 1200. 400.
0123 1000. 26
0123 2019-feb-10. 1800.
I have tried to be as accurate as possible in calculations......Please excuse if there is some in advertant error. I was just coming around to writing a script to do this, but maybe someone will be able to suggest a proper join.
As #DavidHempy said, this is not possible without knowing for which invoice each payment is meant. You can calculate how many days it's been since the account was at 0, which might help:
with all_activity as (
-1 * amount as amount
from due
union all
from received),
totals as (
select date,
sum(amount) over (order by date),
case when sum(amount) over (order by date) >=0
then true
else false
end as nothing_owed
from all_activity)
select date,
date - max(date) filter (where nothing_owed = true) OVER (order by date)
as days_since_positive
from totals order by 1
date | amount | sum | days_since_positive
2019-01-01 | -1000.00 | -1000.00 |
2019-01-10 | 1000.00 | 0.00 | 0
2019-01-15 | -1500.00 | -1500.00 | 5
2019-01-20 | 500.00 | -1000.00 | 10
2019-01-25 | -1200.00 | -900.00 | 15
2019-01-25 | 1300.00 | -900.00 | 15
2019-02-08 | 400.00 | -500.00 | 29
2019-02-10 | -1800.00 | -2300.00 | 31
2019-02-20 | 1000.00 | -1300.00 | 41
(9 rows)
SQL Aggregate Over Date Range

Whoever answers this thank you so, so much!
Here's a little snippet of my data:
DATE Score Multiplier Weighting
2022-01-05 3 4 7
2022-01-05 4 7 8
2022-01-06 5 2 4
2022-01-06 3 4 7
2022-01-06 4 7 8
2022-01-07 5 2 4
Each row of this data is when something "happened" and multiple events occur during the same day.
What I need to do is take the rolling average of this data over the past 3 months.
So for ONLY 2022-01-05, my weighted average (called ADJUSTED) would be:
2022-01-05 [(3*4) + (4*7)]/(7+8)
Except I need to do this over the previous 3 months (so on Jan 5, 2022, I'd need the rolling weighted average -- using the "Weighting" column -- over the preceding 3 months; can also use previous 90 days if that makes it easier).
Not sure if this is a clear enough description, but would appreciate any help.
Thank you!
IF I have interpreted this correctly I believe a GROUP BY query will meet the need:
sample data
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-05',3,4,7);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-05',4,7,8);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-06',5,2,4);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-06',3,4,7);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-06',4,7,8);
INSERT INTO mytable(DATE,Score,Multiplier,Weighting) VALUES ('2022-01-07',5,2,4);
, sum(score) sum_score
, sum(multiplier) sum_multiplier
, sum(weighting) sum_weight
, (sum(score)*1.0 + sum(multiplier)*1.0) / (sum(weighting)*1.0) ADJUSTED
from mytable
group by date
| date | sum_score | sum_multiplier | sum_weight | ADJUSTED |
| 2022-01-05 | 7 | 11 | 15 | 1.200000000000000 |
| 2022-01-06 | 12 | 13 | 19 | 1.315789473684210 |
| 2022-01-07 | 5 | 2 | 4 | 1.750000000000000 |
db<>fiddle here
Progress query to remove duplicates based on number of duplicates

Our accounting department needs pull tax data from our MIS every month and submit it online to the Dept. of Revenue. Unfortunately, when pulling the data, it is duplicated a varying number of times depending on which jurisdictions we have to pay taxes to. All she needs is the dollar amount for one jurisdiction, for one line, because she enters that on the website.
I've tried using DISTINCT to pull only one record of the type, in conjunction with LEFT() to pull just the first 7 characters of the jurisdiction but it ended up excluding certain results that should have been included. I believe it was because the posting date and the amount on a couple transactions was identical. They were separate transactions but the query took them as duplicates and ignored them.
Here is a couple of examples of queries I've run that have been successful in pulling most of the data, but most times either too much or not enough:
SELECT DISTINCT LEFT("Sales-Tax-Jurisdiction-Code", 7), "Taxable-Base", "Posting-Date"
FROM ARInvoiceTax
WHERE ("Posting-Date" >= '2019-09-01' AND "Posting-Date" <= '2019-09-30')
AND (("Sales-Tax-Jurisdiction-Code" BETWEEN '55001' AND '56763')
OR "Sales-Tax-Jurisdiction-Code" = 'Dakota Cty TT')
ORDER BY "Sales-Tax-Jurisdiction-Code"
Here is a query that I can to pull all of the data and the subsequent result is below that:
SELECT "Sales-Tax-Jurisdiction-Code", "Taxable-Base", "Posting-Date"
FROM ARInvoiceTax
WHERE ("Posting-Date" >= '2019-09-01' AND "Posting-Date" <= '2019-09-30')
AND (("Sales-Tax-Jurisdiction-Code" BETWEEN '55001' AND '56763')
OR "Sales-Tax-Jurisdiction-Code" = 'Dakota Cty TT')
ORDER BY "Sales-Tax-Jurisdiction-Code"
Below is a sample of the output:
Jurisdiction | Tax Amount | Posting Date
5512100City | $50.00 | 2019-09-02
5512100City | $50.00 | 2019-09-03
5512100City | $70.00 | 2019-09-02
5512100Cnty | $50.00 | 2019-09-02
5512100Cnty | $50.00 | 2019-09-03
5512100Cnty | $70.00 | 2019-09-02
5512100State | $70.00 | 2019-09-02
5512100State | $50.00 | 2019-09-02
5512100State | $50.00 | 2019-09-03
5513100Cnty | $25.00 | 2019-09-12
5513100State | $25.00 | 2019-09-12
5514100City | $9.00 | 2019-09-06
5514100City | $9.00 | 2019-09-06
5514100Cnty | $9.00 | 2019-09-06
5514100Cnty | $9.00 | 2019-09-06
5515100State | $12.00 | 2019-09-11
5516100City | $6.00 | 2019-09-13
5516100City | $7.00 | 2019-09-13
5516100State | $6.00 | 2019-09-13
5516100State | $7.00 | 2019-09-13
As you can see, the data can be all over the place. One zip code could have multiple different lines. What the accounting department does now is prints a report with this information and, in a spreadsheet, only records (1) dollar amount per transaction. For example, for 55121, she would need to record $50.00, $50.00 and $70.00 (she tallies them and adds the total amount on the website) however the SQL query gives me those (3) numbers, (3) times.
I can't seem to figure out a query that will pull only one set of the data. Unfortunately, I can't do it based on the words/letters after the 00 because not all jurisdictions have all 3 (city, cnty, state) and thus trying to remove lines based on that removes valid lines as well.
Can you use select distinct? If the first five characters are the zip code and you just want that:
select distinct left(jurisdiction, 5), tax_amount
from t;
Take only City/County/.. whatever is first
select jurisdiction, tax_amount, Posting_Date
from (
select *, dense_rank() over(partition by left(jurisdiction, 7) order by substring(jurisdiction, 8, len(jurisdiction))) rnk
from taxes -- you output here
where rnk=1;
Sql server syntax, you may need other string functions in your dbms.
Query a table so that data in one column could be shown as different fields

I have a table that stores data of customer care . The table/view has the following structure.
userid calls_received calls_answered calls_rejected call_date
1030 134 100 34 28-05-2018
1012 140 120 20 28-05-2018
1045 120 80 40 28-05-2018
1030 99 39 50 28-04-2018
1045 50 30 20 28-04-2018
1045 200 100 100 28-05-2017
1030 160 90 70 28-04-2017
1045 50 30 20 28-04-2017
This is the sample data. The data is stored on day basis.
I have to create a report in a report designer software that takes date as an input. When user selects a date for eg. 28/05/2018. This date is send as parameter ${call_date}. i have to query the view in such a way that result should look like as below. If user selects date 28/05/2018 then data of 28/04/2018 and 28/05/2017 should be displayed side by side as like the below column order.
userid | cl_cur | ans_cur | rej_cur |success_percentage |diff_percent|position_last_month| cl_last_mon | ans_las_mon | rej_last_mon |percentage_lm|cl_last_year | ans_last_year | rej_last_year
1030 | 134 | 100 | 34 | 74.6 % | 14% | 2 | 99 | 39 | 50 | 39.3% | 160 | 90 | 70
1045 | 120 | 80 | 40 | 66.6% | 26.7% | 1 | 50 | 30 | 20 | 60% | 50 | 30 | 20
The objective of this query is to show data of selected day, data of same day previous month and same day previous years in columns so that user can have a look and compare. Here the result is ordered by percentage(ans_cur/cl_cur) of selected day in descending order of calculated percentage and show under success_percentage.
The column position_last_month is the position of that particular employee in previous month when it is ordered in descending order of percentage. In this example userid 1030 was in 2nd position last month and userid 1045 in 1 st position last month. Similarly I have to calculate this also for year.
Also there is a field called diff_percent which calculates the difference of percentage between the person who where in same position last month.Same i have to do for last year. How i can achieve this result.Please help.
One method is a join:
select t.user_id,
t.calls_received as cr_cur, t.calls_answered as ca_cur, t.calls_rejected as cr_cur,
tm.calls_received as cr_last_mon, tm.calls_answered as ca_last_mon, tm.calls_rejected as cr_last_mon,
ty.calls_received as cr_last_year, ty.calls_answered as ca_last_year, ty.calls_rejected as cr_last_year
from t left join
t tm
on tm.userid = t.userid and
tm.call_date = dateadd(month, -1, t.call_date) left join
t ty
on ty.userid = t.userid and
tm.call_date = dateadd(year, -1, t.call_date)
SQL - Creating a timeline for each ID (Vertica)

I am dealing with the following problem in SQL (using Vertica):
In short -- Create a timeline for each ID (in a table where I have multiple lines, orders in my example, per ID)
What I would like to achieve -- At my disposal I have a table on historical order date and I would like to compute new customer (first order ever in the past month), active customer- (>1 order in last 1-3 months), passive customer- (no order for last 3-6 months) and inactive customer (no order for >6 months) rates.
Which steps I have taken so far -- I was able to construct a table similar to the example presented below:
CustomerID Current order date Time between current/previous order First order date (all-time)
001 2015-04-30 12:06:58 (null) 2015-04-30 12:06:58
001 2015-09-24 17:30:59 147 05:24:01 2015-04-30 12:06:58
001 2016-02-11 13:21:10 139 19:50:11 2015-04-30 12:06:58
002 2015-10-21 10:38:29 (null) 2015-10-21 10:38:29
003 2015-05-22 12:13:01 (null) 2015-05-22 12:13:01
003 2015-07-09 01:04:51 47 12:51:50 2015-05-22 12:13:01
003 2015-10-23 00:23:48 105 23:18:57 2015-05-22 12:13:01
A little bit of intuition: customer 001 placed three orders from which the second one was 147 days after its first order. Customer 002 has only placed one order in total.
What I think that the next steps should be -- I would like to know for each date (also dates on which a certain user did not place an order), for each CustomerID, how long it has been since his/her last order. This would imply that I would create some sort of timeline for each CustomerID. In the example presented above I would get 287 (days between 1st of May 2015 and 11th of February 2016, the timespan of this table) lines for each CustomerID. I have difficulties solving this previous step. When I have performed this step I want to create a field which shows at each date the last order date, the period between the last order date and the current date, and what state someone is in at the current date. For the example presented earlier, this would look something like this:
CustomerID Last order date Current date Time between current date /last order State
001 2015-04-30 12:06:58 2015-05-01 00:00:00 0 00:00:00 New
001 2015-04-30 12:06:58 2015-06-30 00:00:00 60 11:53:02 Active
001 2015-09-24 17:30:59 2016-02-01 00:00:00 129 11:53:02 Passive
002 2015-10-21 17:30:59 2015-10-22 00:00:00 0 06:29:01 New
002 2015-10-21 17:30:59 2015-11-30 00:00:00 39 06:29:01 Active
003 2015-05-22 12:13:01 2015-06-23 00:00:00 31 11:46:59 Active
003 2015-07-09 01:04:51 2015-10-22 00:00:00 105 11:46:59 Inactive
At the dots there should be all the inbetween dates but for sake of space I have left these out of the table.
When I know for each date what the state is of each customer (active/passive/inactive) my plan is to sum the states and group by date which should give me the sum of new, active, passive and inactive customers. From here on I can easily compute the rates at each date.
Anybody that knows how I can possibly achieve this task?
Note -- If anyone has other ideas how to achieve the goal presented above (using some other approach compared to the approach I had in mind) please let me know!
Suppose you start from a table like this:
SQL> select * from ord order by custid, ord_date ;
custid | ord_date
1 | 2015-04-30 12:06:58
1 | 2015-09-24 17:30:59
1 | 2016-02-11 13:21:10
2 | 2015-10-21 10:38:29
3 | 2015-05-22 12:13:01
3 | 2015-07-09 01:04:51
3 | 2015-10-23 00:23:48
(7 rows)
You can use Vertica's Timeseries Analytic Functions TS_FIRST_VALUE(), TS_LAST_VALUE() to fill gaps and interpolate last_order date to the current date:
Then you just have to join this with a Vertica's TimeSeries generated from the same table with interval one day starting from the first day each customer did place his/her first order up to now (current_date):
when status_dt::date - last_order_dt::date < 30 then case
when nord = 1 then 'New' else 'Active' end
when status_dt::date - last_order_dt::date < 90 then 'Active'
when status_dt::date - last_order_dt::date < 180 then 'Passive'
else 'Inactive'
end as status
from (
conditional_true_event (first_order_dt is null or
last_order_dt > lag(last_order_dt))
over(partition by custid order by status_dt) as nord
from (
ts_first_value(ord_date) as first_order_dt ,
ts_last_value(ord_date) as last_order_dt ,
dt::date as status_dt
( select custid, ord_date from ord
union all
select distinct(custid) as custid, current_date + 1 as ord_date from ord
) z timeseries dt as '1 day' over (partition by custid order by ord_date)
) x
) y
where status_dt <= current_date
order by 1, 2
And you will get something like this:
custid | status_dt | last_order_dt | status
1 | 2015-04-30 | 2015-04-30 12:06:58 | New
1 | 2015-05-01 | 2015-04-30 12:06:58 | New
1 | 2015-05-02 | 2015-04-30 12:06:58 | New
1 | 2015-05-29 | 2015-04-30 12:06:58 | New
1 | 2015-05-30 | 2015-04-30 12:06:58 | Active
a Rollup query with some logical netting using Oracle SQL

I have a table "AuctionResults" like below
Auction Action Shares ProfitperShare
Round1 BUY 6 200
Round2 BUY 5 100
Round2 SELL -2 50
Round3 SELL -5 80
Now I need to aggregate results by every auction with BUYS after netting out SELLS in subsequent rounds on a "First Come First Net basis"
so in Round1 I bought 6 Shares and then sold 2 in Round2 and rest "4" in Round3 with a total NET profit of 6 * 200-2 * 50-4 * 80 = 780
and in Round2 I bought 5 shares and sold "1" in Round3(because earlier "4" belonged to Round1) with a NET Profit of 5 * 100-1 * 80 = 420 the Resulting Output should look like:
Auction NetProfit
Round1 780
Round2 420
Can we do this using just Oracle SQL(10g) and not PL-SQL
Thanks in advance
I know this is an old question and won't be of use to the original poster, but I wanted to take a stab at this because it was an interesting question. I didn't test it out enough, so I would expect this still needs to be corrected and tuned. But I believe the approach is legitimate. I would not recommend using a query like this in a product because it would be difficult to maintain or understand (and I don't believe this is really scalable). You would be much better off creating some alternate data structures. Having said that, this is what I ran in Postgresql 9.1:
SELECT round, action
,ABS(shares) AS shares
,COALESCE( SUM(shares) OVER(ORDER BY round, action
, 0) AS previous_net_shares
OVER(ORDER BY round, action
AND 1 PRECEDING) ), 0 ) AS previous_sells
FROM AuctionResults
SELECT round, shares * profitpershare - deduction AS net
SELECT buy.round, buy.shares, buy.profitpershare
,SUM( LEAST( LEAST( sell.shares, GREATEST(buy.shares - (sell.previous_sells - buy.previous_sells), 0)
,GREATEST(sell.shares + (sell.previous_sells - buy.previous_sells) - buy.previous_net_shares, 0)
) * sell.profitpershare ) AS deduction
FROM x buy
,x sell
WHERE sell.round > buy.round
AND buy.action = 'BUY'
AND sell.action = 'SELL'
GROUP BY buy.round, buy.shares, buy.profitpershare
) AS y
And the result:
round | net
1 | 780
2 | 420
(2 rows)
To break it down into pieces, I started with this data set:
CREATE TABLE AuctionResults( round int, action varchar(4), shares int, profitpershare int);
INSERT INTO AuctionResults VALUES(1, 'BUY', 6, 200);
INSERT INTO AuctionResults VALUES(2, 'BUY', 5, 100);
INSERT INTO AuctionResults VALUES(2, 'SELL',-2, 50);
INSERT INTO AuctionResults VALUES(3, 'SELL',-5, 80);
INSERT INTO AuctionResults VALUES(4, 'SELL', -4, 150);
select * from auctionresults;
round | action | shares | profitpershare
1 | BUY | 6 | 200
2 | BUY | 5 | 100
2 | SELL | -2 | 50
3 | SELL | -5 | 80
4 | SELL | -4 | 150
(5 rows)
The query in the "WITH" clause adds some running totals to the table.
"previous_net_shares" indicates how many shares are available to sell before the current record. This also tells me how many 'SELL' shares I need to skip before I can start allocating it to this 'BUY'.
"previous_sells" is a running count of the number of "SELL" shares encountered, so the difference between two "previous_sells" indicates the number of 'SELL' shares used in that time.
round | action | shares | profitpershare | previous_net_shares | previous_sells
1 | BUY | 6 | 200 | 0 | 0
2 | BUY | 5 | 100 | 6 | 0
2 | SELL | 2 | 50 | 11 | 0
3 | SELL | 5 | 80 | 9 | 2
4 | SELL | 4 | 150 | 4 | 7
(5 rows)
With this table, we can do a self-join where each "BUY" record is associated with each future "SELL" record. The result would look like this:
SELECT buy.round, buy.shares, buy.profitpershare
,sell.round AS sellRound, sell.shares AS sellShares, sell.profitpershare AS sellProfitpershare
FROM x buy
,x sell
WHERE sell.round > buy.round
AND buy.action = 'BUY'
AND sell.action = 'SELL'
round | shares | profitpershare | sellround | sellshares | sellprofitpershare
1 | 6 | 200 | 2 | 2 | 50
1 | 6 | 200 | 3 | 5 | 80
1 | 6 | 200 | 4 | 4 | 150
2 | 5 | 100 | 3 | 5 | 80
2 | 5 | 100 | 4 | 4 | 150
(5 rows)
And then comes the crazy part that tries to calculate the number of shares available to sell in the order vs the number over share not yet sold yet for a buy. Here are some notes to help follow that. The "greatest"calls with "0" are just saying we can't allocate any shares if we are in the negative.
-- allocated sells
sell.previous_sells - buy.previous_sells
-- shares yet to sell for this buy, if < 0 then 0
GREATEST(buy.shares - (sell.previous_sells - buy.previous_sells), 0)
-- number of sell shares that need to be skipped
