Rolling 12 months SUM from latest available data with varchar YearMonth - sql

Sample Schema: ID (varchar 4), YM (varchar 6), Rev(Float) , P_K (ID,YM)
ID YM Rev
1001 201112 150
1001 201211 200
1001 201212 200
1001 201303 500
... ... ...
1001 201605 400
1023 201112 150
1023 201211 200
1023 201212 200
1023 201303 500
... ... ...
1023 201605 700
I need to create a VIEW where Every row should have a SUM of Rev (Revenue) for preceeding 12 months from the YM(Year Month).
I have tried the following query, however it returns data from '201512' backwards. Where I was expecting the YM to be starting from '201603' since that is the latest YM data available in the table.
SELECT
fs.ID,
fs.YM,
(SELECT SUM(fsi.Rev) FROM FSource fsi WHERE fsi.YM >= ((100* (fs.YM / 100) + 1)-100) AND fsi.YM <= fs.YM AND fsi.ID= fs.ID) AS Rev
FROM FSource fs
WHERE RIGHT(fs.ym, 2) = (SELECT COUNT(*) FROM FSource fsi WHERE fsi.YM >= ((100* (fs.ym / 100) + 1)-100) AND fsi.YM <= fs.ym AND fsi.ID= fs.ID)
Any help, what could possibly be wrong hapenning here?
Expected Output:
ID YM Rev
... ... ...
1001 201601 1500
1001 201602 2400
1001 201603 2400
1001 201604 6000
1001 201605 4800 (assuming last 12 months were 400 each, date range "201506 to 201605")

You can approach this using a correlated subquery:
select fs.*,
(select sum(fs2.rev)
from fsource fs2
where fs2.id = fs.id and
cast(fs2.ym as int) <= cast(fs.ym as int) and
cast(fs2.ym as int) > cast(fs.ym as int) - 100
) as rev_12months
from fsource fs;
One problem with your query is this expression ((100* (fs.YM / 100) + 1)-100). For a value such as 201603, it turns into 201501, because SQL Server does integer division. Also, it is a bad idea to rely on implicit conversion.
The rows being returned are determined by the outer where. I have no idea what that logic is supposed to be doing.

Related

Dividing a sum value into multiple rows due to field length constraint

I am migrating financial data from a very large table (100 million+ of rows) by summarizing the amount and insert them into summary table. I ran into problem when the summary amount (3 billions) is larger than what the field in the summary table can hold (can only hold up to 999 millions.) Changing the field size is not an option as it requires a change process.
The only option I have is to divide the amount (the one that breach the size limit) into smaller ones so it can be inserted into the table.
I came across this SQL - I need to divide a total value into multiple rows in another table which is similar except the number of rows I need to insert is dynamic.
For simplicity, this is how the source table might look like
account_table
acct_num | amt
-------------------------------
101 125.00
101 550.00
101 650.00
101 375.00
101 475.00
102 15.00
103 325.00
103 875.00
104 200.00
104 275.00
The summary records are as follows
select acct_num, sum(amt)
from account_table
group by acct_num
Account Summary
acct_num | amt
-------------------------------
101 2175.00
102 15.00
103 1200.00
104 475.00
Assuming the maximum value in the destination table is 1000.00, the expected output will be
summary_table
acct_num | amt
-------------------------------
101 1000.00
101 1000.00
101 175.00
102 15.00
103 1000.00
103 200.00
104 475.00
How do I create a query to get the expected result? Thanks in advance.
You need a numbers table. If you have a handful of values, you can define it manually. Otherwise, you might have one on hand or use a similar logic:
with n as (
select (rownum - 1) as n
from account_table
where rownum <= 10
),
a as (
select acct_num, sum(amt) as amt
from account_table
group by acct_num
)
select acct_num,
(case when (n.n + 1) * 1000 < amt then 1000
else amt - n.n * 1000
end) as amt
from a join
n
on n.n * 1000 < amt ;
A variation along these lines might give some ideas (using the 1,000 of your sample data):
WITH summary AS (
SELECT acct_num
,TRUNC(SUM(amt) / 1000) AS times
,MOD(SUM(amt), 1000) AS remainder
FROM account_table
GROUP BY acct_num
), x(acct_num, times, remainder) AS (
SELECT acct_num, times, remainder
FROM summary
UNION ALL
SELECT s.acct_num, x.times - 1, s.remainder
FROM summary s
,x
WHERE s.acct_num = x.acct_num
AND x.times > 0
)
SELECT acct_num
,CASE WHEN times = 0 THEN remainder ELSE 1000 END AS amt
FROM x
ORDER BY acct_num, amt DESC
The idea is to first build a summary table with div and modulo:
ACCT_NUM TIMES REMAINDER
101 2 175
102 0 15
103 1 200
104 0 475
Then perform a hierarchical query on the summary table based on the number of "times" (i.e. rows) you want, with an extra for the remainder.
ACCT_NUM AMT
101 1000
101 1000
101 175
102 15
103 1000
103 200
104 475

How duplicate a rows in SQL base on difference between date columns and divided aggregated column per duplicate row?

I have a table with some records about fuel consumption. The important columns in the table are: CONSUME_DATE_FROM and CONSUM_DATE_TO.
I want to calculate average fuel consumption per cars on a monthly basis but some rows are not in the same month. For example some have a three month difference between them and the total of gas per litre is aggregated in a single row.
Now I should find records that have difference more than a month between CONSUME_DATE_FROM and CONSUM_DATE_TO, and duplicate them in current or second table per count of month and divide the total gas per litre between related rows.
I've this table with the following data:
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 600
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 400
4 103 2018-03-29 2018-05-29 200
5 104 2018-02-05 2018-02-09 50
The expected output table should be as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
1 100 2018-10-25 2018-12-01 200
2 101 2018-07-19 2018-07-24 100
3 102 2018-12-31 2019-01-01 200
3 102 2018-12-31 2019-01-01 200
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
4 103 2018-03-29 2018-05-29 66.66
5 104 2018-02-05 2018-02-09 50
Or as below
ID VehicleId CONSUME_DATE_FROM CONSUM_DATE_TO GAS_PER_LITER DATE_RELOAD_GAS
1 100 2018-10-25 2018-12-01 200 2018-10-01
1 100 2018-10-25 2018-12-01 200 2018-11-01
1 100 2018-10-25 2018-12-01 200 2018-12-01
2 101 2018-07-19 2018-07-24 100 2018-07-01
3 102 2018-12-31 2019-01-01 200 2018-12-01
3 102 2018-12-31 2019-01-01 200 2019-01-01
4 103 2018-03-29 2018-05-29 66.66 2018-03-01
4 103 2018-03-29 2018-05-29 66.66 2018-04-01
4 103 2018-03-29 2018-05-29 66.66 2018-05-01
5 104 2018-02-05 2018-02-09 50 2018-02-01
Can someone please help me out with this query?
I'm using oracle database
Your business rule treats the difference between CONSUME_DATE_FROM and CONSUM_DATE_TO as absolute months. So you expect the difference between 2018-10-25 and 2018-12-01 to be three months whereas the difference in days actually equates to about 1.1 months. So we can't use simple date arithmetic to get your desired output, we need to do some additional massaging of the dates.
The query below implements your desired logic by deriving the first day of the month for CONSUME_DATE_FROM and the last day of the month for CONSUME_DATE_TO, then using ceil() to round the difference up to the nearest whole number of months.
This is calculated in a subquery which is used in the main query with the old connect by level trick to multiply a record by level number of times:
with cte as (
select f.*
, ceil(months_between(last_day(CONSUM_DATE_TO)
, trunc(CONSUME_DATE_FROM,'mm'))) as diff
from fuel_consumption f
)
select cte.id
, cte.VehicleId
, cte.CONSUME_DATE_FROM
, cte.CONSUM_DATE_TO
, cte.GAS_PER_LITER/cte.diff as GAS_PER_LITER
, add_months(trunc(cte.CONSUME_DATE_FROM, 'mm'), level-1) as DATE_RELOAD_GAS
from cte
connect by level <= cte.diff
and prior cte.id = cte.id
and prior sys_guid() is not null
;
"what about if add a additional column "DATE_RELOAD_GAS" that display difference date for similar rows"
From your posted sample it seems like DATE_RELOAD_GAS is the first day of the month for each month bounded by CONSUME_DATE_FROM and CONSUM_DATE_TO. I have amended my solution to implement this rule.
By using connect by level structure with considering to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') as month I was able to resolve as below :
select ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO,
trunc(GAS_PER_LITER/max(rn) over (partition by ID order by ID),2) as GAS_PER_LITER,
'01.'||substr(myMonth,5,2)||'.'||substr(myMonth,1,4) as DATE_RELOAD_GAS
from
(
with consumption( ID, VehicleId, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER ) as
(
select 1,100,date'2018-10-25',date'2018-12-01',600 from dual union all
select 2,101,date'2018-07-19',date'2018-07-24',100 from dual union all
select 3,102,date'2018-12-31',date'2019-01-01',400 from dual union all
select 4,103,date'2018-03-29',date'2018-05-29',200 from dual union all
select 5,104,date'2018-02-05',date'2018-02-09', 50 from dual
)
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID >= 2
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
union all
select ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm') myMonth,
VehicleId, c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, GAS_PER_LITER,
row_number() over (partition by ID order by ID) as rn
from dual join consumption c
on c.ID = 1
group by ID, to_char(c.CONSUME_DATE_FROM + level - 1,'yyyymm'), VehicleId,
c.CONSUME_DATE_FROM, c.CONSUM_DATE_TO, c.GAS_PER_LITER
connect by level <= c.CONSUM_DATE_TO - c.CONSUME_DATE_FROM + 1
) q
group by ID, VehicleId, myMonth, CONSUME_DATE_FROM, CONSUM_DATE_TO, GAS_PER_LITER, rn
order by ID, myMonth;
I met an interesting issue that if I consider the join condition in the subquery as c.ID >= 1 query hangs on for huge period of time, so splitted into two parts by union all
as c.ID >= 2 and c.ID = 1
Rextester Demo

Oracle SQL for commission per employee per week

Here is the sample data in an Oracle 11g database I am trying to work with to come up with a SQL query to return commission earned by each employee per week. If an employee did not earn a commission in a week, then it should display 0 as in the commission_earned column. (Assuming the week start on Monday)
SQL> select * from SALES;
EMPLID PRODUCT_TYPE PRODUCTID SALE_AMOUNT COMMISSION_EARNED SALE_DATE
1001 Desktop 55355251 750 45 02-MAY-16
1002 Desktop 2332134 600 30 02-MAY-16
1001 Laptop 773643 1200 65 02-MAY-16
1003 Camera 5546232 450 25 03-MAY-16
1002 Printer 445321 150 15 04-MAY-16
1001 Printer 445321 150 15 10-MAY-16
1003 Camera 5546232 450 25 10-MAY-16
I am trying to come up with a sql that would return the total commission earned by each employee per week. I would appreciate any help or pointers.
WEEKOF EMPLID COMMISSION_EARNED
02-MAY-16 1001 110
02-MAY-16 1002 45
02-MAY-16 1003 25
09-MAY-16 1001 15
09-MAY-16 1002 0
09-MAY-16 1003 25
I came up with the below sql but it does not display a row for emplid 1002 with commision_earned as 0 for the week starting with 09-MAY-16
SQL> select trunc(sale_date,'IW') WEEKOF,emplid,sum(COMMISSION_EARNED) COMMISSION_EARNED from sales group by trunc(sale_date,'IW'),emplid order by trunc(sale_date,'IW'),emplid;
WEEKOF EMPLID COMMISSION_EARNED
02-MAY-16 1001 110
02-MAY-16 1002 45
02-MAY-16 1003 25
09-MAY-16 1001 15
09-MAY-16 1003 25
You can generate all the rows with a cross join and then use left join to get the information from the table:
select w.wk, e.emplid,
coalesce(sum(t.commission_earned), 0) as commission_earned
from (select distinct trunc(weekof, 'IW') as wk from test) w cross join
(select distinct emplid from test) e left join
test t
on t.emplid = e.emplid and trunc(t.weekof, 'IW') = w.wk
group by w.wk, e.emplid
order by w.wk, e.emplid;

Find Max Value and date between date range returning mutliplie records

I hope someone can help me, I been working on this all day.
I need to get max value, and the date and id where that max value is associated with between specific date ranges.
Here is my code , and I have tried many different version but it still returning more than one ID and date
SELECT distinct bw_s.id, avs.carProd, cd_s.RecordDate,
cd_s.milkProduction as MilkProd,
cd_s.WaterProduction as WaterProd
FROM tblTest bw_s
INNER JOIN tblTestCp cd_s WITH(NOLOCK)
ON bw_s.id=cd_s.id
AND cd_s.recorddate BETWEEN '08/06/2014' AND '10/05/2014'
Inner Join
( select id, max(CarVol) as carProd
from tblTestCp
where recorddate BETWEEN '08/06/2014' AND '10/05/2014'
group by id ) avs
on avs.id = bw_s.id
order by id
I have table like this
id RecordDate carProd MilkProd WaterProd
47790 2014-10-05 132155 0 225
47790 2014-10-01 13444 0 0
47790 2014-08-06 132111 10 100
47790 2014-09-05 10000 500 145
47790 2014-09-20 10000 800 500
47791 2014-09-20 10000 300 500
47791 2014-09-21 10001 400 500
47791 2014-08-21 20001 600 500
And the result should be ( max carprod)
id RecordDate carProd MilkProd WaterProd
47790 2014-10-05 132155 0 225
47791 2014-08-21 20001 600 500
I've assumed that the name of your table is "Data":
SELECT
*
FROM
Data
WHERE
Data.RecordDate BETWEEN '2014-08-21' AND '2014-10-01'
ORDER BY
Data.carProd DESC
LIMIT 1;
Make sure to change the dates to match what your particular requirements are.

SQL query to find the avg salary based on the nearest client dob's

I have a requirement with a below table.
Conditions:
I have to take the average of salaries clients, if the client has serial 3 days date of birth gap.
If there are no nearest 3 day dob's gap between the gap between the clients, then no need to take that client into consideration.
Example:
in the below table
client 17 has previous clientid's WITH serial dob's with 1day gap -> in this case I'll TAKE salary AVG FOR 17 BY taking 15,16 & 17 salaries.
client 18 has previous clientid's WITH serial dob's -> in this case I'll TAKE salary AVG FOR 18 BY taking 16,17 & 18 salaries.
Table:
JobType ClientID ClinetDOB's Slaries
.net 1 2012-03-14 300
.net 2 2012-04-11 400
.net 3 2012-04-12 200
.net 4 2012-07-29 400
.net 5 2012-08-17 1200
.net 6 2012-08-18 1400
.net 7 2012-08-19 1400
java 8 2012-04-10 400
java 9 2012-07-29 400
java 10 2012-07-30 600
java 11 2012-08-14 1200
java 12 2012-08-15 1800
java 13 2012-08-16 1100
java 14 2012-09-17 1200
java 15 2012-08-18 2400
java 16 2012-08-19 2400
java 17 2012-08-20 2400
java 18 2012-08-21 1500
Result Should looks LIKE this:-
JobType ClientID ClinetDOB's AVG(Slaries)
.net 7 2012-08-19 1333
Java 13 2012-08-16 1366 --This avg of 5,6,7 clientsId's(because they have serial 3days dob's)
Java 17 2012-08-20 2400 --This avg of 15,16,17 clientsId's(because they have serial 3days dob's)
Java 18 2012-08-21 2100 --This avg of 16,17,18 clientsId's(because they have serial 3days dob's)
Below query giving the some messup results.
select t1.ClientID,
t1.ClinetDOBs,
(t1.Slaries + sum (t2.Slaries)) / (count (*) + 1) Avg_Slaries
from table1 t1
inner join table1 t2
on (t1.ClinetDOBs = dateadd(day, 3, t2.ClinetDOBs) and t1.jobtype = t2.jobtype)
group by t1.ClientID,
t1.ClinetDOBs,
t1.Slaries
Please help.
Thank You In advance!
You might try this - difference is that from t2 are taken rows from previous three days, which include current row being tested so no double-summing is needed. Also ˙having` removes rows that reference themselves only.
select t1.ClientID,
t1.ClinetDOBs,
avg(t2.Slaries) Avg_Slaries
from table1 t1
inner join table1 t2
on t1.ClinetDOBs >= t2.ClinetDOBs
and t1.ClinetDOBs <= dateadd(day, 3, t2.ClinetDOBs)
and t1.jobtype = t2.jobtype
group by t1.ClientID,
t1.ClinetDOBs
having count(*) > 1
You can see it on your last data here.
The following query joins in each of the three preceding records. The joins both bring in the data and act as a filter to ensure that there are three:
select tmain.ClientID, tmain.ClinetDOBs,
sum(tmain.slaries + t1.slaries + t2.slaries)/3.0 as avg_slaries
from table1 tmain join
table1 t1
on t1.ClinetDOBs = dateadd(day, -1, tmain.ClinetDOBs) and
t1.jobtype = tmain.jobtype join
table t2
on t2.ClinetDOBs = dateadd(day, -2, tmain.ClinetDOBs) and
t2.jobtype = tmain.jobtype
group by tmain.ClientID, tmain.ClinetDOBs, tmain.Slaries
You question seems odd. Why do the dates have to be sequential and why do they all have to be there? What happens if there are multiple people on the same date and job title?
Try
select t1.ClientID,
t1.ClinetDOBs,
avg(t2.Slaries)
from table1 t1
inner join table1 t2
on t2.ClinetDOBs >= t1.ClinetDOBs)
t2.ClinetDOBs <= dateadd(day, 3, t1.ClinetDOBs)
and t1.jobtype = t2.jobtype
group by t1.ClientID,
t1.ClinetDOBs