Build table with previous months (cumulative) - sql

I'm a bit lost with the following problem that I need to solve with an SQL query, no plsql. The idea is to build a cumulative column to calculate all previous months. The input table looks like
Month
1
2
3
..
24
I need build the following table :
Month Cum_Month
1 1
2 1
2 2
3 1
3 2
3 3
..
24 1
...
24 23
All this in SQL Server 2008, thanks in advance

You can do it like this:
DECLARE #tbl TABLE ([Month] INT)
INSERT #tbl VALUES
(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),
(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24)
SELECT Month
, ROW_NUMBER() OVER (PARTITION BY Month ORDER BY Month) num
FROM #tbl a
JOIN
(
SELECT *
FROM master..spt_values
WHERE type = 'P'
)
b ON b.number < a.Month
master..spt_values is used to generate numbers, after numbers are generated result of the subquery is joined on the #tbl to get the number of rows that corresponds to the month. After that ROW_NUMBER is used to create appropriate ordinal numbers for each month.

Here's a pretty cool trick not using any tables:
SELECT N.Number as Month, N2.Number as Cum_Month
FROM
(SELECT Number FROM master..spt_values WHERE Number BETWEEN 1 AND 24 AND Type = 'P') N
JOIN (SELECT Number FROM master..spt_values WHERE Number BETWEEN 1 AND 24 AND Type = 'P') N2 ON N.Number >= N2.Number
ORDER BY N.Number, N2.Number
And the Fiddle.
And if you really don't want the last 24 24 (why not), just change the second query to between 1 and 23).

Related

SQL query to duplicate each row 12 times

I have a table which has columns site,year and sales . this table is unique on site+year eg
site year sales
-------------------
a 2012 50
b 2013 100
a 2006 35
Now what I want to do is make this table unique on site+year+month. Thus each row gets duplicated 12 times, a month column is added which is labelled from 1-12 and the sales values get divided by 12 thus
site year month sales
-------------------------
a 2012 1 50/12
a 2012 2 50/12
...
a 2012 12 50/12
...
b 2013 1 100/12
...
a 2006 12 35/12
I am doing this on python currently and it works like a charm, but I need to do this in SQL (ideally PostgreSQL since I will be using this as a datasource for tableau)
It would be very helpful if someone can provide the explanations with the solution as well, since I am a novice at this
You can use generate_series() for that
select t.site, t.year, g.month, t.sales / 12
from the_table t
cross join generate_series(1,12) as g (month)
order by t.site, t.year, g.month;
If the column sales is an integer, you should cast that to a numeric to avoid the integer division: t.sales::numeric / 12
Online example: http://rextester.com/GUWPI39685
Try this approach (For T-SQL - MS SQL) :
DECLARE #T TABLE
(
[site] VARCHAR(5),
[year] INT,
sales INT
)
INSERT INTO #T
VALUES('A',2012,50),('B',2013,100),('C',2006,35)
;WITH CTE
AS
(
SELECT
MonthSeq = 1
UNION ALL
SELECT
MonthSeq = MonthSeq+1
FROM CTE
WHERE MonthSeq <12
)
SELECT
T.[site],
T.[year],
[Month] = CTE.MonthSeq,
sales = T.[sales]/12
FROM CTE
CROSS JOIN #T T
ORDER BY T.[site],CTe.MonthSeq

How to divide two values from the different row

I have used this formula.
Quote change = (current month data / previous month data) * 100
Then my data stored on SQL SERVER table look like below :
id DATE DATA
1 2015/01/01 10
2 2015/02/01 20
3 2015/03/01 30
4 2015/04/01 40
5 2015/05/01 50
6 2015/06/01 60
7 2015/07/01 70
8 2015/08/01 80
9 2015/09/01 90
How can i implement this formula on SQL Function ?
For Example
current month is 2015/02/1
Quote change = (Current Month Data / Previous Month Data ) * 100
Quote change =( 15/10)*100
Then if current date is 2015/01/01. Because no data before 2015/01/01, I need to show 0 or #
Sql server 2012 have a window function called LAG that is very useful in situations like this.
Lag returns the value of a specific column in the previous row (specified by the order by part of the over clause).
Try this:
;With cte as
(
SELECT Id, Date, Data, LAG(Data) OVER(ORDER BY Date) As LastMonthData
FROM YourTable
)
SELECT Id,
Date,
Data,
CASE WHEN ISNULL(LastMonthData, 0) = 0 THEN 0 ELSE (Data/LastMonthData) * 100 END As Quote
FROM cte
I've used a CTE just so I wouldn't have to repeat the LAG twice.
The CASE expression is to prevent an exception in case the LastMonthData is 0 or null.
You can use inner join like mentioned below -
select a.*,isnull(cast(a.data/b.data as decimal(4,2))*100,0)
from TableA as a
inner join TableA as b
on b.date = dateadd(mm,-1,a.date)
Let me know if this helps

Joining next Sequential Row

I am planing an SQL Statement right now and would need someone to look over my thougts.
This is my Table:
id stat period
--- ------- --------
1 10 1/1/2008
2 25 2/1/2008
3 5 3/1/2008
4 15 4/1/2008
5 30 5/1/2008
6 9 6/1/2008
7 22 7/1/2008
8 29 8/1/2008
Create Table
CREATE TABLE tbstats
(
id INT IDENTITY(1, 1) PRIMARY KEY,
stat INT NOT NULL,
period DATETIME NOT NULL
)
go
INSERT INTO tbstats
(stat,period)
SELECT 10,CONVERT(DATETIME, '20080101')
UNION ALL
SELECT 25,CONVERT(DATETIME, '20080102')
UNION ALL
SELECT 5,CONVERT(DATETIME, '20080103')
UNION ALL
SELECT 15,CONVERT(DATETIME, '20080104')
UNION ALL
SELECT 30,CONVERT(DATETIME, '20080105')
UNION ALL
SELECT 9,CONVERT(DATETIME, '20080106')
UNION ALL
SELECT 22,CONVERT(DATETIME, '20080107')
UNION ALL
SELECT 29,CONVERT(DATETIME, '20080108')
go
I want to calculate the difference between each statistic and the next, and then calculate the mean value of the 'gaps.'
Thougts:
I need to join each record with it's subsequent row. I can do that using the ever flexible joining syntax, thanks to the fact that I know the id field is an integer sequence with no gaps.
By aliasing the table I could incorporate it into the SQL query twice, then join them together in a staggered fashion by adding 1 to the id of the first aliased table. The first record in the table has an id of 1. 1 + 1 = 2 so it should join on the row with id of 2 in the second aliased table. And so on.
Now I would simply subtract one from the other.
Then I would use the ABS function to ensure that I always get positive integers as a result of the subtraction regardless of which side of the expression is the higher figure.
Is there an easier way to achieve what I want?
The lead analytic function should do the trick:
SELECT period, stat, stat - LEAD(stat) OVER (ORDER BY period) AS gap
FROM tbstats
The average value of the gaps can be done by calculating the difference between the first value and the last value and dividing by one less than the number of elements:
select sum(case when seqnum = num then stat else - stat end) / (max(num) - 1);
from (select period, row_number() over (order by period) as seqnum,
count(*) over () as num
from tbstats
) t
where seqnum = num or seqnum = 1;
Of course, you can also do the calculation using lead(), but this will also work in SQL Server 2005 and 2008.
By using Join also you achieve this
SELECT t1.period,
t1.stat,
t1.stat - t2.stat gap
FROM #tbstats t1
LEFT JOIN #tbstats t2
ON t1.id + 1 = t2.id
To calculate the difference between each statistic and the next, LEAD() and LAG() may be the simplest option. You provide an ORDER BY, and LEAD(something) returns the next something and LAG(something) returns the previous something in the given order.
select
x.id thisStatId,
LAG(x.id) OVER (ORDER BY x.id) lastStatId,
x.stat thisStatValue,
LAG(x.stat) OVER (ORDER BY x.id) lastStatValue,
x.stat - LAG(x.stat) OVER (ORDER BY x.id) diff
from tbStats x

How do I aggregate numbers from a string column in SQL

I am dealing with a poorly designed database column which has values like this
ID cid Score
1 1 3 out of 3
2 1 1 out of 5
3 2 3 out of 6
4 3 7 out of 10
I want the aggregate sum and percentage of Score column grouped on cid like this
cid sum percentage
1 4 out of 8 50
2 3 out of 6 50
3 7 out of 10 70
How do I do this?
You can try this way :
select
t.cid
, cast(sum(s.a) as varchar(5)) +
' out of ' +
cast(sum(s.b) as varchar(5)) as sum
, ((cast(sum(s.a) as decimal))/sum(s.b))*100 as percentage
from MyTable t
inner join
(select
id
, cast(substring(score,0,2) as Int) a
, cast(substring(score,charindex('out of', score)+7,len(score)) as int) b
from MyTable
) s on s.id = t.id
group by t.cid
[SQLFiddle Demo]
Redesign the table, but on-the-fly as a CTE. Here's a solution that's not as short as you could make it, but that takes advantage of the handy SQL Server function PARSENAME. You may need to tweak the percentage calculation if you want to truncate rather than round, or if you want it to be a decimal value, not an int.
In this or most any solution, you have to count on the column values for Score to be in the very specific format you show. If you have the slightest doubt, you should run some other checks so you don't miss or misinterpret anything.
with
P(ID, cid, Score2Parse) as (
select
ID,
cid,
replace(Score,space(1),'.')
from scores
),
S(ID,cid,pts,tot) as (
select
ID,
cid,
cast(parsename(Score2Parse,4) as int),
cast(parsename(Score2Parse,1) as int)
from P
)
select
cid, cast(round(100e0*sum(pts)/sum(tot),0) as int) as percentage
from S
group by cid;

SQL: create sequential list of numbers from various starting points

I'm stuck on this SQL problem.
I have a column that is a list of starting points (prevdoc), and anther column that lists how many sequential numbers I need after the starting point (exdiff).
For example, here are the first several rows:
prevdoc | exdiff
----------------
1 | 3
21 | 2
126 | 2
So I need an output to look something like:
2
3
4
22
23
127
128
I'm lost as to where even to start. Can anyone advise me on the SQL code for this solution?
Thanks!
;with a as
(
select prevdoc + 1 col, exdiff
from <table> where exdiff > 0
union all
select col + 1, exdiff - 1
from a
where exdiff > 1
)
select col
If your exdiff is going to be a small number, you can make up a virtual table of numbers using SELECT..UNION ALL as shown here and join to it:
select prevdoc+number
from doc
join (select 1 number union all
select 2 union all
select 3 union all
select 4 union all
select 5) x on x.number <= doc.exdiff
order by 1;
I have provided for 5 but you can expand as required. You haven't specified your DBMS, but in each one there will be a source of sequential numbers, for example in SQL Server, you could use:
select prevdoc+number
from doc
join master..spt_values v on
v.number <= doc.exdiff and
v.number >= 1 and
v.type = 'p'
order by 1;
The master..spt_values table contains numbers between 0-2047 (when filtered by type='p').
If the numbers are not too large, then you can use the following trick in most databases:
select t.exdiff + seqnum
from t join
(select row_number() over (order by column_name) as seqnum
from INFORMATION_SCHEMA.columns
) nums
on t.exdiff <= seqnum
The use of INFORMATION_SCHEMA columns in the subquery is arbitrary. The only purpose is to generate a sequence of numbers at least as long as the maximum exdiff number.
This approach will work in any database that supports the ranking functions. Most databases have a database-specific way of generating a sequence (such as recursie CTEs in SQL Server and CONNECT BY in Oracle).