Select same number of rows even if data is not there - sql

I want to write a query to always select the same number of rows, even if the data is not there. If the data is not there, I would still like to select something in its place.
For example, if I want to select the amount in my bank account for the last 5 years, but I only have data for the last 3 years, could I still select 5 rows and just have 0's for the two missing years?
| Year | Balance |
| 2014 | $5 |
| 2013 | $10 |
| 2012 | $31 |
| 2011 | $0 | << Doesn't exist
| 2010 | $0 | << Doesn't exist
Is this possible? Thanks for any help.

Using mssql, this will work. There are other similar functions for other DBs.
SELECT TOP 5 year, ISNULL(balance,0) FROM yourtable

The key to solving this is using an outer join. It is tempting to think of your actual table of data as the "main" table, but what you really want is a table of all possible years. I didn't have a table of years, so I made one up on the fly:
select 2010 as Year
union select 2011 as Year
union select 2012 as Year
union select 2013 as Year
union select 2014 as Year
This gives me a list of all the years I care about.
Then I use an outer join to join it to my table with real data. The outer join will just return NULL values for the stuff that isn't there. But I don't want NULLs, I want zeroes, so I use the isnull function to make them zeros if they are null. And then I end up with this:
select YearList.Year, isnull(Bals.Balance, 0) as theBalance
from
(select 2010 as Year
union select 2011 as Year
union select 2012 as Year
union select 2013 as Year
union select 2014 as Year) as YearList
left join (select Year, Balance from Balances) as Bals
on YearList.Year = Bals.Year

Related

Linear Interpolation in SQL

I work with crashes and mileage for the same year which is Year in table. Crashes are are there for every record, but annual mileage is not. NULLs for mileage could be at the beginning or at the end of the time period for certain customer. Also, couple of annual mileage records can be missing as well. I do not know how to overcome this. I try to do it in CASE statement but then I do not know how to code it properly. Issue needs to be resolved in SQL and use SQL Server.
This is how the output looks like and I need to have mileage for every single year for each customer.
The info I am pulling from is proprietary database and the records themselves should be untouched as is. I just need code in query which will modify my current output to output where I have mileage for every year. I appreciate any input!
Year
Customer
Crashes
Annual_Mileage
2009
123
5
3453453
2010
123
1
NULL
2011
123
0
54545
2012
123
14
376457435
2013
123
3
63453453
2014
123
4
NULL
2015
123
15
6346747
2016
123
0
NULL
2017
123
2
534534
2018
123
7
NULL
2019
123
11
NULL
2020
123
15
565435
2021
123
12
474567546
2022
123
7
NULL
Desired Results
Year
Customer
Crashes
Annual_Mileage
2009
123
5
3453453
2010
123
1
175399 (prior value is taken)
2011
123
0
54545
2012
123
14
376457435
2013
123
3
63453453
2014
123
4
34900100 (avg of 2 adjacent values)
2015
123
15
6346747
2016
123
0
3440641 (avg of 2 adjacent values)
2017
123
2
534534
2018
123
7
534534 ( prior value is taken)
2019
123
11
549985 (avg of 2 adjacent values)
2020
123
15
565435
2021
123
12
474567546
2022
123
7
474567546 (prior value is taken)
SELECT Year,
Customer,
Crashes,
CASE
WHEN Annual_Mlg IS NOT NULL THEN Annual_Mlg
WHEN Annual_Mlg IS NULL THEN
CASE
WHEN PREV.Annual_Mlg IS NOT NULL
AND NEXT.Annual_Mlg IS NOT NULL
THEN ( PREV.Annual_Mlg + NEXT.Annual_Mlg ) / 2
ELSE 0
END
END AS Annual_Mlg
FROM #table
The above code doesn't work, but I just need to start somehow and that what I have currently.
I understand what I need to do I just do not know how to code it in SQL.
After i applied row_number () function i got this output for first 2 clients and for the rest of the 4 clients row_number() function gave correct output. i have no idea why is that. I thought may be because i used "full join" before to combine mileage and crashes table?
enter image description here
Your use of #table tells me that you're using MS SQL Server (a temporary table, probably in a stored procedure).
You want to:
select all the rows in #table
joined with the matching row (if any) for the previous year, and
joined with the matching row (if any) for the next year
Then it's easy. Assuming the primary key on your #table is composed of the year and customer columns, something like this ought to do you:
select t.year ,
t.customer ,
t.crashes ,
annual_milage = coalesce(
t.annual_milage ,
( coalesce( p.annual_mileage, 0 ) +
coalesce( n.annual_mileage, 0 )
) / 2
)
from #table t -- take all the rows
left join #table p on p.year = t.year - 1 -- with the matching row for
and p.customer = t.customer -- the previous year (if any)
left join #table n on n.year = t.year + 1 -- and the matching row for
and n.customer = t.customer -- the next year (if any)
Notes:
What value you default to if the previous or next year doesn't exist is up to you (zero? some arbitrary value?)
Is the previous/next year guaranteed to be the current year +/- 1?
If not, you may have to use derived tables as the source for the
prev/next data, selecting the closest previous/next year (that sort
of thing rather complicates the query significantly).
Edited To Note:
If you have discontiguous years for each customer such that the "previous" and "next" years for a given customer are not necessarily the current year +/- 1, then something like this is probably the most straightforward way to find the previous/next year.
We use a derived table in our from clause, and assign a sequential number in lieu of year for each customer, using the ranking function row_number() function. This query, then
select row_nbr = row_number() over (
partition by x.customer
order by x.year
) ,
x.*
from #table x
would produce results along these lines:
row_nbr
customer
year
...
1
123
1992
...
2
123
1993
...
3
123
1995
...
4
123
2020
...
1
456
2001
...
2
456
2005
...
3
456
2020
...
And that leads us to this:
select year = t.year ,
customer = t.customer ,
crashes = t.crashes ,
annual_mileage = coalesce(
t.mileage,
coalesce(
t.annual_mileage,
(
coalesce(p.annual_mileage,0) +
coalesce(n.annual_mileage,0)
) / 2
),
)
from (
select row_nbr = row_number() over (
partition by x.customer
order by x.year
) ,
x.*
from #table x
) t
left join #table p on p.customer = t.customer and p.row_nbr = t.row_nbr-1
left join #table n on n.customer = t.customer and n.row_nbr = t.row_nbr+1

SQL query for until 2005 but not after 2005?

Find all Id who had taught until 2005 but had not taught after 2005.
for eg.
year ID
2010 A
2009 C
2005 B
2002 D
2002 C
2001 B
2000 A
Then the result should give only B and D.
The table has columns ID and year and I want to print out ID.
SELECT ID
FROM university.teaches
WHERE year <= 2005 AND
year NOT IN (SELECT year FROM university.teaches WHERE year> 2005);
I am trying something like this but it gives result including A and C
You should check for ids and not years with the IN operator:
SELECT DISTINCT ID
FROM university.teaches
WHERE id NOT IN (SELECT id FROM university.teaches WHERE year > 2005);
The subquery of IN returns all the ids that have taught after 2005, so NOT IN will return all the rest ids.
See the demo.
Results:
| ID |
| --- |
| B |
| D |
Use GROUP BY and MAX():
select id
from university.teaches
group by id
having max(year) <= 2005;

select columns between month and year

I have table with columns:
id month year
1 10 2011
2 1 2012
3 4 2011
4 3 2012
I Want select ids where (month=10 and year=2011) and (month=1 and year=2012), it's possible?
This is a basic SQL SELECT:
SELECT id FROM myTable WHERE (month = 10 AND year = 2011) OR (month = 1 AND year = 2012);
To search for rows between any two dates, the simplest solution may be to combine the month and year into a single number and then use numeric comparison:
SELECT id
FROM myTable
WHERE year*100 + month BETWEEN 201110 AND 201201
A misfeature of this solution is that it can't take advantage of indexes, so it will be slow on very large tables.

Append X amount of results from one query to X amount of results from another (with no foreign key join)

This query brings back results like this: select distinct date from dwh.product_count
April, 2013
March, 2013
February, 2013
January, 2013
I'd like to append however many results ^that^ brings back to the results from this query:
select distinct p_id dwh.members a
5
7
8
...etc
So that my results would look like this:
5 April, 2013
5 March, 2013
5 February, 2013
5 January, 2013
7 April, 2013
7 March, 2013
7 February, 2013
7 January, 2013
etc....
What type of query would bring these results?
select id, dt
from
(select distinct p_id as id from dwh.members) s
cross join
(select distinct date as dt from dwh.product_count) t
select id, dt from
( select distinct p_id as id from dwh.members )
.( select distinct date as dt from dwh.product_count )

Generate year to date by month report in SQL [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Running total by grouped records in table
I am trying to put together an SQL statement that returns the SUM of a value by month, but on a year to date basis. In other words, for the month of March, I am looking to get the sum of a value for the months of January, February, and March.
I can easily do a group by to get a total for each month by itself, and potentially calculate the year to date value I need in my application from this data by looping through the results set. However, I was hoping to have some of this work handled with my SQL statement.
Has anyone ever tackled this type of problem with an SQL statement, and if so, what is the trick that I am missing?
My current sql statement for monthly data is similar to the following:
Select month, year, sum(value) from mytable group by month, year
If I include a where clause on the month, and only group by the year, I can get the result for a single month that I am looking for:
select year, sum(value) from mytable where month <= selectedMonth group by year
However, this requires me to have a particular month pre-selected or to utilize 12 different SQL statements to generate one clean result set.
Any guidance that can be provided would be greatly appreciated!
Update: The data is stored on an IBM iSeries.
declare #Q as table
(
mmonth INT,
value int
)
insert into #Q
values
(1,10),
(1,12),
(2,45),
(3,23)
select sum(January) as UpToJanuary,
sum(February)as UpToFebruary,
sum(March) as UpToMarch from (
select
case when mmonth<=1 then sum(value) end as [January] ,
case when mmonth<=2 then sum(value) end as [February],
case when mmonth<=3 then sum(value) end as [March]
from #Q
group by mmonth
) t
Produces:
UpToJanuary UpToFebruary UpToMarch
22 67 90
You get the idea, right?
NOTE: This could be done easier with PIVOT tables but I don't know if you are using SQL Server or not.
As far as I know DB2 does support windowing functions although I don't know if this is also supported on the iSeries version.
If windowing functions are supported (I believe IBM calls them OLAP functions) then the following should return what you want (provided I understood your question correctly)
select month,
year,
value,
sum(value) over (partition by year order by month asc) as sum_to_date
from mytable
order by year, month
create table mon
(
[y] int not null,
[m] int not null,
[value] int not null,
primary key (y,m))
select a.y, a.m, a.value, sum(b.value)
from mon a, mon b
where a.y = b.y and a.m >= b.m
group by a.y, a.m, a.value
2011 1 120 120
2011 2 130 250
2011 3 500 750
2011 4 10 760
2011 5 140 900
2011 6 100 1000
2011 7 110 1110
2011 8 90 1200
2011 9 70 1270
2011 10 150 1420
2011 11 170 1590
2011 12 600 2190
You should try to join the table to itself by month-behind-a-month condition and generate a synthetic month-group code to group by as follows:
select
sum(value),
year,
up_to_month
from (
select a.value,
a.year,
b.month as up_to_month
from table as a join table as b on a.year = b.year and b.month => a.month
)
group by up_to_month, year
gives that:
db2 => select * from my.rep
VALUE YEAR MONTH
----------- ----------- -----------
100 2011 1
200 2011 2
300 2011 3
400 2011 4
db2 -t -f rep.sql
1 YEAR UP_TO_MONTH
----------- ----------- -----------
100 2011 1
300 2011 2
600 2011 3
1000 2011 4