I have a question regarding Oracle SQL.
My data looks like this:
id year
-- ----
1 2000
1 2001
1 2002
1 2003
1 2006
1 2000
2 2001
2 2002
2 2003
3 2003
3 2005
4 2012
4 2013
I want the id's which have the years 2001, 2002, 2003.
My result set:
id
--
1
2
Please help me with this. I actually tried searching this, but couldn't figure a way to search about my particular problem.
SQL
SELECT t.id
FROM TABLE t
WHERE t.year in(2001,2002,2003)
GROUP BY t.id
Sample SqlFiddle
http://sqlfiddle.com/#!2/4ec9f/2/0
Explanation
You want to filter your data set to only show rows with certain years, so that is what you put in the where clause WHERE t.year in(2001,2002,2003).
Since a single id can be in multiple years, your result set would contain duplicates. To remove the duplicates you could GROUP BY the ID or use the DISTINCT statement to only show unique elements.
UPDATE
Based on comments, here's a version that will only display id's that have all three years. We use DISTINCT t.YEAR to avoid counting id's that perhaps would have a single year repeated multiple times. The HAVING COUNT(DISTINCT t.YEAR) = 3 part ensures that we only include id's that have all three years.
SELECT t.id
FROM years t
WHERE t.year in(2001,2002,2003)
GROUP BY t.id
HAVING COUNT(DISTINCT t.YEAR) = 3
Updated sqlFiddle, which includes a data set where id of 3 has two rows for 2003 to show off the logic that only counts unique years for an ID.
select distinct id
from table
where year in(2001,2002,2003)
Related
I'm trying to get the minimum value of open, across multiple rows of year. This is from app.mode.com and the site only says SQL, not sure which version
SELECT year, open
FROM tutorial.aapl_historical_stock_price
WHERE open =
(
select MIN(open)
FROM tutorial.aapl_historical_stock_price
)
When I use the code above, the result is
Table result vs actual output
Year
Open
2000
0
2000
0
2000
0
What I'm trying to get is
Year
Open
2002
0
2001
0
2000
0
Can someone help point me what I'm doing wrong?
select year and get the min by grouping each year as following:
select
year
, min(open) as <desired_alias>
from your_table
group by 1
order by 1 desc;
How do I average the last 6 months of sales within SQL?
Here are my tables and fields:
IM_ItemWhseHistoryByPeriod.FISCALCALPERIOD,
IM_ItemWhseHistoryByPeriod.FISCALCALYEAR,
And I need to average these fields
IM_ItemWhseHistoryByPeriod.DOLLARSSOLD,
IM_ItemWhseHistoryByPeriod.QUANTITYSOLD,
The hard part I'm having is understanding how to average the last whole 6 months, ie. fsicalcalperiod 2-6(inside fiscalcalyear 2017).
I'm hoping for some help on what the SQL command text should look like since I'm very new to manipulating SQL outside of the UI.
Sample Data
My Existing SQL String:
SELECT IM_ItemWhseHistoryByPeriod.ITEMCODE,
IM_ItemWhseHistoryByPeriod.DOLLARSSOLD,
IM_ItemWhseHistoryByPeriod.QUANTITYSOLD,
IM_ItemWhseHistoryByPeriod.FISCALCALPERIOD,
IM_ItemWhseHistoryByPeriod.FISCALCALYEAR
FROM MAS_AME.dbo.IM_ItemWhseHistoryByPeriod
IM_ItemWhseHistoryByPeriod
ScaisEdge Attempt #1
if fiscalyear and fiscalperiod are number you could use
select avg(IM_ItemWhseHistoryByPeriod.DOLLARSSOLD) ,
avg(IM_ItemWhseHistoryByPeriod.QUANTITYSOLD)
from my_table
where IM_ItemWhseHistoryByPeriod.FISCALCALYEAR = 2017
and IM_ItemWhseHistoryByPeriod.FISCALCALPERIOD between 2 and 6
or for each item code
select itemcode, avg(IM_ItemWhseHistoryByPeriod.DOLLARSSOLD) ,
avg(IM_ItemWhseHistoryByPeriod.QUANTITYSOLD)
from my_table
where IM_ItemWhseHistoryByPeriod.FISCALCALYEAR = 2017
and IM_ItemWhseHistoryByPeriod.FISCALCALPERIOD between 2 and 6
group by itemcode
Try the following solution and see if it works for you:
select avg(DOLLARSSOLD) as AvgDollarSod,
avg(QUANTITYSOLD) as AvgQtySold
from IM_ItemWhseHistoryByPeriod
where FISCALCALYEAR = '2017
and FISCALCALPERIOD between 2 and 6
Is there any way to generate a custom sequential number like the following?
I want the Number to be incremented with grouping by the Code and Year.
Code Year Number
A 2016 1
A 2016 2
A 2016 3
B 2016 1
B 2016 2
C 2016 1
A 2017 1
A 2017 2
Any suggestion would be appreciated.
EDIT
Sorry, I was too ambiguous what I want. I want to generate the unique number when I query, so if I ask a new number in the above data context with Code:A and Year:2017, I want the Number to be 3. I guess to get the Number properly in a future I need to save the Code and Year with the Number.
Use ROW_NUMBER to assign Number per Code,Year grouping.
SELECT *,
Number = ROW_NUMBER() OVER(PARTITION BY Code, [Year] ORDER BY (SELECT NULL))
FROM tbl
Replace SELECT NULL with the column you want the order to be based from.
I'm using SAS University Edition to analyze the following table (actually has 2.5M rows in it)
p_id c_id startyear endyear
0001 3201 2008 2013
0001 2131 2013 2015
0013 3201 2006 2010
where p_id is person_id and c_id is companyid.
I want to get number of colleagues (number of persons that worked during an overlapping span at the same companies) in a certain year, so I created a table with the distinct p_ids and do the following query:
PROC SQL;
UPDATE no_colleagues AS t1
SET c2007 = (
SELECT COUNT(DISTINCT t2.p_id) - 1
FROM table AS t2
INNER JOIN table AS t3
ON t3.p_id = t1.p_id
AND t3.c_id = t2.c_id
AND t3.startyear <= t2.endyear % checks overlapping criteria
AND t3.endyear >= t2.startyear % checks overlapping criteria
AND t3.startyear <= 2007 % limits number of returns
AND t2.startyear <= 2007 % limits number of returns
);
A single lookup on an indexed query (p_id, c_id, startyear, endyear) takes 0.04 seconds. The query above takes about 1.8 seconds for a single update, and does not use any indexes.
So my question is:
How to improve the query, and/or how to use indices to make sure the self join can use the indices?
Thanks in advance.
Based on your data, I'd do something like this, but maybe you need to tweak the code to fit your needs.
First, create a table with p_id, c_id, year.
So your first guy working at the company 3201 will have 6 observations in this table, one for each worked year.
data have_count;
set have;
do i=startyear to endyear;
worked_in = i;
output;
end;
drop i startyear endyear;
run;
Now you just count and agreggate:
proc sql;
select
worked_in as year
,c_id
,count(distinct p_id) as no_colleagues
from have_count
group by 1,2;
quit;
Result:
year c_id no_colleagues
2006 3201 1
2007 3201 1
2008 3201 2
2009 3201 2
2010 3201 2
2011 3201 1
2012 3201 1
2013 2131 1
2013 3201 1
2014 2131 1
2015 2131 1
A more efficient method:
1) Create a long format table for the results rather than wide format. This will be both easier to populate and easier to work with later.
create table colleagues_by_year (
p_id int,
year int,
colleagues int
);
Now this can be populated with a single insert statement. The only trick is getting the full list of years you want in the final table. There are a few options, but since I'm not too familiar with SAS SQL I'm going to go with a very simple one: a lookup table of years, to which you can join.
create table years (
year int
);
insert into years
values (2007),(2008),...
(A more sophisticated approach would be a recursive query that found the range of all years in the input data).
Now the final insert:
insert into colleagues_by_year
select p_id,
year,
count(*)
from colleagues
join years on
years.year between colleagues.startyear and colleagues.endyear
group by p_id,year
This won't have any rows where the number of colleagues for the year would be 0. If you wanted that you could make years be a left join and only count the rows where years.year is not null.
I am trying to generate a report based on the below two tables:
Name Start Year End Year No. Of Students Fill Order
School-ABC 2000 2004 1 1
School-DEF 2000 2004 2 3
School-GHI 2000 2004 1 2
Name Start Year End Year Joined On
Student-1 2000 2004 01-Jan
Student-2 2000 2004 03-Jan
Student-3 2000 2004 02-Jan
Student-4 2000 2004 15-Jan
The expected output is below:
Name Start Year End Year Joined On School
Student-1 2000 2004 01-Jan School-ABC
Student-2 2000 2004 03-Jan School-DEF
Student-3 2000 2004 02-Jan School-GHI
Student-4 2000 2004 15-Jan School-DEF
Logic behind generating the data:
First table contains the list of schools and the seats available (along with the priority in which seats will be allocated to students on FCFS basis)
The second table contains data on the list of students enrolled to schools, with their admission date and the start/end year of course.
I am required to populate based on the "Fill Order", the school that is allocated to each student.
After analyzing the problem for a while, I have come to a conclusion that, this might not be achievable using select queries alone. Currently, I am planning to do it using two Cursors for each table and process the records row-by-row. Is there a better way of doing it or is it possible through select statements? TIA
Note:
The database I use is Oracle 10g
I cannot create any temporary tables or alter the data in any of the tables. I strictly have read-only access to the database.
You could use Oracle analytic functions. row_number() over () can assign a number to each student based on their join date. sum() over () can calculate the first and last student for each school. Combining the two you get:
select stud.name
, stud.startyear
, stud.endyear
, stud.joinedon
, schl.name as SchoolName
from (
select name
, coalesce(sum(NoOfStudents) over (order by FillOrder
range between unbounded preceding and 1 preceding),0)+1 FirstStudent
, sum(NoOfStudents) over (order by FillOrder) as LastStudent
from Schools
) schl
join (
select row_number() over (order by JoinedOn) as StudentRank
, Students.*
from Students
) stud
on stud.StudentRank between schl.FirstStudent and schl.LastStudent
order by
stud.name
Live example at SQL Fiddle.