Find last (first) instance in table but exclude most recent (oldest) date - sql

I have a table that reflects a monthly census of a certain population. Each month on an unpredictable day early in that month, the population is polled. Any member who existed at that point is included in that month's poll, any member who didn't is not.
My task is to look through an arbitrary date range and determine which members were added or lost during that time period. Consider the sample table:
ID | Date
2 | 1/3/2010
3 | 1/3/2010
1 | 2/5/2010
2 | 2/5/2010
3 | 2/5/2010
1 | 3/3/2010
3 | 3/3/2010
In this case, member with ID "1" was added between Jan and Feb, and member with ID 2 was lost between Feb and Mar.
The problem I am having is that if I just poll to try and find the most recent entry, I will capture all the members that were dropped, but also all the members that exist on the last date. For example, I could run this query:
SELECT
ID,
Max(Date)
FROM
tableName
WHERE
Date BETWEEN '1/1/2010' AND '3/27/2010'
GROUP BY
ID
This would return:
ID | Date
1 | 3/3/2010
2 | 2/5/2010
3 | 3/3/2010
What I actually want, however, is just:
ID | Date
2 | 2/5/2010
Of course I can manually filter out the last date, but since the start and end date are parameters I want to generalize that. One way would be to run sequential queries. In the first query I'd find the last date, and then use that to filter in the second query. It would really help, however, if I could wrap this logic into a single query.
I'm also having a related problem when I try to find when a member was first added to the population. In that case I'm using a different type of query:
SELECT
ID,
Date
FROM
tableName i
WHERE
Date BETWEEN '1/1/2010' AND '3/27/2010'
AND
NOT EXISTS(
SELECT
ID,
Date
FROM
tableName ii
WHERE
ii.ID=i.ID
AND
ii.Date < i.Date
AND
Date BETWEEN '1/1/2010' AND '3/27/2010'
)
This returns:
ID | Date
1 | 2/5/2010
2 | 1/1/2010
3 | 1/1/2010
But what I want is:
ID | Date
1 | 2/5/2010
I would like to know:
1. Which approach (the MAX() or the subquery with NOT EXISTS) is more efficient and
2. How to fix the queries so that they only return the rows I want, excluding the first (last) date.
Thanks!

You could do something like this:
SELECT
ID,
Max(Date)
FROM
tableName
WHERE
Date BETWEEN '1/1/2010' AND '3/27/2010'
GROUP BY
ID
having max(date) < '3/1/2010'
This filters out anyone polled in March.

Related

How to return one row based upon date in microsoft sql

I have an internal Department that will be switching account numbers in the future. I am trying to build a stored procedure to return only one of two rows based upon an effective date.
Here's the data makeup
|ID | EffectiveDate | AccountNum |
|-- | ------------- | ---------- |
| 1 | 2021-01-01 | 350000 |
| 2 | 2021-09-01 | 950000 |
I know this returns all the data
SELECT Id, EffectiveDate, AccountNum
FROM Account
This returns only the first row and never the 2nd:
SELECT Id, EffectiveDate, DeptNum
FROM Account
WHERE EffectiveDate <= GETDATE()
How would I dynamically return only 1 row based upon today's date?
So if the Effective date is less than today's date get row one. If the Effective Date is equal to or greater than todays date get row two.
I figured out the solution. Select only the top 1 deptnum from one of two possible answers given that the EffectiveDate is less than or equal to the current date and use an order by desc too. This ensures that the original EffectiveDate is obtained because the future EffectiveDate has not passed current date. When the future date is equal to or is earlier than the current date then it's value will be returned.
select TOP 1 deptnum
from Account
where EffectiveDate <= GETDATE()
order by EffectiveDate desc

Find highest (max) date query, and then find highest value from results of previous query

Here is a table called packages:
id packages_sent date sent_order
1 | 10 | 2017-02-11 | 1
2 | 25 | 2017-03-15 | 1
3 | 5 | 2017-04-08 | 1
4 | 20 | 2017-05-21 | 1
5 | 25 | 2017-05-21 | 2
6 | 5 | 2017-06-19 | 1
This table shows the number of packages sent on a given date; if there were multiple packages sent on the same date (as is the case with rows 4 and 5), then the sent_order keeps track of the order in which they were sent.
I am trying to make a query that will return sum(packages_sent) given the following conditions: first, return the row with the max(date) (given some date provided), and second, if there are multiple rows with the same max(date), return the row with the max(send_order) (the highest send_order value).
Here is the query I have so far:
SELECT sum(packages_sent)
FROM packages
WHERE date IN
(SELECT max(date)
FROM packages
WHERE date <= '2017-05-29');
This query correctly finds the max date, which is 2017-05-21, but then for the sum it returns 45 because it is adding rows 4 and 5 together.
I want the query to return the max(date), and if there are multiple rows with the same max(date), then return the row with the max(sent_order). Using the example above with the date 2017-05-29, it should only return 25.
I don't see where a sum() comes into play. You seem to only want the last row:
select p.*
from packages p
order by date desc, sendorder desc
fetch first 1 row only;
If you data is truly ordered ascending as you show it then it's easier to use the surrogate key ID field.
SELECT packages_sent
FROM packages
WHERE ID =
(SELECT max(ID)
FROM packages
WHERE date <= '2017-05-29');
Since the ID is always increasing with date and sent order finding the max of it also finds the max of the other two in one step.

sql query to get unique id for a row in oracle based on its continuity

I have a problem that needs to be solved using sql in oracle.
I have a dataset like given below:
value | date
-------------
1 | 01/01/2017
2 | 02/01/2017
3 | 03/01/2017
3 | 04/01/2017
2 | 05/01/2017
2 | 06/01/2017
4 | 07/01/2017
5 | 08/01/2017
I need to show the result in the below format:
value | date | Group
1 | 01/01/2017 | 1
2 | 02/01/2017 | 2
3 | 03/01/2017 | 3
3 | 04/01/2017 | 3
2 | 05/01/2017 | 4
2 | 06/01/2017 | 4
4 | 07/01/2017 | 5
5 | 08/01/2017 | 6
The logic is whenever value changes over date, it gets assigned a new group/id, but if its the same as the previous one , then its part of the same group.
Here is one method using lag() and cumulative sum:
select t.*,
sum(case when value = prev_value then 0 else 1 end) over (order by date) as grp
from (select t.*,
lag(value) over (order by date) as prev_value
from t
) t;
The logic here is to simply count the number of times that the value changes from one month to the next.
This assumes that date is actually stored as a date and not a string. If it is a string, then the ordering will not be correct. Either convert to a date or use a column that specifies the correct ordering.
Here is a solution using the MATCH_RECOGNIZE clause, introduced in Oracle 12.*
select value, dt, mn as grp
from inputs
match_recognize (
order by dt
measures match_number() as mn
all rows per match
pattern ( a b* )
define b as value = prev(value)
)
order by dt -- if needed
;
Here is how this works: Other than SELECT, FROM and ORDER BY, the query has only one clause, MATCH_RECOGNIZE. What this clause does is: it takes the rows from inputs and it orders them by dt. Then it searches for patterns: one row, marked as a, with no constraints, followed by zero or more rows b, where b is defined by the condition that the value is the same as for the prev[ious] row. What the clause calculates or measures is the match_number() - first "match" of the pattern, second match etc. We use this match number as the group number (grp) in the outer query - that's all we needed!
*Notes: The existence of solutions like this shows why it is important for posters to state their Oracle version. (Run the statement select * from v$version to find out.) Also: date and group are reserved words in Oracle and shouldn't be used as column names. Not even for posting made-up sample data. (There are workarounds but they aren't needed in this case.) Also, whenever using dates like 03/01/2017 in a post, please indicate whether that is March 1 or January 3, there's no way for "us" to tell. (It wasn't important in this case, but it is in the vast majority of cases.)

How to do a sub-select per result entry in postgresql?

Assume I have a table with only two columns: id, maturity. maturity is some date in the future and is representative of until when a specific entry will be available. Thus it's different for different entries but is not necessarily unique. And with time number of entries which have not reached this maturity date changes.
I need to count a number of entries from such a table that were available on a specific date (thus entries that have not reached their maturity). So I basically need to join this two queries:
SELECT generate_series as date FROM generate_series('2015-10-01'::date, now()::date, '1 day');
SELECT COUNT(id) FROM mytable WHERE mytable.maturity > now()::date;
where instead of now()::date I need to put entry from the generated series. I'm sure this has to be simple enough, but I can't quite get around it. I need the resulting solution to remain a query, thus it seems that I can't use for loops.
Sample table entries:
id | maturity
---+-------------------
1 | 2015-10-03
2 | 2015-10-05
3 | 2015-10-11
4 | 2015-10-11
Expected output:
date | count
------------+-------------------
2015-10-01 | 4
2015-10-02 | 4
2015-10-03 | 3
2015-10-04 | 3
2015-10-05 | 2
2015-10-06 | 2
NOTE: This count doesn't constantly decrease, since new entries are added and this count increases.
You have to use fields of outer query in WHERE clause of a sub-query. This can be done if the subquery is in the SELECT clause of the outer query:
SELECT generate_series,
(SELECT COUNT(id)
FROM mytable
WHERE mytable.maturity > generate_series)
FROM generate_series('2015-10-01'::date, now()::date, '1 day');
More info: http://www.techonthenet.com/sql_server/subqueries.php
I think you want to group your data by the maturity Date.
Check this:
select maturity,count(*) as count
from your_table group by maturity;

How can I see if a date is on a weekend?

I have a table:
ID | Name | TDate
1 | John | 1 May 2013, 8:67AM
2 | Jack | 2 May 2013, 6:43AM
3 | Adam | 3 May 2013, 9:53AM
4 | Max | 4 May 2013, 2:13AM
5 | Leny | 5 May 2013, 5:33AM
I need a query that will return all the items where TDate is a weekend. How would I write such a
query?
WHAT I HAVE SO FAR
select
table.*,
EXTRACT (DAY FROM table.tdate )
from table
I did a select using EXTRACT to just see if I can get the right values. However, EXTRACT with the parameter DAY returns the day of the month. If I instead use WEEKDAY, as per the documentation here, then I get error:
ERROR: timestamp units "weekday" not recognized
SQL state: 22023
limit 1250
EDIT
TDate has a data type of datetime (timestamp). I just wrote it like that for easy reading. But regardless of the type, I could easily cast between types if need be.
I know dates 4May and 5May are weekends (as they fall on a Saturday and a Sunday). Does firebird allow for a way to write a query that will return dates if they fall on weekends.
try this:
SELECT ID, Name, TDate
FROM your_table
WHERE EXTRACT(WEEKDAY FROM TDate) IN (6,0)
UPDATE
condition must be (0,6) not (0,1).