Counting ID's for correct creation date time - sql

I need to get the number of user ID's for each month, but they should only be counted for the month if the user's minimum month falls within that month.
So if customer A had a min(day) of 04/18 then for month and year, they would be counted.
My table looks like:
monthyear | id
02/18 A32
04/19 T39
05/19 T39
04/19 Y95
01/18 A32
12/19 I99
11/18 OPT
09/19 TT8
I was doing something like:
SELECT day, id
SUM(CASE WHEN month = min(day) THEN 1 ELSE 0)
FROM testtable
GROUP BY 1
But I'm not sure how to specify that for each user ID, so only user ID = 1, when their min(Day) = day
Goal table to be:
monthyear | count
01/18 1
02/18 0
11/18 1
04/19 2
05/19 0
09/19 1
12/19 1

Use window functions. Let me assume that your monthyear is really yearmonth, so it sorts correctly:
SELECT yearmonth, COUNT(*) as numstarts
FROM (SELECT tt.*, ROW_NUMBER() OVER (PARTITION BY id ORDER BY yearmonth) as seqnum
FROM testtable tt
) tt
WHERE seqnum = 1
GROUP BY yearmonth;
If you do have the absurd format of month-year, then you can use string manipulations. These depend on the database, but something like this:
SELECT yearmonth, COUNT(*) as numstarts
FROM (SELECT tt.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY RIGHT(monthyear, 2), LEFT(monthyear, 2) as seqnum
FROM testtable tt
) tt
WHERE seqnum = 1
GROUP BY yearmonth;

I assumed that you have a column that's a date (use of min() is necessary). You can do it by selecting a minimal date(subquery t2) for each id and then count only these rows that connect throught left join, so if there is no connection you will get zeros for these dates or monthyear as you have in your data.
select
monthyear
,count(t2.id) as cnt
from testtable t1
left join (
select
min(date) as date
,id
from testtable
group by id
) t2
on t2.date = t1.date
and t2.id = t1.id
group by monthyear

You are looking for the number of new users each month, yes?
Here is one way to do it.
Note that I had to use TO_DATE and TO_CHAR to make sure the month/year text strings sorted correctly. If you use real DATE columns that would be unnecessary.
An additional complexity was adding the empty months in (months with zero new users). Optimally that would not be done by using a SELECT DISTINCT on the base table to get all months.
create table x (
monthyear varchar2(20),
id varchar2(10)
);
insert into x values('02/18', 'A32');
insert into x values('04/19', 'T39');
insert into x values('05/19', 'T39');
insert into x values('04/19', 'Y95');
insert into x values('01/18', 'A32');
insert into x values('12/19', 'I99');
insert into x values('11/18', 'OPT');
insert into x values('09/19', 'TT8');
And the query:
with allmonths as(
select distinct monthyear from x
),
firstmonths as(
select id, to_char(min(to_date(monthyear, 'MM/YY')),'MM/YY') monthyear from x group by id
),
firstmonthcounts as(
select monthyear, count(*) cnt
from firstmonths group by monthyear
)
select am.monthyear, nvl(fmc.cnt, 0) as newusers
from allmonths am left join firstmonthcounts fmc on am.monthyear = fmc.monthyear
order by to_date(monthyear, 'MM/YY');

Related

How can I obtain the minimum date for a value that is equal to the maximum date?

I am trying to obtain the minimum start date for a query, in which the value is equal to its maximum date. So far, I'm able to obtain the value in it's maximum date, but I can't seem to obtain the minimum date where that value remains the same.
Here is what I got so far and the query result:
select a.id, a.end_date, a.value
from database1 as a
inner join (
select id, max(end_date) as end_date
from database1
group by id
) as b on a.id = b.id and a.end_date = b.end_date
where value is not null
order by id, end_date
This result obtains the most recent record, but I'm looking to obtain the most minimum end date record where the value remains the same as the most recent.
In the following sample table, this is the record I'd like to obtain the record from the row where id = 3, as it has the minimum end date in which the value remains the same:
id
end_date
value
1
02/12/22
5
2
02/13/22
5
3
02/14/22
4
4
02/15/22
4
Another option that just approaches the problem somewhat as described for the sample data as shown - Get the value of the maximum date and then the minimum id row that has that value:
select top(1) t.*
from (
select top(1) Max(end_date)d, [value]
from t
group by [value]
order by d desc
)d
join t on t.[value] = d.[value]
order by t.id;
DB<>Fiddle
I'm most likely overthinking this as a Gaps & Island problem, but you can do:
select min(end_date) as first_date
from (
select *, sum(inc) over (order by end_date desc) as grp
from (
select *,
case when value <> lag(value) over (order by end_date desc) then 1 else 0 end as inc
from t
) x
) y
where grp = 0
Result:
first_date
----------
2022-02-14
See running example at SQL Fiddle.
with data as (
select *,
row_number() over (partition by value) as rn,
last_value(value) over (order by end_date) as lv
from T
)
select * from data
where value = lv and rn = 1
This isn't looking strictly for streaks of consecutive days. Any date that happened to have the same value as on final date would be in contention.

Selecting new distinct values over time (ORACLE SQL)

I want to select new distinct values and track them over time.
I have a table where each row represents a score awarded to a particular person.
- timestamp (when the score was awarded)
- name (which person received the score)
- score (what score the person received)
I want the result to look like:
The above table should be interpreted as how many new distinct names appear in each day.
Because 6-NOV is the first day, all the names are new hence 3 new names.
On 7-NOV Michael is the only new name so the value is 1.
On 8-NOV we have 3 new names (Don, Alex, Tina)
And on 9-NOV 0 new names appear a Jimmy and Sara have both been score before.
Thanks for the help
Consider:
select t.timestamp, count(*)
from (select distinct timestamp from mytable) t
left join (select name, min(timestamp) timestamp from mytablegroup by name) n
on n.timestamp = t.timestamp
group by t.timestamp
This works by generating a list of distinct timestamps from the table, and then joining it with an aggregate query that comptes the first timestamp of each name. The final step is aggregation in the outer query.
Find the minimum timestamp for each name and then count how many names in each timestamp
select timestamp, count(*) as new_names from
(select name, min(timestamp) as timestamp from mytable
group by name)
group by timestamp
order by timestamp
To include all days even without any names
select t.timestamp, nvl(new_names,0) as new_names from
(select timestamp, count(*) as new_names from
(select name, min(timestamp) as timestamp from mytable
group by name)
group by timestamp) c
RIGHT OUTER JOIN (select distinct timestamp from mytable) t
ON c.timestamp = t.timestamp
order by t.timestamp
To include dates that don't appear in the table at all you need to have a list of dates from a calendar somewhere and then put that table instead of the subquery I have RIGHT OUTER JOINed to
You can do this
select t.timestamp, nvl(new_names,0) as new_names from
(select timestamp, count(*) as new_names from
(select name, min(timestamp) as timestamp from mytable
group by name)
group by timestamp) c
RIGHT OUTER JOIN (
SELECT TRUNC (SYSDATE - ROWNUM - 1) dt
FROM DUAL CONNECT BY ROWNUM < 366
) t
ON c.timestamp = t.timestamp
order by t.timestamp
But you'd have to adjust the -1 and 366 to be the date range you wanted and it's much more standard to use a calendar that already exists in your database
With MIN() window function:
select tt.firstdate, count(distinct tt.name) "new names"
from (
select t.*, min(timestamp) over (partition by name) firstdate
from tablename t
) tt
group by tt.firstdate
If you also want the dates where there are not any new names:
select t.timestamp, count(distinct tt.name) "new names"
from tablename t
left join (
select t.*, min(timestamp) over (partition by name) firstdate
from tablename t
) tt on tt.firstdate = t.timestamp
group by t.timestamp
Count only first appearances, use row_number() at first:
select timestamp, sum(frst) as new_names
from (
select timestamp,
case when row_number()
over (partition by name order by timestamp) = 1
then 1 else 0 end frst
from scores)
group by timestamp
Yet, another opetion through right joining among distinctly selected timestamps and the least values for each names. This way also non-matched rows returned with zero counts as new_names column :
SELECT NVL(t1.timestamp,t2.timestamp) AS timestamp,
SUM(NVL2(t1.timestamp,1,0)) AS new_names
FROM (SELECT name, MIN(timestamp) AS timestamp from t group by name) t1
RIGHT JOIN (SELECT DISTINCT timestamp FROM t) t2
ON t2.timestamp = t1.timestamp
GROUP BY NVL(t1.timestamp,t2.timestamp)
ORDER BY timestamp
Demo

Select multiple max values after GROUP BY query

Suppose I have a table look like this:
date ID income
0 9/1 C 10.40
1 9/3 A 33.90
2 9/3 B 29.10
3 9/4 C 19.30
4 9/4 B 17.80
5 9/5 B 9.55
6 9/5 C 11.10
7 9/5 A 13.10
8 9/7 A 29.10
9 9/7 B 29.10
I want to find out the ID who made the most income for each date. The most intuitive approach would be writing
SELECT ID, MAX(income) FROM table GROUP BY date
But there are two IDs who made the same MAX income on 9/7, I want to retain all ties on the same date, by using that query I will ignore one ID on 9/7, and 29.1 appears on 9/3 and 9/7, any other approach?
A join based approach doesn't have this problem, and would retain all records tied for the max income on a given date.
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT date, MAX(income) AS max_income
FROM yourTable
GROUP BY date
) t2
ON t1.date = t2.date AND t1.income = t2.max_income
ORDER BY
t1.date;
The way the above query works is to join the complete original table to a subquery which finds, for each date, the maximum income value. This has the effect of filtering off any record which did not have the max income on a given date. Pay close attention to the join condition, which has two components, the date, and the income.
If your database supports analytic function, we can also use RANK here:
SELECT date, ID, income
FROM
(
SELECT t.*, RANK() OVER (PARTITION BY date ORDER BY income DESC) rnk
FROM yourTable t
) t
WHERE rnk = 1
ORDER BY date;
one approach can be like below
with cte1
(
Select t1.*
FROM yourTable t1
INNER JOIN
(
SELECT date, MAX(income) AS max_income
FROM yourTable
GROUP BY date
) t2
ON t1.date = t2.date AND t1.income = t2.max_income
) select min(ID) as ID, date,income from cte1
group by date,income
As you not mentioned which id you need in case of two ID's(when income is same on a particular date) so i took minimum id among them when two id's income is same on a particular date But at the same time you may use max() function also
Try below using subquery and as you've tie for one date so take minimum ID which'll give you one id from date 9/7
select date,min(ID),income
from
(SELECT t1.date, t1.ID,t1.income
FROM tablename t1
INNER JOIN
(
SELECT date, MAX(income) AS mincome
FROM yourTable
GROUP BY date
) t2 ON t1.date = t2.date AND t1.income = t2.mincome
)X group by date,income

Per year one maximum date row according to previous row date

I have a table having two columns and I want to fetch data of 6 years with rules
The first row would be maximum date row that is available before and equals to input date (I will pass an input date)
From the second row till 6th row I need maximum(date row) that is earlier than previous row data selected data and there should not be 2 rows for same year i need only latest one according to the previous row but not in same year.
declare #tbl table (id int identity, marketdate date )
insert into #tbl (marketdate)
values('2018-05-31'),
('2017-06-01'),
('2017-05-28'),
('2017-04-28'),
('2016-05-26'),
('2015-04-18'),
('2015-04-20'),
('2015-03-18'),
('2014-05-31'),
('2014-04-18'),
('2013-04-15')
output:
id marketdate
1 2018.05.31
3 2017.05.28
5 2016.05.27
7 2015.04.20
9 2014.04.18
10 2013.04.15
Can't you do this with a simple order by/desc?
SELECT TOP 6 id, max(marketdate) FROM tbl
WHERE tbl.marketdate <= #date
GROUP BY YEAR(marketdate), id, marketdate
ORDER BY YEAR(marketdate) DESC
Based purely on your "Output" given your sample data, I believe the following is what you are after (The max date for each distinct year of data):
SELECT TOP 6
max(marketdate),
Year(marketDate) as marketyear
FROM #tbl
WHERE #tbl.marketdate <= getdate()
GROUP BY YEAR(marketdate)
ORDER BY YEAR(marketdate) DESC;
SQLFiddle of this matching your output
you can use row_number if you are using sql server
select top 6
id
, t.marketdate
from ( select rn = row_number() over (partition by year(marketdate)order by marketdate desc)
, id
, marketdate
from #tbl) as t
where t.rn = 1
order by t.marketdate desc
The following recursively searches for the next date, which must be at least one year earlier than the previous date.
Your parameterised start position goes where I chose 2018-06-01.
WITH
recursiveSearch AS
(
SELECT
id,
marketDate
FROM
(
SELECT
yourTable.id,
yourTable.marketDate,
ROW_NUMBER() OVER (ORDER BY yourTable.marketDate DESC) AS relative_position
FROM
yourTable
WHERE
yourTable.marketDate <= '2018-06-01'
)
search
WHERE
relative_position = 1
UNION ALL
SELECT
id,
marketDate
FROM
(
SELECT
yourTable.id,
yourTable.marketDate,
ROW_NUMBER() OVER (ORDER BY yourTable.marketDate DESC) AS relative_position
FROM
yourTable
INNER JOIN
recursiveSearch
ON yourTable.marketDate < DATEADD(YEAR, -1, recursiveSearch.marketDate)
)
search
WHERE
relative_position = 1
)
SELECT
*
FROM
recursiveSearch
WHERE
id IS NOT NULL
ORDER BY
recursiveSearch.marketDate DESC
OPTION
(MAXRECURSION 0)
http://sqlfiddle.com/#!18/56246/13

group rows in plain sql

I have a Table with columns Date and Number, like so:
date Number
1-1-2012 1
1-2-2012 1
1-3-2012 2
1-4-2012 1
I want to make a sql query that groups the rows with the same Number and take the minimum date. The grouping only may occur when the value iof Number is the same as previous / next row. So the rsult is
date Number
1-1-2012 1
1-3-2012 2
1-4-2012 1
try this:
WITH CTE AS(
SELECT * ,ROW_NUMBER() OVER (ORDER BY [DATE] ) -
ROW_NUMBER() OVER (PARTITION BY NUMBER ORDER BY [DATE] ) AS ROW_NUM
FROM TABLE1)
SELECT NUMBER,MIN(DATE) AS DATE
FROM CTE
GROUP BY ROW_NUM,NUMBER
ORDER BY DATE
SQL fiddle demo
SELECT Number, MIN(date)
FROM table
GROUP BY Number
ORDER BY Number
since you requirement is a bit more specific, how about this? I have not checked it myself, but something that might work, considering you requirement..
SELECT date, Number FROM (
SELECT Number,
(SELECT MIN(date) FROM #table t2 WHERE t1.date <> t2.date AND t1.Number = t2.Number) AS date
FROM table t1
) AS a
GROUP BY number, date