Return the first and last value from one column when value from another column changes - sql

I am trying to write a PostgreSQL query to return the first and last dates corresponding to indices. I have a table:
Datetime
Index
March 1 2021
0
March 2 2021
0
March 3 2021
0
March 4 2021
1
March 5 2021
1
March 6 2021
2
In this case, I would want to return:
I am wondering how I would write the PostgreSQL query for this.

I think this can be done with the following:
SELECT MIN("Datetime") AS Start
, MAX("Datetime") AS End
, "Index"
FROM <your_table>
GROUP BY "Index"
ORDER BY "Index"
;

Related

How to use SAS/SQL to create a table with certain conditions from a dataset

I have a dataset with ID and event_year (event meaning something happened that year, a person has more than one record in this table with more than one event year eg. ID 1 can have three entries with event_year 2017, 2018, 2019 ). Example dataset like:
ID event_year
1 2017
1 2018
1 2019
2 2018
2 2017
ID
event_year
1
2017
1
2018
1
2019
2
2018
2
2017
I need to get a table from this of all ID where the event_year is between 2017 and 2021 to make a frequency table counting people with event_year at set years 2017, 2018, 2019, 2020, 2021 (these are the columns refer to as study year x).
Year frequency
2017 2
2018 2
2019 1
2020 1
2021 0
Year
frequency
2017
2
2018
2
2019
1
2020
1
2021
0
Another condition is for the study year x if a person didnt have an event_year in x but had event_year x-1 they will be included in the frequency of year x, for example the ID 1 above should be included in frequency of once in each 2017, 2018, 2019 and 2020- because following the condition above for year 2020 they didnt have event_year in 2020 but did in 2019 so will be included in 2020. I apologise if this is confusing and would be happy to clarify
If I understood your question, this should work:
data have;
input ID event_year;
datalines;
1 2017
1 2018
1 2019
2 2018
2 2017
3 2017
3 2020
;
run;
For the next step (your additional requirement of being included a year after last event) we need data grouped by ID.
proc sort data=have;
by ID;
run;
We just add extra rows to a table, where a year is last year + 1.
data have;
set have;
output;
by ID;
if last.ID then do;
ID=ID;
event_year=event_year+1;
output;
end;
run;
Now we just check how many different IDs every year had. If you want to check only for certain years, just add a where clause (for example, where event_year in (2017, 2018, 2019, 2020, 2021) ).
proc sql;
create table want as
select distinct event_year, count(distinct ID) as frequency
from have
group by event_year
;
run;

big query SQL - repeatedly/recursively change a row's column in the select statement based on the values in previous row

I have table like below
customer
date
end date
1
jan 1 2021
jan 30 2021
1
jan 2 2021
jan 31 2021
1
jan 3 2021
feb 1 2021
1
jan 27 2021
feb 26 2021
1
feb 3 2021
mar 5 2021
2
jan 2 2021
jan 31 2021
2
jan 10 2021
feb 9 2021
2
feb 10 2021
mar 12 2021
Now, I wanted to update the value in the 'end date' column of a row based on the values in the previous row 'end date' and the current row 'date'.
Say if the date in current row < end date of the previous row, I wanted to update the end date of the current row = (end date of the previous row).
I Wanted to do this repeated for all the rows (grouped by customer).
I want the output as below. Just need it in the select statement instead of a updating/inserting in a table.
Note - in below as the second row(end date) is updated with the value in the first row (jan 30 2021), now the third row value (jan 3 2021) is evaluated against the updated value in the second row (which is jan 30 2021) but not with the second row value before update (jan 31 2021).
customer
date
end date
1
jan 1 2021
jan 30 2021
1
jan 2 2021
jan 30 2021 [updated because current date < previous end date]
1
jan 3 2021
jan 30 2021[updated because current date < previous end date]
1
jan 27 2021
jan 30 2021 [updated because current date < previous end date]
1
feb 3 2021
mar 5 2021
2
jan 2 2021
jan 31 2021
2
jan 10 2021
jan 31 2021[updated because current date < previous end date]
2
feb 10 2021
mar 12 2021
I think I should go this way. I use the datasource twice just to get the way its needed to perform the operation without updating or inserting into the table.
input table:
1|2021-01-01|2021-01-30
1|2021-01-02|2021-01-31
1|2021-01-03|2021-02-01
1|2021-01-27|2021-02-26
1|2021-02-03|2021-03-05
2|2021-01-02|2021-01-31
2|2021-01-10|2021-02-09
2|2021-02-10|2021-03-12
code:
with num_raw_data as (
SELECT row_number() over(partition by customer)as num, customer,date_init,date_end
FROM `project-id.data-set.table`
), analyzed_data as(
select r.num,
r.customer,
r.date_init,
r.date_end,
case when date_init<(select date_end from num_raw_data where num=r.num-1 and customer=r.customer and EXTRACT(month FROM r.date_init)=EXTRACT(month FROM date_init)) then 1 else 0 end validation
from num_raw_data r
)
select customer,
date_init,
case when validation !=0 then (select MIN(date_end) from analyzed_data where validation=0 and customer=ad.customer and date_init<ad.date_end) else date_end end as date_end
from analyzed_data ad
order by customer,num
output:
1|2021-01-01|2021-01-30
1|2021-01-02|2021-01-30
1|2021-01-03|2021-01-30
1|2021-01-27|2021-01-30
1|2021-02-03|2021-03-05
2|2021-01-02|2021-01-31
2|2021-01-10|2021-01-31
2|2021-02-10|2021-03-12
Using column validation from analyzed_data to get to know where I should be looking for changes. I'm not sure if its fast (probably not) but it works for the scenario you bring in your question.

How to get column value comparison in sql?

I have a table as below. The table holds the price of a product for each day in a year. I would like to get price change for each day by year.
Product Year 1Jan 2Jan .................... 31Dec
A 2018 10 20 .................... 120
A 2019 130 150 .................... 200
B 2018 15 23 .................... 90
B 2019 113 130 .................... 220
I would like to compare columns sequentially with year overlaps and get output as below.
• For the year 2018, by negating the value 2 Jan from 1 Jan (2 Jan-1 Jan), we get the new value of 2 Jan.
• For the year 2018, by negating the value 3Jan from 2 Jan (3 Jan-2 Jan), we get the new value of 3 Jan.
• For the year 2018, by negating the value 31Dec from 30 Dec (31 Dec-30 Dec), we get the new value of 31 Dec
• Now, For the year 2019, by negating the value 31 Dec(2018 year) from 1 Jan (2019 year), we get the new value of 1 Jan, 2019
So, in a nutshell, the value of a column is the difference of its value with previous day value.
Product Year 1Jan 2Jan .................... 31Dec
A 2018 10 10 .................... 15 (just assume value of 30Dec column is 105)
A 2019 10 20 .................... 10 (just assume value of 30Dec column is 190)
B 2018 15 8 .................... 8 (just assume value of 30Dec column is 82)
B 2019 23 17 .................... 10 (just assume value of 30Dec column is 210)
Let me know, if things are not clear.
Though logically there is nothing in this query, but still you have to work hard to write it -
SELECT Product
,Year
,1Jan
,2Jan - 1Jan 2Jan
,3Jan - 2Jan 3Jan
.
.
.
,31Dec - 30Dec 31Dec
FROM YOUR_TAB
ORDER BY Product
,Year;
first of all I think the design of the table could be better but thats a topic for some other time. Right now below code should work -
SELECT Product, Year,
1Jan AS '1st Jan',
2Jan-1Jan AS '2nd Jan',
3Jan-2Jan AS '3rd Jan',
4Jan-3Jan AS '4th Jan',
.
.
.
.
.
31Dec-30Dec AS '31st Dec',
FROM [table name];

distribute a value starting from the first months

let be a query such as the following.
Select MONTH, sum(RECEIVABLES), sum(COLLECTED) from TABLE1 group by MONTH
result
MONTH RECEIVABLES COLLECTED
JANUARY 2 0
FEBRUARY 1 0
MARCH 3 0
Now, APRIL 4 get made COLLECTED ...
Question: APRIL 4 in value, starting from the first month , we distribute according to how COLLECTED column.
as follows
MONTH RECEIVABLES COLLECTED
JANUARY 2 2
FEBRUARY 1 1
MARCH 3 1
APRIL 0 0
With SQL or stored procedures...
thanks...

I want output of sql Query and i already done some query part but it fails

The query which i have written
SELECT Date_current,
COUNT(*) as'Total'
FROM Call_Register
WHERE (DATEDIFF(dd,'02/1/2014',Date_current) >=0)
AND (DATEDIFF(dd,'02/12/2014',Date_current) <=0)
GROUP BY Date_current
HAVING COUNT(*)>=(convert(int,'02/12/2014')) \
ORDER BY Date_current
But this query gives me error:
Conversion failed when converting the varchar value '02/12/2014' to data type int.
Date Total
---------- -------
Feb 3 2014 2:58PM 10
Feb 4 2014 2:59PM 10
Please Help me
getting Output like
Date Total
---------- -------
Feb 3 2014 2:58PM 1
Feb 3 2014 2:59PM 1
Feb 3 2014 3:00PM 1
Feb 3 2014 3:08PM 1
Feb 3 2014 3:20PM 1
Feb 3 2014 4:05PM 1
Feb 3 2014 4:17PM 1
Feb 3 2014 4:19PM 1
Feb 3 2014 4:21PM 1
Feb 3 2014 4:24PM 1
Feb 4 2014 1:11PM 1
Feb 4 2014 2:35PM 1
Feb 4 2014 2:37PM 1
Feb 4 2014 5:19PM 1
Firstly, you should either use the culture invariant date format yyyyMMdd, or explicitly set the date format using SET DATEFORMAT DMY, or prepare to get inconsistent results.
Secondly, the following is potentially very inefficient:
WHERE (DATEDIFF(dd,'02/1/2014',Date_current) >=0)
AND (DATEDIFF(dd,'02/12/2014',Date_current) <=0)
If you have an index on Date_Current it will not be used because you are performing a function on it. You should instead use:
WHERE Date_Current >= '20140102'
AND Date_Current <= '20141202'
You then have a sargable query. I have had to guess at whether '02/1/2014' meant 1st February 2014, or 2nd January 2014 as it is not clear (hence the importance of my first point).
Finally (this part has already been answered but including it for completeness as I couldn't not point out the first two errors) you cannot convert to int here:
convert(int,'02/12/2014')
You presumably need to convert to date time first:
CONVERT(INT, CONVERT(DATETIME, '20141202'))
Although I suspect this is still not what you want, you are just filtering the days to those that have 41973 records or more, seems like a fairly arbitrary filter....
You need to Cast your String Date after that only you can proceed with Int CAST
CAST('02/12/2014' AS Datetime)
Try this
SELECT Date_current,COUNT(*) AS 'Total'
From Call_Register
WHERE (DATEDIFF(dd,'02/1/2014',Date_current)>=0) AND
(DATEDIFF(dd,'02/12/2014',Date_current)<=0)
Group By Date_current
having COUNT(*)>=(convert(int, CAST('02/12/2014' AS Datetime)) order By Date_current
At last got the as for my question thnx everyone
SELECT cast(Date_current as DATE),COUNT(*) AS 'Total'
From Call_Register
WHERE (DATEDIFF(dd,'02/1/2014',Date_current)>=0) AND
(DATEDIFF(dd,'02/13/2014',Date_current)<=0)
Group By cast(Date_current as DATE)
can i use order by in this cos i want it in descending order pl help for that