I have a dataset that includes similar values and a number of dates. I am trying to write a sql query to find the difference in dates between rows and sum the values in another row. The data set looks like this:
id | date | value |
1 2015-06-01 10
1 2015-09-22 25
2 2015-12-10 15
2 2015-07-11 20
2 2015-10-18 25
3 2015-04-05 30
3 2015-05-02 45
4 2015-06-01 20
And what I am trying to get to is this:
id | date_diff | value
1 42 35
2 149 60
3 27 75
4 0 20
The idea is the date_diff finds the difference between the dates for each id, and sums the value. However, the date_diff function isn't working for me, and I think this may be one of the first issues. The date_diff function is returning this error:
function datediff(unknown, date, timestamp without time zone) does not exist
Hint: No function matches the given name and argument types. You might need to add explicit type casts.
I'm using a public data set, so that might be part of the issue. Any help or ideas would be great!
Let's create a test table:
create table test
(
id int ,
date DATE,
value int
)
insert into test values (1, CAST('2015-06-01' AS DATE), 10);
insert into test values (1, CAST('2015-09-22' AS DATE), 25);
insert into test values (2, CAST('2015-12-10' AS DATE), 100);
insert into test values (2, CAST('2015-07-11' AS DATE), 200);
Let's look at the data:
select * from test ORDER BY id;
1 2015-06-01 10
1 2015-09-22 25
2 2015-12-10 100
2 2015-07-11 200
Let's do the magic.
select datediff(day, MIN(DATE), MAX(DATE)) as diff_in_days, sum(value) as sum_of_values FROM test group by id;
113 35
152 300
Please note that you didn't specify what should happen when there are 3 rows with id. The code will still work, but if it would make sense from the logical point of view for your application, I don't know.
Related
I have table in Oracle SQL presents ID of clients and date with time of their login to application:
ID | LOGGED
----------------
11 | 2021-07-10 12:55:13.278
11 | 2021-08-10 13:58:13.211
11 | 2021-02-11 12:22:13.364
22 | 2021-01-10 08:34:13.211
33 | 2021-04-02 14:21:13.272
I need to select only these clients (ID) who has logged minimum 1 time in last month (August) and minimum 1 time in one month preceding August (June or July)
Currently we have September, so...
I need clients who has logged min 1 time in August
and min 1 time in July or Jun,
if logged in June -> not logg in July
if logged in July -> not logged in June
As a result I need like below:
ID
----
11
How can do that in Oracle SQL ? be aware that column "LOGGED" has Timestamp like: 2021-01-10 08:34:13.211
May be you consider this:
select id
from yourtable
group by id
having count(case
months_between(trunc(sysdate,'MM'),
trunc(logged,'MM')
) when 1 then 1 end
) >= 1
and count
(case when
months_between(trunc(sysdate,'MM') ,
trunc(logged,'MM')
) in (2,3) then 1 end
) = 1
I don't understand one thing:
You wrote :
minimum 1 time in one month preceding August (June or July)
and after then:
if logged in June -> not logg in July
if logged in July -> not logged in June
If you need EXACTLY one month- June or July
just consider my query above.
If you need minimum one logon in June and July, then:
select id
from yourtable
group by id
having count(case
months_between(trunc(sysdate,'MM'),
trunc(logged,'MM')
) when 1 then 1 end
) >= 1
and count
(case when
months_between(trunc(sysdate,'MM') ,
trunc(logged,'MM')
) in (2,3) then 1 end
) >= 1
Your question needs some clarification, but based on what you were describing I am seeing a couple of options.
The simplest one is probably using a combo of data densification (for generating a row for every month for each id) plus an analytical function (for enabling inter-row calculations. Here's a simple example of this:
rem create a dummy table with some more data (you do not seem to worry about the exact timestamp)
drop table logs purge;
create table logs (ID number, LOGGED timestamp);
insert into logs values (11, to_timestamp('2021-07-10 12:55:13.278','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-07-11 12:55:13.278','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-08-10 13:58:13.211','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-02-11 12:22:13.364','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (11, to_timestamp('2021-04-11 12:22:13.364','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (22, to_timestamp('2021-01-10 08:34:13.211','yyyy-mm-dd HH24:MI:SS.FF'));
insert into logs values (33, to_timestamp('2021-04-02 14:21:13.272','yyyy-mm-dd HH24:MI:SS.FF'));
commit;
The following SQL gets your data densified and lists the total count of logins for a month and the previous month on the same row so that you could do a comparative calculation. I have not done then, but I am hoping you get the idea.
with t as
(-- dummy artificial table just to create a time dimension for densification
select distinct to_char(sysdate - rownum,'yyyy-mm') mon
from dual connect by level < 300),
l_sparse as
(-- aggregating your login info per month
select id, to_char(logged,'yyyy-mm') mon, count(*) cnt
from logs group by id, to_char(logged,'yyyy-mm') ),
l_dense as
(-- densification with partition outer join
select t.mon, l.id, cnt from l_sparse l partition by (id)
right outer join t on (l.mon = t.mon)
)
-- final analytical function to list current and previous row info in same record
select mon, id
, cnt
, lag(cnt) over (partition by id order by mon asc) prev_cnt
from l_dense
order by id, mon;
parts of the result:
MON ID CNT PREV_CNT
------- ---------- ---------- ----------
2020-12 11
2021-01 11
2021-02 11 2
2021-03 11 2
2021-04 11 1
2021-05 11 1
2021-06 11
2021-07 11 3
2021-08 11 2 3
2021-09 11 2
2020-12 22
2021-01 22 2
2021-02 22 2
2021-03 22
2021-04 22
...
You can see for ID 11 that for 2021-08 you have logins for the current and previous month, so you can math on it. (Would require another subselect/with branch).
Alternatives to this would be:
interrow calculation plus time math between two logged timestamps
pattern matching
Did not drill into those, not enough info about your real requirement.
I'm working with Oracle and cannot achieve the query I need for the moment.
Suppose I have the following table :
- ID Date Type Value
- 1 01/12/2016 prod 1
- 2 01/01/2017 test 10
- 3 01/06/2017 test 20
- 4 01/12/2017 prod 30
- 5 15/12/2017 test 40
- 6 01/01/2018 test 50
- 7 01/06/2018 test 60
- 8 01/12/2018 prod 70
I need to sum the VALUES between the "prod" TYPES + the last "prod" VALUE.
The results should be :
- 1 01/01/2016 - 1
- 2 01/01/2017 - 60
- 3 01/06/2017 - 60
- 4 01/12/2017 - 60
- 5 15/12/2017 - 220
- 6 01/01/2018 - 220
- 7 01/06/2018 - 220
- 8 01/12/2018 - 220
I first had to sum VALUES by YEAR without taking TYPES into account.
The need changed and I don't see how to start to identify, for each line, which is the previous "prod" DATE and sum each VALUE including the last "prod" TYPE.
Thanks
You can define the groups using a cumulative sum on type = 'PROD' -- in reverse, then use a window function for the final summation:
select t.*,
sum(value) over (partition by grp) as total
from (select t.*,
sum(case when type = 'PROD' then 1 else 0 end) over (order by id desc) as grp
from t
) t
order by id;
To see the grouping logic, look at:
ID Date Type Value Grp
1 01/12/2016 prod 1 3
2 01/01/2017 test 10 2
3 01/06/2017 test 20 2
4 01/12/2017 prod 30 2
5 15/12/2017 test 40 1
6 01/01/2018 test 50 1
7 01/06/2018 test 60 1
8 01/12/2018 prod 70 1
This identifies the groups that need to be summed. The DESC is because "prod" ends a group. If "prod" started a group (i.e. was included with the sum on the next row), then ASC would be used.
Rextester Demo
Gordon Linoff's answer is great.
This below is just for a bit of a different flavor(12c+)
Setup:
ALTER SESSION SET NLS_DATE_FORMAT = 'DD/MM/YYYY';
CREATE TABLE TEST_TABLE(
THE_ID INTEGER,
THE_DATE DATE,
THE_TYPE CHAR(4),
THE_VALUE INTEGER);
INSERT INTO TEST_TABLE VALUES (1,TO_DATE('01/12/2016'),'prod',1);
INSERT INTO TEST_TABLE VALUES (2,TO_DATE('01/01/2017'),'test',10);
INSERT INTO TEST_TABLE VALUES (3,TO_DATE('01/06/2017'),'test',20);
INSERT INTO TEST_TABLE VALUES (4,TO_DATE('01/12/2017'),'prod',30);
INSERT INTO TEST_TABLE VALUES (5,TO_DATE('15/12/2017'),'test',40);
INSERT INTO TEST_TABLE VALUES (6,TO_DATE('01/01/2018'),'test',50);
INSERT INTO TEST_TABLE VALUES (7,TO_DATE('01/06/2018'),'test',70);
INSERT INTO TEST_TABLE VALUES (8,TO_DATE('01/12/2018'),'prod',60);
COMMIT;
Query:
SELECT
THE_ID, THE_DATE, MAX(RUNNING_GROUP_SUM) OVER (PARTITION BY THE_MATCH_NUMBER) AS GROUP_SUM
FROM TEST_TABLE
MATCH_RECOGNIZE (
ORDER BY THE_ID
MEASURES
MATCH_NUMBER() AS THE_MATCH_NUMBER,
RUNNING SUM(THE_VALUE) AS RUNNING_GROUP_SUM
ALL ROWS PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (TEST_TARGET{0,} PROD_TARGET)
DEFINE TEST_TARGET AS THE_TYPE = 'test',
PROD_TARGET AS THE_TYPE = 'prod')
ORDER BY THE_ID ASC;
Result:
THE_ID THE_DATE GROUP_SUM
---------- ---------- ----------
1 01/12/2016 1
2 01/01/2017 60
3 01/06/2017 60
4 01/12/2017 60
5 15/12/2017 220
6 01/01/2018 220
7 01/06/2018 220
8 01/12/2018 220
I have a table which contain _id, underSubheadId, wefDate, price.
Whenever a product is created or price is edited an entry is made in this table also.
What I want is if I enter a date, I get the latest price of all distinct UnderSubheadIds before the date (or on that date if no entry found)
_id underHeadId wefDate price
1 1 2016-11-01 5
2 2 2016-11-01 50
3 1 2016-11-25 500
4 3 2016-11-01 20
5 4 2016-11-11 30
6 5 2016-11-01 40
7 3 2016-11-20 25
8 5 2016-11-15 52
If I enter 2016-11-20 as date I should get
1 5
2 50
3 25
4 30
5 52
I have achieved the result using ROW NUMBER function in SQL SERVER, but I want this result in Sqlite which don't have such function.
Also if a date like 2016-10-25(which have no entries) is entered I want the price of the date which is first.
Like for 1 we will get price as 5 as the nearest and the 1st entry is 2016-11-01.
This is the query for SQL SERVER which is working fine. But I want it for Sqlite which don't have ROW_NUMBER function.
select underSubHeadId,price from(
select underSubHeadId,price, ROW_NUMBER() OVER (Partition By underSubHeadId order by wefDate desc) rn from rates
where wefDate<='2016-11-19') newTable
where newTable.rn=1
Thank You
This is a little tricky, but here is one way:
select t.*
from t
where t.wefDate = (select max(t2.wefDate)
from t t2
where t2.underSubHeadId = t.underSubHeadId and
t2.wefdate <= '2016-11-20'
);
select underHeadId, max(price)
from t
where wefDate <= "2016-11-20"
group by underHead;
I've got a table something like..
[DateValueField][Hour][Value]
2014-09-01 1 200
...
2014-09-01 24 400
2014-09-02 1 220
...
2014-09-02 24 200
...
I need the same value for each DateValueField based on the average Value for Hour between 6-12 for example but have that display for all hours, not just 6-12. For instance...
[DateValueField][Hour][Value]
2014-09-01 1 300
...
2014-09-01 24 300
2014-09-02 1 190
...
2014-09-02 24 190
...
Query I'm trying is...
select DateValueField, Hour,
(select avg(Value) as Value from MyTable where Hour
between 6 and 12) as Value from MyTable
where DateValueField between '2014' and '2015'
group by DateValueField, Hour
order by DateValueField, Hour
But it gives me the Value as an average of ALL Values but I need it averaged out for that particular day between the hours I specify.
I'd appreciate some help/advice. Thanks!
You can use a derived table to get the average value between hours 6 and 12 grouped by date and then join that to your original table
select t1.DateValueField, t1.Hour, t2.avg_value
from MyTable t1
join (
select DateValueField, avg(Value) avg_value
from MyTable
where hour between 6 and 12
group by DateValueField
) t2 on t2.DateValueField = t1.DateValueField
order by t1.DateValueField, t1.Hour
Note: You may want to use a left join if some of your dates don't have values between hours 6 and 12 but you still want to retrieve all rows from MyTable.
Excuse me for posting a similar question. Please consider this:
date value
18/5/2010, 1 pm 40
18/5/2010, 2 pm 20
18/5/2010, 3 pm 60
18/5/2010, 4 pm 30
18/5/2010, 5 pm 60
18/5/2010, 6 pm 25
19/5/2010, 6 pm 300
19/5/2010, 6 pm 450
19/5/2010, 6 pm 375
20/5/2010, 6 pm 250
20/5/2010, 6 pm 310
The query is to get the date and value for each day such that the value obtained for that day is max. If the max value is repeated on that day, the lowest time stamp is selected. The result should be like:
18/5/2010, 3 pm 60
19/5/2010, 6 pm 450
20/5/2010, 6 pm 310
The query should take in a date range like the one given below and find results for that range in the above fashion:
where
date >= to_date('26/03/2010','DD/MM/YYYY') AND
date < to_date('27/03/2010','DD/MM/YYYY')
If you provide a CREATE TABLE and INSERT, it makes it a lot easier to provide a tested answer.
create table i (i_dt date, i_val number);
insert into i values (to_date('18/5/2010 1pm','dd/mm/yyyy hham'), 40);
insert into i values (to_date('18/5/2010 2pm','dd/mm/yyyy hham'), 20);
insert into i values (to_date('18/5/2010 3pm','dd/mm/yyyy hham'), 60);
insert into i values (to_date('18/5/2010 4pm','dd/mm/yyyy hham'), 30);
insert into i values (to_date('18/5/2010 5pm','dd/mm/yyyy hham'), 60);
insert into i values (to_date('18/5/2010 6pm','dd/mm/yyyy hham'), 25 );
insert into i values (to_date('19/5/2010 6pm','dd/mm/yyyy hham'), 300 );
insert into i values (to_date('19/5/2010 6pm','dd/mm/yyyy hham'), 450 );
insert into i values (to_date('19/5/2010 6pm','dd/mm/yyyy hham'), 375 );
insert into i values (to_date('20/5/2010 6pm','dd/mm/yyyy hham'), 250 );
insert into i values (to_date('20/5/2010 6pm','dd/mm/yyyy hham'), 310 );
select i_dt, i_val from
(select i.*, rank() over (partition by trunc(i_dt) order by i_val desc, i_dt asc) rn
from i)
where rn = 1;
You are aggregating your data, so use grouping and aggregation functions. You can add any where clause you want, but I copied your where clause in, changing the dates so every record is selected. Borrowing Gary's create table and insert statements:
SQL> select min(i_dt) keep (dense_rank last order by i_val) i_dt
2 , max(i_val) i_val
3 from i
4 where i_dt >= to_date('26/03/2010','dd/mm/yyyy')
5 and i_dt < to_date('27/05/2010','dd/mm/yyyy')
6 group by trunc(i_dt)
7 /
I_DT I_VAL
------------------- ----------
18-05-2010 15:00:00 60
19-05-2010 18:00:00 450
20-05-2010 18:00:00 310
3 rows selected.
Regards,
Rob.
I've not tried this, but I would think you want something like:
select max(date)
from table
where date >= to_date('26/03/2010','DD/MM/YYYY') AND date < to_date('27/03/2010','DD/MM/YYYY')
group by trunc(date)