Timestamp difference between rows in Postgresql - sql

In PostgreSQL I have a table
CREATE TABLE cubenotification.newnotification
(
idnewnotification serial NOT NULL,
contratto text,
idlocation numeric,
typology text,
idpost text,
iduser text,
idobject text,
idwhere text,
status text DEFAULT 'valid'::text,
data_crea timestamp with time zone DEFAULT now(),
username text,
usersocial text,
url text,
apikey text,
CONSTRAINT newnotification_pkey PRIMARY KEY (idnewnotification )
)
Let's say that typology field can be "owned_task" or "fwd_task".
What I'd like to get from DB is the timestamp difference in seconds strictly between data_crea of the row with typology "fwd_task" and data_crea of the row with typology "owned_task" for every couple "idobject,iduser", and I'd like to get also the EXTRACT(WEEK FROM data_crea)) as "weeks", grouping the results by "weeks". My problem is principally about performing the timestamp ordered difference between two rows with same idobject, same iduser and different typology.
EDIT:
Here some sample data
and sqlfiddle link http://sqlfiddle.com/#!12/6cd64/2

What you are looking for is the JOIN of two subselects:
SELECT EXTRACT(WEEK FROM o.data_crea) AS owned_week
, iduser, idobject
, EXTRACT('epoch' FROM (o.data_crea - f.data_crea)) AS diff_in_sek
FROM (SELECT * FROM newnotification WHERE typology = 'owned_task') o
JOIN (SELECT * FROM newnotification WHERE typology = 'fwd_task') f
USING (iduser, idobject)
ORDER BY 1,4
->sqlfiddle
I order by week and timestamp difference. The week number is based on the week of 'owned_task'.
This assumes there is exactly one row for 'owned_task' and one for 'fwd_task' per (iduser, idobject), not one per week. Your specification leaves room for interpretation.

Related

How to conditionally left join records in one table that fall between two date periods in another table?

I currently have the following two tables:
Please note I am using SQLite.
The first table is called url_table - it has a unique ID, a start_time, an end_time and a reference.
I am aware that the column for start_time and end_time are TEXT and that in order for a solution to work i would need to convert the data to a date format but i am unsure how as of yet.
CREATE TABLE "url_table" (
"ID" TEXT,
"start_time" TEXT,
"end_time" TEXT,
"reference" TEXT
);
INSERT INTO "url_table" ("ID", "start_time", "end_time", "reference")
values("abcd","2019-10-10 17:00", "2019-10-10 17:10","boy");
INSERT INTO "url_table" ("ID", "start_time", "end_time", "reference")
values("efgh","2019-11-10 18:00", "2019-11-10 18:10","girl");
The second table is calling_table
This table contains the fields; ID unique, start_time which should be date but is TEXT and reference
CREATE TABLE "calling_table" (
"c_ID" TEXT,
"start_time" TEXT,
"reference" TEXT
);
INSERT INTO "calling_table" ("c_ID", "start_time", "reference")
values("agfhfghd","2019-10-10 17:05","boy");
INSERT INTO "calling_table" ("c_ID", "start_time", "reference")
values("fghfghfghrty","2019-11-10 18:05","girl");
My question is the following:
I would like to left join the calling_table to the url_table by the common column "reference" - but I would like to left join in such a way that I am joining where the calling_table's start_time record is between the url_table's start_time and end_time.
So for example - the first and second record of calling_table both have a start_time that falls within the first two records of the url_table - so this information will be joined to the url_table.
I am unsure how to do this with a left join.
Any help appreciated
I think that the start_time and end_time columns should be dates to perform this - however I do not know how to do this in SQLITE so have left them as TEXT
sql fiddle here
Expected result is the following:
ID | start_time | end_time | reference | c_ID | call.start_time | call.reference
------------------------------------------------------------------------------------
abcd 2019-10-10 17:00 2019-10-10 17:00 "boy" "agfhfghd","2019-10-10 17:05","boy"
I think the logic you want is:
select . . . -- list the columns you want here
from calling_table ct left join
url_table ut
on ct.reference = ut.reference and
ct.start_time >= ut.start_time and
ct.start_time < ut.end_time;
Here is a SQL fiddle.

Sqllite: finding abnormal values over time

I have the following sqllite table:
CREATE TABLE test (
id INTEGER NOT NULL,
date TEXT,
account TEXT,
........
value TEXT,
.......
PRIMARY KEY (id),
CONSTRAINT composite UNIQUE (date, account)
)
I want to find all the account numbers where the value is greater than 0 on 2 separate dates . I'm thinking:
SELECT * from test WHERE value> 0 GROUP BY account
is probably a start, but I don't know how to evaluate the size of groups
One way to phrase this query is to aggregate over accounts having a greater than zero value, and then retain those accounts having two or more distinct dates:
SELECT
account
FROM test
WHERE value > 0
GROUP BY account
HAVING COUNT(DISTINCT date) >= 2
I see that your value column is declared as TEXT. I think this should probably be an integer if you want to do numeric comparisons with this column.

Postgres: How to find nearest tsrange from timestamp outside of ranges?

I am modeling (in Postgres 9.6.1 / postGIS 2.3.1) a booking system for local services provided by suppliers:
create table supplier (
id serial primary key,
name text not null check (char_length(title) < 280),
type service_type,
duration interval,
...
geo_position geography(POINT,4326)
...
);
Each supplier keeps a calendar with time slots when he/she is available to be booked:
create table timeslot (
id serial primary key,
supplier_id integer not null references supplier(id),
slot tstzrange not null,
constraint supplier_overlapping_timeslot_not_allowed
exclude using gist (supplier_id with =, slot with &&)
);
For when a client wants to know which nearby suppliers are available to book at a certain time, I create a view and a function:
create view supplier_slots as
select
supplier.name, supplier.type, supplier.geo_position, supplier.duration, ...
timeslot.slot
from
supplier, timeslot
where
supplier.id = timeslot.supplier_id;
create function find_suppliers(wantedType service_type, near_latitude text, near_longitude text, at_time timestamptz)
returns setof supplier_slots as $$
declare
nearpoint geography;
begin
nearpoint := ST_GeographyFromText('SRID=4326;POINT(' || near_latitude || ' ' || near_longitude || ')');
return query
select * from supplier_slots
where type = wantedType
and tstzrange(at_time, at_time + duration) <# slot
order by ST_Distance( nearpoint, geo_position )
limit 100;
end;
$$ language plpgsql;
All this works really well.
Now, for the suppliers that did NOT have a bookable time slot at the requested time, I would like to find their closest available timeslots, before and after the requested at_time, also sorted by distance.
This has my mind spinning a little bit and I can't find any suitable operators to give me the nearest tsrange.
Any ideas on the smartest way to do this?
The solution depends on the exact definition of what you want.
Schema
I suggest these slightly adapted table definitions to make the task simpler, enforce integrity and improve performance:
CREATE TABLE supplier (
supplier_id serial PRIMARY KEY,
supplier text NOT NULL CHECK (length(title) < 280),
type service_type,
duration interval,
geo_position geography(POINT,4326)
);
CREATE TABLE timeslot (
timeslot_id serial PRIMARY KEY,
supplier_id integer NOT NULL -- references supplier(id),
slot_a timestamptz NOT NULL,
slot_z timestamptz NOT NULL,
CONSTRAINT timeslot_range_valid CHECK (slot_a < slot_z)
CONSTRAINT timeslot_no_overlapping
EXCLUDE USING gist (supplier_id WITH =, tstzrange(slot_a, slot_z) WITH &&)
);
CREATE INDEX timeslot_slot_z ON timeslot (supplier_id, slot_z);
CREATE INDEX supplier_geo_position_gist ON supplier USING gist (geo_position);
Save two timestamptz columns slot_a and slot_z instead of the tstzrange column slot - and adapt constraints accordingly. This treats all ranges as default inclusive lower and exclusive upper bounds automatically now - which avoids corner case errors / headache.
Collateral benefit: only 16 bytes for 2 timestamptz instead of 25 bytes (32 with padding) for the tstzrange.
All queries you might have had on slot keep working with tstzrange(slot_a, slot_z) as drop-in replacement.
Add an index on (supplier_id, slot_z) for the query at hand.
And a spatial index on supplier.geo_position (which you probably have already).
Depending on data distribution in type, a couple of partial indexes for types common in queries might help performance:
CREATE INDEX supplier_geo_type_foo_gist ON supplier USING gist (geo_position)
WHERE supplier = 'foo'::service_type;
Query / Function
This query finds the X closest suppliers who offer the correct service_type (100 in the example), each with the one closest matching time slot (defined by the time distance to the start of the slot). I combined this with actually matching slots, which may or may not be what you need.
CREATE FUNCTION f_suppliers_nearby(_type service_type, _lat text, _lon text, at_time timestamptz)
RETURNS TABLE (supplier_id int
, name text
, duration interval
, geo_position geography(POINT,4326)
, distance float
, timeslot_id int
, slot_a timestamptz
, slot_z timestamptz
, time_dist interval
) AS
$func$
WITH sup_nearby AS ( -- find matching or later slot
SELECT s.id, s.name, s.duration, s.geo_position
, ST_Distance(ST_GeographyFromText('SRID=4326;POINT(' || _lat || ' ' || _lon || ')')
, geo_position) AS distance
, t.timeslot_id, t.slot_a, t.slot_z
, CASE WHEN t.slot_a IS NOT NULL
THEN GREATEST(t.slot_a - at_time, interval '0') END AS time_dist
FROM supplier s
LEFT JOIN LATERAL (
SELECT *
FROM timeslot
WHERE supplier_id = supplier_id
AND slot_z > at_time + s.duration -- excl. upper bound
ORDER BY slot_z
LIMIT 1
) t ON true
WHERE s.type = _type
ORDER BY s.distance
LIMIT 100
)
SELECT *
FROM (
SELECT DISTINCT ON (supplier_id) * -- 1 slot per supplier
FROM (
TABLE sup_nearby -- matching or later slot
UNION ALL -- earlier slot
SELECT s.id, s.name, s.duration, s.geo_position
, s.distance
, t.timeslot_id, t.slot_a, t.slot_z
, GREATEST(at_time - t.slot_a, interval '0') AS time_dist
FROM sup_nearby s
CROSS JOIN LATERAL ( -- this time CROSS JOIN!
SELECT *
FROM timeslot
WHERE supplier_id = s.supplier_id
AND slot_z <= at_time -- excl. upper bound
ORDER BY slot_z DESC
LIMIT 1
) t
WHERE s.time_dist IS DISTINCT FROM interval '0' -- exact matches are done
) sub
ORDER BY supplier_id, time_dist -- pick temporally closest slot per supplier
) sub
ORDER BY time_dist, distance; -- matches first, ordered by distance; then misses, ordered by time distance
$func$ LANGUAGE sql;
I did not use your view supplier_slots and optimized for performance instead. The view may still be convenient. You might include tstzrange(slot_a, slot_z) AS slot for backward compatibility.
The basic query to find the 100 closest suppliers is a textbook "K Nearest Neighbour" problem. A GiST index works well for this. Related:
How do I query all rows within a 5-mile radius of my coordinates?
The additional task (find the temporally nearest slot) can be split in two tasks: to find the next higher and the next lower row. The core feature of the solution is to have two subqueries with ORDER BY slot_z LIMIT 1 and ORDER BY slot_z DESC LIMIT 1, which result in two very fast index scans.
I combined the first one with finding actual matches, which is a (smart, I think) optimization, but may distract from the actual solution.

Use interval to add on timestamp value from other table, in Oracle

I'm actually beginner in SQL and i working on Oracle engine. I have a problem to do arithmetic manipulation using Interval, to add on timestamp column - integer value, that exist in other table and convert it to Minute.
To test my schemas i used in data generator. As a result, Some of the data produced, are not reliable and i need to check overlapping between two appointments, when the same patient invited for two treatments overlap.
I have treatments_appointments table that contains these attributes:
treatments_appointments(app_id NUMBER(38) NOT NULL,
[fk] care_id NUMBER(38) NOT NULL,
[fk] doctor_id NUMBER(38) NOT NULL,
[fk] room_id NUMBER(38) NOT NULL,
[fk] branch_id NUMBER(38) NOT NULL,
[fk] patient_id NUMBER(38) NOT NULL,
appointment_time TIMESTAMP NOT NULL)
Below is the code what i wrote and it's get an error message:
SELECT app1.app_id
FROM treatment_appointment app1
INNER JOIN treatment_appointment app2
ON app1.patient_id = app2.patient_id
WHERE app1.appointment_time >= app2.appointment_time AND
app1.appointment_time <=
app2.appointment_time + interval (to_char(select care_categories.care_duration where app2.care_id = care_categories.care_id)) minute
AND
app1.app_id != app2.app_id
The error message is:
ORA-00936: missing expression
Sorry about my English and thanks for answering my question!
You can only use a fixed string value for an INTERVAL literal, not a variable, an expression or a column value. But you can use the NUMTODSINTERVAL function to convert a number of minutes into an interval. Instead of:
interval (to_char(select care_categories.care_duration
where app2.care_id = care_categories.care_id)) minute
Use:
numtodsinterval(select care_categories.care_duration
where app2.care_id = care_categories.care_id, 'MINUTE')
Although you should join to that table in the main query rather than doing a subquery for every row:
SELECT app1.app_id
FROM treatment_appointment app1
INNER JOIN treatment_appointment app2
ON app1.patient_id = app2.patient_id
INNER JOIN care_categories cc
ON app2.care_id = cc.care_id
WHERE app1.appointment_time >= app2.appointment_time AND
app1.appointment_time <=
app2.appointment_time + numtodsinterval(cc.care_duration, 'MINUTE') AND
app1.app_id != app2.app_id

Select date between two dates

I have this select statement to return number of rows between two dates. It only returns one while while there are foure rows.
SELECT count(*) as number
FROM PFServicesLogging
WHERE User_ID = '784198013531599'
AND ServiceType = 1
AND InsertDate between "2013-11-11" and "2013-11-19"
table structure is
CREATE TABLE `PFServicesLogging`
(
ID INTEGER NOT NULL,
ServiceType INTEGER NOT NULL,
Datetime DATETIME,
User_ID TEXT,
Frequency INTEGER, `InsertDate` Date,
PRIMARY KEY(ID)
)
If your date fields include the time of day, do not use between. Use
where YourDateField >= StartDate
and YourDateField < TheDayAfterTheEndDate
Edit Starts Here
Reading the comments, maybe the problem has nothing to do with the dates. Maybe it's the user_id or service type. To troubleshoot, replace your where clause with
where 1 = 1
/*
User_ID = '784198013531599'
AND ServiceType = 1
AND InsertDate between "2013-11-11" and "2013-11-19"
*/
and run your query. Take your filters out of the comment block one by one so that you can see which one causes the unexpected results.