I have a table named villas and this table has a column named reserved_dates of type reserved_dates in daterange[]
I want to keep the booked dates in the reserved_dates column.
Villas are booked between certain dates .
For Example:
Check In Date: 2023-02-05 Check Out Date: 2023-02-15.
and in this case I can manually add {"[2023-02-05,2023-02-15)"} value to the reserved_dates column.
what i want is for example when client choose date
Check In Date: 2023-02-10
Check Out Date: 2023-02-20
I want to check, does the selected date range conflict with the one in the database?
And if there is no reservation date, I want to add it, how can I do that?
Or what can I do for this problem?
I couldn't find the Result I wanted and the use of the new date types on many blog platforms, including the PostgreSQL 14 Documentation.
I am able to manually add the date range to reserved_dates. But I can't update the reservation date, if the reservations overlap
An exclusion constraint will do what you want. You have to include more than just the date range (1 room cannot be booked more than once on any given day but 2 separate rooms can be booked on the same day).
CREATE EXTENSION IF NOT EXISTS BTREE_GIST;
CREATE TABLE demo_table(
RoomNumber INTEGER NOT NULL,
CheckIn DATE NOT NULL,
CheckOut DATE NOT NULL,
CHECK (CheckIn < CheckOut ),
EXCLUDE USING gist (RoomNumber WITH =, daterange(CheckIn, CheckOut, '[)'::text) WITH &&)
);
The && operator is the range overlap (range1 && range2), which you can test in a regular SELECT query too.
EDIT: I have seen your comments.
Point 1: the devil is in the details. You named your table villas (plural), suggesting there is more than 1 villa to manage. In which case there should be an additional column to identify which villa is linked to a reservation (if not RoomNumber INTEGER, call it VillaName TEXT).
Honestly, even if you have only 1 villa, it would not hurt to plan for the future and make it so adding another one in the future does not require you to change your entire schema.
Point 2: I do not know why you would store the reservations in an array, it is probably a bad design choice (it will not let you use an index for instance, and delete past records as easily).
UNNEST is a quick fix for you. It turns your array elements into records.
Example:
SELECT *
FROM (
SELECT UNNEST(reserved_dates) AS Reserved, [add other columns here]
FROM villas
) R
But the correct way to do things, it was said in the comments, would rather be in the lines of:
CREATE TABLE villaReservation ([...]);
INSERT INTO villaReservation SELECT UNNEST(reserved_dates), ... FROM villas
WHERE Reserved && daterange('2023-03-10', '2023-02-20', '[)'::text)
Last thing: I personally prefer keeping the 2 bounds of ranges separate in tables (above, keep separate check-in and check-out dates).
It makes migrating from PostgreSQL to another DBMS easier (the table's CREATE script will not need to be adjusted).
It might not apply in your case but it makes it possible to have ranges in the form of [date1, date1), that is null ranges but with a placement in time.I actually encountered 1 use case where things needed to be saved this way, albeit in a different context than yours.
Related
I am trying to copy data from one table to another table, which works fine, but I only want to copy certain data from one the of the columns.
Insert Into Period (Invoice_No, Period_Date)
Select Invoice_Seq_No, Inv_Comment
From Invoices
Where INV_Comment LIKE '%November 2015';
The Inv_Comment column contains free-form comments and the date in different formats, e.g. "paid on November 2015 or "paid on Aug" or "July 2015". What I am trying to do is to copy only the "November 2015" part of the comment into the new table.
The above code only copies the entire data of the Inv_Comment field and I only want to copy the date. The date part can be in one of three formats: MON YYYY, DD.MM.YYYY or only the month i.e. MON
How can I extract only the date part I am interested in?
For your very simple example query you can use the substr() function, using the length of your fixed value to count back from the end of the string, as that document describes:
If position is negative, then Oracle counts backward from the end of char.
So you can do:
select invoice_seq_no, substr(inv_comment, -length('November 2015'))
from invoices
where inv_comment like '%November 2015';
But it's clear from the comments that you really want to find all dates, in various formats, and not always at the end of the free-form text. One option is to search the text repeatedly for all the possible formats and values, starting with the most specific (e.g. DD.MM.YYYY) and then going down to least specific
(e.g. just MON). You could insert just the sequence numbers into your table start with, and then repeatedly update the rows that do not yet have values set:
insert into period (invoice_no) select invoice_seq_no from invoices;
update period p
set period_date = (
select case when instr(i.inv_comment, '15.09.2015') > 0 then
substr(i.inv_comment, instr(i.inv_comment, '15.09.2015'), length('15.09.2015'))
end
from invoices i
where i.invoice_seq_no = p.invoice_no
)
where period_date is null;
then repeat the update with another date, or a more generic November 2015 pattern, etc. But specifying every possible date isn't going to be feasible, so you could regular expressions. There are probably better patterns for this but as an example:
update period p
set period_date = (
select regexp_substr(i.inv_comment, '[[0-3][0-9][-./][0-1][0-9][-./][12]?[901]?[0-9]{2}')
from invoices i
where i.invoice_seq_no = p.invoice_no
)
where period_date is null;
which matches (or attempts to match) anything looking like DD.MM.YYYY, followed by maybe:
update period p
set period_date = (
select regexp_substr(i.inv_comment,
'(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|'
|| 'Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)([[:space:]]+[12]?[901]?[0-9]{2})?')
from invoices i
where i.invoice_seq_no = p.invoice_no
)
where period_date is null;
which matches any short or long month name. You may have mixed case though - aug, Aug, AUG - so you might want to use the match parameter to make it case-insensitive. This isn't supposed to be a complete solution though, and you may need further formats. There are some ideas on other questions.
You may really want actual dates, which means breaking down a bit more, and then assuming missing years - perhaps taking the year from another column (order date?) if it isn't available in the comments, though that gets a bit messy around year-end. But you can essentially do the same thing, just passing each extracted value through to_date() with a format mask matching the search expression you're using.
There will always be mistakes, typos, odd formatting etc., so even if this approach identified most patterns, you'll probably end up with some that are left blank, and will need to be set manually by a human looking at the comments; and some that are just wrong. But this is why dates shouldn't be stored as strings at all - having them mixed in with other text is just making things even worse.
Here you're dealing with strings containing disparate date information. Several string operations may be needed.
I need to add a unique constraint to an Oracle database table where a foreign key reference can only exist more than once if 2 other columns which are dates don't overlap
e.g.
car_id start_date end_date
3 01/10/2012 30/09/2013
3 01/10/2013 30/09/2014 -- okay no overlap
3 01/10/2014 30/09/2015 -- okay no overlap
4 01/10/2012 30/09/2013 -- okay different foregin key
3 01/11/2013 01/01/2014 -- * not allowed overlapping dates for this car.
Any suggestions? Thanks in advance.
Last time I've seen a requirement and a solution for this, I've seen this:
Create an after statement trigger. In this trigger do a self join on your table like this:
select count(*)
from your_table a
join your_table b
on a.car_id = b.car_id and
(a.start_date between b.start_date and b.end_date
or
b.start_date between a.start_date and a.end_date)
If count is zero then everything is ok. If count > 0 then raise an exception and the statement will be rolled back.
OBS: This will not work for tables with > millions of rows and many inserts.
It works on small lookup tables or, if you have a big table, with big table and seldom inserts(batch inserts).
I take it that cars are tracked through some sort of process and every date records a state change. For example, you show that car #3 underwent a state change on 1 Oct 2012, again on 1 Oct 2013 and again on 1 Oct 2014. The final entry implies that the state changed again on 1 Oct 2015. Where is the entry showing that? Or is the state something that always lasts exactly one year -- making it possible to specify the end of the state as soon as the state begins? If so, then the entry showing the state change on 1 Nov 2013 is simply wrong. But the one-year specification could just be a coincident. You could have just picked simplistic data points for your example data.
Your concern at this point is to strictly identify valid data from accurate data. We design databases (or should) with an emphasis on data integrity or validity. That means we as sharply as possible constrain each piece of data so it is consistent with the specifications of that piece of data.
For example, the car id field is a foreign key -- generally to a table that defines each instance of the car entity. So we know that at least two cars exist with an id of 3 and 4. Else those values could not exist in the example you show.
But what about accuracy or correctness? Suppose in the last entry in your example, the car id 3 should really have been 4? There is no way to tell from within the database. This illustrates the difference. Both the 3 and 4 are valid values and we are able to constrain these to only valid values. But only one is correct -- assuming for a moment they are the only two cars so far defined. The point is, there is no test, no way to constrain the values to the one that is correct. We can check for validity -- not accuracy.
What you are trying to do is check for accuracy with a validity test. You may claim the "no overlaps" restriction becomes a validity check, but this is just a sort of accuracy check. We can sometimes perform tests to signal data anomalies that indicate an inaccuracy exists somewhere. For example, the overlap could mean the end date of 30 Sep 2014 (second row) is wrong or the start date of 1 Nov 2013 (last row) is wrong or both could be wrong. We have no idea which situation this represents. So we can't just prevent the last row from being entered into the database -- it might be correct with the second row being incorrect.
Invalid data is invalid on its own. Suppose an attempt is made to insert a row for car id 15 and there is no entry for car 15 in the CARS table. Then the value 15 is invalid and the row can be (and should be) prevented from ever entering the table. But date period overlaps are caused by wrong data somewhere -- we have no way of knowing exactly where. We can signal the inconsistency to the user or make a log entry somewhere to have someone look into the problem, but we shouldn't reject the row that "caused" the overlap when it could very well be the existing row that contains the wrong data.
Accuracy, like the data itself, originates from outside the database. If we are lucky enough to be able to detect instances of inaccuracy, the solution also lies outside the database. The best we can do is flag it and have someone investigate to determine what data is correct and what is incorrect and (hopefully) correct the inaccuracy.
UPDATE: Having discussed a bit the concepts of data integrity and accuracy and the differences between them, here is a design idea that may be an improvement.
Note: this is based on the assumption that the date ranges form an unbroken range for each car from the first entry to the last. That is, there are no gaps.
Simple: do away with the end_date field altogether. The first entry for a car sets up the current state of that car with no end date specified. The clear implication is that the state continues indefinitely into the future until the next state change is inserted. The start date of the second state change then becomes the end date of the first state change. Continue as needed.
create table Car_States(
Car_ID int not null,
Start_Date date not null,
..., -- other info
constraint FK_Car_States_Car foreign key( Car_ID )
references Cars( ID ),
constraint PK_Car_States primary key( Car_ID, Start_Date )
);
Now let's look at the data
car_id start_date
3 01/10/2012
3 01/10/2013 -- okay no overlap
3 01/10/2014 -- okay no overlap
4 01/10/2012 -- okay different foreign key
3 01/11/2013 -- What does this mean???
Before that final row was entered, here is how the data is read for the car with id = 3: Car 3 started life in a particular state on 1 Oct 2012, changed to another state on 1 Oct 2013 and then again on 1 Oct 2014 where it remains.
Now the final row is entered: Car 3 started life in a particular state on 1 Oct 2012, changed to another state on 1 Oct 2013, changed to another state on 1 Nov 2013 and then again on 1 Oct 2014 where it remains.
As we can see, we are able to absorb the new data easily into the model. The design makes it impossible to have gaps or overlaps.
But is this really an improvement? What if the last entry was a mistake -- possibly meant to be for a different car instead of car 3? Or the wrong dates were entered. The new model just accepted the incorrect data with no complaints and we proceed not knowing we have incorrect data in the table.
This is true. But how is it any different from the original scenario? The last row represents "wrong" data. The question was, "How do I prevent this?" The answer is, in both cases, "You can't! Sorry." The best either design can do is detect the discrepancy and bring it to someone's attention.
One might think that with the original design, with the start and end dates in the same row, it is easy to determine if the new period overlapped any previously defined period. But this is also easily determined with the start-date-only design. What is important is that the test for such possible inaccuracies being discovered before the data is written to the table primarily lies with the application, not just within the database.
It is up to the users and/or some automated process to verify new and existing data and determine if any inaccuracies exist. The advantage of using only one date is that, after displaying a warning message with an "Are you sure?" response, the new record can be inserted and the operation is finished. With two dates, other records must be found and their dates resynched to match the new period.
The table below contains customer reservations. Customers come and make one record in this table, and the last day this table will be updated its checkout_date field by putting that current time.
The Table
Now I need to extract all customers spending nights.
The Query
SELECT reservations.customerid, reservations.roomno, rooms.rate,
reservations.checkin_date, reservations.billed_nights, reservations.status,
DateDiff("d",reservations.checkin_date,Date())+Abs(DateDiff("s",#12/30/1899
14:30:0#,Time())>0) AS Due_nights FROM reservations, rooms WHERE
reservations.roomno=rooms.roomno;
What I need is, if customer has checkout status, due nights will be calculated checkin_date subtracting by checkout date instead current date, also if customer has checkout date no need to add extra absolute value from 14:30.
My current query view is below, also my computer time is 14:39 so it adds 1 to every query.
Since you want to calculate the Due nights upto the checkout date, and if they are still checked in use current date. I would suggest you to use an Immediate If.
The condition to check would be the status of the room. If it is checkout, then use the checkout_date, else use the Now(), something like.
SELECT
reservations.customerid,
reservations.roomno,
rooms.rate,
reservations.checkin_date,
reservations.billed_nights,
reservations.status,
DateDiff("d", checkin_date, IIF(status = 'checkout', checkout_date, Now())) As DueNights
FROM
reservations
INNER JOIN
rooms
ON reservations.roomno = rooms.roomno;
As you might have noticed, I used a JOIN. This is more efficient than merging the two tables with common identifier. Hope this helps !
I am proposing to have a table (the design isn't settled on yet and can be altered dependent upon the views expressed in reply to this question) that will have a primary key of type int (using auto increment) and a field (ReturnPeriod of type Nchar) that will contain data in the form of '06 2013' (representing in this instance June 2013).
I would simply like to return 06 or whatever happens to be in the last record entered in the table. This table will never grow by more than 4 records per annum (so it will never be that big). It also has a column indicating the date that the last entry was created.
That column seems to my mind at least to be the most suitable candidate for getting the last record, so essentially I'd like to know if sql has a inbuilt function for comparing the date the query is run to the nearest match in a column, and to return the first two characters of a field.
So far I have:
Select Mid(ReturnPeriod,1,2) from Returns
Where DateReturnEntered = <and this is where I'm stuck>
What I'm looking for is a where clause that would get me the last entered record using the date the query is run as its reference point(DateRetunEntered of type Date contains the date a record was entered).
Of course there may be an even easier way to guarantee that one has the last record in which case I'm open to suggestions.
Thanks
I think you should store ReturnPeriod as a datetime for example not 06 2013 as a VARCHAR but 01.06.2013 as a DATETIME (first day of 06.2013).
In this case, if I've got your question right, you can use GETDATE() to get current time:
SELECT TOP 1 MONTH(ReturnPeriod)
FROM Returns
WHERE DateReturnEntered<=GETDATE()
ORDER BY DateReturnEntered DESC
If you store ReturnPeriod as a varchar then
SELECT TOP 1 LEFT(ReturnPeriod,2)
FROM Returns
WHERE DateReturnEntered<=GETDATE()
ORDER BY DateReturnEntered DESC
I would store your ReturnPeriod as a date datatype, using a nominal 1st of the month, e.g. 1 Jun 2013, if you don't have the actual date.
This will allow direct comparison against your entered date, with trivial formatting of the return value if required.
Your query would then find the latest date prior to your date entered.
SELECT MONTH(MAX(ReturnPeriod)) AS ReturnMonth
FROM Returns
WHERE ReturnPeriod <= #DateReturnEntered
I've been given a stack of data where a particular value has been collected sometimes as a date (YYYY-MM-DD) and sometimes as just a year.
Depending on how you look at it, this is either a variance in type or margin of error.
This is a subprime situation, but I can't afford to recover or discard any data.
What's the optimal (eg. least worst :) ) SQL table design that will accept either form while avoiding monstrous queries and allowing maximum use of database features like constraints and keys*?
*i.e. Entity-Attribute-Value is out.
You could store the year, month and day components in separate columns. That way, you only need to populate the columns for which you have data.
if it comes in as just a year make it default to 01 for month and date, YYYY-01-01
This way you can still use a date/datetime datatype and don't have to worry about invalid dates
Either bring it in as a string unmolested, and modify it so it's consistent in another step, or modify the year-only values during the import like SQLMenace recommends.
I'd store the value in a DATETIME type and another value (just an integer will do, or some kind of enumerated type) that signifies its precision.
It would be easier to give more information if you mentioned what kind of queries you will be doing on the data.
Either fix it, then store it (OK, not an option)
Or store it broken with a fixed computed columns
Something like this
CREATE TABLE ...
...
Broken varchar(20),
Fixed AS CAST(CASE WHEN Broken LIKE '[12][0-9][0-9][0-9]' THEN Broken + '0101' ELSE Broken END AS datetime)
This also allows you to detect good from bad source data
If you don't always have a full date, what sort of keys and constraints would you need? Perhaps store two columns of data; a full date, and a year. For data that has only year, the year is stored and date is null. For items with full info, both are populated.
I'd put three columns in the table:
The provided value (YYYY-MM-DD or YYYY)
A date column, Date or DateTime data type, which is nullable
A year column, as an integer or char(4) depending upon your needs.
I'd always populate the year column, populate the date column only when the provided value is a date.
And, because you've kept the provided value, you can always re-process down the road if needs change.
An alternative solution would be to that of a date mask (like in IP). Store the date in a regular datetime field, and insert an additional field of type smallint or something, where you could indicate which is present (could go even binary here):
If you have YYYY-MM-DD, you would have 3 bits of data, which will have the values 1 if data is present and 0 if not.
Example:
Date Mask
2009-12-05 7 (111)
2009-12-01 6 (110, only year and month are know, and day is set to default 1)
2009-01-20 5 (101, for some strange reason, only the year and the date is known. January has 31 days, so it will never generate an error)
Which solution is better depends on what you will do with it.
This is better when you want to select those with full dates, which are between a certain period (less to write). Also this way it's easier to compare any dates which have masks like 7,6,4. It may also take up less memory (date + smallint may be smaller than int+int+int, and only if datetime uses 64 bit, and smallint uses up as much as int, it will be the same).
I was going to suggest the same solution as #ninesided did above. Additionally, you could have a date field and a field that quantitatively represents your uncertainty. This offers the advantage of being able to represent things like "on or about Sept 23, 2010". The problem is that to represent the case where you only know the year, you'd have to set your date to be the middle of the year, with 182.5 days' uncertainty (assuming non-leap year), which seems ugly.
You could use a similar but distinct approach with a mask that represents what date parts you're confident about - that's what SQLMenace offered in his answer above.
+1 each to recommendations from ninesided, Nikki9696 and Jeff Siver - I support all those answers though none was exactly what I decided upon.
My solution:
a date column used only for complete dates
an int column used for years
a constraint to ensure integrity between the two
a trigger to populate the year if only date is supplied
Advantages:
can run simple (one-column) queries on the date column with missing data ignored (by using NULL for what it was designed for)
can run simple (one-column) queries on the year column for any row with a date (because year is automatically populated)
insert either year or date or both (provided they agree)
no fear of disagreement between columns
self explanatory, intuitive
I would argue that methods using YYYY-01-01 to signify missing data (when flagged as such with a second explanatory column) fail seriously on points 1 and 5.
Example code for Sqlite 3:
create table events
(
rowid integer primary key,
event_year integer,
event_date date,
check (event_year = cast(strftime("%Y", event_date) as integer))
);
create trigger year_trigger after insert on events
begin
update events set event_year = cast(strftime("%Y", event_date) as integer)
where rowid = new.rowid and event_date is not null;
end;
-- various methods to insert
insert into events (event_year, event_date) values (2008, "2008-02-23");
insert into events (event_year) values (2009);
insert into events (event_date) values ("2010-01-19");
-- select events in January without expressions on supplementary columns
select rowid, event_date from events where strftime("%m", event_date) = "01";