Is it possible to return part of a field from the last row entered into a table - sql

I am proposing to have a table (the design isn't settled on yet and can be altered dependent upon the views expressed in reply to this question) that will have a primary key of type int (using auto increment) and a field (ReturnPeriod of type Nchar) that will contain data in the form of '06 2013' (representing in this instance June 2013).
I would simply like to return 06 or whatever happens to be in the last record entered in the table. This table will never grow by more than 4 records per annum (so it will never be that big). It also has a column indicating the date that the last entry was created.
That column seems to my mind at least to be the most suitable candidate for getting the last record, so essentially I'd like to know if sql has a inbuilt function for comparing the date the query is run to the nearest match in a column, and to return the first two characters of a field.
So far I have:
Select Mid(ReturnPeriod,1,2) from Returns
Where DateReturnEntered = <and this is where I'm stuck>
What I'm looking for is a where clause that would get me the last entered record using the date the query is run as its reference point(DateRetunEntered of type Date contains the date a record was entered).
Of course there may be an even easier way to guarantee that one has the last record in which case I'm open to suggestions.
Thanks

I think you should store ReturnPeriod as a datetime for example not 06 2013 as a VARCHAR but 01.06.2013 as a DATETIME (first day of 06.2013).
In this case, if I've got your question right, you can use GETDATE() to get current time:
SELECT TOP 1 MONTH(ReturnPeriod)
FROM Returns
WHERE DateReturnEntered<=GETDATE()
ORDER BY DateReturnEntered DESC
If you store ReturnPeriod as a varchar then
SELECT TOP 1 LEFT(ReturnPeriod,2)
FROM Returns
WHERE DateReturnEntered<=GETDATE()
ORDER BY DateReturnEntered DESC

I would store your ReturnPeriod as a date datatype, using a nominal 1st of the month, e.g. 1 Jun 2013, if you don't have the actual date.
This will allow direct comparison against your entered date, with trivial formatting of the return value if required.
Your query would then find the latest date prior to your date entered.
SELECT MONTH(MAX(ReturnPeriod)) AS ReturnMonth
FROM Returns
WHERE ReturnPeriod <= #DateReturnEntered

Related

Copying only certain values from one row into another table

I am trying to copy data from one table to another table, which works fine, but I only want to copy certain data from one the of the columns.
Insert Into Period (Invoice_No, Period_Date)
Select Invoice_Seq_No, Inv_Comment
From Invoices
Where INV_Comment LIKE '%November 2015';
The Inv_Comment column contains free-form comments and the date in different formats, e.g. "paid on November 2015 or "paid on Aug" or "July 2015". What I am trying to do is to copy only the "November 2015" part of the comment into the new table.
The above code only copies the entire data of the Inv_Comment field and I only want to copy the date. The date part can be in one of three formats: MON YYYY, DD.MM.YYYY or only the month i.e. MON
How can I extract only the date part I am interested in?
For your very simple example query you can use the substr() function, using the length of your fixed value to count back from the end of the string, as that document describes:
If position is negative, then Oracle counts backward from the end of char.
So you can do:
select invoice_seq_no, substr(inv_comment, -length('November 2015'))
from invoices
where inv_comment like '%November 2015';
But it's clear from the comments that you really want to find all dates, in various formats, and not always at the end of the free-form text. One option is to search the text repeatedly for all the possible formats and values, starting with the most specific (e.g. DD.MM.YYYY) and then going down to least specific
(e.g. just MON). You could insert just the sequence numbers into your table start with, and then repeatedly update the rows that do not yet have values set:
insert into period (invoice_no) select invoice_seq_no from invoices;
update period p
set period_date = (
select case when instr(i.inv_comment, '15.09.2015') > 0 then
substr(i.inv_comment, instr(i.inv_comment, '15.09.2015'), length('15.09.2015'))
end
from invoices i
where i.invoice_seq_no = p.invoice_no
)
where period_date is null;
then repeat the update with another date, or a more generic November 2015 pattern, etc. But specifying every possible date isn't going to be feasible, so you could regular expressions. There are probably better patterns for this but as an example:
update period p
set period_date = (
select regexp_substr(i.inv_comment, '[[0-3][0-9][-./][0-1][0-9][-./][12]?[901]?[0-9]{2}')
from invoices i
where i.invoice_seq_no = p.invoice_no
)
where period_date is null;
which matches (or attempts to match) anything looking like DD.MM.YYYY, followed by maybe:
update period p
set period_date = (
select regexp_substr(i.inv_comment,
'(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|'
|| 'Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)([[:space:]]+[12]?[901]?[0-9]{2})?')
from invoices i
where i.invoice_seq_no = p.invoice_no
)
where period_date is null;
which matches any short or long month name. You may have mixed case though - aug, Aug, AUG - so you might want to use the match parameter to make it case-insensitive. This isn't supposed to be a complete solution though, and you may need further formats. There are some ideas on other questions.
You may really want actual dates, which means breaking down a bit more, and then assuming missing years - perhaps taking the year from another column (order date?) if it isn't available in the comments, though that gets a bit messy around year-end. But you can essentially do the same thing, just passing each extracted value through to_date() with a format mask matching the search expression you're using.
There will always be mistakes, typos, odd formatting etc., so even if this approach identified most patterns, you'll probably end up with some that are left blank, and will need to be set manually by a human looking at the comments; and some that are just wrong. But this is why dates shouldn't be stored as strings at all - having them mixed in with other text is just making things even worse.
Here you're dealing with strings containing disparate date information. Several string operations may be needed.

Default value for datetime

How can you search for dates (datetimes) that contain a default value i.e. ''. I guess it is not:
select * from table where dateofbirth=''
All the dates seem to have a default value of '1900-01-01'. However, there are people in my database who have a date of birth on or before this date (histroic people mainly). Therefore I cannot do:
select * from table where dateofbirth='1900-01-01'
I know that some versions of SQL Server have a default date of: 1899-12-31.
I guess it is better to use nulls for unknown dates. I cannot do that in this case.
I have read through lots of questions on here about finding dates using SQL but I have not found an answer to my specific question.
You can get the default DateTime value as;
SELECT CONVERT(DATETIME, 0)
And apply it to the filter as appropriate;
SELECT * FROM [Table] WHERE DateOfBirth = CONVERT(DATETIME, 0)
Or if you need to select earlier dates then;
SELECT * FROM [Table] WHERE DateOfBirth <= CONVERT(DATETIME, 0)
Fiddle example
The best you are going to get is what you listed:
select * from table where dateofbirth='1900-01-01'
As you know, the problem is that if someone was really born on 1/1/1900, you will also include them. But there's really no way for your query to know the difference.
To fix this, you would need to change what your system is using for the default value (e.g. NULL or change to datetime2 or date datatype and use 1/1/0001). Then update all your 1/1/1900 values to the new default value. Yes, this will erroneously update any existing people with 1/1/1900 birthdays, but at least it will prevent any future occurrences.
In SQL Server terms a default value for column X is only used when a new record is first created and a value is not provided for that column. After the initial creation of the record the value is just a value, same as any other. Within a single table there is no way to distinguish between records that that hold the default value in column X because it was supplied, or because it was defaulted.
This won't help you now, but an alternative to nulls that is sometimes used is to use a 'magic value'. In the case of dates of births, the maximum datetime value of 31st December 9999 could be used to indicate an unknown value (assuming your system isn't expected to be in use in 8,000 years time :) Some people (including me) don't really approve of the use of magic values because there's no way in the database of indicating their magic status.
Rhys

Date/Time data types and declaration in SQL Server

I would like to have a date and time column in my table. The main purpose of having these 2 columns is to be able to return query results like:
Number of treatments done in the period November 2011.
Number of people working in shifts between 00:01 and 08:00 hours.
I have two tables, which have the following attributes in them(among others):
Shift(day, month, year)
Treatment(start_time, date)
For the first table- Shift, query results need to return values in
terms of (ex: December 30,2012)
For the second table, start_time needs to have values like 0001 and
0800(as I mentioned above). While, date can return values like
'November 2011'.
Initially I thought using the date datatype for declaring each of the day/month/year/date variables would do the job. But this doesn't seem to work out. Should I use int, varchar and int respectively for day, month and year respectively? Also, since the date variable does not have component parts, will date datatype work here? Lastly, if I use timestamp data type for the start_time attribute, what should be the value I enter in the insert column- should it be 08:00:00?
I'm using SQL Server 2014.
Thank You for your help.
AFAIK it is better to use one column by type of DateTime instead of two columns which hold Date and Time separately.
Also you could simply query this column either by Date or Time by casting it to corresponding type :
DECLARE #ChangeDateTime AS DATETIME = '2012-12-09 16:07:43.937'
SELECT CAST(#ChangeDateTime AS DATE) AS [ChangeDate],
CAST(#ChangeDateTime AS TIME) AS [ChangeTime]
results to :
ChangeDate ChangeTime
---------- ----------------
2012-12-09 16:07:43.9370000

How to extract dates with datatye DATETIME from colum A in table X and put them into Table Y while changing datatype into DATE

Long title, easy meaning:
How is it possible to extract from a date like "2014-04-04 10:47:30.000", which is stored in one column, it's components like year, month and day?
I'm not interested in the time.
For example, I have a table called "Incidents". Inside the table we got a column called "IncidentID" and a column called "ReportingDate", in which dates like the above-mentionend are stored. Let's say we have about 50k Incidents, therefore we have also 50k dates.
A year has 365 days. I want to query for the count of the Incidents, which were reported on different dates - for instance on the 5th of October 2013.
So: How can I get the components of the date and put them into another table while having own columns for the components and how can I query for the Incidents as well?
I guess at first I have to change the datatype of the date from DATETIME to DATE, but I'm not quite sure how to go further. May anyone help me while giving me a code and explains me what it does for a sql-noob? :-)
To achieve this
I want to query for the count of the Incidents, which were reported on
different dates - for instance on the 5th of October 2013.
you haven't do this:
I guess at first I have to change the datatype of the date from
DATETIME to DATE, but I'm not quite sure how to go further.
Just query
SELECT
IncidentID
FROM incidents
WHERE ReportingDate >= '20131005'
AND ReportingDate < '20131006'

How to design SQL tables when column data arrives in multiple types/margins of error?

I've been given a stack of data where a particular value has been collected sometimes as a date (YYYY-MM-DD) and sometimes as just a year.
Depending on how you look at it, this is either a variance in type or margin of error.
This is a subprime situation, but I can't afford to recover or discard any data.
What's the optimal (eg. least worst :) ) SQL table design that will accept either form while avoiding monstrous queries and allowing maximum use of database features like constraints and keys*?
*i.e. Entity-Attribute-Value is out.
You could store the year, month and day components in separate columns. That way, you only need to populate the columns for which you have data.
if it comes in as just a year make it default to 01 for month and date, YYYY-01-01
This way you can still use a date/datetime datatype and don't have to worry about invalid dates
Either bring it in as a string unmolested, and modify it so it's consistent in another step, or modify the year-only values during the import like SQLMenace recommends.
I'd store the value in a DATETIME type and another value (just an integer will do, or some kind of enumerated type) that signifies its precision.
It would be easier to give more information if you mentioned what kind of queries you will be doing on the data.
Either fix it, then store it (OK, not an option)
Or store it broken with a fixed computed columns
Something like this
CREATE TABLE ...
...
Broken varchar(20),
Fixed AS CAST(CASE WHEN Broken LIKE '[12][0-9][0-9][0-9]' THEN Broken + '0101' ELSE Broken END AS datetime)
This also allows you to detect good from bad source data
If you don't always have a full date, what sort of keys and constraints would you need? Perhaps store two columns of data; a full date, and a year. For data that has only year, the year is stored and date is null. For items with full info, both are populated.
I'd put three columns in the table:
The provided value (YYYY-MM-DD or YYYY)
A date column, Date or DateTime data type, which is nullable
A year column, as an integer or char(4) depending upon your needs.
I'd always populate the year column, populate the date column only when the provided value is a date.
And, because you've kept the provided value, you can always re-process down the road if needs change.
An alternative solution would be to that of a date mask (like in IP). Store the date in a regular datetime field, and insert an additional field of type smallint or something, where you could indicate which is present (could go even binary here):
If you have YYYY-MM-DD, you would have 3 bits of data, which will have the values 1 if data is present and 0 if not.
Example:
Date Mask
2009-12-05 7 (111)
2009-12-01 6 (110, only year and month are know, and day is set to default 1)
2009-01-20 5 (101, for some strange reason, only the year and the date is known. January has 31 days, so it will never generate an error)
Which solution is better depends on what you will do with it.
This is better when you want to select those with full dates, which are between a certain period (less to write). Also this way it's easier to compare any dates which have masks like 7,6,4. It may also take up less memory (date + smallint may be smaller than int+int+int, and only if datetime uses 64 bit, and smallint uses up as much as int, it will be the same).
I was going to suggest the same solution as #ninesided did above. Additionally, you could have a date field and a field that quantitatively represents your uncertainty. This offers the advantage of being able to represent things like "on or about Sept 23, 2010". The problem is that to represent the case where you only know the year, you'd have to set your date to be the middle of the year, with 182.5 days' uncertainty (assuming non-leap year), which seems ugly.
You could use a similar but distinct approach with a mask that represents what date parts you're confident about - that's what SQLMenace offered in his answer above.
+1 each to recommendations from ninesided, Nikki9696 and Jeff Siver - I support all those answers though none was exactly what I decided upon.
My solution:
a date column used only for complete dates
an int column used for years
a constraint to ensure integrity between the two
a trigger to populate the year if only date is supplied
Advantages:
can run simple (one-column) queries on the date column with missing data ignored (by using NULL for what it was designed for)
can run simple (one-column) queries on the year column for any row with a date (because year is automatically populated)
insert either year or date or both (provided they agree)
no fear of disagreement between columns
self explanatory, intuitive
I would argue that methods using YYYY-01-01 to signify missing data (when flagged as such with a second explanatory column) fail seriously on points 1 and 5.
Example code for Sqlite 3:
create table events
(
rowid integer primary key,
event_year integer,
event_date date,
check (event_year = cast(strftime("%Y", event_date) as integer))
);
create trigger year_trigger after insert on events
begin
update events set event_year = cast(strftime("%Y", event_date) as integer)
where rowid = new.rowid and event_date is not null;
end;
-- various methods to insert
insert into events (event_year, event_date) values (2008, "2008-02-23");
insert into events (event_year) values (2009);
insert into events (event_date) values ("2010-01-19");
-- select events in January without expressions on supplementary columns
select rowid, event_date from events where strftime("%m", event_date) = "01";