Let's assume that we have the following input parameters:
date [Date]
period [Integer]
The task is the following: build the table which has two columns: date and dayname.
So, if we have date = 2018-07-12 and period = 3 the table should look like this:
date |dayname
-------------------
2018-07-12|THURSDAY
2018-07-13|FRIDAY
2018-07-14|SATURDAY
My solution is the following:
select add_days(date, -1) into previousDay from "DUMMY";
for i in 1..:period do
select add_days(previousDay, i) into nextDay from "DUMMY";
:result.insert((nextDay, dayname(nextDay));
end for;
but I don't like the loop. I assume that it might be a problem in the performance if there are more complicated values that I want to put to result table.
What would be the better solution to achieve the target?
Running through a loop and inserting values one by one is most certainly the slowest possible option to accomplish the task.
Instead, you could use SAP HANA's time series feature.
With a statement like
SELECT to_date(GENERATED_PERIOD_START)
FROM SERIES_GENERATE_TIMESTAMP('INTERVAL 1 DAY', '01.01.0001', '31.12.9999')
you could generate a bounded range of valid dates with a given interval length.
In my tests using this approach brought the time to insert a set of dates from ca. 9 minutes down to 7 seconds...
I've written about that some time ago here and here if you want some more examples for that.
In those examples, I even included the use of series tables that allow for efficient compression of timestamp column values.
Series Data functions include SERIES_GENERATE_DATE which returns a set of values in date data format. So you don't have to bother to convert returned data into desired date format.
Here is a sample code
declare d int := 5;
declare dstart date := '01.01.2018';
SELECT generated_period_start FROM SERIES_GENERATE_DATE('INTERVAL 1 DAY', :dstart, add_days(:dstart, :d));
I need to a value associated to a month and a user in a table. And I want to perform queries on it. I don't know if there is a column data type for this type of need. If not, should I:
Create a string field and build year-month concatenation (2017-01)
Create a int field and build year-month concatenation (201701)
Create two columns (one year and one month)
Create a date column at the beginning of the month (2017-01-01 00:00:00)
Something else?
The objective is to run queries like (pseudo-SQL):
SELECT val FROM t WHERE year_month = THIS_YEAR_MONTH and user_id='adc1-23...';
I would suggest not thinking too hard about the problem and just using the first date/time of the month. Postgres has plenty of date-specific functions -- from date_trunc() to age() to + interval -- to support dates.
You can readily convert them to the format you want, get the difference between two values, and so on.
If you phrase your query as:
where year_month = date_trunc('month', now()) and user_id = 'adc1-23...'
Then it can readily take advantage of an index on (user_id, year_month) or (year_month, user_id).
If you are interested in display values in YYYY-MM formt you can use to_char(your_datatime_colum,'YYYY-MM')
example:
SELECT to_char(now(),'YYYY-MM') as year_month
Here trip 1 involves 2 activity_code in a single day and also concludes in a single day and most other activities are just single day but i have one trip that span over more than one day.
What could be the best possible way to store date range for that column that span more than one days.
Splitting the column into multiple begin date and end date just doesn't make sense as there would be many blank columns?
trip_id(pk,fk) Activity_code(pk,fk) date
1 a1 1st October 2015
1 a2 1st October 2015
2 a3 2nd -5th October 2015
Keep in mind that i need to search the activity_code on basis of month. such as list all the activity code that occur in October ?
Is it possible to insert a range of date in a single column or any other design solution ?
Is there any datatype that can represent the date range in single value ?
PS: oracle 11g e
Store the date ranges as FirstDate/LastDate or FirstDate/Duration.
This allows you to store the values in the native format for dates. Storing dates as strings is a bad, bad idea, because strings don't have all the built-in functionality provided for native date types.
Don't worry about the additional storage for a second date or duration. In fact, the two columns together are probably smaller than storing the value as a string.
Splitting the date into start date and end date would be ideal. Storing dates as strings is not recommended. If you store your dates as strings then there is a possibility of malformed data being stored in the column since a VARCHAR2 column will allow any value. You will have to build strong validations in your script while inserting the data which is unnecessary.
Secondly, you will not be able to perform simple operations like calculating the duration/length of the trip easily if both the start_date and end_date are stored in the same column. If they are stored in different columns it would be as simple as
SELECT trip_id, activity_code, end_date - start_date FROM trips;
I would like to have a date and time column in my table. The main purpose of having these 2 columns is to be able to return query results like:
Number of treatments done in the period November 2011.
Number of people working in shifts between 00:01 and 08:00 hours.
I have two tables, which have the following attributes in them(among others):
Shift(day, month, year)
Treatment(start_time, date)
For the first table- Shift, query results need to return values in
terms of (ex: December 30,2012)
For the second table, start_time needs to have values like 0001 and
0800(as I mentioned above). While, date can return values like
'November 2011'.
Initially I thought using the date datatype for declaring each of the day/month/year/date variables would do the job. But this doesn't seem to work out. Should I use int, varchar and int respectively for day, month and year respectively? Also, since the date variable does not have component parts, will date datatype work here? Lastly, if I use timestamp data type for the start_time attribute, what should be the value I enter in the insert column- should it be 08:00:00?
I'm using SQL Server 2014.
Thank You for your help.
AFAIK it is better to use one column by type of DateTime instead of two columns which hold Date and Time separately.
Also you could simply query this column either by Date or Time by casting it to corresponding type :
DECLARE #ChangeDateTime AS DATETIME = '2012-12-09 16:07:43.937'
SELECT CAST(#ChangeDateTime AS DATE) AS [ChangeDate],
CAST(#ChangeDateTime AS TIME) AS [ChangeTime]
results to :
ChangeDate ChangeTime
---------- ----------------
2012-12-09 16:07:43.9370000
I've been given a stack of data where a particular value has been collected sometimes as a date (YYYY-MM-DD) and sometimes as just a year.
Depending on how you look at it, this is either a variance in type or margin of error.
This is a subprime situation, but I can't afford to recover or discard any data.
What's the optimal (eg. least worst :) ) SQL table design that will accept either form while avoiding monstrous queries and allowing maximum use of database features like constraints and keys*?
*i.e. Entity-Attribute-Value is out.
You could store the year, month and day components in separate columns. That way, you only need to populate the columns for which you have data.
if it comes in as just a year make it default to 01 for month and date, YYYY-01-01
This way you can still use a date/datetime datatype and don't have to worry about invalid dates
Either bring it in as a string unmolested, and modify it so it's consistent in another step, or modify the year-only values during the import like SQLMenace recommends.
I'd store the value in a DATETIME type and another value (just an integer will do, or some kind of enumerated type) that signifies its precision.
It would be easier to give more information if you mentioned what kind of queries you will be doing on the data.
Either fix it, then store it (OK, not an option)
Or store it broken with a fixed computed columns
Something like this
CREATE TABLE ...
...
Broken varchar(20),
Fixed AS CAST(CASE WHEN Broken LIKE '[12][0-9][0-9][0-9]' THEN Broken + '0101' ELSE Broken END AS datetime)
This also allows you to detect good from bad source data
If you don't always have a full date, what sort of keys and constraints would you need? Perhaps store two columns of data; a full date, and a year. For data that has only year, the year is stored and date is null. For items with full info, both are populated.
I'd put three columns in the table:
The provided value (YYYY-MM-DD or YYYY)
A date column, Date or DateTime data type, which is nullable
A year column, as an integer or char(4) depending upon your needs.
I'd always populate the year column, populate the date column only when the provided value is a date.
And, because you've kept the provided value, you can always re-process down the road if needs change.
An alternative solution would be to that of a date mask (like in IP). Store the date in a regular datetime field, and insert an additional field of type smallint or something, where you could indicate which is present (could go even binary here):
If you have YYYY-MM-DD, you would have 3 bits of data, which will have the values 1 if data is present and 0 if not.
Example:
Date Mask
2009-12-05 7 (111)
2009-12-01 6 (110, only year and month are know, and day is set to default 1)
2009-01-20 5 (101, for some strange reason, only the year and the date is known. January has 31 days, so it will never generate an error)
Which solution is better depends on what you will do with it.
This is better when you want to select those with full dates, which are between a certain period (less to write). Also this way it's easier to compare any dates which have masks like 7,6,4. It may also take up less memory (date + smallint may be smaller than int+int+int, and only if datetime uses 64 bit, and smallint uses up as much as int, it will be the same).
I was going to suggest the same solution as #ninesided did above. Additionally, you could have a date field and a field that quantitatively represents your uncertainty. This offers the advantage of being able to represent things like "on or about Sept 23, 2010". The problem is that to represent the case where you only know the year, you'd have to set your date to be the middle of the year, with 182.5 days' uncertainty (assuming non-leap year), which seems ugly.
You could use a similar but distinct approach with a mask that represents what date parts you're confident about - that's what SQLMenace offered in his answer above.
+1 each to recommendations from ninesided, Nikki9696 and Jeff Siver - I support all those answers though none was exactly what I decided upon.
My solution:
a date column used only for complete dates
an int column used for years
a constraint to ensure integrity between the two
a trigger to populate the year if only date is supplied
Advantages:
can run simple (one-column) queries on the date column with missing data ignored (by using NULL for what it was designed for)
can run simple (one-column) queries on the year column for any row with a date (because year is automatically populated)
insert either year or date or both (provided they agree)
no fear of disagreement between columns
self explanatory, intuitive
I would argue that methods using YYYY-01-01 to signify missing data (when flagged as such with a second explanatory column) fail seriously on points 1 and 5.
Example code for Sqlite 3:
create table events
(
rowid integer primary key,
event_year integer,
event_date date,
check (event_year = cast(strftime("%Y", event_date) as integer))
);
create trigger year_trigger after insert on events
begin
update events set event_year = cast(strftime("%Y", event_date) as integer)
where rowid = new.rowid and event_date is not null;
end;
-- various methods to insert
insert into events (event_year, event_date) values (2008, "2008-02-23");
insert into events (event_year) values (2009);
insert into events (event_date) values ("2010-01-19");
-- select events in January without expressions on supplementary columns
select rowid, event_date from events where strftime("%m", event_date) = "01";