Given a data set in MS SQL Server 2012 where travelers take trips (with trip_ID as UID) and where each trip has start_date and an end_date, I'm looking to find the trip_ID's for each traveler where trip's overlap and the range of that overlap. So if the initial table looks like this:
| trip_ID | traveler | start_date | end_date | trip_length |
|---------|----------|------------|------------|-------------|
| AB24 | Alpha | 2017-01-29 | 2017-01-31 | 2|
| BA02 | Alpha | 2017-01-31 | 2017-02-10 | 10|
| CB82 | Charlie | 2017-02-20 | 2017-02-23 | 3|
| CA29 | Bravo | 2017-02-26 | 2017-02-28 | 2|
| AB14 | Charlie | 2017-03-06 | 2017-03-08 | 2|
| DA45 | Bravo | 2017-03-26 | 2017-03-29 | 3|
| BA22 | Bravo | 2017-03-29 | 2017-04-03 | 5|
I'm looking for a query that will append three columns to the original table: overlap_id, overlap_start, overlap_end. The idea is that each row will have a value (or NULL) for an overlapping trip along with the start and end dates for overlap itself. Like this:
| trip_ID | traveler | start_date | end_date |trip_length|overlap_id |overlap_start| overlap_end|
|---------|----------|------------|------------|-----------|------------|-------------|------------|
| AB24 | Alpha | 2017-01-29 | 2017-01-31 | 2|BA02--------|2017-01-31---|2017-01-31--|
| BA02 | Alpha | 2017-01-31 | 2017-02-10 | 10|AB24--------|2017-01-31---|2017-01-31--|
| CB82 | Charlie | 2017-02-20 | 2017-02-23 | 3|NULL--------|NULL---------|NULL--------|
| CA29 | Bravo | 2017-02-26 | 2017-02-28 | 2|NULL--------|NULL---------|NULL--------|
| AB14 | Charlie | 2017-03-06 | 2017-03-08 | 2|NULL--------|NULL---------|NULL--------|
| DA45 | Bravo | 2017-03-26 | 2017-03-29 | 3|BA22--------|2017-03-28---|2017-03-29--|
| BA22 | Bravo | 2017-03-28 | 2017-04-03 | 5|DA45--------|2017-03-28---|2017-03-29--|
I've tried variations of Overlapping Dates in SQL to inform my approach but it's not returning the right answers. I'm only looking for overlaps for the same traveler (i.e., within Alpha or Bravo, not between Alpha and Bravo).
For the overlap_id column, I think the code would have to test if a trip's start_date plus range(0, trip_length) returns a value within the range of dates between start_date and end_date for any other trip where the traveler is the same, then the trip_id is updated to equal the id of the matching trips. If this is the right concept, I'm not sure how to make a variable out of trip_length so I test a range of values for it, i.e., run this for all values of trip_length - x until trip_length - x = 0.
--This might be the bare bones of an answer
update table
set overlap_id = CASE
WHEN ( DATEADD(day, trip_length, start_date) = SELECT (DATEADD(day, trip_length, start_date) from table where traveler = traveler)
You can join the table with itself (the join condition is described here):
SELECT t.*, o.trip_ID, o.start_date, o.end_date
FROM t
LEFT JOIN t AS o ON t.trip_ID <> o.trip_ID -- trip always overlaps itself so exclude it
AND o.traveler = t.traveler -- same traveller
AND t.start_date <= o.end_date -- overlap test
AND t.end_date >= o.start_date
Related
I am currently studying SQL and I am still a newbie. I have this task where I need to split some rows with various entries like dates and user IDs. I really need help
+-------+------------------------------+---------------------------+
| TYPE | DATES | USER _ID |
+-------+------------------------------+---------------------------+
| WORK | ["2022-06-02", "2022-06-03"] | {74042,88357,83902,88348} |
| LEAVE | ["2022-05-16", "2022-05-26"] | {83902,74042,88357,88348} |
+-------+------------------------------+---------------------------+
the end result should look like this. the user id's should be aligned or should be in the same as their respective dates.
+-------+------------+---------+
| TYPE | DATES | USER_ID |
+-------+------------+---------+
| LEAVE | 05/16/2022 | 74042 |
| LEAVE | 05/16/2022 | 88357 |
| LEAVE | 05/16/2022 | 88348 |
| LEAVE | 05/16/2022 | 83902 |
| LEAVE | 05/26/2022 | 74042 |
| LEAVE | 05/26/2022 | 88357 |
| LEAVE | 05/26/2022 | 88348 |
| LEAVE | 05/26/2022 | 83902 |
| WORK | 06/2/2022 | 74042 |
| WORK | 06/2/2022 | 88357 |
| WORK | 06/2/2022 | 88348 |
| WORK | 06/2/2022 | 83902 |
| WORK | 06/3/2022 | 74042 |
| WORK | 06/3/2022 | 88357 |
| WORK | 06/3/2022 | 88348 |
| WORK | 06/3/2022 | 83902 |
+-------+------------+---------+
Create table:
CREATE TABLE work_leave (
TYPE varchar,
DATES date,
USER_ID integer
);
INSERT INTO work_leave
VALUES ('LEAVE', '05/16/2022', 74042),
('LEAVE', '05/16/2022', 88357),
('LEAVE', '05/16/2022', 88348),
('LEAVE', '05/16/2022', 83902),
('LEAVE', '05/26/2022', 74042),
('LEAVE', '05/26/2022', 88357),
('LEAVE', '05/26/2022', 88348),
('LEAVE', '05/26/2022', 83902),
('WORK', '06/2/2022', 74042),
('WORK', '06/2/2022', 88357),
('WORK', '06/2/2022', 88348),
('WORK', '06/2/2022', 83902),
('WORK', '06/3/2022', 74042),
('WORK', '06/3/2022', 88357),
('WORK', '06/3/2022', 88348),
('WORK', '06/3/2022', 83902);
WITH date_ends AS (
SELECT
type,
ARRAY[min(dates),
max(dates)] AS dates
FROM
work_leave
GROUP BY
type
),
users AS (
SELECT
type,
array_agg(DISTINCT (user_id)
ORDER BY user_id) AS user_ids
FROM
work_leave
GROUP BY
type
)
SELECT
de.type,
de.dates,
u.user_ids
FROM
date_ends AS de
JOIN
users as u
ON de.type = u.type;
type | dates | user_ids
-------+-------------------------+---------------------------
LEAVE | {05/16/2022,05/26/2022} | {74042,83902,88348,88357}
WORK | {06/02/2022,06/03/2022} | {74042,83902,88348,88357}
I adjusted the data slightly for simplicity. Here's one idea:
WITH rows (type, dates, user_id) AS (
VALUES ('WORK', array['2022-06-02', '2022-06-03'], array[74042,88357,83902,88348])
, ('LEAVE', array['2022-05-16', '2022-05-26'], array[83902,74042,88357,88348])
)
SELECT r1.type, x.*
FROM rows AS r1
CROSS JOIN LATERAL (
SELECT r2.dates, r3.user_id
FROM unnest(r1.dates) AS r2(dates)
, unnest(r1.user_id) AS r3(user_id)
) AS x
;
The fiddle
The result:
type
dates
user_id
WORK
2022-06-02
74042
WORK
2022-06-02
88357
WORK
2022-06-02
83902
WORK
2022-06-02
88348
WORK
2022-06-03
74042
WORK
2022-06-03
88357
WORK
2022-06-03
83902
WORK
2022-06-03
88348
LEAVE
2022-05-16
83902
LEAVE
2022-05-16
74042
LEAVE
2022-05-16
88357
LEAVE
2022-05-16
88348
LEAVE
2022-05-26
83902
LEAVE
2022-05-26
74042
LEAVE
2022-05-26
88357
LEAVE
2022-05-26
88348
Say in MonetDB (specifically, the embedded version from the "MonetDBLite" R package) I have a table "events" containing entity ID codes and event start and end dates, of the format:
| id | start_date | end_date |
| 1 | 2010-01-01 | 2010-03-30 |
| 1 | 2010-04-01 | 2010-06-30 |
| 2 | 2018-04-01 | 2018-06-30 |
| ... | ... | ... |
The table is approximately 80 million rows of events, attributable to approximately 2.5 million unique entities (ID values). The dates appear to align nicely with calendar quarters, but I haven't thoroughly checked them so assume they can be arbitrary. However, I have at least sense-checked them for end_date > start_date.
I want to produce a table "nonevent_qtrs" listing calendar quarters where an ID has no event recorded, e.g.:
| id | last_doq |
| 1 | 2010-09-30 |
| 1 | 2010-12-31 |
| ... | ... |
| 1 | 2018-06-30 |
| 2 | 2010-03-30 |
| ... | ... |
(doq = day of quarter)
If the extent of an event spans any days of the quarter (including the first and last dates), then I wish for it to count as having occurred in that quarter.
To help with this, I have produced a "calendar table"; a table of quarters "qtrs", covering the entire span of dates present in "events", and of the format:
| first_doq | last_doq |
| 2010-01-01 | 2010-03-30 |
| 2010-04-01 | 2010-06-30 |
| ... | ... |
And tried using a non-equi merge like so:
create table nonevents
as select
id,
last_doq
from
events
full outer join
qtrs
on
start_date > last_doq or
end_date < first_doq
group by
id,
last_doq
But this is a) terribly inefficient and b) certainly wrong, since most IDs are listed as being non-eventful for all quarters.
How can I produce the table "nonevent_qtrs" I described, which contains a list of quarters for which each ID had no events?
If it's relevant, the ultimate use-case is to calculate runs of non-events to look at time-till-event analysis and prediction. Feels like run length encoding will be required. If there's a more direct approach than what I've described above then I'm all ears. The only reason I'm focusing on non-event runs to begin with is to try to limit the size of the cross-product. I've also considered producing something like:
| id | last_doq | event |
| 1 | 2010-01-31 | 1 |
| ... | ... | ... |
| 1 | 2018-06-30 | 0 |
| ... | ... | ... |
But although more useful this may not be feasible due to the size of the data involved. A wide format:
| id | 2010-01-31 | ... | 2018-06-30 |
| 1 | 1 | ... | 0 |
| 2 | 0 | ... | 1 |
| ... | ... | ... | ... |
would also be handy, but since MonetDB is column-store I'm not sure whether this is more or less efficient.
Let me assume that you have a table of quarters, with the start date of a quarter and the end date. You really need this if you want the quarters that don't exist. After all, how far back in time or forward in time do you want to go?
Then, you can generate all id/quarter combinations and filter out the ones that exist:
select i.id, q.*
from (select distinct id from events) i cross join
quarters q left join
events e
on e.id = i.id and
e.start_date <= q.quarter_end and
e.end_date >= q.quarter_start
where e.id is null;
I have an MS Access database for rainfall data of several climate stations.
For each day of each station, I want to calculate the rainfall in the previous day (if recorded), and the sum of the rainfall at the previous 3 and 7 days.
Due to the huge amount of data and the limitations of Access, I made a query that takes station by station; Then I applied an auxillary query to find dates first, For each station, The following SQL statement is applied (and named RainFallStudy query):
SELECT
[173].ID, [173].AirportCode, [173].RFmm,
DateSerial([rYear], [rMonth], [rDay]) AS DateSer,
[DateSer]-1 AS DM1,
[DateSer]-2 AS DM2,
[DateSer]-3 AS DM3,
[DateSer]-4 AS DM4,
[DateSer]-5 AS DM5,
[DateSer]-6 AS DM6,
[DateSer]-7 AS DM7
FROM
[173]
WHERE
((([173].AirportCode) = 786660));
I used DM1, DM2, etc as the date serial of the day-1, day-2, etc.
Then I used another query that uses RainFallStudy query with left joints as shown in the figure:
The SQL statement is
SELECT
RainFallStudy.ID, RainFallStudy.AirportCode,
RainFallStudy.RFmm AS RF0, RainFallStudy.DateSer,
RainFallStudy.DM1, RainFallStudy_1.RFmm AS RF1,
RainFallStudy_2.RFmm AS RF2, RainFallStudy_3.RFmm AS RF3,
RainFallStudy_4.RFmm AS RF4, RainFallStudy_5.RFmm AS RF5,
RainFallStudy_6.RFmm AS RF6, RainFallStudy_7.RFmm AS RF7,
Nz([rf1], 0) + Nz([rf2], 0) + Nz([rf3], 0) + Nz([rf4], 0) + Nz([rf5], 0) + Nz([rf6], 0) + Nz([rf7], 0) AS RF_W
FROM
((((((RainFallStudy
LEFT JOIN
RainFallStudy AS RainFallStudy_1 ON RainFallStudy.DM1 = RainFallStudy_1.DateSer)
LEFT JOIN
RainFallStudy AS RainFallStudy_2 ON RainFallStudy.DM2 = RainFallStudy_2.DateSer)
LEFT JOIN
RainFallStudy AS RainFallStudy_3 ON RainFallStudy.DM3 = RainFallStudy_3.DateSer)
LEFT JOIN
RainFallStudy AS RainFallStudy_4 ON RainFallStudy.DM4 = RainFallStudy_4.DateSer)
LEFT JOIN
RainFallStudy AS RainFallStudy_5 ON RainFallStudy.DM5 = RainFallStudy_5.DateSer)
LEFT JOIN
RainFallStudy AS RainFallStudy_6 ON RainFallStudy.DM6 = RainFallStudy_6.DateSer)
LEFT JOIN
RainFallStudy AS RainFallStudy_7 ON RainFallStudy.DM7 = RainFallStudy_7.RFmm;
Now I suffer from the slow performance of this query, as the records of each station range from 1,000 to 750,000 records! Is there any better way to find what I need in a faster SQL statement? The second question, can I make a standalone SQL statement for that (one query without the auxiliary query) as I will use it in python, which requires one SQL statement (as Iof my knowledge).
Thanks in advance.
Update
As requested by #Andre, Here are some sample data of table [173] in HTML
<table><tbody><tr><th>ID</th><th>AirportCode</th><th>rYear</th><th>rMonth</th><th>rDay</th><th>RFmm</th></tr><tr><td>11216</td><td>409040</td><td>2012</td><td>1</td><td>23</td><td>0.51</td></tr><tr><td>11217</td><td>409040</td><td>2012</td><td>1</td><td>24</td><td>0</td></tr><tr><td>11218</td><td>409040</td><td>2012</td><td>1</td><td>25</td><td>0</td></tr><tr><td>11219</td><td>409040</td><td>2012</td><td>1</td><td>26</td><td>2.03</td></tr><tr><td>11220</td><td>409040</td><td>2012</td><td>1</td><td>27</td><td>0</td></tr><tr><td>11221</td><td>409040</td><td>2012</td><td>1</td><td>28</td><td>0</td></tr><tr><td>11222</td><td>409040</td><td>2012</td><td>1</td><td>29</td><td>0</td></tr><tr><td>11223</td><td>409040</td><td>2012</td><td>1</td><td>30</td><td>0</td></tr><tr><td>11224</td><td>409040</td><td>2012</td><td>1</td><td>31</td><td>0.25</td></tr><tr><td>11225</td><td>409040</td><td>2012</td><td>2</td><td>1</td><td>0</td></tr><tr><td>11226</td><td>409040</td><td>2012</td><td>2</td><td>2</td><td>0</td></tr><tr><td>11227</td><td>409040</td><td>2012</td><td>2</td><td>3</td><td>4.32</td></tr><tr><td>11228</td><td>409040</td><td>2012</td><td>2</td><td>4</td><td>13.21</td></tr><tr><td>11229</td><td>409040</td><td>2012</td><td>2</td><td>5</td><td>1.02</td></tr><tr><td>11230</td><td>409040</td><td>2012</td><td>2</td><td>6</td><td>0</td></tr><tr><td>11231</td><td>409040</td><td>2012</td><td>2</td><td>7</td><td>0</td></tr><tr><td>11232</td><td>409040</td><td>2012</td><td>2</td><td>8</td><td>0</td></tr><tr><td>11233</td><td>409040</td><td>2012</td><td>2</td><td>9</td><td>0</td></tr><tr><td>11234</td><td>409040</td><td>2012</td><td>2</td><td>10</td><td>5.08</td></tr><tr><td>11235</td><td>409040</td><td>2012</td><td>2</td><td>11</td><td>0</td></tr><tr><td>11236</td><td>409040</td><td>2012</td><td>2</td><td>12</td><td>12.95</td></tr><tr><td>11237</td><td>409040</td><td>2012</td><td>2</td><td>13</td><td>5.59</td></tr><tr><td>11238</td><td>409040</td><td>2012</td><td>2</td><td>14</td><td>0.25</td></tr><tr><td>11239</td><td>409040</td><td>2012</td><td>2</td><td>15</td><td>0</td></tr><tr><td>11240</td><td>409040</td><td>2012</td><td>2</td><td>16</td><td>0</td></tr><tr><td>11241</td><td>409040</td><td>2012</td><td>2</td><td>17</td><td>0</td></tr><tr><td>11242</td><td>409040</td><td>2012</td><td>2</td><td>18</td><td>0</td></tr><tr><td>11243</td><td>409040</td><td>2012</td><td>2</td><td>19</td><td>0</td></tr><tr><td>11244</td><td>409040</td><td>2012</td><td>2</td><td>20</td><td>14.48</td></tr><tr><td>11245</td><td>409040</td><td>2012</td><td>2</td><td>21</td><td>9.65</td></tr><tr><td>11246</td><td>409040</td><td>2012</td><td>2</td><td>22</td><td>3.05</td></tr><tr><td>11247</td><td>409040</td><td>2012</td><td>2</td><td>23</td><td>0</td></tr><tr><td>11248</td><td>409040</td><td>2012</td><td>2</td><td>24</td><td>0</td></tr><tr><td>11249</td><td>409040</td><td>2012</td><td>2</td><td>25</td><td>0</td></tr><tr><td>11250</td><td>409040</td><td>2012</td><td>2</td><td>26</td><td>0</td></tr><tr><td>11251</td><td>409040</td><td>2012</td><td>2</td><td>27</td><td>0</td></tr><tr><td>11252</td><td>409040</td><td>2012</td><td>2</td><td>28</td><td>7.37</td></tr><tr><td>11253</td><td>409040</td><td>2012</td><td>2</td><td>29</td><td>0</td></tr></tbody></table>
And here is sample output (HTML)
<table><tbody><tr><th>ID</th><th>AirportCode</th><th>DateSer</th><th>ThisDay</th><th>Yesterday</th><th>Prev3days</th><th>PrevWeek</th></tr><tr><td>11216</td><td>409040</td><td>23-01-2012</td><td>0.51</td><td>0</td><td>0</td><td>0</td></tr><tr><td>11217</td><td>409040</td><td>24-01-2012</td><td>0</td><td>0.51</td><td>0.51</td><td>0.51</td></tr><tr><td>11218</td><td>409040</td><td>25-01-2012</td><td>0</td><td>0</td><td>0.51</td><td>0.51</td></tr><tr><td>11219</td><td>409040</td><td>26-01-2012</td><td>2.03</td><td>0</td><td>0.51</td><td>0.51</td></tr><tr><td>11220</td><td>409040</td><td>27-01-2012</td><td>0</td><td>2.03</td><td>2.03</td><td>2.54</td></tr><tr><td>11221</td><td>409040</td><td>28-01-2012</td><td>0</td><td>0</td><td>2.03</td><td>2.54</td></tr><tr><td>11222</td><td>409040</td><td>29-01-2012</td><td>0</td><td>0</td><td>2.03</td><td>2.54</td></tr><tr><td>11223</td><td>409040</td><td>30-01-2012</td><td>0</td><td>0</td><td>0</td><td>2.54</td></tr><tr><td>11224</td><td>409040</td><td>31-01-2012</td><td>0.25</td><td>0</td><td>0</td><td>2.03</td></tr><tr><td>11225</td><td>409040</td><td>01-02-2012</td><td>0</td><td>0.25</td><td>0.25</td><td>2.28</td></tr><tr><td>11226</td><td>409040</td><td>02-02-2012</td><td>0</td><td>0</td><td>0.25</td><td>2.28</td></tr><tr><td>11227</td><td>409040</td><td>03-02-2012</td><td>4.32</td><td>0</td><td>0.25</td><td>0.25</td></tr><tr><td>11228</td><td>409040</td><td>04-02-2012</td><td>13.21</td><td>4.32</td><td>4.32</td><td>4.57</td></tr><tr><td>11229</td><td>409040</td><td>05-02-2012</td><td>1.02</td><td>13.21</td><td>17.53</td><td>17.78</td></tr><tr><td>11230</td><td>409040</td><td>06-02-2012</td><td>0</td><td>1.02</td><td>18.55</td><td>18.8</td></tr><tr><td>11231</td><td>409040</td><td>07-02-2012</td><td>0</td><td>0</td><td>14.23</td><td>18.8</td></tr><tr><td>11232</td><td>409040</td><td>08-02-2012</td><td>0</td><td>0</td><td>1.02</td><td>18.55</td></tr><tr><td>11233</td><td>409040</td><td>09-02-2012</td><td>0</td><td>0</td><td>0</td><td>18.55</td></tr><tr><td>11234</td><td>409040</td><td>10-02-2012</td><td>5.08</td><td>0</td><td>0</td><td>18.55</td></tr><tr><td>11235</td><td>409040</td><td>11-02-2012</td><td>0</td><td>5.08</td><td>5.08</td><td>19.31</td></tr><tr><td>11236</td><td>409040</td><td>12-02-2012</td><td>12.95</td><td>0</td><td>5.08</td><td>6.1</td></tr><tr><td>11237</td><td>409040</td><td>13-02-2012</td><td>5.59</td><td>12.95</td><td>18.03</td><td>18.03</td></tr><tr><td>11238</td><td>409040</td><td>14-02-2012</td><td>0.25</td><td>5.59</td><td>18.54</td><td>23.62</td></tr><tr><td>11239</td><td>409040</td><td>15-02-2012</td><td>0</td><td>0.25</td><td>18.79</td><td>23.87</td></tr><tr><td>11240</td><td>409040</td><td>16-02-2012</td><td>0</td><td>0</td><td>5.84</td><td>23.87</td></tr><tr><td>11241</td><td>409040</td><td>17-02-2012</td><td>0</td><td>0</td><td>0.25</td><td>23.87</td></tr><tr><td>11242</td><td>409040</td><td>18-02-2012</td><td>0</td><td>0</td><td>0</td><td>18.79</td></tr><tr><td>11243</td><td>409040</td><td>19-02-2012</td><td>0</td><td>0</td><td>0</td><td>18.79</td></tr><tr><td>11244</td><td>409040</td><td>20-02-2012</td><td>14.48</td><td>0</td><td>0</td><td>5.84</td></tr><tr><td>11245</td><td>409040</td><td>21-02-2012</td><td>9.65</td><td>14.48</td><td>14.48</td><td>14.73</td></tr><tr><td>11246</td><td>409040</td><td>22-02-2012</td><td>3.05</td><td>9.65</td><td>24.13</td><td>24.13</td></tr><tr><td>11247</td><td>409040</td><td>23-02-2012</td><td>0</td><td>3.05</td><td>27.18</td><td>27.18</td></tr><tr><td>11248</td><td>409040</td><td>24-02-2012</td><td>0</td><td>0</td><td>12.7</td><td>27.18</td></tr><tr><td>11249</td><td>409040</td><td>25-02-2012</td><td>0</td><td>0</td><td>3.05</td><td>27.18</td></tr><tr><td>11250</td><td>409040</td><td>26-02-2012</td><td>0</td><td>0</td><td>0</td><td>27.18</td></tr><tr><td>11251</td><td>409040</td><td>27-02-2012</td><td>0</td><td>0</td><td>0</td><td>27.18</td></tr><tr><td>11252</td><td>409040</td><td>28-02-2012</td><td>7.37</td><td>0</td><td>0</td><td>12.7</td></tr><tr><td>11253</td><td>409040</td><td>29-02-2012</td><td>0</td><td>7.37</td><td>7.37</td><td>10.42</td></tr></tbody></table>
I created an additional column rDate (DateTime) and filled it with this query:
UPDATE Rainfall SET Rainfall.rDate = DateSerial([rYear],[rMonth],[rDay]);
Then your desired result can be achieved with several subqueries, using SUM() for the last two columns:
SELECT r.ID, r.AirportCode, r.rDate, r.RFmm,
(SELECT RFmm FROM Rainfall r1 WHERE r1.AirportCode = r.AirportCode AND r1.rDate = r.rDate-1) AS Yesterday,
(SELECT SUM(RFmm) FROM Rainfall r3 WHERE r3.AirportCode = r.AirportCode AND r3.rDate BETWEEN r.rDate-3 AND r.rDate-1) AS Prev3days,
(SELECT SUM(RFmm) FROM Rainfall r7 WHERE r7.AirportCode = r.AirportCode AND r7.rDate BETWEEN r.rDate-7 AND r.rDate-1) AS PrevWeek
FROM Rainfall r
Make sure AirportCode and rDate are indexed for larger numbers of records.
Result:
+-------+-------------+------------+-------+-----------+-----------+----------+
| ID | AirportCode | rDate | RFmm | Yesterday | Prev3days | PrevWeek |
+-------+-------------+------------+-------+-----------+-----------+----------+
| 11216 | 409040 | 23.01.2012 | 0,51 | | | |
| 11217 | 409040 | 24.01.2012 | 0 | 0,51 | 0,51 | 0,51 |
| 11218 | 409040 | 25.01.2012 | 0 | 0 | 0,51 | 0,51 |
| 11219 | 409040 | 26.01.2012 | 2,03 | 0 | 0,51 | 0,51 |
| 11220 | 409040 | 27.01.2012 | 0 | 2,03 | 2,03 | 2,54 |
| 11221 | 409040 | 28.01.2012 | 0 | 0 | 2,03 | 2,54 |
| 11222 | 409040 | 29.01.2012 | 0 | 0 | 2,03 | 2,54 |
| 11223 | 409040 | 30.01.2012 | 0 | 0 | 0 | 2,54 |
| 11224 | 409040 | 31.01.2012 | 0,25 | 0 | 0 | 2,03 |
| 11225 | 409040 | 01.02.2012 | 0 | 0,25 | 0,25 | 2,28 |
| 11226 | 409040 | 02.02.2012 | 0 | 0 | 0,25 | 2,28 |
| 11227 | 409040 | 03.02.2012 | 4,32 | 0 | 0,25 | 0,25 |
| 11228 | 409040 | 04.02.2012 | 13,21 | 4,32 | 4,32 | 4,57 |
| 11229 | 409040 | 05.02.2012 | 1,02 | 13,21 | 17,53 | 17,78 |
+-------+-------------+------------+-------+-----------+-----------+----------+
Use Nz() to avoid NULL values in the first row.
It appears that you store the day in separate fields (rYear, rMonth, rDay). So, in order to get the date you use the DateSerial function. This means that in order to use the date for a join or where clause, Access must calculate the date for the entire table. You need to store the date in a separate field and index it to avoid the calculation.
I'm attempting to pull down records that are filtered by two date columns - I need to show all "active" records. Currently, I am able to pull records using the latest "effective date", but the problem is I may have active records across multiple effective dates.
An "active" record is defined as a record with an effective date prior to or equal to current date (see notes for current date assumptions), with an end date that is equal to or greater than current date. An "inactive" record would be the first and second rows of data in my example, an active record would be the third row of data.
I'm working with a data set similar to this:
+-------------+----------------+----------+---------+---------+---------+
| Mode Name | Effective Date | End Date | Mode ID | Param 1 | Param 2 |
+-------------+----------------+----------+---------+---------+---------+
| Single Mode | 20110102 | 20120313 | 1 | Green | Metal |
| Single Mode | 20120314 | 20131122 | 1 | Green | Wood |
| Single Mode | 20131123 | 29991231 | 1 | Orange | Plastic |
| Multi Mode | 20110102 | 20120313 | 5 | Orange | Plastic |
| Multi Mode | 20120314 | 20120501 | 5 | Red | Metal |
| Triple Mode | 20120314 | 20120314 | 3 | Blue | Cloth |
| Triple Mode | 20120315 | 20131122 | 3 | Red | Wood |
| Triple Mode | 20131123 | 20131130 | 3 | Red | Wood |
| Triple Mode | 20131201 | 29991231 | 3 | Orange | Wood |
| Double Mode | 20131123 | 29991231 | 2 | Green | Metal |
| Double Mode | 20131202 | 29991231 | 2 | Brown | Plastic |
| Quad Mode | 20131202 | 29991231 | 4 | Black | Wood |
| Quad Mode | 20131203 | 29991231 | 4 | Green | Plastic |
| Zero Mode | 20090704 | 29991231 | 0 | Blue | Cloth |
+-------------+----------------+----------+---------+---------+---------+
What I need to do is query so that each "active" mode is shown, but only the latest active mode as defined by the "effective date" column. "Ended" modes should not be shown. An "ended" mode is defined as having an end date prior to current date - with "29991231" being defined as "no end date". Ideally, the data set above would filter down to this:
+-------------+----------------+----------+---------+---------+---------+
| Mode Name | Effective Date | End Date | Mode ID | Param 1 | Param 2 |
+-------------+----------------+----------+---------+---------+---------+
| Single Mode | 20131123 | 29991231 | 1 | Orange | Plastic |
| Triple Mode | 20131201 | 29991231 | 3 | Orange | Wood |
| Double Mode | 20131202 | 29991231 | 2 | Brown | Plastic |
| Quad Mode | 20131203 | 29991231 | 4 | Green | Plastic |
| Zero Mode | 20090704 | 29991231 | 0 | Blue | Cloth |
+-------------+----------------+----------+---------+---------+---------+
Some notes:
Assume "current date" in this example is 2013-12-16.
You cannot filter on end date alone - as due to the way our system works, an end date of "29991231" does not guarantee a record is ended. For example, given two records with ending dates of "29991231", the one with the more recent effective date will supercede the one with an older effective date.
Some records will not be shown at all because they are ended prior to the current date.
This is an old system that is terribly designed. I'm sure there are ton of better ways to store data (believe me, what I'm showing you is NOT the worst part) - but unfortunately I'm stuck with what I have.
Here you go
SELECT
YourTable.ModeName,
YourTable.EffectiveDate,
YourTable.EndDate,
YourTable.ModeId,
YourTable.Param1,
YourTable.Param2
FROM
YourTable INNER JOIN
(SELECT
ModeName,
MAX(EffectiveDate) AS MaximumEffectiveDate
FROM YourTable AS YourTable_1
WHERE (GETDATE() BETWEEN CONVERT(Date, EffectiveDate, 101) AND CONVERT(Date, EndDate, 101))
GROUP BY ModeName) AS GroupedByMode ON YourTable.ModeName = GroupedByMode.ModeName AND
YourTable.EffectiveDate = GroupedByMode.MaximumEffectiveDate
Just change the GETDATE() with the date of your choice
Hopefully this is what you need, I copied your data and tested it to get the same results you have
You can use a CTE to get the single record for each:
with MaxDate as (select
[Mode Name],
max([Effective Date]) as mdate
from
table1
group by [Mode Name])
select
*
from
table1 t1
inner join
MaxDate on mdate = [Effective Date]
and t1.[Mode Name] = MaxDate.[Mode Name]
where [End Date] = 29991231
SQL Fiddle
So I have been doing pretty well on my project (Link to previous StackOverflow question), and have managed to learn quite a bit, but there is this one problem that has been really dogging me for days and I just can't seem to solve it.
It has to do with using the UNIX_TIMESTAMP call to convert dates in my SQL database to UNIX time-format, but for some reason only one set of dates in my table is giving me issues!
==============
So these are the values I am getting -
#abridged here, see the results from the SELECT statement below to see the rest
#of the fields outputted
| firstVst | nextVst | DOB |
| 1206936000 | 1396238400 | 0 |
| 1313726400 | 1313726400 | 278395200 |
| 1318910400 | 1413604800 | 0 |
| 1319083200 | 1413777600 | 0 |
when I use this SELECT statment -
SELECT SQL_CALC_FOUND_ROWS *,UNIX_TIMESTAMP(firstVst) AS firstVst,
UNIX_TIMESTAMP(nextVst) AS nextVst, UNIX_TIMESTAMP(DOB) AS DOB FROM people
ORDER BY "ref DESC";
So my big question is: why in the heck are 3 out of 4 of my DOBs being set to date of 0 (IE 12/31/1969 on my PC)? Why is this not happening in my other fields?
I can see the data quite well using a more simple SELECT statement and the DOB field looks fine...?
#formatting broken to change some variable names etc.
select * FROM people;
| ref | lastName | firstName | DOB | rN | lN | firstVst | disp | repName | nextVst |
| 10001 | BlankA | NameA | 1968-04-15 | 1000000 | 4600000 | 2008-03-31 | Positive | Patrick Smith | 2014-03-31 |
| 10002 | BlankB | NameB | 1978-10-28 | 1000001 | 4600001 | 2011-08-19 | Positive | Patrick Smith | 2011-08-19 |
| 10003 | BlankC | NameC | 1941-06-08 | 1000002 | 4600002 | 2011-10-18 | Positive | Patrick Smith | 2014-10-18 |
| 10004 | BlankD | NameD | 1952-08-01 | 1000003 | 4600003 | 2011-10-20 | Positive | Patrick Smith | 2014-10-20 |
It's because those DoB's are from before 12/31/1969, and the UNIX epoch starts then, so anything prior to that would be negative.
From Wikipedia:
Unix time, or POSIX time, is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, not counting leap seconds.
A bit more elaboration: Basically what you're trying to do isn't possible. Depending on what it's for, there may be a different way you can do this, but using UNIX timestamps probably isn't the best idea for dates like that.