T-SQL checking if a date in 1 table is between 2 dates in another table - sql

I have a main table CleanData and a reference table TIME_TABLE_VERSION ttv. In CleanData there is no primary key but each row has a date column called CALENDAR_DATE1.
I want to return all rows from CleanData where CleanData.CALENDAR_DATE1 is between ttv.ACTIVATION_DATE and ttv.DEACTIVATION_DATE.
What's tricky is the timing of when ttv table gets updated. If you look below you'll see that the last record in ttv has a deactivation date of July 16, 2022. However, this table will get updated in the future by truncating the previous row deactivation date and a new row gets inserted. So for example the next time ttv gets updated will be around June 30th and there will be a new row for TIME_TABLE_VERSION_ID = 191 with an activation date of July 3, 2022; the deactivation date for TIME_TABLE_VERSION_ID = 190 will get truncated from July 16, 2022 to July 2, 2022 upon update. Note that this ttv table update will happen in advance, when CleanData.CALENDAR_DATE1 is still less than the ttv.ACTIVATION_DATE in TIME_TABLE_VERSION_ID = 191. If I simply select MAX TIME_TABLE_VERSION_ID then there will be a few days of missing data returned from CleanData between June 30th and July 3rd.
I'm trying to write a query that will factor in when CleanData.CALENDAR_DATE1 is less than the most recent ttv.ACTIVATION_DATE, and give all the rows in CleanData with a Calendar_DATE1 between TIME_TABLE_VERSION_ID -1, until CleanData.CALENDAR_DATE1 is >= the ttv.ACTIVATION_DATE in TIME_TABLE_VERSION_ID + 0 (most recent).
Any ideas how to fix my query?
SELECT
CleanData.*
FROM
TIME_TABLE_VERSION AS ttv
INNER JOIN
CleanData ON CleanData.CALENDAR_DATE1 BETWEEN ttv.ACTIVATION_DATE AND ttv.DEACTIVATION_DATE
AND (CASE
WHEN (CleanData.CALENDAR_DATE1 < (SELECT ttv1.ACTIVATION_DATE FROM TIME_TABLE_VERSION ttv1 WHERE ttv.TIME_TABLE_VERSION_ID = ttv1.TIME_TABLE_VERSION_ID))
THEN (ttv.TIME_TABLE_VERSION_ID = (SELECT MAX (ttv1.TIME_TABLE_VERSION_ID)-1 FROM TIME_TABLE_VERSION ttv1))
ELSE (ttv.TIME_TABLE_VERSION_ID = (SELECT MAX(ttv1.TIME_TABLE_VERSION_ID) FROM TIME_TABLE_VERSION ttv1)) END)
ORDER BY
CleanData.CALENDAR_DATE1
Here's what table ttv looks like:
TIME_TABLE_VERSION_ID
TIME_TABLE_VERSION_NAME
ACTIVATION_DATE
DEACTIVATION_DATE
184
Feb22_01
2022-02-06 00:00:00.000
2022-02-26 23:59:59.000
185
Feb22_02
2022-02-27 00:00:00.000
2022-03-19 23:59:59.000
186
Feb22_03
2022-03-20 00:00:00.000
2022-04-09 23:59:59.000
187
Feb22_04
2022-04-10 00:00:00.000
2022-04-23 23:59:59.000
188
Apr22_01
2022-04-24 00:00:00.000
2022-05-14 23:59:59.000
189
Apr22_02
2022-05-15 00:00:00.000
2022-05-28 23:59:59.000
190
Apr22_03
2022-05-29 00:00:00.000
2022-07-16 23:59:59.000
Note there is no TIME_TABLE_VERSION_ID or TIME_TABLE_VERSION_NAME in CleanData to join to. I only have the CALENDAR_DATE1 in CleanData and the ACTIVATION_DATE and DEACTIVATION_DATE in ttv.
Note also that I have no ability to change the structure of either table, I have to work with what's there in both.
Thanks so much for any help you can offer!

Related

combine two rows with 2 months into one row of one month, containing null values into one

I would like to have a dataframe where 1 row only contains one month of data.
month cust_id closed_deals cum_closed_deals checkout cum_checkout
2019-10-01 1 15 15 null null
2019-10-01 1 null 15 210 210
2019-11-01 1 27 42 null 210
2019-11-01 1 null 42 369 579
Expected result:
month cust_id closed_deals cum_closed_deals checkout cum_checkout
2019-10-01 1 15 15 210 210
2019-11-01 1 27 42 369 579
At first, I thought a normal groupby will work, but as I try to group by only by "month" and "cust_id", I got an error saying that closed_deals and checkout also need to be in the groupby.
You may simply aggregate by the (first of the) month and cust_id and take the max of all other columns:
SELECT
month,
cust_id,
MAX(closed_deals) AS closed_deals,
MAX(cum_closed_deals) AS cum_closed_deals,
MAX(checkout) AS checkout,
MAX(cum_checkout) AS cum_checkout
FROM yourTable
GROUP BY
month,
cust_id;

How to bring future days to past date and then revert to same old days using postgresql?

I have a db with 6 tables. Each table has a list of date and datetime columns as shown below
Table 1 Table 2 .... Table 6
Date_of_birth Exam_date exam_datetime Result_date Result_datetime
2190-01-13 2192-01-13 2192-01-13 09:00:00 2194-04-13 2194-04-13 07:12:00
2184-05-21 2186-05-21 2186-05-21 07:00:00 2188-02-03 2188-02-03 09:32:00
2181-06-17 2183-06-17 2183-06-17 05:00:00 2185-07-23 2185-07-23 12:40:00
What I would like to do is shift all these future days back to the past date (definitely has to be less than the current date) but retain the same chronological order. Meaning, we can see that the person was born first, then he took the exam, and finally, he got his results.
In addition, I should be able to revert the changes and get back the future dates again.
I expect my output to be something like below
Stage 1 - shift back to old days (it can be any day but it has to be in the past and retain chronological order)
Table 1 Table 2 .... Table 6
Date_of_birth Exam_date exam_datetime Result_date Result_datetime
1990-01-13 1992-01-13 1992-01-13 09:00:00 1994-04-13 1994-04-13 07:12:00
1984-05-21 1986-05-21 1986-05-21 07:00:00 1988-02-03 1988-02-03 09:32:00
1981-06-17 1983-06-17 1983-06-17 05:00:00 1985-07-23 1985-07-23 12:40:00
Stage 2 - Shift forward to future days as how it was earlier
Table 1 Table 2 .... Table 6
Date_of_birth Exam_date exam_datetime Result_date Result_datetime
2190-01-13 2192-01-13 2192-01-13 09:00:00 2194-04-13 2194-04-13 07:12:00
2184-05-21 2186-05-21 2186-05-21 07:00:00 2188-02-03 2188-02-03 09:32:00
2181-06-17 2183-06-17 2183-06-17 05:00:00 2185-07-23 2185-07-23 12:40:00
Subtract two centuries:
update table1
set date_of_birth = date_of_birth - interval '200 year';
You can do something similar for all the other dates.

GROUP BY several hours

I have a table where our product records its activity log. The product starts working at 23:00 every day and usually works one or two hours. This means that once a batch started at 23:00, it finishes about 1:00am next day.
Now, I need to take statistics on how many posts are registered per batch but cannot figure out a script that would allow me achiving this. So far I have following SQL code:
SELECT COUNT(*), DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
ORDER BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
which results in following
count day hour
....
1189 9 23
8611 10 0
2754 10 23
6462 11 0
1885 11 23
I.e. I want the number for 9th 23:00 grouped with the number for 10th 00:00, 10th 23:00 with 11th 00:00 and so on. How could I do it?
You can do it very easily. Use DATEADD to add an hour to the original registrationtime. If you do so, all the registrationtimes will be moved to the same day, and you can simply group by the day part.
You could also do it in a more complicated way using CASE WHEN, but it's overkill on the view of this easy solution.
I had to do something similar a few days ago. I had fixed timespans for work shifts to group by where one of them could start on one day at 10pm and end the next morning at 6am.
What I did was:
Define a "shift date", which was simply the day with zero timestamp when the shift started for every entry in the table. I was able to do so by checking whether the timestamp of the entry was between 0am and 6am. In that case I took only the date of this DATEADD(dd, -1, entryDate), which returned the previous day for all entries between 0am and 6am.
I also added an ID for the shift. 0 for the first one (6am to 2pm), 1 for the second one (2pm to 10pm) and 3 for the last one (10pm to 6am).
I was then able to group over the shift date and shift IDs.
Example:
Consider the following source entries:
Timestamp SomeData
=============================
2014-09-01 06:01:00 5
2014-09-01 14:01:00 6
2014-09-02 02:00:00 7
Step one extended the table as follows:
Timestamp SomeData ShiftDay
====================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00
2014-09-01 14:01:00 6 2014-09-01 00:00:00
2014-09-02 02:00:00 7 2014-09-01 00:00:00
Step two extended the table as follows:
Timestamp SomeData ShiftDay ShiftID
==============================================================
2014-09-01 06:01:00 5 2014-09-01 00:00:00 0
2014-09-01 14:01:00 6 2014-09-01 00:00:00 1
2014-09-02 02:00:00 7 2014-09-01 00:00:00 2
If you add one hour to registrationtime, you will be able to group by the date part:
GROUP BY
CAST(DATEADD(HOUR, 1, registrationtime) AS date)
If the starting hour must be reflected accurately in the output (as 9, 23, 10, 23 rather than as 10, 0, 11, 0), you could obtain it as MIN(registrationtime) in the SELECT clause:
SELECT
count = COUNT(*),
day = DATEPART(DAY, MIN(registrationtime)),
hour = DATEPART(HOUR, MIN(registrationtime))
Finally, in case you are not aware, you can reference columns by their aliases in ORDER BY:
ORDER BY
day,
hour
just so that you do not have to repeat the expressions.
The below query will give you what you are expecting..
;WITH CTE AS
(
SELECT COUNT(*) Count, DATEPART(DAY,registrationtime) Day,DATEPART(HOUR,registrationtime) Hour,
RANK() over (partition by DATEPART(HOUR,registrationtime) order by DATEPART(DAY,registrationtime),DATEPART(HOUR,registrationtime)) Batch_ID
FROM RegistrationMessageLogEntry
WHERE registrationtime > '2014-09-01 20:00'
GROUP BY DATEPART(DAY, registrationtime), DATEPART(HOUR,registrationtime)
)
SELECT SUM(COUNT) Count,Batch_ID
FROM CTE
GROUP BY Batch_ID
ORDER BY Batch_ID
You can write a CASE statement as below
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN DATEPART(DAY,registrationtime)+1
END,
CASE WHEN DATEPART(HOUR,registrationtime) = 23
THEN 0
END

create sequential number column index into table with data

I wanted to do something like this post, so I tried:
SELECT
ROW_NUMBER() OVER(ORDER BY t.[Data Saida] ) AS id,
t.codigo, t.[Data Saida], t.Entidade, t.DataEnt,
t.[GTEntT Nº], t.Estado, t.[GTSaida Nº], t.[Observações1],
t.Eequisitante, t.Certificado, T.Resultado, T.Seleccionar, t.[Tipo de Intervenção]
FROM
[Movimento ferramentas] t;
However I ended up getting something like
Syntax error (missing operator) in query expression ROW_NUMBER() OVER(ORDER BY t.[Data Saida] )
Is it because ROW_NUMBER() OVER() is SQL Server only or am I doing something wrong?
I'm working with MS Access 2010.
Here's a row from that table:
To add an AutoNumber field to an existing table, simply open the table in Design View, type in the Field Name and select "AutoNumber" from the drop-down list for the Data Type:
Access will populate the new field with AutoNumber values for any existing records in the table.
Edit re: influencing the order in which AutoNumber values are applied to existing records
As with many other database operations, there is essentially no guarantee that Access will use any particular order when assigning the AutoNumber values to existing records. However, if we look at a couple of examples we can see how Access will likely do it.
Consider the following sample table named [Events]. The rows were entered in random order and there is no primary key:
EventDate Event
---------- --------------
2012-06-01 it's June
2012-10-01 it's October
2012-09-01 it's September
2012-12-01 it's December
2012-11-01 it's November
2012-07-01 it's July
2012-04-01 it's April
2012-08-01 it's August
2012-02-01 it's February
2012-01-01 it's January
2012-03-01 it's March
2012-05-01 it's May
Now we'll simply add an AutoNumber field named [ID] using the procedure above. After that has been done
SELECT * FROM Events ORDER BY ID
returns
EventDate Event ID
---------- -------------- --
2012-06-01 it's June 1
2012-10-01 it's October 2
2012-09-01 it's September 3
2012-12-01 it's December 4
2012-11-01 it's November 5
2012-07-01 it's July 6
2012-04-01 it's April 7
2012-08-01 it's August 8
2012-02-01 it's February 9
2012-01-01 it's January 10
2012-03-01 it's March 11
2012-05-01 it's May 12
Now let's revert back to the old copy of the table and see if the existence of a primary key makes a difference. We'll make [Event Date] the primary key, save the changes to the table, and then add the [ID] AutoNumber field. After that is done, the select statement above gives us
EventDate Event ID
---------- -------------- --
2012-06-01 it's June 1
2012-10-01 it's October 2
2012-09-01 it's September 3
2012-12-01 it's December 4
2012-11-01 it's November 5
2012-07-01 it's July 6
2012-04-01 it's April 7
2012-08-01 it's August 8
2012-02-01 it's February 9
2012-01-01 it's January 10
2012-03-01 it's March 11
2012-05-01 it's May 12
Hmmm, same thing. So it looks like the AutoNumber values get assigned to the table in natural order (the order in which the records were added to the table) even if there is a primary key.
Okay, if that's the case then let's use a make-table query to create a new copy of the table in a different order
SELECT Events.EventDate, Events.Event
INTO Events2
FROM Events
ORDER BY Events.EventDate;
Now let's add the [ID] AutoNumber field to that new table and see what we get:
SELECT * FROM Events2 ORDER BY ID
returns
EventDate Event ID
---------- -------------- --
2012-01-01 it's January 1
2012-02-01 it's February 2
2012-03-01 it's March 3
2012-04-01 it's April 4
2012-05-01 it's May 5
2012-06-01 it's June 6
2012-07-01 it's July 7
2012-08-01 it's August 8
2012-09-01 it's September 9
2012-10-01 it's October 10
2012-11-01 it's November 11
2012-12-01 it's December 12
If that is the order we want then we can just delete the [Events] table and rename [Events2] to [Events].

Join against date range, aggregate by SUM

I need to gather the SUM of sales made on a certain category item, grouped by day for a selected date range (could be from a week out to 12weeks) and return 0 instead of NULL for days where no transactions have occurred.
My original idea was to use a pre-populated table called "calendar" (shown below) which has about 10yrs of dates which I could LEFT JOIN my "products" table against to get days when no sales occurred as a 0 SUM.
Result was too large to deal with, so I'm trying to first copy the selected range of dates to an empty table called "datetable" which shares the same column names as "calendar". So I have 3 tables:
"calendar" table. It has 10 years worth of dates with following column names:
IsoDate DayNameOfWeek
2012-01-01 Sun
2012-01-02 Mon
2012-01-03 Tue
2012-01-04 Wed
2012-01-05 Thu
2012-01-06 Fri
2012-01-07 Sat
2012-01-08 Sun
2012-01-09 Mon
2012-01-10 Tue
etc for 10yrs
"datetable" table (this is created empty with two columns to prefill from "calendar" table so the date range data for the LEFT JOIN is more compact):
IsoDate DayNameOfWeek
"products" table. It is where I'm storing sales for each ProductCat:
ExpDate ProductCat Amount
2012-01-03 28 232
2012-01-04 29 100
2012-01-04 29 1002
2012-01-06 12 12
2012-01-06 29 9
2012-01-07 10 100
2012-01-07 29 122
2012-01-07 29 17
The output I'm looking for based on a single "ProductCat" number, in this case 29:
IsoDate DayNameOfWeek AmountSummed
2012-01-01 Sun 0
2012-01-02 Mon 0
2012-01-03 Tue 0
2012-01-04 Wed 1102
2012-01-05 Thu 0
2012-01-06 Fri 9
2012-01-07 Sat 139
2012-01-08 Sun 0
2012-01-09 Mon 0
2012-01-10 Tue 0
My code is below. The initial insert works fine but I'm not sure of the syntax that will make the second part with the JOIN and the SUM work:
INSERT INTO datetable (IsoDate, DayNameOfWeek)
SELECT IsoDate, DayNameOfWeek
FROM calendar
WHERE IsoDate
BETWEEN '2012-07-01' AND '2012-07-10'
SELECT ExpDate, SUM(IFNULL(Amount, 0))
AS AmountSummed
FROM products
WHERE ProductCat = 29
AND ExpDate BETWEEN '2012-07-01' AND '2012-07-10'
LEFT JOIN products
ON datetable.IsoDate=products.ExpDate
GROUP BY datetable.IsoDate
EDIT
This is the code that works now:
SELECT C.IsoDate,IFNULL(SUM(P.Amount),0) AS AmountSummed
FROM calendar C LEFT OUTER JOIN products P ON C.IsoDate=P.ExpDate
AND P.ProductCat = 29
WHERE C.IsoDate BETWEEN '2012-07-01' AND '2012-07-10'
GROUP BY C.IsoDate, C.DayNameOfWeek
ORDER BY C.IsoDate
You've pretty much got what you need. However, you don't need the datetable.
Your query should look like this:
SELECT C.IsoDate, C.DayNameOfWeek, IFNULL(SUM(P.Amount),0) AS AmountSummed
FROM calendar C LEFT JOIN products P ON C.IsoDate=P.ExpDate
WHERE C.IsoDate BETWEEN '2012-07-01' AND '2012-07-10'
AND P.ProductCat = 29
GROUP BY C.IsoDate, C.DayNameOfWeek
ORDER BY C.IsoDate
If you really want to use your datetable, just substitute it in for calendar and remove the C.IsoDate BETWEEN '2012-07-01' AND '2012-07-10' (assuming that the datetable was empty before you started) because datetime already has all the date you are looking for.