In SQLite, I am trying to combine both tables. Specifically, I am trying to find a way to combine lab result dates with 0-7 days of follow-up for diagnosis dates (minimum 0 day, like the same day, to maximum 7 days). I have attached the tables here (note: not real ID, ENCID, lab result date, and diag_date numbers). Is there a possible way to combine both tables without the first row (of Table 1) attached to DIAG_DATE of 11/19/2020 in SQLite? If not, what about in Python?
Table 1
ID ENCID LAB RESULT DATE
1 098 10/29/2020
1 098 11/17/2020
1 098 11/15/2020
1 098 11/12/2020
1 098 11/19/2020
Table 2
ID ENCID DIAG_DATE
1 098 11/19/2020
1 098 10/01/2021
My goal:
Table 3
ID ENCID LAB_RESULT_DATE DIAG_DATE
1 098 11/12/2020 11/19/2020
1 098 11/15/2020 11/19/2020
1 098 11/17/2020 11/19/2020
1 098 11/19/2020 11/19/2020
Here is my SQLite code below (I am aware this is not right):
CREATE TABLE table3 AS
SELECT *
FROM table1
JOIN table2
WHERE table1.ID=table2.ID AND table1.ENCID=table2.ENCID AND DIAG_DATE >= LAB_RESULT_DATE
HAVING MAX(DIAG_DATE)>MIN(LAB_RESULT_DATE)
ORDER BY table1.ID ASC
you can join both table with thier ENCID and dates.
You need to chech if the time frame of the second ON parameter is enough, to caputure all dates and times else you need to adjust the time by adding , '-10 seconds' for example
SELECT t1.*, t2."DIAG_DATE"
FROM tab1 t1 JOIN tab2 t2 ON t1."ENCID" = t2."ENCID" AND "LAB RESULT DATE" BETWEEN DATE("DIAG_DATE",
'-7 day') AND "DIAG_DATE"
ID ENCID LAB RESULT DATE DIAG_DATE
1 98 2020-11-17 01:00:00 2020-11-19 01:00:00
1 98 2020-11-15 01:00:00 2020-11-19 01:00:00
1 98 2020-11-12 01:00:00 2020-11-19 01:00:00
1 98 2020-11-19 01:00:00 2020-11-19 01:00:00
db<>fiddle here
Related
Need help tallying fork truck training completions at work. Here is an example of the tables I have, and the table I need to create:
table 1:
date
is_work_day
2023-01-25
1
2023-01-26
1
2023-01-27
1
2023-01-28
0
2023-01-29
1
2023-01-30
0
table 2:
employee_id
training_passed
test_date
001
1
2023-01-25
002
1
2023-01-26
003
0
2023-01-26
004
1
2023-01-26
005
0
2023-01-27
006
1
2023-01-29
need table:
date
cumulative_passed_training
2023-01-26
2
2023-01-27
2
2023-01-29
3
The table should count the total passed trainings, but only starting on 2023-01-26 and should only show dates that are work days. Any help would be greatly appreciated.
I think I need to JOIN the two tables, and then SUM the training_passed column, but am unsure how to get it to start at a certain date, and how to make it only show work days on the final table.
JOIN on the date column and add the passed tests as JOIN condition. Also GROUP BY the date so you can sum for each one
select t1.date, count(t2.employee_id)
from table1 t1
join table2 t2 on t1. date = t2.test_date
and t2.training_passed = 1
group by t1.date
It would make no difference if you put the condition
t2.training_passed = 1
in a where clause instead of the INNER JOIN.
In AWS Athena, I am trying to join two tables in the db using the date, but one of the tables (table2) is not clean, and contains values that are not dates, as shown below.
| table2.date |
| ---- |
|6/02/2021|
|9/02/2021|
|1431 BEL & 1628 BEL."|
|15/02/2021|
|and failed to ....|
|18/02/2021|
|19/02/2021|
I am not able to have any influence in cleaning this table up.
My current query is:
SELECT *
FROM table1
LEFT JOIN table2
ON table1.operation_date = cast(date_parse(table2."date",'%d/%m/%Y') as date)
LIMIT 10;
I've tried using regex_like(col, '[a-z]'), but this still leaves the values that are numerical, but not dates.
How do I get the query to ignore the values that are not dates?
You may wrap conversion expression with try function, that will resolve to NULL in case of failed conversion.
select
try(date_parse(col, '%d/%m/%Y'))
from(values
('6/02/2021'),
('9/02/2021'),
('1431 BEL & 1628 BEL.'),
('15/02/2021'),
('and failed to ....'),
('18/02/2021'),
('19/02/2021')
) as t(col)
#
_col0
1
2021-02-06 00:00:00.000
2
2021-02-09 00:00:00.000
3
4
2021-02-15 00:00:00.000
5
6
2021-02-18 00:00:00.000
7
2021-02-19 00:00:00.000
I have two tables like so -
Table 1 -
patient admit_dt discharge_dt
323 2020-01-09 2020-02-01
323 2020-02-18 2020-02-27
231 2020-02-13 2020-02-17
Table 2 -
patient admit_dt discharge_dt
323 2020-02-05 2020-02-07
231 2020-02-23 2020-02-28
The output I am needing is
patient
323
The logic is - if one patient goes from table 1 into table 2 and ends up back in table 1 within 30 days, we want to count them in the output.
Patient 231 is not included in the result because they didn't go back to table 1.
If I understand correctly, you can use join:
select t1.patient
from table1 t1 join
table2 t2
on t2.patient = t1.patient and
t2.admit_dt > t1.discharge_dt join
table1 tt1
on tt1.patient = t1.patient and
tt1.admit_dt > t2.discharge_dt;
I am dealing with the following problem in SQL (using Vertica):
In short -- Create a timeline for each ID (in a table where I have multiple lines, orders in my example, per ID)
What I would like to achieve -- At my disposal I have a table on historical order date and I would like to compute new customer (first order ever in the past month), active customer- (>1 order in last 1-3 months), passive customer- (no order for last 3-6 months) and inactive customer (no order for >6 months) rates.
Which steps I have taken so far -- I was able to construct a table similar to the example presented below:
CustomerID Current order date Time between current/previous order First order date (all-time)
001 2015-04-30 12:06:58 (null) 2015-04-30 12:06:58
001 2015-09-24 17:30:59 147 05:24:01 2015-04-30 12:06:58
001 2016-02-11 13:21:10 139 19:50:11 2015-04-30 12:06:58
002 2015-10-21 10:38:29 (null) 2015-10-21 10:38:29
003 2015-05-22 12:13:01 (null) 2015-05-22 12:13:01
003 2015-07-09 01:04:51 47 12:51:50 2015-05-22 12:13:01
003 2015-10-23 00:23:48 105 23:18:57 2015-05-22 12:13:01
A little bit of intuition: customer 001 placed three orders from which the second one was 147 days after its first order. Customer 002 has only placed one order in total.
What I think that the next steps should be -- I would like to know for each date (also dates on which a certain user did not place an order), for each CustomerID, how long it has been since his/her last order. This would imply that I would create some sort of timeline for each CustomerID. In the example presented above I would get 287 (days between 1st of May 2015 and 11th of February 2016, the timespan of this table) lines for each CustomerID. I have difficulties solving this previous step. When I have performed this step I want to create a field which shows at each date the last order date, the period between the last order date and the current date, and what state someone is in at the current date. For the example presented earlier, this would look something like this:
CustomerID Last order date Current date Time between current date /last order State
001 2015-04-30 12:06:58 2015-05-01 00:00:00 0 00:00:00 New
...
001 2015-04-30 12:06:58 2015-06-30 00:00:00 60 11:53:02 Active
...
001 2015-09-24 17:30:59 2016-02-01 00:00:00 129 11:53:02 Passive
...
...
002 2015-10-21 17:30:59 2015-10-22 00:00:00 0 06:29:01 New
...
002 2015-10-21 17:30:59 2015-11-30 00:00:00 39 06:29:01 Active
...
...
003 2015-05-22 12:13:01 2015-06-23 00:00:00 31 11:46:59 Active
...
003 2015-07-09 01:04:51 2015-10-22 00:00:00 105 11:46:59 Inactive
...
At the dots there should be all the inbetween dates but for sake of space I have left these out of the table.
When I know for each date what the state is of each customer (active/passive/inactive) my plan is to sum the states and group by date which should give me the sum of new, active, passive and inactive customers. From here on I can easily compute the rates at each date.
Anybody that knows how I can possibly achieve this task?
Note -- If anyone has other ideas how to achieve the goal presented above (using some other approach compared to the approach I had in mind) please let me know!
EDIT
Suppose you start from a table like this:
SQL> select * from ord order by custid, ord_date ;
custid | ord_date
--------+---------------------
1 | 2015-04-30 12:06:58
1 | 2015-09-24 17:30:59
1 | 2016-02-11 13:21:10
2 | 2015-10-21 10:38:29
3 | 2015-05-22 12:13:01
3 | 2015-07-09 01:04:51
3 | 2015-10-23 00:23:48
(7 rows)
You can use Vertica's Timeseries Analytic Functions TS_FIRST_VALUE(), TS_LAST_VALUE() to fill gaps and interpolate last_order date to the current date:
Then you just have to join this with a Vertica's TimeSeries generated from the same table with interval one day starting from the first day each customer did place his/her first order up to now (current_date):
select
custid,
status_dt,
last_order_dt,
case
when status_dt::date - last_order_dt::date < 30 then case
when nord = 1 then 'New' else 'Active' end
when status_dt::date - last_order_dt::date < 90 then 'Active'
when status_dt::date - last_order_dt::date < 180 then 'Passive'
else 'Inactive'
end as status
from (
select
custid,
last_order_dt,
status_dt,
conditional_true_event (first_order_dt is null or
last_order_dt > lag(last_order_dt))
over(partition by custid order by status_dt) as nord
from (
select
custid,
ts_first_value(ord_date) as first_order_dt ,
ts_last_value(ord_date) as last_order_dt ,
dt::date as status_dt
from
( select custid, ord_date from ord
union all
select distinct(custid) as custid, current_date + 1 as ord_date from ord
) z timeseries dt as '1 day' over (partition by custid order by ord_date)
) x
) y
where status_dt <= current_date
order by 1, 2
;
And you will get something like this:
custid | status_dt | last_order_dt | status
--------+------------+---------------------+---------
1 | 2015-04-30 | 2015-04-30 12:06:58 | New
1 | 2015-05-01 | 2015-04-30 12:06:58 | New
1 | 2015-05-02 | 2015-04-30 12:06:58 | New
...
1 | 2015-05-29 | 2015-04-30 12:06:58 | New
1 | 2015-05-30 | 2015-04-30 12:06:58 | Active
1 | 2015-05-31 | 2015-04-30 12:06:58 | Active
...
etc.
I'm trying to write a script to load records into a table based on the data in another table.
The steps are listed below.
This is in SQL and I'm new to programming. Any help with the code will be much appreciated.
Table1 - Source table and has 3 columns
Material Num Mfg Date Sold Date
1 1/1/2013 1/15/2013
2 2/1/2013 null
3 3/1/2013 null
4 4/1/2013 4/15/2013
5 5/1/2013 5/15/2013
Table2 Target table
Month/ Yr Count
Jan-13 0
Feb-13 0
Mar-13 1
Apr-13 2
May-13 2