I have two tables. Let's call the first one A and the other B.
A is:
ID
Doc_ID
Date
1
1a
1-Jan-2020
1
1a
1-Feb-2020
1
1b
1-Mar-2020
2
1a
1-Jan-2020
B is:
ID
Doc2_ID
Date
1
2a
1-Mar-2020
1
2a
1-Apr-2020
2
2b
1-Feb-2020
2
2a
1-Mar-2020
Now using SQL, I want to create a table which has all the values in Table A and the difference between the date in table A and the closest date in table B. For eg. 1-Jan-2020 should be subtracted from 1-Mar-2020 and similarly, 1-Feb-2020 should be subtracted from 1-Mar-2020. Can you please help me with it?
I am using the query below in azure databricks:
%sql
SELECT a.ID, a.Doc_ID, DATEDIFF(b.DATE, a.DATE) as day FROM a
LEFT JOIN b
ON a.ID = b.ID
AND a.DATE < b.DATE
But this is generating more than one row in the results i.e. it is subtracting from all the dates in Table 3 which fulfils the where conditions (For eg. it is subtracting 1 Jan 2020 from 1 Mar 2020 and 1 Apr 2020 and it want it subtract only from the closest date in Table B i.e. 1 Mar 2020)
The expected outcome should be:
ID
Doc_ID
day
1
1a
59
1
1a
30
1
1b
0
2
1a
30
The day column for first two rows was obtained after subtracting the respective dates in Table A from 1-Mar-2020 i.e. closest value in Table B for ID 1
Related
This is my first post here, and the first problem i havent been able to find a solution to on my own. I have a MainTable that contains the fields: Date, MinutesActiveWork (And other not relevant fields). I have a second table that contains the fields: ID, id_Workarea, GoalOfActiveMinutes, GoalActiveFrom.
I want to make a query that returns all records from MainTable, and the active goal for the date.
Exampel:
Maintable (Date = dd/mm/yyyy)
ID Date ActvWrkMin WrkAreaID
1 01-01-2019 45 1
2 02-01-2019 50 1
3 03-01-2019 48 1
GoalTable:
ID id_Workarea Goal GlActvFrm
1 1 45 01-01-2019
2 2 90 01-01-2019
3 1 50 03-01-2019
What i want from my query:
IDMain Date ActvWrkMin Goal WrkAreaID
1 01-01-2019 45 45 1
2 02-01-2019 50 45 1
3 03-01-2019 48 50 1
The query that i have now is really close to what i want. But the problem is that the query outputs all goals that is less than the date from MainTable (It makes sense why, but i dont know what criteria to type to fix it). Like so:
IDMain Date ActvWrkMin Goal WrkAreaID
1 01-01-2019 45 45 1
2 02-01-2019 50 45 1
3 03-01-2019 48 45 1 <-- Dont want this one
3 03-01-2019 48 50 1
My query
SELECT tblMain.Date, tblMain.ActiveWorkMins, tblGoal.Goal
FROM VtblSumpMain AS tblMain LEFT JOIN (
SELECT VtblGoalsForWorkareas.idWorkArea, VtblGoalsForWorkareas.Goal, VtblGoalsForWorkareas.GoalActiveFrom (THIS IS THE DATE FIELD)
FROM VtblGoalsForWorkareas
WHERE VtblGoalsForWorkareas.idWorkArea= 1) AS tblGoal ON tblMain.Date > tblGoal.GoalActiveFrom
ORDER BY tblMain.Date
(I know i could do this pretty simple with Dlookup, but that is just not fast enough)
Thanks for any advice!
For this, I think you have to use the nested query as I mention below.
select tblMain.id,tblMain.Date,tblMain.ActvWrkMin, tblMain.WrkAreaID,
(select top 1 Goal
from GoalTable as gtbl
where gtbl.id_workarea = 1
and tblmain.[Date] >= gtbl.glActvFrm order by gtbl.glActvFrm desc) as Goal
from Maintable as tblMain
Check the below image for the result which is generated from this query.
I hope this will solve your issue.
We have thousands of record in our data and want to count date wise jobs with category through single query. It is Possible?
Display required as under
TypesJobs 01 02 03 04 05 06 07
A 2 1 6 4 1 3 4
B 10 12 8 10 12 9 13
C 3 5 4 3 2 5 4
Here Types of jobs count for a day in date column 01, 02, 03 are date range of the month
You can use conditional aggregation, something like this:
select typesjobs,
sum(case when month(datecol) = 1 then 1 els e0 end) as month_01,
sum(case when month(datecol) = 2 then 1 els e0 end) as month_02,
. . .
from t
where <date condition here>
group by typesjobs;
In SQL Server I have 2 tables that looks like this:
TEST SCRIPT 'a collection of test scripts'
(PK)
ID Description Count
------------------------
A12 Proj/Num/Dev 12
B34 Gone/Tri/Tel 43
C56 Geff/Ben/Dan 03
SCRIPT HISTORY 'the history of the aforementioned scripts'
(FK) (PK)
ScriptID ID Machine Date Time Passes
----------------------------------
A12 01 DEV012 6/26/15 16:54 4
A12 02 DEV596 6/28/15 13:12 9
A12 03 COM199 3/12/14 14:22 10
B34 04 COM199 6/30/13 15:45 12
B34 05 DEV012 6/30/15 13:13 14
B34 06 DEV444 6/12/15 11:14 14
C56 07 COM321 6/29/14 02:19 12
C56 08 ANS042 6/24/14 20:10 18
C56 09 COM432 6/30/15 12:24 4
C56 10 DEV444 4/20/12 23:55 2
In a single query, how would I write a select statement that takes just one entry for each DISTINCT script in TEST SCRIPT and pairs it with the values in only the TOP 1 most recent run time in SCRIPT HISTORY?
For example, the solution to the example tables above would be:
OUTPUT
ScriptID ID Machine Date Time Passes
---------------------------------------------------
A12 02 DEV596 6/28/15 13:12 9
B34 05 DEV012 6/30/15 13:13 14
C56 09 COM432 6/30/15 12:24 4
The way you describe the problem is almost directly as cross apply:
select h.*
from testscript ts cross apply
(select top 1 h.*
from history h
where h.scriptid = ts.id
order by h.date desc, h.time desc
) h;
Please try something like this:
select *
from SCRIPT SCR
left join (select MAX(SCRIPT_HISTORY.Date) as Date, SCRIPT_HISTORY.ScriptID
from SCRIPT_HISTORY
group by SCRIPT_HISTORY.ScriptID
) SH on SCR.ID = SH.ScriptID
I am using DB2 to take a table, split it into partitions and then order rows within each partition. The table I have is like:
ID DATE EVENT
-- ---- -----
01 1999-06-01 a
01 1999-06-01 b
01 2006-01-01 a
01 2011-12-31 c
02 1999-01-01 a
02 2003-01-01 a
02 2003-01-01 b
02 2009-11-12 b
where I want to order it to get the following...
ID DATE EVENT SEQUENCE
-- ---- ----- --------
01 1999-06-01 a 1
01 1999-06-01 b 1
01 2006-01-01 a 2
01 2011-12-31 c 3
02 1999-01-01 a 1
02 2003-01-01 a 2
02 2003-01-01 b 2
02 2009-11-12 b 3
so I am trying:
select a.*, row_number() over(partition by ID,order by DATE) from mytable a
which gives me:
ID DATE EVENT SEQUENCE
-- ---- ----- --------
01 1999-06-01 a 1
01 1999-06-01 b 2
01 2006-01-01 a 3
01 2011-12-31 c 4
02 1999-01-01 a 1
02 2003-01-01 a 2
02 2003-01-01 b 3
02 2009-11-12 b 4
where as you can see, even though a consecutive row may have the same date as the previous row, this is ignored and the SEQUENCE column is iterated.
How do I ensure that if the next row has the same date that the sequence is preserved until a row with a later date appears?
Thanks very much.
Clearly, the row_number() function would not return the same number for different rows within the window. You need to use the dense_rank() function.
By the way, your query has a syntax error, and it is not a good idea to use reserved words ('DATE' in this case) for column names.
You could use the DENSE_RANK function instead, which gives you an option of assigning the same rank, if two rows have the same values, as below:
select a.*, DENSE_RANK() OVER(PARTITION BY ID ORDER BY DATE DESC) from mytable a;
References:
Using OLAP specifications
I just started to program in SQL and I have a bit of a problem (n.b., I am working of a tabl that come from a game). My table is something like this, where ID refers to a single person, H to a certain hour of playing and IF to a certain condition:
ID H IF
01 1 0
01 2 0
01 3 0
02 1 0
02 2 1
03 1 0
03 2 1
03 3 0
03 4 1
In this case player 01 played for three hours, player 02 for two hours and player 03 for four hours. In each of these hours they may or may have not performed an action. If they did, a 1 appears in the IF column.
Now, my doubt is: how can I query so that I have a table with only the ID of the people who never performed the action? I do not want to rule out only the row with IF = 1, I want to rule out all the row with that ID. In this case it should become:
01 1 0
01 2 0
01 3 0
Any help?
This should do it.
select *
from table
where Id not in (select Id from table where IF = 1)
SELECT ID FROM Table GROUP BY ID HAVING SUM(IF)=0