Left join combining GETDATE() - sql

I have the below tables and i trying to LEFT JOIN from table A to table B to get Code & Time. The issue is that i get multiple lines for each code. What i want to get is one row for each Code with the Time which i less than the GETDATE () ordering desc.
Tables:
Code:
SELECT
[ID],
Date_Time
FROM Table_A
LEFT JOIN Table_B
ON A.ID = B.Project_Code

You can use apply:
select a.*, b.*
from a cross apply
(select top (1) b.*
from b
where b.code = a.code and b.time < getdate()
order by b.time desc
) b;
This assumes that time is really a datetime. If you just want to compare times, then use convert(time, getdate()).

You need to add an extra join clause to only return records from table before the specified DATETIME,then simply use MAX to get the latest:
SELECT a.[ID],
Date_Time = MAX(b.Date_Time)
FROM Table_A AS a
LEFT JOIN Table_B AS b
ON b.Project_Code = a.ID
AND b.Date_Time < GETDATE()
GROUP BY a.ID;
If the column Date_time uses the TIME data type (your sample data suggests it does, but your column name suggests it does not), then you will need to convert GETDATE() to a time:
SELECT a.[ID],
Date_Time = MAX(b.Date_Time)
FROM Table_A AS a
LEFT JOIN Table_B AS b
ON b.Project_Code = a.ID
AND b.Date_Time < CONVERT(TIME, GETDATE())
GROUP BY a.ID;
If there are other columns you need to return from Table_B, then you will need to use OUTER APPLY, or a subquery with a ranking function.
OUTER APPLY
SELECT a.[ID],
b.Date_Time,
b.SomeOtherColumn
FROM Table_A AS a
OUTER APPLY
( SELECT TOP 1 b.Date_Time, b.SomeOtherColumn
FROM Table_B AS b
WHERE A.ID = B.Project_Code
AND b.Date_Time < GETDATE()
ORDER BY b.Date_Time DESC
) AS b;
SUBQUERY WITH RANKING FUNCTION
SELECT a.[ID],
b.Date_Time,
b.SomeOtherColumn
FROM Table_A AS a
LEFT JOIN
( SELECT b.Project_Code,
b.Date_Time,
b.SomeOtherColumn,
RowNumber = ROW_NUMBER() OVER(PARTITION BY b.Project_Code ORDER BY b.Date_Time DESC)
FROM Table_B AS b
WHERE b.Date_Time < GETDATE()
ORDER BY b.Date_Time DESC
) AS b
ON b.Project_Code = a.ID
AND b.RowNumber = 1;

Related

Can I rewrite this SQL query to avoid the "The ORDER BY clause is invalid "

My query originally was this:
select a.ID, a.TransactionID, b.Result
from MyDB.Result a inner join MyDB.ResultData b on a.ID=b.ID
where a.ID < 100000 and a.CreatedOn > '2020-01-01'
order by a.ID
But I got the error in Elasticsearch JDBC input
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
To get around it I refactored it to this:
select TOP 500 a.ID, a.TransactionID, b.Result
from MyDB.Result a inner join MyDB.ResultData b on a.ID=b.ID
where a.ID < 100000 and a.CreatedOn > '2020-01-01'
order by a.ID
Is there any way I can rewrite it so as not to have to ask for TOP 500 and instead let the JDBC plugin use its inbuilt settings?
EDIT:
This is a section from the logs. This does look to be running as a part of a bigger query.
(1.251403s) SELECT TOP (1) count(*) AS [COUNT] FROM (select TOP 500 a.ID, a.TransactionID, b.Result from MyDB.Result a inner join MyDB.ResultData b on a.ID=b.ResultID where a.ID < 100000 and a.CreatedOn > '2020-01-01' order by a.ID) AS [T1]
(8.845048s) SELECT * FROM (select TOP 500 a.ID, a.TransactionID, b.Result from MyDB.Result a inner join MyDB.ResultData b on a.ID=b.ResultID where a.ID < 100000 and a.CreatedOn > '2020-01-01' order by a.ID) AS [T1] ORDER BY 1 OFFSET 0 ROWS FETCH NEXT 500 ROWS ONLY
When you use subquery with TOP Clause as FROM , it requires ORDER BY.
Subquery rules
ORDER BY can only be specified when TOP is also specified.
Read about all subquery rules
What you can do is, you can move the ORDER BY outside the FROM clause. I understand that you want all the rows, instead of only 500 rows.
(1.251403s) SELECT TOP (1) count(*) AS [COUNT] FROM (select a.ID,
a.TransactionID, b.Result from MyDB.Result a inner join MyDB.ResultData b
on a.ID=b.ResultID where a.ID < 100000 and a.CreatedOn > '2020-01-01') AS [T1]
order by ID
(8.845048s) SELECT * FROM (select a.ID, a.TransactionID, b.Result from MyDB.Result a
inner join MyDB.ResultData b on a.ID=b.ResultID where a.ID < 100000
and a.CreatedOn > '2020-01-01' order by a.ID) AS [T1] ORDER BY ID

Removing duplicate values from a column in SQL

I have two tables A (group_id, id, subject) and B (id, date). Below is the joint table of tables A and B on id. I have tried using distinct and partition to remove the duplicates in group_id(field) only, but no luck:
My code:
select
a.group_id, a.id, a.subject, b.date
from
A a
inner join
(select
b.*,
row_number() over (partition by group_id order by date asc) as seqnum
from
B b) b on a.id = b.id and seqnum = 1
order by
date desc;
I got this error when I ran the code:
Partitioning can not be used stand-alone in query near 'partition by group_id order by date asc) as seqnum from B' at line 1
This is my expected result:
Thank you in advance!
It looks like you want the earliest date for each row in the table you show. Your question mentions two tables, but you only show one.
I recommend a correlated subquery in most databases:
select b.*
from b
where b.date = (select min(b2.date)
from b b2
where b2.group_id = b.group_id
);
I see. You need to join first and then use row_number():
select ab.*
from (select a.group_id, a.id, a.subject, b.date,
row_number() over (partition by a.group_id order by b.date) as seqnum
from A a join
B b
on a.id = b.id
) ab
where seqnum = 1
order by date desc;
You are almost there. But the column that you try to use to partition (ie group_id) comes from table a, which is not available in the subquery.
You would need to JOIN and assign the row number in a subquery, and then filter in the outer query.
select *
from (
select
a.group_id,
a.id,
a.subject,
b.date,
row_number() over (partition by a.group_id order by b.date asc) as seqnum
from a
inner join b on ON a.id = b.id
)
where seqnum = 1
ORDER BY date desc;
Another way to achieve your goal though it may not be the efficient one
SELECT
A.group_id, A.id, B.Date, A.subject
FROM A
INNER JOIN B
ON A.Id = B.Id
INNER JOIN
(
SELECT
A.Group_id, MIN(B.Date) AS Date
FROM A
INNER JOIN B
ON A.Id = B.Id
GROUP BY A.group_id
) AS supportTable
ON A.group_id = supportTable.group_id
AND B.Date = supportTable.Date

SQL: Modifying Inner Join to Select One Row

I have two tables, A and B that I want to inner join on location. However, for each row in A, there are many rows in B whose location matches. I want to end up with at most the same number of rows as in A. Specifically, I want to take the row in B where date is earliest. Here's what I have so far:
SELECT *
FROM A
INNER JOIN B ON A.location = B.location
How would I modify this so that each row in A only gets joined with a single row in B (using the earliest date)?
Attempt:
SELECT *
FROM A
INNER JOIN B ON A.location = B.location
AND B.date = (SELECT MIN(date) FROM B)
Is that the right approach?
You can use the ANSI/ISO standard row_number() function:
SELECT *
FROM A INNER JOIN
(SELECT B.*, ROW_NUMBER() OVER (PARTITION BY B.location ORDER BY B.date) as seqnum
FROM B
) B
ON A.location = B.location AND seqnum = 1;
SELECT TOP(1) * FROM A
INNER JOIN B ON
A.LOCATION=B.LOCATION
ORDER BY B.DATE

How would I write this query as a join instead of a correlated query?

So, Netezza can't use correlated subqueries in SELECT statements, which is unfortunate that I can't think of a single way to avoid this in my particular case. I was thinking about doing something with ROW_NUMBER(); however, I can't include windowing functions in a HAVING clause.
I've got the following query:
select
a.*
,( select b.col1
from b
where b.ky = a.ky
and a.date <= b.date
order by b.date desc
limit 1
) as new_col
from a
Any suggestions?
This should return the expected result:
select *
from
(
select
a.*
,b.col1 as b_col1
,row_number()
over (partition by a.ky
order by b.date desc NULLS FIRST) as rn
from a left join b
where b.ky = a.ky
and a.date <= b.date
) as dt
where rn = 1
I'm not completely sure I understand your question, but is this what you're looking for?
SELECT TOP 1 a.*, b.col1 FROM a JOIN b ON a.ky = b.ky
WHERE a.date <= b.date ORDER BY b.date desc

Aliasing derived table which is a union of two selects

I can't get the syntax right for aliasing the derived table correctly:
SELECT * FROM
(SELECT a.*, b.*
FROM a INNER JOIN b ON a.B_id = b.B_id
WHERE a.flag IS NULL AND b.date < NOW()
UNION
SELECT a.*, b.*
FROM a INNER JOIN b ON a.B_id = b.B_id
INNER JOIN c ON a.C_id = c.C_id
WHERE a.flag IS NOT NULL AND c.date < NOW())
AS t1
ORDER BY RAND() LIMIT 1
I'm getting a Duplicate column name of B_id. Any suggestions?
The problem isn't the union, it's the select a.*, b.* in each of the inner select statements - since a and b both have B_id columns, that means you have two B_id cols in the result.
You can fix that by changing the selects to something like:
select a.*, b.col_1, b.col_2 -- repeat for columns of b you need
In general, I'd avoid using select table1.* in queries you're using from code (rather than just interactive queries). If someone adds a column to the table, various queries can suddenly stop working.
In your derived table, you are retrieving the column id that exists in table a and table b, so you need to choose one of them or give an alias to them:
SELECT * FROM
(SELECT a.*, b.[all columns except id]
FROM a INNER JOIN b ON a.B_id = b.B_id
WHERE a.flag IS NULL AND b.date < NOW()
UNION
SELECT a.*, b.[all columns except id]
FROM a INNER JOIN b ON a.B_id = b.B_id
INNER JOIN c ON a.C_id = c.C_id
WHERE a.flag IS NOT NULL AND c.date < NOW())
AS t1
ORDER BY RAND() LIMIT 1
First, you could use UNION ALL instead of UNION. The two subqueries will have no common rows because of the excluding condtion on a.flag.
Another way you could write it, is:
SELECT a.*, b.*
FROM a
INNER JOIN b
ON a.B_id = b.B_id
WHERE ( a.flag IS NULL
AND b.date < NOW()
)
OR
( a.flag IS NOT NULL
AND EXISTS
( SELECT *
FROM c
WHERE a.C_id = c.C_id
AND c.date < NOW()
)
)
ORDER BY RAND()
LIMIT 1