I know this has been asked a lot but I can't seem to get my query working.
I'm trying to get only one row per id in a query looking like this :
SELECT a.id, b.name
FROM table1 a
LEFT JOIN table2 b ON a.key = b.key
WHERE a.Date =
(SELECT MAX(a1.date) from table1 WHERE a1.primarykey = a.primarykey)
GROUP BY a.id, b.name
I do not need to group by b.name but have to since I need to group by id.
Right now, I have multiple occurences for b.name which duplicates a.id where I just want the corresponding b.name for the last date for a.id.
Can anyone point me to the right way to do this ?
Thank you
I guess this condition:
WHERE a1.primarykey = a.primarykey
should be:
WHERE a1.key = a.key
and key is not the primary key of table1, because if you really mean the primary key then there is no point to search for the MAX(date) for the primary key since there is only 1 date for each primary key.
If I'm not wrong then try with row_number():
SELECT t.id, t.name
FROM (
SELECT a.id, b.name,
row_number() over (partition by a.key order by a.date desc) rn
FROM table1 a LEFT JOIN table2 b
ON a.key = b.key
) t
WHERE t.rn = 1
It looks like you would be getting 1 row per id if you would be removing b.name from your group statement.
Not sure why you would need to group on b.name if you group on a.id?
try this:
SELECT a.id, b.name from (
SELECT a1.id,a1.key,
rank() over(partition by a1.key order by a1.date desc) md FROM table1 a1 )a
LEFT JOIN table2 b ON a.key = b.key and a.md=1;
but I don't get -you need group by Id or key, double check it
Related
I have a table created by joining a table to itself.
SELECT a.id id1, b.id id2, ST_Distance(a.geom, b.geom) dist
FROM yy_cluster_1 a, yy_cluster_1 b
WHERE ST_DWithin(a.geom, b.geom, 12) AND a.id <> b.id
ORDER BY a.id, b.id
This results in a table like the one below on the left. I am trying to number rows to produce the result on the right
For the life of me I cannot think how to do this, let alone efficiently.
Use dense_rank() along with least() and greatest():
SELECT
a.id id1,
b.id id2,
ST_Distance(a.geom, b.geom) dist,
dense_rank() over(order by least(a.id, b.id), greatest(a.id, b.id)) new_id
FROM
yy_cluster_1 a
INNER JOIN yy_cluster_1 b ON ST_DWithin(a.geom, b.geom, 12) AND a.id <> b.id
ORDER BY a.id, b.id
Note that I changed your query so it performs an explicit join instead of an old-school implicit join.
I have two tables A (group_id, id, subject) and B (id, date). Below is the joint table of tables A and B on id. I have tried using distinct and partition to remove the duplicates in group_id(field) only, but no luck:
My code:
select
a.group_id, a.id, a.subject, b.date
from
A a
inner join
(select
b.*,
row_number() over (partition by group_id order by date asc) as seqnum
from
B b) b on a.id = b.id and seqnum = 1
order by
date desc;
I got this error when I ran the code:
Partitioning can not be used stand-alone in query near 'partition by group_id order by date asc) as seqnum from B' at line 1
This is my expected result:
Thank you in advance!
It looks like you want the earliest date for each row in the table you show. Your question mentions two tables, but you only show one.
I recommend a correlated subquery in most databases:
select b.*
from b
where b.date = (select min(b2.date)
from b b2
where b2.group_id = b.group_id
);
I see. You need to join first and then use row_number():
select ab.*
from (select a.group_id, a.id, a.subject, b.date,
row_number() over (partition by a.group_id order by b.date) as seqnum
from A a join
B b
on a.id = b.id
) ab
where seqnum = 1
order by date desc;
You are almost there. But the column that you try to use to partition (ie group_id) comes from table a, which is not available in the subquery.
You would need to JOIN and assign the row number in a subquery, and then filter in the outer query.
select *
from (
select
a.group_id,
a.id,
a.subject,
b.date,
row_number() over (partition by a.group_id order by b.date asc) as seqnum
from a
inner join b on ON a.id = b.id
)
where seqnum = 1
ORDER BY date desc;
Another way to achieve your goal though it may not be the efficient one
SELECT
A.group_id, A.id, B.Date, A.subject
FROM A
INNER JOIN B
ON A.Id = B.Id
INNER JOIN
(
SELECT
A.Group_id, MIN(B.Date) AS Date
FROM A
INNER JOIN B
ON A.Id = B.Id
GROUP BY A.group_id
) AS supportTable
ON A.group_id = supportTable.group_id
AND B.Date = supportTable.Date
I am trying to do a simple LEFT JOIN of a table in couchbase. Here is what I have:
SELECT
a.*,
b.id,
b.name
FROM my_table AS a LEFT JOIN my_table AS b
ON KEYS a.pid
WHERE a.id='abc'
but for some reason the result I get is not including the fields of the table on the right side. Can anyone help me to achieve something similar to what we can do in relational database SQL as below?
SELECT
a.*,
b.id,
b.name
FROM my_table AS a LEFT JOIN my_table AS b
ON a.pid=b.id
WHERE a.id='abc'
thanks!
If nothing matched right side of JOIN projected as MISSING in (NOSQL) JSON vs NULL in SQL (i.e it gives LEFT side document extend right side as MISSING)
SELECT
a.*,
(CASE WHEN b IS MISSING THEN NULL ELSE b.id END) AS id,
(CASE WHEN b IS MISSING THEN NULL ELSE b.name END) AS name
FROM my_table AS a LEFT JOIN my_table AS b
ON KEYS a.pid
WHERE a.id='abc';
If the document b is matched and b.name is MISSING you still get MISSING. If you need null try this.
SELECT
a.*,
(CASE WHEN b.id IS MISSING THEN NULL ELSE b.id END) AS id,
(CASE WHEN b.name IS MISSING THEN NULL ELSE b.name END) AS name
FROM my_table AS a LEFT JOIN my_table AS b
ON KEYS a.pid
WHERE a.id='abc';
SELECT
a.*,
IFMISSING(b.id,NULL) AS id,
IFMISSING(b.name,NULL) AS name
FROM my_table AS a LEFT JOIN my_table AS b
ON KEYS a.pid
WHERE a.id='abc';
Not sure how to rewrite the below query. I’m trying to join table_a to the most recent table_b record. Currently testing for only one ID, but a different criteria on table_a may be added:
Select t.*
from table_a t
left join table_b d on d.id = T.id and d.MOD_DATE IN (SELECT MAX(mod_date) FROM table_b d2 WHERE d2.id = t.id)
where T.id = 123456
Any suggestions?
I think you are looking for something like:
SELECT t.*
FROM table_a t
LEFT JOIN (
SELECT d.*
FROM table_b d
INNER JOIN (
SELECT id
, MAX(mod_date) mod_date_max
FROM table_b d2
GROUP BY id
) db
ON db.id = d.id
AND db.mod_date_max = d.mod_date
) d
ON d.id = T.id
WHERE T.id = 123456
Note that your where clause turns the left join into an inner join.
Also, if you get an error, please post the error message as well, not just its number.
I also found the same could be achieved with the following query:
SELECT * FROM table_a t
WHERE id IN (
SELECT id
FROM (
SELECT id,MAX(MOD_DATE)
FROM table_b
WHERE id = 123456
GROUP BY id
)
)
I have a table like below.
Id amount
--------------
10. 12345
10. 12345
12. 34567
13. 34567
As per my business requirement same id with same amount is not duplicate record. different Ids wtih same amount is duplicate record. hope you understood the requirement.
In the above sample record I have to get the duplicate amount values and its count and at the same time Id should be different.
The expected query result is 34567 and count as 2.
IF you need to display id as well,
SELECT a.*
FROM
(
SELECT id, amount, count(1) OVER (PARTITION BY amount) num_dup
FROM table1
)a
WHERE a.num_dup >1
Update. If you care only about distinct id , use COUNT(DISTINCT id) instead of COUNT(1)
More examples.
With joining another table
SELECT a.*
FROM
(
SELECT a.id, a.amount,
count(distinct a.id) OVER (PARTITION BY a.amount) num_dup
FROM table1 a
INNER JOIN table2 b ON (b.id = a.id)
)a
WHERE a.num_dup >1
Without window function and without table1.id :
SELECT a.amount, count(distinct a.id)
FROM table1 a
INNER JOIN table2 b ON (b.id = a.id)
GROUP BY a.amount
HAVING count(distinct a.id) >1 ;
Without window function and with table1.id :
SELECT b.*
FROM
(
SELECT a.amount, count(distinct a.id)
FROM table1 a
INNER JOIN table2 b ON (b.id = a.id)
GROUP BY a.amount
HAVING count(distinct a.id) >1
)a
INNER JOIN table1 b ON (b.amount = a.amount)