This is causing me more trouble then it should.
I have the following sample tables:
____________________ ____________
| Name| Number | | Number |
|_______|__________| |__________|
| Alice | 1 | | 1 |
| Bob | 2 | | 1 |
|_______|__________| |__________|
I want my result to be:
_____________________________________
| Name | Number | Count(Number) |
|________|___________|_______________|
| Alice | 1 | 2 |
| Bob | 2 | 0 |
|________|___________|_______________|
I'm going back and forward but I'm sure this shouldn't be so tricky. I assume I'm missing something.
I've modified Gordon's answer:
select name, number count(t2.number)
from table1 t1,
table2 t2
where t1.number = t2.number (+)
group by t1.number;
You need a join and aggregation. However, the join needs to be a left outer join to keep all the rows:
select name, number, count(t2.number)
from table1 t1 left outer join
table2 t2
on t1.number = t2.number
group by t1.name, t1.number;
And, the count() is counting the non-NULL values in the second table, so you can get 0 when there is no match.
Related
Could someone explain to me why
select "talent".* from "talent"
left join "push" on "push"."talentId" = "talent"."id"
where ("push"."offerId" = '403' or "push"."offerId" is null)
yields less results than
select "talent".* from "talent"
left join "push" on "push"."talentId" = "talent"."id" and "push"."offerId" in ('403')
The way I see it, it should boil down to the same result, but it doesn’t, and I’m not sure what I miss to get it.
first one does not contain rows that have no entry in the push table.
I’d expect them to be caught by the or "push"."offerId" is null.
EDIT:
here is an example:
talent table
+----+------+
| id | name |
+----+------+
| 1 | John |
| 2 | Bob |
| 3 | Jack |
+----+------+
push table
+----+----------+---------+
| id | talentId | offerId |
+----+----------+---------+
| 1 | 1 | 403 |
| 2 | 1 | 42 |
| 3 | 2 | 123 |
| 3 | 2 | 456 |
+----+----------+---------+
With this data, the query with the where clause returns only
+----+------+---------+
| id | name | offerId |
+----+------+---------+
| 1 | John | 403 |
+----+------+---------+
while the one with the on condition returns all wanted rows
+----+------+---------+
| id | name | offerId |
+----+------+---------+
| 1 | John | 403 |
| 2 | Bob | null |
| 3 | Jack | null |
+----+------+---------+
The difference is when there is a match but on another row. This is best shown with a small example.
Consider:
t1:
x y
1 abc
1 def
2 xyz
t2:
x y
1 def
Then the left join version returns all three rows in t1:
select *
from t1 left join
t2
on t1.x = t2.x and t1.y = t2.y;
The filtering in the where clause version:
select *
from t1 left join
t2
on t1.x = t2.x
where t2.y = 'abc' or t2.y is null;
returns only one rows. The row that is returned is 1/abc. x = 2 matches in t2. So, t2.y is not null. And it is not 'abc' either. So it is filtered out.
Here is a db<>fiddle.
Yes, there is something you are missing.
WHERE and join conditions are only exchangeable for inner joins.
An outer join a LEFT JOIN b ON ... is defined as:
the result of the inner join
in addition, for every row in a that did not find a match that way, we get a result row where the b values are replaced with NULL.
So, no matter what the join condition is, the result will always contain at least one row for each value of a.
But a WHERE condition is evaluated (logically) after the join, so it can exclude rows from a from the query result.
I want to join tables in such a way that it fetches only latest record from one of the tables:
The following are my data
Table_One:
+----+------+
| ID | Name |
+----+------+
| 1 | John |
| 2 | Tom |
| 3 | Anna |
+----+------+
Table_two:
+----+----------+-----------+
| ID | Visit ID | Date |
+----+----------+-----------+
| 1 | 2513 | 5/5/2001 |
| 1 | 84654 | 10/5/2012 |
| 1 | 454 | 4/20/2018 |
| 2 | 754 | 4/5/1999 |
| 2 | 654 | 8/8/2010 |
| 2 | 624 | 4/9/1982 |
| 3 | 7546 | 7/3/1997 |
| 3 | 246574 | 6/4/2015 |
| 3 | 15487 | 3/4/2017 |
+----+----------+-----------+
Results needed after Join:
+----+------+----------+-----------+
| ID | Name | Visit ID | Date |
+----+------+----------+-----------+
| 1 | John | 454 | 4/20/2018 |
| 2 | Tom | 654 | 8/8/2010 |
| 3 | Anna | 246574 | 6/4/2015 |
+----+------+----------+-----------+
Different database engines have varying ways to get the top row from table 2 per group (you can google for "SQL windowing functions" and your product). Since you don't state what engine you're using it's impossible to give the most appropriate or most performant solution.
The following method should work in most or all SQL engines but will not be especially performant over a large data set (it will benefit from a composite index Table2(ID, Date)). The details of how you specify the aliases for the tables may differ a bit among engines but you can use this as a guide. A windowing function solution will probably be more efficient.
SELECT ID, Name, VisitID, Date FROM Table1 T1 INNER JOIN Table2 T2 +
ON T1.ID = T2.ID
WHERE NOT EXISTS (SELECT * FROM Table2 T2B WHERE T2B.ID = T1.ID AND T2B.Date > T2.Date)
I suspect you have SQL Server if so, then you can use APPLY:
select o.*, tt.*
from Table_One o
cross apply ( select top 1 t.VisitDate, t.Date
from table_two t
where t.id = o.id
order by t.date desc
) tt;
You can filter out "latest visit" using
SELECT ID,MAX(DATE) FROM TABLE_TWO GROUP BY ID;
You then join that to TABLE_ONE (... ON .ID = .ID) to pick up the Name column and then join that again to TABLE_TWO (... ON ID=ID AND DATE=DATE) if you need to pick up the VISIT_ID as well.
Specific DBMS's might have proprietary/idiosyncratic extensions typically serving the purpose of allowing the optimizer do a better job (e.g. allowing the optimizer to understand that the "joining back to TABLE_TWO can be eliminated). Thinking here of SELECT TOP 1 ... and the like.
SELECT ID,Name,Visit_ID,Date
FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY ID Date DESC) as seq
FROM Table2 LEFT OUTER JOIN
Table1 ON Table2.ID = Table1.ID) as mainTable
WHERE seq = 1
I'm not a 100% sure if this is correct since the Visit ID might just throw every record right back at you. However you can find some great documentation here: https://www.w3resource.com/sql/aggregate-functions/max-date.php
select t1.ID,t1.Name,t2.visit_ID, Max(t2.Date) from Table_Two t2
inner join Table_One t1
on(t2.ID=t1.ID)
group by t1.ID,t1.Name,t2.visit_ID
something like this should work though, i think that this is also the same as #Erwin Smout proposes
select a.ID, t1.Name, a.date,t2.Visit_ID (
select ID, max(date)'date' from Table_Two
group by ID) a
inner join Table_One t1
on( a.ID=t1.ID)
inner join Table_Two t2
on(a.ID=t2.ID and a.Date=t2.Date)
This is the answer to your question
SELECT t1.ID, t1.Name, t2.visit_id, t2.Date FROM table1 t1 INNER JOIN table2 t2 ON t1.ID = t2.ID WHERE NOT EXISTS (SELECT * FROM table2 t2b WHERE t2b.ID = t1.ID AND t2b.Date > t2.Date)
I have two tables.They have the same data but from different sources. I would like to find all columns from both tables that where id in table 2 occurs more than once in table 1. Another way to look at it is if table2.id occurs only once in table1.id dont bring it back.
I have been thinking it would be some combination of group by and order by clause that can get this done but its not getting the right results. How would you express this in a SQL query?
Table1
| id | info | state | date |
| 1 | 123 | TX | 12-DEC-09 |
| 1 | 123 | NM | 12-DEC-09 |
| 2 | 789 | NY | 14-DEC-09 |
Table2
| id | info | state | date |
| 1 | 789 | TX | 14-DEC-09 |
| 2 | 789 | NY | 14-DEC-09 |
Output
|table2.id| table2.info | table2.state| table2.date|table1.id|table1.info|table1.state|table1.date|
| 1 | 789 | TX | 14-DEC-09 | 1 | 123 | TX | 12-DEC-09 |
| 1 | 789 | TX | 14-DEC-09 || 1 | 123 | NM | 12-DEC-09 |
If you using MSSQL try using a Common Table Expression
WITH cte AS (SELECT T1.ID, COUNT(*) as Num FROM Table1 T1
INNER JOIN Table2 T2 ON T1.ID = T2.ID
GROUP BY T1.ID
HAVING COUNT(*) > 1)
SELECT * FROM cte
INNER JOIN Table1 T1 ON cte.ID = T1.ID
INNER JOIN Table2 T2 ON cte.ID = T2.ID
First, I would suggest adding an auto-incrementing column to your tables to make queries like this much easier to write (you still keep your ID as you have it now for relational-mapping). For example:
Table 1:
TableID int
ID int
Info int
State varchar
Date date
Table 2:
TableID int
ID int
Info int
State varchar
Date date
Then your query would be really easy, no need to group, use CTEs, or row_over partitioning:
SELECT *
FROM Table2 T2
JOIN Table1 T1
ON T2.ID = T1.ID
JOIN Table1 T1Duplicate
ON T2.ID = ID
AND T1.TableID <> T1Duplicate.TableID
It's a lot easier to read. Furthermore, there are lots of scenarios where an auto-incrementing ID field is benefitial.
I find this a much simpler way to do it:
select TableA.*,TableB.*
from TableA
inner join TableB
on TableA.id=TableB.id
where TableA.id in
(select distinct id
from TableA
group by id
having count(*) > 1)
gurus!
I'm using SQL Server linked tables in Access Forms. In MainTable I need to update and insert records, but Access won't let it, for update it says "This Recordset is not updateable". I know, it's couse DISTINCT, but it's necessary for TableType records - I need only one related name_ds from TableTypes (even first by npr) and in result just thees 7 MainTable records not 16 (without DISTINCT).
Any workarounds?
Simple structure -
MainTable: id, npr, name, type, datasource_fk.
TableDS: id, name_ds, something.
TableType: id, npr, name_type, something_type.
Data -
MainTable:
1;12;"Olie";"percentage";1
2;15;"Tol";"count";2
3;13;"Opp";"percentage";1
4;12;"Hypq";"count";3
5;14;"Gete";"count";1
6;;"Mour";"count";2
7;;"Ellt";"percentage";3
TableDS:
1;"City1";"q"
2;"City2";"a"
3;"State1";"z"
4;"State2";"x"
TableType:
1;12;"City1";"w"
2;15;"City1";"s"
3;13;"City1";"x"
4;14;"City2";"w"
5;14;"City1";"s"
6;13;"City3";"p"
7;12;"City1";"t"
8;12;"City1";"n"
9;12;"State1";"r"
10;15;"State1";"r"
SQL, result -
SELECT DISTINCT t3.npr AS npr_type, t1.npr, t1.id, t1.name, t2.name_ds, t1.datasource_fk, t1.types
FROM (MainTable AS t1 LEFT JOIN TableDS AS t2 ON t1.datasource_fk = t2.id) LEFT JOIN TableType AS t3 ON t1.npr = t3.npr;
---------------------------------------------------------------------------------------------------------------------------------------------
| npr_type | npr | id | name | name_ds | datasource_fk | types |
---------------------------------------------------------------------------------------------------------------------------------------------
| | | 6 | Mour | City2 | 2 | count |
---------------------------------------------------------------------------------------------------------------------------------------------
| | | 7 | Ellt | State1 | 3 | percentage |
---------------------------------------------------------------------------------------------------------------------------------------------
| 12 | 12 | 1 | Olie | City1 | 1 | percentage |
---------------------------------------------------------------------------------------------------------------------------------------------
| 12 | 12 | 4 | Hypq | State1 | 3 | count |
---------------------------------------------------------------------------------------------------------------------------------------------
| 13 | 13 | 3 | Opp | City1 | 1 | percentage |
---------------------------------------------------------------------------------------------------------------------------------------------
| 14 | 14 | 5 | Gete | City1 | 1 | count |
---------------------------------------------------------------------------------------------------------------------------------------------
| 15 | 15 | 2 | Tol | City2 | 2 | count |
---------------------------------------------------------------------------------------------------------------------------------------------
You are getting 16 matches on your joins because MainTable npr column matches multiple times with TableType npr column.
1;12;"Olie";"percentage";1
Matches to
7;12;"City1";"t"
8;12;"City1";"n"
9;12;"State1";"r"
1;12;"City1";"w"
Your best bet is to use a where clause for column TableType.somethingtype. You can try LEFT JOIN on TableDS and TableType using multiple columns but really, you may need to adjust your data. In other words, inactivate some rows. The following query will show you what you're up against:
SELECT t3.npr AS npr_type,
t1.npr,
t1.id,
t1.name,
t2.name_ds,
t1.datasource_fk,
t1.type,
t3.something_type
FROM #MainTable t1
LEFT JOIN #TableDS AS t2
ON t1.datasource_fk = t2.id
LEFT JOIN #TableType AS t3
ON t1.npr = t3.npr
ORDER BY t3.npr,
t1.npr,
t1.id,
t1.name,
t2.name_ds,
t1.datasource_fk,
t1.type,
t3.something_type
So, after you figure out your data. Then you may be able to do something like:
SELECT t3.npr AS npr_type,
t1.npr,
t1.id,
t1.name,
t2.name_ds,
t1.datasource_fk,
t1.type,
t3.something_type
FROM #MainTable t1
LEFT JOIN #TableDS AS t2
ON t1.datasource_fk = t2.id
LEFT JOIN #TableType AS t3
ON t1.npr = t3.npr
WHERE
(t1.npr = 12 AND t3.something_type = 'n')
OR
(t1.npr = 14 AND t3.something_type = 's')
OR
(t1.npr = 13 AND t3.something_type = 'p')
OR
(t1.npr = 15 AND t3.something_type = 's')
OR
(t1.npr IS NULL)
Sorry for the broad title, I had a hard time coming up with a brief way of describing what I am looking to do. I have two tables (examples below) that I want to join but under a certain condition.
The main table has a field called "DateVal", the second table has a field called "Day". After joining on field "JoinField" I only want to keep rows where the day value in "DateVal" is less than the value of "Day". However, if this criteria is met for multiple values of "Day" I only want it to keep the first instance.
In the second table below, for JoinField "A" there are three rows, for the first I only want it to return times when the day of the month is between 1-10, the second only with the day of the month is between 11-20, and the last 20-31.
A left or inner join will bring back all values, the only way I can think of to get around this is to do a complete join and only return for min("Day"). Can anyone think of a more efficient way?
Thanks in advance.
Table 1
-------------------------------
| ID | JoinField | DateVal |
-------------------------------
| 1 | A | 01/01/2014 |
| 2 | A | 01/16/2014 |
| 3 | B | 05/20/2013 |
-------------------------------
Table 2
--------------------------------
| JoinField | Day | FieldToAdd |
--------------------------------
| A | 10 | A |
| A | 20 | AA |
| A | 31 | AAA |
| B | 15 | B |
| B | 31 | BB |
--------------------------------
Desired Results
--------------------------------------------
| ID | JoinField | DateVal | FieldToAdd |
--------------------------------------------
| 1 | A | 01/01/2014 | A |
| 2 | A | 01/16/2014 | AA |
| 3 | B | 05/20/2014 | BB |
--------------------------------------------
You can do this in a variety of ways. I think a correlated subquery is the easiest way to express it, but unfortunately, the following doesn't work in Oracle:
select t1.*,
(select *
from (select t2.*
from table2 t2
where t2.day < extract(day from t1.dateval)
order by t2.day desc
) t
where rownum = 1
)
from table1 t1;
You can instead do this with join fancy window functions:
select *
from (select t1.*,
row_number() over (partition by t1.id order by t2.day desc) as seqnum
from table1 t1 left outer join
table2 t2
on t2.day < extract(day from t1.dateval)
) t
where seqnum = 1;