I have a simple question on how to read the number of occurences of some values in a table B that references to a value in a table A. It's all explained better in the following example:
Let's say I have two simple tables, A with an attribute ID and B with an attribute ID that references A.ID; I use this code to find the number of occurences of a A.ID value into B:
SELECT A.ID, COUNT(*)
FROM A JOIN B ON A.ID = B.ID
GROUP BY A.ID
Is it possible to achieve the same result using something like the following code...
SELECT ID, -- SOMETHING --
FROM A
....
... using this subquery?
SELECT COUNT(*)
FROM B WHERE B.ID = A.ID
Thank you very much
I think you might be referring to a correlated subquery:
select a.id, (select count(1) from b where id=a.id) cnt from a;
The a.id term in the subquery binds with table a in the outer query.
Related
I'm trying to select * from two tables (a and b) using a join (column a.id and b.id), given that the count of a column (b.owner) in b is lower than 3, i.e. the occurence of a person's name can be max 2.
I've tried:
SELECT a.*, COUNT(b.owner) AS b_count
FROM a LEFT JOIN b on a.id = b.id
GROUP BY b.owner HAVING COUNT(b_count) <3
As im pretty new to SQL, im pretty stuck here. How can i resolve this issue? The result should be all columns for owners who do not appear more than twice in the data.
The query you are trying to run is not working due to the columns missing in the GROUP BY clause.
As you are outputting all columns from table a (with SELECT a.*), you need to include all those columns in the GROUP BY statement, so that the database understand the group of fields to group by and perform the aggregation required (in your case COUNT(b.owner)).
Example
Considering that your table a has 3 columns below:
CREATE TABLE persons (
id INTEGER,
name VARCHAR(50),
birthday DATE,
PRIMARY KEY (id)
);
.. and your table b the following and referencing the first table as below:
CREATE TABLE sales (
id INTEGER,
person_id INTEGER,
sale_value DECIMAL,
PRIMARY KEY (id),
FOREIGN KEY (person_id) REFERENCES persons(id)
);
.. you should query it aggregating the COUNT() by those 3 columns:
SELECT a.id, a.name, a.birthday, COUNT(b.person_id) AS b_count
FROM persons a
LEFT JOIN sales b ON a.id = b.person_id
GROUP BY a.id, a.name, a.birthday
HAVING COUNT(b.person_id) < 3
Alternative
In case the total of records on the 2nd table is not important to you, you could use a different "strategy" here to avoid performing the JOIN between the tables (useful when joining two huge tables) and rewriting all the columns from a on the SELECT+GROUP BY.
By identifying the records that has less than the 3 occurrences firstly:
SELECT b.person_id
FROM sales b
GROUP BY b.person_id
HAVING COUNT(b.id) < 3;
.. and using it in the WHERE clause to retrieve all the columns from the 1st table only for the ids that resulted from the previous query:
SELECT a.*
FROM persons a
WHERE a.id IN (....other query here....);
.. the execution happens in a more chronological and, perhaps, easier way to visualize while getting more familiar with SQL:
SELECT a.*
FROM persons a
WHERE a.id IN (SELECT b.person_id
FROM sales b
GROUP BY b.person_id
HAVING COUNT(b.id) < 3);
DB Fiddle here
In Standard SQL, you can use:
SELECT a.*, COUNT(b.owner) AS b_count
FROM a LEFT JOIN
b
ON a.id = b.id
GROUP BY a.id
HAVING COUNT(b.owner) < 3;
This may not work in all databases (and it assumes that a.id is unique/primary key). An alternative would be to use a correlated subquery:
SELECT a.*
FROM (SELECT a.*,
(SELECT COUNT(*)
FROM b
WHERE a.id = b.id
) as b_count
FROM a
) a
WHERE b_count < 3;
I'm trying to take the distinct IDs that appear in table a, filter table b for only these distinct IDs from table a, and present the remaining columns from b. I've tried:
SELECT * FROM
(
SELECT DISTINCT
a.ID,
a.test_group,
b.ch_name,
b.donation_amt
FROM table_a a
INNER JOIN table_b b
ON a.ID=b.ID
ORDER by a.ID;
) t
This doesn't seem to work. This query worked:
SELECT DISTINCT a.ID, a.test_group, b.ch_name, b.donation_amt
FROM table_a a
inner join table_b b
on a.ID = b.ID
order by a.ID
But I'm not entirely sure this is the correct way to go about it. Is this second query only going to take unique combinations of a.ID and a.test_group or does it know to only take distinct values of a.ID which is what I want.
Your first and second query are similar.(just that you can not use ; inside your query) Both will produce the same result.
Even your second query which you think is giving you desired output, can not produce the output what you actually want.
Distinct works on the entire column list of the select clause.
In your case, if for the same a.id there is different a.test_group available then it will have multiple records with same a.id and different a.test_group.
I have this table:
and would like to convert it to the following:
Please help me, been stuck on it for way too long. Doesn't working for me using group by
WITH A as (SELECT id, a FROM XXX WHERE a is not null),
B as (SELECT id, b FROM XXX WHERE b is not null)
SELECT A.a, B.b, A.id FROM A
INNER JOIN B on A.id = B.id;
For this dataset, simple aggregation would do what you want:
select min(a) a, min(b) b, id
from mytable
group by id
This takes advantage of the fact that aggregate functions ignore null values; we could get the very same result with max() as we did with min().
I saw answers to a related question, but couldn't really apply what they are doing to my specific case.
I have a large table (300k rows) that I need to join with another even larger (1-2M rows) table efficiently. For my purposes, I only need to know whether a matching row exists in the second table. I came up with a nested query like so:
SELECT
id,
CASE cnt WHEN 0 then 'NO_MATCH' else 'YES_MATCH' end as match_exists
FROM
(
SELECT
A.id as id, count(*) as cnt
FROM
A, B
WHERE
A.id = B.foreing_id
GROUP BY A.id
) AS id_and_matches_count
Is there a better and/or more efficient way to do it?
Thanks!
You just want a left outer join:
SELECT
A.id as id, count(B.foreing_id) as cnt
FROM A
LEFT OUTER JOIN B ON
A.id = B.foreing_id
GROUP BY A.id
Assume I have two data tables and a linking table as such:
A B A_B_Link
----- ----- -----
ID ID A_ID
Name Name B_ID
2 Questions:
I would like to write a query so that I have all of A's columns and a count of how many B's are linked to A, what is the best way to do this?
Is there a way to have a query return a row with all of the columns from A and a column containing all of linked names from B (maybe separated by some delimiter?)
Note that the query must return distinct rows from A, so a simple left outer join is not going to work here...I'm guessing I'll need nested select statements?
For your first question:
SELECT A.ID, A.Name, COUNT(ab.B_ID) AS bcount
FROM A LEFT JOIN A_B_Link ab ON (ab.A_ID = A.ID)
GROUP BY A.ID, A.Name;
This outputs one row per row of A, with the count of matching B's. Note that you must list all columns of A in the GROUP BY statement; there's no way to use a wildcard here.
An alternate solution is to use a correlated subquery, as #Ray Booysen shows:
SELECT A.*,
(SELECT COUNT(*) FROM A_B_Link
WHERE A_B_Link.A_ID = A.A_ID) AS bcount
FROM A;
This works, but correlated subqueries aren't very good for performance.
For your second question, you need something like MySQL's GROUP_CONCAT() aggregate function. In MySQL, you can get a comma-separated list of B.Name per row of A like this:
SELECT A.*, GROUP_CONCAT(B.Name) AS bname_list
FROM A
LEFT OUTER JOIN A_B_Link ab ON (A.ID = ab.A_ID)
LEFT OUTER JOIN B ON (ab.B_ID = B.ID)
GROUP BY A.ID;
There's no easy equivalent in Microsoft SQL Server. Check here for another question on SO about this:
"Simulating group_concat MySQL function in MS SQL Server 2005?"
Or Google for 'microsoft SQL server "group_concat"' for a variety of other solutions.
For #1
SELECT A.*,
(SELECT COUNT(*) FROM A_B_Link WHERE A_B_Link.A_ID = AOuter.A_ID)
FROM A as AOuter
SELECT A.*, COUNT(B_ID)
FROM A
LEFT JOIN A_B_Link ab ON ab.A_ID=A.ID