how to get a Distinct Count of users from two related but different tables - sql

Apologies for this but SQL is not a strong point for me, and whilst appears similar to lots of other queries I cannot translate those to this situation successfully.
I have two tables that will be related by a common value (id and Issue) if a row in table 2 exists.
I need to get a distinct count of users raising particular issues. I have users in both tables, with the table 2 user taking precedence if it exists.
There is always a REPORTER in Table 1, but there may not be a Stringvalue of Name (fieldtype = 1) in table 2. If there is a Stringvalue then that is the "User" and the Reporter can be ignored.
Table 1
| id | Reporter| Type |
| 1 | 111111 | 1 |
| 2 | 111111 | 2 |
| 3 | 222222 | 2 |
| 4 | 333333 | 1 |
| 5 | 111111 | 1 |
| 6 | 666666 | 1 |
Table 2
|issue | Stringvalue | fieldType|
| 1 | Fred | 1 |
| 1 | bananas | 2 |
| 2 | Jack | 1 |
| 5 | Steve | 1 |
I have a total of 4 issues of the right type (1,4,5,6), three reporters (111111,333333,666666) and two Stringvalues(Fred, Steve).
My total count of Distinct Users = 4 (Fred, 333333, Steve, 666666)
Result Table
| id| T1.Reporter | T2.Name |
| 1| Null | Fred |
| 4| 333333 | Null |
| 5| Null | Steve |
| 6| 666666 | Null |
How do I get this result in SQL!
Closest try so far:
SELECT
table1.REPORTER,
TO_CHAR(NULL) "NAME"
FROM table1
Where table1.TYPE =1
AND table1.REPORTER <> '111111'
Union
SELECT
TO_CHAR(NULL) "REPORTER",
table2.STRINGVALUE "NAME"
FROM table2,
table1
WHERE table2.ISSUE = table1.ID
AND table2.fieldtype= 1
and table1.issuetype = 1
Without explicitly excluding the default table 1 Reporter, this gets returned in my results even when there is a name value in table 2.
I have tried exists and in but cannot get syntax right or the correct results. As soon as try any Join that links the ID and Issue values the results always end up constrained to the matching rows or for all values. And added additional conditions to the ON does not return correct results.
I have tried too many permutations to list, logically this sounds like should be able to do union with where exists, or left outer join but my skills are lacking to make this work.

You need to use a LEFT JOIN and that is where you specify the fieldtype = 1 clause:
SELECT
table1.id,
CASE
WHEN table2.Stringvalue IS NOT NULL THEN table2.Stringvalue
ELSE table1.Reporter
END AS TheUser
FROM table1
LEFT JOIN table2 ON table1.id = table2.issue AND table2.fieldType = 1
WHERE table1.Type = 1
Result:
+------+---------+
| id | TheUser |
+------+---------+
| 1 | Fred |
| 4 | 333333 |
| 5 | Steve |
| 6 | 666666 |
+------+---------+

If I understand correctly, you want a left join and count(distinct). Here is what I think you are looking for:
select count(distinct coalesce(stringval, reporter) )
from table1 t1 left join
table2 t2
on t1.id = t2.issue and t2.fieldtype = 1
where t1.id in (1, 4, 5, 6);
You need to learn how to use explicit JOIN syntax. As a simple rule: Never use commas in the FROM clause. Always use explicit JOIN syntax. For one thing, it is more powerful, making it easy to express outer joins.

Related

How can I define IIf parameters across different records in a table?

I've defined a query that filters out records that are null in a specific field. I'd like to also calculate a query field that returns the type of record that follows the record that was filtered out, if it matches the parameters. The way I thought to do this was with an IIf statement with multiple parameters:
Preparing: IIf([tblCustomers!OrderId]=([tblCustomers!OrderId]+1)
AND [tblCustomers!OrderStatus]="Preparing","Preparing","")
This didn't work as I hoped, but I wasn't too surprised, as it would have to return data from the field initially tested. So, the argument that adds 1 is actually doing nothing.
Is there a way to target the next record in the table, test if it matches one of two or three strings, then return which one it is?
Edit: Following #mazoula's solution, it seems a correlated subquery is indeed the answer here. Following the guide on allenbrowne.com (linked by June7), I seemed to be on the right track. Here is my code for retrieving the status of a previous record:
SELECT tblCustomers.AccountId,
tblCustomers.OrderId,
tblCustomers.OrderStatus,
tblCustomers.OrderShipped,
tblCustomers.OrderNotes,
(SELECT TOP 1 Dupe.OrderStatus
FROM tblCustomers AS Dupe
WHERE Dupe.AccountId = tblCustomers.AccountId
AND Dupe.OrderId > tblCustomers.OrderId
ORDER BY Dupe.AccountId DESC, Dupe.OrderId) AS NextStatus
FROM tblCustomers
WHERE (((tblCustomers.OrderShipped)="N") AND
((tblCustomers.OrderNotes) Is Null))
ORDER BY tblCustomers.AccountId DESC;
Unfortunately, I am met with the following error:
At most one record can be returned by this subquery
Doing a little more research, I found that incorporating an INNER JOIN expression should solve this.
...
FROM tblCustomers
INNER JOIN OrderStatus Dupe ON Dupe.AccountId = tblCustomers.AccountId
WHERE ...
This is where I've hit another roadblock and, when the syntax is at least correct, I receive the error:
Join expression not supported.
Is this a simple syntax issue, or have misunderstood the role of a Join expression?
in Access 2016 I do this in two parts because access throws the error: must use an updateable query when I try to update based on a subquery. For instance, if I want to replace the Null Values in TableA.Field3 with 'a' if the next record's Field3 is 'a'
tableA:
-------------------------------------------------------------------------------------
| ID | Field1 | Field2 | Field3 |
-------------------------------------------------------------------------------------
| 1 | a | 1 | |
-------------------------------------------------------------------------------------
| 2 | b | 2 | |
-------------------------------------------------------------------------------------
| 3 | c | 3 | a |
-------------------------------------------------------------------------------------
| 4 | d | 4 | b |
-------------------------------------------------------------------------------------
| 5 | e | 5 | |
-------------------------------------------------------------------------------------
| 6 | f | 6 | b |
-------------------------------------------------------------------------------------
I make a table on which to base the update query:
Replacement: (SELECT TOP 1 Dupe.Field3 FROM [TableA] as Dupe WHERE Dupe.ID > [TableA].[ID])
'SQL PANE'
SELECT TableA.ID, TableA.Field1, TableA.Field2, TableA.Field3, (SELECT TOP 1 Dupe.Field3 FROM [TableA] as Dupe WHERE Dupe.ID > [TableA].[ID]) AS Replacement INTO TempTable
FROM TableA;
TempTable:
----------------------------------------------------------------------------------------------------------
| ID | Field1 | Field2 | Field3 | Replacement |
----------------------------------------------------------------------------------------------------------
| 1 | a | 1 | | |
----------------------------------------------------------------------------------------------------------
| 2 | b | 2 | | a |
----------------------------------------------------------------------------------------------------------
| 3 | c | 3 | a | b |
----------------------------------------------------------------------------------------------------------
| 4 | d | 4 | b | |
----------------------------------------------------------------------------------------------------------
| 5 | e | 5 | | b |
----------------------------------------------------------------------------------------------------------
| 6 | f | 6 | b | |
----------------------------------------------------------------------------------------------------------
Finally do the Update
UPDATE TempTable INNER JOIN TableA ON TempTable.ID = TableA.ID SET TableA.Field3 = [TempTable].[Replacement]
WHERE (((TempTable.Replacement)='a'));
TableA after update
-------------------------------------------------------------------------------------
| ID | Field1 | Field2 | Field3 |
-------------------------------------------------------------------------------------
| 1 | a | 1 | |
-------------------------------------------------------------------------------------
| 2 | b | 2 | a |
-------------------------------------------------------------------------------------
| 3 | c | 3 | a |
-------------------------------------------------------------------------------------
| 4 | d | 4 | b |
-------------------------------------------------------------------------------------
| 5 | e | 5 | |
-------------------------------------------------------------------------------------
| 6 | f | 6 | b |
notes: In the Make Table query remember to sort TableA and Dupe in the same way. Here we use the default sort of increasing ID for TableA then grab the first record with a higher ID using the default sort again. the only reason I did the filtering to 'a' in the update query is it made the Make Table query simpler.

Update rows in one table depending on a column in another table

I have 2 tables in different databases in SQL Server.
database1.table_A
id | name | present |
-----|------------|-----------|
1 | jon | 1 |
2 | ham | 0 |
3 | sam | 1 |
7 | tom | 1 |
database2.table_B
absentid |
----------|
1 |
5 |
7 |
For every id value present in table_B, I want the value of present in table_A to be 0. So, my final result should look like -
id | name | present |
-----|------------|-----------|
1 | jon | 0 |
2 | ham | 0 |
3 | sam | 1 |
7 | tom | 0 |
I want to confirm if the following query I wrote is correct or if there are any better ways to do this:
update database1.table_A
set present=0
FROM database1.table_A t1
inner join
database2.table_B t2
ON t1.id = t2.absentid;
If you want to set present = 1 if they are not in the table, then you would use left join:
update t1
set present = (case when t2.absentid is null then 1 else 0 end)
from database1.table_A t1 left join
database2.table_B t2
on t1.id = t2.absentid;
Otherwise, if you want to keep the value in that case, your version is fine.

difference between 'where' null and 'on' in a left join

Could someone explain to me why
select "talent".* from "talent"
left join "push" on "push"."talentId" = "talent"."id"
where ("push"."offerId" = '403' or "push"."offerId" is null)
yields less results than
select "talent".* from "talent"
left join "push" on "push"."talentId" = "talent"."id" and "push"."offerId" in ('403')
The way I see it, it should boil down to the same result, but it doesn’t, and I’m not sure what I miss to get it.
first one does not contain rows that have no entry in the push table.
I’d expect them to be caught by the or "push"."offerId" is null.
EDIT:
here is an example:
talent table
+----+------+
| id | name |
+----+------+
| 1 | John |
| 2 | Bob |
| 3 | Jack |
+----+------+
push table
+----+----------+---------+
| id | talentId | offerId |
+----+----------+---------+
| 1 | 1 | 403 |
| 2 | 1 | 42 |
| 3 | 2 | 123 |
| 3 | 2 | 456 |
+----+----------+---------+
With this data, the query with the where clause returns only
+----+------+---------+
| id | name | offerId |
+----+------+---------+
| 1 | John | 403 |
+----+------+---------+
while the one with the on condition returns all wanted rows
+----+------+---------+
| id | name | offerId |
+----+------+---------+
| 1 | John | 403 |
| 2 | Bob | null |
| 3 | Jack | null |
+----+------+---------+
The difference is when there is a match but on another row. This is best shown with a small example.
Consider:
t1:
x y
1 abc
1 def
2 xyz
t2:
x y
1 def
Then the left join version returns all three rows in t1:
select *
from t1 left join
t2
on t1.x = t2.x and t1.y = t2.y;
The filtering in the where clause version:
select *
from t1 left join
t2
on t1.x = t2.x
where t2.y = 'abc' or t2.y is null;
returns only one rows. The row that is returned is 1/abc. x = 2 matches in t2. So, t2.y is not null. And it is not 'abc' either. So it is filtered out.
Here is a db<>fiddle.
Yes, there is something you are missing.
WHERE and join conditions are only exchangeable for inner joins.
An outer join a LEFT JOIN b ON ... is defined as:
the result of the inner join
in addition, for every row in a that did not find a match that way, we get a result row where the b values are replaced with NULL.
So, no matter what the join condition is, the result will always contain at least one row for each value of a.
But a WHERE condition is evaluated (logically) after the join, so it can exclude rows from a from the query result.

Selecting column from one table and count from another

t1
id | name | include
-------------------
1 | foo | true
2 | bar | true
3 | bum | false
t2
id | some | table_1_id
-------------------------
1 | 42 | 1
2 | 43 | 1
3 | 42 | 2
4 | 44 | 1
5 | 44 | 3
Desired output:
name | count(some)
------------------
foo | 3
bar | 1
What I have currently from looking through other solutions here:
SELECT a.name,
COUNT(r.some)
FROM t1 a
JOIN t2 r on a.id=r.table_1_id
WHERE a.include = 'true'
GROUP BY a.id,
r.some;
but that seems to get me
name | count(r.some)
--------------------
foo | 1
foo | 1
bar | 1
foo | 1
I'm no sql expert (I can do simple queries) so I'm googling around as well but finding most of the solutions I find give me this result. I'm probably missing something really easy.
Just remove the second column from the group by clause
SELECT a.name,
COUNT(r.some)
FROM t1 a
JOIN t2 r on a.id=r.table_1_id
WHERE a.include = 'true'
GROUP BY a.name
Columns you want to use in an aggregate function like sum() or count() must be left out of the group by clause. Only put the columns in there you want to be unique outputted.
This is because multiple column group requires the all column values to be same.
See this link for more info., Using group by on multiple columns
Actually in you case., if some are equal, table_1_id is not equal (And Vice versa). so grouping cannot occur. So all are displayed individually.
If the entries are like,
id | some | table_1_id
-------------------------
1 | 42 | 1
2 | 43 | 1
3 | 42 | 2
4 | 42 | 1
Then the output would have been.,
name | count
------------------
foo | 2 (for 42)
foo | 1 (for 43)
bar | 1 (for 42)
Actually, if you want to group on 1 column as Juergen said, you could remove r.some; from groupby clause.

Can I combine values from multiple rows in another table into multiple columns of one row using SQL?

I have two tables:
T1:
| M_ID | P_ID1 | P_ID2 | rest of T1 columns |
| 0 | 0 | 1 | ... |
| 1 | 2 | 3 | ... |
T2:
| P_ID | Type | A | B |
| 0 | 1 | a | e |
| 1 | 2 | b | f |
| 2 | 1 | c | g |
| 3 | 2 | d | h |
Now, I want to have a query that selects this:
| M_ID | P_1a | P_1b | P_2a | P_2b | rest of T1 columns |
| 0 | a | e | b | f | ... |
| 1 | c | g | c | h | ... |
So, in words: I want to select all columns from T1, but I want to replace P_ID1 with the columns from T2, where the P_ID is equal to P_ID1, and the type is 1, and basically the same for P_ID2.
I can obviously get the information I need with multiple queries, but I was wondering if there is a way that I can do this with one query. Any ideas?
I'm currently using SQL Server 2008r2, but I'd also be interested in solutions for other database software.
Thanks for the help!
Sure, you just need to use a join:
select T1.M_ID, t2_1.A as P_1a, t2_1.B as P_1b, t2_2.A as P_2a, t2_2.B as P_2b, ...
from T1, T2 t2_1, T2 t2_2
where T1.P_ID1 = t2_1.P_ID and T1.P_ID2 = t2_2.P_ID
basically we are joining T1 onto T2 twice, once for the P_1 values and a second time for the P_2 values. You need to alias T2 when you join it twice to distinguish between the two (that's what the t2_1 and t2_2 are - a means of distinguishing between the two instances of the joined T2).
This is the same as #John Pickup's solution only using modern join syntax:
select T1.M_ID, t2_1.A as P_1a, t2_1.B as P_1b, t2_2.A as P_2a, t2_2.B as P_2b, ...
from T1
join T2 t2_1 on T1.P_ID1 = t2_1.P_ID
join T2 t2_2 on T1.P_ID2 = t2_2.P_ID
I only post a seperate answer, as there is no code formatting in comments as you get told here