Joining and grouping to equate on two tables - sql

I've tried to minify this problem as much as possible. I've got two tables which share some Id's (among other columns)
id id
---- ----
1 1
1 1
2 1
2
2
Firstly, I can get each table to resolve to a simple count of how many of each Id there is:
select id, count(*) from tbl1 group by id
select id, count(*) from tbl2 group by id
id | tbl1-count id | tbl2-count
--------------- ---------------
1 2 1 3
2 1 2 2
but then I'm at a loss, I'm trying to get the following output which shows the count from tbl2 for each id, divided by the count from tbl1 for the same id:
id | count of id in tbl2 / count of id in tbl1
==========
1 | 1.5
2 | 2
So far I've got this:
select tbl1.Id, tbl2.Id, count(*)
from tbl1
join tbl2 on tbl1.Id = tbl2.Id
group by tbl1.Id, tbl2.Id
which just gives me... well... something nowhere near what I need, to be honest! I was trying count(tbl1.Id), count(tbl2.Id) but get the same multiplied amount (because I'm joining I guess?) - I can't get the individual representations into individual columns where I can do the division.

This gives consideration to your naming of tables -- the query from tbl2 needs to be first so the results will include all records from tbl2. The LEFT JOIN will include all results from the first query, but only join those results that exist in tbl1. (Alternatively, you could use a FULL OUTER JOIN or UNION both results together in the first query.) I also added an IIF to give you an option if there are no records in tbl1 (dividing by null would produce null anyway, but you can do what you want).
Counts are cast as decimal so that the ratio will be returned as a decimal. You can adjust precision as required.
SELECT tb2.id, tb2.table2Count, tb1.table1Count,
IIF(ISNULL(tb1.table1Count, 0) != 0, tb2.table2Count / tb1.table1Count, null) AS ratio
FROM (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table2Count
FROM tbl2
GROUP BY id
) AS tb2
LEFT JOIN (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table1Count
FROM tbl1
GROUP BY id
) AS tb1 ON tb1.id = tb2.id
(A subqquery with a LEFT JOIN will allow the query optimizer to determine how to generate the results and will generally outperform a CROSS APPLY, as that executes a calculation for every record.)

Assuming your expected results are wrong, then this is how I would do it:
CREATE TABLE T1 (ID int);
CREATE TABLE T2 (ID int);
GO
INSERT INTO T1 VALUES(1),(1),(2);
INSERT INTO T2 VALUES(1),(1),(1),(2),(2);
GO
SELECT T1.ID AS OutID,
(T2.T2Count * 1.) / COUNT(T1.ID) AS OutCount --Might want a CONVERT to a smaller scale and precision decimal here
FROM T1
CROSS APPLY (SELECT T2.ID, COUNT(T2.ID) AS T2Count
FROM T2
WHERE T2.ID = T1.ID
GROUP BY T2.ID) T2
GROUP BY T1.ID,
T2.T2Count;
GO
DROP TABLE T1;
DROP TABLE T2;

You can aggregate in subqueries and then join:
select t1.id, t2.cnt * 1.0 / t1.cnt
from (select id, count(*) as cnt
from tbl1
group by id
) t1 join
(select id, count(*) as cnt
from tbl2
group by id
) t2
on t1.id = t2.id

Related

How to write BigQuery/SQL query to divide average of a column in one table from column in second/another table

I am looking for SQL query to SELECT ratio of Value1/Value2 from two tables as below..
Table_1:
-------
id Type Count
1. A. 2
2. B. 3
3. A. 1
4. A. 4
5. B. 2
Table_2:
id Type Max
1. A. 10
2. B. 10
where Value1 = SELECT AVG(Count) FROM Table_1 GROUP BY Type
and Value2 = SELECT Max FROM Table_2.
I tried below query
WITH table1 (SELECT AVG(Count) as avg_cnt...),
table2 (SELECT Max as max_val....)
SELECT table1.avg_cnt/table2.max_val
FROM ..
But it did not work.
You can just aggregate and calculate:
select t2.type, avg(t1.count) / t2.max
from table1 t1 join
table2 t2
on t1.type = t2.type
group by t2.type, t2.max;
No subqueries or CTEs are needed.
You can try using join
Select cnt/maxval
from
(
select type,avg(count) cnt
from table1
group by type
)A join table2 on table1.type=table2.type

Getting common value count of text array column in Postgres

I have a table which looks like this:
id num
--- ----
1 {'1','2','3','3'}
2 {'2','3'}
3 {'5','6','7'}
Here id is a unique column and num is a text array which can contain duplicate elements . I want to do something like an intersection between two consecutive rows so that I get the count of common elements between num of two rows. Consider something like a set where duplicates are considered only once. For example, for the above table I am expecting something like the following:
id1 id2 count
--- --- -----
1 2 2
1 3 0
2 1 2
2 3 0
3 1 0
3 2 0
It is not necessary to get the output like the above. The only part I am concerned about is count.
I have the following query which gives the output only for one id compared with one other id:
select unnest(num) from emp where id=1
intersect
select unnest(num) from emp where id=2
How can I generalize it to get the required output?
A straight forward approach puts the intersection of the unnested arrays in a subquery and gets their count.
SELECT t1.id id1,
t2.id id2,
(SELECT count(*)
FROM (SELECT num1.num
FROM unnest(t1.num) num1(num)
INTERSECT
SELECT num2.num
FROM unnest(t2.num) num2(num)) x) count
FROM emp t1
INNER JOIN emp t2
ON t2.id > t1.id
ORDER BY t1.id,
t2.id;
Should you be only interested in whether the arrays share elements or not but not in the exact count, you can also use the overlap operator &&.
SELECT t1.id id1,
t2.id id2,
t1.num && t2.num intersection_not_empty
FROM emp t1
INNER JOIN emp t2
ON t2.id > t1.id
ORDER BY t1.id,
t2.id;
For the example data, this works:
with t as (
select v.*
from (values (1000, array['acct', 'hr']), (1005, array['dev', hr'])) v(empid, depts)
)
select t1.empid, t2.empid,
(select count(distinct d1)
from unnest(t1.depts) d1 join
unnest(t2.depts) d2
on d1 = d2
) cnt
from t t1 join
t t2
on t1.empid < t2.empid;
I'm not 100% sure this is what you intend, though.

Get data from 2 tables in one sql query

I have 2 tables which are as below in SQL and I want to get data from these 2 tables which is shown as Expected result
I suppose values is columns 4,5,6 must be sum from T1 and T2:
SELECT
t1.No, t1.Month, t1.Salary + t2.Salary,
( t1.PresenceTime + t2.PresenceTime ) AS PresenceTime,
( t1.AbsencePaidTime + t2.AbsencePaidTime ) AS AbsencePaidTime,
( t1.PresenceTargetTime + t2.PresenceTargetTime ) AS PresenceTargetTime
FROM TABLE1 t1 JOIN TABLE2 t2 ON t1.No=t2.No AND t1.Month=t2.Month;
Not sure whether JOIN only on No or Month is enough.
This is not a join that you need, rather a UNION. You can do
SELECT * FROM TABLE1
UNION
SELECT * FROM TABLE2

Deleting equal number of records with positive and negative values in a table

I have a table having multiple negative and positive values, i want to delete only those number of records from table which are having negative values and have the same positive values . I'm not sure how to explain this scenario...
I will give a brief example-
I have a table with 6 records in which 2 records are with negative value and 4 record with positive
Name | number
A | 1
A |-1
A | 1
A |-1
A | 1
A | 1
So here i want to delete equal number of records of negative value and positive value
so my output should be
Name | Number
A | 1
A | 1
By using Row_number
;WITH CTE AS (
select *,ROW_NUMBER()OVER(PARTITION BY number ORDER BY (SELECT NULL)) -1 RN from Table1 )
Select Name, number from CTE WHERE RN NOT IN (1,0)
The following query assumes that your table has either a column called id which is either a primary key or some other means to order your records. Without any order, your question cannot be answered, and in fact the data sample you showed us would have no meaning, since internally records have no order in a SQL database.
WITH cte1 AS (
SELECT t1.id, t1.number, SUM(t2.number) as sum
FROM yourTable t1
INNER JOIN yourTable t2 on t1.id >= t2.id
GROUP BY t1.id, t1.number
)
WITH cte2 AS (
SELECT MAX(id) AS cutoff
FROM cte1
WHERE sum = 0
)
SELECT t.*
FROM yourTable t
WHERE t.id > (SELECT cutoff FROM cte2)
Note that I used the old school way of computing a running sum because you never told us the version of SQL Server which you are using. Hence, I didn't want to make assumptions about what you have available.
declare #negvalrecs int = (select COUNT(*) from tab where Number < 0)
delete
from tab
where Number < 0
delete top (#negvalrecs)
from tab
where Number > 0
Thanks for all your inputs!
I have a solution for it. We will be needing row number function for it.
--Providing row number to rows
select *,row_number () over (partition by name,number order by name) R into #1 from Table
--Taking negative values
select * into #2 from #1 where number<0
--Now Deleting those records from the main table by joining this table
delete #1 from #1 a inner join #2 b on a.name=b.name and a.number=b.number and a.r<=b.r
delete #1 from #1 a inner join #2 b on a.name=b.name and a.number=-(b.number) and a.r<=b.r
Hope it helps!
I recently encountered a similar problem and this is how I resolved it.
I also had records in table where there we no negatives for a given name the union all is to bring such records.
SELECT t1.name, t1.number
FROM table t1
LEFT OUTER JOIN
(SELECT name, number FROM table where number < 0) t2
ON
t1.name = t2.name and t1.number = t2.number
WHERE t1.number > 0 and t2.number IS NOT NULL
UNION ALL
SELECT t1.name, t1.number
FROM table t1
LEFT OUTER JOIN
(SELECT name, number FROM table where number < 0) t2
ON
t1.name = t2.name
WHERE t1.number > 0 and t2.number IS NULL;`
Try this,
delete from table_name
where substring(ltrim(rtrim(number)),1,1)='-'

Querying two tables to filter data using select case

I have two tables
Table 1 looks like this
ID Repeats
-----------
A 1
A 1
A 0
B 2
B 2
C 2
D 1
Table 2 looks like this
ID values
-----------
A 100
B 200
C 100
D 300
Using a view I need a result like this
ID values Repeats
-------------------
A 100 NA
B 200 2
C 100 2
D 300 1
that means, I want unique ID, its values and Repeats. Repeats value should display NA when there are multiple values against single ID and it should display the Repeats value in case there is single value for repeats.
Initially I needed to display the max value of repeats so I tried the following view
ALTER VIEW [dbo].[BookingView1]
AS
SELECT bv.*, bd2.Repeats FROM Table1 bv
JOIN
(
SELECT distinct bd.id, bd.Repeats FROM table2 bd
JOIN
(
SELECT Id, MAX(Repeats) AS MaxRepeatCount
FROM table2
GROUP BY Id
) bd1
ON bd.Id = bd1.Id
AND bd.Repeats = bd1.MaxRepeatCount
) bd2
ON bv.Id = bd2.Id;
and this returns the correct result but when trying to implement the CASE it fails to return unique ID results. Please help!!
One method uses outer apply:
select t2.*, t1.repeats
from table2 t2 outer apply
(select (case when max(repeats) = min(repeats) then max(repeats)
else 'NA'
end) as repeats
from table1 t1
where t1.id = t2.id
) t1;
Two notes:
This assumes that repeats is a string. If it is a number, you need to cast it to a string.
repeats is not null.
For the sake of completeness, I'm including another approach that will work if repeats is NULL. However, Gordon's answer has a much simpler query plan and should be preferred.
Option 1 (Works with NULLs):
SELECT
t1.ID, t2.[Values],
CASE
WHEN COUNT(*) > 1 THEN 'NA'
ELSE CAST(MAX(Repeats) AS VARCHAR(2))
END Repeats
FROM (
SELECT DISTINCT t1.ID, t1.Repeats
FROM #table1 t1
) t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values]
Option 2 (does not contain explicit subqueries, but does not work with NULLs):
SELECT DISTINCT
t1.ID,
t2.[Values],
CASE
WHEN COUNT(t1.Repeats) OVER (PARTITION BY COUNT(DISTINCT t1.Repeats), t1.ID) > 1 THEN 'NA'
ELSE CAST(t1.Repeats AS VARCHAR(2))
END Repeats
FROM #table1 t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values], t1.Repeats
NOTE:
This may not give desired results if table2 has different values for the same ID.