Select rows from table such that sum of computed values of their column is less than given limit - sql

I have a table myTable of the following structure:
id: int PRIMARY KEY
number: int
I'd like to randomly select 3 rows at most from myTable under the conditions:
Three rows at most should be selected
We should select id and a value calculated as 0,3*RAND()*number converted to INT. The alias of the computed column is randomValue
Only rows with randomValue>0 should be included in the result
Sum of randomValues should be less than given treshold, say 60.
So far, I've written this:
SELECT TOP 3 id,randomValue
FROM(
SELECT id, CONVERT(INT,(0.3*RAND()*number)) AS randomValue
FROM myTable
WHERE number>0
) AS D
WHERE randomValue>0
ORDER BY NEWID()
The code above selects at most 3 random rows where randomValue is greater than zero. However, I don't know how to fulfill condition 4, i.e. how to achieve that sum of randomValues in selected rows is less than 60.
This is myTable where I'm testing the solutions:

WITH
random_values AS (
SELECT
id,
CONVERT(INT,(0.3*RAND()*number)) AS randomValue
FROM myTable
WHERE number>0
),
valid_sets AS (
SELECT
t1.id id1,
t2.id id2,
t3.id id3
FROM random_values t1
INNER JOIN random_values t2 ON (t2.id > t1.id)
INNER JOIN random_values t3 ON (t3.id > t2.id)
WHERE t1.randomValue + t2.randomValue + t3.randomValue < 60
)
SELECT c.id,c.number
FROM (SELECT TOP 1 * FROM valid_sets ORDER BY NEWID()) a
UNPIVOT(id FOR n IN (id1,id2,id3)) b
INNER JOIN myTable c ON (b.id = c.id)

Related

Joining and grouping to equate on two tables

I've tried to minify this problem as much as possible. I've got two tables which share some Id's (among other columns)
id id
---- ----
1 1
1 1
2 1
2
2
Firstly, I can get each table to resolve to a simple count of how many of each Id there is:
select id, count(*) from tbl1 group by id
select id, count(*) from tbl2 group by id
id | tbl1-count id | tbl2-count
--------------- ---------------
1 2 1 3
2 1 2 2
but then I'm at a loss, I'm trying to get the following output which shows the count from tbl2 for each id, divided by the count from tbl1 for the same id:
id | count of id in tbl2 / count of id in tbl1
==========
1 | 1.5
2 | 2
So far I've got this:
select tbl1.Id, tbl2.Id, count(*)
from tbl1
join tbl2 on tbl1.Id = tbl2.Id
group by tbl1.Id, tbl2.Id
which just gives me... well... something nowhere near what I need, to be honest! I was trying count(tbl1.Id), count(tbl2.Id) but get the same multiplied amount (because I'm joining I guess?) - I can't get the individual representations into individual columns where I can do the division.
This gives consideration to your naming of tables -- the query from tbl2 needs to be first so the results will include all records from tbl2. The LEFT JOIN will include all results from the first query, but only join those results that exist in tbl1. (Alternatively, you could use a FULL OUTER JOIN or UNION both results together in the first query.) I also added an IIF to give you an option if there are no records in tbl1 (dividing by null would produce null anyway, but you can do what you want).
Counts are cast as decimal so that the ratio will be returned as a decimal. You can adjust precision as required.
SELECT tb2.id, tb2.table2Count, tb1.table1Count,
IIF(ISNULL(tb1.table1Count, 0) != 0, tb2.table2Count / tb1.table1Count, null) AS ratio
FROM (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table2Count
FROM tbl2
GROUP BY id
) AS tb2
LEFT JOIN (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table1Count
FROM tbl1
GROUP BY id
) AS tb1 ON tb1.id = tb2.id
(A subqquery with a LEFT JOIN will allow the query optimizer to determine how to generate the results and will generally outperform a CROSS APPLY, as that executes a calculation for every record.)
Assuming your expected results are wrong, then this is how I would do it:
CREATE TABLE T1 (ID int);
CREATE TABLE T2 (ID int);
GO
INSERT INTO T1 VALUES(1),(1),(2);
INSERT INTO T2 VALUES(1),(1),(1),(2),(2);
GO
SELECT T1.ID AS OutID,
(T2.T2Count * 1.) / COUNT(T1.ID) AS OutCount --Might want a CONVERT to a smaller scale and precision decimal here
FROM T1
CROSS APPLY (SELECT T2.ID, COUNT(T2.ID) AS T2Count
FROM T2
WHERE T2.ID = T1.ID
GROUP BY T2.ID) T2
GROUP BY T1.ID,
T2.T2Count;
GO
DROP TABLE T1;
DROP TABLE T2;
You can aggregate in subqueries and then join:
select t1.id, t2.cnt * 1.0 / t1.cnt
from (select id, count(*) as cnt
from tbl1
group by id
) t1 join
(select id, count(*) as cnt
from tbl2
group by id
) t2
on t1.id = t2.id

SQL: Joining data from table with multiple conditions where value matches

Table1 has columns POS, ID, NAME, TYPE
I have been running a standard query as so;
SELECT POS, NAME FROM Table1 WHERE ID = 100 AND TYPE LIKE '%someType%' ORDER BY POS ASC
Table2 has columns POS, ID, VALUE, ROLE
SELECT POS, VALUE FROM Table2 WHERE IS = 100 AND ROLE LIKE '%someType%' ORDER BY POS ASC
I would like to combine these two in order to return to a recordset with 3 columns; POS, Table1.NAME, and Table2.VALUE... No matter what I try with inner joins, I keep getting way more rows than I should. Also, if the corresponding value in Table does not exist, I would like it to return a null or something so that essentially a recordset could look like this;
POS NAME VALUE
1 A DF1
2 B DF1
3 C DF2
4 C null
5 null DF3
etc...
Is this possible at all?
You seem to be looking for a simple join. Something like:
select t1.pos, t1.name, t2.value
from table1 t1
inner join table2 t2
on t2.pos = t1.pos
and t2.is = t1.id
and t2.role like '%someType%'
where
t1.type like '%someType%'
and t1.id = 100
order by t1.pos
Please note that, if one or both of your original queries return more than one row, you will get a cartesian products of these in the resultset.
Or if you want to allow rows coming from both tables, even if there is no match in the other table, then you can full outer join both queries, like:
select coalesce(t1.pos, t2.pos) pos, t1.name, t2.value
from (
select pos, name from table1 where id = 100 and type like '%someType%'
) t1
full outer join (
select pos, value from table2 where is = 100 and role like '%someType%'
) t2
on t1.pos = t2.pos
order by coalesce(t1.pos, t2.pos)

Getting common value count of text array column in Postgres

I have a table which looks like this:
id num
--- ----
1 {'1','2','3','3'}
2 {'2','3'}
3 {'5','6','7'}
Here id is a unique column and num is a text array which can contain duplicate elements . I want to do something like an intersection between two consecutive rows so that I get the count of common elements between num of two rows. Consider something like a set where duplicates are considered only once. For example, for the above table I am expecting something like the following:
id1 id2 count
--- --- -----
1 2 2
1 3 0
2 1 2
2 3 0
3 1 0
3 2 0
It is not necessary to get the output like the above. The only part I am concerned about is count.
I have the following query which gives the output only for one id compared with one other id:
select unnest(num) from emp where id=1
intersect
select unnest(num) from emp where id=2
How can I generalize it to get the required output?
A straight forward approach puts the intersection of the unnested arrays in a subquery and gets their count.
SELECT t1.id id1,
t2.id id2,
(SELECT count(*)
FROM (SELECT num1.num
FROM unnest(t1.num) num1(num)
INTERSECT
SELECT num2.num
FROM unnest(t2.num) num2(num)) x) count
FROM emp t1
INNER JOIN emp t2
ON t2.id > t1.id
ORDER BY t1.id,
t2.id;
Should you be only interested in whether the arrays share elements or not but not in the exact count, you can also use the overlap operator &&.
SELECT t1.id id1,
t2.id id2,
t1.num && t2.num intersection_not_empty
FROM emp t1
INNER JOIN emp t2
ON t2.id > t1.id
ORDER BY t1.id,
t2.id;
For the example data, this works:
with t as (
select v.*
from (values (1000, array['acct', 'hr']), (1005, array['dev', hr'])) v(empid, depts)
)
select t1.empid, t2.empid,
(select count(distinct d1)
from unnest(t1.depts) d1 join
unnest(t2.depts) d2
on d1 = d2
) cnt
from t t1 join
t t2
on t1.empid < t2.empid;
I'm not 100% sure this is what you intend, though.

Deleting equal number of records with positive and negative values in a table

I have a table having multiple negative and positive values, i want to delete only those number of records from table which are having negative values and have the same positive values . I'm not sure how to explain this scenario...
I will give a brief example-
I have a table with 6 records in which 2 records are with negative value and 4 record with positive
Name | number
A | 1
A |-1
A | 1
A |-1
A | 1
A | 1
So here i want to delete equal number of records of negative value and positive value
so my output should be
Name | Number
A | 1
A | 1
By using Row_number
;WITH CTE AS (
select *,ROW_NUMBER()OVER(PARTITION BY number ORDER BY (SELECT NULL)) -1 RN from Table1 )
Select Name, number from CTE WHERE RN NOT IN (1,0)
The following query assumes that your table has either a column called id which is either a primary key or some other means to order your records. Without any order, your question cannot be answered, and in fact the data sample you showed us would have no meaning, since internally records have no order in a SQL database.
WITH cte1 AS (
SELECT t1.id, t1.number, SUM(t2.number) as sum
FROM yourTable t1
INNER JOIN yourTable t2 on t1.id >= t2.id
GROUP BY t1.id, t1.number
)
WITH cte2 AS (
SELECT MAX(id) AS cutoff
FROM cte1
WHERE sum = 0
)
SELECT t.*
FROM yourTable t
WHERE t.id > (SELECT cutoff FROM cte2)
Note that I used the old school way of computing a running sum because you never told us the version of SQL Server which you are using. Hence, I didn't want to make assumptions about what you have available.
declare #negvalrecs int = (select COUNT(*) from tab where Number < 0)
delete
from tab
where Number < 0
delete top (#negvalrecs)
from tab
where Number > 0
Thanks for all your inputs!
I have a solution for it. We will be needing row number function for it.
--Providing row number to rows
select *,row_number () over (partition by name,number order by name) R into #1 from Table
--Taking negative values
select * into #2 from #1 where number<0
--Now Deleting those records from the main table by joining this table
delete #1 from #1 a inner join #2 b on a.name=b.name and a.number=b.number and a.r<=b.r
delete #1 from #1 a inner join #2 b on a.name=b.name and a.number=-(b.number) and a.r<=b.r
Hope it helps!
I recently encountered a similar problem and this is how I resolved it.
I also had records in table where there we no negatives for a given name the union all is to bring such records.
SELECT t1.name, t1.number
FROM table t1
LEFT OUTER JOIN
(SELECT name, number FROM table where number < 0) t2
ON
t1.name = t2.name and t1.number = t2.number
WHERE t1.number > 0 and t2.number IS NOT NULL
UNION ALL
SELECT t1.name, t1.number
FROM table t1
LEFT OUTER JOIN
(SELECT name, number FROM table where number < 0) t2
ON
t1.name = t2.name
WHERE t1.number > 0 and t2.number IS NULL;`
Try this,
delete from table_name
where substring(ltrim(rtrim(number)),1,1)='-'

Querying two tables to filter data using select case

I have two tables
Table 1 looks like this
ID Repeats
-----------
A 1
A 1
A 0
B 2
B 2
C 2
D 1
Table 2 looks like this
ID values
-----------
A 100
B 200
C 100
D 300
Using a view I need a result like this
ID values Repeats
-------------------
A 100 NA
B 200 2
C 100 2
D 300 1
that means, I want unique ID, its values and Repeats. Repeats value should display NA when there are multiple values against single ID and it should display the Repeats value in case there is single value for repeats.
Initially I needed to display the max value of repeats so I tried the following view
ALTER VIEW [dbo].[BookingView1]
AS
SELECT bv.*, bd2.Repeats FROM Table1 bv
JOIN
(
SELECT distinct bd.id, bd.Repeats FROM table2 bd
JOIN
(
SELECT Id, MAX(Repeats) AS MaxRepeatCount
FROM table2
GROUP BY Id
) bd1
ON bd.Id = bd1.Id
AND bd.Repeats = bd1.MaxRepeatCount
) bd2
ON bv.Id = bd2.Id;
and this returns the correct result but when trying to implement the CASE it fails to return unique ID results. Please help!!
One method uses outer apply:
select t2.*, t1.repeats
from table2 t2 outer apply
(select (case when max(repeats) = min(repeats) then max(repeats)
else 'NA'
end) as repeats
from table1 t1
where t1.id = t2.id
) t1;
Two notes:
This assumes that repeats is a string. If it is a number, you need to cast it to a string.
repeats is not null.
For the sake of completeness, I'm including another approach that will work if repeats is NULL. However, Gordon's answer has a much simpler query plan and should be preferred.
Option 1 (Works with NULLs):
SELECT
t1.ID, t2.[Values],
CASE
WHEN COUNT(*) > 1 THEN 'NA'
ELSE CAST(MAX(Repeats) AS VARCHAR(2))
END Repeats
FROM (
SELECT DISTINCT t1.ID, t1.Repeats
FROM #table1 t1
) t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values]
Option 2 (does not contain explicit subqueries, but does not work with NULLs):
SELECT DISTINCT
t1.ID,
t2.[Values],
CASE
WHEN COUNT(t1.Repeats) OVER (PARTITION BY COUNT(DISTINCT t1.Repeats), t1.ID) > 1 THEN 'NA'
ELSE CAST(t1.Repeats AS VARCHAR(2))
END Repeats
FROM #table1 t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values], t1.Repeats
NOTE:
This may not give desired results if table2 has different values for the same ID.