select all rows by distinct values with limit on distinct values - sql

Let's say we have 2 tables:
Table1: Table2:
id | t2id id | col
---------- ----------
1 | 1 1 | a
2 | 2 2 | b
3 | 2 3 | c
4 | 1 4 | d
5 | 3 5 | e
6 | 3 6 | f
7 | 4 7 | g
8 | 5 8 | h
9 | 1 9 | i
10 | 4 10 | j
My question is:
Is there any short way to put limit for distinct results of Table1.t2id column?
For example: if limit = 2 then all rows with t2id from 1 to 2 (or any other values) are selected.
Expected result (with limit = 2):
Res:
id | t2id
----------
1 | 1
2 | 2
3 | 2
4 | 1
9 | 1
Note:
Any information or suggestion are accepted

You could use just the where clause
Select id,t2id
from table1
where t2id<=2
Or you can use where .. between
Select id,t2id
from table1
where t2id between 1 and 2

I believe you want to:
Create a subquery with all the columns you need + this one: DENSE_RANK() OVER (ORDER BY Table1.t2id) AS MyRank
outside of the sub-query, add a where on MyRank
Complete solution:
SELECT id, tb2id
FROM (
SELECT id, tb2id, DENSE_RANK() OVER (ORDER BY Table1.t2id) AS MyRank
FROM table1
) MySubQuery
WHERE MyRank <= 2
This will adapt to JOINs with table2 (with potential multiplicity increase) and non-consecutive values in tb2id.

You can also use in:
select t1.*
from table1 t1
where t1.t2_id in (select t2.id from table2 t2 limit 2);
The advantage of this approach is that it is easy to make it random:
select t1.*
from table1 t1
where t1.t2_id in (select t2.id from table2 t2 order by random() limit 2);

Related

Select row with max value saving distinct column

I have values
- id|type|Status|Comment
- 1 | P | 1 | AAA
- 2 | P | 2 | BBB
- 3 | P | 3 | CCC
- 4 | S | 1 | DDD
- 5 | S | 2 | EEE
I wan to get values for each type with max status and with comment from the row with max status:
- id|type|Status|Comment
- 3 | P | 3 | CCC
- 5 | S | 2 | EEE
All the existing questions on SO do not care about the right correspondence of Max type and value.
This gives you one row per type, which have max status
select * from (
select your_table.*, row_number() over(partition by type order by Status desc) as rn from your_table
) tt
where rn = 1
Corrected: The below will use a subquery to figure out each type and what the max status is, then it joins that onto the original table and uses the where clause to only select those rows where the status equals the max status. Of note, if you have multiple records with the same max status, you will get both of them to come up.
WITH T1 AS (SELECT type, MAX(STATUS) AS max_status FROM table_name GROUP BY type)
SELECT t2.id, t2.type, t2.status, t2.comment
FROM T1 LEFT JOIN table_name t2 ON t2.type= T1.type
WHERE t2.status = T1.max_status

Creating sql view where the select is dependent on the value of two columns

I want to create a view in my database, based on these three tables:
I would like to select the rows in table3 that has the highest value in Weight, for rows that has the same value in Count.
Then I want them grouped by Category_ID and ordered by Date, so that if two rows in table3 are identical, I want the newest.
Let me give you an example:
Table1
ID | Date | UserId
1 | 2015-01-01 | 1
2 | 2015-01-02 | 1
Table2
ID | table1_ID | Category_ID
1 | 1 | 1
2 | 2 | 1
Table3
ID | table2_ID | Count | Weight
1 | 1 | 5 | 10
2 | 1 | 5 | 20 <-- count is 5 and weight is highest
3 | 1 | 3 | 40
4 | 2 | 5 | 10
5 | 2 | 3 | 40 <-- newest of the two equal rows
Then the result should be row 2 and 5 from table 3.
PS I'm doing this in mssql.
PPS I'm sory if the title is not appropriate, but I did not know how to formulate a good one.
SELECT
*
FROM
(
SELECT
t3.*
,RANK() OVER (PARTITION BY [Count] ORDER BY [Weight] DESC, Date DESC) highest
FROM TABLE3 t3
INNER JOIN TABLE2 t2 ON t2.Id = t3.Table2_Id
INNER JOIN TABLE1 t1 ON t1.Id = t2.Table1_Id
) t
WHERE t.Highest = 1
This will group by the Count (which must be the same). Then it will determine which has the highest weight. If two of more of them have the same 'heighest' weight, it takes the one with the most recent date first.
You can use RANK() analytic function here, and give those rows a rank and than choose the first rank for each ID
Something like
select *
from
(select
ID, table2_ID, Count, Weight,
RANK() OVER (PARTITION BY ID ORDER BY Count, Weight DESC) as Highest
from table3)
where Highest = 1;
This is the syntax for Oracle, if you not using it look in the internet for the your syntax which should be almost the same

SELECT query with cross rows WHERE statement

I'll try to explain the type of the query that I want:
Assume I have a table like this:
| ID | someID | Number |
|----|--------|--------|
| 1 | 1 | 10 |
| 2 | 1 | 11 |
| 3 | 1 | 14 |
| 4 | 2 | 10 |
| 5 | 2 | 13 |
Now, I want to find the someID that have a specific numbers (For example query for numbers 10, 11, 14 will return someID 1 and query for numbers 10, 13 will return 2). But, if someID contains all the query numbers but also more numbers, it will not return by the query. (For example query for 10, 11 will return nothing).
Is it possible?
SELECT t1.someId
FROM yourTable t1
WHERE t1.number IN (10,14,11)
GROUP BY t1.someID
HAVING COUNT(DISTINCT t1.ID) = (SELECT COUNT(DISTINCT t2.ID) FROM yourTable t2 WHERE t1.someID=t2.someID)
Example Fiddle
select someID
from yourtable
where number in (10,11,14)
and not exists (select * from yourtable t2 where number not in(10,11,14)
and t2.someid=yourtable.someid)
group by someID
having count(distinct ID) = 3
Where 3 is the number of items you are querying for
Yes, once you get the query numbers into a table variable (say it's called #QNums, with one column named QNum)) try
Select distinct someId
From table t
Where exists (Select * from #QNums
where QNum = t.Number)
And not Exists (Select * From table t2
Where someId = t.someId
And not exists(Select * From #QNums
where QNum = t3.Number))

Sequence grouping in TSQL

I'm trying to group data in sequence order. Say I have the following table:
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 1 | C |
| 1 | B |
I need the SQL query to output the following:
| 1 | A | 1 |
| 1 | A | 1 |
| 1 | B | 2 |
| 1 | B | 2 |
| 1 | C | 3 |
| 1 | B | 4 |
The last column is a group number that is incremented in each group. The important thing to note is that rows 3, 4 and 5 contain the same data which should be grouped into 2 groups not 1.
For MSSQL2008:
Suppose you have a SampleStatuses table:
Status Date
A 2014-06-11
A 2014-06-14
B 2014-06-25
B 2014-07-01
A 2014-07-06
A 2014-07-19
B 2014-07-21
B 2014-08-13
C 2014-08-19
you write the following:
;with
cte as (
select top 1 RowNumber, 1 as GroupNumber, [Status], [Date] from SampleStatuses order by RowNumber
union all
select c1.RowNumber,
case when c2.Status <> c1.Status then c2.GroupNumber + 1 else c2.GroupNumber end as GroupNumber, c1.[Status], c1.[Date]
from cte c2 join SampleStatuses c1 on c1.RowNumber = c2.RowNumber + 1
)
select * from cte;
you get this result:
RowNumber GroupNumber Status Date
1 1 A 2014-06-11
2 1 A 2014-06-14
3 2 B 2014-06-25
4 2 B 2014-07-01
5 3 A 2014-07-06
6 3 A 2014-07-19
7 4 B 2014-07-21
8 4 B 2014-08-13
9 5 C 2014-08-19
The normal way you would do what you want is the dense_rank function:
select key, val,
dense_rank() over (order by key, val)
from t
However, this does not address the problem of separating the last groups.
To handle this, I have to assume there is an "id" column. Tables, in SQL, do not have an ordering, so I need the ordering. If you are using SQL Server 2012, then you can use the lag() function to get what you need. Use the lag to see if the key, val pair is the same on consecutive rows:
with t1 as (
select id, key, val,
(case when key = lead(key, 1) over (order by id) and
val = lead(val, 1) over (order by id)
then 1
else 0
end) as SameAsNext
from t
)
select id, key, val,
sum(SameAsNext) over (order by id) as GroupNum
from t
Without SQL Server 2012 (which has cumulative sums), you have to do a self-join to identify the beginning of each group:
select t.*,
from t left outer join
t tprev
on t.id = t2.id + 1 and t.key = t2.key and t.val = t2.val
where t2.id is null
With this, assign the group as the minimum id using a join:
select t.id, t.key, t.val,
min(tgrp.id) as GroupId
from t left outer join
(select t.*,
from t left outer join
t tprev
on t.id = t2.id + 1 and t.key = t2.key and t.val = t2.val
where t2.id is null
) tgrp
on t.id >= tgrp.id
If you want these to be consecutive numbers, then put them in a subquery and use dense_rank().
This will give you rankings on your columns.
It will not give you 1,2,3 however.
It will give you 1,3,6 etc based on how many in each grouping
select
a,
b,
rank() over (order by a,b)
from
table1
See this SQLFiddle for a clearer idea of what I mean: http://sqlfiddle.com/#!3/0f201/2/0

sql problem,challenge

I want to get
id a b c
--------------------
1 1 100 90
6 2 50 100
...from:
id a b c
--------------------
1 1 100 90
2 1 300 50
3 1 200 20
4 2 200 30
5 2 300 70
6 2 50 100
It's the row with the smallest b group by a.
How to do it with sql?
EDIT
I thought it can be achieved by
select * from table group by a having min(b);
which I found later it's wrong.
But is it possible to do it with having statement?
I'm using MySQL
SELECT t1.*
FROM mytable t1
LEFT OUTER JOIN mytable t2
ON (t1.a=t2.a AND t1.b>t2.b)
WHERE t2.a IS NULL;
This works because there should be no matching row t2 with the same a and a lesser b.
update: This solution has the same issue with ties that other folks have identified. However, we can break ties:
SELECT t1.*
FROM mytable t1
LEFT OUTER JOIN mytable t2
ON (t1.a=t2.a AND (t1.b>t2.b OR t1.b=t2.b AND t1.id>t2.id))
WHERE t2.a IS NULL;
Assuming for instance that in the case of a tie, the row with the lower id should be the row we choose.
This doesn't do the trick:
select * from table group by a having min(b);
Because HAVING MIN(b) only tests that the least value in the group is not false (which in MySQL means not zero). The condition in a HAVING clause is for excluding groups from the result, not for choosing the row within the group to return.
In MySQL:
select t1.* from test as t1
inner join
(select t2.a, min(t2.b) as min_b from test as t2 group by t2.a) as subq
on subq.a=t1.a and subq.min_b=t1.b;
Here is the proof:
mysql> create table test (id int unsigned primary key auto_increment, a int unsigned not null, b int unsigned not null, c int unsigned not null) engine=innodb;
Query OK, 0 rows affected (0.55 sec)
mysql> insert into test (a,b,c) values (1,100,90), (1,300,50), (1,200,20), (2,200,30), (2,300,70), (2,50,100);
Query OK, 6 rows affected (0.39 sec)
Records: 6 Duplicates: 0 Warnings: 0
mysql> select * from test;
+----+---+-----+-----+
| id | a | b | c |
+----+---+-----+-----+
| 1 | 1 | 100 | 90 |
| 2 | 1 | 300 | 50 |
| 3 | 1 | 200 | 20 |
| 4 | 2 | 200 | 30 |
| 5 | 2 | 300 | 70 |
| 6 | 2 | 50 | 100 |
+----+---+-----+-----+
6 rows in set (0.00 sec)
mysql> select t1.* from test as t1 inner join (select t2.a, min(t2.b) as min_b from test as t2 group by t2.a) as subq on subq.a=t1.a and subq.min_b=t1.b;
+----+---+-----+-----+
| id | a | b | c |
+----+---+-----+-----+
| 1 | 1 | 100 | 90 |
| 6 | 2 | 50 | 100 |
+----+---+-----+-----+
2 rows in set (0.00 sec)
Use:
SELECT DISTINCT
x.*
FROM TABLE x
JOIN (SELECT t.a,
MIN(t.b) 'min_b'
FROM TABLE T
GROUP BY t.a) y ON y.a = x.a
AND y.min_b = x.b
You're right. select min(b), a from table group by a. If you want the entire row, then you've use analytics function. That depends on database s/w.
It depends on the implementation, but this is usually faster than the self-join method:
SELECT id, a, b, c
FROM
(
SELECT id, a, b, c
, ROW_NUMBER() OVER(PARTITION BY a ORDER BY b ASC) AS [b IN a]
) As SubqueryA
WHERE [b IN a] = 1
Of course it does require that you SQL implementation be fairly up-to-date with the standard.