I have data like this (col2 is of type Date)
| col1 | col2 |
------------------------------
| 1 | 17/10/2007 07:19:07 |
| 1 | 17/10/2007 07:18:56 |
| 1 | 31/12/2070 |
| 2 | 28/11/2008 15:23:14 |
| 2 | 31/12/2070 |
How would select rows which col1 is distinct and the value of col2 is the greatest. Like this
| col1 | col2 |
------------------------------
| 1 | 31/12/2070 |
| 2 | 31/12/2070 |
SELECT col1, MAX(col2) FROM some_table GROUP BY col1;
select col1, max(col2)
from table
group by col1
i reckon it would be
select col1, max(col2)
from DemoTable
group by col1
unless i've missed something obvious
select col1, max(col2) from MyTable
group by col1
SELECT Col1, MAX(Col2) FROM YourTable GROUP BY Col1
In Oracle and MS SQL:
SELECT *
FROM (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col2 DESC) rn
FROM table t
) q
WHERE rn = 1
This will select other columns along with col1 and col2
Related
I'm trying to get max count of a field.
This is what I get and what I'm tried to do.
| col1 | col2 |
| A | B |
| A | B |
| A | D |
| A | D |
| A | D |
| C | F |
| C | G |
| C | F |
I'm trying to get the max count occurrences of col2, grouped by col1.
With this query I get the occurrences grouped by col1 and col2.
SELECT col1, col2, count(*) as conta
FROM tab
WHERE
GROUP by col1, col2
ORDER BY col1, col2
And I get:
| col1 | col2 | conta |
| A | B | 2 |
| A | D | 3 |
| C | F | 2 |
| C | G | 1 |
Then I used this query to get max of count:
SELECT max(conta) as conta2, col1
FROM (
SELECT col1, col2, count(*) as conta
FROM tab
WHERE
GROUP BY col1, col2
ORDER BY col1, col2
) AS derivedTable
GROUP BY col1
And I get:
| col1 | conta |
| A | 3 |
| C | 2 |
What I'm missing is the value of col2. I would like something like this:
| col1 | col2 | conta |
| A | D | 3 |
| C | F | 2 |
The problem is that if I try to select the col2 field, I get an error message, that I have to use this field in group by or aggregation function, but using it in the group by it's not the right way.
Simpler & faster (and correct):
SELECT DISTINCT ON (col1)
col1, col2, count(*) AS conta
FROM tab
GROUP BY col1, col2
ORDER BY col1, conta DESC;
db<>fiddle here (based on a_horse's fiddle)
DISTINCT ON is applied after aggregation, so we don't need a subquery or CTE. Consider the sequence of events in a SELECT query:
Best way to get result count before LIMIT was applied
Select first row in each GROUP BY group?
You can combine GROUP BY with a window function - which gets evaluated after the group by:
with cte as (
SELECT col1, col2,
count(*) as conta,
dense_rank() over (partition by col1 order by count(*) desc) as rnk
FROM tab
WHERE ...
GROUP by col1, col2
)
select col1, col2, conta
from cte
where rnk = 1
order by col1, col2;
This will return the combination of col1,col2 with the same highest max count twice. If you don't want that, use row_number() instead of dense_rank()
Online example
Possibly not the most elegant solution, but using a common table expression may help.
with cte as (
select col1, col2, count(*) as total
from dtable
group by col1, col2
)
select col1, col2, total
from cte c
where total = (select max(total)
from cte cc
where cc.col1 = c.col1)
order by col1 asc
Returns
col1|col2|total|
----+----+-----+
A | D | 3|
C | F | 2|
from the docs
I misunderstood the question. Here is your solution:
;with tablex as
(Select col1, col2, Count(col2) as Count From Your_Table Group by col1, col2),
aaaa as
(Select ROW_NUMBER() over (partition by col1 order by Count desc) as row, * From tablex)
Select * From aaaa Where row = 1
Using a window function:
select distinct on (col1) col1, col2, cnt
from
(
select col1, col2, count(*) over (partition by col1, col2) cnt
from the_table
) t
order by col1, cnt desc;
col1
col2
cnt
A
D
3
C
F
2
This solution does not solve cases with ties.
What I need is the following.
Currently, it's repeating column names with the regular group by and sum.
| column 1 | column2 | column3 | sum |
|-------------|-------------|----------|-----|
|main product |sub product1 |subsub 1 | 500|
|main product |sub product1 |subsub 2 | 300|
|main product |sub product2 |subsub 1 | 300|
I want to get rid of repeating the same as excel pivot, so below, I need.
| column 1 | column2 | column3 | sum |
|-------------|-------------|----------|-----|
|main product |sub product1 |subsub 1 | 500|
|main product | |subsub 2 | 300|
|main product |sub product2 |subsub 1 | 300|
Can someone help me with this?
edit : formatted
We can approximate this behavior with the help of ROW_NUMBER:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY col1, col2 ORDER BY col3) rn
FROM yourTable
)
SELECT col1, CASE WHEN rn = 1 THEN col2 ELSE '' END AS col2, col3, sum
FROM cte t
ORDER BY col1, t.col2;
You can use row_number() :
select col1,
case when row_number() over(partition by col1, col2 order by col3) = 1 then col2 else 0 end as col2,
col3, sum
from table t
group by col1, col2, col3;
I want to write a non-recursive common table expression (CTE) in postgres to calculate a cumulative sum, here's an example, input table:
----------------------
1 | A | 0 | -1
1 | B | 3 | 1
2 | A | 1 | 0
2 | B | 3 | 2
An output should look like this:
----------------------
1 | A | 0 | -1
1 | B | 3 | 1
2 | A | 1 | -1
2 | B | 6 | 3
As you can see the cumulative sum of columns 3 and 4 are calculated, this is easy to do using a recursive CTE, but how is it done with a non-recursive one?
Use window functions. Assuming that your table has columns col1, col2, col3 and col4, that would be:
select
t.*,
sum(col3) over(partition by col2 order by col1) col3,
sum(col4) over(partition by col2 order by col1) col4
from mytable t
You would use a window function for a cumulative sum. I don't see what the sum is in your example, but the syntax is something like:
select t.*, sum(x) over (order by y) as cumulative_sum
from t;
For your example, this would seem to be:
select t.*,
sum(col3) over (partition by col2 order by col1) as new_col3,
sum(col4) over (partition by col2 order by col1) as new_col4
from t;
I have a table
| group | col1 | col2 |
| 1 | test1 | val1 |
| 1 | test2 | val2 |
| 3 | test3 | val3 |
| 3 | test4 | val4 |
I need to select rows by priority. For example, if row has col1 value as test1 so show it. If it's not then show test2. Don't remember about group. Just if values in one group.
I expect this result:
| group | col1 | col2 |
| 1 | test1 | val1 |
| 3 | test3 | val3 |
In standard SQL, you seem to want:
select t.*
from t
order by (case when col1 = 'test1' then 1
when col2 = 'test2' then 2
else 3
end)
fetch first 1 row only;
EDIT:
For the revised question, you can use distinct on:
select distinct on (group) t.*
from t
order by group,
(col1 = 'test1') desc,
(col1 = 'test2') desc;
Please use below query,
select * from
(select group, col1, col2, row_number() over (partition by group order by col1) as rnk
from table) where rnk = 1;
This is the query that work!
select * from
(select group,
col1,
col2,
row_number() over (partition by group order by (case when col1 = 'test1' then 2
when col1 = 'test2' then 1
else 3
end)) as rnk
from test) AS tab1 where rnk = 1;
I have a table that looks like the first example.
I'm trying to write a MSSQL2012 statement that that will display results like the second example.
Basically I want null values instead of duplicate values in columns 1 and 2. This is for readability purposes during reporting.
This seems like it should be possible, but I'm drawing a blank. No amount of joins or unions I've written has rendered the results I need.
| Col1 | Col2 | Col3 |
+------+------+------+
| 1 | 2 | 4 |
| 1 | 2 | 5 |
| 1 | 3 | 6 |
| 1 | 3 | 7 |
+------+------+------+
| Col1 | Col2 | Col3 |
+------+------+------+
| 1 | 2 | 4 |
| Null | null | 5 |
| null | 3 | 6 |
| null | null | 7 |
+------+------+------+
I would do this with no subqueries at all:
select (case when row_number() over (partition by col1 order by col2, col3) = 1
then col1
end) as col1,
(case when row_number() over (partition by col2 order by col3) = 1
then col2
end) as col2,
col3
from t
order by t.col1, t.col2, t.col3;
Note that the order by at the end of the query is very important. The result set that you want depends critically on the ordering of the rows. Without the order by, the result set could be in any order. So, the query might look like it works, and then suddenly fail one day or on a slightly different set of data.
Using a common table expression with row_number():
;with cte as (
select *
, rn_1 = row_number() over (partition by col1 order by col2, col3)
, rn_2 = row_number() over (partition by col1, col2 order by col3)
from t
)
select
col1 = case when rn_1 > 1 then null else col1 end
, col2 = case when rn_2 > 1 then null else col2 end
, col3
from cte
without the cte
select
col1 = case when rn_1 > 1 then null else col1 end
, col2 = case when rn_2 > 1 then null else col2 end
, col3
from (
select *
, rn_1 = row_number() over (partition by col1 order by col2, col3)
, rn_2 = row_number() over (partition by col1, col2 order by col3)
from t
) sub
rextester demo: http://rextester.com/UYA17142
returns:
+------+------+------+
| col1 | col2 | col3 |
+------+------+------+
| 1 | 2 | 4 |
| NULL | NULL | 5 |
| NULL | 3 | 6 |
| NULL | NULL | 7 |
+------+------+------+