My problem is that I would like to select some records which appears in a row.
For example we have table like this:
x
x
x
y
y
x
x
y
Query should give answer like this:
x 3
y 2
x 2
y 1
SQL tables represent unordered sets. Your question only makes sense if there is a column that specifies the ordering. If so, you can use the difference-of-row-numbers to determine the groups and then aggregate:
select col1, count(*)
from (select t.*,
row_number() over (order by <ordering col>) as seqnum,
row_number() over (partition by col1 order by <ordering col>) as seqnum_2
from t
) t
group by col1, (seqnum - seqnum_2)
I made a SQL Fiddle
http://sqlfiddle.com/#!18/f8900/5
CREATE TABLE [dbo].[SomeTable](
[data] [nchar](1) NULL,
[id] [int] IDENTITY(1,1) NOT NULL
);
INSERT INTO SomeTable
([data])
VALUES
('x'),
('x'),
('x'),
('y'),
('y'),
('x'),
('x'),
('y')
;
select * from SomeTable;
WITH SomeTable_CTE (Data, total, BaseId, NextId)
AS
(
SELECT
Data,
1 as total,
Id as BaseId,
Id+1 as NextId
FROM SomeTable
where not exists(
Select * from SomeTable Previous
where Previous.Id+1 = SomeTable.Id
and Previous.Data = SomeTable.Data)
UNION ALL
select SomeTable_CTE.Data, SomeTable_CTE.total+1, SomeTable_CTE.BaseId as BaseId, SomeTable.Id+1 as NextId
from SomeTable_CTE inner join SomeTable on
SomeTable.Data = SomeTable_CTE.Data
and
SomeTable.Id = SomeTable_CTE.NextId
)
SELECT Data, max(total) as total
FROM SomeTable_CTE
group by Data, BaseId
order by BaseId
The elephant in the room is the missing column(s) to establish the order of rows.
SELECT col1, count(*)
FROM (
SELECT col1, order_column
, row_number() OVER (ORDER BY order_column)
- row_number() OVER (PARTITION BY col1 ORDER BY order_column) AS grp
FROM tbl
) t
GROUP BY col1, grp
ORDER BY min(order_column);
To exclude partitions with only a single row, add a HAVING clause:
SELECT col1, count(*)
FROM (
SELECT col1, order_column
, row_number() OVER (ORDER BY order_column)
- row_number() OVER (PARTITION BY col1 ORDER BY order_column) AS grp
FROM tbl
) t
GROUP BY col1, grp
HAVING count(*) > 1
ORDER BY min(order_column);
db<>fiddle here
Add a final ORDER BY to maintain original order (and a meaningful result). You may want to add a column like min(order_column) as well.
Related:
Find the longest streak of perfect scores per player
Select longest continuous sequence
Group by repeating attribute
Related
I have ROW_NUMBER() OVER (ORDER BY NULL) rnum in a sql statement to give me row numbers. is there a way to attach the max rnum to the same dataset going out?
what I want is the row_number() which I get, but I also want the MAXIMUM rownumber of the total return on each record.
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) rnum,
C.ID, C.FIELD1, C."NAME", C.FIELD2, C.FIELD3
FROM SCHEMA.TABLE
WHERE (C.IS_CRNT = 1)
), MAX_NUM as (
SELECT DATA.ID, max(rnum) as maxrnum from DATA GROUP BY DATA.COMPONENT_ID
) select maxrnum, DATA.* from DATA JOIN MAX_NUM on DATA.COMPONENT_ID = MAX_NUM.COMPONENT_ID
DESIRED RESULT (ASSUMING 15 records):
1 15 DATA
2 15 DATA
3 15 DATA
etc...
I think you want count(*) as a window function:
SELECT ROW_NUMBER() OVER (ORDER BY NULL) as rnum,
COUNT(*) OVER () as cnt,
C.ID, C.FIELD1, C."NAME", C.FIELD2, C.FIELD3
FROM SCHEMA.TABLE
WHERE C.IS_CRNT = 1
Based on my assumptions in your dataset, this is the approach I would take:
WITH CTE AS (
select C.ID, C.FIELD1, C."NAME", C.FIELD2, C.FIELD3, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID)
FROM SCHEMA.TABLE WHERE (C.IS_CRNT = 1))
SELECT *, (select count(*) from cte) "count" from cte;
I have a Oracle table with the following columns
Table Structure
In a query I need to return all the records with CPER>=40 which is trivial. However, apart from CPER>=40 I need to list 5 random records for each CPID.
I have attached a sample list of records. However, in my table I have around 50,000 records.
Appreciate if you can help.
Oracle solution:
with CTE as
(
select t1.*,
row_number() over(order by DBMS_RANDOM.VALUE) as rn -- random order assigned
from MyTable t1
where CPID <40
)
select *
from CTE
where rn <=5 -- pick 5 at random
union all
select t2.*, null
from my_table t2
where CPID >= 40
SQL Server:
with CTE as
(
select t1.*,
row_number() over(order by newid()) as rn -- random order assigned
from MyTable t1
where CPID <40
)
select *
from CTE
where rn <=5 -- pick 5 at random
union all
select t2.*, null
from my_table t2
where CPID >= 40
How about something like this...
SELECT *
FROM (SELECT CID,
CVAL,
CPID,
CPER,
Row_number() OVER (partition BY CPID ORDER BY CPID ASC ) AS RN
FROM Table) tmp
WHERE CPER>=40 OR pids <= 5
However, this is not random.
Assuming that you want five additional random records, you can do:
select t.*
from (select t.*,
row_number() over (partition by cpid,
(case when cper >= 40 then 1 else 2 end)
order by dbms_random.value
) as seqnum
from t
) t
where seqnum <= 5 or cper >= 40;
The row_number() is enumerating the rows for each cpid in two groups -- based on the cper value. The outer where is taking all cper values in the range you want as well as five from the other group.
I have this query:
Select *
from
(Select
*
ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID) AS RowNumber
from
MyTable
where
Eid = 'C1') as a
where
a.RowNumber = 1
and it displays these results:
Column1 Column2 RowNumber
------------------------------
Value1 value2 1
I want to ignore the RowNumber column in the select statement and I don't want to list all columns in select query (100+ columns and given is just an example).
How to do this in SQL Server?
Well, you would have to list all the columns in the outer select, if you use a subquery and row_number() to get a unique row.
An alternative method uses a correlated subquery, but requires having some unique column in the table. If you have one:
select t.*
from mytable t
where t.col = (select max(t2.col) from mytable t2 where t2.tid = t.tid and t2.eid = 'C1');
With the right indexes, this can have better performance than the row_number() version.
If you don't have a unique column, you can do:
select t.*
from (select distinct tid from mytable where eid = 'C1') tc cross apply
(select top 1 t.*
from mytable t
where t.tid = tc.tid and t.eid = 'C1'
) t;
Wrap your query as a subquery and select specific columns from it like so:
SELECT x.Column1, x.Column2
FROM
(
Select * from (Select * ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID)
AS RowNumber from MyTable where Eid="C1") as a where a.RowNumber=1
) AS x
OR Change your original Select to:
Select a.[Column1], a.[Column2]
from
(
Select * ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID)
AS RowNumber from MyTable where Eid="C1"
) as a
Where a.RowNumber=1
Replace * from your query in clarify exactly columnd which you whant
select x.Column1, x.Column2 FROM (
Select * from (Select * ROW_NUMBER() OVER (PARTITION BY TID ORDER BY TID)
AS RowNumber from MyTable where Eid="C1") as a where a.RowNumber=1) AS x
Is it possible to group array elements in PostgreSQL?
Example, I have 2 related arrays like this (I say related because the first array indicates actions and the second array represents those action's times:
col0 = 'any_value'
col1 = array1['a','b','b','c','c','a','a','a','c']
col2 = array2[1,2,3,4,5,6,7,8,9]
and I would like to output the following result:
col0 = 'any_value'
array_result1['a','b','c','a','c']
array_result2[1,2,4,6,9]
A way the array can be unnested is by using ordinality, this is an example query, but it returns a distinct selection of the array elements which removes the repeated ones:
select col0,
array_agg(x order by rn) as unique_array1
from (
select
distinct on (col0, a.x) col0,
a.x,
a.rn
from table_a,
unnest(array1) with ordinality as a (x,rn)
order by 1,2,3
) unnested_ordered
group by col0;
So the result of this would be:
col0 = 'any_value'
array_result1['a','b','c']
But as you can see it is missing many elements.
EDIT:
To describe more my issue, In the end I would like to know when each of the array_result1 actions are initially done.
So for the example result
array_result1['a','b','c','a','c']
*array_result2[1,2,4,6,9]
*I supose the position of the array starts at 1 and not 0, i also fixed the last element, it should be 9 not 7
would help me to know, when did the first action 'a' happen and when did the second action 'a' happen so I can calculate the time for action 'a' to return into the path I am building.
So first time action 'a' happened was = 1
Second time it happened was = 6
So action 'a' appears twice in the path(array) and it takes 5 time units to re appear. That is why I need the second array with the times on which the actions happened (the first time each action happened)
You could use LATERAL and calculate group using ROW_NUMBER:
DROP TABLE IF EXISTS table_a;
CREATE TABLE table_a(col0 VARCHAR(10), col1 text[],col2 int[]);
INSERT INTO table_a(col0, col1, col2)
VALUES ('any_value',array['a','b','b','c','c','a','a','a','c'],
array[1,2,3,4,5,6,7,8,9]);
Main query:
SELECT col0,
col1,
unique_col1
FROM table_a,
LATERAL (SELECT ARRAY_AGG(x ORDER BY grp) AS unique_col1
FROM ( SELECT DISTINCT x,
rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
) AS sub
) AS lat1
Output:
EDIT:
Calculating second array:
SELECT col0,
col1,
unique_col1,
col2,
unique_col2
FROM table_a,
LATERAL (SELECT ARRAY_AGG(x ORDER BY grp) AS unique_col1
FROM ( SELECT DISTINCT x,
rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
) AS sub
) AS lat1,
LATERAL (
SELECT array_agg(x ORDER BY rn) AS unique_col2
FROM unnest(col2) WITH ORDINALITY AS b(x,rn)
WHERE rn IN (
SELECT SUM(c) OVER(ORDER BY grp) - (c-1) AS result
FROM (SELECT grp, COUNT(*) AS c
FROM ( SELECT x,
rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
) AS sub
GROUP BY grp) AS s
)
) AS lat2
Remark:
It generates second array from values, not its position, so when you have:
col2 = array[9,8,7,6,5,4,3,2,1]
you will get:
[9,8,6,4,1]
If you want only positions you could use:
...
LATERAL (
SELECT array_agg(result ORDER BY result) AS unique_col2
FROM (
SELECT SUM(c) OVER(ORDER BY grp) - (c-1) AS result
FROM (SELECT grp, COUNT(*) AS c
FROM ( SELECT x,
rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp
FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
) AS sub
GROUP BY grp) AS s
) AS s1
) AS lat2
And the result will be:
[1,2,4,6,9]
EDIT 2
In above version there is small mistake. The ARRAY_AGG should be ordered by rn not grp:
DROP TABLE IF EXISTS table_a;
CREATE TABLE table_a(col0 VARCHAR(10), col1 text[],col2 int[]);
INSERT INTO table_a(col0, col1, col2)
VALUES ('any_value',array['a','b','b','c','c','a','a','a','c'],
array[1,2,3,4,5,6,7,8,9]);
INSERT INTO table_a(col0, col1, col2)
VALUES ('any_value2',array['a','b','a','a','c','a'],array[1,2,3,4,5,6]);
SELECT *
FROM table_a,
LATERAL (SELECT ARRAY_AGG(x ORDER BY rn) AS unique_col1
FROM
(SELECT x, grp, MIN(rn) AS rn
FROM (SELECT x,
rn - ROW_NUMBER() OVER(PARTITION BY x ORDER BY rn) AS grp,
rn
FROM unnest(col1) WITH ORDINALITY AS a(x,rn)
) AS sub
GROUP BY x, grp) AS s
) AS lat1;
I'm using row_number() expression but I don't get result as I expected. I have a sql table and some rows are duplicate. They have same 'BATCHID' and I want to get second row number for these, for others I use first row number. How can I do it?
SELECT * FROM (SELECT * , ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY) Rn FROM SAYIMDCPC ) t
WHERE Rn=1
This code returns to me only first rows, but I want to get second rows for duplicated items.
ROW_NUMBER() gives every row a unique counter. You'd want to use RANK(), which is similar, but gives rows with identical values the same score:
SELECT *
FROM (SELECT * , RANK() OVER (PARTITION BY batchid ORDER BY scaqry) rk
FROM sayimdcpc) t
WHERE rk = 1
If some values are only shown once, but some twice (and perhaps more than twice), you don't want the "first" row, you want the "max" row. Try reversing your order condition:
SELECT *
FROM (SELECT * ,
ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY DESC) Rn
FROM SAYIMDCPC ) t
WHERE Rn=1
As a side note, it's still better to explicitly list out all columns; for instance, you probably don't need Rn outside of this query...
Simply try this,
SELECT * FROM (SELECT * , ROW_NUMBER() OVER (ORDER BY SCAQTY) Rn FROM SAYIMDCPC ) t
WHERE Rn=1
To rephrase it another way, it sounds like you're saying, "When there's a single row for the Batch ID, return the single row. When there are 2 or more rows, return the second row." That's going to require inspecting your Rn value to see what its max is. I don't think you can do it in a single query.
So I'd try something like this:
WITH NumberedRows AS (
SELECT * ,
ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY) Rn
FROM SAYIMDCPC
)
, MaxNumber AS (
SELECT Max(RN) as MaxRn,
BATCHID
FROM NumberedRows
GROUP BY BATCHID
)
, NonDupes AS (
SELECT *
FROM NumberedRows
WHERE BATCHID NOT IN (SELECT BATCHID FROM MaxNumber WHERE MaxNumber = 1)
)
, SecondRows AS (
SELECT (
FROM NumberedRows
WHERE BATCHID NOT IN (SELECT BATCHID FROM MaxNumber WHERE MaxNumber > 1)
AND Rn = 2
)
SELECT
FROM NonDupes
UNION ALL
SELECT *
FROM SecondRows
Please try this. It will select max row num, if no duplicate then it should be first one otherwise second
select * from (SELECT * , ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY) Rn FROM SAYIMDCPC ) d,
(SELECT batchid,max(Rn) maxRn FROM (SELECT * , ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY) Rn FROM SAYIMDCPC ) t
group by batchid) q
where d.batchid = q.batchid and d.rn = q.maxrn
correct me if i am wrong
e.g sample data
BatchID, SCAQTY
1 , 10
2 , 10
2 , 20
2 , 30
is your expectation result like below?
**Expectation Result 1**
BatchID , SCAQty
1 , 10
2 , 30
or
**Expectation Result 2**
BatchID , SCAQty
1 , 10
2 , 20
2 , 30
based on my understanding what you want to perform is Expectation Result 1, so i guess Query below should able to help u, you just need to add desc for SCAQTY in your query
SELECT * FROM (SELECT * ,
ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY DESC) Rn FROM SAYIMDCPC ) t
WHERE Rn=1
Total result set with duplicates and non duplicates. The first column "IsDuplicate" indicates if the column is a duplicate or not.
;WITH d1 AS (
SELECT Seq = ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY)
,*
FROM SAYIMDCPC
)
SELECT IsDuplicate = CONVERT(BIT, Seq)
,*
FROM d1
This will give you only the duplicates:
;WITH d1 AS (
SELECT Seq = ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY)
,*
FROM SAYIMDCPC
)
SELECT IsDuplicate = CONVERT(BIT, Seq)
,*
FROM d1
WHERE Seq > 1
This will give you only the non duplicates (as in your first query)
;WITH d1 AS (
SELECT Seq = ROW_NUMBER() OVER (PARTITION BY BATCHID ORDER BY SCAQTY)
,*
FROM SAYIMDCPC
)
SELECT IsDuplicate = CONVERT(BIT, Seq)
,*
FROM d1
WHERE Seq = 1