I have a problem which I believe can be solve using the CTE.
I create sql query shown below :
with CTE as
(
SELECT Distinct T.testType as name,'tmpRequirement_TestType_Canvas' as template,'n/a' as layout,
0 as x,0 as y,'855' as width,'42' as height,
T.id,T.testType FROM [dbo].[tProperty] P
INNER JOIN tTest_Type T
on P.tTest_Type_id = T.id)
select * from CTE
I got the results like below:
What I would like to get is like shown below :
Basically I would like to have 'y' to be increment by 50 for each row. Is there any way for this?
Thanks in advance.
You need an ordering, but row_number() will do this:
select . . .,
50 * row_number() over (order by ?) as y
from cte;
? is for the column that specifies the ordering.
As your sample from above:
This will start from 0 and Add 50:
select setNumber from
(select setNumber from
(select 50 * ROW_NUMBER() over(order by id) as setNumber from yourTableOrCTE) as a
union all
select 0) as a order by setNumber
Related
I have tried the following approaches which none of them worked:
Using SELECT TOP 50 PERCENT: BigQuery does not have top function
Using LIMIT (SELECT COUNT(*) FROM tabl)/2: the reason is BigQuery does not accept any non integer value.
Using SET to set the median value and then use WHERE
In BigQuery I would use window function percent_rank().
select t.* except (prnk)
from (select t.*, percent_rank() over(order by id) prnk from mytable t) t
where prnk <= 0.5
Note: any answer to your question will require that you provide a column to order your data. I assumed that this column is called id.
One method uses window functions:
select t.* except (seqnum, cnt)
from (select t.*, row_number() over (order by ?) as seqnum,
count(*) over () as cnt
from t
) t
where seqnum <= cnt / 2;
Another possibility would be to limit the data with a WHERE clause instead of LIMIT. This is an example if you want yo filter by an ID:
SELECT * FROM table_name as t
WHERE t.id <= (SELECT COUNT(*) FROM table_name)/2;
And if you want to filter by the row number:
SELECT t.* except (rn)
FROM (
SELECT t.*, ROW_NUMBER() OVER () AS rn
FROM table_name as t
) AS t
WHERE t.rn <= (SELECT COUNT(*) FROM table_name)/2;
To scale up, you can use an approx algorithm to find the 50% point:
DECLARE mid_date TIMESTAMP DEFAULT (
SELECT APPROX_QUANTILES(creation_date, 2)[OFFSET(1)] mid_date
FROM `fh-bigquery.stackoverflow_archive.201909_posts_answers` )
;
SELECT mid_date
, COUNTIF(creation_date > mid_date) first_half
, COUNTIF(creation_date < mid_date) second_half
FROM `fh-bigquery.stackoverflow_archive.201909_posts_answers`
Looks like it works well:
Now let's get these records out:
CREATE TABLE `temp.fifty_percent`
AS
SELECT *
FROM `fh-bigquery.stackoverflow_archive.201909_posts_answers`
WHERE creation_date < (
SELECT APPROX_QUANTILES(creation_date, 2)[OFFSET(1)] mid_date
FROM `fh-bigquery.stackoverflow_archive.201909_posts_answers`
)
This method will happily scale, while solutions using OVER(ORDER BY) won't.
Is there any optimised way in sql sever to optimse this code, I am trying to find 2nd duplicate
WITH CTE AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY id,AN_KEY ORDER BY [ENTITYID]) AS [rn]
FROM [data].[dbo].[TRANSFER]
)
select *
INTO dbo.#UpSingle
from CTE
where RN=2
UPDATE:
As GurV pointed out - this query doesn't solve the problem. It will only give you the items that have exactly two duplicates, but not the row where the second duplicate lies.
I am just going to leave this here from reference purposes.
Original Answer
Why not try something like this from another SO post: Finding duplicate values in a SQL table
SELECT
id, AN_KEY, COUNT(*)
FROM
[data].[dbo].[TRANSFER]
GROUP BY
id, AN_KEY
HAVING
COUNT(*) = 2
I gather from your original SQL that the cols you would want to group by on are :
Id
AN_KEY
Here is another way to get the the second duplicate row (in the order of increasing ENTITYID of course):
select *
from [data].[dbo].[TRANSFER] a
where [ENTITYID] = (
select min([ENTITYID])
from [data].[dbo].[TRANSFER] b
where [ENTITYID] > (
select min([ENTITYID])
from [data].[dbo].[TRANSFER] c
where b.id = c.id
and b.an_key = c.an_key
)
and a.id = b.id
and a.an_key = b.an_key
)
Provided there is an index on id, an_key and ENTITYID columns, performance of both your query and this should be acceptable.
Let me assume that this query does what you want:
WITH CTE AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, AN_KEY
ORDER BY [ENTITYID]) AS [rn]
FROM [data].[dbo].[TRANSFER] t
)
SELECT *
INTO dbo.#UpSingle
FROM CTE
WHERE RN = 2;
For performance, you want a composite index on [data].[dbo].[TRANSFER](id, AN_KEY, ENTITYID).
I have thousands of groups in a table, something like :
1..
1..
2..
2..
2..
2..
3..
3..
.
.
.
10000..
10000..
How can i make a select that give me the Top 3 groups each time.
I Want something like select Top 3 from rows , but it have to return the first three groups not the first three rows.
You can try this :
;with cte as (
select distinct groupId from mytable order by groupid
)
select * from mytable where TheGroupId in (select top 3 groupdid from cte)
You can use DENSE_RANK to assign a number to each group. All members of the same group will have the same number. Then in an outer query, select top 3 groups:
SELECT *
FROM (SELECT *, DENSE_RANK() OVER (ORDER BY id) AS rnk
FROM mytable ) t
WHERE t.rnk <= 3
The above query assumes that id is the column used to group records together.
SQL Fiddle Demo
Use Ranking function Row_Number() :
SELECT *
FROM (SELECT *,
Row_number()
OVER(
partition BY GroupId
ORDER BY GroupId) AS [rn]
FROM YourTable) t
WHERE rn <= 3
Check this MSDN doc for details of all ranking functions.
There is a sql TOP statement that does this
SELECT TOP number|percent column_name(s) FROM table_name;
a description of what it does and how it is used in alternative sql statements for example for mysql and ms access can be found here: http://www.w3schools.com/sql/sql_top.asp
My bad i misread your question, this will return the top rows not groups, could you explain what you are trying to do in more detail?
SELECT *
FROM
(SELECT *
,ROW_NUMBER() OVER (PARTITION BY [Group] ORDER BY [Group] ASC)rn
FROM TableName
)A
WHERE rn <= 3
I am having a select statement that generates the following output.
If you look the output there are few rows that generates duplicate AdjusterIds.
What i need is to get the FIRST ROW ONLY for repeated ids along with other rows.
I tried several Group By, Distinct etc but no luck.
Please note that there are several columns that i just omitted for simplicity.
You can use a CTE + ROW_NUMBER:
WITH CTE AS
(
SELECT t.*, RN = ROW_NUMBER() OVER (PARTITION BY AdjusterIds
ORDER BY AdjusterIds)
FROM dbo.TableName t
)
SELECT * FROM CTE WHERE RN = 1
I have a SQL query that does some ranking, like this:
SELECT RANK() OVER(PARTITION BY XXX ORDER BY yyy,zzz,oooo) as ranking, *
FROM SomeTable
WHERE ranking = 1 --> this is not possible
I want to use that ranking in a WHERE condition at the end.
Now I nest this query in another query and do filtering on the ranking there, but is there no easier or faster way to filter on such values from the SELECT statement?
Use a CTE (Common Table Expression) - sort of an "inline" view just for the next statement:
;WITH MyCTE AS
(
SELECT
RANK() OVER(PARTITION BY XXX ORDER BY yyy,zzz,oooo) as ranking,
*
FROM SomeTable
)
SELECT *
FROM MyCTE
WHERE ranking = 1 --> this is now possible!
Sorry for the former posting, i forgot : windowing functions can only be used in select or order by clauses.
You'll have to use a sub query:
SELECT * FROM
(
SELECT RANK() OVER(PARTITION BY XXX ORDER BY yyy,zzz,oooo) as ranking, *
FROM SomeTable
) t
WHERE ranking = 1
OR A CTE.
select * from (
select RANK() OVER(PARTITION BY name ORDER BY id) as ranking, *
from PostTypes
) A
where A.ranking = 1
https://data.stackexchange.com/stackoverflow/query/edit/59515