Add Identity column to a view in SQL Server 2008 - sql

This is my view:
Create View [MyView] as
(
Select col1, col2, col3 From Table1
UnionAll
Select col1, col2, col3 From Table2
)
I need to add a new column named Id and I need to this column be unique so I think to add new column as identity. I must mention this view returned a large of data so I need a way with good performance, And also I use two select query with union all I think this might be some complicated so what is your suggestion?

Use the ROW_NUMBER() function in SQL Server 2008.
Create View [MyView] as
SELECT ROW_NUMBER() OVER( ORDER BY col1 ) AS id, col1, col2, col3
FROM(
Select col1, col2, col3 From Table1
Union All
Select col1, col2, col3 From Table2 ) AS MyResults
GO

The view is just a stored query that does not contain the data itself so you can add a stable ID. If you need an id for other purposes like paging for example, you can do something like this:
create view MyView as
(
select row_number() over ( order by col1) as ID, col1 from (
Select col1 From Table1
Union All
Select col1 From Table2
) a
)

There is no guarantee that the rows returned by a query using ROW_NUMBER() will be ordered exactly the same with each execution unless the following conditions are true:
Values of the partitioned column are unique. [partitions are parent-child, like a boss has 3 employees][ignore]
Values of the ORDER BY columns are unique. [if column 1 is unique, row_number should be stable]
Combinations of values of the partition column and ORDER BY columns are unique. [if you need 10 columns in your order by to get unique... go for it to make row_number stable]"
There is a secondary issue here, with this being a view. Order By's don't always work in views (long-time sql bug). Ignoring the row_number() for a second:
create view MyView as
(
select top 10000000 [or top 99.9999999 Percent] col1
from (
Select col1 From Table1
Union All
Select col1 From Table2
) a order by col1
)

Using "row_number() over ( order by col1) as ID" is very expensive.
This way is much more efficient in cost:
Create View [MyView] as
(
Select ID = isnull(cast(newid() as varchar(40)), '')
, col1
, col2
, col3
From Table1
UnionAll
Select ID = isnull(cast(newid() as varchar(40)), '')
, col1
, col2
, col3
From Table2
)

use ROW_NUMBER() with "order by (select null)" this will be less expensive and will get your result.
Create View [MyView] as
SELECT ROW_NUMBER() over (order by (select null)) as id, *
FROM(
Select col1, col2, col3 From Table1
Union All
Select col1, col2, col3 From Table2 ) R
GO

Related

"WHERE" clause with subqueries in "IN"

I'm trying (on Impala SQL) to get the rows that have the biggest/ smallest difference between two columns, and I'm trying something like this:
SELECT *
FROM table
WHERE col1 - col2 IN ( SELECT MAX(col1-col2)
FROM table, SELECT MIN(col1-col2) FROM table )
Using only one subquery works, but if I add both of them inside IN it gives an error.
Any suggestions on how I can do this?
Use a subquery join:
SELECT *
FROM table t
JOIN (
SELECT MIN(col1 - col2) AS min_diff, MAX(col1 - col2) AS max_diff
FROM table
) AS agg ON t.col1 - t.col2 IN (agg.min_diff, agg.max_diff)
in your case you cannot use "in" like that you need to join it together or union it as list. I'll show you some example
SELECT * FROM table WHERE col1 - col2 IN ( SELECT MAX(col1-col2) FROM table union SELECT MIN(col1-col2) FROM table)
hop it will help you.
Use union as follows:
SELECT * FROM table WHERE col1 - col2 IN ( SELECT MAX(col1-col2) FROM table
Union
SELECT MIN(col1-col2) FROM table )
-- update
Use rank as follows:
SELECT t.*,
Rank() over (order by col1 - col2) as rn,
Rank() over (order by col1 - col2 desc) as rnd
FROM table t) t
Where rn = 1 or rnd = 1
I will prefer using CTE as shown below:
with difference as
(
select min(col1-col2) minDifference,max(col1-col2) maxDifference
from table
)
select *
from table as t
join difference as d
where t.col1-t.col2 in (d.minDifference,d.maxDifference)

Select group by with a max predicate

Quite often I have to do queries like below:
select col1, max(id)
from Table
where col2 = 'value'
and col3 = ( select max(col3)
from Table
where col2 = 'value'
)
group by col1
Are there any other ways to avoid subqueries and temp tables? Basically I need a group by on all the rows with a particular max value. Assuming all proper indices are used.
You can use an OLAP function to achieve this. I would say this solution is marginally better in that your predicates are not duplicated between the main query and subquery, so you don't violate DRY:
SELECT *
FROM (
select col1, max(id) as max_id,
RANK() OVER (PARTITION BY col1 ORDER BY col3 DESC) AS irow
from [Member]
where col2 = 'value'
group by col1
) subquery
WHERE subquery.irow = 1

Generate Unique ID On a Select in DB2

I have a select that look like this:
SELECT * FROM (SELECT DISTICT COL1, COL2, COL3
FROM view a WHERE conditions ....
) QUERY
WHERE CONDITIONS... LIMIT 20 OFFSET 0
I'm executing this from java and I need this query return an unique id.
So i try:
SELECT TRIM(CHAR(HEX(GENERATE_UNIQUE()))) AS GUID, QUERY.* FROM (SELECT DISTICT COL1, COL2, COL3
FROM view a WHERE conditions ....
) QUERY
WHERE CONDITIONS... LIMIT 20 OFFSET 0
This one return an error telling me I can't use this function in that place.
If i try:
SELECT * FROM (SELECT DISTINCT TRIM(CHAR(HEX(GENERATE_UNIQUE()))) AS GUID, COL1, COL2, COL3
FROM view a WHERE conditions ....
) QUERY
WHERE CONDITIONS... LIMIT 20 OFFSET 0
I have duplicated rows because it is like I execute query wihout DISTINCT
Does anyone know a way to do it?
I don't know DB2 version (I have tried all solutions from How to check db2 version )
If a numeric id would do, how about just using row_number():
SELECT CAST(ROW_NUMBER() OVER (ORDER BY COL1, COL2, COL3) as VARCHAR(255)) as unique_id,
QUERY.*
FROM (SELECT DISTICT COL1, COL2, COL3
FROM view a
WHERE conditions ....
) QUERY
WHERE CONDITIONS...
LIMIT 20 OFFSET 0

SQL script for retrieving 5 unique values in a table ( google big query )

I am looking for a query where I can get unique values(5) in a table. For example.
The table consists of more 100+ columns. Is there any way I can get unique values.
I am using google big query and tried this option
select col1 col2 ... coln
from tablename
where col1 is not null and col2 is not null
group by col1,col2... coln
order by col1, col2... coln
limit 5
But problem is it gives zero records if all the column are null
Thanks
R
I think you might be able to do this in Google bigquery, assuming that the types for the columns are compatible:
select colname, colval
from (select 'col1' as colname, col1 as colvalue
from t
where col1 is not null
group by col1
limit 5
),
(select 'col2' as colname, col2 as colvalue
from t
where col2 is not null
group by col2
limit 5
),
. . .
For those not familiar with the syntax, a comas in the from clause means union all, not cross join in this dialect. Why did they have to change this?
Try This one, i hope it works
;With CTE as (
select * ,ROW_NUMBER () over (partition by isnull(col1,''),isnull(col2,'')... isnull(coln,'') order by isnull(col1,'')) row_id
from tablename
) select * from CTE where row_id =1

Multiple rows match, but I only want one?

Sometimes I wish to perform a join whereby I take the largest value of one column. Doing this I have to perform a max() and a groupby- which prevents me from retrieving the other columns from the row which was the max (beause they were not contained in a GROUP BY or aggregate function).
To fix this, I join the max value back on the original data source, to get the other columns. However, my problem is that this sometimes returns more than one row.
So, so far I have something like:
SELECT * FROM
(SELECT Col1, Max(Col2) FROM Table GROUP BY Col1) tab1
JOIN
(SELECT Col1, Col2 FROM Table) tab2
ON tab1.Col2 = tab2.Col2
If the above query now returns three rows (which match the largest value for column2) I have a bit of a headache.
If there was an extra column- col3 and for the rows returned by the above query, I only wanted to return the one which was, say the minimum Col3 value- how would I do this?
If you are using SQL Server 2005+. Then you can do it like this:
CTE way
;WITH CTE
AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY Col1 ORDER BY Col2 DESC) AS RowNbr,
table.*
FROM
table
)
SELECT
*
FROM
CTE
WHERE
CTE.RowNbr=1
Subquery way
SELECT
*
FROM
(
SELECT
ROW_NUMBER() OVER(PARTITION BY Col1 ORDER BY Col2 DESC) AS RowNbr,
table.*
FROM
table
) AS T
WHERE
T.RowNbr=1
As I got it can be something like this
SELECT * FROM
(SELECT Col1, Max(Col2) FROM Table GROUP BY Col1) tab1
JOIN
(SELECT Col1, Col2 FROM Table) tab2
ON tab1.Col2 = tab2.Col2 and Col3 = (select min(Col3) from table )
Assuming you are using SQL-Server 2005 or later You can make use of Window functions here. I have chosen ROW_NUMBER() but it is not hte only option.
;WITH T AS
( SELECT *,
ROW_NUMBER() OVER(PARTITION BY Col1 ORDER BY Col2 DESC) [RowNumber]
FROM Table
)
SELECT *
FROM T
WHERE RowNumber = 1
The PARTITION BY within the OVER clause is equivalent to your group by in your subquery, then your ORDER BY determines the order in which to start numbering the rows. In this case Col2 DESC to start with the highest value of col2 (Equivalent to your MAX statement).