Auto id for select statements in Oracle - sql

Is it possible to create / to have an auto id column in the select statements in Oracle.
Example:
Assume we have a table ITEMS without an id
Normal select-statement
Select name
from ITEMS
What I'm looking for is something like this
select AutoIdGen(), name
from ITEMS

You can use ROWID or ROWNUM in oracle ,like this:
SELECT ROWID,ROWNUM,name from ITEMS;

You can use row_number for this. The row_number analytical function works a little different then rownum. You can also apply partitioning on the results when you want to or sort on different columns then the results.
select row_number() over (order by name)
, name
from ITEMS

Related

Oracle SQL - Retrieve select and new column with the sum of an existing column

So, I am running a large query that returns n columns, but additionally, i would like to retrieve another column as the sum of an existing column.
I know that this works:
select A.*, (select sum(column_n) from my_query) from my_query A
The problem is my_query is quite large and I don't want to repeat it twice.
Thanks, guys.
Use window functions:
select q.*, sum(column_n) over ()
from my_query q;

Handling duplicates in BigQuery (Nested Table)

I think this is a very simple question but I would like some guidance: I didn't want to have to drop a table to send a new table with the deduplicated records, like using DELETE FROM based on the query below using BigQuery, is it possible? PS: This is a nested table!
SELECT
*
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY id, date_register) row_number
FROM
dataset.table)
WHERE
row_number = 1
order by id, date_register
To de-duplicate in place, without re-creating the table - use MERGE:
MERGE `temp.many_random` t
USING (
SELECT DISTINCT *
FROM `temp.many_random`
)
ON FALSE
WHEN NOT MATCHED BY SOURCE THEN DELETE
WHEN NOT MATCHED BY TARGET THEN INSERT ROW
It's simpler than the current accepted answer, as it won't ask you to match the current partitioning or clustering - it will just respect it.
Update: please also check Felipe Hoffa's answer which is simpler, and learn more on this post: BigQuery Deduplication.
You need to exclude row_number from output and overwrite your table using CREATE OR REPLACE TABLE:
CREATE OR REPLACE TABLE your_table AS
PARTITION BY DATE(date_register)
SELECT
* EXCEPT(row_number)
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY id, date_register) row_number
FROM your_table)
WHERE
row_number = 1
If you don´t have a partition field defined at the source, I recommend that you create a new table with the partition field to make this query work so that you can automate the process.

How to use DISTINCT used while selecting all columns including sequence number column?

My query is to avoid duplicate in a particular column while selecting all columns. But DISTINCT is not working since seq.number column is also being selected.
Any idea to make the query work
In the below example query seq_num is unique key.
Edit: including sample data in picture
select DISTINCT(name), seq_num from table_1;![enter image description here](https://i.stack.imgur.com/Y3NYn.jpg)
For two columns this query will be enough:
SELECT name, min(seq_num)
FROM table
GROUP BY name
For more column, use row_number analytic functon
SELECT name, col1, col2, .... col500, seq_num
FROM (
SELECT t.*, row_number() over (partition by name order by seq_num ) As rn
FROM table t
)
WHERE rn = 1
The above queries pick only one row with a given name and the smallest seq_num value for each name.
You cannot do what you want. Please read more about DISTINCT clause and query result set. You will understand that distinct is not suitable for your issue. If you provide some sample data for what you have and what should query show, when possible we will help you.

Return only the newest rows from a BigQuery table with a duplicate items

I have a table with many duplicate items – Many rows with the same id, perhaps with the only difference being a requested_at column.
I'd like to do a select * from the table, but only return one row with the same id – the most recently requested.
I've looked into group by id but then I need to do an aggregate for each column. This is easy with requested_at – max(requested_at) as requested_at – but the others are tough.
How do I make sure I get the value for title, etc that corresponds to that most recently updated row?
I suggest a similar form that avoids a sort in the window function:
SELECT *
FROM (
SELECT
*,
MAX(<timestamp_column>)
OVER (PARTITION BY <id_column>)
AS max_timestamp,
FROM <table>
)
WHERE <timestamp_column> = max_timestamp
Try something like this:
SELECT *
FROM (
SELECT
*,
ROW_NUMBER()
OVER (
PARTITION BY <id_column>
ORDER BY <timestamp column> DESC)
row_number,
FROM <table>
)
WHERE row_number = 1
Note it will add a row_number column, which you might not want. To fix this, you can select individual columns by name in the outer select statement.
In your case, it sounds like the requested_at column is the one you want to use in the ORDER BY.
And, you will also want to use allow_large_results, set a destination table, and specify no flattening of results (if you have a schema with repeated fields).

SQLServer SQL query with a row counter

I have a SQL query, that returns a set of rows:
SELECT id, name FROM users where group = 2
I need to also include a column that has an incrementing integer value, so the first row needs to have a 1 in the counter column, the second a 2, the third a 3 etc
The query shown here is just a simplified example, in reality the query could be arbitrarily complex, with several joins and nested queries.
I know this could be achieved using a temporary table with an autonumber field, but is there a way of doing it within the query itself ?
For starters, something along the lines of:
SELECT my_first_column, my_second_column,
ROW_NUMBER() OVER (ORDER BY my_order_column) AS Row_Counter
FROM my_table
However, it's important to note that the ROW_NUMBER() OVER (ORDER BY ...) construct only determines the values of Row_Counter, it doesn't guarantee the ordering of the results.
Unless the SELECT itself has an explicit ORDER BY clause, the results could be returned in any order, dependent on how SQL Server decides to optimise the query. (See this article for more info.)
The only way to guarantee that the results will always be returned in Row_Counter order is to apply exactly the same ordering to both the SELECT and the ROW_NUMBER():
SELECT my_first_column, my_second_column,
ROW_NUMBER() OVER (ORDER BY my_order_column) AS Row_Counter
FROM my_table
ORDER BY my_order_column -- exact copy of the ordering used for Row_Counter
The above pattern will always return results in the correct order and works well for simple queries, but what about an "arbitrarily complex" query with perhaps dozens of expressions in the ORDER BY clause? In those situations I prefer something like this instead:
SELECT t.*
FROM
(
SELECT my_first_column, my_second_column,
ROW_NUMBER() OVER (ORDER BY ...) AS Row_Counter -- complex ordering
FROM my_table
) AS t
ORDER BY t.Row_Counter
Using a nested query means that there's no need to duplicate the complicated ORDER BY clause, which means less clutter and easier maintenance. The outer ORDER BY t.Row_Counter also makes the intent of the query much clearer to your fellow developers.
In SQL Server 2005 and up, you can use the ROW_NUMBER() function, which has options for the sort order and the groups over which the counts are done (and reset).
The simplest way is to use a variable row counter. However it would be two actual SQL commands. One to set the variable, and then the query as follows:
SET #n=0;
SELECT #n:=#n+1, a.* FROM tablename a
Your query can be as complex as you like with joins etc. I usually make this a stored procedure. You can have all kinds of fun with the variable, even use it to calculate against field values. The key is the :=
Heres a different approach.
If you have several tables of data that are not joinable, or you for some reason dont want to count all the rows at the same time but you still want them to be part off the same rowcount, you can create a table that does the job for you.
Example:
create table #test (
rowcounter int identity,
invoicenumber varchar(30)
)
insert into #test(invoicenumber) select [column] from [Table1]
insert into #test(invoicenumber) select [column] from [Table2]
insert into #test(invoicenumber) select [column] from [Table3]
select * from #test
drop table #test