Sql query that numerates the returned result - sql

How to write one SQL query that selects a column from a table but returns two columns where the additional one contains an index of the row (a new one, starting with 1 to n). It must be without using functions that do that (like row_number()).
Any ideas?
Edit: it must be a one-select query

You can do this on any database:
SELECT (SELECT COUNT (1) FROM field_company fc2
WHERE fc2.field_company_id <= fc.field_company_id) AS row_num,
fc.field_company_name
FROM field_company fc

SET NOCOUNT ON
DECLARE #item_table TABLE
(
row_num INT IDENTITY(1, 1) NOT NULL PRIMARY KEY, --THE IDENTITY STATEMENT IS IMPORTANT!
field_company_name VARCHAR(255)
)
INSERT INTO #item_table
SELECT field_company_name FROM field_company
SELECT * FROM #item_table

if you are using Oracle or a database that supports Sequence objects, make a new db sequence object for this purpose. Next create a view, and run this.
insert into the view as select column_name, sequence.next from table

In mysql you can :
SELECT Row,Column1
FROM (SELECT #row := #row + 1 AS Row, Column1 FROM table1 )
As derived1

I figured out a hackish way to do this that I'm a bit ashamed of. On Postgres 8.1:
SELECT generate_series, (SELECT username FROM users LIMIT 1 OFFSET generate_series) FROM generate_series(0,(SELECT count(*) - 1 FROM users));
I believe this technique will work even if your source table does not have unique ids or identifiers.

On SQL Server 2005 and higher, you can use OVER to accomplish this:
SELECT rank() over (order by company_id) as rownum
, company_name
FROM company

Related

Update columns in DB2 using randomly chosen static values provided at runtime

I would like to update rows with values chosen randomly from a set of possible values.
Ideally I would be able to provide this values at runtime, using JdbcTemplate from Java application.
Example:
In a table, column "name" can contain any name. The goal is to run through the table and change all names to equal to either "Bob" or "Alice".
I know that this can be done by creating a sql function. I tested it and it was fine but I wonder if it is possible to just use simple query?
This will not work, seems that the value is computed once, and applied to all rows:
UPDATE test.table
SET first_name =
(SELECT a.name
FROM
(SELECT a.name, RAND() idx
FROM (VALUES('Alice'), ('Bob')) AS a(name) ORDER BY idx FETCH FIRST 1 ROW ONLY) as a)
;
I tried using MERGE INTO, but it won't even run (possible_names is not found in SET query). I am yet to figure out why:
MERGE INTO test.table
USING
(SELECT
names.fname
FROM
(VALUES('Alice'), ('Bob'), ('Rob')) AS names(fname)) AS possible_names
ON ( test.table.first_name IS NOT NULL )
WHEN MATCHED THEN
UPDATE SET
-- select random name
first_name = (SELECT fname FROM possible_names ORDER BY idx FETCH FIRST 1 ROW ONLY)
;
EDIT: If possible, I would like to only focus on fields being updated and not depend on knowing primary keys and such.
Db2 seems to be optimizing away the subselect that returns your supposedly random name, materializing it only once, hence all rows in the target table receive the same value.
To force subselect execution for each row you need to somehow correlate it to the table being updated, for example:
UPDATE test.table
SET first_name =
(SELECT a.name
FROM (VALUES('Alice'), ('Bob')) AS a(name)
ORDER BY RAND(ASCII(SUBSTR(first_name, 1, 1)))
FETCH FIRST 1 ROW ONLY)
or may be even
UPDATE test.table
SET first_name =
(SELECT a.name
FROM (VALUES('Alice'), ('Bob')) AS a(name)
ORDER BY first_name, RAND()
FETCH FIRST 1 ROW ONLY)
Now that the result of subselect seems to depend on the value of the corresponding row in the target table, there's no choice but to execute it for each row.
If your table has a primary key, this would work. I've assumed the PK is column id.
UPDATE test.table t
SET first_name =
( SELECT name from
( SELECT *, ROW_NUMBER() OVER(PARTITION BY id ORDER BY R) AS RN FROM
( SELECT *, RAND() R
FROM test.table, TABLE(VALUES('Alice'), ('Bob')) AS d(name)
)
)
AS u
WHERE t.id = u.id and rn = 1
)
;
There might be a nicer/more efficient solution, but I'll leave that to others.
FYI I used the following DDL and data to test the above.
create table test.table(id int not null primary key, first_name varchar(32));
insert into test.table values (1,'Flo'),(2,'Fred'),(3,'Sue'),(4,'John'),(5,'Jim');

SQL Server random using seed

I want to add a column to my table with a random number using seed.
If I use RAND:
select *, RAND(5) as random_id from myTable
I get an equal value(0.943597390424144 for example) for all the rows, in the random_id column. I want this value to be different for every row - and that for every time I will pass it 0.5 value(for example), it would be the same values again(as seed should work...).
How can I do this?
(
For example, in PostrgreSql I can write
SELECT setseed(0.5);
SELECT t.* , random() as random_id
FROM myTable t
And I will get different values in each row.
)
Edit:
After I saw the comments here, I have managed to work this out somehow - but it's not efficient at all.
If someone has an idea how to improve it - it will be great. If not - I will have to find another way.
I used the basic idea of the example in here.
Creating a temporary table with blank seed value:
select * into t_myTable from (
select t.*, -1.00000000000000000 as seed
from myTable t
) as temp
Adding a random number for each seed value, one row at a time(this is the bad part...):
USE CPatterns;
GO
DECLARE #seed float;
DECLARE #id int;
DECLARE VIEW_CURSOR CURSOR FOR
select id
from t_myTable t;
OPEN VIEW_CURSOR;
FETCH NEXT FROM VIEW_CURSOR
into #id;
set #seed = RAND(5);
WHILE ##FETCH_STATUS = 0
BEGIN
set #seed = RAND();
update t_myTable set seed = #seed where id = #id
FETCH NEXT FROM VIEW_CURSOR
into #id;
END;
CLOSE VIEW_CURSOR;
DEALLOCATE VIEW_CURSOR;
GO
Creating the view using the seed value and ordering by it
create view my_view AS
select row_number() OVER (ORDER BY seed, id) AS source_id ,t.*
from t_myTable t
I think the simplest way to get a repeatable random id in a table is to use row_number() or a fixed id on each row. Let me assume that you have a column called id with a different value on each row.
The idea is just to use this as a seed:
select rand(id*1), as random_id
from mytable;
Note that the seed for the id is an integer and not a floating point number. If you wanted a floating point seed, you could do something with checksum():
select rand(checksum(id*0.5)) as random_id
. . .
If you are doing this for sampling (where you will say random_id < 0.1 for a 10% sample for instance, then I often use modulo arithmetic on row_number():
with t as (
select t.* row_number() over (order by id) as seqnum
from mytable t
)
select *
from t
where ((seqnum * 17 + 71) % 101) < 0.1
This returns about 10% of the numbers (okay, really 10/101). And you can adjust the sample by fiddling with the constants.
Someone sugested a similar query using newid() but I'm giving you the solution that works for me.
There's a workaround that involves newid() instead of rand, but it gives you the same result. You can execute it individually or as a column in a column. It will result in a random value per row rather than the same value for every row in the select statement.
If you need a random number from 0 - N, just change 100 for the desired number.
SELECT TOP 10 [Flag forca]
,1+ABS(CHECKSUM(NEWID())) % 100 AS RANDOM_NEWID
,RAND() AS RANDOM_RAND
FROM PAGSEGURO_WORK.dbo.jobSTM248_tmp_leitores_iso
So, in case it would someone someday, here's what I eventually did.
I'm generating the random seeded values in the server side(Java in my case), and then create a table with two columns: the id and the generated random_id.
Now I create the view as an inner join between the table and the original data.
The generated SQL looks something like that:
CREATE TABLE SEED_DATA(source_id INT PRIMARY KEY, random_id float NOT NULL);
select Rand(5);
insert into SEED_DATA values(1,Rand());
insert into SEED_DATA values(2, Rand());
insert into SEED_DATA values(3, Rand());
.
.
.
insert into SEED_DATA values(1000000, Rand());
and
CREATE VIEW DATA_VIEW
as
SELECT row_number() OVER (ORDER BY random_id, id) AS source_id,column1,column2,...
FROM
( select * from SEED_DATA tmp
inner join my_table i on tmp.source_id = i.id) TEMP
In addition, I create the random numbers in batches, 10,000 or so in each batch(may be higher), so it will not weigh heavily on the server side, and for each batch I insert it to the table in a separate execution.
All of that because I couldn't find a good way to do what I want purely in SQL. Updating row after row is really not efficient.
My own conclusion from this story is that SQL Server is sometimes really annoying...
You could convert a random number from the seed:
rand(row_number over (order by ___, ___,___))
Then cast that as a varchar
, Then use the last 3 characters as another seed.
That would give you a nice random value:
rand(right(cast(rand(row_number() over(x,y,x)) as varchar(15)), 3)

Sql server order by value not by field name

suppose my table structure is like
ID OEReference
--- ------------
1 00000634B9
2 00000634B6
3 0005000053
4 0002855071
5 0000940148
6 0001414825
7 00000634B9
i want that they way i supply OEReference that order should maintain in output.
my sql is like
Select * from mytable where OEReference in ('00000634B9','0001414825','00000634B6')
the above statement did not return resultset according to the order of IN clause. i know that it is not possible by ORDER BY CLAUSE
how can i do it with simple sql statement in sql server. thanks
You could use a temporary table as a filter. Aninner join will enforce the filter, and you can sort on the identity column:
declare #filter table (id int identity, ref varchar(50))
insert #filter values ('00000634B9')
insert #filter values ('0001414825')
insert #filter values ('00000634B6')
select *
from YourTable yt
join #filter filter
on filter.ref = yt.OEReference
order by
filter.id
Please here is my solution for you:
SELECT [id], [OEReference]
FROM [Tbl]
where [OEReference] in ('002', '001')
order by case [OEReference]
when '002' then 1
when '001' then 2
end
Please note: this can decrease performance of your server. It depends on how many rows in table you have. However, you can easily add index for OEReference. Off course, you should generate such query dynamically.
My solution is not close ideal. Maybe you found it useful for you.
Happy coding!

Looking for SQL constraint: SELECT COUNT(*) from tBoss < 2

I'd like to limit the entries in a table. Let's say in table tBoss. Is there a SQL constraint that checks how many tuples are currently in the table? Like
SELECT COUNT(*) from tBoss < 2
Firebird says:
Invalid token.
Dynamic SQL Error.
SQL error code = -104.
Token unknown - line 3, column 8.
SELECT.
You could do this with a check constraint and a scalar function. Here's how I built a sample.
First, create a table:
CREATE TABLE MyTable
(
MyTableId int not null identity(1,1)
,MyName varchar(100) not null
)
Then create a function for that table. (You could maybe add the row count limit as a parameters if you want more flexibility.)
CREATE FUNCTION dbo.MyTableRowCount()
RETURNS int
AS
BEGIN
DECLARE #HowMany int
SELECT #HowMany = count(*)
from MyTable
RETURN #HowMany
END
Now add a check constraint using this function to the table
ALTER TABLE MyTable
add constraint CK_MyTable__TwoRowsMax
check (dbo.MyTableRowCount() < 3)
And test it:
INSERT MyTable (MyName) values ('Row one')
INSERT MyTable (MyName) values ('Row two')
INSERT MyTable (MyName) values ('Row three')
INSERT MyTable (MyName) values ('Row four')
A disadvantage is that every time you insert to the table, you have to run the function and perform a table scan... but so what, the table (with clustered index) occupies two pages max. The real disadvantage is that it looks kind of goofy... but everything looks goofy when you don't understand why it has to be that way.
(The trigger solution would work, but I like to avoid triggers whenever possible.)
Does your database have triggers? If so, Add a trigger that rolls back any insert that would add more than 2 rows...
Create Trigger MyTrigName
For Insert On tBoss
As
If (Select Count(*) From tBoss) > 2
RollBack Transaction
but to answer your question directly, the predicate you want is to just put the select subquery inside parentheses. like this ...
[First part of sql statement ]
Where (SELECT COUNT(*) from tBoss) < 2
To find multiples in a database your best bet is a sub-query for example: (Note I am assuming you are looking to find duplicated rows of some sort)
SELECT id FROM tBoss WHERE id IN ( SELECT id FROM tBoss GROUP BY id HAVING count(*) > 1 )
where id is the possibly duplicated column
SELECT COUNT(*) FROM tBoss WHERE someField < 2 GROUP BY someUniqueField

Paging in Pervasive SQL

How to do paging in Pervasive SQL (version 9.1)? I need to do something similar like:
//MySQL
SELECT foo FROM table LIMIT 10, 10
But I can't find a way to define offset.
Tested query in PSQL:
select top n *
from tablename
where id not in(
select top k id
from tablename
)
for all n = no.of records u need to fetch at a time.
and k = multiples of n(eg. n=5; k=0,5,10,15,....)
Our paging required that we be able to pass in the current page number and page size (along with some additional filter parameters) as variables. Since a select top #page_size doesn't work in MS SQL, we came up with creating an temporary or variable table to assign each rows primary key an identity that can later be filtered on for the desired page number and size.
** Note that if you have a GUID primary key or a compound key, you just have to change the object id on the temporary table to a uniqueidentifier or add the additional key columns to the table.
The down side to this is that it still has to insert all of the results into the temporary table, but at least it is only the keys. This works in MS SQL, but should be able to work for any DB with minimal tweaks.
declare #page_number int, #page_size
int -- add any additional search
parameters here
--create the temporary table with the identity column and the id
--of the record that you'll be selecting. This is an in memory
--table, so if the number of rows you'll be inserting is greater
--than 10,000, then you should use a temporary table in tempdb
--instead. To do this, use
--CREATE TABLE #temp_table (row_num int IDENTITY(1,1), objectid int)
--and change all the references to #temp_table to #temp_table
DECLARE #temp_table TABLE (row_num int
IDENTITY(1,1), objectid int)
--insert into the temporary table with the ids of the records
--we want to return. It's critical to make sure the order by
--reflects the order of the records to return so that the row_num
--values are set in the correct order and we are selecting the
--correct records based on the page INSERT INTO #temp_table
(objectid)
/* Example: Select that inserts
records into the temporary table
SELECT personid FROM person WITH
(NOLOCK) inner join degree WITH
(NOLOCK) on degree.personid =
person.personid WHERE
person.lastname = #last_name
ORDER BY person.lastname asc,
person.firsname asc
*/
--get the total number of rows that we matched DECLARE #total_rows
int SET #total_rows =
##ROWCOUNT
--calculate the total number of pages based on the number of
--rows that matched and the page size passed in as a parameter DECLARE
#total_pages int
--add the #page_size - 1 to the total number of rows to
--calculate the total number of pages. This is because sql
--alwasy rounds down for division of integers SET #total_pages =
(#total_rows + #page_size - 1) /
#page_size
--return the result set we are interested in by joining
--back to the #temp_table and filtering by row_num /* Example:
Selecting the data to return. If the
insert was done properly, then
you should always be joining the table
that contains the rows to return
to the objectid column on the
#temp_table
SELECT person.* FROM person WITH
(NOLOCK) INNER JOIN #temp_table
tt ON person.personid =
tt.objectid
*/
--return only the rows in the page that we are interested in
--and order by the row_num column of the #temp_table to make sure
--we are selecting the correct records WHERE tt.row_num <
(#page_size * #page_number) + 1
AND tt.row_num > (#page_size *
#page_number) - #page_size ORDER
BY tt.row_num
I face this problem in MS Sql too... no Limit or rownumber functions. What I do is insert the keys for my final query result (or sometimes the entire list of fields) into a temp table with an identity column... then I delete from the temp table everything outside the range I want... then use a join against the keys and the original table, to bring back the items I want. This works if you have a nice unique key - if you don't, well... that's a design problem in itself.
Alternative with slightly better performance is to skip the deleting step and just use the row numbers in your final join. Another performance improvement is to use the TOP operator so that at the very least, you don't have to grab the stuff past the end of what you want.
So... in pseudo-code... to grab items 80-89...
create table #keys (rownum int identity(1,1), key varchar(10))
insert #keys (key)
select TOP 89 key from myTable ORDER BY whatever
delete #keys where rownumber < 80
select <columns> from #keys join myTable on #keys.key = myTable.key
I ended up doing the paging in code. I just skip the first records in loop.
I thought I made up an easy way for doing the paging, but it seems that pervasive sql doesn't allow order clauses in subqueries. But this should work on other DBs (I tested it on firebird)
select *
from (select top [rows] * from
(select top [rows * pagenumber] * from mytable order by id)
order by id desc)
order by id