Adding same random rows to table in SQL - sql

I have the following base table:
_ID_ _Name_
1 Bart Smit
2 Ahmed Lissabon
3 Medina Aziz
4 Ben Joeson
Whereby I would like to assign random titles to above table. However, the same titles should be assigned to the same persons every time the query is run. Thus if I have the following table:
_Titles_
Captain
Mr.
Ms.
Prince
King
Queen
Lieutenant
Doctor
Sir
So the output could look like this:
_ID_ _Title_ _Name_
1 Doctor Bart Smit
2 King Ahmed Lissabon
3 Captain Medina Aziz
4 Sir Ben Joeson
But then it should assign those titles to those names every time I run the code. Now I use the NEWID() in combination with a CROSS APPLY and it randomly assigns titles to names every time I run it.
SELECT _ID_, R._Title_, _NAME_
FROM TABLE
CROSS APPLY
(
SELECT TOP 1 Title
FROM #titles
WHERE TABLE.[_ID_]= TABLE.[_ID_]
ORDER BY NEWID()
) R

If you want the same result every time instead of just running query, update table:
WITH cte AS (
SELECT t.*, R.title AS new_title
FROM TABLE t
CROSS APPLY(SELECT TOP 1 Title
FROM #titles
ORDER BY NEWID()) R
WHERE _TITLE_ IS NULL
)
UPDATE cte
SET _Title_ = new_title;

You can't use any random calculation if it has to be repeatable (unless you add a new column to store that random value).
This applies checksum and modulo to get a repeatable value:
select *
from tab
join
( select *
,row_number() over (order by title) -1 as rn
from titles
) as t
-- or simply hardcoded 9
on abs(checksum(tab.Name,tab.id) % (select count(*) from titles)) = t.rn
;
Of course this will still return different when the number of titles changes (or add an ID column to titles).

Related

How to update a column by repositioning the values in a random order

Okay, so this table will work as an example of what I am working with. This table consists of the name of someone and the order they are in compared to others:
NAME
ORDER
ZAC
1
JEFF
2
BART
3
KATE
4
My goal is to take the numbers in ORDER and reposition them randomly and update that into the table, keeping the NAME records in the same position that they were in originally.
Example of the desired result:
NAME
ORDER
ZAC
3
JEFF
1
BART
4
KATE
2
Using the table above, I have tried the following solutions:
#1
Update TEST_TABLE
Set ORDER = dbms_random.value(1,4);
This resulted in the random numbers between 1 and 4 inclusive, but the numbers could repeat, so ORDER could have the same number multiple times
Example of the attempted solution:
NAME
ORDER
ZAC
3
JEFF
1
BART
3
KATE
2
#2
Update TEST_TABLE
Set ORDER = (Select dbms_random.value(1,4) From dual);
This resulted in the same random number being copied into each ORDER record, so if the number came out at 3, then it would change them all to 3.
Example of the attempted solution:
NAME
ORDER
ZAC
3
JEFF
3
BART
3
KATE
3
This is my first time posting to StackOverflow, and I am relatively new to Oracle, so hopefully I proposed this question properly.
How about this?
Sample data:
SQL> select * from test order by rowid;
NAME C_ORDER
---- ----------
Zac 1
Jeff 2
Bart 3
Kate 4
Table is updated based on value acquired by the row_number analytic function which sorts data randomly; matches are found by the rowid value:
SQL> merge into test a
2 using (with counter (cnt) as
3 (select count(*) from test)
4 select t.rowid rid,
5 row_number() over(order by dbms_random.value(1, c.cnt)) rn
6 from counter c cross join test t
7 ) b
8 on (a.rowid = b.rid)
9 when matched then update set
10 a.c_order = b.rn;
4 rows merged.
Result:
SQL> select * from test order by rowid;
NAME C_ORDER
---- ----------
Zac 3
Jeff 4
Bart 1
Kate 2
SQL>
How about this?
MERGE INTO test d USING
(SELECT rownum AS new_order,
name
FROM (SELECT *
FROM test
ORDER BY dbms_random.value)) s
ON (d.name = s.name)
WHEN matched THEN
UPDATE
SET d.sort_order = s.new_order;
The new order is build by simply sorting the original data by random values and using rownum to number those random records from 1 to N.
I use NAME to match the records, but you should use the primary key or rowid as in Littlefoot answer.
Or at least an indexed column (for speed, when the table contains a lot of data), which uniquely identifies a row.
The simplest is to sort the data randomly and join on the "name" column:
merge into data dst
using (
select rownum as rn, name from (
select name from data order by dbms_random.value()
)
) src
on (src.name = dst.name)
when matched then
update set ord = src.rn
;

Update SQL column using Rank() function

I have a table with existing data. For each unique value in the first column of this table, we have a column that is supposed to be in sequential order, but this table has gotten out of order. I want to run a SQL statement that will put this second column back in order. I was able to see the results I want with this SQL:
select FORMULA_ID, ATTRIB_CODE, ATTRIB_VAL, ATTRIB_ORDER,
rank() over (partition by formula_id order by attrib_code, attrib_val) AS WANT_THIS
from ATTRIB
Which yields:
FORMULA_ID ATTRIB_CODE ATTRIB_VAL ATTRIB_ORDER WANT_THIS
----------- -------------------- ---------------- ------------ ---------
2791 C_BRAND ROMAN HOLIDAY 3 1
2791 C_ENDUSE DINNER 4 2
2791 C_ENDUSE SNACK 6 3
2791 C_ENDUSER 10-17 7 4
2791 C_PRODTYPE SALAD 13 5
2791 C_RELIG ANY 14 6
2821 C_ALLERGEN PEANUT 1 1
2821 C_ALLERGEN SOY 2 2
2821 C_BRAND ROMAN HOLIDAY 1 3
2821 C_ENDUSE DINNER 1 4
As you can see, the WANT_THIS column orders the rows and resets to 1 when it gets to a new FORMULA_ID. But I don't know how to convert this into an UPDATE statement that will actually put the value in WANT_THIS into the column ATTRIB_ORDER. Is there a way to convert the SQL above into an UPDATE statement?
This is one way:
WITH CTE AS
(
SELECT FORMULA_ID,
ATTRIB_CODE,
ATTRIB_VAL,
ATTRIB_ORDER,
RANK() OVER (PARTITION BY formula_id
ORDER BY attrib_code, attrib_val) AS WANT_THIS
FROM ATTRIB
)
UPDATE CTE
SET ATTRIB_ORDER = WANT_THIS;
This should work on MySql server:
UPDATE attrib
LEFT JOIN (
SELECT formula_id, attrib_code, attrib_val,
rank() over (partition by formula_id order by attrib_code, attrib_val)
want_this FROM attrib
) AS new_values
ON
attrib.formula_id = new_values.formula_id AND
attrib.attrib_code = new_values.attrib_code AND
attrib_val = new_values.attrib_val
SET
attrib_order = new_values.want_this
Short description: We are updating the attrib table. First we must calculate new_values using a subquery. Then we connect (LEFT JOIN) the subquery with existing attrib table. After the connection is made, we exactly know to which row want_this should be applied. The ON condition is long here and it would be better to use unique identifier if possible.

Complex SQL query or queries

I looked at other examples, but I don't know enough about SQL to adapt it to my needs. I have a table that looks like this:
ID Month NAME COUNT First LAST TOTAL
------------------------------------------------------
1 JAN2013 fred 4
2 MAR2013 fred 5
3 APR2014 fred 1
4 JAN2013 Tom 6
5 MAR2014 Tom 1
6 APR2014 Tom 1
This could be in separate queries, but I need 'First' to equal the first month that a particular name is used, so every row with fred would have JAN2013 in the first field for example. I need the 'Last" column to equal the month of the last record of each name, and finally I need the 'total' column to be the sum of all the counts for each name, so in each row that had fred the total would be 10 in this sample data. This is over my head. Can one of you assist?
This is crude but should do the trick. I renamed your fields a bit because you are using a bunch of "RESERVED" sql words and that is bad form.
;WITH cte as
(
Select
[NAME]
,[nmCOUNT]
,ROW_NUMBER() over (partition by NAME order by txtMONTH ASC) as 'FirstMonth'
,ROW_NUMBER() over (partition by NAME order by txtMONTH DESC) as 'LastMonth'
,SUM([nmCOUNT]) as 'TotNameCount'
From Table
Group by NAME, [nmCOUNT]
)
,cteFirst as
(
Select
NAME
,[nmCOUNT]
,[TotNameCount]
,[txtMONTH] as 'ansFirst'
From cte
Where FirstMonth = 1
)
,cteLast as
(
Select
NAME
,[txtMONTH] as 'ansLast'
From cte
Where LastMonth = 1
Select c.NAME, c.nmCount, c.ansFirst, l.ansLast, c.TotNameCount
From cteFirst c
LEFT JOIN cteLast l on c.NAME = l.NAME

SQL: How to get the AVG(MIN(number))?

I am looking for the AVERAGE (overall) of the MINIMUM number (grouped by person).
My table looks like this:
Rank Name
1 Amy
2 Amy
3 Amy
2 Bart
1 Charlie
2 David
5 David
1 Ed
2 Frank
4 Frank
5 Frank
I want to know the AVERAGE of the lowest scores. For these people, the lowest scores are:
Rank Name
1 Amy
2 Bart
1 Charlie
2 David
1 Ed
2 Frank
Giving me a final answer of 1.5 - because three people have a MIN(Rank) of 1 and the other three have a MIN(Rank) of 2. That's what I'm looking for - a single number.
My real data has a couple hundred rows, so it's not terribly big. But I can't figure out how to do this in a single, simple statement. Thank you for any help.
Try this:
;WITH MinScores
AS
(
SELECT
"Rank",
Name,
ROW_NUMBER() OVER(PARTITION BY Name ORDER BY "Rank") row_num
FROM Table1
)
SELECT
CAST(SUM("Rank") AS DECIMAL(10, 2)) /
COUNT("Rank")
FROM MinScores
WHERE row_num = 1;
SQL Fiddle Demo
Selecting the set of minimum values is straightforward. The cast() is necessary to avoid integer division later. You could also avoid integer division by casting to float instead of decimal. (But you should be aware that floats are "useful approximations".)
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
Now you can use the minimums as a common table expression, and select from it.
with minimums as (
select name, cast(min(rank) as decimal) as min_rank
from Table1
group by name
)
select avg(min_rank) avg_min_rank
from minimums
If you happen to need to do the same thing on a platform that doesn't support common table expressions, you can a) create a view of minimums, and select from that view, or b) use the minimums as a derived table.
You might try using a derived table to get the minimums, then get the average minimum in the outer query, as in:
-- Get the avg min rank as a decimal
select avg(MinRank * 1.0) as AvgRank
from (
-- Get everyone's min rank
select min([Rank]) as MinRank
from MyTable
group by Name
) as a
I think the easiest one will be
for max
select name , max_rank = max(rank)
from table
group by name;
for average
select name , avg_rank = avg(rank)
from table
cgroup by name;

Deleting all but top 10 rows

I have the following SQL statement which seems to be deleting every row in the selected table. What it should be doing is deleting all but the top 10 where the difficulty level equals one.
DELETE FROM Scores
WHERE Highscore_ID
NOT IN (SELECT TOP 10 highScore FROM Scores WHERE difficulty = 1 ORDER BY HighScore DESC)
Any suggestions as to why this would be deleting all rows? When running the subquery, it selects the proper rows, however, when deleting, it seems to want to delete every row.
You compare Hichscore_Id with a column highScore. Does those columns really have the same values?
Then it should be
DELETE FROM Scores
WHERE Highscore_ID NOT IN
(SELECT TOP 10 HighScore_ID
FROM Scores
WHERE difficulty = 1
ORDER BY HighScore DESC);
EDIT:
Try this
DELETE FROM Scores
JOIN (SELECT ROW_NUMBER() OVER (ORDER BY highscore) AS your_row, *
FROM Scores
WHERE difficulty = 1
ORDER BY HighScore DESC) AS score2 ON score2.HighScore_ID = Scores.HighScore_ID
WHERE Scores.difficulty = 1 AND score2.your_row>10
Don't know if this is a typo or a real error but your select clause should refer to highscore_id, not highscore.
Try this:
with a as
(
select ROW_NUMBER() over(order by name) as ordinal, * from test
)
delete from a where a.ordinal > 10;
Related: http://www.ienablemuch.com/2012/03/possible-in-sql-server-deleting-any-row.html
Sample data:
CREATE TABLE [beatles]
([name] varchar(14));
INSERT INTO [beatles]
([name])
VALUES
('john'),
('paul'),
('george'),
('ringo'),
('pete'),
('brian'),
('george martin');
Query:
with a as
(
select *, row_number() over(order by name) ordinal
from beatles
)
delete from a
where ordinal > 4;
select * from beatles;
Prior deleting:
NAME
brian
george
george martin
john
paul
pete
ringo
After deleting:
NAME
brian
george
george martin
john
Live test: http://www.sqlfiddle.com/#!3/0adcf/6
My first guess would be "SELECT TOP 10 highScore FROM Scores WHERE difficulty = 1 ORDER BY HighScore DESC" could be returning null.
My second guess would be "highscore_id" is different from "highscore", so there's no overlap (and nothing's deleted).
Definitely double-check your subquery, and make sure it's returning the keys you expect!