How I can improve (speed up) my code for long tables (1M rows)?
I have a table named names. The data in id's column is 1, 2, 5, 7.
ID | NAME
1 | Homer
2 | Bart
5 | March
7 | Lisa
I need to find the missing sequence numbers from the table.
My SQL query found the missing sequence numbers from my table.
It is similar with problem asked here. But my solution is different. I am expecting results like:
id
----
3
4
6
(3 rows)
so, my code (for postgreSql):
SELECT series AS id
FROM generate_series(1, (SELECT ID FROM names ORDER BY ID DESC LIMIT 1), 1)
series LEFT JOIN names ON series = names.id
WHERE id IS NULL;
Use max(id) to get the biggest one
Result here
SELECT series AS id
FROM generate_series(1, (select max(id) from names), 1)
series LEFT JOIN names ON series = names.id
WHERE id IS NULL;
Related
Okay, so this table will work as an example of what I am working with. This table consists of the name of someone and the order they are in compared to others:
NAME
ORDER
ZAC
1
JEFF
2
BART
3
KATE
4
My goal is to take the numbers in ORDER and reposition them randomly and update that into the table, keeping the NAME records in the same position that they were in originally.
Example of the desired result:
NAME
ORDER
ZAC
3
JEFF
1
BART
4
KATE
2
Using the table above, I have tried the following solutions:
#1
Update TEST_TABLE
Set ORDER = dbms_random.value(1,4);
This resulted in the random numbers between 1 and 4 inclusive, but the numbers could repeat, so ORDER could have the same number multiple times
Example of the attempted solution:
NAME
ORDER
ZAC
3
JEFF
1
BART
3
KATE
2
#2
Update TEST_TABLE
Set ORDER = (Select dbms_random.value(1,4) From dual);
This resulted in the same random number being copied into each ORDER record, so if the number came out at 3, then it would change them all to 3.
Example of the attempted solution:
NAME
ORDER
ZAC
3
JEFF
3
BART
3
KATE
3
This is my first time posting to StackOverflow, and I am relatively new to Oracle, so hopefully I proposed this question properly.
How about this?
Sample data:
SQL> select * from test order by rowid;
NAME C_ORDER
---- ----------
Zac 1
Jeff 2
Bart 3
Kate 4
Table is updated based on value acquired by the row_number analytic function which sorts data randomly; matches are found by the rowid value:
SQL> merge into test a
2 using (with counter (cnt) as
3 (select count(*) from test)
4 select t.rowid rid,
5 row_number() over(order by dbms_random.value(1, c.cnt)) rn
6 from counter c cross join test t
7 ) b
8 on (a.rowid = b.rid)
9 when matched then update set
10 a.c_order = b.rn;
4 rows merged.
Result:
SQL> select * from test order by rowid;
NAME C_ORDER
---- ----------
Zac 3
Jeff 4
Bart 1
Kate 2
SQL>
How about this?
MERGE INTO test d USING
(SELECT rownum AS new_order,
name
FROM (SELECT *
FROM test
ORDER BY dbms_random.value)) s
ON (d.name = s.name)
WHEN matched THEN
UPDATE
SET d.sort_order = s.new_order;
The new order is build by simply sorting the original data by random values and using rownum to number those random records from 1 to N.
I use NAME to match the records, but you should use the primary key or rowid as in Littlefoot answer.
Or at least an indexed column (for speed, when the table contains a lot of data), which uniquely identifies a row.
The simplest is to sort the data randomly and join on the "name" column:
merge into data dst
using (
select rownum as rn, name from (
select name from data order by dbms_random.value()
)
) src
on (src.name = dst.name)
when matched then
update set ord = src.rn
;
I have a table (in SQL Server) that stores records as shown below. The purpose for Old_Id is for change tracking.
Meaning that when I want to update a record, the original record has to be unchanged, but a new record has to be inserted with a new Id and with updated values, and with the modified record's Id in Old_Id column
Id Name Old_Id
---------------------
1 Paul null
2 Paul 1
3 Jim null
4 Paul 2
5 Tim null
My question is:
When I search for id = 1 or 2 or 4, I want to select all related records.
In this case I want see records the following ids: 1, 2, 4
How can it be written in a stored procedure?
Even if it's bad practice to go with this, I can't change this logic because its legacy database and it's quite a large database.
Can anyone help with this?
you can do that with Recursive Common Table Expressions (CTE)
WITH cte_history AS (
SELECT
h.id,
h.name,
h.old_id
FROM
history h
WHERE old_id IS NULL
and id in (1,2,4)
UNION ALL
SELECT
e.id,
e.name,
e.old_id
FROM
history e
INNER JOIN cte_history o
ON o.id = e.old_id
)
SELECT * FROM cte_history;
I want to expand each row in TableA into 4 rows. The result hold all the columns from TableA and two additional columns: SetID = ranging from 0 to 3 and unique when grouped by TableA. Random = a random permutation of SetID within the same grouping.
I use SQLite and would prefer a pure SQL solution.
Table A:
Description
-----------
A
B
Desired output:
Description | SetID | Random
------------|-------|-------
A | 0 | 2
A | 1 | 0
A | 2 | 3
A | 3 | 1
B | 0 | 3
B | 1 | 2
B | 2 | 0
B | 3 | 1
My attempt so far solves creating 4 rows for each row in TableA but doesn't get the permutation correctly. wrong will contain a random number ranging from 0 to 3. I need exactly one 0, 1, 2 and 3 for each unique value in Description and their order should be random.
SELECT
Description,
SetID,
abs(random()) % 4 AS wrong
FROM
TableA
LEFT JOIN
TableB
ON
1 = 1
Table B:
SetID
-----
0
1
2
3
Use a cross join
SELECT Description,
SetID,
abs(random()) % 4 AS wrong
FROM TableA
CROSS JOIN TableB
Consider a solution in your specialty, R. As you know, R maintains excellent database packages, one of which is RSQLite. Additionally, R can run commands via the connection without the need to import very large datasets.
Your solution is essentially a random sampling without replacement. Simply have R run the sampling and concatenate list items into an SQL string.
Below creates a table in the SQLite database where R sends the CREATE TABLE command to the SQL engine. No import or export of data. Should you need to run every four rows, run an iterative loop in a defined function that outputs the sql string. For append queries change the CREATE TABLE AS to INSERT INTO ... SELECT statement.
library(RSQLite)
sqlite <- dbDriver("SQLite")
conn <- dbConnect(sqlite,"C:\\Path\\To\\Database\\File\\newexample.db")
# SAMPLE WITHOUT REPLACEMENT
randomnums <- as.list(sample(0:3, 4, replace=F))
# SQL CONCATENATION
sql <- sprintf("CREATE TABLE PermutationsTable AS
SELECT a.Description, b.SetID,
(select %d from TableB WHERE TableB.SetID = b.SetID AND TableB.SetID=0
union select %d from TableB WHERE TableB.SetID = b.SetID AND TableB.SetID=1
union select %d from TableB WHERE TableB.SetID = b.SetID AND TableB.SetID=2
union select %d from TableB WHERE TableB.SetID = b.SetID AND TableB.SetID=3)
As RandomNumber
from TableA a, TableB b;",
randomnums[[1]], randomnums[[2]],
randomnums[[3]], randomnums[[4]])
# RUN QUERY
dbSendQuery(conn, sql)
dbDisconnect(conn)
You will notice a nested union subquery. This is used to achieve the inline random numbers for each row. Also, to return all possible combinations from all tables, no join statements are needed, simply list tables in FROM clause.
How can I transform this table from this
id name
1 sam
2 nick
3 ali
4 farah
5 josef
6 fadi
to
id1 name1 id2 name2 id3 name3 id4 name4
1 sam 2 nick 3 ali 4 farah
5 josef 6 fadi
the reason i need this is i have a database and i need to do a mail merge using word and I want to print every 4 rows on one page, MS word can only print one row per page, so using an SQL query I want one row to represent 4 rows
thanks in advance
Ali
You don't need to create a query for this in Access. Word has a merge field called <<Next Record>> which forces moving to the next record. If you look at how label documents are created using the Mail Merge Wizard, you'll see that's how it's done.
Updated - Doing this in SQL
The columns in simple SELECT statements are derived from the columns from the underlying table/query (or from expressions). If you want to define columns based on the data, you need to use a crosstab query.
First create a query with a running count for each person (say your table is called People), and calculate the row and column position from the running count:
SELECT People.id, Count(*)-1 AS RunningCount, int(RunningCount/4) AS RowNumber, RunningCount Mod 4 AS ColumnNumber
FROM People
LEFT JOIN People AS People_1 ON People.id >= People_1.id
GROUP BY People.id;
(You won't be able to view this in the Query Designer, because the JOIN isn't comparing with = but with >=.)
This query returns the following results:
id Rank RowNumber ColumnNumber
1 0 0 0
2 1 0 1
3 2 0 2
4 3 0 3
5 4 1 0
6 5 1 1
Assuming this query is saved as Positions, the following query will return the results:
TRANSFORM First(Item) AS FirstOfItem
SELECT RowNumber
FROM (
SELECT ID AS Item, RowNumber, "id" &( ColumnNumber + 1) AS ColumnHeading
FROM Positions
UNION ALL SELECT Name, RowNumber, "name" & (ColumnNumber +1)
FROM Positions
INNER JOIN People ON Positions.id = People.id
) AS AllValues
GROUP BY AllValues.RowNumber
PIVOT AllValues.ColumnHeading In ("id1","name1","id2","name2","id3","name3","id4","name4");
The UNION is there so each record in the People table will have two columns - one with the id, and one with the name.
The PIVOT clause forces the columns to the specified order, and not in alphabetical order (e.g. id1, id2 ... name1, name2...)
Let's say I have this table:
|Fld | Number|
1 5
2 2
And I want to make a select that retrieves as many Fld as the Number field has:
|Fld |
1
1
1
1
1
2
2
How can I achieve this? I was thinking about making a temporary table and instert data based on the Number, but I was wondering if this could be done with a single Select statement.
PS: I'm new to SQL
You can join with a numbers table:
SELECT Fld
FROM yourtable
JOIN Numbers
ON yourtable.Number <= Numbers.Number
A numbers table is just a table with a list of numbers:
Number
1
2
3
etc...
Not an great solution (since you still query your table twice, but maybe you can work from it)
SELECT t1.fld, t1.number
FROM table t1, (
SELECT ROWNUM number FROM dual
CONNECT BY LEVEL <= (SELECT MAX(number) FROM t1)) t2
WHERE t2.number<=t1.number
It generates maximum amount of rows needed and then filters it by each row.
I don't know if your RDBMS version supports it (although I rather suspect it does), but here is a recursive version:
WITH remaining (fld, times) as (SELECT fld, 1
FROM <table>
UNION ALL
SELECT a.fld, a.times + 1
FROM remaining as a
JOIN <table> as b
ON b.fld = a.fld
AND b.number > a.times)
SELECT fld
FROM remaining
ORDER BY fld
Given your source data table, it outputs this (count included for verification):
fld times
=============
1 1
1 2
1 3
1 4
1 5
2 1
2 2