I have this local db that I'm playing with and it pulls a list of users, does something with each and then deletes the records. The delete is VERY slow:
db.all("select id, username from users", (err, rows) => {
rows.forEach((row) => {
// do stuff with row
db.run("delete from users where id = ?", row.id, (err) => {
if (err) {
throw err;
}
});
});
});
It is a simple db: CREATE TABLE IF NOT EXISTS users(id INTEGER PRIMARY KEY, username text NOT NULL)
Deleting a record takes even 20 seconds on a list of 100k records. What am I doing wrong here and how can I speed this up?
Deleting a record takes even 20 seconds on a list of 100k records. What am I doing wrong here and how can I speed this up?
db.all will fetch all the rows at once. This is slow, consumes a lot of memory, and all rows must be fetched before any processing starts.
Instead, use db.each. This will fetch a row and act on it immediately.
There's also no need to use where in (?). For a single value use where = ?. This may or may not affect performance.
db.each(
"select id, username from users", (err, row) => {
// do stuff with row
db.run("delete from users where id = ?", row.id, (err) => {
if (err) {
throw err;
}
}
}
)
Related
im trying to make a freelancing website that will hold gigs at redis for caching, in order to categorise them, there are 2 fields called "categoryId" and "skillId" and i want to keep them sorted with "createdAt" field which is a date. So i have two options and i have some blank spots about first one.
option1
Im holding my gigs at sorted set and making a key with two parameter which holds categoryId And skillId, but the problem is user may want only select gigs with specific category and skill doesn't matter. But user may also want select gigs with both categoryId and skillId. So for that reason i used a key like
`gigs:${categoryId}:${skillId != null ? skillId : "*"}`
here's my full code
export const addGigToSortedSet = async (value) => {
return new Promise<string>((resolve, reject) => {
let date =
value.gigCreatedAt != null && value.createdAt != undefined
? Math.trunc(Date.parse(<string>value.createdAt) / 1000)
: Date.now();
redisClient
.zAdd(`gigs:${value.gigCategory}:${value.gigSkill}`, {
score: date,
value: JSON.stringify(value),
})
.then((res) => {
if (res == 1) {
resolve("Başarılı");
} else {
reject("Hata");
return;
}
});
});
};
export const multiAddGigsToSortedSet = async (gigs: any[]) => {
return new Promise((resolve, reject) => {
let multiClient = redisClient.multi();
for (const gig of gigs) {
let date =
gig.gigCreatedAt != null && gig.createdAt != undefined
? Math.trunc(Date.parse(<string>gig.createdAt) / 1000)
: Date.now();
multiClient.zAdd(`gigs:${gig.gigCategory}:${gig.gigSkill}`, {
score: date,
value: JSON.stringify(gig),
});
}
multiClient.exec().then((replies) => {
if (replies.length > 0) {
resolve(replies);
} else {
reject("Hata");
return;
}
});
});
};
export const getGigsFromSortedSet = async (
categoryId: string,
page: number,
limit: number,
skillId?: string
) => {
return new Promise<string[]>((resolve, reject) => {
redisClient
.zRange(
`gigs:${categoryId}:${skillId != null ? skillId : "*"}`,
(page - 1) * limit,
page * limit
)
.then((res) => {
if (res) {
resolve(res.reverse());
} else {
reject("Hata");
return;
}
});
});
};
Option 2
option two is way more simpler but less more effective with storage usage
i'll create two sorted set about category and skill and then will use zinterstore to get my values, and i will easily get gigs about only category since i have different set.
so my question is which way is more effective solution and will this line give me gigs with given category without skill parameter?
gigs:${categoryId}:${skillId != null ? skillId : "*"}
Your approach #2 is the most common implementation. See https://redis.io/docs/reference/patterns/indexes/
But...
Indexes created with sorted sets are able to index only a single
numerical value. Because of this you may think it is impossible to
index something which has multiple dimensions using this kind of
indexes, but actually this is not always true. If you can efficiently
represent something multi-dimensional in a linear way, they it is
often possible to use a simple sorted set for indexing.
For example the Redis geo indexing API uses a sorted set to index
places by latitude and longitude using a technique called Geo hash.
The sorted set score represents alternating bits of longitude and
latitude
Therefore if you can find an encoding scheme of your "categoryId" and "skillId" into a single value then you could use a single sorted set.
I'm new to using Express.js. I'm working on my first endpoint which is to create a user. To accomplish this, it first has to be checked whether the username or email address already exists. After some research on how to do this, here's the code I've come up with:
// Check whether the user_name or email already exists
pool.query('SELECT CASE WHEN EXISTS (SELECT * FROM users WHERE user_name = $1 OR email = $2) THEN CAST(1 AS BIT) ELSE CAST(0 AS BIT) END', [values.user_name, values.email], (error, results) => {
if (error) {
throw error
}
if (results.rows[0].case === '1') {
console.log('User exists so send message to this effect back to client');
} else {
console.log('User does not exist so we can call INSERT query');
pool.query('INSERT INTO users (first_name, last_name, user_name, email, password, last_password, password_salt) VALUES ($1, $2, $3, $4, $5, $6, $7) RETURNING *', [values.first_name, values.last_name, values.user_name, values.email, values.password, values.password, 'tmp_salt'], (error, results) => {
if (error) {
console.error('Something went wrong');
throw error;
}
console.log('Results from INSERT:', results);
res.status(201).send(`User added with user_id: ${results.rows[0].user_id}`);
});
}
});
It's obviously not finished yet but I'm curious if this nested approach I've used is the best way to do it? In other words, first I'm checking for the existence of user_name and/or email and only if both don't exist am I performing the INSERT.
What do you think?
There are really two different questions there:
Is checking then inserting the right approach?
Is nesting the right approach?
Is checking then inserting the right approach?
Not usually, no, it leaves you open to a race condition:
Client A sends joe#example.com / funkypassword
Client B sends joe#example.com / somethingelse
The main thread picks up the task for Client A's request and starts the asynchronous check to see if the user exists
While waiting for that asynchronous result, the main thread picks up Client B's request and starts the asynchronous check to see if the user exists
Both checks come back clear (no user exists)
The main thread inserts one of those sets of details (Client A's or Client B's)
The main thread tries to insert the other one of those sets of details (Client B's or Client A's)
At this point, if the database is set up correctly, you'll get an insertion error (primary or unique key violation).
Instead, you ensure the DB is set up that way and expect to get an error if the user already exists, and don't do the check at all. Just do the insert, and look at the error if you get one to see if it's a primary or unique key constraint violation.
Is nesting the right approach?
This particular task may not need multiple DB operations (see above), but many others will. So is nesting the right way to handle that?
It's certainly been the main way to handle it for a long time, but it has the issue of quickly becoming "callback hell" where you have lots of nested operations, each in its own callback to a previous operation, etc., etc. It can get very hard to manage. But you can do it that way if you like, and many have done for some time.
The more modern alternative is to use promises and async/await. In an async function, you can make the function wait for an asynchronous process to complete before continuing with its logic. That way, your logic isn't buried in callbacks.
Suppose you have to do two or three database operations, where whether it's two or three depends on the first operation's result, and where information from earlier calls is needed by later ones. With nesting, you might do something like this:
pool.query("DO THE FIRST THING", (error1, results1) => {
if (error1) {
/*...handle/report error...*/
return;
}
const thirdThing = (results2) => {
pool.query("DO THE THIRD THING with results1 and results2 (which may be null)", (error3, results3) => {
if (error3) {
/*...handle/report error...*/
return;
}
/*...finish your work using `results1`, `results2`, and `results3`...*/
});
};
if (/*results1 says we should do the second thing*/) {
pool.query("DO THE SECOND THING with results1", (error2, results2) => {
if (error2) {
/*...handle/report error...*/
return;
}
thirdThing(results2);
});
} else {
thirdThing(null);
}
});
You might isolate the three operations as functions, but while that's good for reuse if it's relevant and possibly for debugging, it doesn't help much with the callback hell:
function firstThing(callback) {
pool.query("DO THE FIRST THING", callback);
}
function secondThing(paramA, callback) {
pool.query("DO THE SECOND THING with paramA", callback);
}
function thirdThing(paramA, paramB, callback) {
pool.query("DO THE THIRD THING with paramA and paramB (which may be null)", callback);
}
// ...and then where your code was:
const done = (error, results1, results2, results3) => {
if (error) {
/*...handle/report error...*/
} else {
/*...do your final work here using `results1`, `results2` (which may be `null`), and `results3`...*/
}
});
firstThing((error1, results1) => {
if (error1) {
done(error1);
} else if (/*results1 says we should do the second thing*/) {
secondThing(results1, (error2, results2) => {
if (error2) {
done(error2);
} else {
thirdThing(results1, results2, (error3, results3) => {
done(error3, results1, results2, results3);
});
}
});
} else {
thirdThing(results1, null, (error3, results3) => {
done(error3, results1, null, results3);
});
}
});
But suppose we had a poolQuery function that put a promise wrapper around pool.query. Here's how that could look:
async function firstThing() {
return await poolQuery("DO THE FIRST THING");
}
async function secondThing(paramA) {
return await poolQuery("DO THE SECOND THING with paramA");
}
async function thirdThing(paramA, paramB) {
return await poolQuery("DO THE THIRD THING with paramA and paramB (which may be null)");
}
// ...and then where your code was, making it an `async` function:
try {
const results1 = await firstThing();
const results2 = (/*results1 says we should do the second thing*/)
? await secondThing(results2)
: null;
const results3 = await thirdThing(results1, results2);
/*...do your final work here using `results1`, `results2` (which may be `null`), and `results3`...*/
} catch (error) {
/*...handle/report error...*/
}
Or if you aren't going to reuse those queries, then:
try {
const results1 = await poolQuery("DO THE FIRST THING");
const results2 = (/*results1 says we should do the second thing*/)
? await poolQuery("DO THE SECOND THING with results1")
: null;
const results3 = await poolQuery("DO THE THIRD THING with results1 and results2 (which may be null)");
/*...do your final work here using `results1`, `results2` (which may be `null`), and `results3`...*/
} catch (error) {
/*...handle/report error...*/
}
How simple and clear is that? :-) It gets even more powerful when you have loops and such involved.
I am working on a feature in an app where users can log journal entries, and I need to implement lazy loading. However, I need to load the entires in reverse, since I need to display the most recent entries first. I am not able to do this, even with the COUNT(*) aggregate function because I do not want to use GROUP BY. Here is my code:
export const lazyLoadEntriesByGoalId = (goalId, amountLastLoaded) => {
//load journal entries
const transactionPromise = new Promise((resolve, reject) => {
database.transaction((tx) => {
tx.executeSql(
`SELECT * FROM journal
WHERE goalId = ?
AND id < ?
ORDER BY id DESC
LIMIT 5;`,
[goalId, amountLastLoaded],
(_, result) => {
resolve(result.rows._array);
},
(_, err) => {
reject(err);
}
);
});
});
return transactionPromise;
};
The amountLastLoaded is for seeing how many entries are already loaded. I am considering using "COUNT(*) - ?", but expo-sqlite throws an error if I do that. What can I do?
I am new to mobile development and I would like to perform simple queries like deleting a row a specific from my sqlite table. But It doesnt work.The row still exits in my database table
This is my code:
export default class App extends Component{
constructor(props){
super(props);
db = SQlite.openDatabase(
{
name: 'gad.db',
createFromLocation: 1,
},
this.successToOpenDB,
this.failToOpenDB,
);
}
successToOpenDB()
{
db.transaction(tx =>
{
tx.executeSql("DELETE FROM songs WHERE content='content2' ", [] ,(tx, results) =>
{
console.log('DELETION OK');
},
(tx, error) =>
{
console.log("DELETION KO");
});
});
}
failToOpenDB(err){
console.log(err);
alert("not connected to database");
}
Please anyone help.
Thanks in advance
Must be some issue with primary key, if primary key is your condition to delete,
or "content" here.
First try to retrieve primary key of the row you want to delete, and store in
some state maybe:
var deleted= results.rows.item(length-1).ID
I have primary key as ID, so I am now able to get required row.
also, after delete query you can check any rows affected or not in promise returned as follows:
(tx, results) => {
if( results.rowsAffected >0) {
//then proceed ahead.
}
}
In my program I insert some data into a table and get back it's id and I need to ensure I enter that id into another table with a unique randomly generated string. But, in case the insertion fails for attempting to insert an already-existing random string, how could I repeat the insertion until it is successful?
I'm using pg-promise to talk to postgreSQL. I can run program like this that inserts the data into both tables given the random string doesn't already exists:
db.none(
`
WITH insert_post AS
(
INSERT INTO table_one(text) VALUES('abcd123')
RETURNING id
)
INSERT INTO table_two(id, randstr)
VALUES((SELECT id FROM insert_post), '${randStrFn()}')
`
)
.then(() => console.log("Success"))
.catch(err => console.log(err));
I'm unsure if there is any easy SQL/JS/pg-promise based solution that I could make use of.
I would encourage the author of the question to seek a pure-SQL solution to his problem, as in terms of performance it would be significantly more efficient than anything else.
But since the question was about how to re-run queries with pg-promise, I will provide an example, in addition to one already published, except without acquiring and releasing the connection for every attempt, plus proper data integrity.
db.tx(t => {
// BEGIN;
return t.one('INSERT INTO table_one(text) VALUES($1) RETURNING id', 'abcd123', a => +a.id)
.then(id => {
var f = attempts => t.none('INSERT INTO table_two(id, randstr) VALUES($1, randStrFn())', id)
.catch(error => {
if (--attempts) {
return f(attempts); // try again
}
throw error; // give up
});
return f(3); // try up to 3 times
});
})
.then(data => {
// COMMIT;
// success, data = null
})
.catch(error => {
// ROLLBACK;
});
Since you are trying to re-run a dependent query, you should not let the first query remain successful, if all your attempts with the second query fail, you should roll all the changes back, i.e. use a transaction - method tx, as shown in the code.
This is why we split your WITH query inside the transaction, to ensure such an integrity.
UPDATE
Below is a better version of it though. Because errors inside the transaction need to be isolated, in order to avoid breaking the transaction stack, each attempt should be inside its own SAVEPOINT, which means using another transaction level:
db.tx(t => {
// BEGIN;
return t.one('INSERT INTO table_one(name) VALUES($1) RETURNING id', 'abcd123', a => +a.id)
.then(id => {
var f = attempts => t.tx(sp => {
// SAVEPOINT level_1;
return sp.none('INSERT INTO table_two(id, randstr) VALUES($1, randStrFn())', id);
})
.catch(error => {
// ROLLBACK TO SAVEPOINT level_1;
if (--attempts) {
return f(attempts); // try again
}
throw error; // give up
});
return f(3); // try up to 3 times
});
})
.then(data => {
// 1) RELEASE SAVEPOINT level_1;
// 2) COMMIT;
})
.catch(error => {
// ROLLBACK;
});
I would also suggest using pg-monitor, so you can see and understand what is happening underneath, and what queries are being in fact executed.
P.S. I'm the author of pg-promise.
The easiest way is to put it into a method then re-call that in the catch:
const insertPost = (post, numRetries) => {
return
db.none(
`
WITH insert_post AS
(
INSERT INTO table_one(text) VALUES('abcd123')
RETURNING id
)
INSERT INTO table_two(id, randstr)
VALUES((SELECT id FROM insert_post), '${randStrFn()}')
`
)
.then(() => console.log("Success"))
.catch(err => {
console.log(err)
if (numRetries < 3) {
return self.insertPost(post, numRetries + 1);
}
throw err;
});
}