What is the difference between Process ID (PID) and Execution ID for a Process? - process

In a Process Control Block (PCB) there is Accounting information which stores information such as the amount of CPU used for process execution, time limits, execution ID etc.
Also there is a Process ID information in the PCB
So What is the difference between Process ID (PID) and Execution ID for a Process ?
Regards

Related

Can this SQL operation be done without doing it row by row (RBAR)?

I have a set of tasks, with some tasks being more important than others
Each task does some work on one or more databases.
These tasks are assigned to workers that will perform the task (threads in an application poll a table).
When the worker has done the task, it sets the value back to null to signal it can accept work again.
When assigning the tasks to the workers, I would like to impose an upper limit on the number of database connections that can be used at any one time - so a task that uses a database that is currently at it's limit will not be assigned to a worker.
I can get the number of database connections available by subtracting the databases of tasks that are currently assigned to workers from the database limits.
My problem is this, how do I select tasks that can run, in order of importance, based on the number of database connections available, without doing it row by row?
I'm hoping the example below illustrates my problem:
On the right is available database connections, decreasing as we go down the list of tasks in order of importance.
If I'm selecting them in order of the importance of a task, then the connections available to the next task depend on whether the previous one was selected, which depends on if there was space for all it's database connections.
In the case above, task 7 can run only because task 6 couldn't
Also task 8 can't run because task 5 took the last connection to database C as it's a more important task.
Question:
Is there a way to work this out without using while loops and doing it row by row?

How to assign multiple workers to one task at the same time

I have the requirement as follows: there are some tasks and some workers,
For any given task, we need one worker or a pair of workers with different levels at the same time.
For any given task, we must finish it before a specific time.
define one worker or a combination with two workers as a human resource, assign the human resource object to the task.
write a hard constraint rule for the task's finish time. Punish it when the finish time is later than the specified time.
Besides Kent's 2 options, there is a more complex scenario that requires:
Option 3: The auto delay to last design pattern.
Give each Task 1 or more TaskAssignment instances. Some tasks only require only 1 Worker, so they only have 1 TaskAssignment. Those that require multiple workers, have multiple.
2 task assignments for the same task must start and end at the same time (the workers need to work together). Delay until the last arriving worker. So when worker 1 arrives at 10:00 and worker 2 arrives at 10:30, delay the worker 1 until 10:30 to start the task. So if the task takes 1 hour, both workers are done at 11:30 (so worker 1 will have lost half an hour, but the solver will automatically start avoiding that because of your existing soft constraints that already favor time efficiency)
If 2 task assignments for the same task have the same worker, incur a hard constraint penalty.

PostgreSQL row read lock

Let’s say I have a table called Withdrawals (id, amount, user_id, status).
Whenever I a withdrawal is initiated this is the flow:
Verify if user has sufficient balance (which is calculated as sum of amount received - sum of withdrawals amount)
Insert row with amount, user_id and status=‘pending’
Call 3rd party software through gRPC to initiate a withdrawal (actually send money), wait for a response
Update row with status = ‘completed’ as soon we a positive response or delete the entry if the withdrawal failed.
However, I have a concurrency problem in this flow.
Let’s say the user makes 2 full balance withdrawal requests within ~50 ms difference:
Request 1
User has enough balance
Create Withdrawal (balance = 0)
Update withdrawal status
Request 2 (after ~50ms)
User has enough balance (which is not true, the other insert didn’t got stored yet)
Create Withdrawal (balance = negative )
Update withdrawal status
Right now, we are using redis to lock withdrawals to specific user if they are within x ms, to avoid this situation, however this is not the most robust solution. As we are developing an API for businesses right now, with our current solution, we would be blocking possible withdrawals that could be requested at the same time.
Is there any way to lock and make sure consequent insert queries wait based on the user_id of the Withdrawals table ?
This is a property of transaction isolation. There is a lot written about it and I would highly recommend the overview in Designing Data-Intensive Applications. I found it to be the most helpful description in bettering my personal understanding.
The default postgres level is READ COMMITTED which allows each of these concurrent transactions to see a similiar (funds available state) even though they should be dependent.
One way to address this would be to mark each of these transactions as "SERIALIZABLE" consistency.
SERIALIZABLE All statements of the current transaction can only see
rows committed before the first query or data-modification statement
was executed in this transaction. If a pattern of reads and writes
among concurrent serializable transactions would create a situation
which could not have occurred for any serial (one-at-a-time) execution
of those transactions, one of them will be rolled back with a
serialization_failure error.
This should enforce the correctness of your application at a cost to availability, Ie in this case the second transaction will not be allowed to modify the records and would be rejected, which would require a retry. For a POC or a low traffic application this is usually a perfectly acceptable first step as you can ensure correctness for right now.
Also in the book referenced above I think there was an example of how ATM's handle availability. They allow for this race condition and the user to overdraw if they are unable to connect to the centralized bank but bound the maximum withdraw to minimize the blast radius!
Another architectural way to address this is to take the transactions offline and make them asynchronous, so that each user invoked transaction is published to a queue, and then by having a single consumer of the queue you naturally avoid any race conditions. The tradeoff here is similar there is a fixed throughput available from a single worker, but it does help to address the correctness issue for right now :P
Locking across machines (like using redis across postgres/grpc) called distributed locking and has a good amount written about it https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

Aerospike Design | Request Flow Internals | Resources

Where can I find information about the how flow of the read/write request in the cluster when fired from the client API?
In Aerospike configuration doc ( http://www.aerospike.com/docs/reference/configuration ), it's mentioned about transaction queues, service threads, transaction threads etc but they are not discussed in the architecture document. I want to understand how it works so that I can configure it accordingly.
From client to cluster node
In your application, a record's key is the 3-tuple (namespace, set, identifier). The key is passed to the client for all key-value methods (such as get and put).
The client then hashes the (set, identifier) portion of the key through RIPEMD-160, resulting in a 20B digest. This digest is the actual unique identifier of the record within the specified namespace of your Aerospike cluster. Each namespace has 4096 partitions, which are distributed across the nodes of the cluster.
The client uses 12 bits of the digest to determine the partition ID of this specific key. Using the partition map, the client looks up the node that owns the master partition corresponding to the partition ID. As the cluster grows, the cost of finding the correct node stays constant (O(1)) as it does not depended on the number of records or the number of nodes.
The client converts the operation and its data into an Aerospike wire protocol message, then uses an existing TCP connection from its pool (or creates a new one) to send the message to the correct node (the one holding this partition ID's master replica).
Service threads and transaction queues
When an operation message comes in as a NIC transmit/receive queue interrupt,
a service thread picks up the message from the NIC. What happens next depends on the namespace this operation is supposed to execute against. If it is an in-memory namespace, the service thread will perform all of the following steps. If it's a namespace whose data is stored on SSD, the service thread will place the operation on a transaction queue. One of the queue's transaction threads will perform the following steps.
Primary index lookup
Every record has a 64B metadata entry in the in-memory primary index. The primary-index is expressed as a collection of sprigs per-partition, with each sprig being implemented as a red-black tree.
The thread (either a transaction thread or the service thread, as mentioned above) finds the partition ID from the record's digest, and skips to the correct sprig of the partition.
Exist, Read, Update, Replace
If the operation is an exists, a read, an update or a replace, the thread acquires a record lock, during which other operations wait to access the specific sprig. This is a very short lived lock. The thread walks the red-black tree to find the entry with this digest. If the operation is an exists, and the metadata entry does exist, the thread will package the appropriate message and respond. For a read, the thread will use the pointer metadata to read the record from the namespace storage.
An update needs to read the record as described above, and then merge in the bin data. A replace is similar to an update, but it skips first reading the current record. If the namespace is in-memory the service thread will write the modified record to memory. If the namespace stores on SSD the merged record is placed in a streaming write buffer, pending a flush to the storage device. The metadata entry in the primary index is adjusted, updating its pointer to the new location of the record. Aerospike performs a copy-on-write for create/update/replace.
Updates and replaces also needs to be communicated to the replica(s) if the replication factor of the namespace is greater than 1. After the record locking process, the operation will also be parked in the RW Hash (Serializer), while the replica write completes. This is where other transactions on the same record will queue up until they hit the transaction pending limit (AKA a hot key). The replica write(s) is handled by a different thread (rw-receive), releasing the transaction or service thread to move on to the next operation. When the replica writes complete the RW Hash lock is released, and the rw-receive thread will package the reply message and send it back to the client.
Create and Delete
If the operation is a new record being written, or a record being deleted, the partition sprig needs to be modified.
Like update/replace, these operations acquire the record-level lock and will go through the RW hash. Because they add or remove a metadata entry from the red-black tree representing the sprig, they must also acquire the index tree reduction lock. This process also happens when the namespace supervisor thread finds expired records and remove them from the primary index. A create operation will add an element to the partition sprig.
If the namespace stores on SSD, the create will load the record into a streaming write buffer, pending a flush to SSD, and ahead of the replica write. It will update the metadata entry in the primary index, adjusting its pointer to the new block.
A delete removes the metadata entry from the partition sprig of the primary index.
Summary
exists/read grab the record-level lock, and hold it for the shortest amount of time. That's also the case for update/replace when replication factor is 1.
update/replace also grab the RW hash lock, when replication factor is higher than 1.
create/delete also grab the index tree reduction lock.
For in-memory namespaces the service thread does all the work up to potentially the point of replica writes.
For data on SSD namespaces the service thread throws the operation onto a transaction queue, after which one of its transaction threads handles things such as loading the record into a streaming write buffer for writes, up until the potential replica write.
The rw-receive thread deals with replica writes and returning the message after the update/replace/create/delete write operation.
Further reading
I've addressed key-value operations, but not batch, scan or query. The difference between batch-reads and single-key reads is easier to understand once you know how single-read works.
Durable deletes do not remove the metadata entry of this record from the primary index. Instead, those are a new write operation of a tombstone. There will be a new 64B entry in the primary index, and a 128B entry in the SSD for the record.
Performance optimizations with CPU pinning. See: auto-pin, service-threads, transaction-queues.
Service threads == transaction queues == number of cores in your CPU or use CPU pinning - auto-pin config parameter if available in your version and possible in your OS env.
transaction threads per queue-> 3 (default is 4, for objsize <1KB, non data-in-memory namespace, 3 is optimal)
Changes with server ver 4.7+, the transaction is now handled by the service thread itself. By default, number of service threads is now set to 5 x no. of cpu cores. Once a service thread picks a transaction from the socket buffer, it carries it through completion unless it ends up in the rwHash (e.g. writes for replicating). The transaction queue is still there (internally) but only relevant for transaction restarts when queued up in the rwHash. (Multiple pending transactions for the same digest).

DBMS Transaction and Serializable

We know:
A schedule in which transactions are aligned in such a way that one
transaction is executed first. When the first transaction completes
its cycle then next transaction is executed. Transactions are ordered
one after other. This type of schedule is called serial schedule as
transactions are executed in a serial manner.
and
This execution does no harm if two transactions are mutually
independent and working on different segment of data but in case these
two transactions are working on same data, results may vary.
so what is the benefit of finding that two transaction is Serialize Schedule? if the result is vary, what is the benefit ?
When transactions access the same variables but are executed serially (ie not executed simultaneously) there is a sense in which "results may vary" from when there is only one transaction ever executing (possibly repeatedly). With serial transactions we don't know what order the (non-overlapping) transactions are executed in. All we know at the start of the execution of a repeating transaction is that other transactions may have changed variables since the end of the last execution of the repeating transaction. (Although we generally know something about how they have been left.)
There is nothing wrong with such "varying results" because they just reflect that the transactions were requested at varying times.
When transactions access the same variables and are executed simultaneously (ie not serially) then then for each transaction "results may vary" (in another sense) from how we normally understand the code. That normal understanding relies on only one transaction executing at a time. Eg normally if code reads a variable twice without writing to it then we expect to get the same value. But that's not guaranteed if another transaction writes to it in between reads. Eg normally if code reads a variable then we expect to get the value that the variable actually had. But that's not guaranteed if we get some of its bytes and then another transaction writes to it and then we get the rest of the bytes from that new value.
But if transactions are serializable then they can be executed non-serially (with overlap) but with the same result as if they were executed serially (with no overlap). Then code means what it normally means when there is only one transaction executing.
So we have to make sure that the system acts as if transactions are serial or else we have no idea what our program does.
A serializable schedule is an interleaving of operations from multiple transactions that gives the same result as some serial(ized) schedule. The benefit of executing a serializable schedule that is different from just doing all of one transaction's operations after another's is improved throughput from doing multiple operations from multiple transactions at the same time.
PS
Your quotes appear on a web page that is a mess. It doesn't even define "serializable schedule". The text between your quotations is
In a multi-transaction environment, serial schedules are considered as
benchmark. The execution sequence of instruction in a transaction
cannot be changed but two transactions can have their instruction
executed in random fashion.
But the second sentence should start But in a non-serial schedule.... Because in a serial schedule "Transactions are ordered one after other." So the "results may vary" in the quotation is in a non-serial schedule.
But you did not respond to my comment:
Does "This execution" refer to a serial execution of transactions or
to a non-serial execution of transactions? (What came before your
second quote?)