Do you have to spent the entire output of a bitcoin transaction? - bitcoin

After reading about how bitcoin is transferred, I learned that you take the outputs of a previous transaction and use them as inputs for a current transaction, in other words you use the transaction id of the previous transaction to supply coins to the current one. There can be multiple inputs and outputs but if you want to spend the bitcoin in an output do you have to spend the entire output or can you spend part of one output and then use that transaction id later to spend the rest of that output. How would the UTXO database keep track of this and handle the math?

Unspent outputs are claimed by the miner of the block. Transaction fees are just unspent outputs.
If you want to spend the bitcoin in an output do you have to spend the entire output or can you spend part of one output and then use that transaction id later to spend the rest of that output.
Yes, you need to spend all of the outputs unless you plan to donate it to the block miner. Usually this is achieved by paying the "change" back to yourself.

Related

Orderbook matching engine

My question is more of a conceptual one, rather than coding question, but I also accept code (the ideal answer).
So I have a huge dataset of secondly orderbook snapshots (that is, for each second, I have the best 200 ask prices (and their volumes) and the best 200 bid prices (and their volumes)). This is real data, real orders that were submitted at some point in time. For each state, the data is represented as pandas dataframe which has timestamp,side,price,volume. So, an example is:
2023-02-14 00:01:01, 'ask', 19874.11, 0.3
But we have many ask and bid orders per state. My question is the following: for a state s_i, if I decide to do a limit order with a specified price and volume, how would that change change state s_(i+1) (this is just a simulation). Same question goes if I had a market order with some volume.
Purpose:
I am trying to optimize order execution, and there is already existing literature on this subject. The idea is, when I train my agent, I want to reflect each decision it makes so I can update my next states based on what actions/decisions the agent has done.
Literature:
https://www.econstor.eu/bitstream/10419/216206/1/1696077540.pdf
You can try to deploy your exchange and test it there, if you can implement the logic you need for working with orders.
There is an open-source project of crypto exchange Opencex, here is a link to it:
https://github.com/Polygant/OpenCEX

Check number of slots used by a query in BigQuery

Is there a way to check how many slots were used by a query over the period of its execution in BigQuery? I checked the execution plan but I could just see the Slot Time in ms but could not see any parameter or any graph to show the number of slots used over the period of execution. I even tried looking at Stackdriver Monitoring but I could not find anything like this. Please let me know if it can be calculated in some way or if I can see it somewhere I might've missed seeing.
A BigQuery job will report the total number of slot-milliseconds from the extended query stats in the job metadata, which is analogous to computational cost. Each stage of the query plan also indicates input stats for the stage, which can be used to indicate the number of units of work each stage dispatched.
More details about the representation can be found in the REST reference for jobs. See query.statistics.totalSlotMs and statistics.query.queryPlan[].parallelInputs for more information.
BigQuery now provides a key in the Jobs API JSON called "timeline". This structure provides "statistics.query.timeline[].completedUnits" which you can obtain either during job execution or after. If you choose to pull this information after a job has executed, "completedUnits" will be the cumulative sum of all the units of work (slots) utilised during the query execution.
The question might have two parts though: (1) Total number of slots utilised (units of work completed) or (2) Maximum parallel number of units used at a point in time by the query.
For (1), the answer is as above, given by "completedUnits".
For (2), you might need to consider the maximum value of queryPlan.parallelInputs across all query stages, which would indicate the maximum "number of parallelizable units of work for the stage" (https://cloud.google.com/bigquery/query-plan-explanation)
If, after this, you additionally want to know if the 2000 parallel slots that you are allocated across your entire on-demand query project is sufficient, you'd need to find the point in time across all queries taking place in your project where the slots being utilised is at a maximum. This is not a trivial task, but Stackdriver monitoring provides the clearest view for you on this.

Rails postgresql. Collision of two operations

I've got this shop-like application (Rails 3.2 + Postgresql), where two of my resources/tables are Users, and Operations. It has the following characteristics:
Amongst other attributes, Users have a certain :credit at each moment in time.
Operations represent either:
A purchase of a product (whose price is deduced from the User's credit who purchased it).
A purchase of credit ( the amount of which is added to the User's credit).
Each Operation stores:
:precredit - The credit the User had before the Operation.
:postcredit - The final credit after the Operation.
:price - The amount of money involved, whether it's positive or negative.
There was a problem with two Operation since they happened exactly at the same second ( My guess is that there was an internet problem for a while and then both queries were executed at the same second, see below).
This is the sorted sequence of operations by created_at(credit operations add and product operation subtract from the credit):
Category:credit Precredit:2.9 Price:30.0 Postcredit:32.9 Created_at:16:34:02
Category:product Precredit:32.9 Price:30.0 Postcredit:2.9 Created_at:16:42:06
Category:credit Precredit:32.9 Price:5.0 Postcredit:37.9 Created_at:16:42:06
Category:product Precredit:37.9 Price:4.0 Postcredit:33.9 Created_at:16:45:24
As one can see, Operation#3 should have a precredit = 2.9, which is the postcredit of Operation#2. However, the result of Operation#2 is not taken into account when Operation#3 is executed.
Ideally I would have:
Category:credit Precredit:2.9 Price:30.0 Postcredit:32.9 Created_at:16:34:02
Category:product Precredit:32.9 Price:30.0 Postcredit:2.9 Created_at:16:42:06
Category:credit Precredit:2.9 Price:5.0 Postcredit:-2.1 Created_at:16:42:06
Note that Operation#3 would've raised an error due to enough_balance?-type validations resulting in false.
Questions
Any ideas regarding how this might have happened?
How can this type of collisions be avoided?
I'm not sure how you're creating the operations, but this kind of situation can happen in concurrent environments, consider the next example:
Process A: gets the User object to obtain the current credit (equal to precredit)
Process B: gets the User object to obtain the current credit (at this point both have the same value)
Process A: calculates the postcredit (precredit +/- value)
Process B: calculates the postcredit
Process B: saves the record
Process A: saves the record
Even if the record in process A and the record in process B are not saved in the exact same millisecond (which is more unlikely), they still save both records with the same precredit, and this depends on how did they calculate this value. This is a common problem in operating systems and its solved with a 'Lock' (Peterson's algorithm,Lock)
Now, Rails provides a mechanism for achieving this, I recommend you take a look at http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html, the object you'll want to lock will probably be the user.

SQL Server Payment estimate probabilty - best maths equations and recursive query?

I have a table which holds a list of transactions.
Task: To estimate the next transaction amount.
Problem:
The actual payment periods for each rows is a varible, which can be weekly, monthly or anything choosen by the end user.
To estimate the next payment, based on previous data, can anyone suggest a good method?
At the moment I basically take the figure back to the daily amount then multiple by period i.e. week/month/q/year. Then given the history, choose the result that has the highest incidence (count).
This does not generate an accuarate estimations due to payments within payments that I dont need to care about i.e. £100 real payment but +20 for addition charges that are irrelevant.
Another way is to calculate the average,std,varience between payments then choose the highest probability.
Problem is, i've been unable to code this in SQL.
SELECT [Identifier]
,[DateTranEntered]
,[Type]
,[TranDateFrom],
,[TranDateTo]
,[Amount]
,[ReferenceForTran]
,[CreatedDate]
FROM .[TranTable]
Perhaps something with recursion through the table and calculate every transaction daily amount then with the variance, incidence - choose from the last 'x' what the estimate guess is ?
Problem is I have gotten stuck with the resurive query for this.
Any thoughts about this?
SQL Server Analysis services has a suite of data mining tools that provide algorithms such as Linear Regressions, Decision Trees and Neural Networks. You can learn more about them here: http://msdn.microsoft.com/en-us/library/ms175595.aspx. It sounds like Linear Regressions might be the best place to start for this problem.

Q-learning value update

I am working on the power management of a device using Q-learning algorithm. The device has two power modes, i.e., idle and sleep. When the device is asleep, the requests for processing are buffered in a queue. The Q-learning algorithm looks for minimizing a cost function which is a weighted sum of the immediate power consumption and the latency caused by an action.
c(s,a)=lambda*p_avg+(1-lambda)*avg_latency
In each state, the learning algorithm takes an action (executing time-out values) and evaluates the effect of the taken action in next state (using above formula). The actions are taken by executing certain time-out values from a pool of pre-defined time-out values. The parameter lambda in above equation is a power-performance parameter (0_<lambda<1). It defines whether the algorithm should look for power saving (lambda-->1) or should look for minimizing latency (lambda-->0). The latency for each request is calculated as queuing-time + execution-time.
The problem is that the learning algorithm always favors small time-out values in sleep state. It is because the average latency for small time-out values is always lower, and hence their cost is also small. When I change the value of lambda from lower to higher, I don't see any effect in the final output policy. The policy always selects small time-out values as best actions in each state. Instead of average power and average latency for each state, I have tried using overall average power consumption and overall average latency for calculating cost for a state-action pair, but it doesn't help. I also tried using total energy consumption and total latency experinced by all the request for calculating cost in each state-action pair, but it doesn't help either. My question is: what could be a better cost function for this scenario? I update the Q-value as follows:
Q(s,a)=Q(s,a)+alpha*[c(s,a)+gamma*min_a Q(s',a')-Q(s,a)]
Where alpha is a learning rate (decreased slowly) and gamma=0.9 is a discount factor.
To answer the questions posed in the comments:
shall I use the entire power consumption and entire latency for all
the requests to calculate the cost in each state (s,a)?
No. In Q-learning, reward is generally considered an instantaneous signal associated with a single state-action pair. Take a look at Sutton and Barto's page on rewards. As shown the instantaneous reward function (r_t+1) is subscripted by time step - indicating that it is indeed instantaneous. Note that R_t, that expected return, considers the history of rewards (from time t back to t_0). Thus, there is no need for you to explicitly keep track of accumulated latency and power consumption (and doing so is likely to be counter-productive.)
or shall I use the immediate power consumption and average latency
caused by an action a in state s?
Yes. To underscore the statement above, see the definition of an MDP on page 4 here. The relevant bit:
The reward function specifies expected instantaneous reward as a
function of the current state and action
As I indicated in a comment above, problems in which reward is being "lost" or "washed out" might be better solved with a Q(lambda) implementation because temporal credit assignment is performed more effectively. Take a look at Sutton and Barto's chapter on TD(lambda) methods here. You can also find some good examples and implementations here.