Why transaction cost and execution cost are same in Remix IDE all the time? - solidity

I'm recently started to learn how to developing smart contract using solidity in Remix IDE.
I'm using Remix VM (London) environment.
My question is, how can transaction costs and execution costs be the same in all transactions?
I know that transaction cost is the cost of putting data on the blockchain, and execution cost is the cost of executing it.
I'd appreciate your help. Thanks.
example

The transaction and execution cost will be the same every time if the same work is being done.
If you have a function which calculates A+B and returns C then this will always be consistent and the same cost.
If you have a function which saves a string input by the user then the cost will changed based on the size of the string the user inputs.

Related

gem5: how can i monitor when branch misprediction happened during simulating O3CPU?

I'm currently planning to build a IPC performance counter for Out-Of-Order(O3) CPU using gem5.
I've read a paper about building an accurate performance counter for O3 CPU and the idea is using top-down interval analysis.(The paper is A Performance Counter Architecture for Computing Accurate CPI Components) So I'm planning to apply this idea and in order to do this, I have to capture the moment when branch misprediction, I-Cache miss, D-cache miss, ... etc happen and increase counters for each events. I've looked up gem5/src/cpu/o3/decode.cc and there are lines about mispredictions like below.
enter image description here
I'm trying to write codes like below (I think I should create a new object for IPC counter)
if(decodeInfo[tid].branchMispredict == true) counter++;
but I'm struggling to find where to start.
thanks for reading.
gem5 provides a statistics framework to capture hardware events. The O3 CPU model already implements a large number of statistics, including for branch mispredictions (at decode and execute). I suggest you to have a look at them and assess whether they're sufficient for your needs.
Statistics are reported at the end of the simulation in m5out/stats.txt.
PS: statistics are completely accurate and costless, they do not have limitations such as capturing every N cycles or overhead of communicating values to software, for example via interrupts. If you need to model those, you may want to use a performance counter model, such as the Arm PMU model.

Recommended way of measuring execution time in Tensorflow Federated

I would like to know whether there is a recommended way of measuring execution time in Tensorflow Federated. To be more specific, if one would like to extract the execution time for each client in a certain round, e.g., for each client involved in a FedAvg round, saving the time stamp before the local training starts and the time stamp just before sending back the updates, what is the best (or just correct) strategy to do this? Furthermore, since the clients' code run in parallel, are such a time stamps untruthful (especially considering the hypothesis that different clients may be using differently sized models for local training)?
To be very practical, using tf.timestamp() at the beginning and at the end of #tf.function client_update(model, dataset, server_message, client_optimizer) -- this is probably a simplified signature -- and then subtracting such time stamps is appropriate?
I have the feeling that this is not the right way to do this given that clients run in parallel on the same machine.
Thanks to anyone can help me on that.
There are multiple potential places to measure execution time, first might be defining very specifically what is the intended measurement.
Measuring the training time of each client as proposed is a great way to get a sense of the variability among clients. This could help identify whether rounds frequently have stragglers. Using tf.timestamp() at the beginning and end of the client_update function seems reasonable. The question correctly notes that this happens in parallel, summing all of these times would be akin to CPU time.
Measuring the time it takes to complete all client training in a round would generally be the maximum of the values above. This might not be true when simulating FL in TFF, as TFF maybe decided to run some number of clients sequentially due to system resources constraints. In practice all of these clients would run in parallel.
Measuring the time it takes to complete a full round (the maximum time it takes to run a client, plus the time it takes for the server to update) could be done by moving the tf.timestamp calls to the outer training loop. This would be wrapping the call to trainer.next() in the snippet on https://www.tensorflow.org/federated. This would be most similar to elapsed real time (wall clock time).

Ethereum: How to deploy slightly bigger smart contracts?

How can I deploy large smart contracts? I tried it on Kovan and Ropsten and have issues with both. I have 250 lines of code + all the ERCBasic and Standard files imported.
The size of the compiled bin file is 23kB. Did anybody experience similar problems with this size of contracts?
UPDATE:
It is possible to decrease the compiled contract size by doing the compilation as follows:
solc filename.sol --optimize
In my case it turned 23kB in about 10kB
Like everything else in Ethereum, limits are imposed by the gas consumed for your transaction. While there is no exact size limit there is a block gas limit and the amount of gas you provide has to be within that limit.
When you deploy a contract, there is an intrinsic gas cost, the cost of the constructor execution, and the cost of storing the bytecode. The intrinsic gas cost is static, but the other two are not. The more gas consumed in your constructor, the less is available for storage. Usually, there is not a lot of logic within a constructor and the vast majority of gas consumed will be based on the size of the contract. I'm just adding that point here to illustrate that it is not an exact contract size limit.
Easily, the bulk of the gas consumption comes from storing your contract bytecode on the blockchain. The Ethereum Yellowpaper (see page 9) dictates that the cost for storing the contract is
cost = Gcodedeposit * o
where o is the size (in bytes) of the optimized contract bytecode and Gcodedeposit is 200 gas/byte.
If your bytecode is 23kB, then your cost will be ~4.6M gas. Add that to the intrinsic gas costs and the cost of the constructor execution, you are probably getting close the block gas size limits.
To avoid this problem, you need to break down your contracts into libraries/split contracts, remove duplicate logic, remove non-critical functions, etc.
For some more low level examples of deployment cost, see this answer and review this useful article on Hackernoon.
It is possible to get around the max contract size limitation by implementing the Transparent Contract Standard: https://github.com/ethereum/EIPs/issues/1538

How to measure by c code

I have a question about how to measure the bandwidth of a GPU. I have tried some different ways but none of them work. For example, I tried to use the amount of data transfer divided by the time used to calculate the bandwidth. However, since GPU can switch warps currently executed, the number of data transfer varies during execution. I wonder whether you may give me some advices about how to do it. That would be really appreciated.

dynamic optimization of running programs

I was told that running programs generate probability data used to optimize repeated instructions.
For example, if an "if-then-else" control structure has been evaluated TRUE 8/10 times, then the next time the "if-then-else" statement is being evaluated, there is an 80% chance the condition will be TRUE. This statistic is used to prompt hardware to load the appropriate data into the registers assuming the outcome will be TRUE. The intent is to speed up the process. If the statement does evaluate to TRUE, data is already loaded to the appropriate registers. If the statement evaluates to FALSE, then the other data is loaded in and simply written over what was decided "more likely".
I have a hard time understanding how the probability calculations don't out-weigh the performance cost of decisions it's trying to improve. Is this something that really happens? Does it happen at a hardware level? Is there a name for this?
I can seem to find any information about the topic.
This is done. It's called branch prediction. The cost is non-trivial, but it's handled by dedicated hardware, so the cost is almost entirely in terms of extra circuitry -- it doesn't affect the time taken to execute the code.
That means the real cost would be one of lost opportunity -- i.e., if there was some other way of designing a CPU that used that amount of circuitry for some other purpose and gained more from doing so. My immediate guess is that the answer is usually no -- branch prediction is generally quite effective in terms of return on investment.