In what scenarios can the blockchain size decrease for bitcoin? - bitcoin

I am running a private bitcoin network for which I changed the target time between two blocks to 12 seconds and the difficulty adjustment to 25 blocks interval. I ran the network for about 4 hours with 50 nodes. In one of the node's logs I observed that the blockchain height increased up to a maximum of 181 and then started decreasing, all the way to 38. what could be an explanation for such a strange behaviour.
Please refer to the log below:
2015-11-04 01:58:47 receive version message: /Satoshi:0.11.99/: version 70011, blocks=181, us=0.0.0.0:0, peer=2, peeraddr=127.0.0.1:44117
2015-11-04 01:58:47 UpdateTip: new best=0000005265ca4ce01ad0d06f45cf475bf303de3d64e942c5cf1177e00f346c78 height=180 log2_work=37.083283 tx=30941 date=2015-11-04 01:53:17 progress=1.000000 cache=0.0MiB(1tx)
2015-11-04 01:58:47 UpdateTip: new best=00000052a34cedf3c5ddbeb46d36644654523db855c4cce984d2623e840dd219 height=179 log2_work=37.082953 tx=30940 date=2015-11-04 01:53:10 progress=1.000000 cache=0.0MiB(2tx)
2015-11-04 01:58:47 UpdateTip: new best=00000030fd7652affb883f05fe0c98e7fe3fbc3cfd74808e061ed05ec61c22e6 height=178 log2_work=37.082623 tx=30939 date=2015-11-04 01:52:55 progress=1.000000 cache=0.0MiB(3tx)
2015-11-04 01:58:47 AddToWallet c32bcbd8102c602a5e71ee717232e204435f331dce6fbfb9eb5d552698faa95b
2015-11-04 01:58:47 AddToWallet 1c91517aeadd12bcbcfdf4a1423b671d405543ae9abfbd87078969ce1971663f
2015-11-04 01:58:47 AddToWallet b11f9c2e3b1ab3d3983da63783bb95903d89405243d0716ea88272a9261b7a33

Are all 50 nodes mining? What might happen is that some nodes are not in sync and keep mining on earlier blocks. If these soft forks are of a higher difficulty than the chain tip, the chain might rollback.
However it seems the logs are at the same second which might indicate that there is some race condition between receiving the blocks and printing the log messages.

The log you provided shows you have two peers.
If that's the only nodes(2+1) in the network then your chain is not gonna be stable without more fine tuning on the variables.
My guess is that you changed some rules and a chain split&reorganization(soft fork) happened, the extra blocks become orphans after the reorganization.

Related

Tracking down S3 Costs (S3FS)

During a transition, our S3 costs jumped a lot due to ListBucket and HeadObject calls. We are trying to figure out how to debug a sudden increase in our S3 costs. We made some changes that should NOT have affected it but the major change seems to be
10-20X increase in HeadObject calls
Sudden appearance of ListBucket calls
I have attached a chart showing the jump between the April 10, 2018 and April 14, 2018. The dates in between, we made the following changes
Changed from (debian 8) S3FS v1.61 (super old from 2012, not even in Github) to v1.84 (latest)
https://github.com/s3fs-fuse/s3fs-fuse
Moved from N. Virginia to N. California AZ (10% higher cost)
The giant yellow bars are showing the moving of the files using Amazon CLI (April 11 to 13)
In order to try to calm this down, we added to the mount command in /etc/fstab the following:
noatime,stat_cache_expire=3600,enable_noobj_cache
The bars that look uneven starting Apr 14 are now stable around $25/day
Options that are already there were there since the start (no change)
_netdev,allow_other,use_cache=/tmp,umask=0000,use_path_request_style,ensure_diskfree=10240
We have done the following to try to debug this
Enabled S3 Logging
Dumped the logs into Athena and then CSV export into MySQL
These logs are just 1 days worth
Screenshot "query 1" shows that there is 4.8m hits into a path ... basically, we think it is traversing the entire directory tree (with most like about 100k files) looking for a file if it exists
Screenshot "query 2" shows the same thing (kind of) where it is also doing down a path
Not really sure what else to do but our normal bill of about $5/day (including other services) is now about $25/day (5x increase) .. with the /etc/fstab changes, it is down to $13/day but still trying to get it to $5/day if we can get back to the zero ListBucket calls and 20% of the HeadObject calls.
Any ideas on what to try greatly appreciated.
ListBucket and HeadObject API calls were being made updatedb (and located).
Solution: Add your mount point (in my case /mnt/s3fs) to PRUNEPATHS in /etc/updatedb.conf so updatedb does not include this when it scans
https://linux.die.net/man/5/updatedb.conf
https://github.com/s3fs-fuse/s3fs-fuse/issues/193#issuecomment-109617253

Aerospike cluster not clean available blocks

we use aerospike in our projects and caught strange problem.
We have a 3 node cluster and after some node restarting it stop working.
So, we make test to explain our problem
We make test cluster. 3 node, replication count = 2
Here is our namespace config
namespace test{
replication-factor 2
memory-size 100M
high-water-memory-pct 90
high-water-disk-pct 90
stop-writes-pct 95
single-bin true
default-ttl 0
storage-engine device {
cold-start-empty true
file /tmp/test.dat
write-block-size 1M
}
We write 100Mb test data after that we have that situation
available pct equal about 66% and Disk Usage about 34%
All good :slight_smile:
But we stopped one node. After migration we see that available pct = 49% and disk usage 50%
Return node to cluster and after migration we see that disk usage became previous about 32%, but available pct on old nodes stay 49%
Stop node one more time
available pct = 31%
Repeat one more time we get that situation
available pct = 0%
Our cluster crashed, Clients get AerospikeException: Error Code 8: Server memory error
So how we can clean available pct?
If your defrag-q is empty (and you can see whether it is from grepping the logs) then the issue is likely to be that your namespace is smaller than your post-write-queue. Blocks on the post-write-queue are not eligible for defragmentation and so you would see avail-pct trending down with no defragmentation to reclaim the space. By default the post-write-queue is 256 blocks and so in your case that would equate to 256Mb. If your namespace is smaller than that you will see avail-pct continue to drop until you hit stop-writes. You can reduce the size of the post-write-queue dynamically (i.e. no restart needed) using the following command, here I suggest 8 blocks:
asinfo -v 'set-config:context=namespace;id=<NAMESPACE>;post-write-queue=8'
If you are happy with this value you should amend your aerospike.conf to include it so that it persists after a node restart.

ServerXmlHttpRequest hanging sometimes when doing a POST

I have a job that periodically does some work involving ServerXmlHttpRquest to perform an HTTP POST. The job runs every 60 seconds.
And normally it runs without issue. But there's about a 1 in 50,000 chance (every two or three months) that it will hang:
IXMLHttpRequest http = new ServerXmlHttpRequest();
http.open("POST", deleteUrl, false, "", "");
http.send(stuffToDelete); <---hang
When it hangs, not even the Task Scheduler (with the option enabled to kill the job if it takes longer than 3 minutes to run) can end the task. I have to connect to the remote customer's network, get on the server, and use Task Manager to kill the process.
And then its good for another month or three.
Eventually i started using Task Manager to create a process dump,
so i could analyze where the hang is. After five crash dumps (over the last 11 months or so) i get a consistent picture:
ntdll.dll!_NtWaitForMultipleObjects#20()
KERNELBASE.dll!_WaitForMultipleObjectsEx#20()
user32.dll!MsgWaitForMultipleObjectsEx()
user32.dll!_MsgWaitForMultipleObjects#20()
urlmon.dll!CTransaction::CompleteOperation(int fNested) Line 2496
urlmon.dll!CTransaction::StartEx(IUri * pIUri, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4453 C++
urlmon.dll!CTransaction::Start(const wchar_t * pwzURL, IInternetProtocolSink * pOInetProtSink, IInternetBindInfo * pOInetBindInfo, unsigned long grfOptions, unsigned long dwReserved) Line 4515 C++
msxml3.dll!URLMONRequest::send()
msxml3.dll!XMLHttp::send()
Contoso.exe!FrobImporter.TFrobImporter.DeleteFrobs Line 971
Contoso.exe!FrobImporter.TFrobImporter.ImportCore Line 1583
Contoso.exe!FrobImporter.TFrobImporter.RunImport Line 1070
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.HandleFrobImport Line 433
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.CoreExecute Line 71
Contoso.exe!CommandLineProcessor.TCommandLineProcessor.Execute Line 84
Contoso.exe!Contoso.Contoso Line 167
kernel32.dll!#BaseThreadInitThunk#12()
ntdll.dll!__RtlUserThreadStart()
ntdll.dll!__RtlUserThreadStart#8()
So i do a ServerXmlHttpRequest.send, and it never returns. It will sit there for days (causing the system to miss financial transactions, until come Sunday night i get a call that it's broken).
It is of no help unless someone knows how to debug code, but the registers in the stalled thread at the time of the dump are:
EAX 00000030
EBX 00000000
ECX 00000000
EDX 00000000
ESI 002CAC08
EDI 00000001
EIP 732A08A7
ESP 0018F684
EBP 0018F6C8
EFL 00000000
Windows Server 2012 R2
Microsoft IIS/8.5
Default timeouts of ServerXmlHttpRequest
You can use serverXmlHttpRequest.setTimeouts(...) to configure the four classes of timeouts:
resolveTimeout: The value is applied to mapping host names (such as "www.microsoft.com") to IP addresses; the default value is infinite, meaning no timeout.
connectTimeout: A long integer. The value is applied to establishing a communication socket with the target server, with a default timeout value of 60 seconds.
sendTimeout: The value applies to sending an individual packet of request data (if any) on the communication socket to the target server. A large request sent to a server will normally be broken up into multiple packets; the send timeout applies to sending each packet individually. The default value is 30 seconds.
receiveTimeout: The value applies to receiving a packet of response data from the target server. Large responses will be broken up into multiple packets; the receive timeout applies to fetching each packet of data off the socket. The default value is 30 seconds.
The KB305053 (a server that decides to keep the connection open will cause serverXmlHttpRequest to wait for the connection to close) seems like it plausibly could be the issue. But the 30 second default timeout would have taken care of that.
Possible workaround - Add myself to a Job
The Windows Task Scheduler is unable to terminate the task; even though the option is enabled to do do.
I will look into using the Windows Job API to add my self process to a job, and use SetInformationJobObject to set a time limit on my process:
CreateJobObject
AssignProcessToJobObject
SetInformationJobObject
to limit my process to three minutes of execution time:
PerProcessUserTimeLimit
If LimitFlags specifies
JOB_OBJECT_LIMIT_PROCESS_TIME, this member is the per-process
user-mode execution time limit, in 100-nanosecond ticks. Otherwise,
this member is ignored.
The system periodically checks to determine
whether each process associated with the job has accumulated more
user-mode time than the set limit. If it has, the process is
terminated.
If the job is nested, the effective limit is the most
restrictive limit in the job chain.
Although since Task Scheduler uses Job objects to also limit a task's time, i'm not hopeful that the Job Object can limit a job either.
Edit: Job objects cannot limit a process by process time - only user time. And with a process idle waiting for an object, it will not accumulate any user time - certainly not three minutes worth.
Bonus Reading
How can a ServerXMLHTTP GET request hang? (GET, not POST)
KB305053: ServerXMLHTTP Stops Responding When You Send a POST Request (which says the timeout should expire; where mine does not)
MS Forums: oHttp.Send - Hangs (HEAD, not POST)
MS Forums: ASP to test SOAP WebService using MSXML2.ServerXMLHTTP Send hangs
CC to MS Support Forums
Consider switching to a newer, supported API.
msxml6.dll using MSXML2.ServerXMLHTTP.6.0
winhttpcom.dll using WinHttp.WinHttpRequest.5.1.
The msxml3.dll library is no longer supported and is only kept around for compatibility reasons. Plus, there were a number of security and stability improvements included with msxml4.dll (and newer) that you are missing out on.

How to optimize golang program that spends most time in runtime.osyield and runtime.usleep

I've been working on optimizing code that analyzes social graph data (with lots of help from https://blog.golang.org/profiling-go-programs) and I've successfully reworked a lot of slow code.
All data is loaded into memory from db first, and the data analysis from there appears CPU bound (max memory consumption < 10MB, CPU1 # 100%)
But now most of my program's time seems to be in runtime.osyield and runtime.usleep. What's the way to prevent that?
I've set GOMAXPROCS=1 and the code does not spawn any goroutines (other than what the golang libraries may call).
This is my top10 output from pprof
(pprof) top10
62550ms of 72360ms total (86.44%)
Dropped 208 nodes (cum <= 361.80ms)
Showing top 10 nodes out of 77 (cum >= 1040ms)
flat flat% sum% cum cum%
20760ms 28.69% 28.69% 20850ms 28.81% runtime.osyield
14070ms 19.44% 48.13% 14080ms 19.46% runtime.usleep
11740ms 16.22% 64.36% 23100ms 31.92% _/C_/code/sc_proto/cloudgraph.(*Graph).LeafProb
6170ms 8.53% 72.89% 6170ms 8.53% runtime.memmove
4740ms 6.55% 79.44% 10660ms 14.73% runtime.typedslicecopy
2040ms 2.82% 82.26% 2040ms 2.82% _/C_/code/sc_proto.mAvg
890ms 1.23% 83.49% 1590ms 2.20% runtime.scanobject
770ms 1.06% 84.55% 1420ms 1.96% runtime.mallocgc
760ms 1.05% 85.60% 760ms 1.05% runtime.heapBitsForObject
610ms 0.84% 86.44% 1040ms 1.44% _/C_/code/sc_proto/cloudgraph.(*Node).DeepestChildren
(pprof)
The _ /C_/code/sc_proto/* functions are my code.
And the output from web:
(better, SVG version of graph here: https://goo.gl/Tyc6X4)
Found the answer myself, so I'm posting this here for anyone else who is having a similar problem. And special thanks to #JimB for sending me down the right path.
As can be seen from the graph, the paths which lead to osyield and usleep are garbage collection routines. This program was using a linked list which generated a lot of pointers, which created a lot of work for the gc, which occasionally blocked execution of my code while it cleaned up my mess.
Ultimately the solution to this problem came from https://software.intel.com/en-us/blogs/2014/05/10/debugging-performance-issues-in-go-programs (which was an awesome resource btw). I followed the instructions about the memory profiler there; and the recommendation to replace collections of pointers with slices cleared up my garbage collection issues, and my code is much faster now!

How to dump Permgen?

I wanted to take the dump of the Permgen of a application server.
I do not want to use -XX:+TraceClassLoading -XX:+TraceClassUnloading as i do not want to restart the server, Neither i want to use jconsole.
I there any tool like jmap(used to heap dump didnt find any option for permgen) to get the permgen so that i can supply only the pid.
jmap -permstat <pid>
is going to produce an output like that :
30337 intern Strings occupying 2746200 bytes.
class_loader classes bytes parent_loader alive? type
<bootstrap> 2031 7253392 null live <internal>
0x517474f0 1 1760 null dead sun/reflect/DelegatingClassLoader#0x43f95d38
0x4f83f670 1 1744 0x4ebfb8e8 dead sun/reflect/DelegatingClassLoader#0x43f95d38
[...]
total = 287 10020 35889952 N/A alive=3, dead=284 N/A
This is not a full dump, but doing that is going to allow you to do some investigation.
I am still looking on how to find more information.
It is not possible to 'dump permgen' as it's done for the heap.
In addition to jmap -permstat as others have presented, you can analyze standard heap dump to shed some light on your permanent generation as described in this blog entry: 'The Unknown Generation: Perm'.
Because a heap dump does not really contain a lot of information about perm space, perm problems are difficult to tackle. Recently, I found this great article by Sporar, Sundararajan and Kieviet. The authors shed some light on the permanent generation. Of course, I had to check right away if and how I can use the Eclipse Memory Analyzer to analyze this “unknown” generation. This is what this blog is about.
jmap -permstat <pid>