How to open local bitcoin database - bitcoin

I am trying to extract data from local bitcoin database. As I know, bitcoin-qt is using BerkeleyDB. I have installed BerkleyDB from Oracle web site, and found there a DLL for .NET: libdb_dotnet60.dll. I am trying to open a file, but I get a DatabaseException. Here is my code:
using BerkeleyDB;
class Program
{
static void Main(string[] args)
{
var btreeConfig = new BTreeDatabaseConfig();
var btreeDb = BTreeDatabase.Open(#"c:\Users\<user>\AppData\Roaming\Bitcoin\blocks\blk00000.dat", btreeConfig);
}
}
Does anyone have examples how to work with a Bitcoin database (in any other language)?

What are you trying to extract? Only the wallet.dat file is Berkeley database.
Blocks are stored one after the other in the blkxxxxx.dat files with four bytes representing a network identifier and four bytes giving the block size, before each block.
An index for unspent outputs in stored as a leveldb database.
Knowing what type of information you are looking for would help.

There is library NBitcoin: https://github.com/MetacoSA/NBitcoin
How to enumerate blocks:
var store = new BlockStore(#"C:\Bitcoin\blocks\", Network.Main);
// this loop will enumerate all blocks ordered by height starting with genesis block
foreach (var block in store.EnumerateFolder())
{
var item = block.Item;
string blockID = item.Header.ToString();
foreach (var tx in item.Transactions)
{
string txID = tx.GetHash().ToString();
string raw = tx.ToHex();
}
}

In .NET you could use something like BitcoinBlockchain that is available as a NuGet package at https://www.nuget.org/packages/BitcoinBlockchain/. Its usage is trivial. If you want o see how it is implemented the sources are available on GitHub.
If you want to store the blockchain in a SQL database that you could query faster and in more ways that the raw blockchain you could use something like the BitcoinDatabaseGenerator tool available at https://github.com/ladimolnar/BitcoinDatabaseGenerator.

Related

Batch read from DBs

Im a bit confused on how golangs sql package reads large datasets into memory. In this previous stackoverflow question - How to set fetch size in golang?, there seems to be conflicting ideas on whether batching of large datasets on read happens or not.
I am writing a go binary that connects to different remote DBs based on input params given and fetches resutls and subsequently converts them to a csv file. Suppose I have a query that returns a lot of rows; say 20 million rows. Loading this all at once in memory would be very exhaustive. Does the library batch the results automatically and only on row.Next() load the next batch into memory ?
If the db/sql package does not handle it, are there options in the various driver packages ?
https://github.com/golang/go/issues/13067 - From this issue and discussion, I understand that the general idea is to have the driver packages handle this. As mentioned in the issue and also in this blog https://oralytics.com/2019/06/17/importance-of-setting-fetched-rows-size-for-database-query-using-golang/, I found out that golangs oracle driver package has this option that I can pass for batching. But am not able to find an equivalent in the other driver packages.
To summarize -
Does db/sql batch read results automatically.
If yes, then my 2nd & 3rd question does not matter
If no, are there options that I can pass to the various driver pacakges to set the batch size and where can I find what these options are. I have already tried looking at pgx docs and cannot find anything there that sets a batch size.
Is there any other way to batch reads like a prepared statement with configuration specifying the batch size ?
Some clarifications:
My question is when the a query returns a large dataset, is the entire dataset loaded into memory or is it batched whether internally by some code that is called downstream from rows.Next or not.
From what I can see there is a chunk reader that gets created with a default 8kb size and is used to chunk. Are there cases where this does not happen ? Or are the results from db always chunked.
Is there any way this 8kb buffer size that the chunk reader uses configurable ?
For more clarity, I am adding what is existing in java. This is what already exists and I am looking to rewrite it in golang.
private static final int RESULT_SIZE = 10000;
private void generate() {
... //connection and other code...
Statement stmt = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(RESULT_SIZE);
ResultSet resultset = stmt.executeQuery(dataQuery);
String fileInHome = getFullFileName(filePath, manager, parentDir);
rsToCSV(resultset, new BufferedWriter(new FileWriter(fileInHome)));
}
private void rsToCSV(ResultSet rs, BufferedWriter os) throws SQLException {
ResultSetMetaData metaData = rs.getMetaData();
int columnCount = metaData.getColumnCount();
try (PrintWriter pw = new PrintWriter(os)) {
readHeaders(metaData, columnCount, pw);
if (rs.next()) {
readRow(rs, metaData, columnCount, pw);
while (rs.next()) {
pw.println();
readRow(rs, metaData, columnCount, pw);
}
}
}
}
The stmt.setFetchSize(RESULT_SIZE); sets the number of rows to return in each result set which is then processed one by one to a csv.

OutOfMemory on custom extractor

I have stitched a lot of small XML files into one file, and then made a custom extractor to return rows with one byte array that corresponds to each file.
Run on remote/master
Run it for one file (gzipped, 11Mb), it works fine.
Run it for more than one file, I get a System.OutOfMemoryException.
Run on local/master
Run it for one or more files (gzipped 500+ Mbs), works fine.
Extractor looks like this:
public override IEnumerable<IRow> Extract(IUnstructuredReader input, IUpdatableRow output)
{
using (var stream = new StreamReader(input.BaseStream))
{
var xml = stream.ReadToEnd();
// Clean stiched XML
xml = UtilsXml.CleanXml(xml);
// Get nodes - one for each stiched file
var d = new XmlDocument();
d.LoadXml(xml);
var root = d.FirstChild;
for (int i = 0; i < root.ChildNodes.Count; i++)
{
output.Set<object>(1, Encoding.ASCII.GetBytes(root.ChildNodes[i].OuterXml.ToString()));
yield return output.AsReadOnly();
}
yield break;
}
}
and error message looks like this:
==== Caught exception System.OutOfMemoryException
at System.Xml.XmlDocument.CreateTextNode(String text)
at System.Xml.XmlLoader.LoadAttributeNode()
at System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace)
at System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
at System.Xml.XmlDocument.Load(XmlReader reader)
at System.Xml.XmlDocument.LoadXml(String xml)
at Microsoft.Analytics.Tools.Formats.Text.XmlByteArrayRowExtractor.<Extract>d__0.MoveNext()
at ScopeEngine.SqlIpExtractor<ScopeEngine::GZipInput,Extract_0_Data0>.GetNextRow(SqlIpExtractor<ScopeEngine::GZipInput\,Extract_0_Data0>* , Extract_0_Data0* output) in d:\data\ccs\jobs\bc367467-ef86-43d2-a937-46ba2d4cc524_v0\sqlmanaged.h:line 1924
So what am I doing wrong? And how do I debug this on remote?
Thanks!
Unfortunately local run does not enforce memory allocations, so you would have to check memory in local vertex debug yourself.
Looking at your code above, I see that you are loading XML documents into a DOM. Please note that an XML DOM can explode the data size from the string representation up to a factor of 10 or more (I have seen 2 to 12 in my times as the resident SQL XML guru).
Each UDO today only gets 1/2 GB of RAM to play with. So what I assume is that your XML DOM document(s) start going beyond that.
The recommendation normally is that you use the XMLReader interface (there is a reader extractor in the samples on http://usql.io as well) and scan through the document(s) to find the information you are looking for.
If your documents are always small enough (e.g., <20MB), you may want to make sure that you release the memory of the other documents and operate one document at a time.
We do have plans to allow you to annotate your UDO with memory needs, but that is still a bit out.

How to access files stored in SQL Server's FileTable?

As I know SQL Server since version 2012 has a new feature, FileTable. It allows us to store files in the file system and to use them from T-SQL.
I am trying to use this feature and I have no idea how to do it properly.
Generally, I don't know how to access files stored in the file table. Let's suppose I have asp.net MVC app and there are a lot of images which I show on web pages in img tags. I would like to store these images in Filetable and access them as files from the filesystem. But I don't know where these files are stored and how to use them as files. Now my images are stored in web application directory in folder images and I write something like this:
<img src='/images/mypicture.png' />
And if I move my images to file table what I should write in src?
<img src='path-toimage-in-filetable' />
I don't think you still need this, anyways I'll post my answer for anyone else interested.
First, a filetable still being a table, so, if you want to access to data from it you need to use a Select SQL statement. So you'd need something like:
select name, file_stream from filetable_name
where
name = 'file_name',
file_type = 'file_extension'
just execute an statement like this in your asp.net app, then fetch the results and use the file_stream column to get the binary data of the stored file. If you want to retrieve the file from HTML, first you need to create an action in your controller, which will return the retrieved file:
public ActionResult GetFile(){
..
return File(file.file_stream,file.file_type);
}
After this, put in you HTML tag something like:
<img src="/controller/GetFile" />
hope this could help!
If you want to know the schema of a filetable see
here
I assume by FileTable you actually mean FileStream. A couple notes about that:
This feature is best used if your files are actually files
The files should be, on average, greater than 1mb - although there can be exceptions to this rule, if they're smaller than 1mb on average, you may be better off using a VARBINARY(MAX) or XML data type as appropriate. If your images are very small on average (only a few KB), consider using a VARBINARY(MAX) column.
Accessing these files will require an open transaction and that the database is properly configured for FILESTREAM
You can get some significant advantages bypassing the normal SQL engine/database file method of data access by telling SQL Server that you want to access the file directly, however it's not meant for directly accessing the file on the file system and attempting to do so can break SQL's management of these files (transactional consistency, tracking, locking, etc.).
It's pretty likely that your use case here would be better served by using a CDN and storing image URLs in the table if you really need SQL for this. You can use FILESTREAM to do this (see code sample below for one implementation), but you'll be hammering your SQL server for every request unless you store the images somewhere else anyway that the browser can properly cache (my example doesn't do that) - and if you store them somewhere else for rendering int he browser you might as well store them there to begin with (you won't have transactional consistency for those images once they're copied to some other drive/disk/location anyway).
With all that said, here's an example of how you'd access the FILESTREAM data using ADO.NET:
public static string connectionString = ...; // get your connection string from encrypted config
// assumes your FILESTREAM data column is called Img in a table called ImageTable
const string sql = #"
SELECT
Img.PathName(),
GET_FILESTREAM_TRANSACTION_CONTEXT()
FROM ImageTagble
WHERE ImageId = #id";
public string RetreiveImage(int id)
{
string serverPath;
byte[] txnToken;
string base64ImageData = null;
using (var ts = new TransactionScope())
{
using (var conn = new SqlConnection(connectionString))
{
conn.Open();
using (SqlCommand cmd = new SqlCommand(sql, conn))
{
cmd.Parameters.Add("#id", SqlDbType.Int).Value = id;
using (SqlDataReader rdr = cmd.ExecuteReader())
{
rdr.Read();
serverPath = rdr.GetSqlString(0).Value;
txnToken = rdr.GetSqlBinary(1).Value;
}
}
using (var sfs = new SqlFileStream(serverPath, txnToken, FileAccess.Read))
{
// sfs will now work basically like a FileStream. You can either copy it locally or return it as a base64 encoded string
using (var ms = new MemoryStream())
{
sfs.CopyTo(ms);
base64ImageData = Convert.ToBase64String(ms.ToArray());
}
}
}
ts.Complete();
// assume this is PNG image data, replace PNG with JPG etc. as appropraite. Might store in table if it will vary...
return "data:img/png;base64," + base64ImageData;
}
}
Obviously, if you have lots of images to handle like this this is not an ideal method - don't try to make an instance of SQL server into what you should be using a CDN for.... However, if you have other really good reasons, you should try to grab as many images as possible in a single request/transaction (e.g. if you know you're displaying 50 images on a page, get all 50 with a single transaction scope, don't use 50 transaction scopes - this code won't handle that).

Updating Data Source Login Credentials for SSRS Report Server Tables

I have added a lot of reports with an invalid data source login to an SSRS report sever and I wanted to update the User Name and Password with a script to update it so I don't have to update each report individually.
However, from what I can tell the fields are store as Images and are encrypted. I can't find anything out about how they are encrypted or how to update them. It appears that the User Name and password are stored in the dbo.DataSource tables. Any ideas? I want the script to run in SQL.
Example Login Info:
I would be very, very, VERY leery of hacking the Reporting Services tables. It may be that someone out there can offer a reliable way to do what you suggest, but it strikes me as a good way to clobber your entire installation.
My suggestion would be that you make use of the Reporting Services APIs and write a tiny app to do this for you. The APIs are very full-featured -- pretty much anything you can do from the Report Manager website, you can do with the APIs -- and fairly simple to use.
The following code does NOT do exactly what you want -- it points the reports to a shared data source -- but it should show you the basics of what you'd need to do.
public void ReassignDataSources()
{
using (ReportingService2005 client = new ReportingService2005)
{
var reports = client.ListChildren(FolderName, true).Where(ci => ci.Type == ItemTypeEnum.Report);
foreach (var report in reports)
{
SetServerDataSource(client, report.Path);
}
}
}
private void SetServerDataSource(ReportingService2005 client, string reportPath)
{
var itemSources = client.GetItemDataSources(reportPath);
if (itemSources.Any())
client.SetItemDataSources(
reportPath,
new DataSource[] {
new DataSource() {
Item = CreateServerDataSourceReference(),
Name = itemSources.First().Name
}
});
}
private DataSourceDefinitionOrReference CreateServerDataSourceReference()
{
return new DataSourceReference() { Reference = _DataSourcePath };
}
I doubt this answers your question directly, but I hope it can offer some assistance.
MSDN Specifying Credentials
MSDN also suggests using shared data sources for this very reason: See MSDN on shared data sources

How can I get references to BlockBlob objects from CloudBlobDirectory.ListBlobs?

I am using the Microsoft Azure .NET client libraries to interact with Azure cloud storage. I need to be able to access additional information about each blob in its metadata collection. I am currently using CloudBlobDirectory.ListBlobs() method to get a list of blobs in a particular directory of a directory structure I've devised in the blob names. The ListBlobs() method returns a list of IListBlobItem objects. They only have a couple of properties: Url and references to parent directory and parent container. I need to get to the metadata of the actual blob objects.
I envisioned there would be a way to either cast the IListBlobItem to a BlockBlob object or use the IListBlockItem to get a reference to the BlockBlob, but can't seem to find a way to do that.
My question is: Is there a way to get a BlockBlob object from this method, or do I have to use a different way of getting the actual BlockBlob objects? If different, then can you suggest a way to achieve this, while also being able to filter by the "directory" scheme?
OK... I found a way to do this, and while it seems a little clunky and indirect, it does achieve the main thing I thought should be doable, which is to cast the IListBlobItem directly to a CloudBlockBlob object.
What I am doing is getting the list from the Directory object's ListBlobs() method and then looping over each item in the list and casting the item to a CloudBlockBlob object and then calling the FetchAttributes() method to retrieve the properties (including the metadata). Then add a new "info" object to a new list of info objects. Here's the code I'm using:
CloudBlobDirectory dir = container.GetDirectoryReference(dirPath);
var blobs = dir.ListBlobs(true);
foreach (IListBlobItem item in blobs)
{
CloudBlockBlob blob = (CloudBlockBlob)item;
blob.FetchAttributes();
files.Add(new ImageInfo
{
FileUrl = item.Uri.ToString(),
FileName = item.Uri.PathAndQuery.Replace(restaurantId.ToString().PadLeft(3, '0') + "/", ""),
ImageName = blob.Metadata["Name"]
});
}
The whole "Blob" concept seems needlessly complex and doesn't seem to achieve what I'd have thought would have been one of the main features of the Blob wrapper. That is, a way to expand search capabilities by allowing a query over name, directory, container and metadata. I'd have thought you could construct a linq query that would read somewhat like: "return a list of all blobs in the 'images' container, that are in the 'natural/landscapes/' directory path that have a metadata key of 'category' with the value of 'sunset'". There doesn't seem to be a way to do that and that seems to be a missed opportunity to me. Oh, well.
If I'm wrong and way off base here, please let me know.
This approach has been developed for Java, but I hope it can somehow be modified to fit any other supported language. Despite the functionality you ask has not been explicitly developed yet, I think I found a different (hopefully less clunky) way to access CloudBlockBlob data from a ListBlobItem element.
The following code can be used to delete, for example, every blob inside a specific directory.
String blobUri;
CloudBlobClient blobClient = /* Obtain your blob client */
try{
CloudBlobContainer container = /* Obtain your blob container */
for (ListBlobItem blobItem : container.listBlobs(blobPrefix)) {
if (blobItem instanceof CloudBlob) {
blob = (CloudBlob) blobItem;
if (blob.exists()){
System.out.println("Deleting blob " + blob.getName());
blob.delete();
}
}
}
}catch (URISyntaxException | StorageException ex){
Logger.getLogger(BlobOperations.class.getName()).log(Level.SEVERE, null, ex);
}
The previous answers are good. I just wanted to point out 2 things:
1) Nowadays ASYNC programming is recommended to do and supported by Azure SDK as well. So try to use it:
CloudBlobDirectory dir = container.GetDirectoryReference(dirPath);
var blobs = dir.ListBlobs(true);
foreach (IListBlobItem item in blobs)
{
CloudBlockBlob blob = (CloudBlockBlob)item;
await blob.FetchAttributesAsync(); //Use async calls...
}
2) Fetching Metadata in a separate call is not efficient. The code makes 2 HTTP request per blob object. ListBlobs() method supports getting Metadata with as well in one call by setting BlobListingDetails parameter:
CloudBlobDirectory dir = container.GetDirectoryReference(dirPath);
var blobs = dir.ListBlobs(useFlatBlobListing: true, blobListingDetails: BlobListingDetails.Metadata);
I recommend to use second code it it is possible. Since it is the most efficient way to fetch Metadata.