Best practice for retrieving data from the blockchain when data is hashed - solidity

I read a very good article on "Programming Smart Contracts on Ethereum" in the Jan/Feb 2023 issue of Code magazine.
The author uses Solidity to create a contract to read/write a string containing sensitive data. He first base64 encodes a JSON string and then passes it to the contract which is converted to a hash (to reduce gas fees) and written to the blockchain.
However, the only method in the contract is to verify that the proof exists by looking for the hash. This means you have to know ahead of time the data that was stored, encode it, and then search for the proof based on that hash.
But if you don't know what data you need, you can't create a hash. e.g. Retrieving all customer payment transactions
The approach I see to this is storing the block number to a local database. Then fetch the data using the block number.
example db table:
tblTransactions
-customerID(int)
-blockNumber(string)
My questions are:
Can you retrieve data using a block number?
Is this the best approach?

Related

Complete Cryptocurrencies Data

I am working on cryptocurrencies blockchain data and I want data from the beginning of the time of that particular cryptocurrency. Is there any way to download complete block data in the Postgresql file?
https://blockchair.com/dumps is although offering this but they limit the download speed and number of downloading files. Moreover, I am also waiting for their reply. Meanwhile, I am finding some other ways or websites to download complete data of multiple cryptocurrencies in SQL format. I cannot download the .csv or .tsv file because it takes a lot of space on my laptop. Therefore, I want to use any other format (preferably .sql format)
This is depends on cryptocurrency, you have. I can suggest you, how to fetch data from Bitcoin-compatible crypto, i.e. Bitcoin, Emercoin, etc.
The cryptocurrency node (wallet) has JSON RPC API interface.Usinf this API, you retrieve all your data with following commands from this command list:
Get total block counter with command getblockcount.
Iterate block number from 0 to result of getblockcount. For each number, call getblockhash.
For result of getblockhash call getblock. This function provide transactions list, enclosed in this block.
For ech transaction (nested loop), call getrawtransaction. Hint: if you call getrawtransaction with 3rd argument "1", node automatically decodes transaction, and return you decoded transaction in JSON.
You can extract from a transaction vectors vin and vout, and upload all data into your SQL database.

How to properly store a JSON object into a Table?

I am working on a scenario where I have invoices available in my Data Lake Store.
Invoice example (extremely simplified):
{
"business_guid":"b4f16300-8e78-4358-b3d2-b29436eaeba8",
"ingress_timestamp": 1523053808,
"client":{
"name":"Jake",
"age":55
},
"transactions":[
{
"name":"peanut",
"amount":100
},
{
"name":"avocado",
"amount":2
}
]
}
All invoices are stored in ADLS, and can be queried. But, It is my desire to provide access to the same data inside an ALD DB.
I am not an expert on unstructed data: I have RDBMS background. Taking that into consideration, I can only think of 2 possible scenarios:
2/3 tables - invoice, client (could be removed) and transaction. In this scenario, I would have to create an invoice ID to be able to build relationships between those tables
1 table - client info could be normalized into invoice data. But, transactions could (maybe) be defined as an SQL.ARRAY<SQL.MAP<string, object>>
I have mainly 3 questions:
What is the correct way of doing so? Solution 1 seems much better structured.
If I go with solution 1, how do I properly create an ID (probably GUID)? Is it acceptable to require ID creation when working with ADL?
Is there another solution I am missing here?
Thanks in advance!
This type of question is a bit like do you prefer your sauce on the pasta or next to the pasta :). The answer is: it depends.
To answer your 3 questions more seriously:
#1 has the benefit of being normalized that works well if you want to operate on the data separately (e.g., just clients, just invoices, just transactions) and want to the benefits of normalization, get the right indexing, and are not limited by the rowsize limits (e.g., your array of map needs to fit into a row). So I would recommend that approach unless your transaction data is always small and you always access the data together and mainly search on the column data.
U-SQL per se has no understanding of the hierarchy of the JSON document. Thus, you would have to write an extractor that turns your JSON into rows in a way that it either gives you the correlation of the parent to the child (normally done by stepwise downwards navigation with cross apply) and use the key value of the parent data item as the foreign key, or have the extractor generate the key (as int or guid).
There are some sample JSON extractors on the U-SQL GitHub site (start at http://usql.io) that can get you started with the JSON to rowset conversion. Note that you will probably want to optimize the extraction at some point to be JSON Reader based so you process larger docs without loading it into memory.

Retrieving 'to' and 'from' addresses in transaction directly from blockchain

I'm trying to get a list of addresses that have made transactions with a given bitcoin address for a project that examines how people use bitcoin for non-nefarious purposes. I've got a lot of addresses so a web based blockchain explorer like blockchain.info isn't practical.
I've downloaded the blockchain and used bitcoin-abe to dump it into a sqlite database. However I'm not finding addresses anywhere. Are the actual addresses called something different in the blockchain?
The spending conditions, i.e., who is able to spend a given output, are encoded as scripts in the output. What is commonly referred to as a Bitcoin address is little more than a default script format (either pay-to-pubkey or pay-to-pubkey-hash) which require a signature from a private key matching the pubkey in the script. For example P2PKH scripts look like this:
OP_DUP OP_HASH160 <PubkeyHash> OP_EQUALVERIFY OP_CHECKSIG
This checks that the pubkey on the stack matches the hash, and then checks that the signature and pubkey are valid for the transaction.
ABE stores the output scripts, but appears not to create an index for the addresses. So you probably want to convert the addresses that you're looking for into the script version (see the wiki for details on how to extract the pubkey hash or pubkey from the address). Once you have the pubkey hash or pubkey you construct a binary script similar to this (hexencoded):
76a914<pubkey-hash>88ac
You should then be able to search for these in the database ABE gives you.
You need to write a cron job using the BTC address of the user and check whether the transaction is made or not.
https://www.blockchain.com/explorer
Ex. https://github.com/bitpay/insight

Bitcoin : Edit CTxOut class in bitcoins source

Can we edit and add new field to CTxOut class in order to send additional information to transaction object and then to the blockchain?
I got the following answer from here
Note that you should only do this, putting extra data into the block chain, if it is really necessary. The block chain has to be stored by every full node, so try not to take up all our hard drive space with unnecessary stuff whenever possible.
With that said, if you do want to add extra data to your transaction, then add an additional output to the transaction, for which the scriptPubKey has the following form:
OP_RETURN {80 bytes of whatever data you want}
80 bytes was chosen because it is big enough for a 64 byte hash and 16 other extra bytes of data, but not big enough to store anything maliciously big (like a movie collection). This transaction output is automatically un-spendable, and so will not be kept in the UTXO set in any pruning. The other UTXOs from your transaction will still be safe.

Searching Authorize.net CIM Records

Has anyone come up with an elegant way to search data stored on Authorize.net's Customer Information Manager (CIM)?
Based on their XML Guide there doesn't appear to be any search capabilities at all. That's a huge short-coming.
As I understand it, the selling point for CIM is that the merchant doesn't need to store any customer information. They merely store a unique identifier for each and retrieve the data as needed. This may be great from a PCI Compliance perspective, but it's horrible from a flexibility standpoint.
A simple search like "Show me all orders from Texas" suddenly becomes very complicated.
How are the rest of you handling this problem?
The short answer is, you're correct: There is no API support for searching CIM records. And due to the way it is structured, there is no easy way to use CIM alone for searching all records.
To search them in the manner you describe:
Use getCustomerProfileIdsRequest to get all the customer profile IDs you have stored.
For each of the CustomerProfileIds returned by that request, use getCustomerProfileRequest to get the specific record for that client.
Examine each record at that time, looking for the criterion you want, storing the pertinent records in some other structure; a class, a multi-dimensional array, an ADO DataTable, whatever.
Yes, that's onerous. But it is literally the only way to proceed.
The previously mentioned reporting API applies only to transactions, not the Customer Information Manager.
Note that you can collect the kind of data you want at the time of recording a transaction, and as long as you don't make it personally identifiable, you can store it locally.
For example, you could run a request for all your CIM customer profile records, and store the state each customer is from in a local database.
If all you store is the state, then you can work with those records, because nothing ties the state to a specific customer record. Going forward, you could write logic to update the local state record store at the same time customer profile records are created / updated, too.
I realize this probably isn't what you wanted to hear, but them's the breaks.
This is likely to be VERY slow and inefficient. But here is one method. Request an array of all the customer Id's, and then check each one for the field you want... in my case I wanted a search-by-email function in PHP:
$cimData = new AuthorizeNetCIM;
$profileIds = $cimData->getCustomerProfileIds();
$profileIds = $cimData->getCustomerProfileIds();
$array = $profileIds->xpath('ids');
$authnet_cid = null;
/*
this seems ridiculously inefficient...
gotta be a better way to lookup a customer based on email
*/
foreach ( $array[0]->numericString as $ids ) { // put all the id's into an array
$response = $cimData->getCustomerProfile($ids); //search an individual id for a match
//put the kettle on
if ($response->xml->profile->email == $email) {
$authnet_cid = $ids;
$oldCustomerProfile = $response->xml->profile;
}
}
// now that the tea is ready, cream, sugar, biscuits, you might have your search result!
CIM's primary purpose is to take PCI compliance issues out of your hands by allowing you to store customer data, including credit cards, on their server and then access them using only a unique ID. If you want to do reporting you will need to keep track of that kind of information yourself. Since there's no PCI compliance issues with storing customer addresses, etc, it's realistic to do this yourself. Basically, this is the kind of stuff that needs to get flushed out during the design phase of the project.
They do have a new reporting API which may offer you this functionality. If it does not it's very possible it will be offered in the near future as Authnet is currently actively rolling out lots of new features to their APIs.