CouchDb 2.0: update_seq is not a number - couchdb-2.0

According to official doc of couchDb 2.0
http://docs.couchdb.org/en/2.0.0/api/database/common.html
GET /{db}
Gets information about the specified database.
Parameters:
db – Database name
Request Headers:
Accept –
application/json
text/plain
Response Headers:
Content-Type –
application/json
text/plain; charset=utf-8
Response JSON Object:
committed_update_seq (number) – The number of committed update.
compact_running (boolean) – Set to true if the database compaction routine is operating on this database.
db_name (string) – The name of the database.
disk_format_version (number) – The version of the physical format used for the data when it is stored on disk.
data_size (number) – The number of bytes of live data inside the database file.
disk_size (number) – The length of the database file on disk. Views indexes are not included in the calculation.
doc_count (number) – A count of the documents in the specified database.
doc_del_count (number) – Number of deleted documents
instance_start_time (string) – Timestamp of when the database was opened, expressed in microseconds since the epoch.
purge_seq (number) – The number of purge operations on the database.
**update_seq (number) – The current number of updates to the database.**
Status Codes:
200 OK – Request completed successfully
404 Not Found – Requested database not found
The update_seq must be return as number but when we run the request
http://192.168.1.48:5984/testing **(CouchDb 2.0)**the response is
{"db_name":"testing","update_seq":"0-g1AAAAFTeJzLYWBg4MhgTmEQTM4vTc5ISXLIyU9OzMnILy7JAUoxJTIkyf___z8rkQGPoiQFIJlkT1idA0hdPGF1CSB19QTV5bEASYYGIAVUOp8YtQsgavcTo_YARO19YtQ-gKgFuTcLANRjby4","sizes":{"file":33952,"external":0,"active":0},"purge_seq":0,"other":{"data_size":0},"doc_del_count":0,"doc_count":0,"disk_size":33952,"disk_format_version":6,"data_size":0,"compact_running":false,"instance_start_time":"0"}
previously in couchdb 1.6.1 when we run the request
http://192.168.1.80:5984/learners (CouchDb 2.0) the response is
{"db_name":"learners","doc_count":0,"doc_del_count":3,**"update_seq":6**,"purge_seq":0,"compact_running":false,"disk_size":12386,"data_size":657,"instance_start_time":"1487830025605920","disk_format_version":6,"committed_update_seq":6}
So plz explain is this a exception in couchdb 2.0 or something else.

The CouchDB docs are not up to date on this one. CouchDB 2.0 introduced clustering, and with the clustering the update_seq had to be change to a unique string.
You should treat the update_seq as an opaque identifier, not as something with an inherent meaning. If the update_seq has changed, the database itself has changed.
That said, the first part of the update_seq is a number, so if you really need the numeric sequence, you can parse it. But I would strongly advise to not rely on it, because the update_seq format might change in a future version of CouchDB.

Related

How to set TTL on Rocks DB properly?

I am trying to use Rocks DB with TTL. The way I initialise rocks db is as below:
options.setCreateIfMissing(true).setWriteBufferSize(8 * SizeUnit.KB)
.setMaxWriteBufferNumber(3) .setCompressionType(CompressionType.LZ4_COMPRESSION).setKeepLogFileNum(1);
db = TtlDB.open(options, this.dbpath, 10, false);
I have set TTL to 10 seconds. But, the key value pairs are not being deleted after 10 seconds. Whats happening here?
That's by design:
This API should be used to open the db when key-values inserted are meant to be removed from the db in a non-strict 'ttl' amount of time therefore, this guarantees that key-values inserted will remain in the db for at least ttl amount of time and the db will make efforts to remove the key-values as soon as possible after ttl seconds of their insertion
-- from the RocksDB Wiki-page on TTL.
That means values are only removed during compaction, and staleness is not checked during reads.
One of the good things about RocksDB is that their source is quite readable. The files you would want to look at are the header and source for TtlDb. In the header you will find the compaction which removes stale values (the compaction's Filter-contract is documented well in its header). In the TtlDb source you verify for yourself that Get does not do any checks whether or not the value is stale. It just strips the timestamp (which just gets appended to the value on insert).

What is the best method to extract a recurring blob data and put in another table ? - SQL

I'm developing a new webpage in (.NET framework, if that helps) for the below scenario. Every single day, we get a cab drivers report.
Date | Blob
-------------------------------------------------------------
15/07 | {"DriverName1":"100kms", "DriverName2":"10kms", "Hash":"Value"...}
16/07 | {"DriverName1":"50kms", "DriverName3":"100kms", "Hash":"Value"}
Notice that the 'Blob' is the actual data received in json format - contains information about the distance covered by a driver at that particular day.
I have written a service which reads the above table & further breaks down this and puts it into a new table like below:
Date | DriverName | KmsDriven
15/07 DriverName1 100
15/07 DriverName2 10
16/07 DriverName3 100
16/07 DriverName1 50
By populating this, I can easily do the following queries:
How many drivers drove on that particular day.
How is 'DriverName1' did for that particular week, etc.,
My questions here are:
Are there anything in .NET / SQL world to specifically address this or let me know if I am reinventing the wheel here.
Is this the right way to use the Blob data ?
Are there any design patterns to adhere here to ?
Are there anything in .NET / SQL world to specifically address this or
let me know if I am reinventing the wheel here.
Well, there are JSON parsers available, for example Newtonsoft's Json.NET. Or you can use SQL Server's own functions. Once you have extracted individual values from JSON, you can write them into corresponding columns (in your new table).
Is this the right way to use the Blob data?
No. It violates the principle of atomicity, and therefore the first normal form.
Are there any design patterns to adhere here to?
I'm not sure about "patterns", but I don't see why would you need a BLOB in this case.
Assuming the data is uniform (i.e. it always has the same fields), you can just declare the columns you need and write directly to them (as you already proposed).
Otherwise, you may consider using SQL Server's XML data type, which will enable you to extract some of the sections within an XML document, or insert a new section without replacing your whole document.

RavenDB: Is there a way to know the actual "Last-Modified" timestamp in slave database

I replicate data from RavenDB 1.0 database(master) to RavenDB 2.5 database(slave). The replication is done and I query RavenDB 2.5 database with LastModified by using index "Raven/DocumentsByEntityName". I found that the Last-Modified metadata for all documents are updated to today's date so I have no way to get the correct query result. The Last-Modified metadata for the documents in the original 1.0 DB is a date before today.
Is there any way that I can the real Last-Modified date for the replicated documents? or is there any Created-Timestamp in metadata?
Every time the document is updated, the Last-Modified date is reset. It doesn't matter if you did it yourself, or if it was done via one of Raven's own processes such as replication.
If the dates are important for your domain, you might consider adding properties for them to the document itself.
But if all you're after is a creation date in the metadata, you can add one using a custom bundle. You can write your own, or use the one in the Raven.Contrib project.
But no, Raven doesn't keep a creation date on it's own, so if you've already lost the last-modified date then there's no way to get it back.

database schema for http transactions

I have a script that makes a http call to a webservice, captures the response and parses it.
For every transaction, I would like to save the following pieces of data in a relational DB.
HTTP request time
HTTP request headers
HTTP response time
HTTP response code
HTTP response headers
HTTP response content
I am having a tough time visualizing a schema for this.
My initial thoughts were to create 2 tables.
Table 'Transactions':
1. transaction id (not null, not unique)
2. timestamp (not null)
3. type (response or request) (not null)
3. headers (null)
4. content (null)
5. response code (null)
'transaction id' will be some sort of checksum derived from combining the timestamp with the header text.
The reason why i compute this transaction id is to have a unique id that can distinguish 2 transactions, but at the same time used to link a request with a response.
What will this table be used for?
The script will run every 5 minutes, and log all this into the DB. Plus, every time it runs, the script will check the last time a successful transaction was made. Also, at the end of the day, the script generates a summary of all the transactions made that day and emails it.
Any ideas of how i can improve on this design? What kinda normalization and/or optimization techniques i should apply to this schema? Should i split this up into 2 or more tables?
I decided to use a NoSQL approach to this, and it has worked. Used MongDB. The flexibility it offers with document structure and not having to have a fixed number of attributes really helped.
Probably not the best solution to the problem, but i was able to optimize the performance using compound indexes.

When are text field lengths enforced in Salesforce? Can I get the actual field length?

I was importing data from Salesforce today, when my BULK INSERT failed with too-long data: longer than the field length as reported from Salesforce itself. I discovered that this field, which Salesforce describes as a TEXT(40), has values up to 255 characters long. I can only guess that the field had a 255-character limit in the past, was changed to TEXT(40), and Salesforce has not yet applied the new limit.
When are field lengths enforced? Only when new data is inserted or modified? Are they enforced at any other point, such as a weekly schedule?
Second, is there any way to know the actual field length limit? As a database guy, not being able to rely on the metadata I've been given makes me cringe. As just one random example, if we were to restore this table from backup I assume that the long values would bomb, or possibly be truncated.
I'm using the SOAP API.
Field lengths are enforced on create/update. If you later reduce the length, existing records are not truncated. I imagine this is because salesforce is storing these as 255's regardless.
Pragmatically speaking, the "actual" field limit for any text field in salesforce should be considered 255. This is because it's possible that at some point in the past, records were inserted when the limit was as high as 255.
And you're right that if you were to dump that table and re-insert it, you very well could have rejected records due to values that exceed the field size as currently defined.