How does Bigtable handle Tablet Server failures? - bigtable

I have found that Bigtable cluster consists of a single master server and multiple tablet servers. Message sent to tablet server by master during failure. Backup copy of tablet made primary. Extra replica of tablet created automatically by GFS.
Is it enough? But I want to know how to solve it actually or What's the procedure or Which steps actually followed to solve it?

There is a single distributed file system that Bigtable uses, not several different ones, and all the tablet servers access the same file system.
In this file system, the tablets (SSTables) are already stored durably replicated, so when a single tablet server fails, a new one is assigned by the master to handle the data handled by the original tablet server, but no data copying is necessary, because the new tablet server can access the same data as the old one, on the same distributed file system.

Related

How to use batch process for IBM MDM SE 11.6, Virtual MDM implementation

I have a requirement where an external source wants to send there data (CDE) to IBM MDM which will run through PME algorithm and sends out EIDs to the external source.
Note: The external source data will not be stored into MDM database.
I want to go towards batch processing route, where the external source will drop a file on a server which will be picked up by MDM to process through its PME engine.
Question: I am not familiar with the batch process for IBM MDM. If I can get some pointers/guidance on how to achieve this solution that would be very helpful.
I provided one of the solution to use member search web service, but, due to very high volume of external source data per day (100k-200k) records, this option is not feasible.
Thank you for reading.

Redis active-active replication

I am using redis version 2.8.3. I want to build a redis cluster. But in this cluster there should be multiple master. This means I need multiple nodes that has write access and applying ability to all other nodes.
I could build a cluster with a master and multiple slaves. I just configured slaves redis.conf files and added that ;
slaveof myMasterIp myMasterPort
Thats all. Than I try to write something into db via master. It is replicated to all slaves and I really like it.
But when I try to write via a slave, it told me that slaves have no right to write. After that I just set read-only status of slave in redis.conf file to false. Hence, I could write something into db.
But I realize that, it is not replicated to my master replication so it is not replicated to all other slave neigther.
This means I could'not build an active-active cluster.
I tried to find something whether redis has active-active cluster capability. But I could not find exact answer about it.
Is it available to build active-active cluster with redis?
If it is, How can I do it ?
Thank you!
Redis v2.8.3 does not support multi-master setups. The real question, however, is why do you want to set one up? Put differently, what challenge/problem are you trying to solve?
It looks like the challenge you're trying to solve is how to reduce the network load (more on that below) by eliminating over-the-net reads. Since Redis isn't multi-master (yet), the only way to do it is by setting up each app server with a master and a slave (to the other master) - i.e. grand total of 4 Redis instances (and twice the RAM).
The simple scenario is when each app updates only a mutually-exclusive subset of the database's keys. In that scenario this kind of setup may actually be beneficial (at least in the short term). If, however, both apps can touch all keys or if even just one key is "shared" for writes between the apps, then you'll need to bake locking/conflict resolution/etc... logic into your apps to consolidate local master and slave differences (and that may be a bit of an overkill). In either case, however, you'll end up with too many (i.e. more than 1) Redises, which means more admin effort at the very least.
Also note that by colocating app and database on the same server you're setting yourself for near-certain scalability failure. What will happen when you need more compute resources for your apps or Redis? How will you add yet another app server to the mix?
Which brings me back to the actual problem you are trying to solve - network load. Why exactly is that an issue? Are your apps so throughput-heavy or is the network so thin that you are willing to go to such lengths? Or maybe latency is the issue that you want to resolve? Be the case as it may be, I recommended that you consider a time-proven design instead, namely separating Redis from the apps and putting it on its own resources. True, network will hit you in the face and you'll have to work around/with it (which is what everybody else does). On the other hand, you'll have more flexibility and control over your much simpler setup and that, in my book, is a huge gain.
Redis Enterprise has had this feature for quite a while, but if you are looking for an open source solution KeyDB is a fork with Active Active support (called Active Replica).
Setting it up is just a little more work than standard replication:
Both servers must have "active-replica yes" in their respective configuration files
On server B execute the command "replicaof [A address] [A port]"
Server B will drop its database and load server A's dataset
On server A execute the command "replicaof [B address] [B port]"
Server A will drop its database and load server B's dataset (including the data it just transferred in the prior step)
Both servers will now propagate writes to each other. You can test this by writing to a key on Server A and ensuring it is visible on B and vice versa.
https://github.com/JohnSully/KeyDB/wiki/KeyDB-(Redis-Fork):-Active-Replica-Support

Multiple iOS devices + centralized sqlite database

Can I share same sqlite database from 2 different iOS devices or update one from another?
Not easily. You can sync the db over iCloud but there is a high chance that data will be overwritten.
I recommend writing some sort of syncing tool that can detect and merge changes originating on either device.
A Common method is to use a timestamp column and get all rows modified since the last sync, then update the other database.
You can take a look at OpenMobster's Sync Service. It supports multi-device sync operations with conflict management.
Besides this it supports the following sync modes:
two-way
one-way device
one-way server
bootup
You can run in complete offline mode and the changes will be auto-tracked and synchronized with the network returns. At this time sync happens with the Cloud and other devices that hold the same data that you are holding locally.
The project is located at: http://openmobster.googlecode.com
You can look at the following iOS sync tutorial to get an idea of how this thing works:
http://code.google.com/p/openmobster/wiki/iPhoneSyncApp

Is there a log file of running processes in Server Advantage

My name is Josue
I need your help with this:
Is there any way to audit or monitor the server processes that connect to the
Advantage Database Server?
Is there a log of running processes?
Thank's
There is no existing log of processes that use Advantage Database Server. Because it is a client/server architecture, there is no mechanism that I am aware of that can easily associate a connection on the server to a specific process.
However, it would be possible to use the system procedure sp_mgGetConnectedUsers() to obtain some of this information. It might be possible to use it to obtain the information you are looking for at a given point in time (a snapshot).
The output of that procedure includes three fields that you might be interested in. The Address column gives the address of the machine that connected to Advantage. It is typically the IP address of the client application. But it can also be of the form "IPC Connection N", which indicates that it is using shared memory for communications; this means that the client process is running on the same machine as the server.
The TSAddress column might also be of interest. If the connection is made by a client that is running through terminal services (e.g., a remote desktop), then that column contains the IP address of the client machine. If you are interested in knowing processes that originate from the server machine itself, then you would need this field to differentiate between those and clients that connected through terminal services.
The other column of potential interest would be ApplicationID. By default, that field contains the process name (e.g., the executable) of the client application. This could help identify the actual process. It is not guaranteed, though. The application itself can change that value through mechanisms such as sp_SetApplicationID.

Advantage Database Replication

I have a client that wants two sites to have the ability to sync databases so information at Site A can be synced with Site B so the two sites can look at the same data.
I'm not even sure of the infrastructure required. Would a VPN required to connect the 2 databases or would an internet based database work ie/Site A to InternetDatabase and Site B to InternetDatabase. Each site copies data to it periodically and then the InternetDatabase syncs it and the Sites can then pull data down.
My other thought was something like Dropbox. If Site A and Site B use a Dropbox account to sync the ADT files etc can the database at each site then sync with those ADT files?
Thanks
If the two sites update completely different tables, then something like Dropbox might work for that. Dropbox does not synchronize/merge the contents of files. That means if both site A and site B updated some file, then you would be responsible for writing the code to merge the changes.
Advantage Database Server has support for replication built in natively, so that would likely be the simplest solution. Advantage replication is performed on a record-by-record basis and is handled asynchronously. If the target database cannot be reached, the updates are stored in a queue and processed periodically. If the connection between the two sites is open/available constantly, the lag between the source update and the replicated update is typically small but obviously depends on the network bandwidth and latency.
You could use a VPN for the connection between the two sites, but it would not be required. If you do not use some kind of VPN, though, you should make sure the communication is encrypted between the two sites (it is an option when setting up the subscriptions).
Edit For the communication, all you need is "normal" network connectivity. The primary issue is dealing with things like firewalls and NAT. With Advantage, you define which port it uses. If you use a TCP/IP connection, you would need to make sure the configured port allows inbound connections to the ads.exe process. You can use UDP as well, but if you are dealing with firewalls, it is probably going to be simpler with TCP.
Your question about duplicate keys is a good one. If both sites either add a record with the same primary key or update the same record concurrently, then it results in a conflict. There is an option to simply ignore conflicts in which case the last update wins. More realistically, you would want to write an ON CONFLICT trigger to handle the conflicts.