Scalable RDBMS alternative, NoSQL, NewSQL [closed] - sql

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I am looking for scalable alternative to traditional DBMS like PostgreSQL or MySQL.
In traditional databases I don't have the following features:
Auto sharding to ensure linear scalability.
Replication with automatic failover and recovery to ensure high availability.
No single point of failure.
MongoDB looks like good candidate if I can sacrifice transactions.
Also I've looked at several newSQL databases. NewSQL seems suitable for my purposes: VoltDB, TiDB, cockroachDB. But I'm worried about whethever they are production-ready.
May be there are extensions allowing to run postgreSQL or MySQL in clustered mode out of box.

You should check out Vitess. It's used at YouTube and by a few other companies.
PS: I work on that project.

TiDB
Compatibility with MySQL
It supports the MySQL Protocol so that you can transfer your MySQL scripts running on TiDB without change.
Use cases
It was used by many big name company such as Mobike, uber,pinterest etc. In Mobike, the big data team uses TiDB as a slave for synchronizing data with online DB. After that, OLTP query, consisting of analysis and gathering request, was executed in such circumstance. Last but not the least, the cloud computing platform belongs to Tencent, the technology giant, recommend customers use HTAP based on TiDB for OLTP and OLAP.

Related

Reporting off of an in memory data store? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to generate reports based on a dataset pulled from a third party API, but I can't store the data on disk. (If not storing the data were not a requirement, I would be storing the data in a relational database, and writing a query to join several tables to generate the export as a CSV, for instance.) I've been reading about Redis and I wanted to know if that is a potential solution here as a temporary datastore? Or would I have a hard time putting the tables in the dataset together? If not Redis, what is the recommended way to cache data for reporting purposes in an Azure environment?
I'm filling in a lot of the gaps with assumptions, but to answer your question, yes.
Azure Redis Cache could be used to run your reports "in memory" generally speaking.
For the solution, "it depends" on:
the type of data
how you ingest the data
the type of reports you are trying to run
You have a platform that can run reports with Azure Redis Cache, but you still need to model the data properly to build your reports. Redis is not a relational database. Without more details, you should start here: https://redis.io/topics/data-types-intro

SQL Server Enterprise vs IBM iSeries as400 reliability and performance [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
having a bit of a difficult time drumming up a good comparison between these two enterprise level systems. I'm wondering if anyone has ever come up with some concrete statistics between the two. Does anyone here know performance benchmarks between the two? Looking for reliability (up time), access speeds... things of that nature. The struggle is that most information that presents itself is always opinion based - looking for concrete facts regarding the two.
in addition to what has already been posted you can't really do a apples to oranges comparison between those 2 environments. IBM's offering (which is actually Power Systems running IBM i OS) is a "self-contained" all-in-one whereas any SQL Server system there are simply too many ways to impact performance and stability depending on the hardware that the SQL Server DB is running on. running benchmarks to prove/disprove the viability of running your company's business isn't going to be an easy job...

Basic benchmark for oltp databases [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
Im looking for benchmarks that tested different RDBMS running on the same enviroment to use as reference for a project. Im not looking for any test in particular just want a source of comparison for a few RDBMS something like Techempowers benchmarks for development frameworks. Does anyone know where I can find this? It would be really helpful for me. Thanks in advance.
TPC-C at http://www.tpc.org, is a Benchmarks used to simulates a complete computing environment where a multi-users executes transactions against a database.
The benchmark is centered around the transactions of an order-entry environment.
These transactions include entering and delivering orders, recording payments, checking the status of orders, and monitoring the level of stock at the warehouses.
TPC-C involves a mix of five concurrent transactions of different types and complexity either executed on-line or queued for deferred execution.
The simultaneous execution of multiple transaction types that span a breadth of complexity
On-line and deferred transaction execution modes
Significant disk input/output
-Transaction integrity (ACID properties)
Non-uniform distribution of data access through primary and secondary keys
Databases consisting of many tables with a wide variety of sizes, attributes, and relationships Contention on data access and update
It's used for selecting best hardware /database/Price Performance
you can find TPC-c for many database engines runing in different environment at:
The TPC defines transaction processing and database benchmarks and delivers trusted results to the industr
You can sort / concentrate in environment you need.
Concentrate on two numbers in table (based on database and hardware/o.s)
(tpmC) : absolute number , as increase as best performance
Price/tpmC : relative cost per dollar (just relative)

Best free database management system for beginners (with capability for 20gb DB) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm trying to "open" (?) a 20gb database with a .sql extension and can't find any documentation for beginners that doesn't already assume database access. I think as a first step I need some database management system. It don't need to have any development type capabilities - just the ability to compile (right word?) SQL.
Any suggestions?
Thanks,
Erin
The best RDBMS system to use as a beginner is probably going to be either MySQL or SQLite.
MySQL is an excellent database system and is capable of holding an extremely large amount of data. The data however is not stored in a movable file as you have described here.
SQLite is also an excellent database system and is capable of creating database files that you can move from one machine to another quite easily. The downside is that for a data set that large, you will likely have performance issues.
Based on the size of the file you mentioned, it sounds like what you have is a file with a whole bunch of SQL statements in it. Without seeing the contents of the file, it is extremely difficult to say which RDBMS it came from, but at the very least you should install MySQL and learn how to use it.

Distributed log system [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to store logs in a distributed file system.
Let's say that I have many types of logs. Each log type is recorded in file. But this file can be huge, so it must be distributed across many nodes (with replication for data durability).
These files must support append/get operations.
Is there a distributed system that achieves my needs?
Thanks!
I would recommend Flume, a log pulling infrastructure from the folks at Cloudera:
http://github.com/cloudera/flume
You can also try out Scribe from Facebook:
http://github.com/facebook/scribe
Combine a NAS with a no-sql database like MongoDB and you'll have distributed, large, and fault tolerant.
Of course, without more specific details like how much data, structure of the logs (or lack thereof), etc, it's really hard to recommend a real product.
For example, if by "huge" you really mean 2TB or less, and the data is highly structured, then a regular SQL server in a 2 machine clustered environment for fail over will do just fine.
However, if by "huge" you mean exabyte level or more and/or unstructured data then several large (and very expensive) NAS devices are needed. On which you run a set of no-sql databases that are clustered for fail/over and/or multi-master relationships...
You can use Logstash to collect the logs and centralize them with an Elasticsearch cluster. The local logs could be rolling log files, so that they remain small.
Further you can use Graylog2 to analyze and view your logs.