FlatBuffer schema design for frameworks

FlatBuffer schema design for frameworks - flatbuffers

I'm looking for advice on structuring FlatBuffer schemas for a framework which allows users to extend the data types defined by the framework, but also allows the framework developers to add new fields when new versions of the framework are published.
My original thinking was that when you create a project using this framework, it would generate several FlatBuffer schema files which you could then edit for your specific project. You could then compile the schemas and start developing code using the framework APIs.
However, this becomes a problem when the framework developers decide to add fields to the base types. As you probably know, FlatBuffers requires that any additional fields be appended to the end (or at least have a higher ID than other fields). So there is a conflict between the additions made by the framework developer and the framework user.
One possible solution would be to have a set of 'non-user-extensible' types that are owned by the framework creator, and which should not be modified by users of the framework; and these types would then be embedded within the data types defined by the framework user. However, given the restrictions on fields changing size, I am not sure if this would even work.
I'm also willing to hear alternatives to using flatbuffers if it turns out that there is no good solution otherwise.

To have open ended extension like that, you should really have the framework authors and users work in two separate tables.. where one can own the other. There is no good way to extend a single table if all contributors aren't sharing the schema in source control.
If these extensions must be in a single object for whatever reason, then Protocol Buffers is more flexible than FlatBuffers, since it doesn't require adjacent field ids. You can simply say that all field ids >=1000 are for framework users, for example.

In retrospect (answering my own question two years later), it seems that FlatBuffers was not the right choice for my use case. These days I'm using a combination of msgpack (in cases where I care about byte-size) and JSON (in cases where I don't) and I'm pretty happy with each.

Related

Append structure to standard table or create Z table?

Nowadays SAP recommends to "keep the core clean" in order to be able to move to the cloud and always be able to update to the latest version without having to worry or retest, also valid for on-premise.
I got the requirement to add a Z field to the QMEL table to link its notifications to SAP PS projects (PROJ table). The QMEL table already has a structure -CI_QMEL- ready to be extended and the related BAPIs support this extension.
But in order to keep the core clean, I'm considering to challenge the functional requirement and suggest to create a ZNOTIF_PROJ table with the same key than QMEL (Notification ID). This would then become totally separated from the standard but at the same time the official BAPI wouldn't be able to support it, so a wrapper on top would be needed to update the standard and the custom and everything would become more complex.
Should I stick to the old extension style or go for a new table?

Personally I prefer extending standard tables. Having BAPIs, standard transactions, etc, work as expected is worth far more than a nebulous idea like a "clean core."
As long as you're not modding core code or extending tables in an incorrect manner, customizing the system in ways supported by SAP is not a bad thing. You should consider your future upgrade plans (S/4 on-prem vs cloud, for example) when deciding the right answer, but don't make things too hard on yourself.

S/4 on-prem or cloud already has adding new fields and tables functionality. We can do this in web UI look like SAP CRM. So there is no problem for extending existing structure. Help page about this functionality here.

Dapper.Rainbow VS Dapper.Contrib

Can someone please explain the difference between Dapper.Rainbow vs. Dapper.Contrib?
I mean when do you use SqlMapperExtensions.cs of Dapper.Contrib and when should you use Dapper.Rainbow?

I’ve been using Dapper for a while now and have wondered what the Contrib and Rainbow projects were all about myself. After a bit of code review, here are my thoughts on their uses:
Dapper.Contrib
Contrib provides a set of extension methods on the IDbConnection interface for basic CRUD operations:
Get
Insert
Update
Delete
The key component of Contrib is that it provides tracking for your entities to identify if changes have been made.
For example, using the Get method with an interface as the type constraint will return a dynamically generated proxy class with an internal dictionary to track what properties have changed.
You can then use the Update method which will generate the SQL needed to only update those properties that have changed.
Major Caveat: to get the tracking goodness of Contrib, you must use an Interface as your type constraint to allow the proxy class to be generated.
Dapper.Rainbow
Rainbow is an Abstract class that you can use as a base class for your Dapper classes to provide basic CRUD operations:
Get
Insert
Update
Delete
As well as some commonly used methods such as First (gets the first record in a table) and All (gets all results records in a table).
For all intents and purposes, Rainbow is basically a wrapper for your most commonly used database interactions and will build up the boring SQL based on property names and type constraints.
For example, with a Get operation, Rainbow will build up a vanilla SQL query and return all columns and then map those values back to the type used as the constraint.
Similarly, the insert/update methods will dynamically build up the SQL needed for an insert/update based on the type constraint's property names.
Major Caveat: Rainbow expects all your tables to have an identity column named “Id”.
Differences?
The major difference between Contrib and Rainbow is (IMO), one tracks changes to your entities, the other doesn’t:
Use Contrib when you want to be able to track changes in your entities.
Use Rainbow when you want to use something more along the lines of a standard ADO.NET approach.
On a side note: I wish I had looked into Rainbow earlier as I have built up a very similar base class that I use with Dapper.
From the article and quote #anthonyv quoted: That annoying INSERT problem, getting data into the DB
There are now 2 other APIs you can choose from as well (besides Rainbow) (for CRUD)
Dapper.Contrib and Dapper Extensions.
I do not think that one-size-fits-all. Depending on your problem and
preferences there may be an API that works best for you. I tried to
present some of the options. There is no blessed “best way” to solve
every problem in the world.
I suspect what Sam was trying to convey in the above quote and the related blog post was: Your scenario may require a lot of custom mapping (use vanilla Dapper), or it may need to track entity changes (use Contrib), or you may have common usage scenarios (use Rainbow) or you may want to use a combination of them all. Or not even use Dapper. YMMV.

This post by Adam Anderson describes the differences between several CRUD Dapper extension libraries:
Dapper Contrib (Automatic change tracking - only if dirty or not, Attributes for custom mapping, No composite key support, No manual key support)
Dapper Rainbow (Manual change tracking using Snapshotter, Attributes for custom mapping, No composite key support, No manual key support)
Dapper Extensions (No change tracking, Fluent config for custom mapping, Supports composite keys, Supports manual key specification), also includes a predicate system for simple queries (NOTE: deprecated - does not support recent Dapper versions nor .NET core)
Dapper SimpleCRUD (No change tracking, Attributes for custom mapping, No composite key support, Supports manual key specification), also includes filtering/paging helpers, async support, automatic POCO class generation (through T4)

Sam describes in details what the difference is in his post - http://samsaffron.com/archive/2012/01/16/that-annoying-insert-problem-getting-data-into-the-db-using-dapper.
Basically, its the usual not 1 size fits all answer and its up to us to decide which approach to go with based on your needs:
There are now 2 other APIs you can choose from as well (besides Rainbow) (for CRUD)
Dapper.Contrib and Dapper Extensions.
I do not think that one-size-fits-all. Depending on your problem and
preferences there may be an API that works best for you. I tried to
present some of the options. There is no blessed “best way” to solve
every problem in the world.

Schema versioning using Fluent NHibernate

I've tried reading some previous answers but it's not clear whether or not any of them apply to my situation, as far as I can see. Most of the questions seem to refer to web applications. I figure I'm better off stating my requirements and going from there instead of trying to reverse-engineer advice meant for a different situation. I'm essentially asking two questions:
What does (Fluent) NHibernate support that would, in principle, allow me to achieve the requirements? I'd prefer to use the Fluent API if possible;
What am I going to have to write myself to develop a working solution?
Broadly, the requirements are as follows:
What I'd like to do is use FNH to persist and rehydrate models for a desktop application that would have roughly the same usage model as MS Office, for example - that is, work is kept as self-contained files which are loaded into a local instance of the application.
The current version of the application must be able to import files from all previous versions and preserve all information except that which is declared to the user to be unsupported; by 'import' I mean 'transcribe the model information contained in file A into new file B such that file B is fully compatible with the current version, beside that which is declared to be unsupported.'
The current version of the application must be able to export a current model to be compliant with only the most recent issue of the previous major version of the application. It is not required to supply legacy compatibility with any older revisions of the previous major version.
The nature of the product is such that updates to the file format happen fairly frequently - aim to be able to release to the user every six months or so if necessary as a ballpark figure, and are changed in development much more frequently than that.
I have no objection to writing code to handle this, provided that:
The coding does not take an inordinate amount of time for arbitrarily complicated changes to the schema;
I am able to verify whether or not the translation between versions is complete by calling the FNH API through unit tests;
I can verify that any given model will round-trip correctly between versions and only lose data which is declared to the user to be unsupported between product versions;
So, to summarise:
What, if anything, does Fluent NHibernate supply to enable this kind of use-case?
Can the requirements be readily satisfied as they are, or will I have to make them more specific and constrained?
What should I investigate as to coding myself?

I would suggest using a document database, something like RavenDB, MongoDb etc, for what you are trying to do. I think these would be a better fit than trying to force a RDBMS (sql server, oracle etc) and consequently nHibernate to do something that its not all that good at. not to say that it can't, but you will end up jumping through all sort of hoops to accomplish what you are asking.
One thing to note is that Fluent Nhibernate only puts a Fluent API over the Class Mapping of nhibernate.

Entity Framework - Schema Upgrade, Multiple DBMS, and Code First

I'm looking into using Microsoft's Entity Framework in an upcoming project which is a point release of an existing product. Our current product supports two DBMS (Oracle and SQL Server), the schema of each is maintained in separate .sql script files.
The entity framework (4.1) looks appealing because it allows various scenarios to be implemented automatically via code generation, reflection, etc. However, as far as I can tell, some of these benefits appear to be mutually exclusive of others.
For example, to support multiple DMBSes, I am inferring that I would need to use a model or code first design, in which case EF would generate the schema for each according to the model (I have seen little to no posts or documentation on this, so I may be wrong). This means that our existing schema would need to be either abandoned (model-first), or mapped (code-first). Additionally, updating the schema would require manual scripts as EF does not appear to support schema upgrades (without wiping out data).
Are model-first and code-first the only viable means of supporting multiple DBMSes in EF? I realize that technically it would be impossible to guarantee that two arbitrary schemae are the same, so I am thinking this is true.
Are there any potential pitfalls of code-first and mapping to multiple DBMS systems? For example, Oracle does not have auto-increment columns; you have to use sequences. How is this mapped in the DbContext? Do I need to create separate maps for each DBMS?
Does EF support any mechanism to upgrade an existing DBMS schema to one of which is representative of the EF model (schema recreation =/= upgrade), or am I limited to doing this manually?
I did come up with one possible way to use database first and support multiple DBMSes, however it is a maintenance nightmare. The idea was to add another layer of abstraction to the two generated data models and create converter classes for each of the EF generated models. This seems like the best way of doing it so that each DBMS could potentially have its own model, yet my code would handle the mapping. But in doing this, what am I really gaining from EF? Maybe query generation, but is that worth it?

Actually both the model-first and the database-first have same constraints. Both these approaches are using an EDMX file which contains SSDL (a description of store = a database layer) part related directly to a single database provider so if you want to have two different database providers you must have two different SSDL parts and keep them in sync. You can use single CSDL (a description of conceptual layer = your model classes) and a single or two MSLs (a description of mapping between SSDL and CSDL - a single file is possible only if tables and columns will have exactly same names in both SSDLs). As I know EDMX file can consists only from single SSDL, CSDL and MSL parts so I expect that the designer has no support for this scenario and you will have to modify second SSDL manually or use two EDMXs = model each change twice.
The code-first approach can make this much more simple but the question is how good is Oracle provider when using the code-first and the database generation. The provider is responsible for correctly interpreting needed features like sequences in case of auto increment columns.
EF itself currently has no support for upgrading existing DB. When using EDMX the process of the database generation is controlled either by T4 template or Workflow so it can be customized and there is already separate feature called Entity Designer Database Generation Power Pack which allow incremental building of the database with the model-first approach. The problem is that this feature is using VS Database tools. I think these tools works only with SQL server. I never like these automated tools so I still think that database upgrade should be controlled manually with help of some tools to get difference script between the current and the last deployed database versions. You should need diff script only when deploying new the new version to a production environment. In a testing and a development environment you can always recreate the whole database.
There should be no abstraction needed when working with two EDMX models. Models must produce the same conceptual layer. In such case you need only a single set of POCO classes which are mapped by conventions (same class name as the entity, same properties with same types and accesibility) so they will work with both models.
Edit:
Based on #Tridus answer I'm just adding that you can create databases first and use fluentAPI from EF 4.1 to map them. Your databases must have exactly the same schema (table names, column names, etc.), they can't use any specific features (I hope sequences will not be the problem because it is just the way how Oracle handles auto increment columns).

This is actually fairly doable with a database first design, but there's some caveats you won't be able to get around easily due to how the databases handle things differently.
Sequences are one (in that they're just ignored by EF entirely). You can fake that in Oracle by putting a trigger on the table that populates it on Insert, but I also found that if you have to update the model later then EF "forgets" that the column is an identity column and it'll try to stick a 0 in it again. I also found it unreliable in Oracle to try and get the new ID if you use a trigger. We just wound up selecting from the sequence and setting the ID on the object before doing the insert because that's how you usually do it in Oracle. You could also use a stored procedure that handles it.
Numbers aren't handled the same way. SQL Server uses number formats that map to Int32, Int64, etc. Oracle's number format is totally different and a full range Int32 in SQL Server is a Number(10,0) in Oracle... which is actually an Int64 in EF because it's bigger then an Int32. I also found that Oracle's EF provider likes to use Decimal a lot even when it doesn't have to, but that's probably just a beta issue.
Stored Procedures in Oracle require some values to be put in app.config/web.config in order to work in EF. I'm not sure if that's going to just be clutter in SQL Server or if it'll cause problems.
Finally, EF Code First is pretty immature and according to the docs doesn't support changing the database structure in this version. I'm not sure if Oracle's provider supports it either (it might, haven't tried it).
Most of this is stuff you can get around, but you're going to need to do some work to hide the differences from the rest of your code and it'll probably take a wrapper layer to do it.
edit - In regards to your #4 - EF 4.1 can generate partial POCO classes. Instead of writing a wrapper around each of the generated models to hide any differences, you can create another partial class code file that won't be regenerated when you update the model, and then add properties/methods that hide the differences. Your app code would just have to be aware to use those instead, and they'd handle the issue (like the number issue I mentioned, you could completely hide it with another property that can do the necessary casting for Oracle).

How does Virtuemart do EAV without using EAV?

I understand the three basic failures in EAV, namely that it takes a lot of work to reassemble the data. However, I want a database where I can add custom fields. A lot of people say that Virtuemart allows custom fields but without using an EAV database structure. Can someone explain how this can be done or provide links?

I believe they store custom fields in a chunk of XML or YAML or other domain-specific language.
Basically, they use Martin Fowler's Serialized LOB pattern.
This makes it hard to use SQL expressions to query the custom attributes. You have to fetch the whole row back into your application and parse out the custom attributes. But this is no worse than the pain caused by EAV.
See http://web.archive.org/web/20110709125812/http://sankuru.biz/en/blog/8-joomla-configuration-issues/35-the-cck-buzz-content-creation-kit-and-the-eav-problem.html
Virtuemart and CCK
Virtuemart (VM) custom user fields are
CCK-style, but do not rely on EAV.
Therefore, they are very usable, and
useful. I do recommend their use.
VM product types are also CCK-style,
but unfortunately do rely on EAV.
Therefore, I avoid VM product types
like the plague. Instead, I just
manually create additional fields in
the product record.
The VM attribute system (simple,
custom, advanced) is actually too
underpowered to be considered CCK
grade.
A good improvement to VM, would
consist in rephrasing the VM product
types and attributes to non-EAV
CCK-style custom fields (and therefore
make them work more like the VM custom
user fields).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas