I have to generate Avro schema from UML. We're using Enterprise Architect to model our classes.
Can somebody help with generating Avro Schema files (*.avsc) from Enterprise Architect?
EA does not itself support Avro, and to my knowledge there are no third-party extensions that do either. To build one would likely require a solid week or more of effort, and that's for an EA extension expert like myself. So while I'm confident it could be done, I don't think you'll have much luck finding someone to do it for free.
An alternative might be to model and generate an XML schema, which EA can do, and convert that using xml-avro or some other conversion tool.
Related
Does anyone of domain specific languages (DSL) that facilitate data extraction and transformation as part of an Extract-Transform-Load (ETL) pipeline?
I'd like to extract data from a 3rd party SQL database and transform the data to an already defined JSON format to store it into my application. There are many different possible database schemata to extract data from, so I was wondering whether there is already a way to configure this through the help of a (commonly used) extraction language (ideally that language is also agnostic to other data sources such as web services, etc).
I had a look around, but other than a few research papers I couldn't find much in terms of agreed standards for ETL (minus the 'L' which I've got covered) and I don't want to reinvent the wheel.
I'd appreciate any pointers in the right direction.
Creating a good, all-encompassing DSL for ETL is I believe not just hard, it's a bit of a fool's errand. To handle the many real-world ETL complexities, you end up re-creating a general-purpose language.
And ETL "without programming skill" as this research paper attempts will struggle with the messiness of cleaning and conforming disparate source systems.
Using a general-purpose language by itself is of course possible but very time consuming due to the low abstraction level, and all the infrastructure code you'd have to implement.
Graphical ETL tools and some ETL DSLs address this by adding scripts or calling out to external programs. While useful and essential, this does have the disadvantage of employing multiple different programming models, with associated mental and technical friction when moving between them.
A different and I believe a better approach is to instead add ETL capabilities to a general-purpose language. Done well, you combine the benefits of ETL specific functionality and a high abstraction level with the power of a general-purpose language and its large eco-system, all delivered via a single programming model.
As one example of this latter approach, my company provides actionETL, a cross-platform .NET ETL library that combines an ETL mindset with the advantages of modern application development. For example, it provides familiar control flow and dataflow ETL capabilities, and uses internal DSLs in several places to simplify configuration. Do try it out if it sounds like a good fit.
actionETL now also has a free Community edition.
Cheers,
Kristian
I have to integrate text files into a Database using Jboss Fuse; do I have to use JDBC, which is the best approach? Thanks!
Camel offers some easy ways to process a text file into a database. (JDBC, SQL, Hibernate, JPA etc.)
For a small job, I'd be tempted to just use the SQL component-- it's pretty basic.
Github has some great examples provided by some very good books about Camel. These are often cut-n-paste ready.
Good luck, there are good tools available.
My goal is to write specification of simple client-server application protocol for our project where will be few kinds of client: IOS(swift), Android(java) and Web(http/websocket) probably. Server is the python. Our team decided to use MessagePack as a data structure serializer for different requests/responses.
So now i think how to describe such data structures. I don't wanna write the whole description of specification manually and spent time for thinking over different rules and agreements. I would want to point to a notation system description for my colleagues of client development.
My question is a common.
How do you behave with such task? Do you write pure text in your native speaking language or use some notation system? Is it right to use notation system and existing serializer together? I meant ASN.1. It is seemed clear.
First let me explain. My platform is mostly Windows and my data mostly resides in a relational database (SQL Server 2008). I primarily work with C# but occasuionally work with PERL and JavaScript. I was looking to learning what a graph database could do for my data but there seems to be a continual stream of tools and utilities that are not available that I need to install and learn. I am so busy learning the tools that I loose focus of what I really want and that is to work with a graph database.
It seems that Neo4j is relatively small and should be accessible to evaluate its features. I would like to import my data from an existing SQL database into Neo4J with the relationships established initially via the foreign keys. The idea seems relatively straightforward but it seems I need to learn Java, PHP, etc. not only to access Neo4J but also to access the existing database. I was wondering if anyone had some recommendations, tools, documentation that would accomplish this goal fairly simply. Do I do down the route of PHP? Java? What additional libraries/packages do I need? What tools are most useful? Thank you.
I think you want the Neo4J batch importer, which can be found on github. Using this, we were able to export 20 million nodes and relationships to import them into Neo4j.
I think you will have to write your own. Neo4jD looks interesting.
Core principles and tenets for designing a system. is this really web 3.0?
The core principles would include understanding of RDF, RDFS, OWL and
how arbitrary knowledge can be represented using these specs.
Understanding of what reasoners can do with semantic data is essential.
Then comes the idea of intelligent agents.
Then comes understanding of why this is all needed and why it was designed that way.
This would give you the clue about how promising (or not) is this technology.
For the software architect it would be good to know how OWL data can be efficiently stored(in RDBMS or in any other way for example).
As for myself I find this interesting but in reality it is yet very far from the
point where average users can benefit from it on the regular Web.