How to move data model automatically? - gooddata

Can I somehow pull a data model from someone's platform to mine, so I don't need to create the model manually? Is there a chance to make it automatically? Thanks

for some cloud datawarehouses GoodData supports “model scan” I believe where you can connect to the database and generate GD LDM based on that. I believe similar mechanism is also in CN. BUT - I would not recommend doing it for production use especially on top of some random database. It might still require some manual tuning to make a reasonable LDM. And from our experience in GD the LDM should be based on the reporting requirements. But for a quick and dirty demo, it might work out of the box.

Related

Is there a way to let non-technical individuals utilize BigQuery reports?

I want to have an access port for non-tech savvy individuals in which they could make reports of their own without needing to know SQL what-so-ever.
It would be best if I could create custom fields of myself, and then just let the users in the access port pick and choose whichever they like with a custom date range.
I've explored the options Google Data Studio offers, but it looks to me like it mostly puts an emphasis on data visualization.
In addition, my attempts to make custom queries with it were not successful, since the platform is rigid in terms of deciding which field is a metric and which is a dimension (and it does so inaccurately). This makes it hard to query reports as you normally would using BigQuery, which doesn't have these somewhat arbitrary limitations.
Perhaps I've misunderstood something about the platform due to my limited experience with it, but it looks like Data Studio isn't going to fit the bill for me.
EDIT: In addition, the platform should have a way of exporting said reports as CSV files, a feature that Data Studio doesn't have as far as I know.
It would be great to receive suggestions for a different platform which would better fit my needs, or even suggestions on how to make better use of Data Studio.
Have you looked at using a tool like redash (https://redash.io)? Assuming your GA360 data is in BigQuery you can connect redash to BQ. Then you can author queries and visualize.
You can also use the Google Could SDK to connect to BQ and run custom queries to generate new tables in BQ based on the GA360 session data. Then use redash, or any tool, to report/visualize.

What is the best approach to archiving operational data?

I have a sql server 2012 database which is the backend to an asp.net MVC application, storing customer and order information. This database is accessed under high load and high usage.
I know have a requirement to be able to generate ad hoc reports from the database accessing the same data as the MVC application works with. I am concerned what impact this would have on the database server and the database itself, around locking etc. As such their is a distinction between the data, for the app its operational, but for the reports its more data warehouse oriented.
Therefore I am looking at my options as to the best approach to avoid such.
I am considering creating another database on a different server and archive the data to it using a sql job at regular intervals during the day. Only concern around this is that it would require maintenance and also a dependency to ensure any necessary changes are made to the target database when the source database changes.
What other options opened to me in such a situation and what advice could be given regarding such? What is the best approach to such?
You don't have to think of your own solution to keep the databases in sync. SQL Server has build in ways to achieve this.
Database Mirroring
Replication
Always On Availability Groups
If you're using Enterprise Edition of SQL Server 2012 then I would look into Always On Availability Groups if not then (Transactional) Replication. Both of these solutions can keep a second read-only and near real-time copy of the database.
As Steve McConell suggests you should make no assumptions about performance. You should just measure it before making any decisions. It is not a wise choice to make design choices without knowing the actual performance overhead. So I would suggest to measure, or simulate the performance overhead before even consider using a complex architecture, because you would not know if it's worth the trouble.
Anyway, I think that your approach is right. I would create a windows service which periodically retrieves the data I need from my database and stores them in my warehouse (the new database). I don't think you would ever find a tool keeping consistency between the two schemas, unless you want one schema to be an exact copy of the other.
I don't know your exact needs and perhaps my suggestion is an overkill but I would encourage you to consider using an OLAP approach in the data warehouse where your reporting data will come from. I have to warn you that these systems are oriented in really big data and advanced reporting needs but perhaps you can take some ideas from them. Since you are familiar with the Microsoft ecosystem, I would suggest using Business Intelligence Studio. You could there build an OLAP cube using your normal database as data source and integrate advanced reporting.
Hope I helped.

JSON vs classic schema design [duplicate]

The Project
I've been asked to work on an interesting project -- what amounts to a basic Web CMS -- that uses HTML/CSS/jQuery with PHP. However, one requirement is that there won't be a database to house the data (they want flat files for the documents/pages -- preferable in JSON format).
In a very basic sense, it'll be used to generate HTML pages via a very "non-techie" interface. Each installation would only have around 20 pages, but a few may get up to 100. It has to be fairly easy to drop onto a PHP capable server and run, with very little setup needed.
What's Out There
There are tons of CMS options and quite a few flat file versions. But an OSS or other existing CMS is not an option. They need a simple propriety system.
Initial Thoughts
So flat files it is... but I'd really like to get some feedback on the drawbacks, and if it is worth the effort to try and convince them to use something like MySQL (SQLite or CouchDB are out since none of the servers can be configured to run them at the present time).
Of course the document files are pretty straightforward, but we're also talking about login info for 1 or 2 admins per installation, a few lists, as well as configs/settings (which also can easily be stored in a file with protection).
The Dilemma
If there are benefits to using MySQL rather than JOSN formatted files and some arrays in a simple project like this -- beyond my own pre-conceived notions :) -- I'll be sure to argue them.
But honestly I can't see any that outweigh their need to not have a database system.
I'd appreciate you insight and opinions.
If you can't cite a specific need for relational table design, then you're good with flat files. Build as specified. The moment you can cite a specific need, let them know; upgrading isn't that hard, if you're perception is timely (that is, if you aren;t in the position of having to normalize data that should have been integrated earlier).
It's a shame you can't use CouchDB, this seems like the perfect application for it. Keep in mind that using flat-files severely constrains your architecture and, especially, scalability.
What's the best case scenario for your CMS app? It's successful and people want to use it more? If you're using flat-files it'll be harder to service and improve your system (e.g. make it more robust, and add new features for future versions) and performance will not scale well. So "success" in this case is at best short-lived, as success translates into more and more work for less and less gains in feature-set and performance.
Then again, if the CSM is designed right, then switching between a flat file to RDMS should be as simple as using a different data access file.
Will this be installed on any shared hosting sites. For this to work somewhat safely, a mechanism like suEXEC needs to be set up properly as the web server will need write permissions to various directories.
What would be cool with a simple site that was feed via JSON and jQuery is that the site wouldn't need to load on each click. Just the relevant data would change. You could then use hashes in the location bar to keep track of where you were (ex. http://localhost/#about)
The problem being if they are editing the raw JSON file they can mess it up pretty quick. I think your admin tools would have to generate the JSON files based on the input so that you can ensure nothing breaks. The admin tools would be more entailed then the site (though isn't that always the case with dynamic sites)
What is the predicted data sizes for the CMS?
A large reason for the use of a RDMS is quick,specific access to large amounts of data. The data format might not be large, but if there is a lot of the data, then it might be better in the long run for a RDMS.
Then again, if the CSM is designed right, then switching between a flat file to RDMS should be as simple as using a different data access file.
While an RDBMS may be necessary for a very large CMS, a small one could run off flat files very well. A lot of CMS products out there fall down in that regard, I think, by throwing an RDBMS into the mix when there's no real need.
However, if you are using flat files, there are security issues which others have highlighted. Another issue I've come across is hosting providers using the disable_functions directive in php.ini to disable file I/O functions like fopen() and friends. If you're hosting your CMS on a box you control, you won't have this problem but if you're using a third-party provider, check first.
As the original poster, I wasn't signed in, so I'm following up to the answers so far in an answer (sorry if this is bad form).
There may instances where this is on
a shared host.
Though the JSON files can technically
be edited, this won't be the case.
The admin interface will be robust
enough to do all of the creating/editing of pages
The size for each install will be
relatively small -- 1 - 2 admins,
10-100 pages. A few lists of common
items may run longer (snippets of
copy for example).
Security will be a big issue -- any
other options suggestions on this
specifically?
Well, isn't there a problem with they being distrustful to any database system? Isn't the problem more in their thinking than in technology? Maybe they are afraid of database because it sounds complex to them. In that case, if you just present them some very simple CMS (like CMS made simple, which I've heard is really simple and the learning process is very fast), if they see everything is easy then may be they just don't care what's behind, if it's a database or whatever!
They could hear to arguments like better maintenance, lower cost of maintenance, much better handover to another webmaster than proprietary solutions (they are not dependent on you) etc.

Get my database under Version Control using a DVCS [Mercurial]

What would be the best approach for versioning my whole database ?
Creating a file for each database object (table,view,procedsure..) or rather having one file for all DDL scripts and any new change will be put in a separate file ?
What about handling changes made in a Database manager tool ?
I'd like to have a generic solutions for any kind of RDBMS.
Are there any other options ?
I'm a huge VCS fan in general and a big Mercurial booster, but I really think you're going down the wrong path.
VCSs aren't just about iterative changes, the "what", they're also about answering the "who", "when", and "why". For a database those answers are a lot less interesting or hard to provide to the VCS. If you're doing nightly exports and commits the "who" will always be "cron" and the "why" will always be "midnight".
The other thing modern VCSs do really well is helping you merge changes from multiple branches. That's less applicable in the database world. Very seldom do you say "I want this table structure, but this data", and if you do the text/diff merge isn't going to help you much.
The thing that does do "what" and "when" very well is an incremental backup system, and that's probably the better fit.
At work we use Tivoli and at home I use rdiff-backup and duplicity, but there are plenty of great options.
I guess my general rule of thumb is "if it was typed by hand by a human then it does into source control, and if it was generated/exported then it goes in the incremental backups"
Certainly you can make this work, but I don't think it will buy you much over the more traditional backup solutions.
Have a look at this post
If you need generic solution - put everything in the scripts (simple text files) and put under Version Control system (can be used any of VCS).
Grouping similar database objects into scripts will be depend on your requirement.
So you may for example:
Store table/indexes/ in one or several script
Each procedure store in individual script or combine small procedures into one script.
However need to remember one important thing with this approach: don't forget change scripts if you changed table/view/procedure directly in databases and don't create/recreate/compile you db objects in database after changing scripts.
SQL Source Control currently supports SVN and TFS, but Mercurial requests are increasing rapidly and we're hoping to have a story for this very soon.
We use UserVoice to measure demand so please vote accordingly if you're interesting in this: http://redgate.uservoice.com/forums/39019-sql-source-control

Does an ORM integrate with existing applications or do I not understand?

Assume Hibernate for the ORM.
I'm not sure how to ask this. I want to build an application that can replace part of another. For example, say I have an application with various modules, called the "big" app. This application may handle HR, financial, purchases, skill sets, etc. But maybe, for whatever reason, I don't like the skill set module, but I like the rest of the application. I want to build an app that uses the same database that the rest of the "big" app uses but use my software as the front end for that piece.
I could build my app and have it hit the database directly with no ORM. My question is is there an advantage to using an ORM here. I'm thinking there is because if the "big" app goes away and another app is purchased, we could continue to use my version of skill set because I am using hibernate instead of hitting things directly. I'm still learning but I thought that my application used objects that I named and that in the case I just described I'd have to change my mapping files only or/and my code very little.
Here is another example. I have a legacy application and legacy database. It uses database X. I decide that I no longer like the old terminal emulator application that is used to get the data and that I want a graphical version. I can use hibernate with my application and when I finally decide to get rid of the legacy database and change to the latest Oracle or SQL Server, I can do so with minimal headache? Or is my database going to change so much that it wouldn't have matter anyway (I'm suggesting that upon changing to a new database more information will want to be captured)?
I was hoping for comments, if I am misunderstanding why hibernate/ORM might or might not be a benefit.
Thank you.
I do not think you will have a huge benefit frmo hibernate if the database schema changes to something completely different, you might have to change more than just your mapping - especially if more "structure" is added to the database (tables, column and such schema things). That said, if the database was structured mostly the same way, but lets say just the column names and tables names changes and a couple of tables are merged or something like that - you can get by with just changing your mapping.
But I would really recommend using hbernate for database agnosticity, that's is a pretty easy path.
AND then just because it doesn't exactly helps you if your entire database is changed, it such an incredible amount of other forces, that I would choose that over direct DB access most of the time.
Lastly you could think about using a service-layer such as the repository pattern that abstracs away the data access, so the business of your appilcation wouldn't need to change if the database changes.
Switching from one DBMS to another (ala Oracle to SQL Server) is one thing that using an ORM would certainly make much easier.
As for switching from one "big app" to another "big app", I doubt if using an ORM would help that much. It's likely that the database structure and business logic would be different enough that you would find yourself rewriting lots of code anyways.
You can generate domain objects with Hibernate Tools, if you do that than it will be painless and fast. however if you write all the objects by hand you will die. i think its good idea to rewrite part of the app and get to know hibernate better.
I think it's generally a bad idea to make any decision based on the
unknowns versus the knowns. Whether you're deciding on a data
access/persistence strategy, what car to buy, or what college to go
to, you should put the most weight on the things you know you want
today, rather than worrying about what may or may not happen tomorrow.
So when considering ORMs, I wouldn't worry too much about things such as apps
"going away" or DBMSs changing (unless that's either already been talked about, or
there's a history of this in your company). I'm not saying that these aren't things that will never happen, but rather that they should take a back seat to the generally much more important considerations of maintainability, performance, and developer productivity.
So in short, choose an ORM based on its ability to solve the problems and satisfy the requirements that you have today.