Is is possible to develop templates in Tableau? - data-visualization

I am starting to learn tableau. I have reached expert level at collecting, preparing, and processing data in many languages and programs. I would like to know if tableau would allow me to convert data into compelling visualizations that could be used as a guide for others to use.

Related

Domain specific language to perform data extraction and transformation in ETL pipeline

Does anyone of domain specific languages (DSL) that facilitate data extraction and transformation as part of an Extract-Transform-Load (ETL) pipeline?
I'd like to extract data from a 3rd party SQL database and transform the data to an already defined JSON format to store it into my application. There are many different possible database schemata to extract data from, so I was wondering whether there is already a way to configure this through the help of a (commonly used) extraction language (ideally that language is also agnostic to other data sources such as web services, etc).
I had a look around, but other than a few research papers I couldn't find much in terms of agreed standards for ETL (minus the 'L' which I've got covered) and I don't want to reinvent the wheel.
I'd appreciate any pointers in the right direction.
Creating a good, all-encompassing DSL for ETL is I believe not just hard, it's a bit of a fool's errand. To handle the many real-world ETL complexities, you end up re-creating a general-purpose language.
And ETL "without programming skill" as this research paper attempts will struggle with the messiness of cleaning and conforming disparate source systems.
Using a general-purpose language by itself is of course possible but very time consuming due to the low abstraction level, and all the infrastructure code you'd have to implement.
Graphical ETL tools and some ETL DSLs address this by adding scripts or calling out to external programs. While useful and essential, this does have the disadvantage of employing multiple different programming models, with associated mental and technical friction when moving between them.
A different and I believe a better approach is to instead add ETL capabilities to a general-purpose language. Done well, you combine the benefits of ETL specific functionality and a high abstraction level with the power of a general-purpose language and its large eco-system, all delivered via a single programming model.
As one example of this latter approach, my company provides actionETL, a cross-platform .NET ETL library that combines an ETL mindset with the advantages of modern application development. For example, it provides familiar control flow and dataflow ETL capabilities, and uses internal DSLs in several places to simplify configuration. Do try it out if it sounds like a good fit.
actionETL now also has a free Community edition.
Cheers,
Kristian

Is there a way to directly migrate SAP BO reports into microstrategy?

I have my existing BI reporting from the SAP BO software and now I want to migrate everything to MicroStrategy. So is there any way to migrate those reports to MicroStrategy directly or just if I can migrate the dimensions and measures created in SAP BO to Attributes and metrics in MicroStrategy. Please suggest a way to do that effectively.
I did research on this topic on other platforms like on MicroStrategy community and Google also, but all those did not answer my question clearly.
If I remember correctly there was once an internal tool in MicroStrategy to do that, I never used it and from what I remember it was quire rough and still required a lot of manual work.
You can try to read this presentation from MicroStrategy to have an idea about the process and the possible approaches.
Personally I did once a conversion from BO to MicroStrategy, it was possible to reuse most of, if not all, the tables created for Business Object, but in MicroStrategy I created everything from scratch, the design of the dashboards was different to make them more interactive/easy to use.
Of course this approach can appear not feasible with big projects (this is why colleagues were using the above mentioned tool), but I think rebuilding from scratch, with a small scope (small team) and build on it will give the best result in the long run. The main issue here is that this could take time and some times organizations don't want to wait, but this is their problem :)

Automatisation&Piping of diverse tasks

I am looking for recommendations for a very generic automation/task execution tool. The scope is somewhat between a script, a build system like make and orchestration tools like Ansible or Puppet. The best I can do is describe my rather vague 'requirements' and hope for clues how others have solved these problems. Sorry for the long description, I guess I don't really know what exactly I want he solution to do. I profit from programming answers on SO all the time but I am not entirely sure if my open ended question is acceptable here.
--
We work as data analysts/system validators in a corporate setting. We perform a range of diverse tasks and interact with lots of ever changing systems. Each little step we do is arguably mundane/easy, but the bigger picture only forms if lots of iterations with slightly different inputs or combinations are repeated. It is a bit like looking for a needle in a hay stack, but the concrete problem is slightly different every time. This makes it hard to use a normal script or automation tool, which require more structure to work. But doing things semi-manual without a big team does not allow us to cover all the analysis/cases we want/need.
To give an applied example: a typical tasks could involve setting up a big calculation in a vendor system, extracting their ASCII output from a web server and parsing it. Then we would suck raw input data from a set of configuration files and data bases. This is piped into some of our home grown replication tools/models living in C++. Then both the system's results and our replication is scanned for interesting outliers (e.g. regression tested) and only this subset is uploaded for human analysts to investigate, nicely presented in an Excel sheet.
We can do all these things easily by hand for a once-off or maybe using ad-hoc tools/scripts. We just can't do it repeatedly for ever so slightly different settings. We seem to need a library for 'common tasks' that are just specialized by some few inputs (e.g. task it to download a time series and scan for outliers - parameters would be db access/login and maybe parameters defining what an outlier is in that context). And then I need to chain these tasks together to make complex tasks repeatable and simple to build up from atomic steps.
I have not found anything really do something like this. There seems to be specialist scripting or tools for each niche available, but not something combining all the different tasks I need to perform.
I have been so far toying on and off with a minimalist sqlite database which controls a set of python 'scripts'/wrappers. These scripts take input parameters from the data base, and they are chained/piped based on the database. The scripts write their results back to the database, mostly as plain text and floats/ints. This kind of db interface is very error prone and complicated for humans; the idea is to have (template) scripts writing (concrete/parametrised) scripts to the db for execution, like rolling itself out before executing. Not sure if this is a smart idea, but the db is driving the scripts, without much interacting among these building block script; rather than having the conventional bunch of scripts calling each other and dumping some data into db as an after thought. So far we have lots of separate wrappers (scripts) to talk to all the systems and do the work, what is really missing is something tying it all together an controlling it.
I am interested (obviously) more in data/flow transparency, repeatability and chaining mini-programs together to bigger units, rather than speed or scaling to larger data sets. All the heavier lifting is either done in the systems we interact with, or it is delegated to C++ called from these python scripts. This is not a production system with more stability and fixed goals but rather a flexible analysis/investigation helper.
I really hope someone here has previously run into exactly that problem severely limiting our productivity, and we can just piggy back off your solution or ideas.
I would suggest that you consider staf (Software Test Automation Framework). It's open source, distributed, and cross-platform. It will run just about any task on just about any platform. It has a variety of plugin "Services" available for specific purposes, or you can create your own custom Service. You can also extend the functionality through scripting (jython) It's also well documented and reasonably well supported through user forums by IBM.

feasibility on data mining program call stack using AOP

I am reading an article in IEEE Computer magazine about using data mining on applications.
The part that is intriguing to me is the idea that we can have software that can monitor the execution flow of an program, and put the data into a database, where we can do some data mining.
This data could then be used by a data mining tool to look for information, such as if there is certain patterns that may be called that may lead to changing the API, and, ideally, it may also be able to determine bugs, in that if you have to call functions in some order, it can help detect that.
There are probably other uses, but this would be a start.
So, would such a tool be useful?
I am thinking that AOP may be the only way to really do this on a dynamic application, as you could then track the flow of every call and save the stack, and perhaps gather some other information, such as parameters.
Unfortunately software engineers don't tend to be experts on data mining, and those that do data mining may not be an expert on writing complex applications.
For me, where this would get interesting is to then start to analyze distributed applications, or those using cloud computing, but that may be very complicated.
Second question, is this type of question that should be a community wiki?
Yes, I think it would be useful.
No, it shouldn't be a community wiki.
Check out the book "Programming Collective Intelligence" by Segaran for some good programmatic use of data mining strategies.

What is your reporting tool of choice? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Every project invariably needs some type of reporting functionality. From a foreach loop in your language of choice to a full blow BI platform.
To get the job done what tools, widgets, platforms has the group used with success, frustration and failure?
For knocking out fairly "run of the mill" reports, SQL Reporting Services is really quite impressive.
For complicated analysis, loading the data (maybe pre-aggregated) into an Excel Pivot table is usually adequate for most users.
I've found you can spend a lot of time (and money) building a comprehensive "ad-hoc" reporting suite and after the first month or two of "wow factor", 99% of the reports generated will be the same report with minor differences in a fixed set of parameters.
Don't accept when a user says they want "ad-hoc" reports without specifying what goals and targets their looking for. They are just fishing and they need to actually spend as much time on THINKING about THEIR reporting requirements as YOU would have to spend BUILDING their solution.
I've spent too much time building the "the system that can report everything" and for it to become out of date or out of favour before it was finished. Much better to get the quick wins out of the way as quick as possible and then spend time "systemising" the most important reports.
For most reports we use BIRT.
I've used Reporting Services and Crystal fairly extensively, and I'm writing a few reports using Excel(ick) at the moment.
Reporting Services is pretty good for simple reports but as soon as you need total control over formatting,complex formulas and charts etc. Crystal is a long way ahead. I also find Crystal to be far more usable; being able to change things within the report preview is invaluable (it may be possible in later versions of RS?).
RS also needs to be deployed to a web server which limits it's usefulness if you are writing applications that need to be deployed externally.
Older versions of Crystal were very buggy but the latest ones are much better, it's much more mature than Reporting Services.
For a lot of projects we use ActiveReports.
I am a committer on the BIRT project, so I am biased. BIRT provides a very well thought out report object model (ROM) and appropriate API for the various design and deploy function that is needed. In addition, BIRT provides the best multi-language support and the ability to separate development from design through the use of CSS.
BIRT can be embedded into your application for no license cost through the REAPI or it can be purchased through a couple of commercial offerings.
Cognos is a robust suite of tools (we use it as a front-end for an Oracle back-end), but there's a pronounced lack of documentation on how to accomplish complex reporting tasks -- mostly, you end up banging on it until you get something to work.
I wouldn't discount the usefulness of using Microsoft Access as a reporting front-end. It doesn't have that useful Web-enabled functionality, but for in-house reports it's very versatile and surprisingly powerful.
We use i-net Clear Reports for our reporting (seeing as how we "eat our own dog food"). ;)
It is like Crystal Reports,
can read Crystal Reports templates,
the API is more useful,
costs less than Crystal Reports (and if you factor in support costs, costs less than open source)
is platform independent because written in Java.
we offer a free and fully functional report designer
If you have all the money in the world, go with Cognos. They provide a data cube that essentially makes the reporting "developer free" and the end user can create reports, dashboards, anything they like.
For the "common man", I've grown quite fond of the ComponentOne reports for .NET library/tools. It has a similar feel to Crystal Reports, but has a very friendly XML format that you and edit under the hood and none of the headaches with versioning, keys, and other items that I've had to deal with when making simple updates to either the report or the underlying version.
I don't really have much SSAS work to do but I've been quite taken with this:
Cube Browser for ASP.net
It offers many of the capabilities of an excel pivot table in a web app, (thought I'm not enough of an expert on Excel to really know the whole of the pivot table's capabilities - it at least looks comparable to visual studio's cube browser).
Unfortunately the demos don't seem to be online anymore :(
I would have to agree, I really like SQL Server Reporting Services. It just does stuff, and does it easily.
Crystal Reports, because it is easy to take the same exact report file and
1 - Post it on the intranet
2 - Embed it in an application
3 - Schedule it to be emailed as an Excel output every so often to whoever needs it
Also (as I already suggested), it exports easily to Excel, PDF, and other formats.
We've been using BIRT which had a steep learning curve for me until I realized how many WYSIWIG features it had (I started editing the xml source code direct, which I don't recommend.) There are some output specific tricks (like using a 0 left margin to not get a blank A column when outputting to XLS format) but for the most part it's quick and easy to use, edit and preview.
I have also been impressed on how easy it is to intermix different datasets in a single report. While not a silver bullet, its a better all around tool than 99.999% of people are going to build on their own.
"Give them data and they will love you for it"
Out of the methods and tools I've used in the past, I would rank them in the following order based on abilities/versatility/usability/speed to deploy. I'm leaving cost out of it because while it is always a factor it is a different factor for everyone.
1 is Cognos (version 8)
2 is SQL Server Reporting
3 is Crystal Reports
4 is Custom written code
I haven't used any of the other tools mentioned. Cognos 8 is nothing short of awesome. While pricey, you are only limited by your imagination. It can do anything.
This isn't so much a positive suggestion, but more of a cautionary tale against crystal reports... As with other people, getting the right version of the crystal runtime is important, but having done that, I still had this problem:
Spent weeks developing reports that had embedded images.
Tested on dev and staging environment, all A-OK.
Deploy to live server - doesn't work... Hmmm...
Spent two weeks trawling forums and looking for advice, eventually got a response from a crystal body on their forums. Suggested that he had seen a similar problem to do with MS Paint being set up as the default application for a certain file extension.
At this point, we gave up trying (after I convinced my boss that this wasn't a take the piss answer, but actually a formal response from Crystal). Handily we were migrating to new servers about a month later (where the reports worked), but honestly, wouldn't touch them again...
Oh, and have used SSRS and found it to be pretty good for most things (particularly the most recent version).
Tableau software is an amazing tool to run your reports and get easily deep throught analysis
For simple reports I use the standard ReportViewer included in Visual Studio.
For more complicated reports and ones that require more performance I've used both Report Sharp Shooter and devExpress XtraReports. Surprisingly, in both products creating tables isn't as easy as it should but both are faster than ReportViewer and handle extremely well multi-column reports, barcodes and aggregate data.
We use Cognos, it's a fairly complex system, but very powerful.
i have a small reporting set, made in 2 months:
at least 10 times faster than crystal reports;
easy editing;
.net formula;
easy usage;
small code usage;
serialization and deserialization(fast and small);
extreme security;
multi threaded;
no errors;
We had used MS Reporting Services, but we was completely unhappy with it.
Reasons:
it is needed to make difficult configuration of server
it is not possible to embed report editor into our app without buying SQL server license for every user
it is possible only to use embedded report parameters input form UI or send them from app, but not to create parameters UI by report designer
Now we a using Stimulsoft Reports. It have no such limitations like MS Reporting Services, and we and your users are happy with it.
1) I would think Reporting Services is very good for most of the needs, when in comes to developing table based reports and also matrix reports (drilldown - pivot like functionality).Considering the price of Cognos etc. An SME can't even dream of getting Congns AFAIK
2) Report Scheduling / Subscription functionality can be invoked to send reports to a set of users (data driven) to deliver reports. Subscriptions can be delivered to custom locations such as an SFTP, by writing .Net code.
3) Using Report Models, end user can drag and drop columns and develop customized reports
To Note:
1) It can get trickier once you develop really complex graphical/dashboard kind reports - which involve few charts and small tables to be displayed in A4. Report Designer (the tool we use to design reports) and Web display use different rendering engines. So it is better if you deploy the reports often and see how they look, if you develop complex graphical reports
2) If you write custom functionality, you may have to change the XML configuration files(RSReportServer.Config etc). If there is any problem in the edit, ReportServer service may stop. So be careful to back up before doing anything custom
Cognos with an Oracle backend is what we use. We also use spotfire for visualization on top of cognos.
I'm the CTO at Windward and I do believe that Windward Reports is by far both the easiest to use and you can do more with it than any other reporting - and both traits are for the same reason, you design your reports in Word, Excel, & PowerPoint.
As to the generated reports, it's fast, it's rock solid, and incorporating it into your program can be as little as 3 lines of code.
We use Crystal Reports where I work. It has quite a few limitations, and we find ourselves doing almost all of the logic in Database procedures and Views.
One limitation to note is that Crystal Reports does not allow multiple layered sub-reports. In other words, you cannot have a sub-report inside a sub-report.