Is there a way to let non-technical individuals utilize BigQuery reports? - google-bigquery

I want to have an access port for non-tech savvy individuals in which they could make reports of their own without needing to know SQL what-so-ever.
It would be best if I could create custom fields of myself, and then just let the users in the access port pick and choose whichever they like with a custom date range.
I've explored the options Google Data Studio offers, but it looks to me like it mostly puts an emphasis on data visualization.
In addition, my attempts to make custom queries with it were not successful, since the platform is rigid in terms of deciding which field is a metric and which is a dimension (and it does so inaccurately). This makes it hard to query reports as you normally would using BigQuery, which doesn't have these somewhat arbitrary limitations.
Perhaps I've misunderstood something about the platform due to my limited experience with it, but it looks like Data Studio isn't going to fit the bill for me.
EDIT: In addition, the platform should have a way of exporting said reports as CSV files, a feature that Data Studio doesn't have as far as I know.
It would be great to receive suggestions for a different platform which would better fit my needs, or even suggestions on how to make better use of Data Studio.

Have you looked at using a tool like redash (https://redash.io)? Assuming your GA360 data is in BigQuery you can connect redash to BQ. Then you can author queries and visualize.
You can also use the Google Could SDK to connect to BQ and run custom queries to generate new tables in BQ based on the GA360 session data. Then use redash, or any tool, to report/visualize.

Related

Alter data source

Power BI can connect to various data sources and run SELECT queries.
Is it possible to run also other queries (INSERT INTO, UPDATE...)?
Now I need it for a postgresql database, but could use also for others in the future.
No, you can't run directly INSERT/UPDATE queries from Power BI. This isn't the idea of the tool. If you find you need it, then probably there is a major flaw in your design, or you are not using the right tool for this job. But there are few ways to workaround this (again, I'm not saying that you SHOULD do it). Usually this is done in a combination with custom written Power App, embedded in your report in Power Apps visual. The idea is that the app will write to the database, and will refresh your report after that (if needed).
You can start here and I will recommend you to look at this in-depth session - Writing back data to PowerBI from your reports.
The answer is No if I am very straight forward. PBI is a analysis platform for data. There are probably some advance way to do that but, this is not logical or good idea to think about manipulating data from report or from any BI tools. You can search answers from different blog where the same questions asked. For more details, you can check below links-
help link 1
help link 2

Can I see or browse raw segment.com events on Keen.io?

Isn't there a way to list raw events from segment.com, with all their attributes, on Keen.io?
It seems the UI only allows me to create aggregations.
I believe you're looking for extractions. These will give you the raw data from whatever source they were given to. Keep in mind however, that the performance of an extraction will generally be slower than any other kind of query. Furthermore, doing too many of them can cause irate Keen engineers* to get on your case.
*In this case, probably me.
The easiest way to do this is to log into your Keen account and go to the Explorer. Click preview collections to pull up a schema inspector tool and sample events.
You can also run an extraction query in the explorer or any of the query SDKS. You can also easily download a CSV of your raw events.
Data Explorer Guide has more tips & tricks as well.

How to turn MongoDB collection into a 'Table'

I've been given access to a cloud MongoDB (MongoLab) and need to extract some data into Excel so I can analyse it. The data isn't particularly complicated or large and is well suited to a 'normal' relational structure.
My research suggests things are trickier because the data has 'nested' aspects although conceptually its pretty clear how this would become a table. Here is what a document in the collection looks like, essntinaly the stuff highlighted blue would be columns in the table while the yellow would create a row for each "marketing_event" with the specifics of each event also being in a column:
Ideally I would use Power Query to get the data into Power Pivot but at this point anything will do!
I've tried a bunch of things all of which haven't got me much closer to end result that I'm looking for:
I downloaded MongoVue which I used to successfully connect to the database and while it enabled me to see the data in a basic table form, it does nothing with the nested stuff and the documentation is minimal in terms of how it could be of more use.
I also tried Pentaho PDI based on this article:http://sqlmag.com/blog/integrating-mongodb-and-open-source-data-stores-power-pivot but the steps aren't detailed and although I can see the collection, trying to replicate some sample queries I found on the web were totally unsuccesful.
I've tried to get a trial of Simba's ODBC connector but as yet the download doesn't seem to be working. I have contacted them but without response just yet.
I've even installed Mongo locally and tried to use the command prompt to connect which I was unable to do. Even if I pursued this I wouldn't be confident about knowing where to start in terms of creating the end product.
Happy to hear any suggestions or recommendations.
TIA
Jacob
Here's a solid ODBC driver that helps maintain the fidelity of your mongoDB data by exposing the nested MongoDB data model as a set of relational tables to Excel and other ODBC apps. in the sample document above, this driver will do exactly what you're looking for. The embedded documents and arrays can be extracted as separate related tables from the fields at the root level of the document.
https://www.progress.com/odbc/mongodb
I don't know if you already found the solution - but Simba ODBC is providing support for nested arrays.
Have a look here:
https://www.simba.com/resources/webinars/connect-tableau-big-data-source. This is an example how to connect Tableau BI to MongoDB. You might find it helpful.
And some more information on handling no-sql data in BI tools is provided in this whitepaper: http://info.mongodb.com/rs/mongodb/images/MongoDB_BI_Analytics.pdf

I need advise choosing a NoSQL database for a project with a lot of minute related information

I am currently working on a private project that is going to use Google's GTFS spec to get information about 100s of Public Transit agencies, their routers, stations, times, and other related information. I will be getting my information from here and the google code wiki page with similar info. There is a lot of data and its partitioned into multiple CSV formatted text files. These can be huge, some ranging in 80-100mb of data.
With the data I have, I want to translate it all into a nice solid database that I can build layers on top of to use for my project. I will be using GPS positioning to pinpoint a location and all surrounding stations/stops.
My goal is to access all the information for all these stops and stations with as few calls as possible, while keeping datasets small for queried results.
I am currently leaning towards MongoDB and CouchDB for their GeoSpatial support that can really optimize getting small datasets. But I also need to be sure to link all the stops on a route because I will be propagating information along a transit route for that line. In this case I have found that I can benefit from a Graph DB like Neo4j and OrientDB, but from what I know, neither has GeoSpatial support nor am I 100% sure that a Graph DB would be what I need.
The perfect solution might not exist, but I come here asking for help on finding the best possible for my situation. I know I will possible have to work around limitations of whatever I choose, but I want to at least have done my research and know that its the best I can get at the moment.
I have also been suggested to splinter the data into multiple DBs, but that could get very messy because all the information is very tightly interconnected through IDs.
Any help would be appreciated.
Obviously a graph database fits 100% your problem. My advice here is to go for some geo spatial module over neo4j or orientdb, althought you have some others free and open source implementation.
I think the best one right now, with all the geo spatial thing implemented is neo4j-spatial package. But as far as I know, you can also reproduce most of the geo spatial thing on your own if necessary.
BTW talking about splitting, if the amount of data/queries will be high, I strongly recommend you to share the load and think the model in this terms. Sure you can do something.
I've used Mongo's GeoSpatial features and can offer some guidance if you need help with a C# or javascript implementation - I would recommend it to start because it's super easy to use. I'm learning all about Neo4j right now and I am working on a hybrid approach that takes advantage of both Mongo and Neo4j. You might want to cross reference the documents in Mongo to the nodes in Neo4j using the Mongo object id.
For my hybrid implementation, I'm storing profiles and any other large static data in Mongo. In Neo4j, I'm storing relationships like friend and friend-of-friend. If I wanted to analyze movies two friends are most likely to want to watch together (or really any other relationship I hadn't thought of initially), by keeping that object id reference I can simply add some code instructing each node go out and grab a list of movies from the related profile.
Added 2011-02-12:
Just wanted to follow up on this "hybrid" idea as I created prototypes for and implemented a few more solutions recently where I ended up using more than one database. Martin Fowler refers to this as "Polyglot Persistence."
I'm finding that I am often using a combination of a relational database, document database and a graph database (in my case this is generally SQL Server, MongoDB and Neo4j). Since the question is related to data modeling as much as it is to geospatial, I thought I would touch on that here:
I've used Neo4j for site organization (similar to the idea of hypermedia in the REST model), modeling social data and building recommendations (often based on social data). As a result, I will generally model this part of the application before I begin programming.
I often end up using MongoDB for prototyping the rest of the application because it provides such a simple persistence mechanism. I like to start developing an application with the user interface, so this ends up working well.
When I start moving entities from Mongo to SQL Server, the context is usually important - for instance, if I have an application that allows users to build daily reports based on periodically collected data, it may make sense to run a procedure that builds those reports each night and stores daily report objects in Mongo that may be combined into larger aggregate reports as needed (obviously this doesn't consider a few special cases, but that is not relevant to the point)...on the other hand, if users need to pull on-demand reports limited to very specific time periods, it may make sense to keep everything in SQL server and build those reports as needed.
That said, and this deserves more intense thought, here are some considerations that may be helpful:
I generally try to store entities in a relational database if I find that pulling an entity from the database [in other words(in the context of a relational database) - querying data from the database that provides the data required to generate an entity or list of entities that fulfills the requested parameters] does not require significant processing (multiple joins, for instance)
Do you require ACID compliance(aside:if you have a graph problem, you can leverage Neo4j for this)? There are document databases with ACID compliance, but there's a reason Mongo is not: What does MongoDB not being ACID compliant really mean?
One use of Mongo I saw in the wild that I thought was worthy of mention - Hadoop was being used to compute massive hash tables that were then stored in Mongo. I believe a similar approach is used by TripAdvisor for user based customization in terms of targeting offers, advertising, etc..
NoSQL only exists because MySQL users assume that all databases have their performance problems when their database grows large and/or becomes complex.
I suggest that you use PostGIS. You can use the same database for the rest of your data needs as well.
http://postgis.refractions.net/

Custom user-driven reports on a known schema

There's an upcoming project at work to fill a requirement that end-users be able to generate custom reports off their data in within our fixed/known-schema relational database.
The interface needs to be very user friendly and so transposing all of t-sql's language concepts into a graphical paradigm is far too complex for both the project team and the end user.
What research or products, open-source or otherwise, exist around satisfying this of business need? I'm aware of general Business Analytic tools but this is more specific and I'm trying to understand the problem domain better rather than trying to reverse engineer it from vendor marketing materials.
I assume the research would be in the form of a some encoding of the schema that specifies which joins and tables are allowed, which fields are available, then then a method for allowing the user to select one particular valid combination among the possible many, generate the query, and display the results.
Brainstorming - feature support in order of complexity: SELECT, WHERE filters, FULL JOIN, LEFT JOIN, sorting, paging, grouping, aggregation, HAVING filter.
My backup plan is to just dumb it down to pre-written SQL Views (with JOINs built-in) with the ability to display available columns with custom row-wise filtering. Paging and sorting is doable. By itself, this doesn't allow for grouping, aggregate functions, HAVING filters, or other inter-row analysis.
As a follow-up to #Dems post (comment box wasn't bit enough :) )..
Agreed on most counts.. If your data is mostly analytic, then you might want to look into a tool like PowerPivot. In this case, you can write a general query then allow the users to derive reports based on the result set in a familiar tool (Excel).
At the core of every ad hoc reporting engine, you will find a few common themes:
Metadata
There will be some way of describing the schema such that the model may be easily consumed by the user. Sql Server Reporting Services (SSRS) requires you to build a metadata model in order to use the report builder. When using PowerPivot, you can alias column names to make them more readable, but in the end, you are simply providing a flat dataset and allowing the user to build the joins/relationships.
Query Builder
Once the metadata has been manipulated by the user, an intermediary system must be in place to convert the conceptual report into an actual query. Many tools are measured based on the complexity of the Sql that they produce as this can greatly affect performance. One way to get around this is to create views that the reporting engine may build queries against. One of the best open source examples of this that I have seen is the engine that backs Hibernate/NHibernate (look into how the various Dialects are used when building queries).
Rendering Engine
In my experience building a rendering engine is not a road you want to go down. There are many device-specific concerns as well as look & feel problems (i.e. how do you plan on representing cascading joins/relationships?). Every rendering engine has it's own quirks (PowerPivot uses Excel, SSRS has a service that builds the raw result and return it to the consuming application) that must be accounted for, so be careful how you choose.
Earlier I mentioned that I agreed on most counts. I would not recommend encouraging your users to learn Sql or allowing them to pass-through Sql to the underlying data-store. This opens the door to malicious code being written and can become a security nightmare. Not to mention that most business users think in terms of flat tables, not hierarchical sets.
Figure out what your users are comfortable with and try to fit your solution to that domain. I have often found that for sophisticated business users something like PowerPivot is perfect. For more day-to-day end users, having "canned" reports that might be modified by the end user via a simple user interface that allows them to modify restrictions/groupings/sorting is more useful.
There are many options out there, but the best of them cost money.
I really like QlikView as an easy to use report designed for semi-technical people. If your user base is more technically minded it may be a bit restrictive, but if your user base have no logical thought capabilities, it's too complicated. That's the biggest trap I see you falling in to...
- No, I want more than that!
- No, that's too complicated for me!
- At the same time...
If you were to build your own tool-set internally, you'd probably be best sticking with OLAP cubes. Let people slice and dice the data as they like, but with all the relationships pre-defined. Do it right and you can just point an Excel Pivot Table at the OLAP Cube and let them play...
The next up, as Bobby D says, could be SQL Server Reporting Services, or something similar.
But if your users end up wanting absolute flexibility, the tool they need is SQL itself. Unfortunately, all tools follow the same trend: The more flexible and powerful, the more time you need to spend learning/training.