I am working on PowerBi and use SQL server as database. I used views or direct tables as source to PowerBi . My views are simple select queries with simple joins. I am not finding any scope for query optimizations. Query execution takes time in SQL and table has millions of data increasing day by day.
Now I am thinking to use impala as well as SQL server. I am getting clean data from Rapidminer. I didn't use impala before. So I have some doubts. Please answer if you can. I have zero knowledge of impala.
Can we create connection between rapid miner and impala? then what will be the steps? google give me some steps which is difficult to understand.
Can we create connection between impala and sql?
Can we create view on impala and create joins in views? I know we can create view as well as joins in impala. But my question is can we create it together?
suppose SQl and impala connection is made then suppose I have one table from impala and one table from sql server management studio. can I join both tables in impala? for this can we create connection between impala and sql server management studio?
5.Can I use all tables or views created in sql to impala (after making connection between sql and impala). That means my tables or views are in sql. but I am fetching data in impala.
All tables stored in sql server. can I do join operation on these tables in impala.
7.Can I make views in impala using tables which are stored in sql
8.Can I create all tables in impala and do etl operation like sum, add, dateadd in impala
9.Can I create all tables in impala and do etl operation like sum, add, dateadd in power query
10.Can I create views from sql and put it in impala table. and use in power query
Can I create all tables and views with joins in impala?
12.How can I optimise my query in sql and if I run same query for same data in impala then my execution time will reduce or not?
My SQL query is like this
create view as test
select * from table a
inner join table b on a.id=b.id
inner join table c on b.name=c.name
go
output is 3000000 row. increasing day by day
also instead of using view I use table directly. but execution time is not decreasing.
Related
I am using a SQL database and MS Access 2019 as the front end. The SQL database tables are linked to the Access db using an ODBC connection.
All my queries (they have multiple joined linked tables) run just fine, but as soon as I add a join to a table stored in the Access app (for example, a small table just for mapping values) the query will slow to a crawl. Doesn't matter if the joined fields are indexed or what type of join I'm using.
If anyone has seen this behaviour and found a solution I would much appreciate hearing it.
Joining tables from two separate databases requires the client app to retrieve both tables in their entirety in order to determine the rows needed. That's why it's slow.
If your Access table is small, try using a stored procedure on the SQL side with the data from Access moved to a temporary table. (Or better yet, move the Access table to SQL).
I have noticed that Create Table As Select (CTAS) in SQL Data Warehouse statements are extremely fast compared to Select into statement.
I want to know what magic microsoft did to make it so fast?
The magic is: Polybase with minimal transaction loging
SO heres what I am trying to do.
I have a table which has 2 columns - QC_Check and Query. For each QC_Check I have a query. There are several records like this.
Is there a way using SQL transformation that, I can fetch the SQL query stored in the Query column to Informatica, run the queries in Teradata and get the results stored somewhere.
Although, I have not tried it myself, this should be possible using SQL transformation in Query mode with Dynamic SQL Queries.
Use the table with Query column as a source. Create a SQL transformation with Query mode. Connect the Query column to the SQL transformation.
Write ~Query_Port~ in the SQL editor in the SQL transformation:
If you want to capture the results from your query, you have to configure output ports for columns you retrieve from the database.
I have a simple Access database I use to create a number of reports. I've linked to a Teradata database server in our organization to pull some additional employee-level details. There is a simple left join on employee number, and all I pull is the name and the role.
The query without the connect takes maybe a minute or so to run and is very quick once loaded. Left joining on the Teradata connection slows everything down to a crawl. It can take 10 minutes or so to run the query through Excel. When the query is loaded in Access, scrolling through it is very slow.
I should note there's no performance issues with the Teradata server. I pull unrelated reports from the same and different tables, with complex joins and the speed is very quick.
I tried creating an even simpler query that does almost noting, and the performance issues are still there. Here is the code:
SELECT EMPL_DETAILS_CURR.NM_PREFX, EMPL_DETAILS_CURR.NM_GIVEN,
MC.DT_APP_ENTRY, MC.CHANNEL_IND
FROM MC LEFT JOIN EMPL_DETAILS_CURR ON MC.EMP_ID = EMPL_DETAILS_CURR.EMP_ID;
There are only 7000 records in MC.
If you are joining data between MS Access tables and Teradata tables the join has to be completed using the Microsoft JET Engine on your local machine. That means the data that exists in your Teradata tables is being brought down to your local machine to so that it can be joined.
If the tables are all on Teradata and accessed via linked tables in MS Access the join may still be occurring locally. I would suggest running the query as an ODBC Direct (I forget the exact term) query so that the SQL is passed on to Teradata to be executed and the results returned to MS Access when the query completes.
Application connected to MS SQL Server will create views where a single row result is an analysis including aggregations of 1-10k records. The applicable criteria across the resulting view will have dozens to tens of thousands of results. The view+criteria will then be ordered by some column (user specified) in the view which are most likely to be the aggregated columns. Response times are expected to degrade quickly when aggregated column is used for ordering.
A while back, this problem was solved pretty easily (in Oracle 9i) with materialized views.
Any ideas on how to get a similar solution in MS SQL Server 2005.
You can use Indexed views for this.
Read here for SQL 2005: http://msdn.microsoft.com/en-us/library/dd171921.aspx
Read here for SQL 2008: http://msdn.microsoft.com/en-us/library/dd171921.aspx
Materialized views are not same as indexed views. MS SQL server indexed views have multiple limitations such as use of outer joins, aggregates and common table expressions.