I have database with 10 tables. Each table has 70 monthly databases data since 2011 to current and which will increase every month going forward. Each table has Identity ROW_ID with primary key Clustered index. Also there are Non Unique Non Clustered indexes on ID and Date in each table. There are more than 90 million rows in tables.
All I need to create Fact table and then Dimension tables from those 10 tables to create data warehouse cube. I did search so many blogs in Google and could not able to figure out the solution. Can anyone suggest what is the best approach to create Fact and Dimension tables.
Final solution is if once cube build with 70 monthly dtabases data cube needs to refresh every month only to load monthly data. I do not want to make cube reload everything from 2011.
Any suggestions will be helpful.
Thanks.
Related
I have stored customers data in one table and another 21 table has foreign key for customer now I want to find data size for each customer in SQL Server.
One more thing: there are some other tables which have foreign keys for these 21 tables, I also want to find and add data size from those table.
How can I find TOTAL data size - any ideas?
You must find all the table associated by a foreign key from the customer table with the SQL ISO Standard views INFORMATION_SCHEMA : TABLE_CONSTRAINTS, REFERENTIAL_CONSTRAINTS...
Once you get all the table, you must get the object_id of all these tables.
Then you must use sys.dm_db_partition_stats and compute an agregation on the "used_page_count" of this table, and because pages are 8kb, you can do a calculus of this SUM with * 8 for Kb or / 128 for Mb
If you want the average size of each customer, divide the result by the count of customers... Due to fragmentation, the real exact amount of data size for each customer can very heavily changes with the exact same data when other data movment are executed or when some maintenance is done...
If you need more help, please post the exact name of your table, including schema name.
My question is about table partitioning in SQL Server 2008.
I have a program that loads data into a table every 10 mins or so. Approx 40 million rows per day.
The data is bcp'ed into the table and needs to be able to be loaded very quickly.
I would like to partition this table based on the date the data is inserted into the table. Each partition would contain the data loaded in one particular day.
The table should hold the last 50 days of data, so every night I need to drop any partitions older than 50 days.
I would like to have a process that aggregates data loaded into the current partition every hour into some aggregation tables. The summary will only ever run on the latest partition (since all other partitions will already be summarised) so it is important it is partitioned on insert_date.
Generally when querying the data, the insert date is specified (or multiple insert dates). The detailed data is queried by drilling down from the summarised data and as this is summarised based on insert date, the insert date is always specified when querying the detailed data in the partitioned table.
Can I create a default column in the table "Insert_date" that gets a value of Getdate() and then partition on this somehow?
OR
I can create a column in the table "insert_date" and put a hard coded value of today's date.
What would the partition function look like?
Would seperate tables and a partitioned view be better suited?
I have tried both, and even though I think partition tables are cooler. But after trying to teach how to maintain the code afterwards it just wasten't justified. In that scenario we used a hard coded field date field that was in the insert statement.
Now I use different tables ( 31 days / 31 tables ) + aggrigation table and there is an ugly union all query that joins togeather the monthly data.
Advantage. Super timple sql, and simple c# code for bcp and nobody has complained about complexity.
But if you have the infrastructure and a gaggle of .net / sql gurus I would choose the partitioning strategy.
What is a best practice to get around SSAS 2 billion distinct value limitation on a column. My data set grows by 2 billions rows every 10 month and one of the measures in the cube is a row count that runs on the PK. Because adding partitions does not help resolve the problem, would creating new cubes with identical info be the right approach?
Do you have to run the count on a column? This should only be necessary if you have duplicates in your table. Could you not just count rows? Then just omit the primary key column.
I have these 2 tables and I need to create a relationship between them so that I can import them into SSAS Tabular and run some analysis.
The first table has RollingQuarter(Moving Quarter) data. The second is a basic Date table with Date as PK.
Can anyone suggest ways to create a relationship with these?
Ill be using SQL Server 2012.
I could re-create a new date table also.
I think you may have a rough time finding a relationship with these tables.
Your top data table is derived data. It's an average over three months, reported monthly. The Quantity column applies to that window, not to a particular date like all of the stuff in the second table. So what would any relationship really mean?
If you have the primary data that were used to calculate your moving average, then use those instead. Then you can relate dates between the two tables.
But if your analysis is such that you don't need the primary data for the top table, then just pick the middle of each quarter (March 15th 2001 for the first record) and use that as your independent variable for your time series on the top. Then you can relate them by that.
Hi i m new to SSAS 2005/08. I want to create Cube from 1 table , Stored in OLTP Database. Table containg billions of records.
how to select dimension and Fact from this alone table.
Please help me.
A dimension derived from data in the fact table is known as a degenerate dimension:
http://en.wikipedia.org/wiki/Degenerate_dimension
Here's a link discussing how to model an data as both a dimension and fact attribute, if that's what you're wanting to do:
http://www.ralphkimball.com/html/07dt/KU97ModelingDataBothFactDimen.pdf