SSAS Tabular - Only process data where row is new or modified - sql

I have an SSAS cube that imports a view of data form a source system. Each time this processes it imports the full view, however i want to improve performance by only processing the rows that are new or have changed since the last process. Can anyone advise the best way of doing this?
The view has an ID column, along with a created date and a modified date if this helps?
Have not known what to try, even after googling

What is your model processing strategy? Why are you doing a Process Full?
There is an option called Process Add which loads only new data which has not been provisioned in the cube based on the condition which allows you to segment all modified data which has not been added in the model.
Here's a quick snippet of process add logic:
[Reflection.Assembly]::LoadWithPartialName("Microsoft.AnalysisServices")
$server = New-Object Microsoft.AnalysisServices.Server
$server.connect("localhostK12")
$db = $server.Databases.Item("AdventureWorks Tabular Model SQL 2012")
$dsv = $db.DataSourceViews.GetByName("Sandbox")
$cube = $db.Cubes.GetByName("Model")
$measureGroup = $cube.MeasureGroups.GetByName("Internet Sales")
$partition = $measureGroup.Partitions.GetByName("Internet Sales")
$queryBinding = New-Object Microsoft.AnalysisServices.QueryBinding( $dsv.DataSourceID, "SELECT * FROM FactInternetSales **WHERE OrderDateKey >= 20120215**" )
$partition.Process( "ProcessAdd", $queryBinding )
$server.Disconnect()

Use the below query to filter records which have been modified recently
Select * from table where id,
modified_date IN (select id,
max(modified_date) from
table group by id) or created_date IS
NULL;

Related

update main table from two others with modification of data before updating

I have two tables that I need to update. I have to go through every record on one table then do some modifications to the data then upload the modified
data to another table!
The two tables have 3,000 records and 11,000 records. Plus I also have to
check some info from a third table with about 50 records!
Dim id
Dim fly_SQL
id="user1"
Dim rsc1_conn As ADODB.Connection
Set rsc1_conn = CreateObject("ADODB.Connection")
rsc1_conn.Provider = "SQLOLEDB"
rsc1_conn.ConnectionString = "SERVER=companyserver;UID=" & id &
";Trusted_Connection=Yes;DATABASE=DATAbank" '
rsc1_conn.Open
Set rsc1 = CurrentDb.OpenRecordset("SELECT * FROM main_database",
dbOpenDynaset, dbSeeChanges)
rsc1.movefirst
do until rsc1.EOF
fly_SQL = "Select * from alt_db where alt_db.number = main.net_number"
Set rsc2 = CurrentDb.OpenRecordset(fly_SQL)
do stuff
code = dlookup( "type_def", "third_rec" , alt_db.activity = activity)
The two tables both use net_number as a reference which on the main is primary key unique, but the alt_db has multiple entries.
So basically I have to loop through each net_number on the main, look at the matching net_number on the alt_db then compare an activity field
with a third table to see which field I update on the main! If it's a Project management expense I put it in the main.PM_cost. The net_number in alt_db might repeat for 10 other expenses that need to be funneled into their proper expense categories in the main DB! As an example:
Main table looks like
net_number
first record shows
main.netnumber = 123456
main.cont_cost
main.PM_cost
main.mgmt_cost
alt_db table looks like
alt_db.net_number
alt_db.activity
alt_db.PM_cost
alt_db.const_cost
alt_db.mgmt_cost
third_rec looks like
third_rec.code
third_type
where data can be something like con1 , sabb ,
code type
sauf construction
con1 management
I130 project management
And needed rules:
check alt_db.activity with third_rec.act and return activity type
If activity type is construction then I put the alt_db.cost into main.const_cost
If activity type is project_mgmt then I put the alt_db.cost into main.PM_cost. The alt_db.activity could be con1 or SAF4 and the type is determined by the third_rec table.
Trying to figure out the best (most efficient way) to go about this.
Any suggestions?
The above code will surely be missing proper variable definitions and such but it's just for explaining my dilemma!
I could probably do it with DLookup but I don't think that would be very efficient!
Pete
Best way was to build a query that produces a file filtered data from the Main and alt_db to group the activities by net_number . Then use a case to determine which fields from alt_db to update using the third file and update the result in the proper fields on the main db.

Using R's shinydashboard, how do I perform reactive calculations based on an SQL query's results?

Currently I am building a KPI dashboard using shinydashboard, R and MS Access. In order to fulfill user requirements to be able to drill down to atomic data I am using a MS Access database (soon SQL) as the underlying data architecture. I am using RODBC as the DB driver connection library.
Reactive inputs include {warehouse, date1, date2}. Date inputs are in date range format on the ui.R page.
To reactively calculate the KPIs in question I will need the metrics in the database to be interactively queried based on user inputs, then calculated. I am able to get interactively generated query to work when doing it manually, but not in shinydashboard. I am also able to generate the query's output data table using renderDataTable within R & shiny, indicating that the query is working. However I cannot figure out why I can't do a simple proof-of-principle with the simple "sum" statement below.
server.R
query_brk = reactive({
q = ifelse(input$house == 'Saint Louis',
paste0("SELECT *
FROM T_Breakage
WHERE Warehouse = 'STL' AND Date BETWEEN #",
as.character(format(input$dates[1], "%m/%d/%Y")),
"# AND #",
as.character(format(input$dates[2], "%m/%d/%Y")), "#"),
paste0("SELECT *
FROM T_Breakage
WHERE Warehouse = 'KC' AND Date BETWEEN #",
format(input$dates[1], "%m/%d/%Y"),
"# AND #",
format(input$dates[2], "%m/%d/%Y"), "#"))
q
})
t_breakage = reactive({
odbc_connection = odbcConnectAccess2007(reporting_db)
t = sqlQuery(odbc_connection, "SELECT * FROM T_Breakage WHERE Warehouse = 'STL' AND Date BETWEEN #06/01/2015# AND #03/31/2016#")# query=query_brk())
t
})
output$total_breakage = reactive({
valueBox(
scales::comma((round(sum(t_breakage()[, 'Cases'], na.rm=TRUE)))),
'Warehouse Breakage', icon=icon('trash-o')
)
})
I have checked to make sure (1) the query is rendering properly and in the correct format for Access, and (2) that the query works as specified. I am reasonably certain that the breakdown appears at the "output$total_breakage" stage in the code.
I am likely missing something simple -- any help resolving this issue is greatly appreciated. Thanks for your time & thought.

How would I implement this query in an SSIS dataflow?

So I have something that I did in an execute SQL task but my project manager would rather see it in a data flow task.
INSERT INTO [dbo].[lookup_product]
([dim_global_data_source_id]
,[source_product]
,[source_product_type]
,[source_grade]
,[source_gauge]
,[source_width]
)
SELECT distinct
dim_global_data_source_id,
product_desc,
product_type,
grade,
gauge,
size1
FROM Staging_informix_Coil_is
where not exists
(select source_product
from lookup_product
where lookup_product.dim_global_data_source_id = Staging_informix_Coil_is.dim_global_data_source_id
and isnull(lookup_product.source_product,'') = isnull(Staging_informix_Coil_is.product_desc,'')
and lookup_product.source_product_type = Staging_informix_Coil_is.product_type
and isnull(lookup_product.source_grade,'') = isnull(Staging_informix_Coil_is.grade,'')
and isnull(lookup_product.source_gauge,0) = isnull(Staging_informix_Coil_is.gauge,0)
and isnull(lookup_product.source_width,0) = isnull(Staging_informix_Coil_is.size1,0)
)
`
That's the query. I need this in a workflow. Someone help me out or give me a sample
I'm with your project manager on this one. I would create a Data Flow Task. The first component would be an OLE DB Source, containing just your first SELECT (no WHERE clause).
The next component would be a Lookup, selecting the columns you need to match on from lookup_product. On the Columns tab I would match the columns as you have in your WHERE clause. On the General tab I would set it to Redirect Rows to No Match output.
The final component is an OLE DB Destination, pointing at the lookup_product table. I would connect this to the Lookup using the No Match output.

Need to pull only last date in table that stores change dates SQL / ODBC

Hope somebody can help me with this. I'm trying to pull a list of forthcoming titles (I work in publishing) via ODBC/ms query. I want (amongst other things) to show their internal status (approved, prepress etc.). The database stores the change dates for the status'. I seem to be getting one line per status per title. So if the title has changed status 6 times, I will get 6 lines. But I only want to show the latest status...
The date is in BL_PROJECT_TO_STATUS.STATUS_DATE (I've inserted a date criteria beneath, just to make it more visible).
How can this be done? I'm very new to ODBC and would appreciate it a lot.
SELECT DISTINCT
BL_PROJECT.EXP_PUB_DATE, BL_PROJECT.EAN, BL_PROJECT.TITEL,
MEDIATYPE.DESCRIPTION, BL_PROJECT_STATUS.DESCRIPTION
FROM
FIRMA1.BL_PROJECT BL_PROJECT, FIRMA1.BL_PROJECT_STATUS BL_PROJECT_STATUS,
FIRMA1.BL_PROJECT_TO_STATUS BL_PROJECT_TO_STATUS, FIRMA1.MEDIATYPE MEDIATYPE
WHERE
BL_PROJECT.PROJECT_ID = BL_PROJECT_TO_STATUS.PROJECT_ID AND
BL_PROJECT_TO_STATUS.STATUS_ID = BL_PROJECT_STATUS.CODE AND
BL_PROJECT.MEDIATYPE = MEDIATYPE.ID AND
((BL_PROJECT.PROJECT_TYPE = 2) AND
(BL_PROJECT.EXP_PUB_DATE Between SYSDATE AND (SYSDATE+90)) AND
(BL_PROJECT_TO_STATUS.STATUS_DATE = {ts '2013-11-20 00:00:00'}))
ORDER BY
BL_PROJECT.EXP_PUB_DATE, BL_PROJECT.EAN, BL_PROJECT.TITEL
Here is the general idea. You can adapt it with your table and field names.
select somefields
from sometables
join
(select something, max(datetimefield) maxdt
from table1
where whatever
group by something ) temp on table1.datetimefield = maxdt
etc

Finding number of rows in each GROUP BY result

I have a database the tells each systems current and new model. (Each row has a system name, current model, and new model) I need to group the results by the current model and new model. So if the current and new model are the same i want one result. I got that to work but i also need to know how many systems are in each group. How can I do that? Im using CI
This is my query
$query2 = $this->db->query('SELECT * FROM rollout_systems WHERE scope_ID = '.$id.' GROUP BY EAM_Model, new_Model');
(EAM_Model = current model)
Don't know if i clarified enough. I need to be able to display the Current System, the New System, and the number of systems in that group.
So if 3 rows have "blah" as there current system and "blahblah" as there new system, i want it to say
Current: Blah
New: Blah Blah
number of systems: 3
It doesn't make sense to do a select * and a group by in the same statement. The columns you group by need to be in your select but then you should have an aggregate function like count(*) in your select. I'm not sure I totally understand what you are trying to do, but I think you want something like:
SELECT EAM_Model, new_Model, count(*)
FROM rollout_systems
WHERE scope_ID = '.$id.'
GROUP BY EAM_Model, new_Model
If you just want the number of items, use SELECT EAM_Model, new_Model, COUNT(*) FROM ... etc. When you group a query, you normally want to be performing an aggregate like COUNT or SUM or else it doesn't make sense to be grouping the data.
$query2 = $this->db->query('SELECT count(*) FROM rollout_systems WHERE scope_ID = '.$id.' GROUP BY EAM_Model, new_Model');
Is it this you are asking for?