I am currently entering data into a SQL Server database using SSIS. The plan is for it to do this each week but the day that it happens may differ depending on when the data will be pushed through.
I use SSIS to grab data from an Excel worksheet and enter each row into the database (about 150 rows per week). The only common denominator is the date between all the rows. I want to add a date to each of the rows on the day that it gets pushed through. Because the push date may differ I can't use the current date I want to use a week from the previous date entered for that row.
But because there are about 150 rows I don't know how to achieve this. It would be nice if I could set this up in SQL Server where every time a new set of rows are entered it adds 7 days from the previous set of rows. But I would also be happy to do this in SSIS.
Does anyone have any clue how to achieve this? Alternatively, I don't mind doing this in C# either.
Here's one way to do what you want:
Create a column for tracking the data entry date in your target table.
Add an Execute SQL Task before the Data Flow Task. This task will retrieve the latest data entry date + 7 days. The query should be something like:
select dateadd(day,7,max(trackdate)) from targettable
Assign the SQL result to a package variable.
Add a Derived Column Transformation between your Source and Destination components in the Data Flow Task. Create a dummy column to hold the tracking date and assign the variable to it.
When you map the Excel to table in a Data Flow task, map the dummy column created earlier to the tracking date column. Now when you write the data to DB, your tracking column will have the desired date.
Derived Column Transformation
Related
My problem statement is described below :
I have a calculated field, say Opportunity. There is pre defined rule from organization, that the target value for current month will be 1/3rd of the value of 2 months back. For example, The target opportunity value for April will be 1/3rd value of February. I need to show the current month's opportunity and the targeted value in the same worksheet. How to achieve this is Tableau?
I am getting the base data from tables in Oracle through a custom sql query, and calculating the opportunity value in Tableau for each row, and then showing the sum for a range of time , say last 6 months.
The best way to do this would be to write the target value in SQL. It'll be easier (no need for data blending / complex calcs) and also more performant as it would be a hardcoded value in your dataset.
I am trying to measure duration of Dataflow pipeline which pulls messages from Pub/Sub and loads them to a BigQuery table. I cannot find how to get the last modified time of a row in BigQuery table though there is a last modified datetime of table.
Does anyone know how to set last modified datetime to row of BigQuery table?
You should include the current timestamp in the application that creates the output data structure. That would be the event time in some sense (you can add more granularity by adding event times on the client or on the server depending on how your events originate).
Then you possibly want to record the time before processing (right after the message is read from Pub/Sub). Then you want to record the time right before you write into BigQuery.
You can do both of these with a DoFn as an extra step or include it as the first action in the first transformation and the last action in the last transformation that you have in your pipeline.
Include these new columns respectively to the table schema of the output BigQuery table.
I need to sychronize some data from a database to another using kettle/spoon transformation. The logic is i need to select latest date data that has existed in destination db. Then select from source db from the last date. What transformation element do i need to do this?
Thank you.
There can be many solutions:
If you have timestamp columns in both the source and destination tables, then you can take two table input steps. In the first one, just select the max last updated timestamp, use it as a variable in the next table input, taking it as a filter for the source data. You can do something like this:
If you just want the new data to be updated in the destination table and you don't care much about timestamp, I would suggest you to use insert/update step for output. It will bring all the data to the stream and if it finds a match, it won't insert anything. If it doesn't find a match, it will insert the new row. If it finds any modifications to the existing row in the destination table, it will update it accordingly.
I need to add every day to some query result a column A for sysdate, and next day as well, and next day as well, etc.
So you will have the same select which will always add for new day a new column for actual date.
Is this somehow possible in sql without using INSERT, UPDATE and other rewriting statements?
Thank you very much for your answers :)
Im using a Oracle SQL Developer
SQL queries give you a fixed number of columns and a variable number of rows. If every day means more data to you then you would have a query to result in more rows usually. It's up to a GUI to display retrieved data in the most convenient way (one column per day in your example).
So, yes and no. Yes, you can (very easily) write a query to give you more data each day. No, you cannot write a query that results in a new column every day. But as mentioned, this is not what SQL is made for anyhow. SQL is the data retrieving language. You use a programming language or a report tool to display the data to your users.
I have an access database one table has time column that shows the total time from (endtime - starttime) I need to have that column averaged (hh:mm) ss not needed. I need to store this average into another table, and then be able to display that in a textbox. with conditional formating as far as color for certain time ranges. I'm going to need to do this for a daily range and a monthly range, just wonder what would be the best way to accomplish this. this monthly and daily average will need to update each time the table has records added to it.
My thoughts on this was pull the daily times into an array, then average the array, and store that average in another table. Then use the daily average table to display in a textbox, along with the conditional formatting. and then the same thing for the monthly time average as well.
It is by no means difficult to obtain this information from a query, time is just the decimal portion of a number.
SELECT Format(Avg(CDbl([Atime2])-CDbl([ATime1])),"hh:mm:ss") AS Diff
FROM Table;
Or
SELECT Sum(DateDiff("n",[ATime1],[ATime2])) AS SumMins,
Count([ATime1]) AS CountRecs,
Avg(DateDiff("n",[ATime1],[ATime2])) AS AvgMins
FROM Table;
Furthermore, MS Access 2010 has data macros and calculated columns that are good even outside of Access.
Finally, it is not generally recommended that you store a value that becomes invalid at every edit when the value can easily be calculated.