How do I prevent a duplicate file from being imported into an access table - vba

I have an access database I am building for sales reporting. I am automating a process to import sales transactions from the point of sale on a weekly basis. I want to develop a way to perform a simple check to validate that the file being Imported is not the same as the previous weeks file.
The file will always have the same file name and be in the same folder which access will look for when the macro I have written runs.
My proposed solution was to create a staging table for loading the sales transactions into and a backup of that staging table for comparison. Each week I would backup the temporary table which would have last weeks transactions in it and then load the new file into the staging table. To validate that the new file loaded is not identical to the previous week I would sum the value in the "total sell" column of the backup table and the staging table and compare the values.
I need help to create the code/query to do this and how I would insert it into the macro I have build. Or help in coming up with any other solutions.
I have searched quite a bit on the web but haven't found a solution to this.
This is a link to sample data
https://drive.google.com/file/d/0BwD_Ubcf_4voSnN2elFvTWI2QTA/view?usp=sharing

Please, read my comment to #Gene Skuratovsky solution. I'd suggest there to create another table (in pseudo code):
TABLE ImportedFiles
(
ImportID Integer PrimaryKey,
FilePath String,
ImportDate Date
)
Before you start importing a file, you need to check if record exists in corresponding table ;)
You can use DLookup function to check if record exists.
Function FileHasBeenImported(ByVal sFullFileName As String) As Boolean
FileHasBeenImported = (DLookup("[FilePath]", "ImportedFiles", "[FilePath] =" & sFullFileName )<>"")
End If

Each record should be DateTime "stamped" (you would certainly design something like a [Transaction_DateTime] field into your record structure). If the files you import are indeed "strictly" weekly, you can check just one record to see if this is a new file or an old one. Otherwise, check them all.
Edit:
You do not need the Time part, the Date is sufficient. Assuming you import data into a recordset, you will need something like this (brute force will work just fine, and quick too:
rstX.MoveFirst
Do
If rstX("Trans Date") <= Date - 7 Then
MsgBox "Found a transaction less that 1 week old!"
Exit Do
End If
rstX.MoveNext
Loop Until rstX.EOF
If rstX.EOF Then
MsgBox "All transactions are at least 1 week old!"
End IF
Modify this as appropriate.

Related

Need to create Period over Period Issue Reporting in SQL Server 2016

I am responsible for creating period-over-period and trend reporting for our Team's Issue Management Department. What I need to do is at copy table Issues at month-end into a new table IssuesHist and add a column with the current date example: 1/31/21. Then at the next month-end I need to take another copy of the Issues table and append it to the existing IssuesHist table, and then add the column again with the current date. For example: 2/28/21.
I need to do this to be able to run comparative analysis on a period-over-period basis. The goal is to be able to identify any activity (opening new issues, closing old ones, reopening issues, etc.) that occurred over the period.
Example tables below:
Issues Table with the current data from our front-end tool
I need to copy the above into the new IssuesHist and add a date column like so
Then at the following month end I need to do the same thing. For example if the Issues table looked like this (changes highlighted in Red)
I would need to Append that to the bottom of the existing IssuesHist table with the new Date. So that I could run queries comparing the data periods to identify any changes.
My research has shown that a Temporal Table may be the best solution here, but I am unable to DIM our existing database's tables to include system versioning.
Please let me know what solution would work, best, and if you have any SQL Statement Tips.
Thank you!

Store Report Data in a SQL table

I have a report that is ran every quarter. The report is based on current values and creates a score card. We do this for about 50 locations and then have to manually create a report to compare the previous run to the current run. I'd like to automate by taking the report data and saving it to a table for each location and each quarter, then we can run reports that will show the data changes over time.
Data Sample:
Employees Active
Employees with ref checks
Clients Active
Clients with careplans
The reports are fairly complex and pulling data from many different tables so creating this via a query may not work or be just as complex. Any ideas on how to get the report data to a table without having to export each to a CSV or Excel file then importing manually?
If each score card has some dimensions (or metric names) and aggregate values (or metric values) then you can just add a time series table with columns for:
date
location or business unit
or instead of date and location, a scorecard ID (linking to another table with scorecard metadata)
dimension grouping
scores/values/metrics
Then, assuming you're creating the reports with a stored procedure, you can add a flag parameter to the stored procedure to update this table while generating a specific report on a date. This might be less work and/or faster than importing from CSVs, if you store intermediate report data into a temporary table that you can select from when additionally storing the data into the time series table described above.

Get the oldest row from table

I coding a application that dealing with files. So, I have a table that contains information about all the files that registered in the application.
My "files" table looks like this: ID, Path and LastScanTime.
The algorithm that I use in my application is simple:
Take the oldest row (LastScanTime is the oldest)
Extract the file path
Do some magics on this file (takes exactly 5 minutes)
Update the LastScanTime to the current time (now)
Go to step "1"
Until now, the task is pretty simple. For this, I going to use this SQL statement for getting the oldest item:
SELECT TOP 1 * FROM files ORDER BY [LastScanTime] ASC
and at the end of the item processing (preventing the item to be selected immediately again):
UPDATE Files SET [LastScanTime]=GETDATE() WHERE Id=#ItemID
Now, I going to add some complexity to the algorithm:
Take the 3 oldest row (LastScanTime is the oldest)
For each row, do:
A. Extract the file path
B. Do some magics on this file (takes exactly 5 minutes)
C. Update the LastScanTime to the current time (now)
D. Go to step "1"
The problem that now I facing with is that the whole process is going to be processed in parallel (no more serial processing). So, changing my SQL statement to the next statement is not enough!
SELECT TOP 3 * FROM files ORDER BY [LastScanTime] ASC
Why this SQL statement isn't enough?
Let's say that I run my code and started to execute the first 3 items. Now, after a minute I want to execute another 3 items. This SQL statement will retrieve exactly the same "oldest" items that we already started to process.
Possible solution
Implementing a SELECT & UPDATE (combined) that getting the 3 oldest item and immediately update their last scan time. Since there no SELECT & UPDATE in same statement, what will happens if between the executing of the first SELECT, will come in another SELECT? The both statements will get the same results. This is a problem... Another problem is that we mark the item as "scanned recently", before the scan is really finished. What happend if the scanned will terminated by an error?
I'm looking for tips and tricks to solve this problem. The solutions can add columns as needed.
I'll appreciate you help.
Well I usually have habit of having two different field name in the database. one is AddedDate and another is ModifiedDate.
So the algorithm in your terms will be:-
Take the oldest row (AddedDate is the oldest)
Extract the file path
Do some process on this file
Update the ModifiedDate to the current time (now)
It seems that you are going to invent event queue with your SQL. Possibly standard approaches like RabbitMQ or ActiveMQ may solve your problem.

One to Many - Calculated Column

I am trying to teach myself the new Tabular model for SQL 2012 SSAS to handle some analytic reports that were previously handled in (slow) stored procedures.
I've made decent progress on most of it, just figuring out how things work and how to add the calculations I need but I have been banging my head against the following:
I have a table that has file information -- it has:
ID
FileName
CurrentStatus
UploadedBy
And then a table that has statuses that the file went through (a many relationship to the file table):
FileID
StatusID
TimeStamp
What I'm trying to do is to add a calculated column to the File table that returns the TimeStamp information when a file was in a particular status. ie: StatusID=100 is uploaded. I want to add a calculated column called UploadedDate on the File table that has the associated TimeStamp information from the FileStatus table.
It seems like this should be doable with DAX but I just can't seem to wrap my head around it. Any ideas out there?
In advance, many thanks,
Brent
Here's a formula that should work for what you want to do...
=MAXX(
CALCULATETABLE(
'FileStatus'
,'FileStatus'[StatusID] = 100
)
,'FileStatus'[TimeStamp]
)
I'm assuming each file can only be in each status once (there is only one row per FileID that has StatusID 100). I believe you can just use a lookupvalue formula. The formula for your UploadedDate calculated column would be something like
=LOOKUPVALUE(FileStatus[Timestamp], File[FileID], FileStatus[FileID], FileStatus[StatusID], 100)
Here's the MSDN description of LOOKUPVALUE. You provide the column containing the value you want returned, the column you want to search, and the value you are searching for. You can add multiple criteria to your lookup table. Here's a blog post that contains a good example.

What do I gain by adding a timestamp column called recordversion to a table in ms-sql?

What do I gain by adding a timestamp column called recordversion to a table in ms-sql?
You can use that column to make sure your users don't overwrite data from another user.
Lets say user A pulls up record 1 and at the same time user B pulls up record 1. User A edits the record and saves it. 5 minutes later, User B edits the record - but doesn't know about user A's changes. When he saves his changes, you use the recordversion column in your update where clause which will prevent User B from overwriting what User A did. You could detect this invalid condition and throw some kind of data out of date error.
Nothing that I'm aware of, or that Google seems to find quickly.
You con't get anything inherent by using that name for a column. Sure, you can create a column and do the record versioning as described in the next response, but there's nothing special about the column name. You could call the column anything you want and do versioning, and you could call any column RecordVersion and nothing special would happen.
Timestamp is mainly used for replication. I have also used it successfully to determine if the data has been updated since the last feed to the client (when I needed to send a delta feed) and thus pick out only the records which have changed since then. This does require having another table that stores the values of the timestamp (in a varbinary field) at the time you run the report so you can use it compare on the next run.
If you think that timestamp is recording the date or time of the last update, it does not do that, you would need dateTime fields and constraints (To get the orginal datetime)and triggers (to update) to store that information.
Also, keep in mind if you want to keep track of your data, it's a good idea to add these four columns to every table:
CreatedBy(varchar) | CreatedOn(date) | ModifiedBy(varchar) | ModifiedOn(date)
While it doesn't give you full history, it lets you know who and when created an entry, and who and when last modified it. Those 4 columns create pretty powerful tracking abilities without any serious overhead to your DB.
Obviously, you could create a full-blown logging system that tracks every change and gives you full-blown history, but that's not the solution for the issue I think you are proposing.