Inefficient query or am I reaching the limits of Access? - ms-access-2007

I have been taking an online class on relational databases and created an Access database (for the first time) to practice my SQL queries and solve a couple of work-related problems along the way. The database consists of three tables, with the primary table being used to record company wide sales summary information at the branch/store/menu item level (e.g. lowest level of detail) and with three periods of data the database is presently 1.3GB with that one table containing 4,262,421 records.
Everything has gone well until I attempted to run the following query:
SELECT P1.*, P13.[Price?] AS P13Price
FROM (SELECT * FROM PBASE WHERE Period = 13) AS P13, (SELECT * FROM PBASE WHERE Period = 1) AS P1
WHERE P1.Key = P13.Key and P1.[Price?]<>P13.[Price?];
To explain, the big table is PriceAccData and so I first ran a query (PBASE) that added a field to the PriceAccData that I can use as a key to compare price changes from one period to the next (combination of branch, store, menu item). Then I used subqueries to create a data set from the last period of 2013 (Period 13) and the first period of 2014 (Period 1)....from there I attempted to identify items that had changed in price from one period to the next in the Where clause.
Is there a more efficient way to write the query or to accomplish the comparison....it will work for one branch at a time, but takes a long time and locks up Access if I run it for more than one branch.

Subqueries are always known to be inefficient and are used as last resort. There's usually a way to JOIN tables for better efficiency. I suggest something in the line of :
SELECT ...... FROM PBASE P13 INNER JOIN PBASE P1 ON P13.KEY=P1.KEY
this will give you the data for the 2 periods then you can check for your equality criteria. Let me know if you need further help for that

Related

Calculating the difference in days from two records in an Access Database

I am creating an Access Database from a very complex Excel Spreadsheet. The process has been going well until I got to this problem. The solution is easy in Excel, but I cannot figure out how to do it in Access.
Here is what I had before in Excel.
I had a list of Customers on one sheet with multiple fields. I then had another sheet act as a report that would run a VBA macro to search through the table of all customers and list out every customer by name that was an inbound call from our contact center (Que Call), when that call came and then would calculate a third column for the number of days between calls. This last column is where I am running into difficulties translating to Access. In Excel, I would just have it do something like in cell C3 =SUM(B3-B2). Given that the table looked like this:
Column A Column B Column C
Row 1 Name Date Time Lapse
Row 2 Customer 1 7/1/2019 ----------
Row 3 Customer 2 7/2/2019 =SUM(B3-B2) <-- 1 day
Row 4 Customer 3 7/4/2019 =SUM(B4-B3) <-- 2 days
In Access:
I have a report that goes through my table of customers and lists off only those from our contact center (Que Call), but I can't figure out how to put in the calculation of time between calls as the design only allows me to affect one row. How do I make this calculation? Is it a SQL query that I need to do? I would prefer to not have to have a separate table for call center calls or a separate column in my customers table to calculate this as some customers are not from the call center. Can I just run a report or a query. Any advise or help would be greatly appreciated.
Current SQL Code:
SELECT
[Customers].FullName,
[Customers].ID,
[Customers].QueCall,
[Customers].Status,
[Customers].InterestLevel,
[Customers].State,
[Customers].Product,
[Customers].Created,
[Customers].LastContact,
[Customers].PrimaryNote
FROM
Customers
WHERE
((([Customers].QueCall)=True));
ORDER BY
[Customers].Created;
Describe exactly how it isn't working (error message, unexpected results, etc...)
It just lists out the customers and does not allow me to calculate the difference between when the records were created (ie when they were first contacted). I have found many things online about how to calculate the difference between two columns of the same record, but not between two different records; nor two different records that may not be sequentially after each other as there may be other non Que Call customers between records in the customer table.
Describe the desired results
I would like to have a column in the end report that shows how many days lapsed between records that were que calls.
Thank you in advance for any input that you may have.
Consider a correlated aggregate subquery where an inner query from same source, Customer, is correlate with outer query by same ID (assumed to be unique identifier) with date comparison (assumed to be Created field). Notice the use of table alias, c and sub for the correlation.
Use DateDiff for difference between dates. To use this query, place below query into the SQL mode of Query Designer and save the object to be used as recordsources to forms, reports, opened on its own, or used in application code as recordsets.
SELECT
c.FullName,
c.ID,
c.QueCall,
c.Status,
c.InterestLevel,
c.State,
c.Product,
c.Created,
c.LastContact,
c.PrimaryNote,
(SELECT TOP 1 SUM(DateDiff("d", sub.Created, c.Created))
FROM Customer sub
WHERE sub.ID = c.ID
AND sub.Created < c.Created
GROUP BY sub.Created
ORDER BY sub.Created DESC) AS TimeElapsed
FROM
Customers c
WHERE
(((c.QueCall)=True));
ORDER BY
c.Created;
Do be aware for large tables this correlated subquery can be taxing in time and performance. Allow time to complete and look into storing output in a temp table with a Make-Table Query to avoid re-run.

SQL to Spotfire query filtering issue with multiple tables

I am trying to calculate hours flowing in and out of a cost center. When the cost center lends out an employee for an hour it's +1 and when they borrow an employee for an hour it's -1.
Right now I'm using a query that says
select
columns
from dbo.table
where EmployeeCostCenter <> ProjectCostCenter
So when ProjectCostCenter = ID_CostCenter it returns +HoursQuantity.
Then I update ID_CostCenter = EmployeeCostCenter then where ID_CostCenter = EmployeeCostCenter to take -HoursQuantity.
That works fine. The problem is when I import it to Spotfire I can't filter on the main table even after I added the table relations. Can anyone explain why?
I can upload the actual code if needed, but I use 4 queries and a couple of them are quite lengthy. The main table, a temp table to calculate incoming hours, and a temp table to calculate outgoing hours are the only ones involved in this problem I think.
(moved to answer to avoid lengthy discussion)
Essentially, data relations are used to populate filtering / marking between different data-sets. Just like in RDBMS, the relation is what Spotfire uses as the link between dataset. Essentially it's the same as the column or columns you join on. Thus, any column that you wish to filter in TableA and have the result set limited in TableB (or visa versa) must be a relation.
Column matches aren't related columns, but are associated for aggregations, category axis, etc within each visualization. So if TableA has "amount" and TableB has "amount debit" and you wanted to use both of these in an expression, say Sum([TableA].[amount],[TableB].[amount debit]), they would need to be matched in order to not produce erroneous results.
Lastly, once you set up your relations, you should check your filter panel to set up how you want the filtering to work. You can have the rows included, excluded, or ignored all together. Here is a link explaining that.

MS Access - Log daily totals of query in new table

I have an ODBC database that I've linked to an Access table. I've been using Access to generate some custom queries/reports.
However, this ODBC database changes frequently and I'm trying to discover where the discrepancy is coming from. (hundreds of thousands of records to go through, but I can easily filter it down into what I'm concerned about)
Right now I've been manually pulling the data each day, exporting to Excel, counting the totals for each category I want to track, and logging in another Excel file.
I'd rather automate this in Access if possible, but haven't been able to get my heard around it yet.
I've already linked the ODBC databases I'm concerned with, and can generate the query I want to generate.
What I'm struggling with is how to capture this daily and then log that total so I can trend it over a given time period.
If it the data was constant, this would be easy for me to understand/do. However, the data can change daily.
EX: This is a database of work orders. Work orders(which are basically my primary key) are assigned to different departments. A single work order can belong to many different departments and have multiple tasks/holds/actions tied to it.
Work Order 0237153-03 could be assigned to Department A today, but then could be reassigned to Department B tomorrow.
These work orders also have "ranking codes" such as Priority A, B, C. These too can be changed at any given time. Today Work Order 0237153-03 could be priority A, but tomorrow someone may decide that it should actually be Priority B.
This is why I want to capture all available data each day (The new work orders that have come in overnight, and all the old work orders that may have had changes made to them), count the totals of the different fields I'm concerned about, then log this data.
Then repeat this everyday.
the question you ask is very vague so here is a general answer.
You are counting the items you get from a database table.
It may be that you don't need to actually count them every day, but if the table in the database stores all the data for every day, you simply need to create a query to count the items that are in the table for every day that is stored in the table.
You are right that this would be best done in access.
You might not have the "log the counts in another table" though.
It seems you are quite new to access so you might benefit form these links videos numbered 61, 70 here and also video 7 here
These will help or buy a book / use web resources.
PART2.
If you have to bodge it because you can't get the ODBC database to use triggers/data macros to log a history you could store a history yourself like this.... BUT you have to do it EVERY day.
0 On day 1 take a full copy of the ODBC data as YOURTABLE. Add a field "dump Number" and set it all to 1.
1. Link to the ODBC data every day.
join from YOURTABLE to the ODBC table and find any records that have changed (ie test just the fields you want to monitor and if any of them have changed...).
Append these changed records to YOURTABLE with a new value for "dump number of 2" This MUST always increment!
You can now write SQL to get the most recent record for each primary key.
SELECT *
FROM Mytable
WHERE
(
SELECT PrimaryKeyFields, MAX(DumpNumber) AS MAXDumpNumber
FROM Mytable
GROUP BY PrimaryKeyFields
) AS T1
ON t1.PrimaryKeyFields = Mytable.PrimaryKeyFields
AND t1.MAXDumpNumber= Mytable.DumpNumber
You can compare the most recent records with any previous records.
ie to get the previous dump
Note that this will NOT work in the abvoe SQL (unless you always keep every record!)
AND t1.MAXDumpNumber-1 = Mytable.DumpNumber
Use something like this to get the previous row:
SELECT *
FROM Mytable
INNER JOIN
(
SELECT PrimaryKeyFields
, MAX(DumpNumber) AS MAXDumpNumber
FROM Mytable
INNER JOIN
(
SELECT PrimaryKeyFields
, MAX(DumpNumber) AS MAXDumpNumber
FROM Mytable
GROUP BY PrimaryKeyFields
) AS TabLatest
ON TabLatest.PrimaryKeyFields = Mytable.PrimaryKeyFields
AND
TabLatest.MAXDumpNumber <> Mytable.DumpNumber
-- Note that the <> is VERY important
GROUP BY PrimaryKeyFields
) AS T1
ON t1.PrimaryKeyFields = Mytable.PrimaryKeyFields
AND t1.MAXDumpNumber= Mytable.DumpNumber
Create 4 and 5 and MS Access named queries (or SS views) and then treate them like tables to do comparison.
Make sure you have indexes created on the PK fields and the DumpNumber and they shoudl be unique - this will speed things up....
Finish it in time for christmas... and flag this as an answer!

Linking a table to two columns in a second table

I have an issue where I think my major problem is figuring out how to phrase it to get an acceptable answer from Google.
The situation:
Table A is 'Invoice's it has a column that links to Table B 'Jobs' in two places. It either links to our 'Job Number' column or the 'Client Number' column. The major issue is the fact that 'Client Number' and 'Job Number' can be the same number if we set the job up instead of the client setting the job up.
What I'm getting is that every time there is the same number in either column the results are duplicated.
Now this is extremely simplifying the situation to try and make it a bit more understandable, but I am essentially looking for a statement that looks at Table A gets the value then compares against Column B1 if that doesn't match then compares it against B2 if that doesn't match then excludes it from the results. The key would be that if it matches when it compares against B1 it doesn't go on to compare it against B2.
Any help with this would be greatly appreciated, even if it is just a point in the direction of the very obvious operator or function that does this. It's hitting the end of a very long day.
Thank you.
Edit:
A further description:
Invoice Table
---------------------------------
PK, INVOICE_NUMBER, LINK_TO_JOB
Job Table
---------------------------------
PK, JOB_NUMBER, CLIENT_JOB_NUMBER
Now the crux of the matter is that both PK are database generated sequential numbers, no overlap there. The invoice number and the job number are both application generated sequential numbers with no overlap the link to job is application generated and when an invoice is raised links to one of two fields in the jobs table based on rules. For simplicity lets say those rules are if there is a Client Job Number link to that if not link to the job number.
Now the Client job number is a field that is written into buy people, lots of mistakes can and do happen, but lots of crap gets put in this field as well. Stuff like 'Email' 'Fax' are very common answers. So when there is crap in there like 'Email' it links to a series of other fields holding the same 'Email' tag.
So that's problem one.
Problem two Where Statement:
SELECT INVOICE_NUMBER,
LINK_TO_JOB
JOB_NUMBER,
CLINET_JOB_NUMBER
FROM JOBS_TABLE a,
INVOICE_TABLE b
How do I set up the where statement to get the desire result, I've tried:
WHERE (LINK_TO_JOB = JOB_NUMBER OR LINK_TO_JOB = CLIENT_JOB_NUMBER)
This returns lots of multiples, such as when the job number and client job number are identical and when there are multiple identical written in answers 'email' etc. Now this might be unavoidable and I will end up using a Distinct with this where statement to do the best I can with what I have. However what I want to do is:
WHERE (LINK_TO_JOB = JOB_NUMBER (+) OR LINK_TO_JOB = CLIENT_JOB_NUMBER (+))
Which comes back with an error as you can use outer joins with an OR operator.
If nothing comes from this I might just have to go with the OR connection and then throw in the Select Distinct and then build redundancy into Invoicing process so that when the database misses links a manual process catches them.
Although I'm all ears for any ideas.
One way of doing this would be to use a set operation. UNION will give you a distinct set of values. You haven't given much detail so I'm guessing at the specifics: you'll need to amend them for your needs.
with j as ( select * from jobs )
select j.*, inv.*
from invoices inv
join j on ( inv.job_no = j.job_no)
union
select j.*, inv.*
from invoices inv
join j on ( inv.job_no = j.client_no)
The underlying reason for your difficulties is that the data model is half-cooked. In a proper design INVOICES.JOB_NO would have a foreign key relationship referencing JOBS.JOB_NO. Whereas JOBS.CLIENT_NO would be an additional piece of information, a business key, but would not be referenced by INVOICES. Of course it can be displayed on an actual invoice, that's why Nature gave us joins.
Use SELECT DISTINCT to remove the duplicates from your results set.
OK, well group effort here. I used the union join like suggested by APC. and modified to fit my data and all of it's eccentricities (read the French couldn't data model there way out of a paper bag) And then I surrounded everything in a distinct statement suggested by user1871207 and Hikaru-Shindo.
But negative marks go to me, the reason my question was so unclear was several fold, but the big piece of information that was difficult for me to grasp / explain was that Invoices are not always for jobs, coupled with the fact that Invoices can be consolidated (which just went and screwed everything up) and This is just a big mess that I've with your help managed to put a very small piece of two year old scotch tape on.
My only hope for a continued career here is to use the exceptions that come up (and they will come at me like a spider monkey!) to hopefully amend the entire invoice process so that we can report some basic profit and loss numbers.
Cheers for all your help.

What is a fast way of joining two tables and using the first table column to "filter" the second table?

I am trying to develop a SQL Server 2005 query but I'm being unsuccessful at the moment. I trying every different approach that I know, like derived tables, sub-queries, CTE's, etc, but I couldn't solve the problem. I won't post the queries I tried here because they involve many other columns and tables, but I will try to explain the problem with a simpler example:
There are two tables: PARTS_SOLD and PARTS_PURCHASED. The first contains products that were sold to customers, and the second contains products that were purchased from suppliers. Both tables contains a foreign key associated with the movement itself, that contains the dates, etc.
Here is the simplified schema:
Table PARTS_SOLD:
part_id
date
other columns
Table PARTS_PURCHASED
part_id
date
other columns
What I need is to join every row in PARTS_SOLD with a unique row from PARTS_PURCHASED, chose by part_id and the maximum "date", where the "date" is equal of before the "date" column from PARTS_PURCHASED. In other words, I need to collect some information from the last purchase event for the item for every event of selling this item.
The problem itself is that I didn't find a way of joining the PARTS_PURCHASED table with PARTS_SOLD table using the column "date" from PARTS_SOLD to limit the MAX(date) of the PARTS_PURCHASED table.
I could have done this with a cursor to solve the problem with the tools I know, but every table has millions of rows, and perhaps using cursors or sub-queries that evaluate a query for every row would make the process very slow.
You aren't going to like my answer. Your database is designed incorrectly which is why you can't get the data back out the way you want. Even using a cursor, you would not get good data from this. Assume that you purchased 5 of part 1 on May 31, 2010. Assume on June 1, you sold ten of part 1. Matching just on date, you would match all ten to the May 31 purchase even though that is clearly not correct, some parts might have been purchased on May 23 and some may have been purchased on July 19, 2008.
If you want to know which purchased part relates to which sold part, your database design should include the PartPurchasedID as part of the PartsSold record and this should be populated at the time of the purchase, not later for reporting when you have 1,000,000 records to sort through.
Perhaps the following would help:
SELECT S.*
FROM PARTS_SOLD S
INNER JOIN (SELECT PART_ID, MAX(DATE)
FROM PARTS_PURCHASED
GROUP BY PART_ID) D
ON (D.PART_ID = S.PART_ID)
WHERE D.DATE <= S.DATE
Share and enjoy.
I'll toss this out there, but it's likely to contain all kinds of mistakes... both because I'm not sure I understand your question and because my SQL is... weak at best. That being said, my thought would be to try something like:
SELECT * FROM PARTS_SOLD
INNER JOIN (SELECT part_id, max(date) AS max_date
FROM PARTS_PURCHASED
GROUP BY part_id) AS subtable
ON PARTS_SOLD.part_id = subtable.part_id
AND PARTS_SOLD.date < subtable.max_date