SQL Adding unmatched Join results to output

SQL Adding unmatched Join results to output - sql

Maybe I am approaching the entire problem wrong - or inefficiently.
Essentially, I am trying to combine two views of data, one of them a log table, based upon 2 criteria:
RoomName field match
Timestamp matches
vw_FusionRVDB_Schedule (RoomName, StartTime, EndTime, Subject, etc)
This contains the schedule of events for all indexed rooms - times in UTC.
vw_FusionRVDB_DisplayUsageHistory (RoomName, OnTime, OffTime, etc)
This is a log of activity that has been paired down to just show when the room display has been turned on and off - times in UTC.
I am wanting to match display on/off activities with the events scheduled in the room when the logged activities occurred.
The query is really long, and includes a lot of derived fields. Hopefully just focusing on the join section will make it more clear.
SELECT <foo>
FROM dbo.vw_FusionRVDB_Schedule
INNER JOIN dbo.vw_FusionRVDB_DisplayUsageHistory
ON dbo.vw_FusionRVDB_Schedule.RoomName =
dbo.vw_FusionRVDB_DisplayUsageHistory.RoomName
AND dbo.vw_FusionRVDB_Schedule.EndTime >=
dbo.vw_FusionRVDB_DisplayUsageHistory.OnTime
AND dbo.vw_FusionRVDB_Schedule.StartTime <=
dbo.vw_FusionRVDB_DisplayUsageHistory.OffTime
This query is working great. By design, some events are listed more than once. This happens when there are multiple on/off display cycles that occur within the window of the same event. Similarly, if a room display is turned on before or during one event and stays on through a following event, data from that single log entry is used on both the first and second event record. So this query is doing exactly what is needed in this aspect.
However, I also want to add back into the output, scheduled events (from the vw_FusionRVDB_Schedule view) that have no corresponding logged activities in the vw_FusionRVDB_DisplayUsageHistory.
I have tried various forms on UNION on another query of the vw_FusionRVDB_Schedule view with null values in the and the fields otherwise taken or derived from vw_FusionRVDB_DisplayUsageHistory view. But it adds all scheduled activities back in - not just the ones with no match from the initial join.
I can provide more details if needed. Thank you in advance.

HepC answered in the comments. I was letting the results confuse me. A left join did the trick.

Related

Reducing database load from consecutive queries

I have an application which calls the database multiple times to achieve one simple goal.
A little information about this application; In short, the application scrapes data from a webpage & stores specific information from this page into a database. The important information in this query is: Player name, Position. There can be multiple sitting at one specific position, kill points & Class
Player name has every potential to change or remain the same every day
Regarding the Position, there can be multiple sitting in one position
Kill points has the potential to increase or remain the same every day
Class, there is only 2 possibilities that a name can be, Ex: A can change to B or remain A (same in reverse), but cannot be C,D,E,F
The player name can change at any particular day, Position can also change dependent on the kill point increase from the last update which spins back around to the goal. This is to search the database day by day, from the current date to as far back as 2021-02-22 starting at the most recent entry for a player name and back track to the previous day to check if that player name is still the same or has changed.
What is being used as a main reference to the change is the kill points. As the days go on, this number will either be the exact same or increase, it can never decrease.
So now onto the implementation of this application.
The first query which runs finds the most recent entry for the player name
SELECT TOP(1) * FROM [changes] WHERE [CharacterName]=#charname AND [Territory]=#territory AND [Archived]=0 ORDER BY [Recorded] DESC
Then continue to check the previous days entries with the following query:
SELECT TOP(1) * FROM [changes] WHERE [Territory]=#territory AND [CharacterName]=#charname AND [Recorded]=#searchdate AND ([Class] LIKE '%{Class}%' OR [Class] LIKE '%{GetOpposite(Class)}%' AND [Archived]=0 )
If no results are found, will then proceed to find an alternative name with the following query:
SELECT TOP(5) * FROM [changes] WHERE [Kills] <= #kills AND [Recorded]='{Data.Recorded.AddDays(-1):yyyy-MM-dd}' AND [Territory]=#territory AND [Mode]=#mode AND ([Class] LIKE #original OR [Class] LIKE #opposite) AND [Archived]=0 ORDER BY [Kills] DESC
The aim of the query above is to get the top 5 entries that are the closest possible matches & Then cross references with the day ahead
SELECT COUNT(*) FROM [changes] WHERE [CharacterName]=#CharacterName AND [Territory]=#Territory AND [Recorded]=#SearchedDate AND [Archived]=0
So with checking the day ahead, if the character name is not found in the day ahead, then this is considered to be the old player name for this specific character, else after searching all 5 of the results and they are all found to be present in the day aheads searches, then this name is considered to be new to the table.
Now with the date this application started to run up to today's date which is over 400 individual queries on the database to achieve one goal.
It is also worth a noting that this table grows by 14,400 - 14,500 Rows each and every day.
The overall question to this specific? Is it possible to bring all these queries into less calls onto the database, reduce queries & improve performance?

What you can do to improve performance will be based on what parts of the application stack you can manipulate. Things to try:
Store Less Data - Database content retrieval speed is largely based on how well the database is ordered/normalized and just how much data needs to be searched for each query. Managing a cache of prior scraped pages and only storing data when there's been a change between the current scrape and the last one would guarantee less redundant requests to the db.
Separate specific classes of data - Separating data into dedicated tables would allow you to query a specific table for a specific character, etc... effectively removing one where clause.
Reduce time between queries - Less incoming concurrent requests means less resource contention and faster response times to prior requests.
Use another data structure - The only reason you're using top() is because you need data ordered in some specific way (most-recent, etc...). If you just used a code data structure that keeps the data ordered and still easily-query-able you could then perhaps offload some sql requests to this structure instead of the db.
The suggestions above are not exhaustive, but what you do to improve performance is largely a function of what in the application stack you have the ability to modify.

Access Database - Most Recent Record - Max Function

I'm in the process of building a database to keep track of loaning equipment. I'm trying to build a query that will display the latest record of each machines location.
Relevant table is:
Movements:
Movement ID (PK)
EntryDate (Automatically generated on record entry)
Serial (FK from a table called stock, with (Make, Model etc)
Location (Where the machine is)
Status (Things like: Available, Testing, Sold etc)
Current query is:
SELECT Movements.Serial, Max(Movements.EntryDateMovements) AS MaxOfEntryDateMovements
FROM Movements
GROUP BY Movements.Serial;
Which spits out the latest date of a record, and the serial associated with it.
What I need is the status to be shown in the results, but it still be grouped by the serial.
My issue is that when I try and add that, it either comes back with an error with about the expression not being part of the aggregate function, or I get more results than expected, as it no longer just keeps the results unique to the serial.
I'm pretty new Access, and have so far been able to muddle through guides, and books, and this site, to get everything else working, but i'm stuck at this hurdle.
Any help would be much appreciated.

Select top 1 *
from Movements
order by EntryDateMovements desc
This will give you everything for the newest record. This is TSQL but I think it carries over to Access.

Try this
Select t.serial,t.EntryDateMovements ,t.location, t.status
From movements as t
Inner join (SELECT Movements.Serial, Max(Movements.EntryDateMovements) AS MaxOfEntryDateMovements
FROM Movements
GROUP BY Movements.Serial) as MaxMovements on t.serial= MaxMovements.serial and t.EntryDateMovements=MaxMovements.MaxOfEntryDateMovements

Multiple aggregates via join in Teradata

I have two tables
Email Contact History
Place of Service
that share a primarymembercustomerid. The Email Contact History has three fields:
Campaigncode
Primarymembercustomerid
maildate
and the Place of Service table has three fields
primarymembercustomerid
servicedate
serviceshortDesc
primarymembercustomerids are selected for E-mail campaigns, then if they walk into one of our branch offices and receive services, they show up in the Place of Service table. I want to count the number of primarymembercustomer ids that are mailed, and right next to it I want to have a count of primarymembercustomerids that showed up to a branch office.
What I have so far:
select
ch.campaigncode,
pos.serviceshortdesc,
count(ch.primarymembercustomerid),
count(pos.primarymembercustomerid)
from mktprodvm.cdmv_prmmbr_contacthist_email ch
right outer join mktprodvm.cdmv_pos pos on ch.primarymembercustomerid = pos.primarymembercustomerid
where ch.campaigncode = 'EDT_ALLACMO'
and pos.servicedate between '2017-02-01' and '2017-02-28'
group by 1,2
What I'm ending up with is a count of primarymembercustomerids that walk into a branch for that time period, but I'm not getting the total count of primarymembercustomerids that were E-mailed. I thought that by doing a right outer join I would get the total number of primarymembercustomerids that were mailed, but it's not working for me. I feel like I need to do some kind of subquery or correlated subquery, but I've ready about how to use them and I don't think that's right. I've never used them before and to be quite honest I'm not that great of a SQL coder either. Thanks for any help!

Because my low reputation on this site I can't comment, so I'm writing this as an answer.
I think that you are using the wrong type of join (or writing the tables in the wrong order). If you don't want to lose rows of your main table, the Email Contact History, you have to do a LEFT JOIN not a RIGHT JOIN.
Also, I don't know if it's possible, but I'm guessing that a primarymembercustomerid can have more than one service and since you are selecting the serviceshortdesc, a single Email might count in different rows of your anwerset and the total won't be accurate. According to what you said you want, I don't see a reason for including the service description in the SELECT.

Return first 'unsorted' join in Oracle SQL

I have a table 'ACCOUNTS', with fields ACCTNO and ACPARENT. One account can be the parent of another. One account can have many children.
It's been discovered that certain external processes are using the 'first child' in certain reports and outputs - but there's no actual 'reason' for any particular child to be 'first', just an unintended bug in the code.
First step in untangling this - I need a query, that can be re-run (but not often, so optimisation is not really a factor) that will identify, for all accounts that are parents, what their 'first child' is.
Problem - the 'first child' isn't necessarily anything to do with record ID. If I run the following query, for example:
SELECT ACCTNO FROM ACCOUNTS WHERE ACPARENT = '80005217';
I get a result of:
ACCTNO
______
80007325
80007310
80007315
80007298
I can absolutely, 100% confirm that for this particular example, account 80007325 is the account ID being used as the 'first child'.
On the flipside, if I run a naive query of:
SELECT A1.ACCTNO, A2.ACCTNO AS CHILDACCOUNT FROM ACCOUNTS A1
INNER JOIN ACCOUNTS A2 ON A1.ACCTNO = A2.ACPARENT
WHERE A1.ACCTNO IN
(SELECT ACPARENT FROM ACCOUNTS);
then if I scroll down to where 80005217 is the parent account, I see the following list:
CHILDACCOUNT
______
80007298
80007310
80007315
80007325
It's sorted, even though it's exactly not what I want.
Is there a query that will get me a list of what I want in a single query? A list of all parent accounts, and their 'first child' as returned by SQL unsorted?

To guarantee records coming in a fixed order we must provide the database with sort criteria in the ORDER BY clause. If there is no attribute which defines "first-ness" then no guarantee is possible. Without an ORDER BY clause the records are essentially in an uncontrolled order, although because of
database internals they often fall into some kind of pattern.
So, what makes account 80007325 the first child WHERE ACPARENT = '80005217'? Clearly not numerical order. Is there some other criterion? Date created? A flag column? Seems like you need to talk to your users. Do they really care which records come first? All the time or just in some specific report?
If your users cannot specify the criteria there's not much you can do...
...although I might be tempted to sort CHILDACCOUNT numerically by ACCTNO whenever it is displayed. At least that would provide consistency, and the users will get used to it.

How to handle reoccurring calendar events and tasks (SQL Server tables & C#)

I need to scheduled events, tasks, appointments, etc. in my DB. Some of them will be one time appointments, and some will be reoccurring "To-Dos" which must be checked off. After looking a google's calendar layout and others, plus doing a lot of reading here is what I have so far.
Calendar table (Could be called schedule table I guess): Basic_Event Title, start/end, reoccurs info.
Calendar occurrence table: ties to schedule table, occurrence specific text, next occurrence date / time????
Looked here at how SQL Server does its jobs: http://technet.microsoft.com/en-us/library/ms178644.aspx
but this is slightly different.
Why two tables: I need to track status of each instance of the reoccurring task. Otherwise this would be much simpler...
so... on to the questions:
1) Does this seem like the proper way to go about it? Is there a better way to handle the multiple occurrence issue?
2) How often / how should I trigger creation of the occurrences? I really don't want to create a bunch of occurrences... BUT... What if the user wants to view next year's calendar...

Makes sense to have your schedule definition for a task in one table and then a separate table to record each instance of that separately - that's the approach I've taken in the past.
And with regards to creating the occurrences, there's probably no need to create them all up front. Especially when you consider tasks that repeat indefinitely! Again, the approach I've used in the past is to only create the next occurrence. When that instance is actioned, the next instance is then calculated and created.
This leaves the issue of viewing future occurrences. For this, you can start of with the initial/next scheduled occurrence and just calculate the future occurrences on-the-fly at display time.

While this isn't an exact answer to your question I've solved this problem before in SQL Server (though database here is irrelevant) by modeling a solution based on Unix's cron.
Instead of string parsing we used integer columns in a table to store the various time units.
We had events which could be scheduled; they could either point to a one-time schedule table that represented a distinct point in time (a date/time) or to the recurring schedule table which is modelled after cron.
Additionally remember to model your solution correctly. An event has a duration but the duration is unrelated to the schedule (but an event's duration may impact the schedule by causing conflicts). Do not try to model duration as part of your schedule.

In the past when we've done this, we had 2 tables:
1) Schedules -> Includes recurrence information
2) Exceptions -> Edit/changes to specific instances
Using SQL, it's possible to get the list of "Schedules" that have at least one instance in a given date range. Then you can expand in the GUI where each instance lies.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas