Access 2016 SQL: Find minimum absolute difference between two columns of different tables - sql

I haven't been able to figure out exactly how to put together this SQL string. I'd really appreciate it if someone could help me out. I am using Access 2016, so please only provide answers that will work with Access. I have two queries that both have different fields except for one in common. I need to find the minimum absolute difference between the two similar columns. Then, I need to be able to pull the data from that corresponding record. For instance,
qry1.Col1 | qry1.Col2
-----------|-----------
10245.123 | Have
302044.31 | A
qry2.Col1 | qry2.Col2
----------------------
23451.321 | Great
345622.34 | Day
Find minimum absolute difference in a third query, qry3. For instance, Min(Abs(qry1!Col1 - qry2!Col1) I imagine it would produce one of these tables for each value in qry1.Col1. For the value 10245.123,
qry3.Col1
----------
13206.198
335377.217
Since 13206.198 is the minimum absolute difference, I want to pull the record corresponding to that from qry2 and associate it with the data from qry1 (I'm assuming this uses a JOIN). Resulting in a fourth query like this,
qry4.Col1 (qry1.Col1) | qry4.Col2 (qry1.Col2) | qry4.Col3 (qry2.Col2)
----------------------------------------------------------------------
10245.123 | Have | Great
302044.31 | A | Day
If this is all doable in one SQL string, that would be great. If a couple of steps are required, that's okay as well. I just would like to avoid having to time consumingly do this using loops and RecordSet.Findfirst in VBA.

You can use a correlated subquery:
select q1.*,
(select top 1 q2.col2
from qry2 as q2
order by abs(q2.col1 - q1.col1), q2.col2
) as qry2_col2
from qry1 as q1;

Related

Selecting limited results from two tables

I apologise if this has been asked before. I'm still not certain how to phrase my question for the title, so wasn't sure what to search for.
I have a hundred or so databases in the same instance, one for each of my customers, named for the customer, and they all have the same structure. I want to select a single result set that includes the database name along with the most recent date entry in one of the tables. I can pull the database names from sys.databases, but then for each database I want to select the most recent date from Events.Date_Logged so that my result set looks something like this:
_______________________________
| | |
|Cust_Name |Latest_Event |
|_______________|_______________|
| | |
|Customer1 |01/02/2020 |
|_______________|_______________|
| | |
|Customer2 |02/02/2020 |
|_______________|_______________|
| | |
|Customer3 |03/02/2020 |
|_______________|_______________|
I'm really struggling with the syntax though. I either get just a single row returned or every single event for each customer. I think my joins are as rusty as hell.
Any help would be appreciated.
What I suggest you do:
Declare a result variable (of type table)
Use a cursor to go over every database
Inside the cursor: do a select top 1 ... order by date desc to get the most recent record. Save this result in the result variable.
After the cursor print the result variable.
That should do the trick.

How do you 'join' multiple SQL data sets side by side (that don't link to each other)?

How would I go about joining results from multiple SQL queries so that they are side by side (but unrelated)?
The reason I am thinking of this is so that I can run 1 query in Google Big Query and it will return 1 single table which I can import into Excel and do some charts.
e.g. Query 1 looks at dataset TableA and returns:
**Metric:** Sales
**Value:** 3,402
And then Query 2 looks at dataset TableB and returns:
**Name:** John
**DOB:** 13 March
They would both use different tables and different filters, etc.
What would I do to make it look like:
---Sales----------John----
---3,402-------13 March----
Or alternatively:
-----Sales--------3,402-----
-----John-------13 March----
Or is there a totally different way to do this?
I can see the use case for the above, I've used something similar to create a single table from multiple tables with different metrics to query in Data Studio so that filters apply to all data in the dataset for example. However in that case, the data did share some dimensions that made it worthwhile doing.
If you are going to put those together with no relationship between the tables, I'd have 4 columns with TYPE describing the data in that row to make for easier filtering.
Type | Sales | Name | DOB
Use UNION ALL to put the rows together so you have something like
"Sales" | 3402 | null | null
"Customer Details" | null | John | 13 March
However, like the others said, make sure you have a good reason to do that otherwise you're just creating a bigger table to query for no reason.

Why does ROW_NUMBER in a view not respect filters

I am using SQL Server 2014.
I have created a view which surfaces patient history answers such as tabaco use, alcohol use, etc. These answers are time stamped but not linked to the appointment identifiers, and I need to find the most recent answer relative to the appointment date.
The data the view surfaces looks like this:
PATIENT_ID | ANSWER_DATE | TOBACCO_USE
1 | 1/1/2018 | No
1 | 1/5/2018 | Yes
1 | 1/10/2018 | Quit
2 | 1/1/2018 | No
I know I can use ROW_NUMBER() in a inline query when I join to this table to get the ranking I need, but I really want to add ROW_NUMBER()OVER(PARTITION BY PATIENT_ID ORDER BY ANSWER_DATE DESC) as 'rnkDesc' column to the view, to make it simpler for other developers to properly join to this table.
With this new column a SELECT * from the view looks like this:
PATIENT_ID | ANSWER_DATE | TOBACCO_USE | rnkDesc
1 | 1/1/2018 | No | 3
1 | 1/5/2018 | Yes | 2
1 | 1/10/2018 | Quit | 1
2 | 1/1/2018 | No | 1
That is as expected, now I join from my appointments table like so:
FROM APPOINTMENTS appt
LEFT JOIN myHistoryView his
on appt.PATIENT_ID = his.PATIENT_ID
and his.ANSWER_DATE <= APPT.APPT_DATE
and his.rnkDesc = 1
This does not work though, as it appears like the ROW_NUMBER is evaluated before the filters are applied. If I filter my view where PATIENT_ID = 1 and ANSWER_DATE = 1/5/2018 then rnkDesc still shows 2, instead of 1 like it would if I was using ROW_NUMBER in an inline query.
I am really interested in why this behaves this way. I can code around it by using an inline query.
I know that these ranking functions are nondeterministic, and would have thought the engine would filter the result set in the view before it generates the ROW_NUMBER. I tried this with RANK and DENSE_RANK as well, at it appears that these also behave the same way. (determined before the filters are applied.)
If you implemented this with CTEs or a subquery, you'd see the same results. This is necessary because sometimes you need to generate a rank on a result, and then have that rank be unchanged by outer queries. So it is as if the rank is generated first as part of the subquery, and then it is "locked in" so you can filter the results based on that row number.
Let's imagine if one of your filters in the outer query was actually rnkDesc = 2, which is a way that sometimes you can do things like get "2nd most". Imagine if the row number was not generated until after the outer query filter was generated, this would make this type of approach impossible. How do you filter the results on the value of something that hasn't been determined yet? This is the same reason that filtering on a window function usually requires first nesting a subquery or a CTE, so you can filter on the generated results, and those results don't get renumbered/ranked in the outer dynamically.
Therefore it makes sense to lock-in windowed function results based on the nesting they occur in. You have to kind of think about this kind of nesting in terms of subresults.
So that answers your question of "Why" it is this way. I understand what you're trying to achieve though and why you want to do this. You're trying to simplify the use of the window function and have it apply dynamically to the final result. I'm honestly not sure "how" you'd do this without just including the window function in the outer query. There may be a way to embed it in a UDF, but I'm not sure.
It's important to understand that there is a clear, indeed deterministic, order of evaluation in SQL.
The ROW_NUMBER on the view has already been evaluated before the left-join occurs. The ON clause does not act as a "filter" on the table to which the join refers, but as a condition that must be met for a join between the tables to occur.
You could of course create a "parameterised view" (a table-valued inline function), which allows you to pass in the filter date to a where-clause before the ROW_NUMBER is applied in the select-clause, and then outer-apply onto it. That may be appropriate if a large number of queries use the same fuctionality.
But otherwise I'd be inclined to leave the "substance use history" view unadulterated with any row-numbering (unless it is used independently by other queries to get the absolute latest answer), and write the "the latest row on or before the current appointment" logic inline.

SQL getting record for maximum value: why not use "ORDER BY"?

I know that the "select record corresponding to the maximum value for a field" has been exhaustively answered, but I was wondering why nobody suggested using an ORDER BY clause to get the right row.
For example, I have this table:
| other_field | target_field |
| 1 | 15 |
| 2 | 25 |
| 3 | 20 |
and I want to find the other_field value corresponding to the maximum target_field (e.g. in this case, I want to find 2).
Many people suggested using GROUP and JOIN, however my first idea was to use:
SELECT other_field FROM table ORDER by target_field DESC LIMIT 1;
Is there anything wrong with this? The only problem I can think of is that maybe ordering takes longer then just find the maximum (although on the other hand the JOIN might also take a while).
Thanks!
EDIT: sorry guys for the late replies, I'm new here and I was expecting to get some e-mails for notifications :)
Yes.
It actually has to sort every record before it can return any data. It's highly inefficient. It will return what you want, but not in the best possible way. Aggregate functions tend to do it much better, and much quicker.
With your current query, once you reached a much higher data load, it would take ages to process and materialize. (With smaller data sets, you should be fine)
If you need single value from one or more than one tables then you have to go for Max and GroupBy
if you are only one table and requires multiple columns then it is ok to use Order By Desc.
if you again need a single value from single table then MAX is preferred here too.
I hope you got my points
You can try to use the following query :
select top 1 other_field from tester order by target_field desc;
It works well in Sybase. Not sure of other databases.

Best way to use hibernate for complex queries like top N per group

I'm working now for a while on a reporting applications where I use hibernate to define my queries. However, more and more I get the feeling that for reporting use cases this is not the best approach.
The queries only result partial columns, and thus not typed objects
(unless you cast all fields in java).
It is hard to express queries without going straight into sql or
hql.
My current problem is that I want to get the top N per group, for example the last 5 days per element in a group, where on each day I display the amount of visitors.
The result should look like:
| RowName | 1-1-2009 | 2-1-2009 | 3-1-2009 | 4-1-2009 | 5-1-2009
| SomeName| 1 | 42 | 34 | 32 | 35
What is the best approach to transform the data which is stored per day per row to an output like this? Is it time to fall back on regular sql and work with untyped data?
I really want to use typed objects for my results but java makes my life pretty hard for that. Any suggestions are welcome!
Using the Criteria API, you can do this:
Session session = ...;
Criteria criteria = session.createCriteria(MyClass.class);
criteria.setFirstResult(1);
criteria.setMaxResults(5);
... any other criteria ...
List topFive = criteria.list();
To do this in vanilla SQL (and to confirm that Hibernate is doing what you expect) check out this SO post: