Selecting limited results from two tables - sql

I apologise if this has been asked before. I'm still not certain how to phrase my question for the title, so wasn't sure what to search for.
I have a hundred or so databases in the same instance, one for each of my customers, named for the customer, and they all have the same structure. I want to select a single result set that includes the database name along with the most recent date entry in one of the tables. I can pull the database names from sys.databases, but then for each database I want to select the most recent date from Events.Date_Logged so that my result set looks something like this:
_______________________________
| | |
|Cust_Name |Latest_Event |
|_______________|_______________|
| | |
|Customer1 |01/02/2020 |
|_______________|_______________|
| | |
|Customer2 |02/02/2020 |
|_______________|_______________|
| | |
|Customer3 |03/02/2020 |
|_______________|_______________|
I'm really struggling with the syntax though. I either get just a single row returned or every single event for each customer. I think my joins are as rusty as hell.
Any help would be appreciated.

What I suggest you do:
Declare a result variable (of type table)
Use a cursor to go over every database
Inside the cursor: do a select top 1 ... order by date desc to get the most recent record. Save this result in the result variable.
After the cursor print the result variable.
That should do the trick.

Related

How to compare value with multiple modified values from another table in BigQuery?

I am using Google BigQuery and I got the following issue:
I have a table (A) like this:
| time | request |
|------------------------|-----------------|
|2019-09-24 11:10:00 UTC | fakewebsite.com |
|2019-09-24 11:10:00 UTC | realwebsite.com |
|........................|.................|
|2019-09-24 11:10:00 UTC | foobwebsite.com |
|2019-09-24 11:10:00 UTC | barrwebsite.com |
And another table (B) like this:
| blacklist |
|---------------|
| foo.com |
| ... |
| bar.com |
I want to make a query that will grab a modified version of the values inside the blacklist field of table B as follows:
SPLIT(NET.REG_DOMAIN(blacklist), CONCAT('.',NET.PUBLIC_SUFFIX(blacklist)))[OFFSET(0)] AS to_exclude --this will return only "foo" from "foo.com"
and then return all values from the request field of table A where none of the to_exclude was found.
I know how to do this for one value but I don't know how to do this for multiple. I am looking for something like the following:
#standardSQL
WITH tmp_blacklist AS
(SELECT
SPLIT(NET.REG_DOMAIN(blacklist), CONCAT('.',NET.PUBLIC_SUFFIX(blacklist)))[OFFSET(0)] AS to_exclude
FROM
mydataset.B)
SELECT
request
FROM
mydataset.A
WHERE
request NOT LIKE ("%value1%", "%value2%", ..., "%valuen%") -- I can't use OR along with the NOT LIKE since the values are too many and they will change.
The n values are the values of the tmp_blacklist table.
Also if I don't define the table with the WITH and I define it after the NOT LIKE I am going to get the following error: Scalar subquery produced more than one element which makes sense if LIKE expects only one element. But then again that's half of the job done if it get's fixed since I want the "%value%" and not just the value of the table.
Now I searched online for a way to do this and I found people saying that it can't be done and then some workarounds with combinations of LIKE and IN where people said it will be very slow if one of the tables grows to have tons of data(my case).
What is the best way to do this?
One method uses not exists:
SELECT a.request
FROM mydataset.A a
WHERE NOT EXISTS (SELECT 1
FROM tmp_blacklist bl
WHERE a.request LIKE CONCAT('%', bl.to_exclude, '%'
);
Note that this can be expensive. You might want to test constructing the exclusion string as:
'value1|value2|value3'
and then using regular expressions.

SELECT MAX values for duplicate values in another column

I am having some trouble finding an answer for this one, so I apologize if it was somewhere else.
I have a table 'dbo.MileageImport' that has the following layout which I pulled to find duplicate entries:
|KEY | DATA |
---------------------
|V9864653 | 180288 |
|V9864653 | 22189 |
|V9864811 | 11464 |
|V9864811 | 12688 |
What I am having troubles with is when I run the following SQL in a DB2 environment:
SELECT KEY, MIN(DATA)
FROM dbo.MileageImport
GROUP BY KEY
HAVING (COUNT(KEY)>1);
It ends up pulling the following data:
|KEY | DATA |
---------------------
|V9864811 | 11464 |
|V9864653 | 180288 |
For some reason it's pulling the MIN value for V9864811, but not V9864653. If I inverse that and put MAX instead of MIN, it pulls the opposite values.
Is there something I am missing here so I can pull the MIN DATA value for only duplicate KEY records, or is there another way to do this? The report where this data comes from changes from month to month, so there could be different keys that end up being duplicated that I need to correct. Ultimately I am turning this into a DELETE statement to delete the lower of the two (or more) duplicated mileage entries.
Is your DATA column numerical? or a VARCHAR?
If you find its better to change it to a number if you can, maybe an integer if you aren't having any fractions and its just round numbers.
if not, then you could cast them to an integer value, but if there are lots of transactions or its a big table it will be slow and not ideal. Its bad practise to do that if you could just change the datatype!
SELECT KEY, MIN(CAST(DATA as Int))
FROM dbo.MileageImport
GROUP BY KEY
HAVING (COUNT(KEY)>1)

MariaDB - embed function to automatically sum columns and store result?

it is possible to store a function IN the table to automatically sum a group of columns and store the result in a final column?
ie:
+----+------------+-----------+-------------+------------+
| id | appleCount | pearCount | bananaCount | totalFruit |
+----+------------+-----------+-------------+------------+
| 1 | 300 | 60 | 120 | 480 |
+----+------------+-----------+-------------+------------+
where the column totalFruit is automatically calculated from the previous three columns and updated as the other columns update. in this specific application, there is ONLY going to be the one row. it would be spanky-handy to be able to just push the updated counts and then pull the calculated total out. i seem to recall reading about this ability somewhere, but for the life of me, i can't recall where... :poop:
if there is not way to do this, that's cool. but if there is... :smile:
TIA!
WR!
Yes, it is possible. But is it worth it? It is simple enough to do
SELECT ...
appleCount + pearCount + bananaCount AS totalFruit
...
See MariaDB Generated Columns for how to generate the extra column -- either as a real extra column or "virtual". What version of MariaDB?--There are a number of changes over time.
(MySQL users: 5.7.6 has a similar MySQL Generated Columns.)

Access 2016 SQL: Find minimum absolute difference between two columns of different tables

I haven't been able to figure out exactly how to put together this SQL string. I'd really appreciate it if someone could help me out. I am using Access 2016, so please only provide answers that will work with Access. I have two queries that both have different fields except for one in common. I need to find the minimum absolute difference between the two similar columns. Then, I need to be able to pull the data from that corresponding record. For instance,
qry1.Col1 | qry1.Col2
-----------|-----------
10245.123 | Have
302044.31 | A
qry2.Col1 | qry2.Col2
----------------------
23451.321 | Great
345622.34 | Day
Find minimum absolute difference in a third query, qry3. For instance, Min(Abs(qry1!Col1 - qry2!Col1) I imagine it would produce one of these tables for each value in qry1.Col1. For the value 10245.123,
qry3.Col1
----------
13206.198
335377.217
Since 13206.198 is the minimum absolute difference, I want to pull the record corresponding to that from qry2 and associate it with the data from qry1 (I'm assuming this uses a JOIN). Resulting in a fourth query like this,
qry4.Col1 (qry1.Col1) | qry4.Col2 (qry1.Col2) | qry4.Col3 (qry2.Col2)
----------------------------------------------------------------------
10245.123 | Have | Great
302044.31 | A | Day
If this is all doable in one SQL string, that would be great. If a couple of steps are required, that's okay as well. I just would like to avoid having to time consumingly do this using loops and RecordSet.Findfirst in VBA.
You can use a correlated subquery:
select q1.*,
(select top 1 q2.col2
from qry2 as q2
order by abs(q2.col1 - q1.col1), q2.col2
) as qry2_col2
from qry1 as q1;

How should I go about implementing an "autonumber" field in SQL Server 2005?

I'm aware of IDENTITY fields but I have a feeling that I couldn't use one to solve my problem.
Let's say I have multiple clients. Each client has multiple orders. Each client needs to have their orders numbered sequentially, specific to them.
Example table structure:
Orders:
OrderID | ClientID | ClientOrderID | etc...
Some example rows for this table would be:
OrderID | ClientID | ClientOrderID | etc...
1 | 1 | 1 | ...
2 | 1 | 2 | ...
3 | 2 | 1 | ...
4 | 3 | 1 | ...
5 | 1 | 3 | ...
6 | 2 | 2 | ...
I know the naive way would be to take the MAX ClientOrderID for any client and use that value for INSERTs but that would be subject to concurrency issues. I was considering using a transaction but I'm not quite sure what the broadest isolation scope that can be used for this. I'll be using LINQ to SQL but I have feeling that isn't relevant.
Somebody correct me if I'm wrong, but as long as your MAX() call is in the same step as your insert, you won't have a problem with concurrency.
So, you could not do
select #newOrderID=max(ClientOrderID) + 1
from orders
where clientid=#myClientID;
insert into ( ClientID, ClientOrderID, ...)
values( #myClientID, #newOrderID, ...);
But you can do
insert into ( ClientID, ClientOrderID, ...)
select #myClientID, max(ClientOrderID) + 1, ...
from orders
where clientid=#myClientID;
I'm assuming OrderID is an identity column.
Again, if I'm incorrect on this, please let me know. Preferably with a URL
You could use a Repository pattern to handle your Orders and let it control the number of each specific clients order number. If you implement the OrderRepository correctly it could control the concurrency and number the order before saving it to the database (let the repository and not the db set the number).
Repository pattern: http://martinfowler.com/eaaCatalog/repository.html
One possibility (though I don't like to do this) is to have a lookup table that would tell you the greatest Order Number given for each vendor. Inside of a transaction, you'd fetch the most recent one from VendorOrderNumber, save your new order, increment the value in VendorOrderNumber, commit transaction.
This is an odd way to store data, but assuming you need it, there is nothing built-in that you can use.
Your suggestion of Max(ClientOrderID) is straight forward and pretty easy to implement (follow John MacIntyre's advice). It will probably work acceptably well on tables with a few thousand orders. As the table grows this approach will of course slow down.
Nick DeVore's suggestion of a lookup table is a little messier to implement but won't substantially be affected by data growth.
Depending on where/when you actually need the ClientOrderID, you could calculate the id when needed like this:
SELECT *,
ROW_NUMBER() OVER(ORDER BY OrderID) AS ClientOrderID
FROM Orders
WHERE ClientID = 1
This assumes that the ClientOrderIDs are in the same sequence as the OrderID. Without actually persisting the ID, it is awkward to use as a key to anything else. This approach should not be affected by data growth.