MariaDB - embed function to automatically sum columns and store result? - automation

it is possible to store a function IN the table to automatically sum a group of columns and store the result in a final column?
ie:
+----+------------+-----------+-------------+------------+
| id | appleCount | pearCount | bananaCount | totalFruit |
+----+------------+-----------+-------------+------------+
| 1 | 300 | 60 | 120 | 480 |
+----+------------+-----------+-------------+------------+
where the column totalFruit is automatically calculated from the previous three columns and updated as the other columns update. in this specific application, there is ONLY going to be the one row. it would be spanky-handy to be able to just push the updated counts and then pull the calculated total out. i seem to recall reading about this ability somewhere, but for the life of me, i can't recall where... :poop:
if there is not way to do this, that's cool. but if there is... :smile:
TIA!
WR!

Yes, it is possible. But is it worth it? It is simple enough to do
SELECT ...
appleCount + pearCount + bananaCount AS totalFruit
...
See MariaDB Generated Columns for how to generate the extra column -- either as a real extra column or "virtual". What version of MariaDB?--There are a number of changes over time.
(MySQL users: 5.7.6 has a similar MySQL Generated Columns.)

Related

Open sums in SQL / dynamic selection of tables

Much ink has been spilled on the topic of sum types in SQL. The standard solutions are called absorption, separation, and partition; see, e.g.: https://www.inf.unibz.it/~montali/teaching/1415/dpm/slides/4.relational-mapping.pdf .
I want to ask about how to encode open sums. Normal sums allow a field to be one of a fixed set of several different types; with open sums, this set is not fixed.
The basic setup in our program: There is a list of "triggers," where each trigger can be one of many different things. Plugins can be written defining new trigger types, although the set of trigger types can be assumed to be known at compile time.
We want a table of all triggers.
Our current best idea:
Dynamically create a materialized view of the following form:
id | id_in_plugin_table | thing_in_main_program_it_refs | plugin_name
---------------------------------------------------------------------
1 | 27 | 8 | RegexTrigger
2 | 27 | 12 | RidiculouslyUnsafeCustomJSTrigger
This relation is automatically generated from the various plugin tables, each of which have their own ID and a thing_in_main_program_it_refs field.
For illustration, here's what the referenced tables may look like.
RegexTrigger table:
id | thing_in_main_program_it_refs | regex
---------------------------------------------------------------------
27 | 8 | hel*o
RidiculouslyUnsafeCustomJSTrigger
id | thing_in_main_program_it_refs | custom_js
---------------------------------------------------------------------
27 | 12 | (x) => isPrime(x.length())
Either use two roundtrips to lookup the plugin table and then query it, or combine them into a single SQL program which uses EXEC.
I'm happy with part 1, but not with part 2. Neither option sounds efficient, and the latter option uses EXEC.
So, we're looking for either (a) a better way to dynamically select a table in a query, or (b) a different approach to open sums.

Selecting limited results from two tables

I apologise if this has been asked before. I'm still not certain how to phrase my question for the title, so wasn't sure what to search for.
I have a hundred or so databases in the same instance, one for each of my customers, named for the customer, and they all have the same structure. I want to select a single result set that includes the database name along with the most recent date entry in one of the tables. I can pull the database names from sys.databases, but then for each database I want to select the most recent date from Events.Date_Logged so that my result set looks something like this:
_______________________________
| | |
|Cust_Name |Latest_Event |
|_______________|_______________|
| | |
|Customer1 |01/02/2020 |
|_______________|_______________|
| | |
|Customer2 |02/02/2020 |
|_______________|_______________|
| | |
|Customer3 |03/02/2020 |
|_______________|_______________|
I'm really struggling with the syntax though. I either get just a single row returned or every single event for each customer. I think my joins are as rusty as hell.
Any help would be appreciated.
What I suggest you do:
Declare a result variable (of type table)
Use a cursor to go over every database
Inside the cursor: do a select top 1 ... order by date desc to get the most recent record. Save this result in the result variable.
After the cursor print the result variable.
That should do the trick.

SELECT MAX values for duplicate values in another column

I am having some trouble finding an answer for this one, so I apologize if it was somewhere else.
I have a table 'dbo.MileageImport' that has the following layout which I pulled to find duplicate entries:
|KEY | DATA |
---------------------
|V9864653 | 180288 |
|V9864653 | 22189 |
|V9864811 | 11464 |
|V9864811 | 12688 |
What I am having troubles with is when I run the following SQL in a DB2 environment:
SELECT KEY, MIN(DATA)
FROM dbo.MileageImport
GROUP BY KEY
HAVING (COUNT(KEY)>1);
It ends up pulling the following data:
|KEY | DATA |
---------------------
|V9864811 | 11464 |
|V9864653 | 180288 |
For some reason it's pulling the MIN value for V9864811, but not V9864653. If I inverse that and put MAX instead of MIN, it pulls the opposite values.
Is there something I am missing here so I can pull the MIN DATA value for only duplicate KEY records, or is there another way to do this? The report where this data comes from changes from month to month, so there could be different keys that end up being duplicated that I need to correct. Ultimately I am turning this into a DELETE statement to delete the lower of the two (or more) duplicated mileage entries.
Is your DATA column numerical? or a VARCHAR?
If you find its better to change it to a number if you can, maybe an integer if you aren't having any fractions and its just round numbers.
if not, then you could cast them to an integer value, but if there are lots of transactions or its a big table it will be slow and not ideal. Its bad practise to do that if you could just change the datatype!
SELECT KEY, MIN(CAST(DATA as Int))
FROM dbo.MileageImport
GROUP BY KEY
HAVING (COUNT(KEY)>1)

Storing a COUNT of values in a table

I have a table with data along the (massively simplified) lines of:
User | Value
-----|------
UsrA | 100
UsrA | 102
UsrB | 100
UsrA | 100
UsrB | 101
and, for reasons far to obscure to go into, I need to store the COUNT of each value in a table for future retrieval - ending up with something like
User | Value100Count | Value101Count | Value102Count
-----|---------------|---------------|--------------
UsrA | 2 | 0 | 1
UsrB | 1 | 1 | 0
However, there could be up to 255 different Values - meaning potentially 255 different ValueXCount columns. I know this is a horrible way to do things, but is there an easy way to get the data into a format that can be easily INSERTed into the destination table? Is there a better way to store the COUNT of values per user (unfortunately I do need to store this information; grabbing it from the source table each time isn't an option)?
The whole thing isn't very pretty, but you know that, rather than your table with 255 columns I'd consider setting up another table with:
User | Value | CountOfValue
And set a primary key over User and Value.
You could then insert the count's for given user/value combos into the CountOfValue field
As I said, the design is horrible and it feels like you would be better off starting from scratch, normalizing and doing counts live.
Check out indexed views. You can maintain the table automatically, with integrity and as a bonus it can get used in queries that already do count(*) on that data.

How should I go about implementing an "autonumber" field in SQL Server 2005?

I'm aware of IDENTITY fields but I have a feeling that I couldn't use one to solve my problem.
Let's say I have multiple clients. Each client has multiple orders. Each client needs to have their orders numbered sequentially, specific to them.
Example table structure:
Orders:
OrderID | ClientID | ClientOrderID | etc...
Some example rows for this table would be:
OrderID | ClientID | ClientOrderID | etc...
1 | 1 | 1 | ...
2 | 1 | 2 | ...
3 | 2 | 1 | ...
4 | 3 | 1 | ...
5 | 1 | 3 | ...
6 | 2 | 2 | ...
I know the naive way would be to take the MAX ClientOrderID for any client and use that value for INSERTs but that would be subject to concurrency issues. I was considering using a transaction but I'm not quite sure what the broadest isolation scope that can be used for this. I'll be using LINQ to SQL but I have feeling that isn't relevant.
Somebody correct me if I'm wrong, but as long as your MAX() call is in the same step as your insert, you won't have a problem with concurrency.
So, you could not do
select #newOrderID=max(ClientOrderID) + 1
from orders
where clientid=#myClientID;
insert into ( ClientID, ClientOrderID, ...)
values( #myClientID, #newOrderID, ...);
But you can do
insert into ( ClientID, ClientOrderID, ...)
select #myClientID, max(ClientOrderID) + 1, ...
from orders
where clientid=#myClientID;
I'm assuming OrderID is an identity column.
Again, if I'm incorrect on this, please let me know. Preferably with a URL
You could use a Repository pattern to handle your Orders and let it control the number of each specific clients order number. If you implement the OrderRepository correctly it could control the concurrency and number the order before saving it to the database (let the repository and not the db set the number).
Repository pattern: http://martinfowler.com/eaaCatalog/repository.html
One possibility (though I don't like to do this) is to have a lookup table that would tell you the greatest Order Number given for each vendor. Inside of a transaction, you'd fetch the most recent one from VendorOrderNumber, save your new order, increment the value in VendorOrderNumber, commit transaction.
This is an odd way to store data, but assuming you need it, there is nothing built-in that you can use.
Your suggestion of Max(ClientOrderID) is straight forward and pretty easy to implement (follow John MacIntyre's advice). It will probably work acceptably well on tables with a few thousand orders. As the table grows this approach will of course slow down.
Nick DeVore's suggestion of a lookup table is a little messier to implement but won't substantially be affected by data growth.
Depending on where/when you actually need the ClientOrderID, you could calculate the id when needed like this:
SELECT *,
ROW_NUMBER() OVER(ORDER BY OrderID) AS ClientOrderID
FROM Orders
WHERE ClientID = 1
This assumes that the ClientOrderIDs are in the same sequence as the OrderID. Without actually persisting the ID, it is awkward to use as a key to anything else. This approach should not be affected by data growth.