Further calculations from calculated fields in libreoffice base - hsqldb

Working on LibreOffice 4.4.5.2 / HSQLDB 1.8.0.10
Calculated fields are quite easy.. eg; "Field Name1" + "Field Name2" in a third field = a simple sum.
In my database I think I need further calculations from calculated fields within the same query.
These two sql statements are in the same query:
BuyPrice
S/H Paid
TaxPaid
"BuyPrice" + "S/H Paid" + "TaxPaid"
When the query runs, this outputs to a field with an alias of Total Cost
SellPrice
S/H Charged
"SellPrice" + "S/H Charged" - ( "SellPrice" * 0.132 + "S/H Charged" * 0.132 )
This outputs to a field with an alias of NET
This is exactly what I need, however I also need a third calculated field for Profit. I cant just enter "NET" - "Total Cost". If I create another query on top of the first one, I can reference the aliases and it works just fine but I can only get this into two separate "Table Controls"
Should this possibly be separate queries??
I simply don't know enough about any of this to get it to work.. any help or suggestions would be greatly appreciated

As you have found, you cannot use aliases within queries. You have to write out the entire calculation
"SellPrice" + "S/H Charged" - ( "SellPrice" * 0.132 + "S/H Charged" * 0.132 ) - ("BuyPrice" + "S/H Paid" + "TaxPaid")
to alias it to a column PROFIT. If you write two queries, when the program runs it internally writes out all the code like this. Two queries adds a bit of overhead calculation for the queries to be combined. The overhead may be worth it if it makes the queries more maintainable by you.

Related

Add a number value to column in SQL query using SELECT method

I have am working on adding a query that calculates tuition costs. It should do this by using the Tuition table which only includes the FullTimeCost (a static number for the student fees), and the PerUnitCost (the cost per credit hour).
I am trying to use a SELECT to return 3 more columns, 1 constant value of 12 called units, and 2 more that calculate the rest based on simple math.
The problem I am having is that I cannot seem to make the column Units have a default value of 12.
This is my code, and the issue I am having is that when I use this approach, the following formulas do not recognize the the columns being created in the previous lines.
All I need is for the 3rd Line to recognize Units so it can multiply by 12 as intended. Also this is for school, so a comment saying just change it to 12 is not useful.
SELECT
FullTimeCost, PerUnitCost,
12 AS Units,
PerUnitCost * Units AS TotalPerUnitCost,
FullTimeCost + TotalPerUnitCost AS TotalTuition
FROM
Tuition
You cannot re-use a column alias in the select. However, SQL Server gives you a convenient way to define the alias in the from clause, so you can use it:
SELECT t.FullTimeCost, t.PerUnitCost, v.Units,
v2.TotalPerUnitCost,
(t.FullTimeCost + v2.TotalPerUnitCost AS TotalTuition
FROM Tuition t CROSS APPLY
(VALUES (12)) v(units) CROSS APPLY
(VALUES (t.PerUnitCost * v.Units)) v2(TotalPerUnitCost);
Use a CTE to "add" your constant as a column and then apply the calculation. Without context, a variable would also be just as simple and useful.
with cte as (select FullTimeCost, PerUnitCost, 12 as Units
from dbo.Tuition
)
SELECT
FullTimeCost, PerUnitCost,
Units,
PerUnitCost * Units AS TotalPerUnitCost,
FullTimeCost + TotalPerUnitCost AS TotalTuition
FROM cte
order by ...;
There are, of course, other ways to accomplish this. Not certain what your coursework has covered but I assume that recent topics should have provided techniques to do this.
Using apply as shown by Gordon's answer is the most elegant solution and also noted in the comments is another way using a derived table.
As you have no doubt gathered, the problem is that during query compilation, the optimizer does not "see" the calculated column aliases as it can only (generally) access columns available from tables in the where clause, or as shown by Gordon, using an apply().
What you can also do is use a derived table, by first selecting the columns you need from your table and also adding your additional columns.
You then wrap this in parentheses - it's now a derived table ie, the results of the parenthesis content is itself a table available to an outer select.
You then use this as the source for an outer select which has visiblity of any additional columns you have added.
A complication with your query is that you want to add a constant value Units and then reference it, and also reference a second calculated column that makes use of Units.
I would simply use a single derived table to calculate the TotalPerUnitCost, you don't need Units since it's used only once.
select
FullTimeCost, PerUnitCost, TotalPerUnitCost,
FullTimeCost + TotalPerUnitCost as TotalTuition
from (
select FullTimeCost, PerUnitCost, TotalPerUnitCost, PerUnitCost * 12 as TotalPerUnitCost
from Tuition
)t

How to combine calculated fields with the same foreign key?

The title might not be entirely accurate to my problem but I couldn't think of how to word it.
I'm using a view to calculate the adjusted unit price of an item after accounting for initial setup costs, so the calculated field looks something like this:
"SetupFee" + "Shipping" + ( "UnitPrice" * "Quantity" ) / "Quantity"
The table these fields are from is called ItemInvoice and has a foreign key to a table called just Item, which contains item-specific information such as the description. The problem I'm having is that this method will give me two different outputs when there are multiple invoices for an item. For instance, if I have one invoice for item 1 and two for item 2 it shows me:
ItemName - (Calculation)
Item1 - 45.11
Item2 - 60.30
Item2 - 50.67
I'm really rusty on my SQL and databases in general (and am using LibreOffice Base for the first time) and am wondering how I would combine the costs for both invoices and divide them both by the total quantity. So something like:
("SetupFee1" + "Shipping1" + ( "UnitPrice1" * "Quantity1" ) + "SetupFee2" + "Shipping2" + ( "UnitPrice2" * "Quantity2" )) / ("Quantity1" + "Quantity2")
...or just averaging the two separate results, though I'm not sure how that would change the decimal values.
So far I've made this view entirely in LibreOffice Base's Design View, but I'm not averse to using SQL if I know what I'm going to be doing.
EDIT: LibreOffice 7.0 and embedded HSQLDB datbase.
Okay so I think I've figured it out, though I feel like there should maybe be a simpler way of doing it.
Definitely had to delve into SQL statements on this one, but this seems to do what I want:
SELECT "Item Name", AVG("Adjusted Unit Price") AS "Average Adjusted Unit Price"
FROM (
SELECT "Item"."ItemName" AS "Item Name", ( "SetupFee" + "Shipping" + ( "UnitPrice" * "Quantity" ) ) / "Quantity" AS "Adjusted Unit Price"
FROM "ItemInvoice", "Item" WHERE "ItemInvoice"."ItemID" = "Item"."ItemID"
)
GROUP BY "Item Name"

SQL Select/From/Where Run Speed

I have a program that is pulling data from a Visual FoxPro table and dumping into a Dataset with VB.net. My connection string works great, and the query I'm using usually runs with respectable speed. As I've ran it more, however, I've learned that there is a large amount of "bad" data in my table. So now, I'm trying to refine my query to buffer against the "bad" data, but what I thought would be a very small tweak has yielded massive performance losses, and I'm not particularly sure why.
My original query is:
'Pull desired columns for orders that have not "shipped" and were received in past 60 days.
'To "ship", an order must qualify with both an updated ship date and Sales Order #.
sqlSelect = "SELECT job_id,cust_id,total_sale,received,due,end_qty,job_descr,shipped,so "
sqlFrom = "FROM job "
sqlWhere = "WHERE fac = 'North Side' AND shipped < {12/30/1899} AND so = '' AND received >= DATE()-60;"
sql = sqlSelect & sqlFrom & sqlWhere
This has a run-time of about 20 seconds; while I'd prefer it to be quicker, it's not a problem. In my original testing (and occasional debugging), I replaced sqlWhere with sqlWhere = "WHERE job_id = 127350". This runs pretty much instantaneously.
Now the problem block: Once I replaced sqlWhere with
'Find jobs that haven't "shipped" OR were received within last 21 days.
'Recently shipped items are desired in results.
sqlWhere = "WHERE fac = 'North Side' AND ((shipped < {12/30/1899} AND so = '') OR received >= DATE()-21);"
My performance jumped to about 3 min 40 sec. This time is almost exactly the same as the time to run with sqlWhere = "WHERE received >= DATE();".
I'm not the moderator of these tables; I'm merely pulling from them to create a series of reports for our users. My best guess is that the received field is not indexed, this is the cause of my performance drop-off. But while my first search returns about 100 records, pulling the jobs only from today returns about 5, and still takes about 11x as long.
So my question is three part:
1) Would someone be able to explain the phenomenon I'm experiencing right now? I feel like I'm somewhat on the right track, but my knowledge of SQL has been limited to circumstantial use within other languages...
2) Is there something I'm missing, or some better way to obtain the results I need? There are a large volume of records that haven't "shipped", but simply because the user only input a shipped date or s/o, and didn't do the other. I need a way to view very recent orders (regardless of "shipped" status), and then also view less recent orders that have "bad" data, so I can get the user in the habit of cleaning up the data.
3) Is it bad SQL practice to overconstrain a WHERE clause? If I run fifteen field comparisons, joined together with nested ANDs/ORs, am I wasting my time when I could be doing something much cleaner?
Many thanks,
B
If you are looking for a non-indexed record in your WHERE string, the SQL engine must do a table scan, i.e. - look at every record in the table.
The difference between the two queries is having the OR instead of the AND. When you have a non-indexed column in an AND, the SQL engine can use the indexes to narrow down the number of records it has to look at for the non-indexed column. When you have an OR, it now must look at every record in the table and compare on that column.
Adding an index on the Received column would probably fix the performance issue.
In general, there are two things you don't want to have happen in your WHERE clause.
1. A primary condition on an non-indexed column
2. Using a calculation on a column. For example, doing WHERE Shipped-2 < date() is often worse than doing Shipped < Date() + 2, because the former doesn't typically allow the index to be used.
Refining your query through multiple WHERE clauses is generally a good thing. The fewer records you need to return to your application the better your performance will be, but you need to have appropriate indexing in place.

Repeating operations vs multilevel queries

I was always bothered by how should I approach those, which solution is better. I guess the sample code should explain it better.
Lets imagine we have a table that has 3 columns:
(int)Id
(nvarchar)Name
(int)Value
I want to get the basic columns plus a number of calculations on the Value column, but with each of the calculation being based on a previous one, In other words something like this:
SELECT
*,
Value + 10 AS NewValue1,
Value / NewValue1 AS SomeOtherValue,
(Value + NewValue1 + SomeOtherValue) / 10 AS YetAnotherValue
FROM
MyTable
WHERE
Name LIKE "A%"
Obviously this will not work. NewValue1, SomeOtherValue and YetAnotherValue are on the same level in the query so they can't refer to each other in the calculations.
I know of two ways to write queries that will give me the desired result. The first one involves repeating the calculations.
SELECT
*,
Value + 10 AS NewValue1,
Value / (Value + 10) AS SomeOtherValue,
(Value + (Value + 10) + (Value / (Value + 10))) / 10 AS YetAnotherValue
FROM
MyTable
WHERE
Name LIKE "A%"
The other one involves constructing a multilevel query like this:
SELECT
t2.*,
(t2.Value + t2.NewValue1 + t2.SomeOtherValue) / 10 AS YetAnotherValue
FROM
(
SELECT
t1.*,
t1.Value / t1.NewValue1 AS SomeOtherValue
FROM
(
SELECT
*,
Value + 10 AS NewValue1
FROM
MyTable
WHERE
Name LIKE "A%"
) t1
) t2
But which one is the right way to approach the problem or simply "better"?
P.S. Yes, I know that "better" or even "good" solution isn't always the same thing in SQL and will depend on many factors.
I have tired a number of different combination of calculations in both variants. They always produced the same execution plan, so it could be assumed that there is no difference in the performance aspect. From the code usability perspective the first approach i obviously better as the code is more readable and compact.
There is no "right" way to write such queries. SQL Server, as with most databases (MySQL being a notable exception), does not create intermediate tables for each subquery. Instead, it optimizes the query as a whole and often moves all the calculations for the expressions into a single processing node.
The reason that column aliases cannot be re-used at the same level goes to the ANSI standard definition. In particular, nothing in the standard specifies the order of evaluation for the individual expressions. Without knowing the order, SQL cannot guarantee that the variable is defined before evaluated.
I often write multi-level queries -- either using subqueries or CTEs -- to make queries more readable and more maintainable. But then again, I will also copy logic from one variable to the other because it is expedient. In my opinion, this is something that the writer of the query needs to decide on, taking into account whether the query is part of the code for a system that needs to be maintained, local coding standards, whether the query is likely to be modified, and similar considerations.

Ms Access : Query to work out percentage

I have a database which currently records the amount of times someone does a certain procedure and they scores they have received. The scoring is done by select a value of either N, B or C.
I currently have written a query which will count the total number of times a procedure is done and the amount of times each score is received.
Here is the result of the query (original: http://www.flickr.com/photos/mattcripps/6673555339/)
and here is the code
TRANSFORM Count(ed.[Entry ID]) AS [CountOfEntry ID]
SELECT ap.AdultProcedureName, ap.Target, Count(ed.[Entry ID]) AS [Total Of Entry ID]
FROM tblAdultProcedures AS ap LEFT JOIN tblEntryData AS ed ON ap.AdultProcedureName = ed.[Adult Procedure]
GROUP BY ap.AdultProcedureName, ap.Target
PIVOT ed.Grade;
If a score of N or B is given that is deemed below standard and C is deemed at standard. Is there a way I can add something to my query which will show me in percentage how many of the procedures we at standard and how many below?
I really cant get my head round this so any help would be great.
Thanks in advance
UPDATE TabProd
SET PrecProd = (PrecProd * 1.1)
WHERE Código IN (1,2,3,4)
I did something very similar to this on a pretty large scale.
My issue was the need to be able to run queries over specific (but user variable) timeframes and output similar percentage of total results in a report.
I won't get into the date issue but my solution was to run the "sum" function on the total line on my specific reject criteria to get totals of the rejects then use a divide expression to create a new column element (defined expression) in the same query pulling from the joined table of "Total net production" - joined by a common reference - job ID.
For your case it sounds like you want to sum the two failure types - which you would simply add defined expressions dividing your total instances into your various failure modes and formatting in your output report as percents. To finish the data portion of your report you then need a third expression defining your "non-fail percent" - which would be 1.0 - N/total - B/total - both of which you will have previously defined in the query to determine the N and B failure rates.
Then its a matter of pulling that information into your report and formatting. It definitely CAN be done.
Hope this helps.