SQL getting a percentage from a column of quantities

SQL getting a percentage from a column of quantities - sql

I have a set of shelves and some are not being used but some are. I want to get the percentage of shelves that are being used (I am using an ajax call from javascript) what is a good way to do this and could you please provide an example?
This is the query I have so far which gets the nulls and sets the quantity to 0:
SELECT warehouse_locations.location, ISNULL(product_stock_warehouse.quantity, 0) as quantity
FROM product_stock_warehouse
RIGHT JOIN warehouse_locations ON product_stock_warehouse.location = warehouse_locations.location
WHERE warehouse_locations.location LIKE 'A21%'
ORDER BY product_stock_warehouse.quantity
If the shelf is not 0, it is "Full" and therefore counts towards the percentage being used.
I am using MS-SQL

I think you need aggregation. The idea is something like this:
SELECT wl.location, COALESCE(SUM(psw.quantity), 0) as total_quantity,
(CASE WHEN COALESCE(SUM(psw.quantity), 0) = 0 THEN 'EMPTY' ELSE 'USED' END) as status
FROM warehouse_locations wl LEFT JOIN
product_stock_warehouse psw
ON psw.location = wl.location
WHERE wl.location LIKE 'A21%'
GROUP BY wl.location
ORDER BY psw.quantity;
Some notes:
LEFT JOIN is much easier to follow than RIGHT JOIN. It means "keep all rows in the first table" rather than "keep all rows in some table later in the FROM clause that I haven't seen yet".
Table aliases make a query easier to write and to read.
Use GROUP BY to get one row per "shelf", which I assume is the same as a "location".

Related

Alter a existing SQL statement, to give an additional column of data, but to not affect performance, so best approach

In this query, I want to add a new column, which gives the SUM of a.VolumetricCharge, but only where PremiseProviderBillings.BillingCategory = 'Water'. But i don't want to add it in the obvious place since that would limit the rows returned, I only want it to get the new column value
SELECT b.customerbillid,
-- Here i need SUM(a.VolumetricCharge) but where a.BillingCategory is equal to 'Water'
Sum(a.volumetriccharge) AS Volumetric,
Sum(a.fixedcharge) AS Fixed,
Sum(a.vat) AS VAT,
Sum(a.discount) + Sum(deferral) AS Discount,
Sum(Isnull(a.estimatedconsumption, 0)) AS Consumption,
Count_big(*) AS Records
FROM dbo.premiseproviderbillings AS a WITH (nolock)
LEFT JOIN dbo.premiseproviderbills AS b WITH (nolock)
ON a.premiseproviderbillid = b.premiseproviderbillid
-- Cannot add a where here since that would limit the results and change the output
GROUP BY b.customerbillid;

Bit of a tricky one, as what you're asking for will definitely affect performance (your asking SQL Server to do more work after all!).
However, we can add a column to your results which performs a conditional sum so that it does not affect the result of the other columns.
The answer lies in using a CASE expression!
Sum(
CASE
WHEN PremiseProviderBillings.BillingCategory = 'Water' THEN
a.volumetriccharge
ELSE
0
END
) AS WaterVolumetric

My Joins in query not pulling through correctly

Good evening. Could someone please help me with the following. I am trying to join two tables.The first id wbr_global.gl_ap_details. This stores historic GL information. The second table sandbox.utr_fixed_mapping is where account mapping is stored. For example, ana ccount number 60820 is mapped as Employee relation. The first table needs the mapping from the second table linked on the account number. The output I am getting is not right and way to bug. Any help would be appreciated!
Output
select sandbox.utr_fixed_mapping_na.new_mapping_1,sum(wbr_global.gl_ap_details.amount)
from wbr_global.gl_ap_details
LEFT JOIN sandbox.utr_fixed_mapping_na ON wbr_global.gl_ap_details.account_number = sandbox.utr_fixed_mapping_na.account_number
Where gl_ap_details.cost_center = '1172'
and gl_ap_details.period_name = 'JUL-21'
and gl_ap_details.ledger_name = 'Amazon.com, Inc.'
Group by 1;
I tried adding the cast function but after 5000 seconds of the query running I canceled it.

The query itself appears ok, but minor changes. Learn to use table "aliases". This way you don't have to keep typing long database.table.column all over. Additionally, SQL is easier to read doing it that way anyhow.
Notice the aliases "gl" and "fm" after the tables are declared, then these aliases are used to represent the columns.. Easier to read, would you agree.
Added GL Account number as described below the query.
select
gl.account_number,
fm.new_mapping_1,
sum(gl.amount)
from
wbr_global.gl_ap_details gl
LEFT JOIN sandbox.utr_fixed_mapping_na fm
ON gl.account_number = fm.account_number
Where
gl.cost_center = '1172'
and gl.period_name = 'JUL-21'
and gl.ledger_name = 'Amazon.com, Inc.'
Group by
gl.account_number,
fm.new_mapping_1
Now, as for your query and getting null. This just means that there are records within the gl_ap_details table with an account number that is not found in the utr_fixed_mapping_na table. So, to see WHAT gl account number does NOT exist, I have added it to the query. Its possible there are MULTIPLE records in the gl_ap_details that are not found in the mapping table. So, you may get
GLAccount Description SumOfAmount
glaccount1 null $someAmount
glaccount37 null $someAmount
glaccount49 null $someAmount
glaccount72 Depreciation $someAmount
glaccount87 Real Estate $someAmount
glaccount92 Building $someAmount
glaccount99 Salaries $someAmount
I obviously made-up glaccounts just to show the purpose. You may have multiple where the null's total amount is actually masking how many different gl account numbers were NOT found.
Once you find which are missing, you can check / confirm they SHOULD be in the mapping table.
FEEDBACK.
Since you do realize the missing numbers, lets consider a Cartesian result. If there are multiple entries in the mapping table for the same G/L account number, you will get a Cartesian result thus bloating your numbers. To clarify, lets say your mapping table has
Mapping file.
GL Descr1 NewMapping
1 test Salaries
1 testView Buildings
1 Another Depreciation
And your GL_AP_Details has
GL Amount
1 $100
Your total for the query would result in $300 because the query is trying to join the AP Details GL #1 to EACH of the entries in the mapping file thus bloating the amount. You could also add a COUNT(*) as NumberOfEntries to the query to see how many transactions it THINKS it is processing. Is there some "unique ID" in the GL_AP_Details table? If so, then you could also do a count of DISTINCT ID values. If they are different (distinct is lower than # of entries), I think THAT is your culprit.
select
fm.new_mapping_1,
sum(gl.amount),
count(*) as NumberOfEntries,
count( distinct gl.UniqueIdField ) as DistinctTransactions
from
wbr_global.gl_ap_details gl
LEFT JOIN sandbox.utr_fixed_mapping_na fm
ON gl.account_number = fm.account_number
Where
gl.cost_center = '1172'
and gl.period_name = 'JUL-21'
and gl.ledger_name = 'Amazon.com, Inc.'
Group by
fm.new_mapping_1
Might you also need to limit the mapping table for a specific prophecy or mec view?

If you "think" that the result of an aggregate is wrong, then the easiest way to verify this is to select the individual rows that correlate to 1 record in the aggregate output and inspect the records, looking for duplications.
For instance, pick 'Building Management':
SELECT fixed.new_mapping_1,details.amount,*
FROM wbr_global.gl_ap_details details
LEFT JOIN sandbox.utr_fixed_mapping_na fixed ON details.account_number = fixed.account_number
WHERE details.cost_center = '1172'
AND details.period_name = 'JUL-21'
AND details.ledger_name = 'Amazon.com, Inc.'
AND details.account_number = 'Building Management'
Notice that we tack on a ,* to the end of the projection, this will show you everything that the query has access to, you should look for repeating sections of data that you were not expecting, then depending on which table they originate from your might add additional criteria to the JOIN, or to the WHERE or you might need to group by additional columns.
This type of issue is really hard to comment on in a forum like this because it is highly specific to your schema, and the data contained within it, making solutions highly subjective to criteria you are not likely to publish online.
Generally if you think a calculation is wrong, you need to manually compute it to verify, this above advice helps you to inspect the data your query is using, you should either construct your own query or use other tools to build the data set that helps you to manually compute the correct values, then work them back into or replace your original query.
The speed issues are out of scope here, we can comment on the poor schema design but I suspect you don't have a choice. In the utr_fixed_mapping_na table you should make the account_number have the same column type as the source data, or add a new column that has the data in the original type, then you can setup indexes on the columns to improve the speed of the join.

Failed UPDATE with CASE

I'm trying to write a query which will update reorder_level based on how much of an item was sold within a particular time period.
with a as (select invoice_itemized.itemnum, inventory.itemname,
sum(invoice_itemized.quantity) as sold
from invoice_itemized
join inventory on invoice_itemized.itemnum=inventory.itemnum and
inventory.vendor_number='COR' and inventory.dept_id='cigs'
join invoice_totals on
invoice_itemized.invoice_number=invoice_totals.invoice_number and
invoice_totals.datetime>=dateadd(month,-1,getdate())
group by invoice_itemized.itemnum, inventory.itemname)
update inventory
set reorder_level = case when a.sold/numpervencase>=5 then 30
when a.sold/numpervencase>=2 then 20
when a.sold/numpervencase>=1 then 5
else 1 end,
reorder_quantity = 1
from a
join inventory_vendors on a.itemnum=inventory_vendors.itemnum
Replacing the update with a select performs entirely as expected, returning proper results from the case and selecting 94 rows.
with the update in place, all of the areas affected by the update (6758) got set to 1.

Run this, and eyeball the results:
with a as (select invoice_itemized.itemnum, inventory.itemname,
sum(invoice_itemized.quantity) as sold
from invoice_itemized
join inventory on invoice_itemized.itemnum=inventory.itemnum and
inventory.vendor_number='COR' and inventory.dept_id='cigs'
join invoice_totals on
invoice_itemized.invoice_number=invoice_totals.invoice_number and
invoice_totals.datetime>=dateadd(month,-1,getdate())
group by invoice_itemized.itemnum, inventory.itemname)
select a.sold, numpervencase, a.sold/numpervencase,
case
when a.sold/numpervencase>=5 then 30
when a.sold/numpervencase>=2 then 20
when a.sold/numpervencase>=1 then 5
else 1
end,
*
from a
join inventory_vendors on a.itemnum=inventory_vendors.itemnum
Always a good idea to select before update to check that data ends up as you expect
all of the areas affected by the update got set to 1
I put the raw ingredients into the query above; see if the sums worked out as expected. You might need to cast one of the operands to something with decimal places:
1/2 = 0
1.0/2 = 0.5
And it updated far more rows than i was expecting
Every row that comes out of that select will be updated. Identify the rows you don't want to update and put a where clause in to remove them
Am i overthinking this?
Undertesting, probably
Do I even need the cte?
Makes it easier to represent, but no- you could get the same result by pasting the contents of the cte in as a subquery.. it's what the db does (effectively) anyway
Do i have my from statement in the wrong place?
We don't know what result you're after so that one is impossible to answe beyond "doing so would probably generate a syntax error, so.. no"
The actual problem seems to be
your case when is always going to ELSE, find out why
your cte selects too many rows (I couldn't tell if the number you posted was the number you got or the number you were expecting but it's pretty moot without example data), find out why

Solved. When I added another join to the update it worked correctly. i had to add join inventory on inventory_vendors.itemnum=inventory.itemnum

CASE Clause on select clause throwing 'SQLCODE=-811, SQLSTATE=21000' Error

This query is very well working in Oracle. But it is not working in DB2. It is throwing
DB2 SQL Error: SQLCODE=-811, SQLSTATE=21000, SQLERRMC=null, DRIVER=3.61.65
error when the sub query under THEN clause is returning 2 rows.
However, my question is why would it execute in the first place as my WHEN clause turns to be false always.
SELECT
CASE
WHEN (SELECT COUNT(1)
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779) = 1
THEN
(SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779
)
ELSE NULL
END STAPPFAC
FROM SHIPMENT SHIPMENT
WHERE SHIPMENT.SHIPMENT_ID IN (2779);

The SQL standard does not require short cut evaluation (ie evaluation order of the parts of the CASE statement). Oracle chooses to specify shortcut evaluation, however DB2 seems to not do that.
Rewriting your query a little for DB2 (8.1+ only for FETCH in subqueries) should allow it to run (unsure if you need the added ORDER BY and don't have DB2 to test on at the moment)
SELECT
CASE
WHEN (SELECT COUNT(1)
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779) = 1
THEN
(SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST,
FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC=1
AND ST.SHIPMENT_ID = 2779
ORDER BY ST.SHIPMENT_ID
FETCH FIRST 1 ROWS ONLY
)
ELSE NULL
END STAPPFAC
FROM SHIPMENT SHIPMENT
WHERE SHIPMENT.SHIPMENT_ID IN (2779);

Hmm... you're running the same query twice. I get the feeling you're not thinking in sets (how SQL operates), but in a more procedural form (ie, how most common programming languages work). You probably want to rewrite this to take advantage of how RDBMSs are supposed to work:
SELECT Current_Stop.facility_alias_id
FROM SYSIBM/SYSDUMMY1
LEFT JOIN (SELECT MAX(Stop.facility_alias_id) AS facility_alias_id
FROM Stop
JOIN Facility
ON Facility.facility_id = Stop.facility_id
AND Facility.is_dock_sched_fac = 1
WHERE Stop.shipment_id = 2779
HAVING COUNT(*) = 1) Current_Stop
ON 1 = 1
(no sample data, so not tested. There's a couple of other ways to write this based on other needs)
This should work on all RDBMSs.
So what's going on here, why does this work? (And why did I remove the reference to Shipment?)
First, let's look at your query again:
CASE WHEN (SELECT COUNT(1)
FROM STOP ST, FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC = 1
AND ST.SHIPMENT_ID = 2779) = 1
THEN (SELECT ST.FACILITY_ALIAS_ID
FROM STOP ST, FACILITY FAC
WHERE ST.FACILITY_ID = FAC.FACILITY_ID
AND FAC.IS_DOCK_SCHED_FAC = 1
AND ST.SHIPMENT_ID = 2779)
ELSE NULL END
(First off, stop using the implicit-join syntax - that is, comma-separated FROM clauses - always explicitly qualify your joins. For one thing, it's way too easy to miss a condition you should be joining on)
...from this it's obvious that your statement is the 'same' in both queries, and shows what you're attempting - if the dataset has one row, return it, otherwise the result should be null.
Enter the HAVING clause:
HAVING COUNT(*) = 1
This is essentially a WHERE clause for aggregates (functions like MAX(...), or here, COUNT(...)). This is useful when you want to make sure some aspect of the entire set matches a given criteria. Here, we want to make sure there's just one row, so using COUNT(*) = 1 as the condition is appropriate; if there's more (or less! could be 0 rows!) the set will be discarded/ignored.
Of course, using HAVING means we're using an aggregate, the usual rules apply: all columns must either be in a GROUP BY (which is actually an option in this case), or an aggregate function. Because we only want/expect one row, we can cheat a little, and just specify a simple MAX(...) to satisfy the parser.
At this point, the new subquery returns one row (containing one column) if there was only one row in the initial data, and no rows otherwise (this part is important). However, we actually need to return a row regardless.
FROM SYSIBM/SYSDUMMY1
This is a handy dummy table on all DB2 installations. It has one row, with a single column containing '1' (character '1', not numeric 1). We're actually interested in the fact that it has only one row...
LEFT JOIN (SELECT ... )
ON 1 = 1
A LEFT JOIN takes every row in the preceding set (all joined rows from the preceding tables), and multiplies it by every row in the next table reference, multiplying by 1 in the case that the set on the right (the new reference, our subquery) has no rows. (This is different from how a regular (INNER) JOIN works, which multiplies by 0 in the case that there is no row) Of course, we only maybe have 1 row, so there's only going to be a maximum of one result row. We're required to have an ON ... clause, but there's no data to actually correlate between the references, so a simple always-true condition is used.
To get our data, we just need to get the relevant column:
SELECT Current_Stop.facility_alias_id
... if there's the one row of data, it's returned. In the case that there is some other count of rows, the HAVING clause throws out the set, and the LEFT JOIN causes the column to be filled in with a null (no data) value.
So why did I remove the reference to Shipment? First off, you weren't using any data from the table - the only column in the result set was from the subquery. I also have good reason to believe that there would only be one row returned in this case - you're specifying a single shipment_id value (which implies you know it exists). If we don't need anything from the table (including the number of rows in that table), it's usually best to remove it from the statement: doing so can simplify the work the db needs to do.

Writing a query to include certain values but exclude others when looking for a latest time period

I am trying to write a query that looks for a people that have a certain code with the latest period (year) but not if they have another code with that latest period(year). I'll be explicit just so my example makes sense.
I want people who have the code A1,A2,A3,A4,A5 but not AG,AP,AQ. There are people who have an A1 code for a period (like 2014) and an AG code for a the same period. I'd like to exclude them. Not everyone has a code so the field value could be NULL.
Is there a way to express this in a different way (i.e. less characters) than the way I did?
SELECT
people.firstName
FROM
people
WHERE EXISTS (
SELECT *
FROM codes
WHERE
codes.people_id = people.id
AND period = (SELECT MAX(period) FROM codes codes2 WHERE codes2.people_id = codes.people_id)
AND code LIKE 'A[1-5]'
)
AND NOT EXISTS (
SELECT *
FROM codes
WHERE
codes.people_id = people.id
AND period = (
SELECT MAX(period)
FROM codes codes2
WHERE codes2.people_id = codes.people_id
)
AND code LIKE 'A[GPQ]'
)
Schema is as follows:
People
id (PK)
firstName
Codes
people_id (FK) many to one relation with People table
code (e.g. "A1", "A2", "AG")
period (e.g. "2013", "2014")

There are so many ways you could do that, I'm not an SQL expert but I can't see your query being too bad, if you want to try and reduce the number of sub-queries you could consider using the GROUP BY clause along with a SUM Aggregate function in a HAVING clause.
I started updating your code as follows:
SELECT
people.firstName
FROM
people
LEFT JOIN codes AS a15 ON a15.people_id = people.id AND a15.code LIKE 'A[1-5]'
LEFT JOIN codes AS agpq ON agpq.people_id = people.id AND agpq.code LIKE 'A[GPQ]'
GROUP BY
people.firstName
HAVING
SUM(CASE WHEN a15.code IS NULL THEN 0 ELSE 1 END) > 0
AND SUM(CASE WHEN agpq.code IS NULL THEN 0 ELSE 1 END) = 0
This however doesn't take into account anything to do with period specific requirements described. You could add the period to the GROUP BY clause or add it to a WHERE or one of the JOIN constraints but I'm not quite sure from your description exactly what you're after (I don't believe this is through any fault of your own, I just can't personally align the code provided to the description).
I would also like to point out that the SUM functions above will not give an accurate count of the number of matching codes. This is because if both A[GPQ] and A[1_5] return at least one row, the number returned by each constraint will be multiplied by the number returned for the other, it can however be used to determine if there are "any" returned items as if the criteria is matched it will have a SUM(...) > 0
I'm sure a more experienced SQL Developer / DBA will be able to poke many holes in my proposed query but it might give them or someone else something to work from and hopefully gives you ideas for alternatives to using sub-queries.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas