SQLite - Filter record based on specific calculated sum - sql

I'm given a table with a varying number of records that contain persons, their weight, and an order value which determines in which order the persons should be chosen ...
create table line (
id int not null PRIMARY KEY,
name varchar(255) not null,
weight int not null,
turn int unique not null,
check (weight > 0)
);
I have to retrieve the last record from the table with which 1000 lbs is not exceeded when counting the person's weight together in order.
E.g. if the table is
id | name | weight | order |
---+---------------------+---------+-------+
5 | Mark | 250 | 1 |
4 | James | 175 | 5 |
3 | John | 350 | 2 |
6 | James | 400 | 3 |
1 | Rick | 500 | 6 |
2 | Mike | 200 | 4 |
The query should return only James, as the first three picked (per order) people's weight will fit within 1000 lbs. The sum of the weight of these three persons is 250 + 350 + 400 = 1000.
People with same name might be in the table. Therefore the id should be used for the calculation.
How would I write an SQLite statement that can filter this accordingly?
I should first order by order but from there on my SQL knowledge is too limited to figure out how to calculate the sum within 1000...
SELECT name, weight, order FROM line ORDER BY order ASC

We can do this with a correlated subquery:
SELECT
id,
name,
weight,
"order",
FROM line t1
WHERE
(SELECT SUM(t2.weight) FROM line t2 WHERE t2."order" <= t1."order") <= 1000
ORDER BY
"order" DESC
LIMIT 1;
The correlated subquery in the query above calculates the running sum of weight as given by the order. The WHERE clause just restricts to only line records whose cumulative weight does not exceed 1000. Then, we order descending by the order column with LIMIT 1 to target the James record.
By the way, avoid using SQL keywords like order to name your columns.

SQL syntax is not case-sensitive. Wrap column name inside []: [order]
SELECT name, weight, order FROM line ORDER BY [order] ASC

Related

Stop SQL Select After Sum Reached

My database is Db2 for IBM i.
I have read-only access, so my query must use only basic SQL select commands.
==============================================================
Goal:
I want to select every record in the table until the sum of the amount column exceeds the predetermined limit.
Example:
I want to match every item down the table until the sum of matched values in the "price" column >= $9.00.
The desired result:
Is this possible?
You may use sum analytic function to calculate running total of price and then filter by its value:
with a as (
select
t.*,
sum(price) over(order by salesid asc) as price_rsum
from t
)
select *
from a
where price_rsum <= 9
SALESID | PRICE | PRICE_RSUM
------: | ----: | ---------:
1001 | 5 | 5
1002 | 3 | 8
1003 | 1 | 9
db<>fiddle here

(Edited) Given a number limit write an PostgreSQL query that returns the last person's name whose value fits within that limit after summing

Trying to write a query that (given a table that looks like this), sums up everyone's value in the order they appear and returns the person's name who is one place before the person whose value exceeds the given number limit. If the number limit given was 500, bernard should be returned because 250 + 300 exceeds 500. If the number limit given is 1000 bob should be returned because 250 + 300 + 250 + 250 exceeds the 1000 number limit.
id | name| value
------------
1 | bernard | 250
2 | bernice | 300
3 | bob | 250
4 | buddha | 250
5 | cheesy | 200
6 | dog | 200
You want a cumulative sum. But, SQL tables represent unordered sets, so you need a column that specifies the ordering. Let me assume it is called id for typing convenience.
select t.*
from (select t.*, sum(value) over (order by id) as running_sum
from t
) t
where running_sum + value >= 1000 and running_sum < 1000;

SQL Query to return a distinct count of one column while allowing a full summation of a second column, grouped by a third

I'm writing a query in access 2010 and i can't use count(distinct... so I'm running into a bit of trouble with what can be found below:
An example of my table is as follows
Provider | Member ID | Dollars | Status
FacilityA | 1001 | 50 | Pended
FacilityA | 1001 | 100 | Paid
FacilityA | 1002 | 200 | Paid
FacilityB | 1005 | 30 | Pended
FacilityB | 1009 | 90 | Pended
FacilityC | 1001 | 100 | Paid
FacilityC | 1008 | 500 | Paid
I want to return the total # of unique members that have visited each facility, but I also want to get the total dollar amount that is Pended, so for this example the ideal output would be
Provider | # members | Total Pended charges
FacilityA | 2 | 50
FacilityB | 2 | 120
FacilityC | 2 | 0
I tried using some code I found here: Count Distinct in a Group By aggregate function in Access 2007 SQL
and here:
SQL: Count distinct values from one column based on multiple criteria in other columns
Copying the code from the first link provided by gzaxx:
SELECT cd.DiagCode, Count(cd.CustomerID)
FROM (select distinct DiagCode, CustomerID from CustomerTable) as cd
Group By cd.DiagCode;
I can make this work for counting the members:
SELECT cd.Provider_Number, Count(cd.Member_ID)
FROM (select distinct Provider_Number, Member_ID from Claims_Table) as cd
ON claims_table.Provider_Number=cd.Provider_Number
Group By cd.Provider_Number;
However, no matter what I try I can't get a second portion dealing with the dollars to work without causing an error or messing up the calculation on the member count.
SELECT cd.Provider_Number,
-- claims_table.Member_ID, claims_table.Dollars
SUM(IIF ( Claims_Table.Status = 'Pended' , Claims_Table.Dollars , 0 )) as Dollars_Pending,
Count(cd.Member_ID) as Uniq_Members,
Sum(Dollars) as Dollar_Wrong
FROM (select distinct Provider_Number, Member_ID from Claims_Table) as cd inner join #claims_table
ON claims_table.Provider_Number=cd.Provider_Number and claims_table.Member_ID = cd.Member_ID
Group By cd.Provider_Number;
This should work fine based only on the table you described (named Tabelle1):
SELECT Provider, count(MemberID) as [# Members],
NZ(SUM(SWITCH([Status]='Pended', Dollars)),0) as [Total pending charges]
FROM Tabelle1
GROUP BY Provider;
Explanation
I think the first and second column are self-explanatory.
The third column is where most things are done. The SWITCH([Status]='Pended', Dollars) returns the Dollars only if the status is pending. This then gets summed up by SUM. The NZ(..,0) will set the column to 0 if the SUM returns a NULL.
EDIT: This was tested on Access 2016

How to get the first rows after order

How can I get only the first few rows,
After I performed order by to a table?
In SQL 2012, let's say I have a table:
----------------------
| Sales | ProductType |
----------------------
120 | Foodstuff
100 | Electronic
200 | Mobile
Now the problem is:
I select with order by Sales DESC
and I only want to get 2 rows.
You can use the limit clause.
SELECT *
FROM tablename
ORDER BY sales DESC
LIMIT n;
Where n is the number of rows you want to select

Remove redundant SQL price cost records

I have a table costhistory with fields id,invid,vendorid,cost,timestamp,chdeleted. It looks like it was populated with a trigger every time a vendor updated their list of prices.
It has redundant records - since it was populated regardless of whether price changed or not since last record.
Example:
id | invid | vendorid | cost | timestamp | chdeleted
1 | 123 | 1 | 100 | 1/1/01 | 0
2 | 123 | 1 | 100 | 1/2/01 | 0
3 | 123 | 1 | 100 | 1/3/01 | 0
4 | 123 | 1 | 500 | 1/4/01 | 0
5 | 123 | 1 | 500 | 1/5/01 | 0
6 | 123 | 1 | 100 | 1/6/01 | 0
I would want to remove records with ID 2,3,5 since they do not reflect any change since the last price update.
I'm sure it can be done, though it might take several steps.
Just to be clear, this table has swelled to 100gb and contains 600M rows. I am confident that a proper cleanup will take this table's size down by 90% - 95%.
Thanks!
The approach you take will vary depending on the database you are using. For SQL Server 2005+, the following query should give you the records you want to remove:
select id
from (
select id, Rank() over (Partition BY invid, vendorid, cost order by timestamp) as Rank
from costhistory
) tmp
where Rank > 1
You can then delete them like this:
delete from costhistory
where id in (
select id
from (
select id, Rank() over (Partition BY invid, vendorid, cost order by timestamp) as Rank
from costhistory
) tmp
)
I would suggest that you recreate the table using a group by query. Also, I assume the the "id" column is not used in any other tables. If that is the case, then you need to fix those tables as well.
Deleting such a large quantity of records is likely to take a long, long time.
The query would look like:
insert into newversionoftable(invid, vendorid, cost, timestamp, chdeleted)
select invid, vendorid, cost, timestamp, chdeleted
from table
group by invid, vendorid, cost, timestamp, chdeleted
If you do opt for a delete, I would suggestion:
(1) Fix the code first, so no duplicates are going in.
(2) Determine the duplicate ids and place them in a separate table.
(3) Delete in batches.
To find the duplicate ids, use something like:
select *
from (select id,
row_number() over (partition by invid, vendorid, cost, timestamp, chdeleted order by timestamp) as seqnum
from table
) t
where seqnum > 1
If you want to keep the most recent version instead, then use "timestamp desc" in the order by clause.