How to do cumulative query - sql

I would share ddl I am try in my local :
Table Inv :
create table inv(
inv_id integer not null primary key,
document_no varchar(150) not null,
grandtotal integer not null);
Tabel Pay :
create table pay(
pay_id integer not null primary key,
document_no varchar(150) not null,
inv_id integer references inv(inv_id),
payamt integer not null);
Insert into Inv :
insert into inv(inv_id, document_no, grandtotal) values
(1,'ABC18',50000),(2,'ABC19',45000);
Insert into Pay :
insert into pay(pay_id, document_no, inv_id, payamt) values
(1,'DEF18-1',1,20000),(2,'DEF18-2',1,30000);
How to make cumulative query? I am try
select inv.document_no, inv.grandtotal, sum(pay.payamt),
sum(pay.payamt)- inv.grandtotal as total
from inv, pay
where inv.inv_id= pay.inv_id
group by inv.document_no, inv.grandtotal
But it doesn't give me the expected result.

First of all, do not use that Join syntax, I am advising you to not use it. You can see the reason why here
Bad Habits to kick : using old style joins
From your ddl you share and your query I assume you want to see the history of your transaction and do cumulative?
This query should work :
SELECT inv.document_no AS doc_inv,
inv.grandtotal AS total_inv,
COALESCE(pay.document_no, '-') AS doc_pay,
COALESCE(pay.payamt, '0') AS total_pay,
COALESCE(( inv.grandtotal - Sum(pay.payamt)
OVER(
partition BY inv.inv_id
ORDER BY pay.pay_id) ), inv.grandtotal)
AS cumulative
FROM inv
LEFT OUTER JOIN pay
ON inv.inv_id = pay.inv_id
I am using Left Outer Join because there are Inv not get Pay in your insert Data. And of course it is only guessing without more guidance.
And what do you need is Window Function
Definition :
Performs a calculation across a set of table rows that are somehow
related to the current row.
And about join table you can read here : Join Documentation
Here Demo :
Demo<>Fiddle

Related

How to join Views with aggregate functions?

My problem:
In #4, I'm having trouble joining two Views because the other has an aggregate function. Same with #5
Question:
Create a view name it as studentDetails, that would should show the student name, enrollment date, total price per unit and subject description of students who are enrolled on the subject Science or History.
Create a view, name it as BiggestPrice, that will show the subject id and highest total price per unit of all the subjects. The view should show only the highest total price per unit that are greater than 1000.
--4.) Create a view name it as studentDetails, that would should show the student name,
-- enrollment date the total price per unit and subject description of students who are
-- enrolled on the subject Science or History.
CREATE VIEW StudentDetails AS
SELECT StudName, EnrollmentDate
--5.) Create a view, name it as BiggestPrice, that will show the subject id and highest total
-- price per unit of all the subjects. The view should show only the highest total price per unit
-- that are greater than 1000.
CREATE VIEW BiggestPrice AS
SELECT SubjId, SUM(Max(Priceperunit)) FROM Student, Subject
GROUP BY Priceperunit
Here is my table:
CREATE TABLE Student(
StudentId char(5) not null,
StudName varchar2(50) not null,
Age NUMBER(3,0),
CONSTRAINT Student_StudentId PRIMARY KEY (StudentId)
);
CREATE table Enrollment(
EnrollmentId varchar2(10) not null,
EnrollmentDate date not null,
StudentId char(5) not null,
SubjId Number(5) not null,
constraint Enrollment_EnrollmentId primary key (EnrollmentId),
constraint Enrollment_StudentId_FK foreign key (StudentId) references Student(StudentId),
constraint Enrollment_SubjId_Fk foreign key (SubjId) references Subject(SubjId)
);
Create table Subject(
SubjId number(5,0) not null,
SubjDescription varchar2(200) not null,
Units number(3,0) not null,
Priceperunit number(9,0) not null,
Constraint Subject_SubjId_PK primary key (SubjId)
);
Since this appears to be a homework question.
You need to use JOINs. Your current query:
CREATE VIEW StudentDetails AS
SELECT StudName, EnrollmentDate
Does not have a FROM clause and the query you have for question 5 uses the legacy comma join syntax with no WHERE filter; this is the same as a CROSS JOIN and will connect every student to every subject and is not what you want.
Don't use the legacy comma join syntax and use ANSI joins and explicitly state the join condition.
SELECT <expression list>
FROM student s
INNER JOIN enrollment e ON ...
INNER JOIN subject j ON ...
Then you can fill in the ... based on the relationships between the tables (typically the primary key of one table = the foreign key of another table).
Then for the <expression list> you need to include the columns asked for in the question: student name and enrolment date and subject name would just be those columns from the appropriate tables; and total price-per-unit (which I assume is actually total-price-per-subject) would be a calculation.
Then for the last part of question 4.
who are enrolled on the subject Science or History.
Add a WHERE filter to only include rows for those subjects.
For question 5, you do not need any JOINS as the question only asks about details in the SUBJECT table.
You need to add a WHERE filter to show "only the highest total price per unit that are greater than 1000". This is a simple multiplication and then you can filter by comparing if it is > 1000.
Then you need to limit the query to return only the row with the "highest total price per unit of all the subjects". From Oracle 12, this would be done with an ORDER BY clause in descending order of total price and then using FETCH FIRST ROW ONLY or FETCH FIRST ROW WITH TIES.
Not sure if i get it fully, but i think its this :
Notes:
Always use Id's to filter records:
where su.SubjId in (1,2)
You can find max record using max() at subquery and join it with main query like this :
where su2.SubjId = su.SubjId
You cannot use alias as filter so you can filter it like:
( su.Units * su.Priceperunit ) > 1000
CREATE VIEW StudentDetails AS
select s.StudName,
e.EnrollmentDate,
su.SubjDescription,
su.Units * su.Priceperunit TotalPrice
from student s
inner join Enrollment e
on e.StudentId = s.StudentId
inner join Subject su
on su.SubjId = e.SubjId
where su.SubjId in (1,2)
CREATE VIEW BiggestPrice AS
select su.SubjId, ( su.Units * su.Priceperunit ) TotalPrice
from Subject su
where ( su.Units * su.Priceperunit ) =
(
select max(su2.Units * su2.Priceperunit)
from Subject su2
where su2.SubjId = su.SubjId
)
and ( su.Units * su.Priceperunit ) > 1000

Group by count multiple tables

Need to find out why my group by count query is not working. I am using Microsoft SQL Server and there are 2 tables I am trying to join.
My query needs to bring up the number of transactions made for each type of vehicle. The output of the query needs to have a separate row for each type of vehicle such as ute, hatch, sedan, etc.
CREATE TABLE vehicle
(
vid INT PRIMARY KEY,
type VARCHAR(30) NOT NULL,
year SMALLINT NOT NULL,
price DECIMAL(10, 2) NOT NULL,
);
INSERT INTO vehicle
VALUES (1, 'Sedan', 2020, 240)
CREATE TABLE purchase
(
pid INT PRIMARY KEY,
vid INT REFERENCES vehicle(vid),
pdate DATE NOT NULL,
datepickup DATE NOT NULL,
datereturn DATE NOT NULL,
);
INSERT INTO purchase
VALUES (1, 1, '2020-07-12', '2020-08-21', '2020-08-23')
I have about 10 rows on information in each table I just haven't written it out.
This is what I wrote but it doesn't return the correct number of transactions for each type of car.
SELECT
vehicle.vid,
COUNT(purchase.pid) AS NumberOfTransactions
FROM
purchase
JOIN
vehicle ON vehicle.vid = purchase.pid
GROUP BY
vehicle.type;
Any help would be appreciated. Thanks.
Your GROUP BY and SELECT columns are inconsistent. You should write the query like this:
SELECT v.Type, COUNT(*) AS NumPurchases
FROM Purchase p JOIN
Vehicle v
ON v.vID = p.pID
GROUP BY v.Type;
Note the use of table aliases so the query is easier to write and read.
If this doesn't produce the expected values, you will need to provide sample data and desired results to make it clear what the data really looks like and what you expect.

SQL Server 2008 R2: Show only recently added records

I have two tables:
Cust : Contains customer details like customer ID and customer Name.
Cust_Address : This table contains customer ID and customer address.
Table: Cust
create table cust
(
cust_id int,
cust_name varchar(10)
);
Records Insertion:
insert into cust values(1,'A');
insert into cust values(2,'B');
insert into cust values(3,'C');
insert into cust values(4,'D');
Table: Cust_Address
create table cust_address
(
cust_id int,
cust_add varchar(50)
);
Records Insertion:
insert into cust_address values(1,'US');
insert into cust_address values(2,'UK');
insert into cust_address values(3,'UAE');
insert into cust_address values(4,'SA');
insert into cust_address values(1,'AUS');
insert into cust_address values(2,'IND');
insert into cust_address values(3,'SL');
insert into cust_address values(1,'CHINA');
Now I want to show the result which contains the latest customer address which have been inserted in the table Cust_Address.
Expected Result:
Cust_ID Cust_Name Cust_Add
-------------------------------
1 A CHINA
2 B IND
3 C SL
4 D SA
Here is the SQLFiddle for tables and its records.
You are not able to retrieve the rows in any particular order. You need some more info to get an order.
The best way is primary index in Cust_address
CustAddrID int identity(1, 1) not null primary key
You can also have a CreatedOn column that will have default value equal to getDate()
After that you can figure out what is the last inserted value for CustAddr for each Cust record.
In case you are not able to add new column there then maybe
change tracking functionality. But your issue seems to be too trivial for that.
There are also Temporal Tables in SQL Server 2016. But again it's probably too much.
Here is an example how you can get the address using primary key CustAddrID
SQL Fiddle
select cust_name, cust_add
from cust C
join
(select
cust_add, cust_id,
row_number() over (partition by cust_id order by cust_add_id desc) rn
from cust_address ) CLA
on CLA.cust_id = C.cust_id and
CLA.rn = 1
Identity column increases every time when we insert new value to the table. The correct value for your case will be the record with the highest cust_add_id and specified cust_id.
In the above query we generates numbers in desc order starting from 1 using row_number() function for each cust_id (partition by cust_id). Finally we take only the records with generated number rn equal to 1 CLA.rn = 1 and we join it to cust table.
You can replace row_number() by max(cust_add_id) and group by cust_id. However in that case you need to join cust_add table twice.
You will not be able to get the rows out of the link table in the order they were inserted.
You need to have a column for this.
Imagine how big the meta-data would be if you needed to keep a record for each record for creation! Would you also want to keep meta-data on your meta-data so you know when the meta-data was updated? The space use can quickly escalate.
SQL Server keeps some stats but something this specific will need to come from a user-defined field.
So you either use a identity column in the CustAddr table [CustAddr int identity(1, 1) not null primary key] or add a column for createdDateAndTime DateTime Default GetDate().

SQL Server 2005 query optimization with Max subquery

I've got a table that looks like this (I wasn't sure what all might be relevant, so I had Toad dump the whole structure)
CREATE TABLE [dbo].[TScore] (
[CustomerID] int NOT NULL,
[ApplNo] numeric(18, 0) NOT NULL,
[BScore] int NULL,
[OrigAmt] money NULL,
[MaxAmt] money NULL,
[DateCreated] datetime NULL,
[UserCreated] char(8) NULL,
[DateModified] datetime NULL,
[UserModified] char(8) NULL,
CONSTRAINT [PK_TScore]
PRIMARY KEY CLUSTERED ([CustomerID] ASC, [ApplNo] ASC)
);
And when I run the following query (on a database with 3 million records in the TScore table) it takes about a second to run, even though if I just do: Select BScore from CustomerDB..TScore WHERE CustomerID = 12345, it is instant (and only returns 10 records) -- seems like there should be some efficient way to do the Max(ApplNo) effect in a single query, but I'm a relative noob to SQL Server, and not sure -- I'm thinking I may need a separate key for ApplNo, but not sure how clustered keys work.
SELECT BScore
FROM CustomerDB..TScore (NOLOCK)
WHERE ApplNo = (SELECT Max(ApplNo)
FROM CustomerDB..TScore sc2 (NOLOCK)
WHERE sc2.CustomerID = 12345)
Thanks much for any tips (pointers on where to look for optimization of sql server stuff appreciated as well)
When you filter by ApplNo, you are using only part of the key. And not the left hand side. This means the index has be scanned (look at all rows) not seeked (drill to a row) to find the values.
If you are looking for ApplNo values for the same CustomerID:
Quick way. Use the full clustered index:
SELECT BScore
FROM CustomerDB..TScore
WHERE ApplNo = (SELECT Max(ApplNo)
FROM CustomerDB..TScore sc2
WHERE sc2.CustomerID = 12345)
AND CustomerID = 12345
This can be changed into a JOIN
SELECT BScore
FROM
CustomerDB..TScore T1
JOIN
(SELECT Max(ApplNo) AS MaxApplNo, CustomerID
FROM CustomerDB..TScore sc2
WHERE sc2.CustomerID = 12345
) T2 ON T1.CustomerID = T2.CustomerID AND T1.ApplNo= T2.MaxApplNo
If you are looking for ApplNo values independent of CustomerID, then I'd look at a separate index. This matches your intent of the current code
CREATE INDEX IX_ApplNo ON TScore (ApplNo) INCLUDE (BScore);
Reversing the key order won't help because then your WHERE sc2.CustomerID = 12345 will scan, not seek
Note: using NOLOCK everywhere is a bad practice

Database design - Multiple 1 to many relationships

What would be the best way to model 1 table with multiple 1 to many relatiionships.
With the above schema if Report contains 1 row, Grant 2 rows and Donation 12. When I join the three together I end up with a Cartesian product and result set of 24. Report joins to Grant and creates 2 rows, then Donation joins on that to make 24 rows.
Is there a better way to model this to avoid the caresian product?
example code
DECLARE #Report
TABLE (
ReportID INT,
Name VARCHAR(50)
)
INSERT
INTO #Report
(
ReportID,
Name
)
SELECT 1,'Report1'
DECLARE #Grant
TABLE (
GrantID INT IDENTITY(1,1) PRIMARY KEY(GrantID),
GrantMaker VARCHAR(50),
Amount DECIMAL(10,2),
ReportID INT
)
INSERT
INTO #Grant
(
GrantMaker,
Amount,
ReportID
)
SELECT 'Grantmaker1',10,1
UNION ALL
SELECT 'Grantmaker2',999,1
DECLARE #Donation
TABLE (
DonationID INT IDENTITY(1,1) PRIMARY KEY(DonationID),
DonationMaker VARCHAR(50),
Amount DECIMAL(10,2),
ReportID INT
)
INSERT
INTO #Donation
(
DonationMaker,
Amount,
ReportID
)
SELECT 'Grantmaker1',10,1
UNION ALL
SELECT 'Grantmaker2',3434,1
UNION ALL
SELECT 'Grantmaker3',45645,1
UNION ALL
SELECT 'Grantmaker4',3,1
UNION ALL
SELECT 'Grantmaker5',34,1
UNION ALL
SELECT 'Grantmaker6',23,1
UNION ALL
SELECT 'Grantmaker7',67,1
UNION ALL
SELECT 'Grantmaker8',78,1
UNION ALL
SELECT 'Grantmaker9',98,1
UNION ALL
SELECT 'Grantmaker10',43,1
UNION ALL
SELECT 'Grantmaker11',107,1
UNION ALL
SELECT 'Grantmaker12',111,1
SELECT *
FROM #Report r
INNER JOIN
#Grant g
ON r.ReportID = g.ReportID
INNER JOIN
#Donation d
ON r.ReportID = d.ReportID
Update 1 2011-03-07 15:20
Cheers for the feedback so far, to add to this scenario there are also 15 other 1 to many relationships coming from the one report table. These tables can't for various business reasons be grouped together.
Is there any relationship at all between Grants and Donations? If there isn't, does it make sense to pull back a query that shows a pseudo relationship between them?
I'd do one query for grants:
SELECT r.*, g.*
FROM #Report r
JOIN #Grant g ON r.ReportID = g.ReportID
And another for donations:
SELECT r.*, d.*
FROM #Report r
JOIN #Donation d ON r.ReportID = d.ReportID
Then let your application show the appropriate data.
However, if Grants and Donations are similar, then just make a more generic table such as Contributions.
Contributions
-------------
ContributionID (PK)
Maker
Amount
Type
ReportID (FK)
Now your query is:
SELECT r.*, c.*
FROM #Report r
JOIN #Contribution c ON r.ReportID = c.ReportID
WHERE c.Type = 'Grant' -- or Donation, depending on the application
If you're going to join on ReportID, then no, you can't avoid a lot of rows. When you omit the table "Report", and just join "Donation" to "Grant" on ReportId, you still get 24 rows.
SELECT *
FROM Grant g
INNER JOIN
Donation d
ON g.ReportID = d.ReportID
But the essential point is that it doesn't make sense in the real world to match up donations and grants. They're completely independent things that essentially have nothing to do with each other.
In the database, the statement immediately above will join each row in Grants to every matching row in Donation. The resulting 24 rows really shouldn't surprise you.
When you need to present independent things to the user, you should use a report writer or web application (for example) that selects the independent things, well, independently. Select donations and put them into one section of a report or web page, then select grants and put them into another section of the report or web page, and so on.
If the table "Report" is supposed to help you record which sections go into a particular report, then you need a structure more like this:
create table reports (
reportid integer primary key,
report_name varchar(35) not null unique
);
create table report_sections (
reportid integer not null references reports (reportid),
section_name varchar(35), -- Might want to reference a table of section names
section_order integer not null,
primary key (reportid, section_name)
);
The donation and grant tables look almost identical. You could make them one table and add a column that is something like DonationType. Would reduce complexity by 1 table. Now if donations and grants are completely different and have different subtables associated with them then keeping them seperate and only joining on one at a time would be ideal.