SQL Server Roll-up (Sum) Value from associative table - sql

I have the following situation. I have a “Client” Table (Parent) and an “Opportunity” Table Child table. (See example below).
Client Table
| Id | Name
------------------
|1 | Client A
|2 | Client B
|3 | Client C
Opportunity Table
| Id | ClientId | Value
---------------------------------
| 10 | 1 | 1000
| 11 | 1 | 3000
| 12 | 2 | 1500
| 13 | 3 | 2000
I want to show sum of all Total of Opportunity Value (OppValue) on the client record.
Expected Output
| Id | Name | OppValue
-----------------------------
|1 | Client A | 4000
|2 | Client B | 1500
|3 | Client C | 2000
The business requirement is to filter on “OppValue” with the following criteria greater than, less than or null, etc but not by opportunity create date, etc. We are expecting each year users will be adding 500 clients and 45000 new opportunities. Based on the above I can think of three options
Calculate OppValue using SQL Query (Group By or Partition By)
Create View or Calculated Field using UDF
Create a new field in the “Client” table and populate it using Application business logic (outside SQL).
Which of the solution in your opinion will work best in terms of User experience (speed) and maintenance?
In case there is a better suggestion please let me know.
Many thanks in advance.

Start with a view:
create view client_opp as (
select c.*, o.oppvalue
from client c outer apply
(select sum(oppvalue) as oppvalue
from opportunities o
where o.clientId = c.clientId
) o;
Be sure you have an index on opportunities(clientId, oppvalue) -- or at least on opportunities(clientId). Note that this uses apply quite specifically so the view should work well even when used in a query with additional filtering.
If this works performance-wise, then you are done. Other methods using triggers and UDFs require a bit more maintenance in the database. You can definitely use them, but I would recommend waiting to see if this meets your performance needs.

You can use apply :
create view client_view as
select c.*, ot.OppValue
from ClientTable c cross apply
( select sum(value) as OppValue
from OpportunityTable ot
where ot.ClientId = c.ClientId
) ot;

Try the following query using inner join and sum() function. To know more about inner join you can follow this link. You can learn in detail about the Aggregate Functions (Transact-SQL) here.
Create table Client
(Id int, Name Varchar(20))
insert into Client values
(1, 'Client A'),
(2, 'Client B'),
(3, 'Client C')
create table Opportunity
(Id int, ClientId int, Value int)
insert into Opportunity values
(10, 1, 1000 ),
(11, 1, 3000 ),
(12, 2, 1500 ),
(13, 3, 2000 )
Select Client.Id, Client.Name, sum(Value) as Value
from Client
inner join Opportunity on Client.Id = Opportunity.ClientId
group by Client.Id, Client.Name
Output
Id Name Value
----------------------
1 Client A 4000
2 Client B 1500
3 Client C 2000
To create a view you can use the following create view example.
Syntax:
Create View <ViewName>
as
<View Query>
Example:
create view MyView
as
Select Client.Id, Client.Name, sum(Value) as Value
from Client
inner join Opportunity on Client.Id = Opportunity.ClientId
group by Client.Id, Client.Name
Selecting the result from a view as created above.
select * from MyView

Related

HQL, insert two rows if a condition is met

I have the following table called table_persons in Hive:
+--------+------+------------+
| people | type | date |
+--------+------+------------+
| lisa | bot | 19-04-2022 |
| wayne | per | 19-04-2022 |
+--------+------+------------+
If type is "bot", I have to add two rows in the table d1_info else if type is "per" i only have to add one row so the result is the following:
+---------+------+------------+
| db_type | info | date |
+---------+------+------------+
| x_bot | x | 19-04-2022 |
| x_bnt | x | 19-04-2022 |
| x_per | b | 19-04-2022 |
+---------+------+------------+
How can I add two rows if this condition is met?
with a Case When maybe?
You may try using a union to merge or duplicate the rows with bot. The following eg unions the first query which selects all records and the second query selects only those with bot.
Edit
In response to the edited question, I have added an additional parity column (storing 1 or 0) named original to differentiate the duplicate entry named
SELECT
p1.*,
1 as original
FROM
table_persons p1
UNION ALL
SELECT
p1.*,
0 as original
FROM
table_persons p1
WHERE p1.type='bot'
You may then insert this into your other table d1_info using the above query as a subquery or CTE with the desired transformations CASE expressions eg
INSERT INTO d1_info
(`db_type`, `info`, `date`)
WITH merged_data AS (
SELECT
p1.*,
1 as original
FROM
table_persons p1
UNION ALL
SELECT
p1.*,
0 as original
FROM
table_persons p1
WHERE p1.type='bot'
)
SELECT
CONCAT('x_',CASE
WHEN m1.type='per' THEN m1.type
WHEN m1.original=1 AND m1.type='bot' THEN m1.type
ELSE 'bnt'
END) as db_type,
CASE
WHEN m1.type='per' THEN 'b'
ELSE 'x'
END as info,
m1.date
FROM
merged_data m1
ORDER BY m1.people,m1.date;
See working demo db fiddle here
I think what you want is to create a new table that captures your logic. This would simplify your query and make it so you could easily add new types without having to edit logic of a case statement. It may also make it cleaner to view your logic later.
CREATE TABLE table_persons (
`people` VARCHAR(5),
`type` VARCHAR(3),
`date` VARCHAR(10)
);
INSERT INTO table_persons
VALUES
('lisa', 'bot', '19-04-2022'),
('wayne', 'per', '19-04-2022');
CREATE TABLE info (
`type` VARCHAR(5),
`db_type` VARCHAR(5),
`info` VARCHAR(1)
);
insert into info
values
('bot', 'x_bot', 'x'),
('bot', 'x_bnt', 'x'),
('per','x_per','b');
and then you can easily do a join:
select
info.db_type,
info.info,
persons.date date
from
table_persons persons inner join info
on
info.type = persons.type

SQL Server avoid repeat same joins

I´m doing the query below where I´m repeating the same joins multiple times, there is a better way to do it? (SQL Server Azure)
Ex.
Table: [Customer]
[Id_Customer] | [CustomerName]
1 | Tomy
...
Table: [Store]
[Id_Store] | [StoreName]
1 | SuperMarket
2 | BestPrice
...
Table: [SalesFrutes]
[Id_SalesFrutes] | [FruteName] | [Fk_Id_Customer] | [Fk_Id_Store]
1 | Orange | 1 | 1
...
Table: [SalesVegetable]
[Id_SalesVegetable] | [VegetableName] | [Fk_Id_Customer] | [Fk_Id_Store]
1 | Pea | 1 | 2
...
Select * From [Customer] as C
left join [SalesFrutes] as SF on SF.[Fk_Id_Customer] = C.[Id_Customer]
left join [SalesVegetable] as SV on SV.[Fk_Id_Customer] = C.[Id_Customer]
left join [Store] as S1 on S1.[Id_Store] = SF.[Fk_Id_Store]
left join [Store] as S2 on S1.[Id_Store] = SV.[Fk_Id_Store]
In my real case, I have many [Sales...] to Join with [Customer] and many other tables similar to [Store] to join to each [Sales...]. So it starts to scale a lot the number on joins repeating. There is a better way to do it?
Bonus question: I do like also to have FruteName, VegetableName, StoreName, and each Food table name under the same column.
The Expected Result is:
[CustomerName] | [FoodName] | [SalesTableName] | [StoreName]
Tomy | Orange | SalesFrute | SuperMarket
Tomy | Pea | SalesVegetable | BestPrice
...
Thank you!!
So based on the information provided, I would have suggested the below, to use a cte to "fix" the data model and make writing your query easier.
Since you say your real-world scenario is different to the info provided it might not work for you, but could still be applicable if you have say 80% shared columns, you can just use placeholder/null values where relevant for unioning the data sets and still minimise the number of joins eg to your store table.
with allSales as (
select Id_SalesFrutes as Id, FruitName as FoodName, 'Fruit' as SaleType, Fk_Id_customer as Id_customer, Fk_Id_Store as Id_Store
from SalesFruits
union all
select Id_SalesVegetable, VegetableName, 'Vegetable', Fk_Id_customer, Fk_Id_Store
from SalesVegetable
union all... etc
)
select c.CustomerName, s.FoodName, s.SaleType, st.StoreName
from Customer c
join allSales s on s.Id_customer=c.Id_customer
join Store st on st.Id_Store=s.Id_Store

Recursive Function appropriate?

Hi guys wondering could yous help me with a recursive query within SQL. Or even if a recursive query is the right choice.
I have columns like so lets say
ID | CUS | CASHIERID | RECEIPTID | PAYMENTNUM | ORIGINALRECEIPT
Now assume there is data like so:
+----------+--------+-------------+-------------+--------------+------------------+
| ID | CUS | CASHIERID | RECEIPTID | PAYMENTNUM | ORIGINALRECEIPT |
+----------+--------+-------------+-------------+--------------+------------------+
| 1 | jeff | 2 | 123 | 00005 | NULL |
| 4 | jeff | 2 | 548 | 00005 | 123 |
| 16 | jeff | 2 | 897 | 00005 | 123 |
| 151 | jeff | 2 | 1095 | 00005 | 123 |
+----------+--------+-------------+-------------+--------------+------------------+
Now say the Database was Huge and there could be X amount of related receipts as we see above ID is the original and the all others are related (refunds or something). Now say I was given the RECEIPTID for any one of these. To get all parent/child rows of this what is the best route? My first initial thought is to just simply do a sort of IF ELSE lets say and if ORIGINALRECEIPT is not empty then do a where clause with whatever is in it. But for sake of argument would you be able to do a recursive query of sorts to be able put in any receiptID and receive all 4 records back
EDIT
Hi guys so bit of a change so I got a recursive function working but now you see the data base is HUGE and when I perform the recursive function which is finding all reissued receipts (new ones) after the user inputs a receipt ID so user inputs receiptID, this then runs a recursive query that gets all related newer receipts by using the 'prevRecep' column which has the before receiptID in it so like a chain as mentioned in the comments. I have it working no problem on the small test database but the HUGE DB is super slow its been 40 mins and still has not finished. there is an index on CU,cashierid,receiptid but unfortnately for now I can't have an index on any other column. So I know that will already really slow my query down as im using the prevRecep column in it but is there any way I can quicken it up or better approach? Below is the recursive query
with cte as (
select *
from receipts
where cus='jeff' and casherid='2' and receiptid= '548'
union all
select cur.*
from receiptscur, cte
where cur.prevRecep = cte.recieptID
)
select * from cte
Yes, a recursive query should be fine :
declare #ReceiptId int = 123;
with cte as (
--These are the anchor (the parents)
select *
from Receipts
where ReceiptId = #ReceiptId and OriginalReceipt is null
union all
--These are the recursive childs. Could be multiple levels : parent, child, subchild, ...
select Receipts.*
from Receipts
inner join cte on cte.ReceiptId = Receipts.OriginalReceipt
)
select * from cte;
By the way, if your parent-child relations don't have more than one level, then the query doesn't need to be recursive, a simple UNION would be enough:
declare #ReceiptId int = 123;
select *
from Receipts
where ReceiptId = #ReceiptId
union all
select Receipts.*
from Receipts
where OriginalReceipt = #ReceiptId

SQL query filter list within a list

Consider a db with 2 tables like below. CompanyId is a foreign key in Department table pointing to Company. Company names are unique but the department names are not.
Table : Department
+-------+-----------+-----------+-------------------+
| id | name | CompanyId | phone |
+-------+-----------+-----------+-------------------+
| 1 | Sales | 1 | 214-444-1934 |
| 2 | R&D | 1 | 555-111-1834 |
| 3 | Sales | 2 | 214-222-1734 |
| 4 | Finance | 2 | 817-333-1634 |
| 5 | Sales | 3 | 214-555-1434 |
+-------+-----------+-----------+-------------------+
Table : Company
+-------+-----------+
| id | name |
+-------+-----------+
| 1 | Best1 |
| 2 | NewTec |
| 3 | JJA |
+-------+-----------+
I have a filter like below. when department name is null (empty) it means all the department id for that company should be included in the result but when there is list it should only include the ones which are listed.
[ {
companyName: "Best1",
departmentName: ["Sales", "R&D"]
},
{
companyName: "NewTec",
departmentName: ["Finance"]
} ,
{
companyName: "JJA",
departmentName: null
}
}]
Note: The filter is dynamic (a request to an API endpoint) and may include thousands of companies and departments.
I want a sql query to return all department id which fits in the criteria. for this example the result would be "1,2,4,5". (all department ids except the NewTec Sales department's id (3) are returned)
I'm looking for efficient SQL and/or linq query to return the result.
I can loop through companies and filter out departments for each individual one but it means that for each company there would be one trip to database using an ORM. Is there any better way to handle this case?
Here is the SQL query you want:
SELECT
d.id
FROM Department d
INNER JOIN Company c
ON d.CompanyId = c.id
WHERE
(c.name = 'Best1' AND d.name IN ('Sales', 'R&D')) OR
(c.name = 'NewTec' AND d.name = 'Finance') OR
c.name = 'JJA';
Demo
You want to deal with a variable number of conditions. There are mainly two ways to solve this:
Build the query string dynamically from your criteria.
Put the criteria in a separate table and query against that table.
With a filter table as such:
COMPANY_NAME | DEPARTMENT_NAME
-------------+----------------
Best1 | Sales
Best1 | R&D
NewTec | Finance
JJA | (null)
The unvarying (!) query would be:
SELECT *
FROM Department d
INNER JOIN Company c ON d.CompanyId = c.id
WHERE EXISTS
(
SELECT *
FROM Filter f
WHERE f.company_name = c.name
AND (f.department_name = d.name OR f.department_name IS NULL)
);
Here is a demo: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=49ff15426776536acc6f5bd7f88aaf8f (I've hijacked Tim's demo for this :-)
And here is an idea how to combine the two approaches mentioned: Make the filter table a temporary view, i.e. put it in a WITH clause. But in order to keep the query unvarying, you fill that WITH clause from a stored procedure which does the dynamic part. The stored procedure would read an XML containing the criteria and select rows from it. Thus you may be able to have your ORM call this procedure to get the results you are after.
Here is a thread explaining how to build a query in a stored procedure taking input from an XML: Creating a query from XML input parameter in SQL Server stored procedure and verifying output
The query would then look somethink like:
WITH Filter AS (SELECT company_name, department_name FROM dbm.GetMyFilterDataFromXml(#Xml))
SELECT *
FROM Department d
INNER JOIN Company c ON d.CompanyId = c.id
WHERE EXISTS
(
SELECT *
FROM Filter f
WHERE f.company_name = c.name
AND (f.department_name = d.name OR f.department_name IS NULL)
);

Interview question: SQL recursion in one statement

Someone I know went to an interview and was given the following problem to solve. I've thought about it for a few hours and believe that it's not possible to do without using some database-specific extensions or features from recent standards that don't have wide support yet.
I don't remember the story behind what is being represented, but it's not relevant. In simple terms, you're trying to represent chains of unique numbers:
chain 1: 1 -> 2 -> 3
chain 2: 42 -> 78
chain 3: 4
chain 4: 7 -> 8 -> 9
...
This information is already stored for you in the following table structure:
id | parent
---+-------
1 | NULL
2 | 1
3 | 2
42 | NULL
78 | 42
4 | NULL
7 | NULL
8 | 7
9 | 8
There could be millions of such chains and each chain can have an unlimited number of entries. The goal is to create a second table that would contain the exact same information, but with a third column that contains the starting point of the chain:
id | parent | start
---+--------+------
1 | NULL | 1
2 | 1 | 1
3 | 2 | 1
42 | NULL | 42
78 | 42 | 42
4 | NULL | 4
7 | NULL | 7
8 | 7 | 7
9 | 8 | 7
The claim (made by the interviewers) is that this can be achieved with just 2 SQL queries. The hint they provide is to first populate the destination table (I'll call it dst) with the root elements, like so:
INSERT INTO dst SELECT id, parent, id FROM src WHERE parent IS NULL
This will give you the following content:
id | parent | start
---+--------+------
1 | NULL | 1
42 | NULL | 42
4 | NULL | 4
7 | NULL | 7
They say that you can now execute just one more query to get to the goal shown above.
In my opinion, you can do one of two things. Use recursion in the source table to get to the front of each chain, or continuously execute some version of SELECT start FROM dst WHERE dst.id = src.parent after each update to dst (i.e. can't cache the results).
I don't think either of these situations is supported by common databases like MySQL, PostgreSQL, SQLite, etc. I do know that in PostgreSQL 8.4 you can achieve recursion using WITH RECURSIVE query, and in Oracle you have START WITH and CONNECT BY clauses. The point is that these things are specific to database type and version.
Is there any way to achieve the desired result using regular SQL92 in just one query? The best I could do is fill-in the start column for the first child with the following (can also use a LEFT JOIN to achieve the same result):
INSERT INTO dst
SELECT s.id, s.parent,
(SELECT start FROM dst AS d WHERE d.id = s.parent) AS start
FROM src AS s
WHERE s.parent IS NOT NULL
If there was some way to re-execute the inner select statement after each insert into dst, then the problem would be solved.
It can not be implemented in any static SQL that follows ANSI SQL 92.
But as you said it can be easy implemented with oracle's CONNECT BY
SELECT id,
parent,
CONNECT_BY_ROOT id
FROM table
START WITH parent IS NULL
CONNECT BY PRIOR id = parent
In SQL Server you would use a Common Table Expression (CTE).
To replicate the stored data I've created a temporary table
-- Create a temporary table
CREATE TABLE #SourceData
(
ID INT
, Parent INT
)
-- Setup data (ID, Parent, KeyField)
INSERT INTO #SourceData VALUES (1, NULL);
INSERT INTO #SourceData VALUES (2, 1);
INSERT INTO #SourceData VALUES (3, 2);
INSERT INTO #SourceData VALUES (42, NULL);
INSERT INTO #SourceData VALUES (78, 42);
INSERT INTO #SourceData VALUES (4, NULL);
INSERT INTO #SourceData VALUES (7, NULL);
INSERT INTO #SourceData VALUES (8, 7);
INSERT INTO #SourceData VALUES (9, 8);
Then I create the CTE to compile the data result:
-- Perform CTE
WITH RecursiveData (ID, Parent, Start) AS
(
-- Base query
SELECT ID, Parent, ID AS Start
FROM #SourceData
WHERE Parent IS NULL
UNION ALL
-- Recursive query
SELECT s.ID, s.Parent, rd.Start
FROM #SourceData AS s
INNER JOIN RecursiveData AS rd ON s.Parent = rd.ID
)
SELECT * FROM RecursiveData WHERE Parent IS NULL
Which will output the following:
id | parent | start
---+--------+------
1 | NULL | 1
42 | NULL | 42
4 | NULL | 4
7 | NULL | 7
Then I clean up :)
-- Clean up
DROP TABLE #SourceData
There is no recursive query support in ANSI-92, because it was added in ANSI-99. Oracle has had it's own recursive query syntax (CONNECT BY) since v2. While Oracle supported the WITH clause since 9i, SQL Server is the first I knew of to support the recursive WITH/CTE syntax -- Oracle didn't start until 11gR2. PostgreSQL added support in 8.4+. MySQL has had a request in for WITH support since 2006, and I highly doubt you'll see it in SQLite.
The example you gave is only two levels deep, so you could use:
INSERT INTO dst
SELECT a.id,
a.parent,
COALESCE(c.id, b.id) AS start
FROM SRC a
LEFT JOIN SRC b ON b.id = a.parent
LEFT JOIN SRC c ON c.id = b.parent
WHERE a.parent IS NOT NULL
You'd have to add a LEFT JOIN for the number of levels deep, and add them in proper sequence to the COALESCE function.