A running summary of totals in SQL Server - sql

Come up against an issue where I want to summarize results in a query.
Example as follows:
NAME | FRUIT | PRICE
-----+-------+------
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
DAVE | GRAPE | 3
DAVE | GRAPE | 3
DAVE | GRAPE | 3
This is my table at the moment, what i need though is to have a summary of Johns business, like below:
NAME | FRUIT | PRICE
-----+-------+------
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | APPLE | 2
JOHN | TOTAL | 8
DAVE | GRAPE | 3
DAVE | GRAPE | 3
DAVE | GRAPE | 3
DAVE | TOTAL | 9
I have tried to group the information but it does not reflect what i want, plus if John were to have different fruit it would need to sum that up before it sums up the next part and it needs to have a running total for all values in the NAME field as there will be a number of customers.
Any advice would be great
EDIT
I have tried using Rollup but I keep getting totals of all values in a seperate column where as I would like to see it as the way it is formatted above

A solution with UNION and GROUP BY.
;WITH PricesWithTotals AS
(
SELECT
Name,
Fruit,
Price
FROM
YourTable
UNION ALL
SELECT
Name,
Fruit = 'TOTAL',
Price = SUM(Price)
FROM
YourTable
GROUP BY
Name
)
SELECT
Name,
Fruit,
Price
FROM
PricesWithTotals
ORDER BY
Name,
CASE WHEN Fruit <> 'Total' THEN 1 ELSE 999 END ASC,
Fruit

This will get you a running total per customer per fruit:
create table #Sales([Name] varchar(20), Fruit varchar(20), Price int)
insert into #Sales([Name], Fruit, Price)
values
('JOHN','APPLE',2),
('JOHN','APPLE',2),
('JOHN','APPLE',2),
('JOHN','APPLE',2),
('DAVE','GRAPE',3),
('DAVE','GRAPE',3),
('DAVE','GRAPE',3)
Select c.*
, SUM(Price) OVER (PARTITION BY c.[Name], c.[Fruit] ORDER BY c.[Name], c.[Fruit] rows between unbounded preceding and current ROW ) as RunningTotal
from #Sales c
order by c.[Name], c.[Fruit] asc
drop table #Sales
Output:

The solution to your problem is GROUPING SETS. However, your rows are not unique. Alas, so this adds a unique value, just so you can keep your original rows:
with t as (
select t.*, row_number() over (order by (select null)) as seqnum
from t
)
select name, ,
coalesce(fruit, 'Total') as fruit,
sum(price) as price
from t
group by grouping sets ( (name, fruit, seqnum), (name) )
order by name,
(case when fruit is not null then 1 else 2 end);

Related

Select multiple distinct rows

I have a table with following data.
id country serial other_column
1 us 123 1
2 us 456 1
3 gb 123 1
4 gb 456 1
5 jp 777 1
6 jp 888 1
7 us 123 2
8 us 456 3
9 gb 456 4
10 us 123 1
11 us 123 1
Is there a way to fetch 2 rows per unique country and unique serial?
For example, expecting following results from my query.
us,123,1 comes twice cos there was 3 of the same kind and I want 2 rows per unique country and unique serial.
us,123,1
us,123,1
us,456,1
gb,123,1
gb,456,1
jp,777,1
jp,888,1
I can't use:
select distinct country, serial from my_table;
Since I want 2 rows per distinct value match for country and serial. Pls advice.
Assign DENSE_RANK and ROW_NUMBER to your data set using a CTE or subquery then return rows with a ROW_NUMBER less than 3 and a DENSE_RANK equal to 1. Also, since you did not specify the ORDERING, I've added a custom ORDER BY to handle your Country sorting to match your desired output above.
SELECT
ID,
Country,
Serial,
other_column
FROM
(SELECT
*,
ROW_NUMBER() OVER (PARTITION BY Country, Serial ORDER BY Country, Serial, other_column) AS RN,
DENSE_RANK() OVER (PARTITION BY Country, Serial ORDER BY Country, Serial, other_column) AS DR
FROM my_table) A
WHERE RN < 3 AND DR = 1
ORDER BY CASE WHEN Country = 'us' THEN 1
WHEN Country = 'gb' THEN 2
WHEN Country = 'jp' THEN 3
ELSE 4
END ASC, Country, Serial, other_column, ID
Result:
| ID | Country | Serial | other_column |
|----|---------|--------|---------------|
| 1 | us | 123 | 1 |
| 10 | us | 123 | 1 |
| 2 | us | 456 | 1 |
| 3 | gb | 123 | 1 |
| 4 | gb | 456 | 1 |
| 5 | jp | 777 | 1 |
| 6 | jp | 888 | 1 |
Fiddle here.
since you didn't specify any logic which rows it should pick based on the 'other column' value (i am assuming it doesn't matter for you).
Having said that, my code will always pick two rows based on unique country and serial with the other_column value as ascending. For example if you have 3 rows:
us, 123, 1
us, 123, 1
us, 123, 2
it will go for first two since other_column value is set to ASC, if you want the other way around you can change the code to order by DESC within the partition by clause.
If there are less than 3 rows for the country and serial it would just pick 1 row.
for example
us, 456, 1
us, 456, 1
us, 123, 1
us, 123, 2
us, 123, 3
would result in:
us, 456,1
us, 123,1
us, 123,2
with main as (
select
*,
count(*) over(partition by country, serial) as total_occurence,
row_number() over(partition by country, serial order by other_column) as rank_
from <table_name>
),
conditions as (
select *,
case when total_occurence < 3 and rank_ = 1 then true
when total_occurence >=3 and rank_ in (1,2) then true else
false end as is_relevant
from main
)
select * from conditions where is_relevant

pull all data only if there are distinct within a group in SQL

I have table with the following columns (product ID, product group code, product category)
I only want to pull the data if there there are two or more unique product category data within each product group. for example I have the following data.
Product id | product group code | product category
1 | a | Apple
2 | a | Orange
3 | a | Apple
4 | b | Toys
5 | b | Toys
I only want to see all the unique product category for each product code. The output i want to see is:
Product id product group code product category
1 | a | Apple
2 | a | Orange
3 | a | Apple
Thanks
I only want to pull the data if there there are two or more unique product category data within each product group. for example I have the following data.
This answer is based on the results you show which is consistent. The paragraph before the results is unclear.
One method is exists:
select t.*
from t
where exists (select 1
from t t2
where t2.product_group = t.product_group and
t2.product_category <> t.product_category
);
How about two nested selects? one grouping and selecting the group that has more than one in COUNT DISTINCT product_group_id, then join the "good group" back to the original input?
WITH
-- your input as an in-line table
input(Product_id,product_group_code,product_category) AS (
SELECT 1,'a','Apple'
UNION ALL SELECT 2,'a','Orange'
UNION ALL SELECT 3,'a','Apple'
UNION ALL SELECT 4,'b','Toys'
UNION ALL SELECT 5,'b','Toys'
)
,
good_grp AS (
SELECT
product_group_code
FROM input
GROUP BY product_group_code
HAVING COUNT(DISTINCT product_category) >1
)
SELECT
i.*
FROM input i
JOIN good_grp USING(product_group_code)
ORDER BY 1
-- returning ...
Product_id | product_group_code | product_category
-----------+--------------------+------------------
1 | a | Apple
2 | a | Orange
3 | a | Apple

Selecting specific distinct column in SQL

I am trying to create a select statement so that it does a specific distinct on one column. I am trying to make it so that there is not multiple fruits within each id. If there is multiple fruits under an id, I would like use only 1 approved fruit, over the rotten fruit. If there is only 1 fruit under that id, use it.
SELECT id, fruit, fruitweight, status
FROM myfruits
Raw data from current select
id | fruit | fruitweight | status
1 | apple | .2 | approved
1 | apple | .8 | approved
1 | apple | .1 | rotten
1 | orange | .5 | approved
2 | grape | .1 | rotten
2 | orange | .7 | approved
2 | orange | .5 | approved
How it should be formatted after constraint
id | fruit | fruitweight | status
1 | apple | .2 | approved
1 | orange | .5 | approved
2 | grape | .1 | rotten
2 | orange | .7 | approved
I can do something along the lines of select distinct id,fruit,fruitweight,status from myfruits,
but that will only take out the duplicates if all columns are the same.
CTE with aggregate and row_number.
declare #YourTable table (id int, fruit varchar(64), fruitweight decimal(2,1),status varchar(64))
insert into #YourTable
values
(1,'apple',0.2,'approved'),
(1,'apple',0.8,'approved'),
(1,'apple',0.1,'rotten'),
(1,'orange',0.5,'approved'),
(2,'grape',0.1,'rotten'),
(2,'orange',0.7,'approved'),
(2,'orange',0.5,'approved')
;with cte as(
select
id
,fruit
,fruitweight = min(fruitweight)
,[status]
,RN = row_number() over (partition by id, fruit order by case when status = 'approved' then 1 else 2 end)
from
#YourTable
group by
id,fruit,status)
select
id
,fruit
,fruitweight
,status
from
cte
where RN = 1
Another method, without the aggregate... assuming you want the first fruightweight
;with cte as(
select
id
,fruit
,fruitweight
,[status]
,RN = row_number() over (partition by id, fruit order by case when status = 'approved' then 1 else 2 end, fruitweight)
from
#YourTable)
select
id
,fruit
,fruitweight
,status
from
cte
where RN = 1
Another option is using the WITH TIES clause.
Example
Select top 1 with ties *
From YourTable
Order By Row_Number() over (Partition By id,fruit order by status,fruitweight)
A shorter version of scsimon's solution without aggregates.
If you have SQL Server < 2012, you'll have to use case instead of iif.
select
id
,fruit
,fruitweight
,status
from
(
select
id
,fruit
,fruitweight
,status
,rownum = row_number() over(partition by id, fruit order by iif(status = 'approved', 0, 1), fruitweight desc)
from myfruits
) x
where rownum = 1
EDIT: I started writing before scsimon edited his post to included a version without aggregates...

Select all data with no duplicate data

I have some data in database:
Name | Country | Status
Mary | USA | Pending
Jane| Japan | Pending
Jane | Korea | Pending
Adrian | China | Pending
Peter | Singapore | Pending
Jack | Malaysia | Pending
Adrian | China | Updated
Jane | Japan | Updated
May I know how to use the SELECT query to select all the data with no duplicate data? (If the duplicates data exist, select only the Status with Updated)
Try:
SELECT Name, Country, MAX(Status) as Status FROM (
SELECT TOP 100 PERCENT *
FROM NameCountry
ORDER BY Name ASC, Country ASC, Status DESC
) G
GROUP BY G.Name, G.Country
ORDER BY G.Name, G.Country
Check my Demo
From your comment, you seem to mean only data where the first two columns are duplicated. The easiest way, I think, is to use row_number(), which is available in most databases:
select name, country, status
from (select t.*,
row_number() over (partition by name, country
order by (case when status = 'Pending' then 0 else 1 end)
) as seqnum
from t
) t
where seqnum = 1

Data Matching with SQL and assigning Identity ID's

How to write a query that will match data and produce and identity for it.
For Example:
RecordID | Name
1 | John
2 | John
3 | Smith
4 | Smith
5 | Smith
6 | Carl
I want a query which will assign an identity after matching exactly on Name.
Expected Output:
RecordID | Name | ID
1 | John | 1X
2 | John | 1X
3 | Smith | 1Y
4 | Smith | 1Y
5 | Smith | 1Y
6 | Carl | 1Z
Note: The ID should be unique for every match. Also, it can be numbers or varchar.
Can somebody help me with this? The main thing is to assign the ID's.
Thanks.
How about this:
with temp as
(
select 1 as id,'John' as name
union
select 2,'John'
union
select 3,'Smith'
union
select 4,'Smith'
union
select 5,'Smith'
union
select 6,'Carl'
)
SELECT *, DENSE_RANK() OVER
(ORDER BY Name) as NewId
FROM TEMP
Order by id
The first part is for testing purposes only.
Please try:
SELECT *,
Rank() over (order by Name ASC)
FROM table
This structure seems to work:
CREATE TABLE #Table
(
Department VARCHAR(100),
Name VARCHAR(100)
);
INSERT INTO #Table VALUES
('Sales','michaeljackson'),
('Sales','michaeljackson'),
('Sales','jim'),
('Sales','jim'),
('Sales','jill'),
('Sales','jill'),
('Sales','jill'),
('Sales','j');
WITH Cte_Rank AS
(
SELECT [Name],
rw = ROW_NUMBER() OVER (ORDER BY [Name])
FROM #Table
GROUP BY [Name]
)
SELECT a.Department,
a.Name,
b.rw
FROM #Table a
INNER JOIN Cte_Rank b
ON a.Name = b.Name;