SQL Counting Duplicates in a Column - sql

I have been stuck on this problem for a while and have searched over the net for an answer..
My problem is:
I have duplicates in one column. I want to count how many duplicates there are in the one column and then I want to divide the a field by that count. I want to be able to do this for each record in the column as well.
Basically I want the script to behave like this
Count number of duplicates -> divide field A by count of duplicates.
Sample data:
t1.Invoiceno | t2.Amount | t2.orderno
-------------------------------------
201412 200 P202
201412 200 P205
302142 500 P232
201412 300 P211
450402 250 P102
450402 250 P142
450402 250 P512
Desired Result:
Invoiceno | Amount | orderno| duplicates|amount_new
-------------------------------------------------
201412 200 P202 2 100
201412 200 P205 2 100
302142 500 P232 1 500
201552 300 P211 1 300
450402 1200 P102 3 400
450402 1200 P142 3 400
450402 1200 P512 3 400
I do not want to insert new columns into the table, I just want the results to show the two new columns.

Here is one way:
select A / dups.dups
from t cross join
(select count(*) as dups
from (select onecol
from t
group by onecol
having count(*) > 1
) o
) dups
EDIT:
Well, now that the problem is clarified to something more reasonable. You can user a similar approach to the above, but the dups subquery needs to be aggregated by invoice and amount:
select amount / dups.dups as new_amount
from table t join
(select invoice, amount, count(*) as dups
from table t
) dups
on t.invoice = dups.invoice and t.amount = dups.amount;

Here is another way:
Declare #tempTable Table ( ID int , A int)
INSERT INTO #tempTable VALUES (1, 12)
INSERT INTO #tempTable VALUES (1, 12)
INSERT INTO #tempTable VALUES (2, 20)
INSERT INTO #tempTable VALUES (2, 24)
INSERT INTO #tempTable VALUES (2, 15)
INSERT INTO #tempTable VALUES (3, 10)
INSERT INTO #tempTable VALUES (5, 12)
-------------------------------------------
;WITH DupsCTE (ID, DuplicateCount) AS
(
SELECT ID, COUNT(*) AS DuplicateCount FROM #tempTable GROUP BY ID
)
SELECT t.ID, t.A,
c.DuplicateCount, t.A / c.DuplicateCount AS ModifiedA
FROM
#tempTable t
INNER JOIN DupsCTE c ON c.ID = t.ID

Related

Agg Functions while Partitioning Data in SQL

I have a table that looks like this:
store_id industry_id cust_id amount gender
1 100 1000 1.00 M
2 100 1000 2.05 M
3 100 1000 3.15 M
4 100 1000 4.00 M
5 100 2000 5.00 F
6 200 2000 5.20 F
7 200 5000 6.05 F
8 200 6000 7.10 F
Here's the code to create this table:
CREATE TABLE t1(
store_id int,
industry_id int,
cust_id int,
amount float,
gender char
);
INSERT INTO t1 VALUES(1,100,1000,1.00, 'M');
INSERT INTO t1 VALUES(2,100,1000,2.05, 'M');
INSERT INTO t1 VALUES(3,100,1000,3.15, 'M');
INSERT INTO t1 VALUES(4,100,1000,4.00, 'M');
INSERT INTO t1 VALUES(5,100,2000,5.00, 'F');
INSERT INTO t1 VALUES(6,200,2000,5.20, 'F');
INSERT INTO t1 VALUES(7,200,5000,6.05, 'F');
INSERT INTO t1 VALUES(8,200,6000,7.10, 'F');
The question I'm trying to answer is: What is the avg. transaction amount for the top 20% of customers by industry?
This should yield these results:
store_id. industry_id avg_amt_top_20
1 100 4.80
2 100 4.80
3 100 4.80
4 100 4.80
5 100 4.80
6 200 7.10
7 200 7.10
8 200 7.10
Here's what I have so far:
SELECT
store_id, industry_id,
avg(CASE WHEN percentile>=0.80 THEN amount ELSE NULL END) OVER(PARTITION BY industry_id) as cust_avg
FROM(
SELECT store_id, industry_id, amount, cume_dist() OVER(
PARTITION BY industry_id
ORDER BY amount desc) AS percentile
FROM t1
) tmp
GROUP BY store_id, industry_id;
This fails on the GROUP BY (contains nonaggregated column 'amount'). What's the best way to do this?
What is the avg. transaction amount for the top 20% of customers by industry?
Based on this question, I don't see why store_id is in the results.
If I understand correctly, you need to aggregate to get the total by customer. Then you can use NTILE() to determine the top 20%. The final step is aggregating by industry:
SELECT industry_id, AVG(total)
FROM (SELECT customer_id, industry_id, SUM(amount) as total,
NTILE(5) OVER (PARTITION BY industry_id ORDER BY SUM(amount) DESC) as tile
FROM t
GROUP BY customer_id, industry_id
) t
WHERE tile = 1
GROUP BY industry_id

Join query result in duplicated rows

-----------tblDListTest---------
id listid trackingcode
1 125 trc1
2 125 trc1
3 125 trc1
4 126 trc4
5 126 trc5
---------------------------------
---------tblTrcWeightTest----------
id weight trackingcode
1 20 trc1
2 30 trc1
3 40 trc1
4 50 trc4
5 70 trc5
Need to display trackingcode and with their weight.
In tblDListTest, there are 3 records against listid 125.
I want to display only 3 records with weight.
I am using query :
set transaction isolation level read uncommitted
select DL.id, DL.listid, DL.trackingcode, tw.weight
from tblDListTest DL
inner join tblTrcWeightTest tw on DL.trackingcode = tw.trackingcode
where DL.listid = 125
My query result :
id listid trackingcode weight
1 125 trc1 20
1 125 trc1 30
1 125 trc1 40
2 125 trc1 20
2 125 trc1 30
2 125 trc1 40
3 125 trc1 20
3 125 trc1 30
3 125 trc1 40
But I want following result .
id listid trackingcode weight
1 125 trc1 20
2 125 trc1 30
3 125 trc1 40
you need a unique key (any combination of fields that results on a unique value) in one of the tables.
In your example, trc1 appears 3 times in each table.
SQL doen't know to join this data, so, it will make a cartesian product of the possible combinations.
If you can't use a unique value in the join, you can use a SELECT DISTINCT DL.id, DL.listid, DL.trackingcode, tw.weight ....
There are duplicates between your tables. You would want to see something like this:
;WITH DL (id, listid, trackingcode) AS (
SELECT CONVERT(int, id), listid, trackingcode FROM (
VALUES
('1','125','trc1'),
('2','125','trc1'),
('3','125','trc1'),
('4','126','trc4'),
('5','126','trc5')
) AS A (id, listid, trackingcode)
),
tw (id, weight, trackingcode) AS (
SELECT CONVERT(int, id), weight, trackingcode FROM (
VALUES
('1','20','trc1'),
('2','30','trc1'),
('3','40','trc1'),
('4','50','trc4'),
('5','70','trc5')
) AS A (id, weight, trackingcode)
)
SELECT DISTINCT DL.listid,
DL.trackingcode,
tw.weight
FROM DL
INNER JOIN tw ON DL.trackingcode = tw.trackingcode
WHERE DL.listid = 125
You can use row_number() to enumerate the values and then use that for the join:
select dl.id, dl.listid, dl.trackingcode, tw.weight
from (select dl.*, row_number() over (partition by trackingcode order by id) as seqnum
from tblDListTest dl
) dl inner join
(select tw.*, row_number() over (partition by trackingcode order by id) as seqnum
from tblTrcWeightTest tw
) tw
on dl.trackingcode = tw.trackingcode and dl.seqnum = tw.seqnum
where dl.listid = 125;
You can just use something like this.
DECLARE #tblDListTest table (
ID INT,
listid INT,
trackingcode VARCHAR(20)
)
DECLARE #tblTrcWeightTest table (
ID INT,
weight INT,
trackingcode VARCHAR(20)
)
INSERT INTO #tblDListTest (ID,listid,trackingcode)
VALUES (1, 125, 'trc1'),
(2, 125, 'trc1'),
(3, 125, 'trc1'),
(4, 126, 'trc4'),
(5, 126, 'trc5')
INSERT INTO #tblTrcWeightTest (ID,weight,trackingcode)
VALUES (1, 20, 'trc1'),
(2, 30, 'trc1'),
(3, 40, 'trc1'),
(4, 50, 'trc4'),
(5, 70, 'trc5')
SELECT A.ID, A.listid, A.trackingcode, B.weight
FROM #tblDListTest A
JOIN #tblTrcWeightTest B ON B.ID = A.ID
WHERE A.listid = 125
You can use subquery :
select twt.id, tt.listid, twt.trackingcode, twt.weight
from tblTrcWeightTest twt cross apply (
select top 1 tdt.listid
from tblDListTest tdt
where tdt.trackingcode = twt.trackingcode
) tt
where twt.trackingcode = 'trc1';

Switch SQL result columns to rows and include a summary row beneath

I'd like the below columns to populate in the place of rows and then include a summary row beneath it:
table1
ID NAME Value Group
001 Bob 100 A
002 Don 200 A
003 Fay 300 B
Below is an example of the desired output:
GROUP NO SUM
Group A 2 300
Group B 1 300
Total 3 600
select coalesce('Group '+ [group], 'Total') [Group], count([group]) No, sum(value) SUM
from table1
group by [group] with rollup
http://sqlfiddle.com/#!6/655f4/3
Declare #YourTable Table ([ID] varchar(50),[NAME] varchar(50),[Value] int,[Group] varchar(50))
Insert Into #YourTable Values
('001','Bob',100,'A')
,('002','Don',200,'A')
,('003','Fay',300,'B')
Select [Group] = IsNull('Group '+[Group],'Total')
,[No] = count(*)
,[Sum] = sum(Value)
From #YourTable
Group By rollup([Group])
Returns
Group No Sum
Group A 2 300
Group B 1 300
Total 3 600

How to select Remain values after subtract with one Fixed value

Need To select Data From One Table After Minus With One Value
this is the question i already asked and this solution for one value input to table and result. but i need this with more input values for different categories and each categories output
for eg(based of previous question)
Table 1
SNo Amount categories
1 100 type1
2 500 type1
3 400 type1
4 100 type1
5 100 type2
6 200 type2
7 300 type2
8 500 type3
9 100 type3
and
values for type1 - 800
values for type2 - 200
values for type3 - 100
and the output need is
for type-1
800 - 100 (Record1) = 700
700 - 500 (record2) = 200
200 - 400 (record3) = -200
The table records starts from record 3 with Balance Values Balance 200
Table-Output
SNo Amount
1 200
2 100
that means if minus 800 in first table the first 2 records will be removed and in third record 200 is Balance
same operation for remain types also and how to do it?
SQLFiddle demo
with T1 as
(
select t.*,
SUM(Amount) OVER (PARTITION BY [Type] ORDER BY [SNo])
-
CASE WHEN Type='Type1' then 800
WHEN Type='Type2' then 200
WHEN Type='Type3' then 100
END as Total
from t
)select Sno,Type,
CASE WHEN Amount>Total then Total
Else Amount
end as Amount
from T1 where Total>0
order by Sno
UPD: If types are not fixed then you should create a table for them, for example:
CREATE TABLE types
([Type] varchar(5), [Value] int);
insert into types
values
('type1',800),
('type2',200),
('type3',100);
and use the following query:
with T1 as
(
select t.*,
SUM(Amount) OVER (PARTITION BY t.[Type] ORDER BY [SNo])
-
ISNULL(types.Value,0) as Total
from t
left join types on (t.type=types.type)
)select Sno,Type,
CASE WHEN Amount>Total then Total
Else Amount
end as Amount
from T1 where Total>0
order by Sno
SQLFiddle demo
UPDATE: For MSSQL 2005 just replace SUM(Amount) OVER (PARTITION BY t.[Type] ORDER BY [SNo]) with (select SUM(Amount) from t as t1
where t1.Type=t.Type
and t1.SNo<=t.SNo)
with T1 as
(
select t.*,
(select SUM(Amount) from t as t1
where t1.Type=t.Type
and t1.SNo<=t.SNo)
-
ISNULL(types.Value,0) as Total
from t
left join types on (t.type=types.type)
)select Sno,Type,
CASE WHEN Amount>Total then Total
Else Amount
end as Amount
from T1 where Total>0
order by Sno
SQLFiddle demo

How to take column value count

Using SQL Server 2000
Table1
Column1
20
30
40
20
40
30
30
I want take a count like this
20 - 2
30 - 3
40 - 2
In case if the column value 20 or 30 or 40 is not available, it should display 20 - 0 or 30 - 0 or 40 - 0.
For example
Column1
20
30
20
30
30
Expected output
20 - 2
30 - 3
40 - 0
I will get only 20, 30. 40. No more value will come.
How to make a query
Need help
select item,count (item) from table group by item
EDIT : ( after your edit)
CREATE TABLE #table1 ( numbers int )
insert into #table1 (numbers) select 20
insert into #table1 (numbers) select 30
insert into #table1 (numbers) select 40
SELECT [num]
FROM [DavidCard].[dbo].[sssssss]
select numbers,count (num) from #table1 LEFT JOIN [sssssss] ON #table1.numbers = [sssssss].num group by numbers
SQL Query 101:
SELECT Column1, COUNT(*)
FROM dbo.YourTable
GROUP BY Column1
ORDER BY Column1
Update: if you want to get a list of possible values, and their potential count (or 0) in another table, you need two tables, basically - one with all the possible values, one with the actual values - and a LEFT OUTER JOIN to put them together - something like:
SELECT
p.Column1, ISNULL(COUNT(t.Column1), 0)
FROM
(SELECT 20 AS 'Column1'
UNION
SELECT 30
UNION
SELECT 40) AS p
LEFT OUTER JOIN
dbo.YourTable t ON t.Column1 = p.Column1
GROUP BY
p.Column1
ORDER BY
p.Column1