sql query that will get a distinct type, brand, and model, but get a count of how many duplicates were found

sql query that will get a distinct type, brand, and model, but get a count of how many duplicates were found - sql-server-2005

I have a table "Competitor" and here are some of its columns:
Type | Brand | Model | Date | Resolution | etc.
The table will have duplicate Model entries (with obviously same Brand as well, but possibly a different Type (two possible types: 'ProAV' and 'Disti')). I need to build a query that will output a table like this:
Top (ProAV) | Top (Disti) | Last Occurrence | Brand | Model | Resolution | etc.
Basically I need a query that will get a distinct type, brand, and model, but get a count of how many duplicates were found and put that number in either Top (ProAV) or Top (Disti), whichever Type it has. I would need to pull the most recent (given Date) out of the duplicates, so that I can put its Date as the Last Occurrence field. I hope this makes sense, let me know if it doesn't.

SELECT SUM(CASE WHEN Type = 'ProAV' THEN 1 ELSE 0 END) AS TopProAV,
SUM(CASE WHEN Type = 'Disti' THEN 1 ELSE 0 END) AS TopDisti,
MAX(Date) AS LastOccurence,
Brand, Model, Resolution
FROM Competitor
GROUP BY Brand, Model, Resolution
EDIT: Based on the comment, you could use a subquery or CTE to accomplish what you want. Something like:
WITH cteMaxDate AS (
SELECT SUM(CASE WHEN Type = 'ProAV' THEN 1 ELSE 0 END) AS TopProAV,
SUM(CASE WHEN Type = 'Disti' THEN 1 ELSE 0 END) AS TopDisti,
MAX(Date) AS LastOccurence,
Brand, Model, Resolution
FROM Competitor
GROUP BY Brand, Model, Resolution
)
SELECT md.TopProAV, md.TopDisti,
md.LastOccurentce,
md.Brand, md.Model, md.Resolution,
c.AdditionalColumn1, c.AdditionalColumn2
FROM cteMaxDate md
INNER JOIN Competitor c
ON md.Brand = c.Brand
AND md.Model = c.Model
AND md.Resolution = c.Resolution
AND md.LastOccurence = c.Date

Do you have a limited number of Types? In this case you can solve your problem using pivot
More specifically, for the table
Type Model
---- -----
A X
B X
C Y
A Z
NULL NULL
you run this query
Select Model, [A], [B], [C]
From
(select Model, Type
from dbo.Competitor) as SourceTable
PIVOT
(Count([Type]) for [Type] in ([A], [B], [C])) as PivotTable
to get
Model A B C
------ - - -
X 1 1 0
Y 0 0 1
Z 1 0 0

Related

How can I sum a row's value in my table based on a specific type?

My table looks like this. Its a table with an inventory of clothes.
Basically, an user can enter a type of clothe and a quantity.
When he did it, it add a new value in the table with the date of the input.
The type 2 is for shoes and the type 3 for shirts
What I'm trying to do is to sum the quantity based on the type like this :
So I tried this :
SELECT name, type, sum(quantity)
from Clothes
where type="2"
group by name
But it didn't work, it sums all the type of clothes. How can I do ?

Use case expressions to do conditional aggregation:
SELECT name,
SUM(case when type = 3 then quantity else 0 end) shirts,
SUM(case when type = 2 then quantity else 0 end) shoes
from Clothes
group by name

You should group using the type too.
Doing this you'll get a table with 3 columns:
1st one with the name, secondo col with the type and the third with the quantity
SELECT name, type, sum(quantity)
from Clothes
group by name,type
Then you should format as you wish the data
If otherwise you want to get the exact result with a query you should dig more deep and maybe using some 'Case' inside the sum function and put a zero if is not of the selected type:
select name,
sum(case when type = 3 then quantity else 0 end) as Shirts,
sum(case when type = 2 then quantity else 0 end) as Shoes
from Clothes
group by name;
result:

A solution using a PIVOT table will achieve the same result with multi-column aggregation of quantities corresponding to the type column:
SELECT [ProductName], [2] As Shoes, [3] As Clothes
FROM
(SELECT [ProductName], [ProductType], [Quantity] FROM [Inventory_Table])
AS DataSource
PIVOT
(SUM([Quantity]) FOR [ProductType] IN ([2], [3])) AS pvt_table
Note: For the above to work in SQL Server T-SQL I had to replace the [Name] and [Type] columns with other columns names.

Create function that returns table in SQL

I wanted to create view with some logic like using (for loop , if .. else) but since that's not supported in SQL
I thought of creating table function that takes no parameter and returns a table.
I have a table for orders as below
OrderId Destination Category Customer
----------------------------------------
6001 UK 5 Adam
6002 GER 3 Jack
And table for tracking orders as below
ID OrderID TrackingID
-----------------------
1 6001 1
2 6001 2
3 6002 2
And here are the types of tracking
ID Name
--------------
1 Processing
2 Shipped
3 Delivered
As you can see in tracking order, The order number may have more than one record depending on how many tracking events occurred.
We have more than 25 tracking types that I didn't include here. which means one order can exist 25 times in tracking order table.
Now with that being said , My requirements is to create view as below with condition that an order must belong to 5 or 3 category ( we have more than 15 categories).
And whenever I run the function it must return the updated information.
So for example, when new tracking occurs and it's inserted in tracking order , I want to run my function and see the update in the corresponding flag column (e.g isDelivered).
I'm really confused on what is the best way to achieve this. I don't need the exact script i just need to understand the way to achieve it as i'm not very familiar with SQL

It could be done with a crosstab query using conditional aggregation. Something like this
select o.OrderID,
max(case when tt.[Name]='Processing' then 1 else 0 end) isPrepared,
max(case when tt.[Name]='Shipped' then 1 else 0 end) isShipped,
max(case when tt.[Name]='Delivered' then 1 else 0 end) isDelivered
from orders o
join tracking_orders tro on o.OrderID=tro.OrderID
join tracking_types tt on tro.TrackingID=tt.TrackingID
where o.category in(3, 5)
group by o.OrderID;
[EDIT] To break out Category 3 orders, 3 additional columns were added to the cross tab.
select o.OrderID,
max(case when tt.[Name]='Processing' then 1 else 0 end) isPrepared,
max(case when tt.[Name]='Shipped' then 1 else 0 end) isShipped,
max(case when tt.[Name]='Delivered' then 1 else 0 end) isDelivered,
max(case when tt.[Name]='Processing' and o.category=3 then 1 else 0 end) isC3Prepared,
max(case when tt.[Name]='Shipped' and o.category=3 then 1 else 0 end) isC3Shipped,
max(case when tt.[Name]='Delivered' and o.category=3 then 1 else 0 end) isC3Delivered
from orders o
join tracking_orders tro on o.OrderID=tro.OrderID
join tracking_types tt on tro.TrackingID=tt.TrackingID
where o.category in(3, 5)
group by o.OrderID;

Selecting count by row combinations

I'm strugling with what on the first sight appeared to be simple SQL query :)
So I have following table which has three columns: PlayerId, Gender, Result (all of type integer).
What I'm trying to do, is to select distinct players of gender 2 (male) with number of each results.
There are about 50 possible results, so new table should have 51 columns:
|PlayerId | 1 | 2 | 3 | ... | 50 |
So I would like to see how many times each individual male (gender 2) player got specific result.
*** In case question is still not entirely clear to you: After each game I insert a row with a player ID, gender and result (from 1 - 50) player achieved in that game. Now I'd like to see how many times each player achieved specfic results.

If there are 50 results and you want them in columns, then you are talking about a pivot. I tend to do these with conditional aggregation:
select player,
sum(case when result = 0 then 1 else 0 end) as result_00,
sum(case when result = 1 then 1 else 0 end) as result_01,
. . .
sum(case when result = 50 then 1 else 0 end) as result_50
from t
group by player;
You can choose a particular gender if you like, with where gender = 2. But why not calculate all at the same time?

try
select player, result, count(*)
from your_table
where Gender = 2
group by player, result;

select PleyerId from tablename where result = 'specific result you want' and gender = 2 group by PleyerId

The easiest way is to use pivoting:
;with cte as(Select * from t
Where gender = 2)
Select * from cte
Pivot(count(gender) for result in([1],[2],[3],....,[50]))p
Fiddle http://sqlfiddle.com/#!3/8dad5/3
One note: keeping gender in scores table is a bad idea. Better make a separate table for players and keep gender there.

Count in a VIEW

I have a Deaths table where each personID display that died. There is also a column for reason of death and a date for when he/she died.
I need to count all the people that died by Illness, Accident, Suicide, etc.
I want my output to be like this:
| Illness | Accident | Suicide |
| 32 | 55 | 3 |
I can easily create a view like this:
CREATE VIEW viewDeaths AS
SELECT COUNT(personID) AS Illness
WHERE Reason = 'Illness';
And it will display it correct, but how do I do it with multiple conditions?
The main purpose is to display the different values for each reason on a graph in a C# application

Simply use multiple subqueries:
CREATE VIEW viewDeaths AS
SELECT Illness = (SELECT COUNT(*) FROM dbo.Deaths d
WHERE d.Reason = 'Illness'),
Accident = (SELECT COUNT(*) FROM dbo.Deaths d
WHERE d.Reason = 'Accident'),
Suicide = (SELECT COUNT(*) FROM dbo.Deaths d
WHERE d.Reason = 'Suicide')

In this way:
CREATE VIEW viewDeaths AS
SELECT Reason, COUNT(personID) AS Illness
-- WHERE Reason IN (.....)
GROUP BY Reason

Since it's a view, you can't use dynamic SQL, so either a static PIVOT or a CASE expression would be the best way to do it:
CREATE VIEW viewDeaths
AS
SELECT SUM(CASE WHEN Reason = 'Illness' THEN 1 ELSE 0 END) Illness,
SUM(CASE WHEN Reason = 'Accident' THEN 1 ELSE 0 END) Accident,
SUM(CASE WHEN Reason = 'Suicide' THEN 1 ELSE 0 END) Suicide
FROM dbo.Deaths;

How to identify subsequent user actions based on prior visits

I want to identify the users who visited section a and then subsequently visited b. Given the following data structure. The table contains 300,000 rows and updates daily with approx. 8,000 rows:
**USERID** **VISITID** **SECTION** Desired Solution--> **Conversion**
1 1 a 0
1 2 a 0
2 1 b 0
2 1 b 0
2 1 b 0
1 3 b 1
Ideally I want a new column that flags the visit to section b. For example on the third visit User 1 visited section b for the first time. I was attempting to do this using a CASE WHEN statement but after many failed attempts I am not sure it is even possible with CASE WHEN and feel that I should take a different approach, I am just not sure what that approach should be. I do also have a date column at my disposal.
Any suggestions on a new way to approach the problem would be appreciated. Thanks!

Correlated sub-queries should be avoided at all cost when working with Redshift. Keep in mind there are no indexes for Redshift so you'd have to rescan and restitch the column data back together for each value in the parent resulting in an O(n^2) operation (in this particular case going from 300 thousand values scanned to 90 billion).
The best approach when you are looking to span a series of rows is to use an analytic function. There are a couple of options depending on how your data is structured but in the simplest case, you could use something like
select case
when section != lag(section) over (partition by userid order by visitid)
then 1
else 0
end
from ...
This assumes that your data for userid 2 increments the visitid as below. If not, you could also order by your timestamp column
**USERID** **VISITID** **SECTION** Desired Solution--> **Conversion**
1 1 a 0
1 2 a 0
2 1 b 0
2 *2* b 0
2 *3* b 0
1 3 b 1

select t.*, case when v.ts is null then 0 else 1 end as conversion
from tbl t
left join (select *
from tbl x
where section = 'b'
and exists (select 1
from tbl y
where y.userid = x.userid
and y.section = 'a'
and y.ts < x.ts)) v
on t.userid = v.userid
and t.visitid = v.visitid
and t.section = v.section
Fiddle:
http://sqlfiddle.com/#!15/5b954/5/0
I added sample timestamp data as that field is necessary to determine whether a comes before b or after b.
To incorporate analytic functions you could use:
(I've also made it so that only the first occurrence of B (after an A) will get flagged with the 1)
select t.*,
case
when v.first_b_after_a is not null
then 1
else 0
end as conversion
from tbl t
left join (select userid, min(ts) as first_b_after_a
from (select t.*,
sum( case when t.section = 'a' then 1 end)
over( partition by userid
order by ts ) as a_sum
from tbl t) x
where section = 'b'
and a_sum is not null
group by userid) v
on t.userid = v.userid
and t.ts = v.first_b_after_a
Fiddle: http://sqlfiddle.com/#!1/fa88f/2/0

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql query that will get a distinct type, brand, and model, but get a count of how many duplicates were found - sql-server-2005

Related

How can I sum a row's value in my table based on a specific type?

Create function that returns table in SQL

Selecting count by row combinations

Count in a VIEW

How to identify subsequent user actions based on prior visits

Categories

Resources