SQL count issues - sql

Sorry if this is a basic question.
I have a table
store-ProdCode
13p I10x
13p I20x
13p I30x
14a K38z
17a K38y
my data set has nearly 100,000 records.
What I'm trying to do is, for every store find the top 10 prodCode.
I am unsure of how to do this but what I tried was:
select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null
group by store,prod_code
order by count(prod_code) desc limit 10
this is giving me something completely different and i'm unsure on how I go about achieving my final result.
The expected output should be: for every store(s_code) display the top 10 prodcode (top 10 calculated by the count)
so:
store--prodcode--result
1a abc 5
1a abd 4
1a xxx 7
--- this will be done until top 10 prodcodes for 1a are done--
2a dgf 1
2a ldk 6
--process completes until end of data is reached and top 10 prodcodes are displayed for each store
All help is appreciated. What is the best way to do this?
Thanks

One method uses row_number(), something like this:
select s.*
from (select s_code as store, prod_code, count(prod_code),
row_number() over (partition by s_code order by count(prod_code) desc) as seqnum
from top10_secondary
where prod_code is not null
group by s_code, prod_code
) s
where seqnum <= 10;
You can use window functions directly in an aggregation query. The subquery is needed only to reference the sequence number for filtering.

Related

SQL counting query

Sorry if this is a basic question.
Basically, I have a table that is as follows, below is a basic sample
store-ProdCode-result
13p I10x 5
13p I20x 7
13p I30x 8
14a K38z 23
17a K38z 23
my data set has nearly 100,000 records.
What I'm trying to do is, for every store find the top 10 prodCode.
I am unsure of how to do this but what I tried was:
select s_code as store, prod_code,count (prod_code)
from top10_secondary
where prod_code is not null
group by store,prod_code
order by count(prod_code) desc limit 10
this is giving me something completely different and i'm unsure on how I go about achieving my final result.
All help is appreciated.
Thanks
The expected output should be: for every store(s_code) display the top 10 prodcode
so:
store--prodcode--result
1a abc 5
1a abd 4
2a dgf 1
2a ldk 6
.(10 times until next store code)
You can use the table twice in the FROM clause, once for the data, and once to get a count of how many records have fewer results for that store.
SELECT a.s_code, a.prod_code, count(*)
FROM top10_secondary a
LEFT OUTER JOIN top10_secondary b
ON a.s_code = b.s_code
AND b.result < a.result
GROUP BY a.s_code, a.prod_code
HAVING count(*) < 10
With this technique though, you may get more than 10 records per store if the 10th result value exists multiple times. Because the limit rule is simply "include record as long as there are less than 10 records with result values than mine"
It looks like in your case, "result" is a ranking, so they would not be duplicated per store.
This is a good case for Window functions.
SELECT
s_code,
prod_code,
prod_count
FROM
(
SELECT
s_code,
prod_code,
prod_count,
RANK() OVER (PARTITION BY s_code ORDER BY prod_Count DESC) as prod_rank
FROM
(SELECT s_code as store, prod_code, count(prod_Code) prod_count FROM table GROUP BY s_code, prod_code) t1
) t2
WHERE prod_rank <= 10
The inner most query gets the count of each product at the store. The second inner more query determines the rank for those products for each store based on that count. Then the outer most query limits the results based on that rank.
o

Querying Data Backward in ORACLE SQL

I have a simple question regarding oracle sql. So i have this table
WEEKNUM DATA
1 10
2 4
3 6
4 7
So i want to make a view that shows like this,
WEEKNUM DATA ACCUM_DATE
1 10 10
2 4 14
3 6 20
4 7 27
I spend hours on this simple one but couldnt get any luck
thanks a lot
SELECT weeknum,
data,
sum(data) over (order by weeknum) accum_data
FROM your_table_name
should work. I'm using the sum analytic function here and assuming that you want to start with the smallest weeknum value and keep increasing the running total as the weeknum values increase. I'm also assuming that you never want to reset the accumulated sum. If you're trying to do something like generating an accumulated sum that restarts each year, you'd want to add a partition by to the analytic function.
You could use a Cross JOin in this case
Query:
select
A.WEEKNUM
, A.DATA
, SUM(B.DATA) DA
from table1 A
cross join table1 B
WHERE A.WEEKNUM>=B.WeekNUM
GROUP BY A.WEEKNUM
, A.DATA
order by A.WEEKNUM
Result:
WEEKNUM DATA DA
1 10 10
2 4 14
3 6 20
4 7 27
Thanks guys but i just found out this method works perfectly,
OVER (ORDER BY WEEKNUM ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS CUMULATIVE_WEIGHT
Or use a sub-select to calculate:
select WEEKNUM, DATA, (select sum(DATA) from tablename t2
where t2.weeknum <= t1.weeknum) as ACCUM_DATE
from tablename t1

In SQL, I need to generate a ranking (1st, 2nd, 3rd) column, getting stuck on "ties"

I have a query that calculates points based on multiple criteria, and then orders the result set based on those points.
SELECT * FROM (
SELECT
dbo.afunctionthatcalculates(Something, Something) AS Points1
,dbo.anotherone(Something, Something) AS Points2
,dbo.anotherone(Something, Something) AS Points3
,[TotalPoints] = dbo.function(something) + dbo.function(something)
) AS MyData
ORDER BY MyData.TotalPoints
So my first stab at adding placement, rankings.. was this:
SELECT ROW_NUMBER() OVER(MyData.TotalPoints) AS Ranking, * FROM (
SELECT same as above
) AS MyData
ORDER BY MyData.TotalPoints
This adds the Rankings column, but doesn't work when the points are tied.
Rank | TotalPoints
--------------------
1 100
2 90
3 90
4 80
Should be:
Rank | TotalPoints
--------------------
1 100
2 90
2 90
3 80
Not really sure about how to resolve this.
Thank you for your help.
You should use the DENSE_RANK() function which takes the ties into account, as described here: http://msdn.microsoft.com/en-us/library/ms173825.aspx
DENSE_RANK() instead of ROW_NUMBER()

counting subquery in SQL

I have the following query to count how many times each process_track_id occurs in a table:
SELECT
a.process_track_id,
COUNT(1) AS 'num'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
This returns the following results:
process_track_id | num
1 14
2 44
3 16
5 8
6 18
7 17
8 14
This is great. Now is the part where I am stuck. I would like to get the following table:
num count
8 1
14 2
16 1
17 1
18 1
44 1
Where num are the distinct counts from the first table, and count is how many times that frequency occurs.
Here is what I have tried (it's a subquery, but I'm not sold on the method) and I haven't been able to get it to work just yet. I'm new to SQL and I think I'm missing out on some some key aspects of the syntax.
SELECT
X.id_count,
count(1) as 'num_count'
FROM
(SELECT
a.process_track_id,
COUNT(1) AS 'id_count'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
--COUNT(1) AS 'id_count'
) X;
Any ideas?
It's probably good to keep in mind that this may have to be run on a database with at least 1 million records, and I don't have the ability to create a new table in the process.
Thanks!
Here's the subquery method you were driving at:
SELECT id_count, COUNT(*) AS 'num_count'
FROM (SELECT a.process_track_id
,COUNT(*) AS 'id_count'
FROM transreport.process_name a
GROUP BY a.process_track_id
)sub
GROUP BY id_count
Not sure there's a better method as the aggregation needs to run once anyway.
Try this
SELECT x.num, COUNT(*) AS COUNT
FROM (
SELECT
a.process_track_id, -- <--- You may removed this column
COUNT(*) AS 'num'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
) X
GROUP BY X.num

Select unique records [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 2 years ago.
I'm working with a table that has about 50 colums and 100,000 rows.
One column, call it TypeID, has 10 possible values:
1 thourgh 10.
There can be 10,000 records of TypeID = 1, and 10,000 records of TypeID = 2 and so one.
I want to run a SELECT statement that will return 1 record of each distinct TypeID.
So something like
TypeID JobID Language BillingDt etc
------------------------------------------------
1 123 EN 20130103 etc
2 541 FR 20120228 etc
3 133 FR 20110916 etc
4 532 SP 20130822 etc
5 980 EN 20120714 etc
6 189 EN 20131009 etc
7 980 SP 20131227 etc
8 855 EN 20111228 etc
9 035 JP 20130615 etc
10 103 EN 20100218 etc
I've tried:
SELECT DISTINCT TypeID, JobID, Language, BillingDt, etc
But that produces multiple TypeID rows of the same value. I get a whole bunch of '4', '10', and so on.
This is an ORACLE Database that I'm working with.
Any advise would be greatly appreciated; thanks!
You can use ROW_NUMBER() to get the top n per group:
SELECT TypeID,
JobID,
Language,
BillingDt,
etc
FROM ( SELECT TypeID,
JobID,
Language,
BillingDt,
etc,
ROW_NUMBER() OVER(PARTITION BY TypeID ORDER BY JobID) RowNumber
FROM T
) T
WHERE RowNumber = 1;
SQL Fidle
You may need to change the ORDER BY clause to fit your requirements, as you've not said how to pick one row per TypeID I had to guess.
WITH RankedQuery AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY TypeID ORDER BY [ordercolumn] DESC) AS rn
FROM [table]
)
SELECT *
FROM RankedQuery
WHERE rn = 1;
This will return the top row for each type id, you can add an order by if you want a specific row, not just any.