How to obtain list in sql query - sql

In one application, I have a table with three fields, being Id, Name and count.
Id | Name | Value
1 | A | 5
2 | B | 9
3 | C | 9
4 | D | 5
5 | E | 6
6 | F | 6
now, how can I obtain a cross table from the above? I mean, as follows:
Value | Count
---- | ----
5 | 2
6 | 2
7 | 0
8 | 0
9 | 2
can you help, please?

First, you need to create a tally table. There are many methods for that. You will use the tally table to number off all the values between min and max of your source table. Once you have all the numbers between min and max, you will need to LEFT JOIN those into a a version of your table where you use COUNT() and GROUP BY to total the number of times each value appears.
Below Table A is the tally table.
Table B is your aggregated source table.
DECLARE #MinValue INT
DECLARE #MaxValue INT
SET #MinValue = (SELECT MIN(Value) FROM dbo.MyTable)
SET #MaxValue = (SELECT MAX(Value) FROM dbo.MyTable)
SELECT number as Value, COALESCE(Count,0) AS Count
FROM (
SELECT DISTINCT number
FROM master..spt_values
WHERE number
BETWEEN #MinValue AND #MaxValue
) AS A
LEFT JOIN (
SELECT Value, COUNT(Value) AS Count
FROM dbo.MyTable
GROUP BY Value
) AS B
ON A.number = B.value

Related

get the nth-lowest value in a `group by` clause

Here's a tough one: I have data coming back in a temporary table foo in this form:
id n v
-- - -
1 3 1
1 3 10
1 3 100
1 3 201
1 3 300
2 1 13
2 1 21
2 1 300
4 2 1
4 2 7
4 2 19
4 2 21
4 2 300
8 1 11
Grouping by id, I need to get the row with the nth-lowest value for v based on the value in n. For example, for the group with an ID of 1, I need to get the row which has v equal to 100, since 100 is the third-lowest value for v.
Here's what the final results need to look like:
id n v
-- - -
1 3 100
2 1 13
4 2 7
8 1 11
Some notes about the data:
the number of rows for each ID may vary
n will always be the same for every row with a given ID
n for a given ID will never be greater than the number of rows with that ID
the data will already be sorted by id, then v
Bonus points if you can do it in generic SQL instead of oracle-specific stuff, but that's not a requirement (I suspect that rownum may factor prominently in any solutions). It has in my attempts, but I wind up confusing myself before I get a working solution.
I would use row_number function make row number the compare with n column value in CTE, do another CTE to make row number order by v desc.
get rn = 1 which is mean max value in the n number group.
CREATE TABLE foo(
id int,
n int,
v int
);
insert into foo values (1,3,1);
insert into foo values (1,3,10);
insert into foo values (1,3,100);
insert into foo values (1,3,201);
insert into foo values (1,3,300);
insert into foo values (2,1,13);
insert into foo values (2,1,21);
insert into foo values (2,1,300);
insert into foo values (4,2,1);
insert into foo values (4,2,7);
insert into foo values (4,2,19);
insert into foo values (4,2,21);
insert into foo values (4,2,300);
insert into foo values (8,1,11);
Query 1:
with cte as(
select id,n,v
from
(
select t.*, row_number() over(partition by id ,n order by n) as rn
from foo t
) t1
where rn <= n
), maxcte as (
select id,n,v, row_number() over(partition by id ,n order by v desc) rn
from cte
)
select id,n,v
from maxcte
where rn = 1
Results:
| ID | N | V |
|----|---|-----|
| 1 | 3 | 100 |
| 2 | 1 | 13 |
| 4 | 2 | 7 |
| 8 | 1 | 11 |
use window function
select * from
(
select t.*, row_number() over(partition by id ,n order by v) as rn
from foo t
) t1
where t1.rn=t1.n
as ops sample output just need 3rd highest value so i put where condition t1.rn=3 though accodring to description it would be t1.rn=t1.n
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=65abf8d4101d2d1802c1a05ed82c9064
If your database is version 12.1 or higher then there is a much simpler solution:
SELECT DISTINCT ID, n, NTH_VALUE(v,n) OVER (PARTITION BY ID) AS v
FROM foo
ORDER BY ID;
| ID | N | V |
|----|---|-----|
| 1 | 3 | 100 |
| 2 | 1 | 13 |
| 4 | 2 | 7 |
| 8 | 1 | 11 |
Depending on your real data you may have to add an ORDER BY n clause and/or windowing_clause as RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, see NTH_VALUE

SQL Server - Select Distinct of two columns, where the distinct column selected has a maximum value based on two other columns

I have 2 tables - TC and T, with columns specified below. TC maps to T on column T_ID.
TC
----
T_ID,
TC_ID
T
-----
T_ID,
V_ID,
Datetime,
Count
My current result set is:
V_ID TC_ID Datetime Count
----|-----|------------|--------|
2 | 1 | 2013-09-26 | 450600 |
2 | 1 | 2013-12-09 | 14700 |
2 | 1 | 2014-01-22 | 15000 |
2 | 1 | 2014-01-22 | 15000 |
2 | 1 | 2014-01-22 | 7500 |
4 | 1 | 2014-01-22 | 1000 |
4 | 1 | 2013-12-05 | 0 |
4 | 2 | 2013-12-05 | 0 |
Using the following query:
select T.V_ID,
TC.TC_ID,
T.Datetime,
T.Count
from T
inner join TC
on TC.T_ID = T.T_ID
Result set I want:
V_ID TC_ID Datetime Count
----|-----|------------|--------|
2 | 1 | 2014-01-22 | 15000 |
4 | 1 | 2014-01-22 | 1000 |
4 | 2 | 2013-12-05 | 0 |
I want to write a query to select each distinct V_ID + TC_ID combination, but only with the maximum datetime, and for that datetime the maximum count. E.g. for the distinct combination of V_ID = 2 and TC_ID = 1, '2014-01-22' is the maximum datetime, and for that datetime, 15000 is the maximum count, so select this record for the new table. Any ideas? I don't know if this is too ambitious for a query and I should just handle the result set in code instead.
One method uses row_number():
select v_id, tc_id, datetime, count
from (select T.V_ID, TC.TC_ID, T.Datetime, T.Count,
row_number() over (partition by t.V_ID, tc.tc_id
order by datetime desc, count desc
) as seqnum
from t join
tc
on tc.t_id = t._id
) tt
where seqnum = 1;
The only issue is that some rows have the same maximum datetime value. SQL tables represent unordered sets, so there is no way to determine which is really the maximum -- unless the datetime really has a time component or another column specifies the ordering within a day.
It is possible to solve this using CTEs. First, extracting the data from your query. Second, get the maxdates. Third, get the highest count for each maxdate.:
;WITH Dataset AS
(
select T.V_ID,
TC.TC_ID,
T.[Datetime],
T.[Count]
from T
inner join TC
on TC.T_ID = T._ID
),
MaxDates AS
(
SELECT V_ID, TC_ID, MAX(t.[Datetime]) AS MaxDate
FROM Dataset t
GROUP BY t.V_ID, t.TC_ID
)
SELECT t.V_ID, t.TC_ID, t.[Datetime], MAX(t.[Count]) AS [Count]
FROM Dataset t
INNER JOIN MaxDates m ON t.V_ID = m.V_ID AND t.TC_ID = m.TC_ID AND m.MaxDate = t.[Datetime]
GROUP BY t.V_ID, t.TC_ID, t.[Datetime]
Just to keep it simple:
You need to group by T.V_ID,TC.TC_ID,
with selecting the max of date and then to get the maximum count, you must use a sub query as follows,
select T.V_ID,
TC.TC_ID,
max(T.Datetime) as Date_Time,
(select max(Count) from T as tb where v_ID = T.v_ID and DateTime = max(T.DateTime)) as Count
from T
inner join TC
on TC.T_ID = T._ID
group by T.V_ID,TC.TC_ID,

Order by pairs of values

I have a set of rankings, ordered by group and ranking:
Group | Rank
------------
A | 1
A | 2
A | 3
A | 4
A | 5
A | 6
B | 1
B | 2
B | 3
B | 4
C | 1
C | 2
C | 3
C | 4
C | 5
D | 1
D | 2
D | 3
D | 4
I want to interleave the groups, ordered by group and rank, n rankings per group at a time (here, n=2):
Group | Rank
------------
A | 1
A | 2
B | 1
B | 2
C | 1
C | 2
D | 1
D | 2
A | 3
A | 4
B | 3
B | 4
C | 3
C | 4
D | 3
D | 4
A | 5
A | 6
C | 5
I have achieved the desired result with loops and table variables (code pasted here because I got a non-descript syntax error in a SQL Fiddle):
CREATE TABLE Rankings([Group] NCHAR(1), [Rank] INT)
INSERT Rankings
VALUES
('A',1),
('A',2),
('A',3),
('A',4),
('A',5),
('A',6),
('B',1),
('B',2),
('B',3),
('B',4),
('C',1),
('C',2),
('C',3),
('C',4),
('C',5),
('D',1),
('D',2),
('D',3),
('D',4)
-- input
DECLARE #n INT = 2 --number of group rankings per rotation
-- output
DECLARE #OrderedRankings TABLE([Group] NCHAR(1), Rank INT)
--
-- in-memory rankings.. we will be deleting used rows
DECLARE #RankingsTemp TABLE(GroupIndex INT, [Group] NCHAR(1), Rank INT)
INSERT #RankingsTemp
SELECT
ROW_NUMBER() OVER (PARTITION BY Rank ORDER BY [Group]) - 1 AS GroupIndex,
[Group],
Rank
FROM Rankings
ORDER BY [Group], Rank
-- loop variables
DECLARE #MaxGroupIndex INT = (SELECT MAX(GroupIndex) FROM #RankingsTemp)
DECLARE #RankingCount INT = (SELECT COUNT(*) FROM #RankingsTemp)
DECLARE #i INT
WHILE(#RankingCount > 0)
BEGIN
SET #i = 0;
WHILE(#i <= #MaxGroupIndex)
BEGIN
INSERT INTO #OrderedRankings
([Group], Rank)
SELECT TOP(#n)
[Group],
Rank
FROM #RankingsTemp
WHERE GroupIndex = #i;
WITH T AS (
SELECT TOP(#n) *
FROM #RankingsTemp
WHERE GroupIndex = #i
);
DELETE FROM T
SET #i = #i + 1;
END
SET #RankingCount = (SELECT COUNT(*) FROM #RankingsTemp)
END
SELECT #RankingCount as RankingCount, #MaxGroupIndex as MaxGroupIndex
-- view results
SELECT * FROM #OrderedRankings
How can I achieve the desired ordering with a set-based approach (no loops, no table variables)?
I'm using SQL Server Enterprise 2008 R2.
Edit: To clarify, I need no more than n rows per group to appear contiguously. The goal of this query is to yield an ordering, when read sequentially, offers an equal representation (n rows at a time) of each group, with respect to rank.
Perhaps something like this...SQL FIDDLE
Order by
Ceiling(rank*1.0/2), group, rank
Working fiddle above (column names changed slightly)
Updated: was burned by int math... . should work now. forcing int to decimal by multiplying by 1.0 so implicit casting doesn't drop the remainder I need for ceiling to round correctly.
Assuming you have a relatively low number of ranks, this would work:
Order by
case when rank <= n then 10
when rank <= 2*n then 20
when rank <= 3*n then 30
when rank <= 4*n then 40
when rank <= 5*n then 50 --more cases here if needed
else 100
end
, group
, rank

Remove partial duplicates sql server

I am altering an existing view within SQL Server. My union statement creates something along the lines of:
Col1 | C2 | C3 | C4
-----|----|------|-----
1 A | B | NULL | NULL
2 A | B | C | NULL
3 A | B | C | D
4 E | F | NULL | NULL
5 E | F | G | NULL
However, I only want (in this scenario) rows 3 and 5 (I need to ommit one and two because they contain duplicate info - columns one, two, and three contain the same info as row three, but the third row is the most 'complete'). Row 5 for the same reason vs row 4.
Is this an outer join / intersect issue? How the heck do you create a view in this manner?
Assuming that Col1 is not NULL, then we can use ROW_NUMBER with order by on all 4 columns total value
; with cte
AS
(
select ROW_NUMBER() over ( partition by col1 order by (coalesce(Col1,'')+
coalesce([C2],'') +
coalesce([C3],'') +
coalesce([C4],'') ) desc) as seq,
*
FROM Table1
)
select * from cte
where seq =1

SQL - min() gets the lowest value, max() the highest, what if I want the 2nd (or 5th or nth) lowest value?

The problem I'm trying to solve is that I have a table like this:
a and b refer to point on a different table. distance is the distance between the points.
| id | a_id | b_id | distance | delete |
| 1 | 1 | 1 | 1 | 0 |
| 2 | 1 | 2 | 0.2345 | 0 |
| 3 | 1 | 3 | 100 | 0 |
| 4 | 2 | 1 | 1343.2 | 0 |
| 5 | 2 | 2 | 0.45 | 0 |
| 6 | 2 | 3 | 110 | 0 |
....
The important column I'm looking is a_id. If I wanted to keep the closet b for each a, I could do something like this:
update mytable set delete = 1 from (select a_id, min(distance) as dist from table group by a_id) as x where a_gid = a_gid and distance > dist;
delete from mytable where delete = 1;
Which would give me a result table like this:
| id | a_id | b_id | distance | delete |
| 1 | 1 | 1 | 1 | 0 |
| 5 | 2 | 2 | 0.45 | 0 |
....
i.e. I need one row for each value of a_id, and that row should have the lowest value of distance for each a_id.
However I want to keep the 10 closest points for each a_gid. I could do this with a plpgsql function but I'm curious if there is a more SQL-y way.
min() and max() return the smallest and largest, if there was an aggregate function like nth(), which'd return the nth largest/smallest value then I could do this in similar manner to the above.
I'm using PostgeSQL.
Try this:
SELECT *
FROM (
SELECT a_id, (
SELECT b_id
FROM mytable mib
WHERE mib.a_id = ma.a_id
ORDER BY
dist DESC
LIMIT 1 OFFSET s
) AS b_id
FROM (
SELECT DISTINCT a_id
FROM mytable mia
) ma, generate_series (1, 10) s
) ab
WHERE b_id IS NOT NULL
Checked on PostgreSQL 8.3
I love postgres, so it took it as a challenge the second I saw this question.
So, for the table:
Table "pg_temp_29.foo"
Column | Type | Modifiers
--------+---------+-----------
value | integer |
With the values:
SELECT value FROM foo ORDER BY value;
value
-------
0
1
2
3
4
5
6
7
8
9
14
20
32
(13 rows)
You can do a:
SELECT value FROM foo ORDER BY value DESC LIMIT 1 OFFSET X
Where X = 0 for the highest value, 1 for the second highest, 2... And so forth.
This can be further embedded in a subquery to retrieve the value needed. So, to use the dataset provided in the original question we can get the a_ids with the top ten lowest distances by doing:
SELECT a_id, distance FROM mytable
WHERE id IN
(SELECT id FROM mytable WHERE t1.a_id = t2.a_id
ORDER BY distance LIMIT 10);
ORDER BY a_id, distance;
a_id | distance
------+----------
1 | 0.2345
1 | 1
1 | 100
2 | 0.45
2 | 110
2 | 1342.2
Does PostgreSQL have the analytic function rank()? If so try:
select a_id, b_id, distance
from
( select a_id, b_id, distance, rank() over (partition by a_id order by distance) rnk
from mytable
) where rnk <= 10;
This SQL should find you the Nth lowest salary should work in SQL Server, MySQL, DB2, Oracle, Teradata, and almost any other RDBMS: (note: low performance because of subquery)
SELECT * /*This is the outer query part */
FROM mytable tbl1
WHERE (N-1) = ( /* Subquery starts here */
SELECT COUNT(DISTINCT(tbl2.distance))
FROM mytable tbl2
WHERE tbl2.distance < tbl1.distance)
The most important thing to understand in the query above is that the subquery is evaluated each and every time a row is processed by the outer query. In other words, the inner query can not be processed independently of the outer query since the inner query uses the tbl1 value as well.
In order to find the Nth lowest value, we just find the value that has exactly N-1 values lower than itself.