SQL Query - SELECT distinct IDs with 2 extra column - sql

Im working in an SQL Query like this: (sorted by the station visits)
TRAIN_ID TYPE STATION
111 'KC' New York
111 'KC' Washington
111 'KC' Boston
111 'KC' Denver
222 'FC' London
222 'FC' Paris
I'd like to SELECT distinct trains, and actual row must include the first and the last station like:
TRAIN_ID TYPE FIRSTSTATION LASTSTATION
111 'KC' New York Denver
222 'FC' Denver Paris
Anyone can give a hand? Thank you in anticipation!

Assuming you find something to define an order on the stations so that you can identify the "last" and "first" one, the following should work:
WITH numbered_stations AS (
SELECT train_id,
type,
row_number() over (partition by train_id order by some_order_column) as rn,
count(*) over (partition by train_id) as total_stations
FROM the_unknown_table
)
SELECT f.train_id,
f.type,
f.station as first_station,
l.station as last_station
FROM (SELECT train_id,
type
station
FROM numbered_stations
WHERE rn = 1
) f
JOIN (SELECT train_id,
type,
station
FROM numbered_stations
WHERE rn = total_stations) l
ON f.train_id = l.train_id
ORDER BY train_id
This assumes that some_order_column can be used to identify the last and first station.
It also assumes that the type is always the same for all combinations of train_id and station.
The shown syntax is standard ANSI SQL and should work on most modern DBMS.

Related

How to check how many times some values are duplicated?

I have table like below:
city | segment
------------------
London | A
London | B
New York | A
Berlin | B
Barcelona | C
Barcelona | H
Barcelona | E
Each city should have only one segment, but as you can see there are two cities (London and Barcelona) that have more than one segment.
It is essential that in result table I need only these cities which have > 1 segmnet
As a result I need somethig like below:
city - city based on table above
no_segments - number of segments which have defined city based on table above
segments - segments of defined city based on table above
city
no_segments
segments
London
2
A
B
Barcelona
3
C
H
E
How can I do that in Oracle?
You can use COUNT(*) OVER ()(in order to get number of segments) and ROW_NUMBER()(in order to prepare the results those will be conditionally displayed) analytic functions such as
WITH t1 AS
(
SELECT city,
segment,
COUNT(*) OVER (PARTITION BY city) AS no_segments,
ROW_NUMBER() OVER (PARTITION BY city ORDER BY segment) rn
FROM t
)
SELECT DECODE(rn,1,city) AS city,
DECODE(rn,1,no_segments) AS no_segments,
segment
FROM t1
WHERE no_segments > 1
ORDER BY t1.city, segment
Demo
Another way to do this is:
SELECT NULLIF(CITY, PREV_CITY) AS CITY,
SEGMENT
FROM (SELECT CITY,
LAG(CITY) OVER (ORDER BY CITY DESC) AS PREV_CITY,
SEGMENT,
COUNT(SEGMENT) OVER (PARTITION BY CITY) AS CITY_SEGMENT_COUNT
FROM CITY_SEGMENTS)
WHERE CITY_SEGMENT_COUNT > 1
Using LAG() to determine the "previous" CITY allows us to directly compare the CITY values, which in my mind is clearer that using ROW_NUMBER = 1.
db<>fiddle here
;with cte as (
Select city, count(seg) as cntseg
From table1
Group by city having count(seg) > 1
)
Select a.city, b.cntseg, a.seg
From table1 as a join cte as b
On a.city = b.city

Find the most popular combinations SQL

I have 2 tables I want to join to explore the most popular combinations of location, by distinct id, ordered by count. I get location from l, date from d. The results from this join would be:
id loc_id location date
1 111 NYC 20200101
1 222 LA 20200102
2 111 NYC 20200103
2 333 LON 20200103
3 444 NYC 20200105
4 444 LA 20200106
4 555 PAR 20200107
5 111 NYC 20200110
5 222 LA 20200111
I would like to use STRING_AGG if possible, but get an error with the WITHIN statement -
'expecting ')' but got WITHIN
..( I'm on BigQuery for this). Here is what I've attempted so far.
SELECT t.combination, count(*) count
FROM (
SELECT
STRING_AGG(location, ',') WITHIN GROUP (ORDER BY d.date) combination
FROM location as l
JOIN date d
USING (loc_id)
GROUP BY id
) t
WHERE date BETWEEN 20190101 AND 20200228 GROUP BY t.combination
ORDER BY count DESC;
I want to end up with something like:
combination count
NYC, LA 3
NYC, LON 1
LA, PAR 1
NYC 1
If there's another method I'd be happy to change from string_agg.
The correct BQ syntax would be:
SELECT t.combination, count(*) count
FROM (SELECT STRING_AGG(location, ',' ORDER BY d.date) as combination
FROM location l JOIN
date d
USING (loc_id)
GROUP BY id
) t
WHERE date BETWEEN 20190101 AND 20200228
GROUP BY t.combination
ORDER BY count DESC;
Note that your JOIN condition still looks wrong.
And if you are using dates, then I would expect DATE constants.
And your date filtering code won't work in the outer query, because you haven't selected the dates in the inner query. You probably want the filtering in the inner query.
This answer does not address these issues.
BigQuery has quite good documentation. There is no WITHIN GROUP for STRING_AGG().

source destination table without duplication

i m trying to create a new data table using the table below :
ID City Language
1 Paris French
2 New York English
3 Delhi English
4 Berlin German
5 Marseille French
6 Hamburg German
the ouput should be something like this:
City 1 City2 Language
Paris Marseille French
NY Delhi English
Berlin Hamburg German
the main idea here is to avoid 2 rows with same city like for example Paris-Marseille and Marseille-Paris.
please advise on how to achieve it.
Do you just want aggregation?
select min(city), max(city), language
from t
group by language;
If there are only two rows per language, then simple aggregation is enough:
select min(city) city1, max(city) city2, language
from mytable
group by language
If you want to possibly handle more cities, and/or control the order in which they appear in the columns based on the id of the intial row, then you can use window functions and conditional aggregation:
select
max(case when rn = 1 then city end) city1,
max(case when rn = 2 then city end) city2,
max(case when rn = 3 then city end) city3
from (
select t.*, row_number() over(partition by language order by id) rn
from mytable t
) t
group by language

Oracle Sql : distinct value in a specific field [duplicate]

This question already has answers here:
How to select records with maximum values in two columns?
(2 answers)
Closed 7 years ago.
I have the following table :
**Country Name Number**
us John 45
us Jeff 35
fr Jean 31
it Luigi 25
fr Maxime 23
ca Justin 23
This table is order by Number. I want to have a query that for each country give me the name with highest number :
**Country Name Number**
us John 45
fr Jean 31
it Luigi 25
ca Justin 23
I try to use distinct but I can't only make it on country if I want to print the all thing...
Have an idea ?'
EDIT :
The table is obtain by a subquery
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE Countries AS
SELECT 'us' AS Country, 'John' AS Name, 45 AS "Number" FROM DUAL
UNION ALL SELECT 'us' AS Country, 'Jeff' AS Name, 35 AS "Number" FROM DUAL
UNION ALL SELECT 'fr' AS Country, 'Jean' AS Name, 31 AS "Number" FROM DUAL
UNION ALL SELECT 'it' AS Country, 'Luigi' AS Name, 25 AS "Number" FROM DUAL
UNION ALL SELECT 'fr' AS Country, 'Maxime' AS Name, 23 AS "Number" FROM DUAL
UNION ALL SELECT 'ca' AS Country, 'Justin' AS Name, 23 AS "Number" FROM DUAL;
Query 1:
SELECT Country,
MAX( Name ) KEEP ( DENSE_RANK FIRST ORDER BY "Number" DESC ) AS "Name",
MAX( "Number" ) AS "Number"
FROM Countries
GROUP BY Country
Results:
| COUNTRY | Name | Number |
|---------|--------|--------|
| ca | Justin | 23 |
| fr | Jean | 31 |
| it | Luigi | 25 |
| us | John | 45 |
I do not have an Oracle db handy but I got this working in my SQL Server db and am pretty sure it will work in Oracle (meaning I think I am using ANSI sql which should work in most db's):
SELECT m.Country,m.Name,m.number
FROM mytable m
INNER JOIN (
select country, MAX(number) as number
FROM mytable GROUP BY Country
) AS tmp ON m.Country = tmp.Country and m.Number = tmp.number
ORDER BY m.Number DESC
This has the added benefit that it should give you records when you have two people in a given country that have the same number.
You didn't give us a table name so I just called it mytable.
Try below query:
SELECT Country, MAX(numbeer) FROM Table_Name GROUP BY Country
PFB updated query to include Name:
SELECT t1.* FROM table1 t1 INNER JOIN
(SELECT country, max(numbeer) as numbeer FROM table1 GROUP BY country) t2
ON t1.country=t2.country AND t1.numbeer=t2.numbeer;
Use row_number():
select t.Country, t.Name, t.Number
from (select t.*,
row_number() over (partition by country order by number desc) as seqnum
from table t
) t
where seqnum = 1;

SELECT Top 1 ID, DISTINCT Field

I have a table sample table as follows:
ID | City
--------------
1 | New York
2 | San Francisco
3 | New York
4 | Los Angeles
5 | Atlanta
I would like to select the distinct City AND the TOP ID for each. E.g., conceptually I would like to do the following
SELECT TOP 1 ID, DISTINCT City
FROM Cities
Should give me:
ID | City
--------------
1 | New York
2 | San Francisco
4 | Los Angeles
5 | Atlanta
Because New York appears twice, it's taken the first ID 1 in this instance.
But I get the error:
Column 'Cities.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Try this way:
SELECT min(ID), City
FROM Cities
Group by City
MIN function is used for choose one of the ID from two New York cities.
You need to have your city in a GROUP BY
SELECT MIN(ID), City
FROM Cities
GROUP BY City
More general solution is to use row_number in order to get other details of table:
select * from
(select *, row_number() over(partition by City order by ID) as rn from Cities)
where rn = 1
But for this particular table just grouping will do the work:
select City, Min(ID) as ID
from Cities
group by City
If you have a complex scenario where Group By cannot use, You could use Row_Number() function with Common Table Expression.
;WITH CTE AS
(
SELECT ID, City, ROW_NUMBER() OVER (PARTITION BY City ORDER BY Id) rn
FROM YourTable
)
SELECT Id, City
FROM CTE
WHERE rn = 1