Select rows with the same attribute1 and different attribute2 - sql

I have a table
+----+------+---------+
| ID | CODE | COUNTRY |
+----+------+---------+
| 1 | 05 | France |
| 2 | 05 | France |
| 3 | 06 | Germany |
| 4 | 07 | France |
| 5 | 07 | Italy |
+----+------+---------+
and I need to select rows with the same code but different country.
So the result should be:
+------+---------+
| CODE | COUNTRY |
+------+---------+
| 07 | France |
| 07 | Italy |
+------+---------+
I tried
SELECT t1.code AS code, t1.country AS country
FROM countries AS t1, countries AS t2
WHERE t1.code = t2.code
AND t1.country <> t2.country;
and it works for the table in the example above.
But if the table looks like this:
+----+------+---------+
| ID | CODE | COUNTRY |
+----+------+---------+
| 1 | 05 | France |
| 2 | 05 | France |
| 3 | 06 | Germany |
| 4 | 07 | France |
| 5 | 07 | Italy |
| 6 | 07 | Italy |
+----+------+---------+
the result is:
+------+---------+
| CODE | COUNTRY |
+------+---------+
| 07 | Italy |
| 07 | Italy |
| 07 | France |
| 07 | France |
+------+---------+
but should be the same as above.
(I work with MS Access, so the query should work on Access)

Just add distinct
SELECT DISTINCT t1.code AS code, t1.country AS country
FROM countries AS t1, countries AS t2
WHERE t1.code = t2.code
AND t1.country <> t2.country;

You can use MIN and MAX to spot Codes where the Country is different, then use that to select all rows:
SELECT * FROM countries WHERE Code IN (
SELECT Code
FROM countries
GROUP BY Code
HAVING MIN(Country) < MAX(Country)
)

Related

Make a master record out of several duplicate records for each id

My table structure is as shown below
Id | Name | City | Country | State
01 | Bob | *NY* | null | null
01 | Bob | null | *US* | null
01 | Bob | null | null | *AL*
02 | Roy | *LA* | null | null
02 | Roy | null | *IN* | null
02 | Roy | null | null | *MG*
I want to generate two output records from the above table like below.
Id | Name | City |Country | State
01 | bob | NY | US | AL
02 | Roy | LA | IN | MG
You can use aggregation:
select id, name, max(city), max(country), max(state)
from t
group by id, name;

Select all values from last date that is shared between rows grouped by a value

I have a Postgresql table with a list of values for countries over time, and their continents. Values can be NULL. I’d like to get the sum for each continent over time, up to the latest date each continent has data for.
This is my table (view on DB Fiddle):
| continent | country | date | value | id |
| --------- | ------- | ---------- | ----- | --- |
| Europe | Germany | 2020-05-25 | 10 | 1 |
| Europe | Germany | 2020-05-26 | 11 | 2 |
| Europe | Germany | 2020-05-27 | 12 | 3 |
| Europe | Germany | 2020-05-28 | 13 | 4 |
| Europe | Italy | 2020-05-25 | 20 | 5 |
| Europe | Italy | 2020-05-26 | 21 | 6 |
| Europe | Italy | 2020-05-27 | 22 | 7 |
| Europe | Italy | 2020-05-28 | 23 | 8 |
| Europe | France | 2020-05-25 | 30 | 9 |
| Europe | France | 2020-05-26 | 31 | 10 |
| Europe | France | 2020-05-27 | 32 | 11 |
| Europe | France | 2020-05-28 | NULL | 12 |
| Africa | Congo | 2020-05-25 | 40 | 13 |
| Africa | Congo | 2020-05-26 | 41 | 14 |
| Africa | Congo | 2020-05-27 | NULL | 15 |
And this is what I’d like to get back. Note that the Europe includes data up to the 27th, because France has no data for the 28th, and Africa up to the 26th, because that’s the last date its countries have data for.
| continent | date | value |
| --------- | ---------- | ----- |
| Europe | 2020-05-27 | 66 |
| Africa | 2020-05-26 | 41 |
| Europe | 2020-05-26 | 63 |
| Africa | 2020-05-25 | 40 |
| Europe | 2020-05-25 | 60 |
I managed to almost get there by including the number of countries per continent that have data on each date.
SELECT
countries.continent,
countries.date,
SUM(countries.value) AS value,
COUNT(countries.country) AS countries_count
FROM
countries
WHERE
countries.value IS NOT NULL
GROUP BY
countries.continent,
countries.date
ORDER BY
countries.date DESC,
countries.continent;
| continent | date | value | countries_count |
| --------- | ---------- | ----- | --------------- |
| Europe | 2020-05-28 | 36 | 2 |
| Europe | 2020-05-27 | 66 | 3 |
| Africa | 2020-05-26 | 41 | 1 |
| Europe | 2020-05-26 | 63 | 3 |
| Africa | 2020-05-25 | 40 | 1 |
| Europe | 2020-05-25 | 60 | 3 |
I also managed to get the number of countries per continent.
SELECT
countries.continent,
COUNT(DISTINCT countries.country) as number_of_countries
FROM
countries
GROUP BY
countries.continent;
| continent | number_of_countries |
| --------- | ------------------- |
| Africa | 1 |
| Europe | 3 |
I’m stuck on how to combine the two queries to filter out rows that haven’t got the full number of countries for the continent (e. g. select rows where countries_count is 3 for Europe and 1 for Africa.
This is the end result I’d like to get back:
| continent | date | value |
| --------- | ---------- | ----- |
| Europe | 2020-05-27 | 66 |
| Africa | 2020-05-26 | 41 |
| Europe | 2020-05-26 | 63 |
| Africa | 2020-05-25 | 40 |
| Europe | 2020-05-25 | 60 |
Or maybe there’s a completely different way to go about this?
View on DB Fiddle
You can compare the number of countries on the continent to the number available on each date -- and then just use dates where the two match ("complete data").
Unfortunately, Postgres does not support count(distinct) as a window function. But you can do:
SELECT c.continent, c.date,
SUM(c.value) AS value,
COUNT(c.country) AS countries_count
FROM (SELECT c.*,
COUNT(*) OVER (PARTITION BY continent, date) as num_on_date
FROM countries c
WHERE value IS NOT NULL
) c JOIN
(SELECT continent, COUNT(DISTINCT country) as num_countries
FROM countries
GROUP BY continent
) cc
ON cc.continent = c.continent
WHERE num_on_date = num_countries
GROUP BY c.continent, c.date
ORDER BY c.date DESC, c.continent;
Here is a db<>fiddle.
You can also do this with a filter in the HAVING clause:
SELECT c.continent, c.date,
SUM(c.value) AS value,
COUNT(c.country) AS countries_count
FROM countries c
WHERE value IS NOT NULL
GROUP BY c.continent, c.date
HAVING COUNT(*) = (SELECT COUNT(DISTINCT c2.country)
FROM countries c2
WHERE c2.continent = c.continent
)
ORDER BY c.date DESC, c.continent;
This does the aggregation and then only keeps the rows where the number of rows matches the number of countries.
You can use NOT IN within your WHERE Clause :
SELECT
c.continent,
c.date,
SUM(c.value) AS value,
COUNT(DISTINCT c.country) AS countries_count
FROM countries c
WHERE date NOT IN
( SELECT date
FROM countries
WHERE value IS NULL )
GROUP BY c.continent, c.date
ORDER BY c.date DESC, c.continent;
You can filter with a having clause to exclude groups where any country is null
SELECT
continent,
date,
SUM(value) AS value
FROM countries
GROUP BY continent, date
HAVING BOOL_AND(value is not null)
ORDER BY date DESC, continent
With SUM() window function:
select distinct c.continent, c.date,
sum(c.value) over (partition by c.continent, c.date) "value"
from countries c
where not exists (
select 1 from countries
where continent = c.continent and date = c.date and value is null
)
order by c.date desc, c.continent;
See the demo.
Results:
| continent | date | value |
| --------- | ------------------------ | ----- |
| Europe | 2020-05-27T00:00:00.000Z | 66 |
| Africa | 2020-05-26T00:00:00.000Z | 41 |
| Europe | 2020-05-26T00:00:00.000Z | 63 |
| Africa | 2020-05-25T00:00:00.000Z | 40 |
| Europe | 2020-05-25T00:00:00.000Z | 60 |

MS Access pull down column via query

I have the following table in MS Access:
ID | column1 | column2 | column3
---+---------+-------------------+--------------
1 | A | Publishers | Publishers
2 | 01 | Commercial |
3 | 02 | University Press |
4 | B | Place | Place
5 | 01 | United States |
6 | 04 | Western Europe |
7 | 05 | Other |
8 | C | Language | Language
9 | 01 | English |
10 | 02 | French |
I am looking for the following result
ID |column1 | column2 | column3
---+---------+-------------------+--------------
1 | A | Publishers | Publishers
2 | 01 | Commercial | Publishers
3 | 02 | University Press | Publishers
4 | B | Place | Place
5 | 01 | United States | Place
6 | 04 | Western Europe | Place
7 | 05 | Other | Place
8 | C | Language | Language
9 | 01 | English | Language
10 | 02 | French | Language
So basically pulling down column3 heading. I have tried searching the net and asking other pals with some ms access knowledge. But really couldn't find any "pull down" query. Copy/paste wouldn't suffice as this will be performed many times in a day and with much larger data set. Can this be done without vba (looking to get this done through a query)?
If you have a column that specifies the ordering, you can do this with a correlated subquery:
select column1, column2,
(select top (1) t2.column3
from t as t2
where t2.id <= t.id and
t2.column3 is not null
order by t2.id desc
) as column3
from t;

Join between two tables

table1 - doctors
+---------+--------+------+
| country | state | doc |
+---------+--------+------+
| india | AP | 20 |
+---------+--------+------+
| india | TN | 30 |
+---------+--------+------+
| india | KA | 10 |
+---------+--------+------+
| US | LA | 30 |
+---------+--------+------+
| US | CA | 10 |
+---------+--------+------+
| US | NY | 50 |
+---------+--------+------+
table2 - engineers
+---------+--------+-------+
| country | state | engg |
+---------+--------+-------+
| india | AP | 100 |
+---------+--------+-------+
| india | TN | 400 |
+---------+--------+-------+
| india | KA | 250 |
+---------+--------+-------+
| US | LA | 140 |
+---------+--------+-------+
| US | CA | 120 |
+---------+--------+-------+
| US | NY | 150 |
+---------+--------+-------+
Desired output:
+---------+------+-------+
| country | doc | engg |
+---------+------+-------+
| india | 60 | 750 |
+---------+------+-------+
| US | 90 | 410 |
+---------+------+-------+
I tried with the below query but am getting more count of docs and engg. Someone please correct me..
select country, sum(a.doc), sum(b.engg)
from table1 a join table2 b on (a.country = b.country)
I think your problem is that you are getting a cross-product of both the tables with these set of values.
Try using:
tableA NATURAL JOIN tableB.
You can use UNION ALL
SELECT
country,
SUM(doc) AS doc,
SUM(engg) AS engg
FROM
(SELECT
country,
doc,
0 AS engg
FROM
doctors
UNION ALL
SELECT
country,
0,
engg
FROM
engineers
) a
GROUP BY
country
You need to group by country.
select a.country, sum(docSum), sum(enggSum) from
(select country, sum(doc) docSum from doctors) a
inner join
(select country, sum(engg) enggSum from engineers)
on a.country = b.country
group by a.country

SQL: aggregate & transpose rows to columns

I am trying to transpose the data from the first table to the second.
original data (number of cars and states are limited):
+----+----------+-------+--------+
| id | car | state | tstamp |
+----+----------+-------+--------+
| 01 | toyota | new | 1900 |
| 02 | toyota | old | 1950 |
| 03 | toyota | scrap | 1980 |
| 04 | mercedes | new | 1990 |
| 05 | mercedes | old | 2010 |
| 06 | tesla | new | 2013 |
+-----+---------------+----------+
query result:
+----------+------+------+-------+
| car | new | old | scrap |
+----------+------+------+-------+
| toyota | 1900 | 1950 | 1980 |
| mercedes | 1990 | 2010 | null |
| tesla | 2013 | null | null |
+----------+------+------+-------+
My SQL Skills are somewhat rusty therefore I would appreciate any help!
Something like this would work, depending on how your data is organised:
SELECT
car,
MAX(CASE WHEN state = 'new' THEN tstamp END) AS new,
MAX(CASE WHEN state = 'old' THEN tstamp END) AS old,
MAX(CASE WHEN state = 'scrap' THEN tstamp END) AS scrap
FROM
table
GROUP BY
car;