SQL Query (SQLite3) on inner joins and SUM - sql

I have two tables with schema
CREATE TABLE City
(
City CHAR,
Area INT
)
CREATE TABLE Revenue
(
Municipality CHAR,
City_Code CHAR,
[Road tax] INT,
[Water Tax] INT,
[Entertainment Tax] INT
)
I need to rank the top 3 cities in descending order of area while also printing the sum of the total tax obtained by the city. Something along the lines of (e.g.):
New York, 40
Chicago, 27
London, 30
City field from table City and City_code field from table Revenue need to to identify cities between the tables.

Your schema doesn't make sense. There should be a common key in both your tables so that it becomes possible to fetch CITY name from CITY table by joining on the CITY_CODE column. I have made this assumption in the below query.
SELECT B.CITY, (A.ROAD_TAX + A.WATER_TAX + A.ENTERTAINMENT_TAX) AS TOTAL_TAX
FROM
REVENUE A
INNER JOIN
CITY B
ON A.CITY_CODE = B.CITY_CODE
ORDER BY (A.ROAD_TAX + A.WATER_TAX + A.ENTERTAINMENT_TAX) DESC
LIMIT 3;

Try this Code:
SELECT C.City,(R.[Road tax]+R.[Water Tax]+R.[Entertainment Tax])[Total Tax]
FROM City C
JOIN Revenue R ON R.CityCode=C.City
ORDER BY C.Area DESC
LIMIT 3;

Related

How do I select all values which satisfy certain aggregate properties?

Say, I have a table (named "Customers") which consists of:
CustomerName
Country
City
I am trying to list the names of all customers from cities where there are at least two customers.
This is my initial attempt:
SELECT CustomerName, City
FROM Customers
GROUP BY City
HAVING COUNT(City) > 1
This is the result that I got:
CustomerName
City
Person A
New York
Person C
Los Angeles
Here, Person A is a person from NY who appears on the top of the table and similar for Person B. However, what I wanted was the listing of all customers from New York and LA.
When I tried:
SELECT COUNT(CustomerName), City
FROM Customers
GROUP BY City
HAVING COUNT(City) > 1
I had
COUNT(CustomerName)
City
3
New York
5
Los Angeles
This means that the code is working properly, except that my original code only displays a person on top of the table from NY and LA. How do I resolve this issue?
Get cities with more than 1 customer in a subquery, and use that list to select the customers:
SELECT cst.CustomerName
FROM Customers cst
WHERE Cst.City in (
-- All cities where there are at least two customers
SELECT CstGE2.City
FROM Customers CstGE2
GROUP BY CstGE2.City
HAVING count(*) >= 2
)
How about this? I've taken the city out of the select part since you said you just wanted customer names.
SELECT a.CustomerName
FROM Customers a
WHERE (
SELECT COUNT(b.CustomerName)
FROM Customers b
WHERE b.City = a.City
) > 1
ORDER BY a.CustomerName

How can I count unique attribute values using two attributes and joining two tables?

I'm a beginner in SQL.
Simplified, I have two tables, districts and streetdistricts, which contain information about city districts and streets. Every district has a unique number dkey and every street has a unique street number stkey (as primary keys respectively).
Here's an example:
Table districts:
dkey
name
1
Inner City
2
Outer City
3
Outskirts
Table streetdistricts:
stkey
dkey
113
1
126
2
148
2
148
3
152
3
154
3
What I want to do now is to find out how many streets are there per district that are located only in one single district. So that means I do not have to just remove duplicates (like street with stkey 148 here), but instead to remove streets that are situated in more than one district completely so that I only see the districts and the number of streets per district that are just located in one district only.
For this example, this would be:
name number_of_street_in_just_this_district
Inner City 1
Outer City 1
Outskirts 2
I've tried many things, but I always get stuck, mostly because when I SELECT the name of the district, it is also needed in GROUP BY as SQL says, but when I add it, then either the whole number of streets (here: 6) or at least the number including the duplicates (here: 5) are displayed, but not the right answer of 3.
Or I'm not able to JOIN the tables correctly so to get the output I want. Here is my last try:
SELECT SUM(StreetDistricts.dkey) as d_number, StreetDistricts.stkey, COUNT(StreetDistricts.stkey) as numb
FROM StreetDistricts
INNER JOIN Districts ON Districts.dkey = StreetDistricts.dkey
GROUP BY StreetDistricts.stkey
HAVING COUNT(StreetDistricts.dkey) = 1
ORDER BY d_number DESC
This works to get me the correct sum of rows, but I was not able to combine/join it with the other table to receive name and number of unique streets.
First obtain the streets that are found in only one district (cte1). Then count just those streets per district. This should do it:
WITH cte1 AS (
SELECT stkey FROM StreetDistricts GROUP BY stkey HAVING COUNT(DISTINCT dkey) = 1
)
SELECT d.name, COUNT(*) AS n
FROM StreetDistricts AS s
JOIN Districts AS d
ON s.dkey = d.dkey
AND s.stkey IN (SELECT stkey FROM cte1)
GROUP BY d.dkey
;
Result:
+------------+---+
| name | n |
+------------+---+
| Inner City | 1 |
| Outer City | 1 |
| Outskirts | 2 |
+------------+---+
Note: I used the fact that dkey is the primary key of Districts to avoid having to GROUP BY d.name as well. This is guaranteed by functional dependence. If your database doesn't guarantee that with a constraint, just add d.name to the final GROUP BY terms.
The test case:
CREATE TABLE Districts (dkey int primary key, name varchar(30));
CREATE TABLE StreetDistricts (stkey int, dkey int);
INSERT INTO Districts VALUES
(1,'Inner City')
, (2,'Outer City')
, (3,'Outskirts')
;
INSERT INTO StreetDistricts VALUES
(113,1)
, (126,2)
, (148,2)
, (148,3)
, (152,3)
, (154,3)
;

select result in order of maximum match in a string sql

I have a string column Countries in table tblCountry for example 'India, Austrailia, US, UK'.
Now whatever sequence is supplied example 'US, Germany, UK, India, Russia' I have to split the
string on space basis and then display results of all where even a single country is present in the Column Countries value.
Upto this I had achieved by below query. Now the difficult part is I have to display the result first which
has maximum match of column value.
For example if below are the column values:
'India, Austrailia, US, Italy'
'India, Malaysia, Austrailia, US, UK'
'UK, Austrailia, France, Korea, India'
'China, India, US, UK'
and the input string is 'Austrailia, UK, India, Korea, Germany' then the result should be in maximum match comes at top like below:
Output is most matched countries on top by relevance like below for above case
'UK, Austrailia, France, Korea, India'
'India, Malaysia, Austrailia, US, UK'
'India, Austrailia, US, Italy'
'China, India, US, Brazil'
declare #matchCount int = 0;
SET #matchCount = (SELECT count(Item)
FROM dbo.SplitCountry('Austrailia, UK, India, Korea, Germany' , ',')
where Countries like '%'+Item+'%')
SELECT Countries, CASE
WHEN #matchCount >0 THEN #matchCount
ELSE 0
FROM tblCountry
I used a function to split the string 'SplitCountry'. Also I cannot use full text here
You have to
create temp table #t1 (CountriesID, Country)
create temp table #t2 (Country)
populate #t1 by using you split function, so you have:
CountriesID Country
1 India
1 Austrailia
1 US
1 Italy
2 India
2 Malaysia
2 Austrailia
2 US
2 UK
3 ... and so on
populate #t2 by using you split function so t# contains the input values
finally:
SELECT #t1.CountriesID, c.Countries
FROM #t1
INNER JOIN #t2 on #t2.Country = #t1.Country
INNER JOIN tblCountry c ON c.CountriesID = #t1.CountriesID
GROUP BY #t1.ID, c.Countries
ORDER BY COUNT(*) DESC
Finally a lot of research and I got a easy solution for my problem. Below query is perfectly working for me
DECLARE #patterns TABLE (
pattern VARCHAR(20)
);
INSERT INTO #patterns VALUES ('%india%'), ('%france%'), ('%US%'),('%UK%')
;WITH CTEORD AS
(
SELECT a.*,ROW_NUMBER() Over(Partition By country Order By p.pattern) AS RN FROM Countries a
JOIN #patterns p
ON (a.country LIKE p.pattern)
)
select MAX(RN) As RN,country,
from CTEORD group by country
order by RN desc

Get top values from two columns

Lets say I have a table like this:
id | peru | usa
1 20 10
2 5 100
3 1 5
How can I get the top values from peru and usa as well as the spefic ids. So that I get as result:
usa_id: 2 | usa: 100 | peru_id: 1 | peru: 20
Is this possible In one query? Or do I have to do two ORDER BY querys?
Im using postgresql
You can do this with some subqueries and a cross join:
select
u.id usa_id,
u.usa,
p.id peru_id,
p.peru
from
(select id, usa from mytable where usa=(select max(usa) from mytable) order by id limit 1) u
cross join (select id, peru from mytable where peru=(select max(peru) from mytable) order by id limit 1) p
;
In the case that there are multiple rows with the same max value (for usa or peru, independently), this solution will select the one with the lowest id (I've assumed that id is unique).
SELECT
t1.id as peru_id, t1.peru
, t2.id as usa_id, t2.usa
FROM tab1 t1, tab1 t2
ORDER BY t1.peru desc, t2.usa desc
limit 1
http://sqlfiddle.com/#!15/0c12f/6
As basicly what this does is a simple carthesian product - I guess that performance WILL be poor for large datasets.
on the fiddle it took 196ms for a 1k rows table. On 10k rows table - sqlFiddle hung up.
You can consider using MAX aggregate function in conjunction with ARRAY type. Check this out:
CREATE TEMPORARY TABLE _test(
id integer primary key,
peru integer not null,
usa integer not null
);
INSERT INTO _test(id, peru, usa)
VALUES
(1,20,10),
(2,5,100),
(3,1,5);
SELECT MAX(ARRAY[peru, id]) AS max_peru, MAX(array[usa, id]) AS max_usa FROM _test;
SELECT x.max_peru[1] AS peru, x.max_peru[2] AS peru_id, x.max_usa[1]
AS usa, x.max_usa[2] AS usa_id FROM (
SELECT MAX(array[peru, id]) AS max_peru,
MAX(array[usa, id]) AS max_usa FROM _test ) as x;

Invalid count and sum in cross tab query using PostgreSQL

I am using PostgreSQL 9.3 version database.
I have a situation where I want to count the number of products sales and sum the amount of product and also want to show the cities in a column where the product have sale.
Example
Setup
create table products (
name varchar(20),
price integer,
city varchar(20)
);
insert into products values
('P1',1200,'London'),
('P1',100,'Melborun'),
('P1',1400,'Moscow'),
('P2',1560,'Munich'),
('P2',2300,'Shunghai'),
('P2',3000,'Dubai');
Crosstab query:
select * from crosstab (
'select name,count(*),sum(price),city,count(city)
from products
group by name,city
order by name,city
'
,
'select distinct city from products order by 1'
)
as tb (
name varchar(20),TotalSales bigint,TotalAmount bigint,London bigint,Melborun bigint,Moscow bigint,Munich bigint,Shunghai bigint,Dubai bigint
);
Output
name totalsales totalamount london melborun moscow munich shunghai dubai
---------------------------------------------------------------------------------------------------------
P1 1 1200 1 1 1
P2 1 3000 1 1 1
Expected Output:
name totalsales totalamount london melborun moscow munich shunghai dubai
---------------------------------------------------------------------------------------------------------
P1 3 2700 1 1 1
P2 3 6860 1 1 1
Your first mistake seems to be simple. According to the 2nd parameter of the crosstab() function, 'Dubai' must come as first city (sorted by city). Details:
PostgreSQL Crosstab Query
The unexpected values for totalsales and totalamount represent values from the first row for each name group. "Extra" columns are treated like that. Details:
Pivot on Multiple Columns using Tablefunc
To get sums per name, run window functions over your aggregate functions. Details:
Get the distinct sum of a joined table column
select * from crosstab (
'select name
,sum(count(*)) OVER (PARTITION BY name)
,sum(sum(price)) OVER (PARTITION BY name)
,city
,count(city)
from products
group by name,city
order by name,city
'
-- ,'select distinct city from products order by 1' -- replaced
,$$SELECT unnest('{Dubai,London,Melborun
,Moscow,Munich,Shunghai}'::varchar[])$$
) AS tb (
name varchar(20), TotalSales bigint, TotalAmount bigint
,Dubai bigint
,London bigint
,Melborun bigint
,Moscow bigint
,Munich bigint
,Shunghai bigint
);
Better yet, provide a static set as 2nd parameter. Output columns are hard coded, it may be unreliable to generate data columns dynamically. If you a another row with a new city, this would break.
This way you can also order your columns as you like. Just keep output columns and 2nd parameter in sync.
Honestly I think your database needs some drastic normalization and your results in several columns (one for each city name) is not something I would do myself.
Nevertheless if you want to stick to it you can do it this way.
For the first step you need get the correct amounts. This would do the trick quite fast:
select name, count(1) totalsales, sum(price) totalAmount
from products
group by name;
This will be your result:
NAME TOTALSALES TOTALAMOUNT
P2 3 6860
P1 3 2700
You would get the Products/City this way:
select name, city, count(1) totalCityName
from products
group by name, city
order by name, city;
This result:
NAME CITY TOTALCITYNAME
P1 London 1
P1 Melborun 1
P1 Moscow 1
P2 Dubai 1
P2 Munich 1
P2 Shunghai 1
If you really would like a column per city you could do something like:
select name,
count(1) totalsales,
sum(price) totalAmount,
(select count(1)
from Products a
where a.City = 'London' and a.name = p.name) London,
...
from products p
group by name;
But I would not recommend it!!!
This would be the result:
NAME TOTALSALES TOTALAMOUNT LONDON ...
P1 3 2700 1
P2 3 6860 0
Demonstration here.