SQL ORACLE UNPIVOT - sql

I can not deal with this unpivot.
select *
from (select c.country_id, r.region_id, count(*) liczba
from countries c join regions r on r.region_id = c.region_id
group by c.country_id, r.region_id)
UnPivot(
count(liczba) for r.region_id in (any))
order by c.country_id
this is code, and still will not work :(
select r.region_name, count(*)
from countries c join regions r on r.region_id = c.region_id
group by r.region_name
this one works properly.
"invalid identifier"
Seconde imagine!

Your query does not make sense as:
UNPIVOT is used to transpose columns into rows; you only have 3 columns country_id, region_id and liczba.
The UNPIVOT syntax is:
UNPIVOT ( value FOR key IN ( column1 AS 'alias1', column2 AS 'alias2' ) )
You cannot have a aggregation expression as the value.
You need to specify which columns you are UNPIVOTing to rows; you cannot use any unless you have defined a column called any.
You appear to want to use PIVOT rather than UNPIVOT.
I would like to see, how many people work in each job_id
Use GROUP BY and COUNT:
SELECT job_id,
COUNT(DISTINCT EMPLOYEE_ID) AS number_of_employees
FROM employees
GROUP BY job_id
using unpivot
That's not what UNPIVOT is for; UNPIVOT converts columns to rows whereas you want to aggregate rows together.

Related

SQL subquery and column naming issue

Deal all
this is correct code
SELECT MAX(inflation_rate) AS max_inf
FROM (
SELECT name, continent, inflation_rate
FROM countries
INNER JOIN economies
USING (code)
WHERE year = 2015) AS subquery
GROUP BY continent;
this is incorrect code
SELECT MAX(subquery.economies.inflation_rate) AS max_inf
FROM
(SELECT countries.name, countries.continent, economies.inflation_rate
FROM countries
INNER JOIN economies
ON countries.code = economies.code
WHERE economies.year = 2015) AS subquery
GROUP BY subquery.countries.continent;
Why 2nd one is not allowed ?
SELECT
MAX(subquery.economies.inflation_rate) AS max_inf -- 3
FROM (
SELECT
countries.name, -- 1
countries.continent,
economies.inflation_rate
FROM ...) AS subquery -- 2
GROUP BY
subquery.countries.continent; -- 3
You are using a subquery (2). This subquery returns three columns: name, continent, inflation_rate (1). Only these names are known outside the subquery, but nothing else. So the superior query does not know anything about where did the column names come from. The table or the table schema is irrelevant.
So for the superior query the only relevant information is: The name of the subquery and the column names (3):
SELECT
MAX(subquery.inflation_rate) AS max_inf -- change
FROM (
SELECT
countries.name,
countries.continent,
economies.inflation_rate
FROM ...) AS subquery
GROUP BY
subquery.continent; -- change
Since I am assuming this is postgresql you could simplify this and get rid of the subquery.
SELECT continent
, max(inflation_rate) as max_inf
FROM countries
INNER JOIN economies USING (code)
WHERE year = 2015
group by continent
You don't need to write subquery.countries.continent as you've a subquery and you renamed it - so subquery.continent is enough
SELECT MAX(subquery.inflation_rate) AS max_inf FROM
(SELECT countries.name, countries.continent, economies.inflation_rate
FROM countries INNER JOIN economies
ON countries.code = economies.code
WHERE economies.year = 2015) AS subquery
GROUP BY subquery.continent

Query all columns of table1 left join and count of the table2

I couldn't get this query working :
DOESN'T WORK
select
Region.*, count(secteur.*) count
from
Region
left join
secteur on secteur.region_id = Region.id
The solution I found is this but is there a better solution using joins or if this doesn't affect performance, because I have a very large dataset of about 500K rows
WORKS BUT AFRAID OF PERFORMANCE ISSUES
select
Region.*,
(select count(*)
from Secteur
where Secteur.Region_id = region.id) count
from
Region
I would suggest:
select region.*, count(secteur.region_id) as count
from region left join secteur on region.id = secteur.region_id
group by region.id, region.field2, region.field3....
Note that count(table.field) will ignore nulls, whereas count(*) will include them.
Alternatively, left join on a subquery and use coalesce to avoid nulls:
select region.*, coalesce(t.c, 0) as count
from region left join
(select region_id, count(*) as c from secteur group by region_id) t on region.id = t.region_id
I'd join region on an aggregate query of secteur:
SELECT r.*, COALESCE(s.cnt, 0)
FROM region r
LEFT JOIN (SELECT region_id, COUNT(*) AS cnt
FROM secteur
GROUP BY region_id) s ON s.region_id = r.id
I would go with this query:
select r.*,
(select count(*)
from Secteur s
where s.Region_id = r.id
) as num_secteurs
from Region r;
Then fix the performance problem by adding an index on Secteur(region_id):
create index idx_secteur_region on secteur(region_id);
You make a two mistakes
First: you have try to calulate COUNT() in only one (I mean, the second) table. This doesn't will work because theCOUNT(), like an any aggregate function, calculates only for the whole set of rows, not just for any part of the set (not only just for the one or an other joined table).
In your first query, you may replace secteur. * only by asterisk, like a Region.region_id, count(*) AS count, and do not forget add Region.region_id on the GROUP BY step.
Second: You has define not only aggregate function in the query, but and other fields: select Region.*, but you don't define them in GROUP BY step. You need to add to GROUP BY statement all columns, which you has define in the SELECT step but not apply an aggregate functions to them.
Append: not, GROUP BY Region.* doesn't will work, you should to define a columns in the GROUP BY step by their actual names.
So, correct form of this will looks like a
SELECT
Region.col1
,Region.col2,
, count(*) count
from Region
left join
secteur on secteur.region_id = Region.id
GROUP BY Region.col1, Region.col2
Or, if you don't want to type each name of column, use window queries
SELECT
Region.*,
, count( * ) OVER (PARTITION BY region_id) AS count
from Region
left join
secteur on secteur.region_id = Region.id

Select the countries with fewest number of tuples

At http://www.dofactory.com/sql/sandbox I'm experimenting with submitting my own SQL queries against their sample database to become better at SQL. What I want to do is to select all countries from Customer that have exactly the fewest number of tuples. Here is my query attempt:
SELECT a.Country
FROM [Customer] a, (SELECT COUNT(*) AS Tot
FROM [Customer]
GROUP BY Country) b
GROUP BY a.Country
HAVING COUNT(*) = MIN(b.Tot)
However, the website returns an empty table instead of the correct result which is (Ireland, Norway, Poland). The correct result is easily realized by grouping the table by country and using COUNT(*), and then looking at the countries that have the smallest COUNT(*) value out of all COUNT(*) values. I would like some advice on how to generate the correct result without any assumptions about the table's data.
I would do this using SELECT TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES c.Country
FROM Customer c
GROUP BY c.Country
ORDER BY COUNT(*) ASC;
Two notes:
When using table aliases, make them abbreviations for the tables. This makes the query much easier to follow.
Never use commas in the FROM clause. Always use proper, explicit JOIN syntax.
Learned somtihing new(WITH TIES) from Gordon Linoff, again...
Here my solution without it...
Select a.Country from [Customer] a
group by a.Country
having count(*) = (select min(b.Tot) from (SELECT COUNT(*) AS Tot FROM [Customer] GROUP BY Country) b)
If you are not using sql 2012 then,
declare #Fewer int=2
;With CTE as
(
select c.*
,ROW_NUMBER()over(partition by countryid order by customerid)rn
from dbo.Customers C
)
select * from cte
where rn<=#Fewer

How to fetch all rows from table with count of specific group?

I have a simple table like this
spatialite> select id, group_id, object_id, object, param from controlled_object;
1|1|150|nodes|0.5
2|1|186|nodes|0.5
3|2|372|nodes|1.0
The second column is group_id. I want to retrieve all entries from the table, plus the count of the group.
1|1|150|nodes|0.5|2
2|1|186|nodes|0.5|2
3|2|372|nodes|1.0|1
I thought a cross join would be the way to go
SELECT
*
, cj.cnt
FROM
controlled_object
CROSS JOIN (
SELECT
COUNT(DISTINCT group_id) AS cnt
FROM
controlled_object
) AS cj
But that gives me
1|1|150|nodes|0.5|2|2
2|1|186|nodes|0.5|2|2
3|2|372|nodes|1.0|2|2
How do I fetch all rows from table including the count of a specific group?
Join source data with counters, grouped by group_id
select c.id, c.group_id, c.object_id, c.object, c.param,cnt from controlled_object c join
(select group_id,count(*) cnt from controlled_object group by group_id) p on c.group_id =p.group_id ;
Not very good idea for big tables
Sqlite is not very good idea for big tables at all :-)
You can compute the count with a correlated subquery:
SELECT id,
group_id,
object_id,
object,
param,
(SELECT count(*)
FROM controlled_object AS co2
WHERE group_id = controlled_object.group_id)
FROM controlled_object;

Only one expression can be specified in the select list when the subquery is not introduced with EXISTS

My query is as follows, and contains a subquery within it:
select count(distinct dNum)
from myDB.dbo.AQ
where A_ID in
(SELECT DISTINCT TOP (0.1) PERCENT A_ID,
COUNT(DISTINCT dNum) AS ud
FROM myDB.dbo.AQ
WHERE M > 1 and B = 0
GROUP BY A_ID ORDER BY ud DESC)
The error I am receiving is ...
Only one expression can be specified in the select list when the subquery is not
introduced with EXISTS.`
When I run the sub-query alone, it returns just fine, so I am assuming there is some issue with the main query?
You can't return two (or multiple) columns in your subquery to do the comparison in the WHERE A_ID IN (subquery) clause - which column is it supposed to compare A_ID to? Your subquery must only return the one column needed for the comparison to the column on the other side of the IN. So the query needs to be of the form:
SELECT * From ThisTable WHERE ThisColumn IN (SELECT ThatColumn FROM ThatTable)
You also want to add sorting so you can select just from the top rows, but you don't need to return the COUNT as a column in order to do your sort; sorting in the ORDER clause is independent of the columns returned by the query.
Try something like this:
select count(distinct dNum)
from myDB.dbo.AQ
where A_ID in
(SELECT DISTINCT TOP (0.1) PERCENT A_ID
FROM myDB.dbo.AQ
WHERE M > 1 and B = 0
GROUP BY A_ID
ORDER BY COUNT(DISTINCT dNum) DESC)
You should return only one column and one row in the where query where you assign the returned value to a variable. Example:
select * from table1 where Date in (select * from Dates) -- Wrong
select * from table1 where Date in (select Column1,Column2 from Dates) -- Wrong
select * from table1 where Date in (select Column1 from Dates) -- OK
It's complaining about
COUNT(DISTINCT dNum) AS ud
inside the subquery. Only one column can be returned from the subquery unless you are performing an exists query. I'm not sure why you want to do a count on the same column twice, superficially it looks redundant to what you are doing. The subquery here is only a filter it is not the same as a join. i.e. you use it to restrict data, not to specify what columns to get back.
Apart from very good responses here, you could try this as well if you want to use your sub query as is.
Approach:
1) Select the desired column (Only 1) from your sub query
2) Use where to map the column name
Code:
SELECT count(distinct dNum)
FROM myDB.dbo.AQ
WHERE A_ID in
(
SELECT A_ID
FROM (SELECT DISTINCT TOP (0.1) PERCENT A_ID, COUNT(DISTINCT dNum) AS ud
FROM myDB.dbo.AQ
WHERE M > 1 and B = 0
GROUP BY A_ID ORDER BY ud DESC
) a
)
Just in case it helps someone, here's what caused this error for me:
I needed a procedure to return json but I left out the for json path:
set #jsonout = (SELECT ID, SumLev, Census_GEOID, AreaName, Worksite
from CS_GEO G (nolock)
join #allids a on g.ID = a.[value]
where g.Worksite = #worksite)
When I tried to save the stored procedure, it threw the error. I fixed it by adding for json path to the code at the end of the procedure:
set #jsonout = (SELECT ID, SumLev, Census_GEOID, AreaName, Worksite
from CS_GEO G (nolock)
join #allids a on g.ID = a.[value]
where g.Worksite = #worksite for json path)
For projection in subquery, you can use
SELECT t.col1,t.col2
FROM table1 t
WHERE EXISTS (SELECT st.col1,st.col2
FROM table2 st
WHERE st.fcol = t.fcol)