How to fetch all rows from table with count of specific group? - sql

I have a simple table like this
spatialite> select id, group_id, object_id, object, param from controlled_object;
1|1|150|nodes|0.5
2|1|186|nodes|0.5
3|2|372|nodes|1.0
The second column is group_id. I want to retrieve all entries from the table, plus the count of the group.
1|1|150|nodes|0.5|2
2|1|186|nodes|0.5|2
3|2|372|nodes|1.0|1
I thought a cross join would be the way to go
SELECT
*
, cj.cnt
FROM
controlled_object
CROSS JOIN (
SELECT
COUNT(DISTINCT group_id) AS cnt
FROM
controlled_object
) AS cj
But that gives me
1|1|150|nodes|0.5|2|2
2|1|186|nodes|0.5|2|2
3|2|372|nodes|1.0|2|2
How do I fetch all rows from table including the count of a specific group?

Join source data with counters, grouped by group_id
select c.id, c.group_id, c.object_id, c.object, c.param,cnt from controlled_object c join
(select group_id,count(*) cnt from controlled_object group by group_id) p on c.group_id =p.group_id ;
Not very good idea for big tables
Sqlite is not very good idea for big tables at all :-)

You can compute the count with a correlated subquery:
SELECT id,
group_id,
object_id,
object,
param,
(SELECT count(*)
FROM controlled_object AS co2
WHERE group_id = controlled_object.group_id)
FROM controlled_object;

Related

How to get a result set containing the absence of a value?

Scenario: Have a table with four columns. District_Number, District_name, Data_Collection_Week, enrollments. Each week we get data, BUT sometimes we do not.
Task: My supervisor wants me to produce a query that will let us know, which districts did not submit a given week.
What I have tried is below, but I cannot get a NULL value on those that did not submit a week.
SELECT DISTINCT DistrictNumber, DistrictName, DataCollectionWeek
into #test4
FROM EDW_REQUESTS.INSTRUCTION_DELIVERY_ENROLLMENT_2021
order by DistrictNumber, DataCollectionWeek asc
select DISTINCT DataCollectionWeek
into #test5
from EDW_REQUESTS.INSTRUCTION_DELIVERY_ENROLLMENT_2021
order by DataCollectionWeek
select b.DistrictNumber, b.DistrictName, b.DataCollectionWeek
from #test5 a left outer join #test4 b on (a.DataCollectionWeek = b.DataCollectionWeek)
order by b.DistrictNumber, b.DataCollectionWeek asc
One option uses a cross join of two select distinct subqueries to generate all possible combinations of districts and weeks, and then not exists to identify those that are not available in the table:
select d.districtnumber, w.datacollectionweek
from (select distinct districtnumber from edw_requests.instruction_delivery_enrollment_2021) d
cross join (select distinct datacollectionweek from edw_requests.instruction_delivery_enrollment_2021) w
where not exists (
select 1
from edw_requests.instruction_delivery_enrollment_2021 i
where i.districtnumber = d.districtnumber and i.datacollectionweek = w.datacollectionweek
)
This would be simpler (and much more efficient) if you had referential tables to store the districts and weeks: you would then use them directly instead of the select distinct subqueries.

Count multiple column without using many temporary tables then join PostgreSQL

I have data like this
I want to count edit: distinct group1 and group 2 group by the Time and Type. I make each a temporary table then full outer join (on time and type) so desire column like:
Time Type Count_Group1 Count_Group2
Any shorter way to do this?
This answers the original version of the question:
You can use a lateral join and aggregation:
select time, type, sum(in1), sum(in2)
from t cross join lateral
(values (time1, group1, 1, 0), (time2, group2, 0, 1)
) v(time, grp, in1, in2)
group by time, type;
EDIT:
To count distinct values, use count(distinct):
select v.time, t.type, count(distinct t.group1), count(distinct t.group2)
from t cross join lateral
(values (t.time1), (time2)
) v(time)
group by v.time, t.type;

Postgres query left join take too time

i have a problem with this query. it go in loop, I mean query after 15 minutes not finish
But if remove one of the left join works
where wrong I?
Select distinct a.sito,
Count(distinct a.id_us) as us,
Count (distinct b.id_invmat) as materiali,
Count (distinct c.id_struttura) as Struttura,
Count(distinct d.id_tafonomia) as tafonomia
From us_table as a
Left join invetario_materiali as b on a.sito=b.sito
Left join struttura_table as c on a.sito=c.sito
Left join tafonomia_table as d on a.sito=d.sito
Group by a.sito
Order by us
thanks
E
This is a case where correlated subqueries might be the simplest approach:
select s.sito,
(select count(*) from invetario_materiali m where s.sito = m.sito) as materiali,
(select count(*) from struttura_tablest where s.sito = st.sito) as Struttura,
(select count(*) from tafonomia_table t where s.sito = t.sito) as tafonomia
from (select sito, count(*) as us
from us_table
group by sito
) s
order by us;
This should be much, much faster than your version for two reasons. First, it avoids the outer aggregation. Second, it avoids the Cartesian products among the tables.
You can make this even faster by creating indexes on each of the secondary tables on sito.
Assuming that id_us, id_invmat, id_struttura and id_tafonomia are all PRIMARY KEY CLUSTERED
You should add indexes on join columns:
CREATE INDEX IX_SITO ON us_table ( sito ASC) ;
CREATE INDEX IX_SITO ON invetario_materiali ( sito ASC) ;
CREATE INDEX IX_SITO ON struttura_table ( sito ASC) ;
CREATE INDEX IX_SITO ON tafonomia_table ( sito ASC) ;
Than you can reduce complexity in this way:
with
_us_table as (
select sito, count(distinct a.id_us) us
from us_table a
group by sito
),
_invetario_materiali as (
select sito, count(distinct b.id_invmat) materiali
from invetario_materiali b
group by sito
),
_struttura_table as (
select sito, count(distinct c.id_struttura) Struttura
from struttura_table c
group by sito
),
_tafonomia_table as (
select sito, count(distinct d.id_tafonomia) tafonomia
from tafonomia_table d
group by sito
)
Select a.sito, a.us, b.materiali, c.Struttura, d.tafonomia
From _us_table as a
Left join _invetario_materiali as b on a.sito=b.sito
Left join _struttura_table as c on a.sito=c.sito
Left join _tafonomia_table as d on a.sito=d.sito
Order by a.us;
should be much faster
Unfortunately COUNT(DISTINCT ...) is difficult to improve upon using an index. However, we can at least try adding indices which cover all the joins in your query:
CREATE INDEX inv_mat_idx ON invetario_materiali (sito, id_invmat);
CREATE INDEX strut_tbl_idx ON struttura_table (sito, id_struttura);
CREATE INDEX taf_tbl_idx ON tafonomia_table (sito, id_tafonomia);
Note that the above indices would only help the joins, and would not affect the aggregation step by sito and the distinct counts per group. As #jarlh has noted in the comments, SELECT DISTINCT is superfluous, since you are using GROUP BY, so just do a plain SELECT.

How i can i subtract two selected columns in the same query?

select (select count(*) from tbl_account) as processed,
(select count(*) from tbl_rejected_account) as rejected,
processed -rejected as approved from tbl_control .
how can i get that 'approved' count without having to write two subqueries and subtract them later.
EDIT:
The original Query that i want to change:-
select
ACTIVITY_DATE
,SYSTEM_NAME
,START_TIME
,END_TIME
,INSTANCE_NO as instance_number
,case status when '1' then 'Success'
when '2' then 'In process'
when '3' then 'Failed' end as status
,(select count(distinct account_number) from tbl_account_detail where coretype=a.system_name and INSTANCE_NO=a.instance_no and a.activity_date=to_char(upload_date,'dd-MON-yy'))+
(select count(distinct account_number) from tbl_account_detail_exception where system_name=a.system_name and INSTANCE_NO=a.instance_no and a.activity_date=to_char(upload_date,'dd-MON-yy'))
as AccountCount
,(select count(distinct account_number) from tbl_account_detail where CREATOR='SYSTEM' and APPROVER='SYSTEM' and system_name=a.system_name and INSTANCE_NO=a.instance_no and a.activity_date=to_char(upload_date,'dd-MON-yy'))
as AutoApprovedCount
,(select count(distinct account_number) from tbl_account_detail where coretype=a.system_name and INSTANCE_NO=a.instance_no and a.activity_date=to_char(upload_date,'dd-MON-yy')) +
(select count(distinct account_number) from tbl_account_detail_exception where system_name=a.system_name and INSTANCE_NO=a.instance_no and
a.activity_date=to_char(upload_date,'dd-MON-yy')) -
(select count(distinct account_number) from tbl_account_detail where a.activity_date=to_char(upload_date,'dd-MON-yy') and CREATOR='SYSTEM' and APPROVER='SYSTEM' and system_name=a.system_name and INSTANCE_NO=a.instance_no)
as MaintenanceCount
from tbl_control_file_status a where activity_type='MAIN' and activity_name='START';
clearly this is not what should be the proper way,kindly provide an alternative solution.
You can use a subquery to introduce aliases for use in the outer query:
select SubQueryAlias.*
, processed - rejected as approved
from (
select (
select count(*)
from tbl_account
) as processed,
(
select count(*)
from tbl_rejected_account
) as rejected
from dual
) as SubQueryAlias
;
It's often more readable to use a common-table expression (CTE) as in Alex Poole's answer.
You can't refer to a column alias in the same level of query it is defined, except in an order by clause. That is because of the way the query is processed and the result set constructed, but also avoids ambiguity. This is mentioned in the documentatin:
Specify an alias for the column expression. Oracle Database will use this alias in the column heading of the result set. The AS keyword is optional. The alias effectively renames the select list item for the duration of the query. The alias can be used in the order_by_clause but not other clauses in the query.
You could use a CTE for each subquery, or inline views:
select ta.processed, tra.rejected, ta.processed - tra.rejected as approved
from (
select count(*) as processed from tbl_account
) ta
cross join (
select count(*) as rejected from tbl_rejected_account
) tra
Or if you really have a correlation with the third table:
select tc.id, ta.processed, tra.rejected, ta.processed - tra.rejected as approved
from tbl_control tc
join (
select id, count(*) as processed from tbl_account group by id
) ta on ta.id = tc.id
join (
select id, count(*) as rejected from tbl_rejected_account group by id
) tra on tra.id = tc.id
You haven't said what the relationship is so I've assumed a common ID column. Using CTEs rather than inline views that would look like:
with ta as (
select id, count(*) as processed from tbl_account group by id
), tra as (
select id, count(*) as rejected from tbl_rejected_account group by id
)
select tc.id, ta.processed, tra.rejected, ta.processed - tra.rejected as approved
from tbl_control tc
join ta on ta.id = tc.id
join tra on tra.id = tc.id
You may need outer joins and nvl if either subquery table might not have matching rows.
You don't really need to use subqueries, inline views or CTEs here though; you can just join the tables and have the aggregates in the top-level query - you need to duplicate the count not the whole subquery:
with ta as (
select id, count(*) as processed from tbl_account group by id
), tra as (
select id, count(*) as rejected from tbl_rejected_account group by id
)
select tc.id, count(ta.id) as processed,
count(tra.id) as rejected,
count(ta.id) - count(tra.id) as approved
from tbl_control tc
join tbl_approved ta on ta.id = tc.id
join tbl_rejected tra on tra.id = tc.id
group by tc.id
You can add more joins and conditions as needed of course.

Adding rows to SQL query for each element in WHERE sequence

I have an SQL query like the following:
SELECT store_id, SUM(quantity_sold) AS count
FROM sales_table
WHERE store_id IN ('Store1', 'Store2', 'Store3')
GROUP BY store_id;
This returns a row for each store that has rows in sales_table, but does not return a row for those that do not. What I want is one row per store, with a 0 for count if it has no records.
How can I do this, assuming that I do not have access to a stores table?
with stores (store_id) as (
values ('Store1'), ('Store2'), ('Store3')
)
select st.store_id,
sum(sal.quantity_sold) as cnt
from stores st
left join sales_table sal on sal.store_id = st.store_id
group by st.store_id;
If you do have a stores table, then simply do an outer join to that one instead of "making one up" using the common table expression (with ..).
This can also be written without the CTE (common table expression):
select st.store_id,
sum(sal.quantity_sold) as cnt
from (
values ('Store1'), ('Store2'), ('Store3')
) st
left join sales_table sal on sal.store_id = st.store_id
group by st.store_id;
(But I find the CTE version easier to understand)
You can use unnest() to generate rows from array elements.
SELECT store, sum(sales_table.quantity_sold) AS count
FROM unnest(ARRAY['Store1', 'Store2', 'Store3']) AS store
LEFT JOIN sales_table ON (sales_table.store_id = store)
GROUP BY store;