mathematical operation between two select statements - sql

I'm currently trying to find the percentage of certain amount of preregistered users in my postgres db, the operation would be (1185 * 100) / 3104 = 38.17. To do that I'm using two select statements to retrieve each count, but I've been unable to operate between them:
+ count +
- 1185 -
- 3104 -
This is what I have:
select
count(*)
from crm_user_preregistred
left join crm_player on crm_user_preregistred."document" = crm_player."document"
left join crm_user on crm_player.user_id = crm_user.id
where crm_user.email is not null
union all
select
count(*)
from crm_user_preregistred
Thanks in advance for any hint or help.

you can use some with clause to simplifie your selects, substitute the values with your count(*) selects, maybe some formating to the result, and a check for 0 on value2
with temp_value1 as (
select 1185 as value1 ),
temp_value2 as (
select 3104 as value2 )
select (select temp_value1.value1::float * 100 from temp_value1) /
(select temp_value2.value2::float from temp_value2)
result :
38.17654639175258
with your selects:
with temp_value1 as (
select
count(*) as value1
from crm_user_preregistred
left join crm_player on crm_user_preregistred."document" = crm_player."document"
left join crm_user on crm_player.user_id = crm_user.id
where crm_user.email is not null
),
temp_value2 as (
select
count(*) as value2
from crm_user_preregistred
)
select (select temp_value1.value1::float * 100 from temp_value1) / (select temp_value2.value2::float from temp_value2)

You can do this in one query:
select count(*) filter (where cu.email is not null) * 100.0 / max(cup.cnt)
from (select cup.*, count(*) over () as cnt
from crm_user_preregistred cup
) cup left join
crm_player cp
on cup."document" = cp."document" left join
crm_user cu
on cp.user_id = cu.id
where cu.email is not null;
I suspect that the query could be simplified further, but without knowing your data model, it is hard to make specific suggestions.

Related

Unable to convert this legacy SQL into Standard SQL in Google BigQuery

I am not able to validate this legacy sql into standard bigquery sql as I don't know what else is required to change here(This query fails during validation if I choose standard SQL as big query dialect):
SELECT
lineitem.*,
proposal_lineitem.*,
porder.*,
company.*,
product.*,
proposal.*,
trafficker.name,
salesperson.name,
rate_card.*
FROM (
SELECT
*
FROM
dfp_data.dfp_order_lineitem
WHERE
DATE(end_datetime) >= DATE(DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'))
OR end_datetime IS NULL ) lineitem
JOIN (
SELECT
*
FROM
dfp_data.dfp_order) porder
ON
lineitem.order_id = porder.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_proposal_lineitem) proposal_lineitem
ON
lineitem.id = proposal_lineitem.dfp_lineitem_id
JOIN (
SELECT
*
FROM
dfp_data.dfp_company) company
ON
porder.advertiser_id = company.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_product) product
ON
proposal_lineitem.product_id=product.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_proposal) proposal
ON
proposal_lineitem.proposal_id=proposal.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_rate_card) rate_card
ON
proposal_lineitem.ratecard_id=rate_card.id
LEFT JOIN (
SELECT
id,
name
FROM
dfp_data.dfp_user) trafficker
ON
porder.trafficker_id =trafficker.id
LEFT JOIN (
SELECT
id,
name
FROM
dfp_data.dfp_user) salesperson
ON
porder. salesperson_id =salesperson.id
Most likely the error you are getting is something like below
Duplicate column names in the result are not supported. Found duplicate(s): name
Legacy SQL adjust trafficker.name and salesperson.name in your SELECT statement into respectively trafficker_name and salesperson_name thus effectively eliminating column names duplication
Standard SQL behaves differently and treat both those columns as named name thus producing duplication case. To avoid it - you just need to provide aliases as in example below
SELECT
lineitem.*,
proposal_lineitem.*,
porder.*,
company.*,
product.*,
proposal.*,
trafficker.name AS trafficker_name,
salesperson.name AS salesperson_name,
rate_card.*
FROM ( ...
You can easily check above explained using below simplified/dummy queries
#legacySQL
SELECT
porder.*,
trafficker.name,
salesperson.name
FROM (
SELECT 1 order_id, 'abc' order_name, 1 trafficker_id, 2 salesperson_id
) porder
LEFT JOIN (SELECT 1 id, 'trafficker' name) trafficker
ON porder.trafficker_id =trafficker.id
LEFT JOIN (SELECT 2 id, 'salesperson' name ) salesperson
ON porder. salesperson_id =salesperson.id
and
#standardSQL
SELECT
porder.*,
trafficker.name AS trafficker_name,
salesperson.name AS salesperson_name
FROM (
SELECT 1 order_id, 'abc' order_name, 1 trafficker_id, 2 salesperson_id
) porder
LEFT JOIN (SELECT 1 id, 'trafficker' name) trafficker
ON porder.trafficker_id =trafficker.id
LEFT JOIN (SELECT 2 id, 'salesperson' name ) salesperson
ON porder. salesperson_id =salesperson.id
Note: if you have more duplicate names - you need to alias all of them too

How to use multiple count and where condition sql server 2008?

I have this two query
1.
select CL_Clients.cl_id,CL_Clients].cl_name,COUNT(*) AS number_of_orders
from CL_Clients,CLOI_ClientOrderItems
where CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id
2.
select CL_Clients.cl_id,count(cloi_current_status) as dis
from CLOI_ClientOrderItems,CL_Clients
where cloi_current_status]='12'
and CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id,CLOI_ClientOrderItems.cloi_current_status
i have this column i need to put count function and where condition
[cloi_current_status]
166
30
30
30
150
150
150
150
150
150
150
Quite simple, you just encapsulate the queries and give their result sets an alias and then do a JOIN between their aliases on the column that is common. (In the query below I assume you'll be joining by client id)
SELECT *
FROM (
SELECT CL_Clients.cl_id,
CL_Clients].cl_name,
COUNT(*) AS number_of_orders
FROM CL_Clients,
CLOI_ClientOrderItems
WHERE CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id
) A
INNER JOIN (
SELECT CL_Clients.cl_id,
count(cloi_current_status) AS dis
FROM CLOI_ClientOrderItems,
CL_Clients
WHERE cloi_current_status] = '12'
AND CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id,
CLOI_ClientOrderItems.cloi_current_status
) B
ON A.cl_id = B.cl_id
WHERE ...
GROUP BY ...
This will be treated as a separate result set, so you can also filter results with a WHERE or just a GROUP BY, just like in a normal SELECT.
UPDATE:
To answer the question in your comments, when you join two tables that have a column with the same value and use
SELECT * FROM A INNER JOIN B the * will show all columns returned by the join, meaning all columns from A and all columns from B, this is why you have duplicate columns.
If you want to filter the columns returned you can specifiy which columns you want returned. So, in your case, the top SELECT * can be replaced with
SELECT A.cl_id, A.cl_name, A.number_of_orders, B.dis so, your query becomes:
SELECT A.cl_id, A.cl_name, A.number_of_orders, B.dis
FROM (
SELECT CL_Clients.cl_id,
CL_Clients].cl_name,
COUNT(*) AS number_of_orders
FROM CL_Clients,
CLOI_ClientOrderItems
WHERE CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id
) A
INNER JOIN (
SELECT CL_Clients.cl_id,
count(cloi_current_status) AS dis
FROM CLOI_ClientOrderItems,
CL_Clients
WHERE cloi_current_status] = '12'
AND CL_Clients.cl_id = CLOI_ClientOrderItems.cl_id
GROUP BY CL_Clients.cl_name,
CL_Clients.cl_id,
CLOI_ClientOrderItems.cloi_current_status
) B
ON A.cl_id = B.cl_id
UPDATE #2:
For your last question, you need to GROUP BY at the end of the big query and use a HAVING condtion, like this:
GROUP BY A.cl_id, A.cl_name, A.number_of_orders, B.dis
HAVING COUNT(cloi_current_status) > 100
All depends on what data you are trying to get, but you can go about it like this.
SELECT Column_x, Column_y, etc..
FROM ClL_Clients a
JOIN (select CL_Clients.cl_id,CL_Clients].cl_name,COUNT(*) AS number_of_orders
from CL_Clients,CLOI_ClientOrderItems
where CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id) b
on a.cl_id = b.cl_id
JOIN (select CL_Clients.cl_id,count(cloi_current_status) as dis
from CLOI_ClientOrderItems,CL_Clients
where cloi_current_status]='12'
and CL_Clients.cl_id=CLOI_ClientOrderItems.cl_id
group by CL_Clients.cl_name,CL_Clients.cl_id,CLOI_ClientOrderItems.cloi_current_status) c
on a.cl_id = c.cl_id
Group by BLAH BLAH
Hope this gets you in the right direction.

Is there a way to make this query more efficient performance wise?

This query takes a long time to run on MS Sql 2008 DB with 70GB of data.
If i run the 2 where clauses seperately it takes a lot less time.
EDIT - I need to change the 'select *' to 'delete' afterwards, please keep it in mind when answering. thanks :)
select *
From computers
Where Name in
(
select T2.Name
from
(
select Name
from computers
group by Name
having COUNT(*) > 1
) T3
join computers T2 on T3.Name = T2.Name
left join policyassociations PA on T2.PK = PA.EntityId
where (T2.EncryptionStatus = 0 or T2.EncryptionStatus is NULL) and
(PA.EntityType <> 1 or PA.EntityType is NULL)
)
OR
ClientId in
(
select substring(ClientID,11,100)
from computers
)
Swapping IN for EXISTS will help.
Also, as per Gordon's answer: UNION can out-perform OR.
SELECT computers.*
FROM computers
LEFT
JOIN policyassociations
ON policyassociations.entityid = computers.pk
WHERE (
computers.encryptionstatus = 0
OR computers.encryptionstatus IS NULL
)
AND (
policyassociations.entitytype <> 1
OR policyassociations.entitytype IS NULL
)
AND EXISTS (
SELECT name
FROM (
SELECT name
FROM computers
GROUP
BY name
HAVING Count(*) > 1
) As duplicate_computers
WHERE name = computers.name
)
UNION
SELECT *
FROM computers As c
WHERE EXISTS (
SELECT SubString(clientid, 11, 100)
FROM computers
WHERE SubString(clientid, 11, 100) = c.clientid
)
You've now updated your question asking to make this a delete.
Well the good news is that instead of the "OR" you just make two DELETE statements:
DELETE
FROM computers
LEFT
JOIN policyassociations
ON policyassociations.entityid = computers.pk
WHERE (
computers.encryptionstatus = 0
OR computers.encryptionstatus IS NULL
)
AND (
policyassociations.entitytype <> 1
OR policyassociations.entitytype IS NULL
)
AND EXISTS (
SELECT name
FROM (
SELECT name
FROM computers
GROUP
BY name
HAVING Count(*) > 1
) As duplicate_computers
WHERE name = computers.name
)
;
DELETE
FROM computers As c
WHERE EXISTS (
SELECT SubString(clientid, 11, 100)
FROM computers
WHERE SubString(clientid, 11, 100) = c.clientid
)
;
Some things I would look at are
1. are indexes in place?
2. 'IN' will slow your query, try replacing it with joins,
3. you should use column name, I guess 'Name' in this case, while using count(*),
4. try selecting required data only, by selecting particular columns.
Hope this helps!
or can be poorly optimized sometimes. In this case, you can just split the query into two subqueries, and combine them using union:
select *
From computers
Where Name in
(
select T2.Name
from
(
select Name
from computers
group by Name
having COUNT(*) > 1
) T3
join computers T2 on T3.Name = T2.Name
left join policyassociations PA on T2.PK = PA.EntityId
where (T2.EncryptionStatus = 0 or T2.EncryptionStatus is NULL) and
(PA.EntityType <> 1 or PA.EntityType is NULL)
)
UNION
select *
From computers
WHERE ClientId in
(
select substring(ClientID,11,100)
from computers
);
You might also be able to improve performance by replacing the subqueries with explicit joins. However, this seems like the shortest route to better performance.
EDIT:
I think the version with join's is:
select c.*
From computers c left outer join
(select c.Name
from (select c.*, count(*) over (partition by Name) as cnt
from computers c
) c left join
policyassociations PA
on T2.PK = PA.EntityId and PA.EntityType <> 1
where (c.EncryptionStatus = 0 or c.EncryptionStatus is NULL) and
c.cnt > 1
) cpa
on c.Name = cpa.Name left outer join
(select substring(ClientID, 11, 100) as name
from computers
) csub
on c.Name = csub.name
Where cpa.Name is not null or csub.Name is not null;

Replace no result

I have a query like this:
SELECT TV.Descrizione as TipoVers,
sum(ImportoVersamento) as ImpTot,
count(*) as N,
month(DataAllibramento) as Mese
FROM PROC_Versamento V
left outer join dbo.PROC_TipoVersamento TV
on V.IDTipoVersamento = TV.IDTipoVersamento
inner join dbo.PROC_PraticaRiscossione PR
on V.IDPraticaRiscossioneAssociata = PR.IDPratica
inner join dbo.DA_Avviso A
on PR.IDDatiAvviso = A.IDAvviso
where DataAllibramento between '2012-09-08' and '2012-09-17' and A.IDFornitura = 4
group by V.IDTipoVersamento,month(DataAllibramento),TV.Descrizione
order by V.IDTipoVersamento,month(DataAllibramento)
This query must always return something. If no result is produced a
0 0 0 0
row must be returned. How can I do this. Use a isnull for every selected field isn't usefull.
Use a derived table with one row and do a outer apply to your other table / query.
Here is a sample with a table variable #T in place of your real table.
declare #T table
(
ID int,
Grp int
)
select isnull(Q.MaxID, 0) as MaxID,
isnull(Q.C, 0) as C
from (select 1) as T(X)
outer apply (
-- Your query goes here
select max(ID) as MaxID,
count(*) as C
from #T
group by Grp
) as Q
order by Q.C -- order by goes to the outer query
That will make sure you have always at least one row in the output.
Something like this using your query.
select isnull(Q.TipoVers, '0') as TipoVers,
isnull(Q.ImpTot, 0) as ImpTot,
isnull(Q.N, 0) as N,
isnull(Q.Mese, 0) as Mese
from (select 1) as T(X)
outer apply (
SELECT TV.Descrizione as TipoVers,
sum(ImportoVersamento) as ImpTot,
count(*) as N,
month(DataAllibramento) as Mese,
V.IDTipoVersamento
FROM PROC_Versamento V
left outer join dbo.PROC_TipoVersamento TV
on V.IDTipoVersamento = TV.IDTipoVersamento
inner join dbo.PROC_PraticaRiscossione PR
on V.IDPraticaRiscossioneAssociata = PR.IDPratica
inner join dbo.DA_Avviso A
on PR.IDDatiAvviso = A.IDAvviso
where DataAllibramento between '2012-09-08' and '2012-09-17' and A.IDFornitura = 4
group by V.IDTipoVersamento,month(DataAllibramento),TV.Descrizione
) as Q
order by Q.IDTipoVersamento, Q.Mese
Use COALESCE. It returns the first non-null value. E.g.
SELECT COALESCE(TV.Desc, 0)...
Will return 0 if TV.DESC is NULL.
You can try:
with dat as (select TV.[Desc] as TipyDesc, sum(Import) as ToImp, count(*) as N, month(Date) as Mounth
from /*DATA SOURCE HERE*/ as TV
group by [Desc], month(Date))
select [TipyDesc], ToImp, N, Mounth from dat
union all
select '0', 0, 0, 0 where (select count (*) from dat)=0
That should do what you want...
If it's ok to include the "0 0 0 0" row in a result set that has data, you can use a union:
SELECT TV.Desc as TipyDesc,
sum(Import) as TotImp,
count(*) as N,
month(Date) as Mounth
...
UNION
SELECT
0,0,0,0
Depending on the database, you may need a FROM for the second SELECT. In Oracle, this would be "FROM DUAL". For MySQL, no FROM is necessary

MySQL/SQL - When are the results of a sub-query avaliable?

Suppose I have this query
SELECT * FROM (
SELECT * FROM table_a
WHERE id > 10 )
AS a_results LEFT JOIN
(SELECT * from table_b
WHERE id IN
(SElECT id FROM a_results)
ON (a_results.id = b_results.id)
I would get the error "a_results is not a table". Anywhere I could use the re-use the results of the subquery?
Edit: It has been noted that this query doesn't make sense...it doesn't, yes. This is just to illustrate the question which I am asking; the 'real' query actually looks something like this:
SELECT SQL_CALC_FOUND_ROWS * FROM
( SELECT wp_pod_tbl_hotel . *
FROM wp_pod_tbl_hotel, wp_pod_rel, wp_pod
WHERE wp_pod_rel.field_id =12
AND wp_pod_rel.tbl_row_id =1
AND wp_pod.tbl_row_id = wp_pod_tbl_hotel.id
AND wp_pod_rel.pod_id = wp_pod.id
) as
found_hotel LEFT JOIN (
SELECT COUNT(*) as review_count, avg( (
location_rating + staff_performance_rating + condition_rating + room_comfort_rating + food_rating + value_rating
) /6 ) AS average_score, hotelid
FROM (
SELECT r. * , wp_pod_rel.tbl_row_id AS hotelid
FROM wp_pod_tbl_review r, wp_pod_rel, wp_pod
WHERE wp_pod_rel.field_id =11
AND wp_pod_rel.pod_id = wp_pod.id
AND r.id = wp_pod.tbl_row_id
AND wp_pod_rel.tbl_row_id
IN (
SELECT wp_pod_tbl_hotel .id
FROM wp_pod_tbl_hotel, wp_pod_rel, wp_pod
WHERE wp_pod_rel.field_id =12
AND wp_pod_rel.tbl_row_id =1
AND wp_pod.tbl_row_id = wp_pod_tbl_hotel.id
AND wp_pod_rel.pod_id = wp_pod.id
)
) AS hotel_reviews
GROUP BY hotel_reviews.hotelid
ORDER BY average_score DESC
AS sorted_hotel ON (id = sorted_hotel.hotelid)
As you can see, the sub-query which makes up the found_query table is repeated elsewhere downward as another sub-query, so I was hoping to re-use the results
You can not use a sub-query like this.
I'm not sure I understand your query, but wouldn't that be sufficient?
SELECT * FROM table_a a
LEFT JOIN table_b b ON ( b.id = a.id )
WHERE a.id > 10
It would return all rows from table_a where id > 10 and LEFT JOIN rows from table_b where id matches.