BigQuery/Sql cross join question: exclude from a column - sql

I got 2 tables:
One player id can engage with different features. One can engage any number of features in a month.
What I want to achieve is to have the table produced below:i want to have a non-engaging table that will have all features that are not engaged by a player id.
WITH feature_table as (select distinct
Month,
Feature
From Feature_list),
engaged_player as(
select
'Y' Engaged_YN,
Month,
Engaged_Feature
FROM ga_data
),
select
from engaged_player
cross join ?????
)
What I have thought is to probably use cross join first.

Given your sample data, try this approach:
SELECT
'N' AS Engaged_YN,
gd.Month,
gd.Player_ID,
fl.Feature AS Not_Engaged_Feature
FROM mydataset.ga_data gd
CROSS JOIN mydataset.Feature_list fl
WHERE fl.Feature != gd.Engaged_Feature
ORDER BY Player_ID, Not_Engaged_Feature
Output:

Related

I need to match special character, how can I write query to get the required output?

I want mail which is common at supplier's and customer's tables only, I don't
want emails that don't match in customer's table with the supplier's mail. Unfortunately,
we don't have similar IDs to join so if possible can we join with the mail only,
if not then we can consider common IDs and join them.
following tables for your reference and output that I want
The proper solution is to normalize your database design to not store multiple values in the same column. However, I expect that is water under the bridge.
Given the data that you have, you can use STRING_SPLIT() to separate the emails from the SUPPLIERS table and then join with CUSTOMERS.
SELECT S.ID, C.EMAIL
FROM SUPPLIERS S
CROSS APPLY (
SELECT value AS EMAIL
FROM STRING_SPLIT(S.EMAIL, ',')
) E
JOIN CUSTOMERS C ON C.EMAIL = E.EMAIL
See this db<>fiddle
If your data may have any spaces mixed in, you may need to add TRIM() to the STRING_SPLIT() result - SELECT TRIM(value) AS EMAIL.
Select CC.ID, CC.email from
(
SELECT ROW_NUMBER() OVER (ORDER BY ID ASC) AS AID,a.*
FROM Suppliers a
) as AA
left outer join
(
Select BB.* from
(
SELECT ROW_NUMBER() OVER (ORDER BY ID ASC) AS BID,b.*
FROM Customers b
) as BB
) as CC
on AA.AID = CC.BID

How to get a result set containing the absence of a value?

Scenario: Have a table with four columns. District_Number, District_name, Data_Collection_Week, enrollments. Each week we get data, BUT sometimes we do not.
Task: My supervisor wants me to produce a query that will let us know, which districts did not submit a given week.
What I have tried is below, but I cannot get a NULL value on those that did not submit a week.
SELECT DISTINCT DistrictNumber, DistrictName, DataCollectionWeek
into #test4
FROM EDW_REQUESTS.INSTRUCTION_DELIVERY_ENROLLMENT_2021
order by DistrictNumber, DataCollectionWeek asc
select DISTINCT DataCollectionWeek
into #test5
from EDW_REQUESTS.INSTRUCTION_DELIVERY_ENROLLMENT_2021
order by DataCollectionWeek
select b.DistrictNumber, b.DistrictName, b.DataCollectionWeek
from #test5 a left outer join #test4 b on (a.DataCollectionWeek = b.DataCollectionWeek)
order by b.DistrictNumber, b.DataCollectionWeek asc
One option uses a cross join of two select distinct subqueries to generate all possible combinations of districts and weeks, and then not exists to identify those that are not available in the table:
select d.districtnumber, w.datacollectionweek
from (select distinct districtnumber from edw_requests.instruction_delivery_enrollment_2021) d
cross join (select distinct datacollectionweek from edw_requests.instruction_delivery_enrollment_2021) w
where not exists (
select 1
from edw_requests.instruction_delivery_enrollment_2021 i
where i.districtnumber = d.districtnumber and i.datacollectionweek = w.datacollectionweek
)
This would be simpler (and much more efficient) if you had referential tables to store the districts and weeks: you would then use them directly instead of the select distinct subqueries.

Subtracting values of columns from two different tables

I would like to take values from one table column and subtract those values from another column from another table.
I was able to achieve this by joining those tables and then subtracting both columns from each other.
Data from first table:
SELECT max_participants FROM courses ORDER BY id;
Data from second table:
SELECT COUNT(id) FROM participations GROUP BY course_id ORDER BY course_id;
Here is some code:
SELECT max_participants - participations AS free_places FROM
(
SELECT max_participants, COUNT(participations.id) AS participations
FROM courses
INNER JOIN participations ON participations.course_id = courses.id
GROUP BY courses.max_participants, participations.course_id
ORDER BY participations.course_id
) AS course_places;
In general, it works, but I was wondering, if there is some way to make it simplier or maybe my approach isn't correct and this code will not work in some conditions? Maybe it needs to be optimized.
I've read some information about not to rely on natural order of result set in databases and that information made my doubts to appear.
If you want the values per course, I would recommend:
SELECT c.id, (c.max_participants - COUNT(p.id)) AS free_places
FROM courses c LEFT JOIN
participations p
ON p.course_id = c.id
GROUP BY c.id, c.max_participants
ORDER BY 1;
Note the LEFT JOIN to be sure all courses are included, even those with no participants.
The overall number is a little tricker. One method is to use the above as a subquery. Alternatively, you can pre-aggregate each table:
select c.max_participants - p.num_participants
from (select sum(max_participants) as max_participants from courses) c cross join
(select count(*) as num_participants from participants from participations) p;

Sub query to count number of time an id appears in another table

Using SQL Server 2012. I have a table called deals that contains a primary key called deal_id along with 10 other fields. I also have a table called deals_country that contain a foreign key called deal_id.
It's possible that a record in deals contains numerous records in deals country. What I want to do is to count the number of times every deal_id from deals appears in deals_country?
Below is what I have tried without success.
select MA_DEALS.*, MA_DEALS_COUNTRY.mycount
from MA_DEALS cross apply
(
select count(MA_DEALS_COUNTRY.deal_id) as mycount
from MA_DEALS_COUNTRY
group by MA_DEALS_COUNTRY.deal_id
) MA_DEALS_COUNTRY
order by MA_DEALS.deal_id
Although you can use CROSS APPLY for this, I would start with the basic JOIN and GROUP BY query instead:
select MA_DEALS.*, dc.mycount
from MA_DEALS d left join
(select dc.deal_id, count(dc.deal_id) as mycount
from MA_DEALS_COUNTRY dc
group by dc.deal_id
) dc
on d.deal_id = dc.deal_id
order by d.deal_id;
Try this:
SELECT D.*,
DC.N
FROM MA_DEALS D
LEFT JOIN ( SELECT deal_id, COUNT(*) N
FROM MA_DEALS_COUNTRY
GROUP BY deal_id) DC
ON D.deal_id = DC.deal_id

Select SUM from multiple tables

I keep getting the wrong sum value when I join 3 tables.
Here is a pic of the ERD of the table:
(Original here: http://dl.dropbox.com/u/18794525/AUG%207%20DUMP%20STAN.png )
Here is the query:
select SUM(gpCutBody.actualQty) as cutQty , SUM(gpSewBody.quantity) as sewQty
from jobOrder
inner join gpCutHead on gpCutHead.joNum = jobOrder.joNum
inner join gpSewHead on gpSewHead.joNum = jobOrder.joNum
inner join gpCutBody on gpCutBody.gpCutID = gpCutHead.gpCutID
inner join gpSewBody on gpSewBody.gpSewID = gpSewHead.gpSewID
If you are only interested in the quantities of cuts and sews for all orders, the simplest way to do it would be like this:
select (select SUM(gpCutBody.actualQty) from gpCutBody) as cutQty,
(select SUM(gpSewBody.quantity) from gpSewBody) as sewQty
(This assumes that cuts and sews will always have associated job orders.)
If you want to see a breakdown of cuts and sews by job order, something like this might be preferable:
select joNum, SUM(actualQty) as cutQty, SUM(quantity) as sewQty
from (select joNum, actualQty, 0 as quantity
from gpCutBody
union all
select joNum, 0 as actualQty, quantity
from gpSewBody) sc
group by joNum
Mark's approach is a good one. I want to suggest the alternative of doing the group by's before the union, simply because this can be a more general approach for summing along multiple dimensions.
Your problem is that you have two dimensions that you want to sum along, and you are getting a cross product of the values in the join.
select joNum, act.quantity as ActualQty, q.quantity as Quantity
from (select joNum, sum(actualQty) as quantity
from gpCutBody
group by joNum
) act full outer join
(select joNum, sum(quantity) as quantity
from gpSewBody
group by joNum
) q
on act.joNum = q.joNum
(I have kept Mark's assumption that doing this by joNum is the desired output.)