I'm attempting the following query,
DECLARE #EntityType varchar(25)
SET #EntityType = 'Accessory';
WITH Entities (
E_ID, E_Type,
P_ID, P_Name, P_DataType, P_Required, P_OnlyOne,
PV_ID, PV_Value, PV_EntityID, PV_ValueEntityID,
PV_UnitValueID, PV_UnitID, PV_UnitName, PV_UnitDesc, PV_MeasureID, PV_MeasureName, PV_UnitValue,
PV_SelectionID, PV_DropDownID, PV_DropDownName, PV_DropDownOptionID, PV_DropDownOptionName, PV_DropDownOptionDesc,
RecursiveLevel
)
AS
(
-- Original Query
SELECT dbo.Entity.ID AS E_ID, dbo.EntityType.Name AS E_Type,
dbo.Property.ID AS P_ID, dbo.Property.Name AS P_Name, DataType.Name AS P_DataType, Required AS P_Required, OnlyOne AS P_OnlyOne,
dbo.PropertyValue.ID AS PV_ID, dbo.PropertyValue.Value AS PV_Value, dbo.PropertyValue.EntityID AS PV_EntityID, dbo.PropertyValue.ValueEntityID AS PV_ValueEntityID,
dbo.UnitValue.ID AS PV_UnitValueID, dbo.UnitOfMeasure.ID AS PV_UnitID, dbo.UnitOfMeasure.Name AS PV_UnitName, dbo.UnitOfMeasure.Description AS PV_UnitDesc, dbo.Measure.ID AS PV_MeasureID, dbo.Measure.Name AS PV_MeasureName, dbo.UnitValue.UnitValue AS PV_UnitValue,
dbo.DropDownSelection.ID AS PV_SelectionID, dbo.DropDown.ID AS PV_DropDownID, dbo.DropDown.Name AS PV_DropDownName, dbo.DropDownOption.ID AS PV_DropDownOptionID, dbo.DropDownOption.Name AS PV_DropDownOptionName, dbo.DropDownOption.Description AS PV_DropDownOptionDesc,
0 AS RecursiveLevel
FROM dbo.Entity
INNER JOIN dbo.EntityType ON dbo.EntityType.ID = dbo.Entity.TypeID
INNER JOIN dbo.Property ON dbo.Property.EntityTypeID = dbo.Entity.TypeID
INNER JOIN dbo.PropertyValue ON dbo.Property.ID = dbo.PropertyValue.PropertyID AND dbo.PropertyValue.EntityID = dbo.Entity.ID
INNER JOIN dbo.DataType ON dbo.DataType.ID = dbo.Property.DataTypeID
LEFT JOIN dbo.UnitValue ON dbo.UnitValue.ID = dbo.PropertyValue.UnitValueID
LEFT JOIN dbo.UnitOfMeasure ON dbo.UnitOfMeasure.ID = dbo.UnitValue.UnitOfMeasureID
LEFT JOIN dbo.Measure ON dbo.Measure.ID = dbo.UnitOfMeasure.MeasureID
LEFT JOIN dbo.DropDownSelection ON dbo.DropDownSelection.ID = dbo.PropertyValue.DropDownSelectedID
LEFT JOIN dbo.DropDownOption ON dbo.DropDownOption.ID = dbo.DropDownSelection.SelectedOptionID
LEFT JOIN dbo.DropDown ON dbo.DropDown.ID = dbo.DropDownSelection.DropDownID
WHERE dbo.EntityType.Name = #EntityType
UNION ALL
-- Recursive Query?
SELECT E2.E_ID AS E_ID, dbo.EntityType.Name AS E_Type,
dbo.Property.ID AS P_ID, dbo.Property.Name AS P_Name, DataType.Name AS P_DataType, Required AS P_Required, OnlyOne AS P_OnlyOne,
dbo.PropertyValue.ID AS PV_ID, dbo.PropertyValue.Value AS PV_Value, dbo.PropertyValue.EntityID AS PV_EntityID, dbo.PropertyValue.ValueEntityID AS PV_ValueEntityID,
dbo.UnitValue.ID AS PV_UnitValueID, dbo.UnitOfMeasure.ID AS PV_UnitID, dbo.UnitOfMeasure.Name AS PV_UnitName, dbo.UnitOfMeasure.Description AS PV_UnitDesc, dbo.Measure.ID AS PV_MeasureID, dbo.Measure.Name AS PV_MeasureName, dbo.UnitValue.UnitValue AS PV_UnitValue,
dbo.DropDownSelection.ID AS PV_SelectionID, dbo.DropDown.ID AS PV_DropDownID, dbo.DropDown.Name AS PV_DropDownName, dbo.DropDownOption.ID AS PV_DropDownOptionID, dbo.DropDownOption.Name AS PV_DropDownOptionName, dbo.DropDownOption.Description AS PV_DropDownOptionDesc,
(RecursiveLevel + 1)
FROM Entities AS E2
INNER JOIN dbo.Entity ON dbo.Entity.ID = E2.PV_ValueEntityID
INNER JOIN dbo.EntityType ON dbo.EntityType.ID = dbo.Entity.TypeID
INNER JOIN dbo.Property ON dbo.Property.EntityTypeID = dbo.Entity.TypeID
INNER JOIN dbo.PropertyValue ON dbo.Property.ID = dbo.PropertyValue.PropertyID AND dbo.PropertyValue.EntityID = E2.E_ID
INNER JOIN dbo.DataType ON dbo.DataType.ID = dbo.Property.DataTypeID
INNER JOIN dbo.UnitValue ON dbo.UnitValue.ID = dbo.PropertyValue.UnitValueID
INNER JOIN dbo.UnitOfMeasure ON dbo.UnitOfMeasure.ID = dbo.UnitValue.UnitOfMeasureID
INNER JOIN dbo.Measure ON dbo.Measure.ID = dbo.UnitOfMeasure.MeasureID
INNER JOIN dbo.DropDownSelection ON dbo.DropDownSelection.ID = dbo.PropertyValue.DropDownSelectedID
INNER JOIN dbo.DropDownOption ON dbo.DropDownOption.ID = dbo.DropDownSelection.SelectedOptionID
INNER JOIN dbo.DropDown ON dbo.DropDown.ID = dbo.DropDownSelection.DropDownID
)
SELECT E_ID, E_Type,
P_ID, P_Name, P_DataType, P_Required, P_OnlyOne,
PV_ID, PV_Value, PV_EntityID, PV_ValueEntityID,
PV_UnitValueID, PV_UnitID, PV_UnitName, PV_UnitDesc, PV_MeasureID, PV_MeasureName, PV_UnitValue,
PV_SelectionID, PV_DropDownID, PV_DropDownName, PV_DropDownOptionID, PV_DropDownOptionName, PV_DropDownOptionDesc,
RecursiveLevel
FROM Entities
INNER JOIN [dbo].[Entity] AS dE
ON dE.ID = PV_EntityID
The problem is the second query, the "recursive one" is getting the data I expect since I can't do the LEFT JOINs like in the first query. (At least to my understanding).
If I remove the fetching of the data that requires the LEFT (Outer) JOINs then the recursion works perfectly. My problem is I need both. Is there a way I can accomplish this?
Per http://msdn.microsoft.com/en-us/library/ms175972.aspx you can not have a left/right/outer join in a recursive CTE.
For a recursive CTE you can't use a subquery either so I sugest following this example.
They use two CTE's. The first is not recursive and does the left join to get the data it needs. The second CTE is recursive and inner joins on the first CTE. Since CTE1 is not recursive it can left join and supply default values for the missing rows and is guarenteed to work in the inner join.
However, you can also duplicate a left join with a union and subselect though it isn't really useful normally but it is interesting.
In that case, you would keep your first statement how it is. It will match all rows that join successfully.
Then UNION that query with another query that removes the join, but has a
NOT EXISTS(SELECT 1 FROM MISSING_ROWS_TABLE WHERE MAIN_TABLE.JOIN_CONDITION = MISSING_ROWS_TABLE.JOIN_CONDITION)
This gets all the rows that failed the previous join condition in query 1. You can replace the colmuns you would get from MISSING_ROWS_TABLE with NULL. I had to do this once using a coding framework that didn't support outer joins. Since recursive CTE's don't allow subqueries you have to use the first solution.
Related
I am converting following oracle query to bigquery query but the results(record counts) are different, eventhough base tables involved in the query are having same number of records in both oracle and bq.
oracle :
SELECT
to_char(R_PROJECT_S.PROJECT_COPYRIGHT_YEAR),
R_PROJECT_S.PROJECT_TITLE,
to_char(R_PROJECT_S.EDITION),
R_PROJECT_S.CIRCULATION_DESC,
R_PROJECT_S.DISTRIBUTION_DESC,
R_PROJECT_S.PROJECT_ID,
DB.R_USAGE_INFO_S.OBJECT_ID,
UPPER(DB.R_INFO_S.PHOTOGRAPHER),
UPPER(DB.R_INFO_S.SOURCE_CAPTION),
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEAU,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHECPYR,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEED,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEPRDDE,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEGRDE,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHSODE,
R_PROJECT_S.CHARGE_TO_ISBN,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEPTIT,
DB.R_INFO_S.SOURCE_NAME,
R_PROJECT_S.LANGUAGE_DESC,
R_PROJECT_S.PROJECT_FORMAT_DESC,
DB.R_USAGE_INFO_S.USAGE_ID,
DB.R_USAGE_INFO_S.PAGE,
DB.R_USAGE_INFO_S.CHAPTER,
DB.R_INFO_S.WORK_PROJECT_ID,
DB.R_INFO_S.IMAGE_TYPE_DESC,
DB.R_INFO_S.IMAGE_DESC,
DB.R_USAGE_INFO_S.PERMISSION_TYPE_DESC,
DB.R_USAGE_INFO_S.PERMISSION_STATUS_DESC,
DB.R_USAGE_INFO_S.PERMISSION_USAGE_DESC,
DB.R_USAGE_INFO_S.USAGE_LABEL,
DB.R_USAGE_INFO_S.QUOTED_COST,
DB.R_INFO_S.SOURCE_OBJECT_ID,
DB.R_USAGE_INFO_S.USAGE_TYPE_DESC,
GHEPM_TITLE_PSPP.TITLE_DESCRIPTION,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHESOAB,
ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHEGRCD
FROM
DB.R_PROJECT_S_VW R_PROJECT_S,
DB.R_USAGE_INFO_S,
DB.R_INFO_S,
ADMIN.BIC_APHEISBN00_BO_VW,
DB.GHEPM_TITLE GHEPM_TITLE_PSPP
WHERE
( R_PROJECT_S.PROJECT_ID=DB.R_USAGE_INFO_S.PROJECT_ID(+)
)
AND ( DB.R_USAGE_INFO_S.OBJECT_ID=DB.R_INFO_S.OBJECT_ID )
AND ( R_PROJECT_S.PROJECT_ID=ADMIN.BIC_APHEISBN00_BO_VW.BIC_ZCHETIIS(+) )
AND ( R_PROJECT_S.PROJECT_ID=DB.GHEPM_TITLE_PSPP.ISBN10(+) )
AND UPPER(DB.R_USAGE_INFO_S.USAGE_LABEL) NOT LIKE UNISTR('%KILL%')
BQ:
SELECT
CAST(R_PROJECT_S.PROJECT_COPYRIGHT_YEAR AS string) COPYRIGHT_YEAR,
R_PROJECT_S.PROJECT_TITLE,
CAST(R_PROJECT_S.EDITION AS string) EDITION,
R_PROJECT_S.CIRCULATION_DESC,
R_PROJECT_S.DISTRIBUTION_DESC,
R_PROJECT_S.PROJECT_ID,
R_USAGE_INFO_S.OBJECT_ID,
UPPER(R_INFO_S.PHOTOGRAPHER) PHOTOGRAPHER,
UPPER(R_INFO_S.SOURCE_CAPTION) SOURCE_CAPTION,
BIC_APHEISBN00_BO._BIC_ZCHEAU,
BIC_APHEISBN00_BO._BIC_ZCHECPYR,
BIC_APHEISBN00_BO._BIC_ZCHEED,
BIC_APHEISBN00_BO._BIC_ZCHEPRDDE,
BIC_APHEISBN00_BO._BIC_ZCHEGRDE,
BIC_APHEISBN00_BO._BIC_ZCHSODE,
R_PROJECT_S.CHARGE_TO_ISBN,
BIC_APHEISBN00_BO._BIC_ZCHEPTIT,
R_INFO_S.SOURCE_NAME,
R_PROJECT_S.LANGUAGE_DESC,
R_PROJECT_S.PROJECT_FORMAT_DESC,
R_USAGE_INFO_S.USAGE_ID,
R_USAGE_INFO_S.PAGE,
R_USAGE_INFO_S.CHAPTER,
R_INFO_S.WORK_PROJECT_ID,
R_INFO_S.IMAGE_TYPE_DESC,
R_INFO_S.IMAGE_DESC,
R_USAGE_INFO_S.PERMISSION_TYPE_DESC,
R_USAGE_INFO_S.PERMISSION_STATUS_DESC,
R_USAGE_INFO_S.PERMISSION_USAGE_DESC,
R_USAGE_INFO_S.USAGE_LABEL,
R_USAGE_INFO_S.QUOTED_COST,
R_INFO_S.SOURCE_OBJECT_ID,
R_USAGE_INFO_S.USAGE_TYPE_DESC,
GHEPM_TITLE_PSPP.TITLE_DESCRIPTION,
BIC_APHEISBN00_BO._BIC_ZCHESOAB,
BIC_APHEISBN00_BO._BIC_ZCHEGRCD
FROM
`domain-rr.oracle_DB_DB.R_info_s` R_INFO_S
inner join
`domain-rr.oracle_DB_DB.R_usage_info_s` R_USAGE_INFO_S
on
R_USAGE_INFO_S.OBJECT_ID=R_INFO_S.OBJECT_ID
right outer join
`domain-rr.DB_RPT.R_PROJECT_S_VW` R_PROJECT_S
on
R_PROJECT_S.PROJECT_ID=R_USAGE_INFO_S.PROJECT_ID
left outer join
`domain-rr.DB_RPT.BIC_APHEISBN00_BO_VW` BIC_APHEISBN00_BO
ON
R_PROJECT_S.PROJECT_ID=BIC_APHEISBN00_BO._BIC_ZCHETIIS
left outer join
`domain-rr.oracle_DB_DB.ghepm_title` GHEPM_TITLE_PSPP
ON
R_PROJECT_S.PROJECT_ID=GHEPM_TITLE_PSPP.ISBN10
AND UPPER(R_USAGE_INFO_S.USAGE_LABEL) NOT LIKE '%KILL%'
Oracle count - 1553437
BQ count - 2414413
Please help me on how to get counts are same on both oracle and bq
Thanks,
Naren
Had you used more readable, shortened table aliases several differences can be illuminated:
Oracle does not attempt any RIGHT JOIN;
GBQ should run UPPER(...) expression in WHERE not on last LEFT JOIN clause or move expression to INNER JOIN on ui table (but without testing may not make a difference but readability);
Table order may make a difference especially with use of both INNER and OUTER joins;
Oracle (using the older, outdated implicit joins)
...
FROM
GRDW.RMS_IMAGE_PROJECT_S_VW p,
GRDW.RMS_IMAGE_USAGE_INFO_S ui,
GRDW.RMS_IMAGE_INFO_S i,
BOADMIN.BIC_APHEISBN00_BO_VW b,
GRDW.GHEPM_TITLE g
WHERE
( p.PROJECT_ID = ui.PROJECT_ID(+) -- LEFT JOIN
)
AND ( ui.OBJECT_ID = i.OBJECT_ID ) -- INNER JOIN
AND ( p.PROJECT_ID = b.BIC_ZCHETIIS(+) ) -- LEFT JOIN
AND ( p.PROJECT_ID = g.ISBN10(+) ) -- LEFT JOIN
AND UPPER(ui.USAGE_LABEL) NOT LIKE UNISTR('%KILL%')
Google BigQuery (using current standard of explicit joins)
...
FROM
`pearson-rr.oracle_grdw_grdw.rms_image_info_s` i
INNER JOIN
`pearson-rr.oracle_grdw_grdw.rms_image_usage_info_s` ui
ON ui.OBJECT_ID = i.OBJECT_ID
RIGHT OUTER JOIN
`pearson-rr.GRDW_RPT.RMS_IMAGE_PROJECT_S_VW` p
ON p.PROJECT_ID = ui.PROJECT_ID
LEFT OUTER JOIN
`pearson-rr.GRDW_RPT.BIC_APHEISBN00_BO_VW` b
ON p.PROJECT_ID = b._BIC_ZCHETIIS
LEFT OUTER JOIN
`pearson-rr.oracle_grdw_grdw.ghepm_title` g
ON p.PROJECT_ID = g.ISBN10
AND UPPER(ui.USAGE_LABEL) NOT LIKE '%KILL%'
Therefore, to account for table order and appropriate JOIN, consider below adjusted Google BigQuery:
...
FROM
`pearson-rr.GRDW_RPT.RMS_IMAGE_PROJECT_S_VW` p
LEFT OUTER JOIN
`pearson-rr.oracle_grdw_grdw.rms_image_usage_info_s` ui
ON p.PROJECT_ID = ui.PROJECT_ID
INNER OUTER JOIN
`pearson-rr.oracle_grdw_grdw.rms_image_info_s` i
ON ui.OBJECT_ID = i.OBJECT_ID AND UPPER(ui.USAGE_LABEL) NOT LIKE '%KILL%'
LEFT OUTER JOIN
`pearson-rr.GRDW_RPT.BIC_APHEISBN00_BO_VW` b
ON p.PROJECT_ID = b._BIC_ZCHETIIS
LEFT OUTER JOIN
`pearson-rr.oracle_grdw_grdw.ghepm_title` g
ON p.PROJECT_ID = g.ISBN10
I'm fairly new to sql and am probably over my head with this but I keep running into a syntax error in my join statement.
I am trying to get specific stats for a single character. I have added more parenthese to get rid of a missing operator error, and I have tried adding parenthese to only around the inner joins of the same tables. So far the join statement is the only thing throwing errors.
SELECT CHARACTER.CharacterName, CHARACTER.Alignment, INVENTORY.Equipped,
ITEMS.ItemName,
ITEMS.PhysDef, ITEMS.MDef, ITEMS.Dodge, ITEMS.Damage, ITEMS.CritMultiplier,
ITEMS.Range,
ITEMS.AttackSpeed, JOB_CHARACTER.JobLevel, RACE_CHARACTER.RacialLevel,
RACE.RaceName, RACE.Strength,
RACE.Skill, RACE.Vitality, RACE.Arcane, RACE.Spirit, RACE.Charisma,
RACE.Luck, JOB.JobName,
JOB.HP, JOB.AttackBonus, JOB.Agility, JOB.Might, JOB.SpellPower, JOB.Vital,
JOB.Nimble, JOB.Mental,
JOB.Curese, JOB.SpellCasting, JOB.ManaBase, JOB.ManaType, JOB.Ki,
SKILLS.Alchemy, SKILLS.Awareness,
SKILLS.Climb, SKILLS.Coach, SKILLS.Construction, SKILLS.Decieve,
SKILLS.DisarmMechanism,
SKILLS.DiscernTruth, SKILLS.Dishearten, SKILLS.Fly, SKILLS.Forge,
SKILLS.Gymnastics,SKILLS.Identify,
SKILLS.Leadership, SKILLS.Lore_9Realms, SKILLS.Lore_Alternative,
SKILLS.Lore_Arcane,
SKILLS.Lore_Arithmancy, SKILLS.Lore_Divine, SKILLS.Lore_Geography,
SKILLS.Lore_Nature,
SKILLS.Lore_Nobility, SKILLS.Lore_Religion, SKILLS.Lore_Spiritual,
SKILLS.Medical,
SKILLS.Performance, SKILLS.Ride, SKILLS.Steal, SKILLS.Stealth,
SKILLS.Subterfuge,
SKILLS.Swim, SKILLS.Tailor, SKILLS.UseContraption, SKILLS.Wilderness
FROM (((((((CHARACTER INNER JOIN RACE_CHARACTER ON
CHARACTER.CharacterID=RACE_CHARACTER.CharacterID)
LEFT JOIN JOB_CHARACTER ON CHARACTER.CharacterID=JOB_CHARACTER.CharacterID)
LEFT JOIN INVENTORY ON CHARACTER.CharacterID=INVENTORY.CharacterID)
INNER JOIN INVENTORY ON ITEMS.ItemID =INVENTORY.ItemID)
INNER JOIN JOB ON JOB.JobID=JOB_CHARACTER.JobID)
LEFT JOIN SKILLS ON JOB.JobID=SKILLS.JobID)
INNER JOIN RACE ON RACE.RaceID=RACE_CHARACTER.RaceID)
WHERE ((CHARACTER.CharacterID)=1) AND ((JOB.JobID)=3) AND
((RACE.RaceID)=6);
I expect this to output the stats for this single character, its name, and skills, however it is not currently outputting anything.
Your select statement is sourcing 8 fields from an ITEMS table:
SELECT
...
ITEMS.ItemName,
ITEMS.PhysDef,
ITEMS.MDef,
ITEMS.Dodge,
ITEMS.Damage,
ITEMS.CritMultiplier,
ITEMS.Range,
ITEMS.AttackSpeed,
...
However, the ITEMS table is not referenced by your from clause:
FROM
(
(
(
(
(
(
(
CHARACTER INNER JOIN RACE_CHARACTER ON
CHARACTER.CharacterID=RACE_CHARACTER.CharacterID
)
LEFT JOIN JOB_CHARACTER ON
CHARACTER.CharacterID=JOB_CHARACTER.CharacterID
)
LEFT JOIN INVENTORY ON
CHARACTER.CharacterID=INVENTORY.CharacterID
)
INNER JOIN INVENTORY ON ----------< INVENTORY table referenced twice
ITEMS.ItemID =INVENTORY.ItemID
)
INNER JOIN JOB ON
JOB.JobID=JOB_CHARACTER.JobID
)
LEFT JOIN SKILLS ON
JOB.JobID=SKILLS.JobID
)
INNER JOIN RACE ON
RACE.RaceID=RACE_CHARACTER.RaceID
)
I should imagine the SQL code should be changed to something like:
SELECT
CHARACTER.CharacterName,
CHARACTER.Alignment,
INVENTORY.Equipped,
ITEMS.ItemName,
ITEMS.PhysDef,
ITEMS.MDef,
ITEMS.Dodge,
ITEMS.Damage,
ITEMS.CritMultiplier,
ITEMS.Range,
ITEMS.AttackSpeed,
JOB_CHARACTER.JobLevel,
RACE_CHARACTER.RacialLevel,
RACE.RaceName,
RACE.Strength,
RACE.Skill,
RACE.Vitality,
RACE.Arcane,
RACE.Spirit,
RACE.Charisma,
RACE.Luck,
JOB.JobName,
JOB.HP,
JOB.AttackBonus,
JOB.Agility,
JOB.Might,
JOB.SpellPower,
JOB.Vital,
JOB.Nimble,
JOB.Mental,
JOB.Curese,
JOB.SpellCasting,
JOB.ManaBase,
JOB.ManaType,
JOB.Ki,
SKILLS.Alchemy,
SKILLS.Awareness,
SKILLS.Climb,
SKILLS.Coach,
SKILLS.Construction,
SKILLS.Decieve,
SKILLS.DisarmMechanism,
SKILLS.DiscernTruth,
SKILLS.Dishearten,
SKILLS.Fly,
SKILLS.Forge,
SKILLS.Gymnastics,
SKILLS.Identify,
SKILLS.Leadership,
SKILLS.Lore_9Realms,
SKILLS.Lore_Alternative,
SKILLS.Lore_Arcane,
SKILLS.Lore_Arithmancy,
SKILLS.Lore_Divine,
SKILLS.Lore_Geography,
SKILLS.Lore_Nature,
SKILLS.Lore_Nobility,
SKILLS.Lore_Religion,
SKILLS.Lore_Spiritual,
SKILLS.Medical,
SKILLS.Performance,
SKILLS.Ride,
SKILLS.Steal,
SKILLS.Stealth,
SKILLS.Subterfuge,
SKILLS.Swim,
SKILLS.Tailor,
SKILLS.UseContraption,
SKILLS.Wilderness
FROM
(
(
(
(
(
(
(
CHARACTER INNER JOIN RACE_CHARACTER ON
CHARACTER.CharacterID=RACE_CHARACTER.CharacterID
)
LEFT JOIN JOB_CHARACTER ON
CHARACTER.CharacterID=JOB_CHARACTER.CharacterID
)
LEFT JOIN INVENTORY ON
CHARACTER.CharacterID=INVENTORY.CharacterID
)
LEFT JOIN ITEMS ON
ITEMS.ItemID =INVENTORY.ItemID
)
LEFT JOIN JOB ON
JOB.JobID=JOB_CHARACTER.JobID
)
LEFT JOIN SKILLS ON
JOB.JobID=SKILLS.JobID
)
INNER JOIN RACE ON
RACE.RaceID=RACE_CHARACTER.RaceID
)
WHERE
CHARACTER.CharacterID = 1 AND
JOB.JobID = 3 AND
RACE.RaceID = 6
Note that I have changed a couple of the inner joins to left joins because if you use an inner join on a table which is the right of a left join (or on the left of a right join), then you will receive an ambiguous outer joins error.
Goal:
I wish to get the Count of how many times a WorkItem was re-assigned
From what I understand the proper query is the following:
SELECT
WorkItemDimvw.Id,
COUNT(WorkItemAssignedToUserFactvw.WorkItemAssignedToUser_UserDimKey) AS Assignments
FROM WorkItemDimvw INNER JOIN WorkItemAssignedToUserFactvw
ON WorkItemDimvw.WorkItemDimKey = WorkItemAssignedToUserFactvw.WorkItemDimKey
GROUP BY WorkItemDimvw.Id
The EXISTING query is below and I'm wondering / forgeting if I should:
Just add in COUNT(WorkItemAssignedToUserFactvw.WorkItemAssignedToUser_UserDimKey) AS Assignments since joins are existing, except it is group by WorkItemDimvw.Id
Should it instead be a subquery in the Select below?
Query:
SELECT
SRD.ID,
SRD.Title,
SRD.Description,
SRD.EntityDimKey,
WI.WorkItemDimKey,
IATUFact.DateKey
FROM
SLAConfigurationDimvw
INNER JOIN SLAInstanceInformationFactvw
ON SLAConfigurationDimvw.SLAConfigurationDimKey = SLAInstanceInformationFactvw.SLAConfigurationDimKey
RIGHT OUTER JOIN ServiceRequestDimvw AS SRD
INNER JOIN WorkItemDimvw AS WI
ON SRD.EntityDimKey = WI.EntityDimKey
LEFT OUTER JOIN WorkItemAssignedToUserFactvw AS IATUFact
ON WI.WorkItemDimKey = IATUFact.WorkItemDimKey
AND IATUFact.DeletedDate IS NULL
The trick is to aggregate the data on a sub query, before you join it.
SELECT
SRD.ID,
SRD.Title,
SRD.Description,
SRD.EntityDimKey,
WI.WorkItemDimKey,
IATUFact.DateKey,
IATUFact.Assignments
FROM
SLAConfigurationDimvw
INNER JOIN
SLAInstanceInformationFactvw
ON SLAConfigurationDimvw.SLAConfigurationDimKey = SLAInstanceInformationFactvw.SLAConfigurationDimKey
RIGHT OUTER JOIN
ServiceRequestDimvw AS SRD
ON <you're missing something here>
INNER JOIN
WorkItemDimvw AS WI
ON SRD.EntityDimKey = WI.EntityDimKey
LEFT OUTER JOIN
(
SELECT
WorkItemDimKey,
DateKey,
COUNT(WorkItemAssignedToUser_UserDimKey) AS Assignments
FROM
WorkItemAssignedToUserFactvw
WHERE
DeletedDate IS NULL
GROUP BY
WorkItemDimKey,
DateKey
)
IATUFact
ON WI.WorkItemDimKey = IATUFact.WorkItemDimKey
I can't understand the usage of the using word. Can you explain me?
SELECT 1
FROM CONF_RAGGR_OPZTAR ropt
JOIN TAR_OPZIONI_TARIFFARIE OPT using (OPT_OPZIONE_TARIFFARIA_ID)
JOIN CONF_RAGGRUPPAMENTI_FORN rgf using (RGF_RAGGRUPPAMENTO_FORN_ID)
JOIN CONF_FORNITURE_REL_RAGG forg using (RGF_RAGGRUPPAMENTO_FORN_ID)
JOIN CONF_FORNITURE forn using (FORN_FORNITURA_ID)
LEFT JOIN (
select *
from CONF_ELEMENTI_FATTURABILI
where ELF_FLAG_ANN = 'N'
AND ELF_DATA_VER_FIN = TO_DATE('31/12/9999','DD/MM/YYYY')
) elf **using** (ROPT_RAGGR_OPZTAR_ID,COID_CONTRATTUARIO_ID,ROPT_DATA_INI,EDW_PARTITION)
-- LEFT OUTER JOIN TAR_VOCI_FATTURABILI vof
-- ON (elf.VOF_VOCE_FATTURABILE_ID = vof.VOF_VOCE_FATTURABILE_ID)
-- LEFT OUTER JOIN BASE_FASCE_ORARIE fas
-- ON (fas.FAS_FASCIA_ORARIA_ID = elf.FAS_FASCIA_ORARIA_ID)
WHERE FORN_FORNITURA_ID = 'QJlXmOFZPF3eAlAG'
ORDER BY elf.ELF_VERSIONE DESC;
The using keyword indicates that this is a natural join. This means that the column names on both side of the join are identical.
In your case this means that you will join both sides on ROPT_RAGGR_OPZTAR_ID, COID_CONTRATTUARIO_ID, ROPT_DATA_INI and EDW_PARTITION.
I posted a query yesterday (see here) that was horrible (took over a minute to run, resulting in 18,215 records):
SELECT DISTINCT
dbo.contacts_link_emails.Email, dbo.contacts.ContactID, dbo.contacts.First AS ContactFirstName, dbo.contacts.Last AS ContactLastName, dbo.contacts.InstitutionID,
dbo.institutionswithzipcodesadditional.CountyID, dbo.institutionswithzipcodesadditional.StateID, dbo.institutionswithzipcodesadditional.DistrictID
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
INNER JOIN
dbo.institutionswithzipcodesadditional
ON dbo.contacts.InstitutionID = dbo.institutionswithzipcodesadditional.InstitutionID
LEFT OUTER JOIN
dbo.contacts_def_jobfunctions
INNER JOIN
dbo.contacts_link_jobfunctions
ON dbo.contacts_def_jobfunctions.JobID = dbo.contacts_link_jobfunctions.JobID
ON dbo.contacts.ContactID = dbo.contacts_link_jobfunctions.ContactID
WHERE
(dbo.contacts.JobTitle IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_1
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist))
OR
(dbo.contacts_link_jobfunctions.JobID IN
(SELECT JobID
FROM dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_2
WHERE (ParentJobID <> '1841')))
AND
(dbo.contacts_link_emails.Email NOT IN
(SELECT EmailAddress
FROM dbo.newsletterremovelist AS newsletterremovelist))
ORDER BY EMAIL
With a lot of coaching and research, I've tuned it up to the following:
SELECT contacts.ContactID,
contacts.InstitutionID,
contacts.First,
contacts.Last,
institutionswithzipcodesadditional.CountyID,
institutionswithzipcodesadditional.StateID,
institutionswithzipcodesadditional.DistrictID
FROM contacts
INNER JOIN contacts_link_emails ON
contacts.ContactID = contacts_link_emails.ContactID
INNER JOIN institutionswithzipcodesadditional ON
contacts.InstitutionID = institutionswithzipcodesadditional.InstitutionID
WHERE
(contacts.ContactID IN
(SELECT contacts_2.ContactID
FROM contacts AS contacts_2
INNER JOIN contacts_link_emails AS contacts_link_emails_2 ON
contacts_2.ContactID = contacts_link_emails_2.ContactID
LEFT OUTER JOIN contacts_def_jobfunctions ON
contacts_2.JobTitle = contacts_def_jobfunctions.JobID
RIGHT OUTER JOIN newsletterremovelist ON
contacts_link_emails_2.Email = newsletterremovelist.EmailAddress
WHERE (contacts_def_jobfunctions.ParentJobID <> 1841)
GROUP BY contacts_2.ContactID
UNION
SELECT contacts_1.ContactID
FROM contacts_link_jobfunctions
INNER JOIN contacts_def_jobfunctions AS contacts_def_jobfunctions_1 ON
contacts_link_jobfunctions.JobID = contacts_def_jobfunctions_1.JobID
AND contacts_def_jobfunctions_1.ParentJobID <> 1841
INNER JOIN contacts AS contacts_1 ON
contacts_link_jobfunctions.ContactID = contacts_1.ContactID
INNER JOIN contacts_link_emails AS contacts_link_emails_1 ON
contacts_link_emails_1.ContactID = contacts_1.ContactID
LEFT OUTER JOIN newsletterremovelist AS newsletterremovelist_1 ON
contacts_link_emails_1.Email = newsletterremovelist_1.EmailAddress
GROUP BY contacts_1.ContactID))
While this query is now super fast (about 3 seconds), I've blown part of the logic somewhere - it only returns 14,863 rows (instead of the 18,215 rows that I believe is accurate).
The results seem near correct. I'm working to discover what data might be missing in the result set.
Can you please coach me through whatever I've done wrong here?
Thanks,
Russell Schutte
The main problem with your original query was that you had two extra joins just to introduce duplicates and then a DISTINCT to get rid of them.
Use this:
SELECT cle.Email,
c.ContactID,
c.First AS ContactFirstName,
c.Last AS ContactLastName,
c.InstitutionID,
izip.CountyID,
izip.StateID,
izip.DistrictID
FROM dbo.contacts c
INNER JOIN
dbo.institutionswithzipcodesadditional izip
ON izip.InstitutionID = c.InstitutionID
INNER JOIN
dbo.contacts_link_emails cle
ON cle.ContactID = c.ContactID
WHERE cle.Email NOT IN
(
SELECT EmailAddress
FROM dbo.newsletterremovelist
)
AND EXISTS
(
SELECT NULL
FROM dbo.contacts_def_jobfunctions cdj
WHERE cdj.JobId = c.JobTitle
AND cdj.ParentJobId <> '1841'
UNION ALL
SELECT NULL
FROM dbo.contacts_link_jobfunctions clj
JOIN dbo.contacts_def_jobfunctions cdj
ON cdj.JobID = clj.JobID
WHERE clj.ContactID = c.ContactID
AND cdj.ParentJobId <> '1841'
)
ORDER BY
email
Create the following indexes:
newsletterremovelist (EmailAddress)
contacts_link_jobfunctions (ContactID, JobID)
contacts_def_jobfunctions (JobID)
Do you get the same results when you do:
SELECT count(*)
FROM
dbo.contacts_def_jobfunctions AS contacts_def_jobfunctions_3
INNER JOIN
dbo.contacts
INNER JOIN
dbo.contacts_link_emails
ON dbo.contacts.ContactID = dbo.contacts_link_emails.ContactID
ON contacts_def_jobfunctions_3.JobID = dbo.contacts.JobTitle
SELECT COUNT(*)
FROM
contacts
INNER JOIN contacts_link_jobfunctions
ON contacts.ContactID = contacts_link_jobfunctions.ContactID
INNER JOIN contacts_link_emails
ON contacts.ContactID = contacts_link_emails.ContactID
If so keep adding each join conditon on until you don't get the same results and you will see where your mistake was. If all the joins are the same, then look at the where clauses. But I will be surprised if it isn't in the first join because the syntax you have orginally won't even work on SQL Server and it is pretty nonstandard SQL and may have been incorrect all along but no one knew.
Alternatively, pick a few of the records that are returned in the orginal but not the revised. Track them through the tables one at a time to see if you can find why the second query filters them out.
I'm not directly sure what is wrong, but when I run in to this situation, the first thing I do is start removing variables.
So, comment out the where clause. How many rows are returned?
If you get back the 11,604 rows then you've isolated the problems to the joins. Work though the joins, commenting each one out (remove the associated columns too) and figure out how many rows are eliminated.
As you do this, aim to find what is causing the desired rows to be eliminated. Once isolated, consider the join differences between the first query and the second query.
In looking at the first query, you could probably just modify that to eliminate any INs and instead do a EXISTS instead.
Consider your indexes as well. Any thing in the where or join clauses should probably be indexed.