Optimize query Teradata

Optimize query Teradata - optimization

I would appreciate it if you can help me with a problem that i have.
I have this join condition :
SELECT *
FROM
T1_STAGING.(first_table) AS STG
JOIN T1_STAGING.(second_table) AS B
ON
(
STG.DLOF_ID_NO=B.DLOF_ID_NO_RU
)
This simple join is taking too long to finish, more than 20 minutes. The data of each table is less than 600,000K data. i tried the following things :
I took statistics on each table.
I changed the columns to be PRIMARY INDEX.
I created JOIN INDEX for the second table but still nothing!
The query never ends it takes 20 mins ++. This seems to be data distribution problem in the second table, but i can't do anything with the data.
Please bear in mind that if i join my first_table with any other it takes only seconds.
Can you give me a suggestion to try? I need to optimize it for better performance.
Here is the explain of TERADATA:
Explain SEL *
FROM
T1_STAGING.DLS_DLO_OWS_STAGE_STG AS STG
JOIN T1_STAGING.DLS_ACQUISITION_STG AS B
ON
(
STG.DLOF_ID_NO=B.DLOF_ID_NO_RU
)
1) First, we lock a distinct T1_STAGING."pseudo table" for read on a
RowHash to prevent global deadlock for T1_STAGING.STG.
2) Next, we lock a distinct T1_STAGING."pseudo table" for read on a
RowHash to prevent global deadlock for T1_STAGING.B.
3) We lock T1_STAGING.STG for read, and we lock T1_STAGING.B for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from T1_STAGING.B by way of
an all-rows scan with no residual conditions split into Spool
2 (all_amps) with a condition of ("DLOF_ID_NO_RU IN (:)") to
qualify skewed rows and Spool 3 (all_amps) with a condition
of ("DLOF_ID_NO_RU IN (:)") to qualify rows matching skewed
rows of the skewed relation and Spool 4 (all_amps) with
remaining rows fanned out into 2 hash join partitions. Spool
2 is built locally on the AMPs. Then we do a SORT to order
Spool 2 by row hash. The size of Spool 2 is estimated with
high confidence to be 303 rows. Spool 3 is built locally on
the AMPs. The size of Spool 3 is estimated with high
confidence to be 4,710 rows. Spool 4 is redistributed by
hash code to all AMPs. The size of Spool 4 is estimated with
high confidence to be 97,742 rows. The estimated time for
this step is 1.27 seconds.
2) We do an all-AMPs RETRIEVE step from T1_STAGING.STG by way of
an all-rows scan with no residual conditions split into Spool
6 (all_amps) with a condition of ("DLOF_ID_NO IN (:)") to
qualify skewed rows and Spool 5 (all_amps) with a condition
of ("DLOF_ID_NO IN (:)") to qualify rows matching skewed
rows of the skewed relation and Spool 7 (all_amps) with
remaining rows fanned out into 2 hash join partitions. Spool
6 is built locally on the AMPs. The size of Spool 6 is
estimated with high confidence to be 21,587 rows. Spool 5 is
built locally on the AMPs. The size of Spool 5 is estimated
with high confidence to be 7 rows. Spool 7 is redistributed
by hash code to all AMPs. The size of Spool 7 is estimated
with high confidence to be 301,682 rows. The estimated time
for this step is 4.20 seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by
way of an all-rows scan into Spool 8 (all_amps), which is
duplicated on all AMPs. Then we do a SORT to order Spool 8
by the hash code of (T1_STAGING.STG.DLOF_ID_NO). The size of
Spool 8 is estimated with high confidence to be 336 rows (
640,080 bytes). The estimated time for this step is 0.01
seconds.
2) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by
way of an all-rows scan into Spool 9 (all_amps), which is
duplicated on all AMPs. The result spool file will not be
cached in memory. The size of Spool 9 is estimated with high
confidence to be 226,080 rows (391,796,640 bytes). The
estimated time for this step is 1.05 seconds.
6) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of a
RowHash match scan, which is joined to Spool 2 (Last Use) by way
of a RowHash match scan. Spool 8 and Spool 2 are joined using a
merge join, with a join condition of ("DLOF_ID_NO = DLOF_ID_NO_RU").
The result goes into Spool 1 (group_amps), which is built locally
on the AMPs. The result spool file will not be cached in memory.
The size of Spool 1 is estimated with low confidence to be 2,121
rows (11,491,578 bytes). The estimated time for this step is 0.03
seconds.
7) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an
all-rows scan, which is joined to Spool 9 (Last Use) by way of an
all-rows scan. Spool 6 and Spool 9 are joined using a single
partition hash join, with a join condition of ("DLOF_ID_NO =
DLOF_ID_NO_RU"). The result goes into Spool 1 (group_amps), which
is built locally on the AMPs. The result spool file will not be
cached in memory. The size of Spool 1 is estimated with low
confidence to be 9,243,161 rows (50,079,446,298 bytes). The
estimated time for this step is 0.60 seconds.
8) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an
all-rows scan, which is joined to Spool 7 (Last Use) by way of an
all-rows scan. Spool 4 and Spool 7 are joined using a hash join
of 2 partitions, with a join condition of ("DLOF_ID_NO =
DLOF_ID_NO_RU"). The result goes into Spool 1 (group_amps), which
is built locally on the AMPs. The result spool file will not be
cached in memory. The size of Spool 1 is estimated with low
confidence to be 731,525 rows (3,963,402,450 bytes). The
estimated time for this step is 0.96 seconds.
9) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 6.84 seconds.

600,000K you mean 600M, right? ;) That's not that little.
First. The columns used in JOIN should be indexed. (Looks like you've done it)
2. Add WHERE condition to choose specific values you need
3. Add LIMIT to limit the SELECT result
And finally, use EXPLAIN SELECT ... to understand the issue.

Related

No more spool space for query in Teradata

I am using Teradata database but I am quite new to its functioning. Could you please help me with making the below query more efficient so that it does not yield 'no more spool space' error? It is getting too heavy after I add the 2nd join.
SELECT
a.src_cmpgn_code,
a.cmc_name,
SUM(b.open_cnt)
FROM access_views.dw_cmc_lkp a
LEFT JOIN prs_restricted_v.mh_crm_engmnt_sd b
ON b.cmpgn_id = a.cmc_id
LEFT JOIN access_views.dw_cmc_instnc c
ON b.cmpgn_id = c.cmc_id
WHERE 1=1
AND b.trigger_dt BETWEEN '2019-01-01' AND '2019-12-31'
AND b.site_cntry_id = 1
AND a.cmpgn_group_name IN ('a', 'b', 'c', 'd')
AND c.dlvry_vhcl_id IN (1, 10)
AND c.chnl_id = 1
GROUP BY 1,2;
Explain looks like this:
This query is optimized using type 2 profile Cost_NoSlidingJ_Profile,
profileid 10007. 1) First, we lock mdm_tables.DW_CMC_INSTNC in view
access_views.dw_cmc_instnc for access, we lock MDM_TABLES.DW_CMC_LKP
in view access_views.dw_cmc_lkp for access, and we lock
PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd for
access. 2) Next, we do an all-AMPs RETRIEVE step from 365 partitions
of PRS_T.MH_CRM_ENGMNT_SD in view prs_restricted_v.mh_crm_engmnt_sd
with a condition of ("(NOT (PRS_T.MH_CRM_ENGMNT_SD in view
prs_restricted_v.mh_crm_engmnt_sd.CMPGN_ID IS NULL )) AND
(((PRS_T.MH_CRM_ENGMNT_SD in view
prs_restricted_v.mh_crm_engmnt_sd.TRIGGER_DT <= DATE '2019-12-31') AND
(PRS_T.MH_CRM_ENGMNT_SD.TRIGGER_DT >= DATE '2019-01-01')) AND
(PRS_T.MH_CRM_ENGMNT_SD in view
prs_restricted_v.mh_crm_engmnt_sd.SITE_CNTRY_ID = 1. ))") into Spool 4
(all_amps), which is redistributed by the hash code of (
PRS_T.MH_CRM_ENGMNT_SD.CMPGN_ID) to all AMPs. The size of Spool 4 is
estimated with no confidence to be 329,656,959 rows ( 7,582,110,057
bytes). The estimated time for this step is 2.40 seconds. 3) We do an
all-AMPs JOIN step from MDM_TABLES.DW_CMC_LKP in view
access_views.dw_cmc_lkp by way of an all-rows scan with a condition of
("MDM_TABLES.DW_CMC_LKP in view
access_views.dw_cmc_lkp.CMPGN_GROUP_NAME IN ('Bucks_Nectar_eBayPlus',
'DailyDeal','Other','STEP_User_Agreement')"), which is joined to Spool
4 (Last Use) by way of an all-rows scan. MDM_TABLES.DW_CMC_LKP and
Spool 4 are joined using a single partition hash join, with a join
condition of ("CMPGN_ID = MDM_TABLES.DW_CMC_LKP.CMC_ID"). The result
goes into Spool 5 (all_amps) fanned out into 5 hash join partitions,
which is built locally on the AMPs. The size of Spool 5 is estimated
with no confidence to be 79,119,821 rows (10,681,175,835 bytes). The
estimated time for this step is 0.19 seconds. 4) We do an all-AMPs
RETRIEVE step from mdm_tables.DW_CMC_INSTNC in view
access_views.dw_cmc_instnc by way of an all-rows scan with a condition
of ("(mdm_tables.DW_CMC_INSTNC in view
access_views.dw_cmc_instnc.DLVRY_VHCL_ID IN (1 , 10 )) AND
((mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc.CHNL_ID
= 1) AND (mdm_tables.DW_CMC_INSTNC in view access_views.dw_cmc_instnc.TRTMNT_TYPE_CODE <> 'I'))") into Spool 6
(all_amps) fanned out into 5 hash join partitions, which is
redistributed by the hash code of ( mdm_tables.DW_CMC_INSTNC.CMC_ID)
to all AMPs. The size of Spool 6 is estimated with no confidence to be
2,874,675 rows (48,869,475 bytes). The estimated time for this step is
0.58 seconds. 5) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of an all-rows scan, which is joined to Spool 6 (Last Use) by
way of an all-rows scan. Spool 5 and Spool 6 are joined using a hash
join of 5 partitions, with a join condition of ("(CMPGN_ID = CMC_ID)
AND (CMC_ID = CMC_ID)"). The result goes into Spool 3 (all_amps),
which is built locally on the AMPs. The size of Spool 3 is estimated
with no confidence to be 5,353,507,625 rows ( 690,602,483,625 bytes).
The estimated time for this step is 14.82 seconds. 6) We do an
all-AMPs SUM step to aggregate from Spool 3 (Last Use) by way of an
all-rows scan , grouping by field1 (
MDM_TABLES.DW_CMC_LKP.SRC_CMPGN_CODE ,MDM_TABLES.DW_CMC_LKP.CMC_NAME).
Aggregate Intermediate Results are computed globally, then placed in
Spool 7. The size of Spool 7 is estimated with no confidence to be
11,774 rows (5,286,526 bytes). The estimated time for this step is
24.51 seconds. 7) We do an all-AMPs RETRIEVE step from Spool 7 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is
built locally on the AMPs. The size of Spool 1 is estimated with no
confidence to be 11,774 rows (2,837,534 bytes). The estimated time for
this step is 0.01 seconds. 8) Finally, we send out an END TRANSACTION
step to all AMPs involved in processing the request. -> The contents
of Spool 1 are sent back to the user as the result of statement 1. The
total estimated time is 42.50 seconds.

Teradata - Insert into table takes long time - problem with plan execution?

I want to insert data into the target table but unfortunately it takes too much time, even it is only around 800 000 records.
I think that the problem is with the execution plan/bad indexes or statistics, however, can you please look at the explain plan whether something is suspected there?
Here is the plan:
This query is optimized using type 2 profile nonested_cost, profileid
10003.
This request is eligible for incremental planning and execution (IPE)
but does not meet cost thresholds. The following is the static plan
for the request.
1) First, we lock Schema1.Target_Table in TD_MAP1
for write on a reserved RowHash to prevent global deadlock.
2) Next, we lock Schema1.Target_Table in TD_MAP1
for write, we lock Schema1.Table1 in view
Schema1.Table1_View in TD_MAP1 for access, and we lock
Schema1.Table2 in view
Schema1.Table3 in TD_MAP1 for access.
3) We do an all-AMPs RETRIEVE step in TD_MAP1 from
Schema1.Table2 in view
Schema1.Table3 by way of an all-rows scan
with a condition of ("(NOT (Schema1.Table2 in
view Schema1.Table3.CUSTOMER_ID IS NULL
)) AND (Schema1.Table2 in view
Schema1.Table3.TYPE = 2.)") into Spool 3
(all_amps) (compressed columns allowed), which is redistributed by
the hash code of (
Schema1.Table2.CUSTOMER_ID) to all AMPs in
TD_Map1. The size of Spool 3 is estimated with high confidence to
be 66,211 rows (66,939,321 bytes). The estimated time for this
step is 0.80 seconds.
4) We do an all-AMPs JOIN step in TD_Map1 from Spool 3 (Last Use) by
way of a RowHash match scan, which is joined to
Schema1.Table1 in view Schema1.Table1_View by way
of a RowHash match scan with no residual conditions. Spool 3 and
Schema1.Table1 are joined using a single partition hash
join, with a join condition of ("CUSTOMER_ID =
Schema1.Table1.CUSTOMER_ID"). The result goes into
Spool 4 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (
HERE IS THE LIST OF COLUMNS FROM TABLE 1 AND TABLE 2 to all AMPs in TD_Map1.
Then we do a SORT to order Spool 4 by row hash. The size of Spool
4 is estimated with index join confidence to be 66,211 rows (
73,163,155 bytes). The estimated time for this step is 0.11
seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step in TD_Map1 from Spool 4 by
way of an all-rows scan into Spool 5 (all_amps) (compressed
columns allowed) fanned out into 33 hash join partitions,
which is duplicated on all AMPs in TD_Map1. The size of
Spool 5 is estimated with index join confidence to be
28,603,152 rows (31,635,086,112 bytes). The estimated time
for this step is 1.69 seconds.
2) We do an all-AMPs RETRIEVE step in TD_MAP1 from
S.TAB_M by way of an all-rows scan with a
condition of ("(NOT (S.TAB_M.EQUIPMENT_ID IS NULL ))
AND (NOT (S.TAB_M.COUNTY_ID IS NULL ))")
into Spool 6 (all_amps) (compressed columns allowed) fanned
out into 33 hash join partitions, which is built locally on
the AMPs. The size of Spool 6 is estimated with high
confidence to be 120,648,009 rows (29,679,410,214 bytes).
The estimated time for this step is 2.68 seconds.
6) We do an all-AMPs JOIN step in TD_Map1 from Spool 6 (Last Use) by
way of an all-rows scan, which is joined to Spool 5 (Last Use) by
way of an all-rows scan. Spool 6 and Spool 5 are joined using a
hash join of 33 partitions, with a join condition of ("(NOT
(SubCustomer_ID IS NULL )) AND ((NOT (ORG_ID IS NULL )) AND
((EQUIPMENT_ID = SubCustomer_ID) AND ((COUNTY_ID )= (ORG_ID
(FLOAT, FORMAT '-9.99999999999999E-999')))))"). The result goes
into Spool 7 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (
HERE IS THE LIST OF COLUMNS FROM TABLE 1 AND TABLE 2) to all AMPs in
TD_Map1. Then we do a SORT to order Spool 7 by row hash. The
size of Spool 7 is estimated with index join confidence to be
88,525 rows (28,239,475 bytes).
7) We do an all-AMPs JOIN step in TD_Map1 from Spool 7 (Last Use) by
way of a RowHash match scan, which is joined to Spool 4 (Last Use)
by way of a RowHash match scan. Spool 7 and Spool 4 are
right outer joined using a merge join, with a join condition of (
"Field_1 = Field_1"). The result goes into Spool 2 (all_amps)
(compressed columns allowed), which is built locally on the AMPs.
The size of Spool 2 is estimated with index join confidence to be
88,525 rows (114,639,875 bytes). The estimated time for this step
is 1.68 seconds.
8) We do an all-AMPs SUM step in TD_Map1 to aggregate from Spool 2
(Last Use) by way of an all-rows scan, grouping by field1 (HERE IS THE LIST OF COLUMNS FROM TABLE 1 AND TABLE 2 AND S.TAB_M). Aggregate Intermediate
Results are computed locally, then placed in Spool 11 in TD_Map1.
The size of Spool 11 is estimated with low confidence to be 66,394
rows (330,575,726 bytes). The estimated time for this step is
0.23 seconds.
9) We do an all-AMPs RETRIEVE step in TD_Map1 from Spool 11 (Last
Use) by way of an all-rows scan into Spool 1 (all_amps)
(compressed columns allowed), which is redistributed by the hash
code of ((CASE WHEN (NOT (S.TAB_M.TIME_DELIVERY IS
NULL )) THEN (S.TAB_M.TIME_DELIVERY) WHEN (NOT
(S.TAB_M.TIME_DELIVERY_EXP IS NULL )) THEN
(S.TAB_M.TIME_DELIVERY_EXP) ELSE (0) END )(INTEGER)) to
all AMPs in TD_Map1. Then we do a SORT to order Spool 1 by row
hash. The size of Spool 1 is estimated with low confidence to be
66,394 rows (85,050,714 bytes). The estimated time for this step
is 0.05 seconds.
10) We do an all-AMPs MERGE step in TD_MAP1 into
Schema1.Target_Table from Spool 1 (Last Use).
The size is estimated with low confidence to be 66,394 rows. The
estimated time for this step is 4.00 seconds.
11) We spoil the parser's dictionary cache for the table.
12) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> No rows are returned to the user as the result of statement 1.```

My query runs for about an hour then runs out of spool space

I have searched for different ways to resolve the issue that I am having to no avail. In the query below I am trying to run and I keep getting the ran out of spool space. I have tried everything from a subquery to a derived table but still get the same issue. I did notice when I added the new join on the two INPATIENT tables is when I started getting the error.
SELECT
MEM.MBR_FST_NM
,MEM.MBR_LST_NM
,DT.FULL_DT ADMIT_DATE
,DT2.FULL_DT DISCHARGE_DATE
,MAX(case when DT3.Full_DT between dt2.full_dt and
(dt2.full_dt+7) then DT3.FULL_DT else Null end) as Seven_Day_Followup_DATE
,MAX(case when DT3.Full_DT between dt2.full_dt and
(dt2.full_dt+7) then 'Y' else Null end) as Seven_Day_Followup_flag
,MAX(case when DT4.FULL_DT BETWEEN (DT2.FULL_DT+1) AND
(DT2.FULL_DT+30) THEN DT4.FULL_DT END) AS READMIT_DATE
,MAX(case when DT4.Full_DT between (DT2.FULL_DT+1) AND
(DT2.FULL_DT+30) then 'Y' Else Null End) as Readmitted_Within_30_days
FROM
UHCDM001.HP_MEMBER MEM
INNER JOIN UHCDM001.INPATIENT IP
ON MEM.MBR_SYS_ID = IP.MBR_SYS_ID
LEFT JOIN UHCDM001.HP_DATE DT
ON IP.ADMIT_DT_SYS_ID = DT.DT_SYS_ID
LEFT JOIN UHCDM001.HP_DATE DT2
ON IP.HLTH_PLN_DSCHRG_DT_SYS_ID = DT2.DT_SYS_ID
INNER JOIN PSU_TEMP TMP
ON MEM.INDV_SYS_ID=TMP.IND_SYS_ID
LEFT JOIN UHCDM001.PHYSICIAN P
ON MEM.MBR_SYS_ID=P.MBR_SYS_ID
LEFT JOIN UHCDM001.HP_DATE DT3
ON P.FST_SRVC_DT_SYS_ID=DT3.DT_SYS_ID
INNER JOIN UHCDM001.INPATIENT IP2
ON IP.MBR_SYS_ID=IP2.MBR_SYS_ID
LEFT JOIN UHCDM001.HP_DATE DT4
ON IP2.ERLY_SRVC_DT_SYS_ID = DT4.DT_SYS_ID
WHERE dt3.full_dt>= dt2.full_dt
Group by MEM.MBR_FST_NM
,MEM.MBR_LST_NM
,DT.FULL_DT
,DT2.FULL_DT
,DT3.FULL_DT
,DT4.FULL_DT
--,TMP.MBR_STATE
--,TMP.IND_SYS_ID
order by MEM.MBR_FST_NM
,MEM.MBR_LST_NM
,admit_date Asc
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock UHCTB001.FACT_PHYSICIAN in view UHCDM001.PHYSICIAN
for access, we lock UHCTB001.FACT_INPATIENT in view
UHCDM001.INPATIENT for access, we lock UHCTB001.DIM_MEMBER in view
UHCDM001.HP_MEMBER for access, and we lock UHCTB001.DIM_DATE in
view UHCDM001.HP_DATE for access.
2) Next, we execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from UHCTB001.DIM_DATE in
view UHCDM001.HP_DATE by way of an all-rows scan with no
residual conditions into Spool 4 (all_amps) (compressed
columns allowed), which is duplicated on all AMPs. The size
of Spool 4 is estimated with high confidence to be 32,399,136
rows (745,180,128 bytes). The estimated time for this step
is 0.17 seconds.
2) We do an all-AMPs RETRIEVE step from UHCTB001.FACT_INPATIENT
in view UHCDM001.INPATIENT by way of an all-rows scan with no
residual conditions into Spool 5 (all_amps) (compressed
columns allowed), which is redistributed by the hash code of
(UHCTB001.FACT_INPATIENT.MBR_SYS_ID) to all AMPs. Then we do
a SORT to order Spool 5 by row hash. The size of Spool 5 is
estimated with high confidence to be 76,386,713 rows (
1,604,120,973 bytes). The estimated time for this step is
3.09 seconds.
3) We do an all-AMPs RETRIEVE step from MWOODS27.TMP by way of
an all-rows scan with no residual conditions into Spool 6
(all_amps) (compressed columns allowed) fanned out into 6
hash join partitions, which is duplicated on all AMPs. The
size of Spool 6 is estimated with high confidence to be
47,822,040 rows (1,530,305,280 bytes). The estimated time
for this step is 0.35 seconds.
3) We do an all-AMPs JOIN step from UHCTB001.DIM_MEMBER in view
UHCDM001.HP_MEMBER by way of a RowHash match scan with no residual
conditions, which is joined to Spool 5 (Last Use) by way of a
RowHash match scan. UHCTB001.DIM_MEMBER and Spool 5 are joined
using a merge join, with a join condition of (
"UHCTB001.DIM_MEMBER.MBR_SYS_ID = MBR_SYS_ID"). The result goes
into Spool 7 (all_amps) (compressed columns allowed) fanned out
into 6 hash join partitions, which is built locally on the AMPs.
The size of Spool 7 is estimated with low confidence to be
76,386,713 rows (3,361,015,372 bytes). The estimated time for
this step is 1.43 seconds.
4) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an
all-rows scan, which is joined to Spool 7 (Last Use) by way of an
all-rows scan. Spool 6 and Spool 7 are joined using a hash join
of 6 partitions, with a join condition of ("(INDV_SYS_ID )=
(IND_SYS_ID (FLOAT, FORMAT '-9.99999999999999E-999'))"). The
result goes into Spool 8 (all_amps) (compressed columns allowed),
which is redistributed by the hash code of (
UHCTB001.FACT_INPATIENT.ADMIT_DT_SYS_ID) to all AMPs. The size of
Spool 8 is estimated with index join confidence to be 2,318,638
rows (90,426,882 bytes). The estimated time for this step is 0.50
seconds.
5) We do an all-AMPs JOIN step from UHCTB001.DIM_DATE in view
UHCDM001.HP_DATE by way of an all-rows scan with no residual
conditions, which is joined to Spool 8 (Last Use) by way of an
all-rows scan locking UHCTB001.DIM_DATE for access.
UHCTB001.DIM_DATE and Spool 8 are right outer joined using a
single partition hash join, with a join condition of (
"ADMIT_DT_SYS_ID = UHCTB001.DIM_DATE.DT_SYS_ID"). The result goes
into Spool 9 (all_amps) (compressed columns allowed), which is
built locally on the AMPs. The size of Spool 9 is estimated with
index join confidence to be 2,318,638 rows (95,064,158 bytes).
The estimated time for this step is 0.03 seconds.
6) We do an all-AMPs RETRIEVE step from UHCTB001.FACT_PHYSICIAN in
view UHCDM001.PHYSICIAN by way of an all-rows scan with no
residual conditions into Spool 13 (all_amps) (compressed columns
allowed) fanned out into 19 hash join partitions, which is built
locally on the AMPs. The size of Spool 13 is estimated with high
confidence to be 1,049,698,588 rows (19,944,273,172 bytes). The
estimated time for this step is 8.20 seconds.
7) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of an
all-rows scan, which is joined to Spool 4 by way of an all-rows
scan. Spool 9 and Spool 4 are joined using a single partition
hash join, with a join condition of ("HLTH_PLN_DSCHRG_DT_SYS_ID =
DT_SYS_ID"). The result goes into Spool 14 (all_amps) (compressed
columns allowed) fanned out into 19 hash join partitions, which is
duplicated on all AMPs. The size of Spool 14 is estimated with
index join confidence to be 1,168,593,552 rows (50,249,522,736
bytes). The estimated time for this step is 10.45 seconds.
8) We do an all-AMPs JOIN step from Spool 13 (Last Use) by way of an
all-rows scan, which is joined to Spool 14 (Last Use) by way of an
all-rows scan. Spool 13 and Spool 14 are joined using a hash join
of 19 partitions, with a join condition of ("MBR_SYS_ID =
MBR_SYS_ID"). The result goes into Spool 16 (all_amps)
(compressed columns allowed), which is built locally on the AMPs.
The size of Spool 16 is estimated with index join confidence to be
75,438,358 rows (3,092,972,678 bytes). The estimated time for
this step is 5.05 seconds.
9) We do an all-AMPs JOIN step from Spool 4 by way of an all-rows
scan, which is joined to Spool 16 (Last Use) by way of an all-rows
scan. Spool 4 and Spool 16 are joined using a single partition
hash join, with a join condition of ("(FULL_DT >= FULL_DT) AND
(FST_SRVC_DT_SYS_ID = DT_SYS_ID)"). The result is split into
Spool 17 (all_amps) with a condition of ("MBR_SYS_ID IN (:*)") to
qualify rows matching skewed rows of the skewed relation and Spool
18 (all_amps) with remaining rows fanned out into 3 hash join
partitions. Spool 17 is built locally on the AMPs. The size of
Spool 17 is estimated with index join confidence to be 32 rows (
1,376 bytes). Spool 18 is redistributed by hash code to all AMPs.
The size of Spool 18 is estimated with index join confidence to be
75,438,326 rows (3,243,848,018 bytes). The estimated time for
this step is 2.76 seconds.
10) We do an all-AMPs JOIN step from Spool 4 (Last Use) by way of an
all-rows scan, which is joined to UHCTB001.FACT_INPATIENT in view
UHCDM001.INPATIENT by way of an all-rows scan with no residual
conditions locking UHCTB001.FACT_INPATIENT for access. Spool 4
and UHCTB001.FACT_INPATIENT are right outer joined using a dynamic
hash join, with a join condition of (
"UHCTB001.FACT_INPATIENT.ERLY_SRVC_DT_SYS_ID = DT_SYS_ID"). The
result is split into Spool 19 (all_amps) with a condition of (
"MBR_SYS_ID IN (:*)") to qualify skewed rows and Spool 21
(all_amps) with remaining rows fanned out into 3 hash join
partitions. Spool 19 is built locally on the AMPs. Then we do a
SORT to order Spool 19 by row hash. The size of Spool 19 is
estimated with low confidence to be 55,223 rows (1,159,683 bytes).
Spool 21 is redistributed by hash code to all AMPs. The size of
Spool 21 is estimated with low confidence to be 76,331,490 rows (
1,602,961,290 bytes). The estimated time for this step is 1.99
seconds.
11) We do an all-AMPs RETRIEVE step from Spool 17 (Last Use) by way of
an all-rows scan into Spool 22 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs. Then we do a SORT to
order Spool 22 by the hash code of (MBR_SYS_ID). The size of
Spool 22 is estimated with index join confidence to be 16,128 rows
(693,504 bytes). The estimated time for this step is 0.00 seconds.
12) We do an all-AMPs JOIN step from Spool 22 (Last Use) by way of a
RowHash match scan, which is joined to Spool 19 (Last Use) by way
of a RowHash match scan. Spool 22 and Spool 19 are joined using a
merge join, with a join condition of ("MBR_SYS_ID = MBR_SYS_ID").
The result goes into Spool 3 (all_amps), which is built locally on
the AMPs. The size of Spool 3 is estimated with index join
confidence to be 1,767,136 rows (79,521,120 bytes). The estimated
time for this step is 0.01 seconds.
13) We do an all-AMPs JOIN step from Spool 21 (Last Use) by way of an
all-rows scan, which is joined to Spool 18 (Last Use) by way of an
all-rows scan. Spool 21 and Spool 18 are joined using a hash join
of 3 partitions, with a join condition of ("MBR_SYS_ID =
MBR_SYS_ID"). The result goes into Spool 3 (all_amps), which is
built locally on the AMPs. The size of Spool 3 is estimated with
index join confidence to be 1,842,100,578 rows (82,894,526,010
bytes). The estimated time for this step is 14.91 seconds.
14) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 23. The size of Spool 23 is estimated with low
confidence to be 1,382,575,097 rows (139,640,084,797 bytes). The
estimated time for this step is 2 minutes and 9 seconds.
15) We do an all-AMPs RETRIEVE step from Spool 23 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs. Then we do a SORT to order Spool 1 by the sort key
in spool field1. The size of Spool 1 is estimated with low
confidence to be 1,382,575,097 rows (158,996,136,155 bytes). The
estimated time for this step is 37.94 seconds.
16) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 3 minutes and 36 seconds.

Error 2646 Teradata : Redistribution on a badly skewed column after FULL stats

There is a snipped of product code that does some row check. It's actually migrated code that came into teradata and no one has bothered to change it to be TD savvy, should I say.
This code now throws
2646 : No More spool...
Error and that is not really a spool shortage but due to data-skew as would be evident to any Teradata Master.
Code logic is plain stupid but they are running it in Prod. Code change is NOT an option now because this is production. I can rewrite it using a Simple NOT Exists and the Query will run fine.
EXPLAIN SELECT ((COALESCE(FF.SKEW_COL,-99999))) AS Cnt1,
COUNT(*) AS Cnt
FROM DB.10_BILLON_FACT FF
WHERE FF.SKEW_COL IN(
SELECT F.SKEW_COL
FROM DB.10_BILLON_FACT F
EXCEPT
SELECT D.DIM_COL
FROM DB.Smaller_DIM D
)
Its failing because it wants to redistribute on SKEW_COL. WHATEVER I DO THIS WILL NOT CHANGE. SKEW_COL is 99% skewed.
here's the explain.FAILS ON STEP # 4.1
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.F.
2) Next, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.D.
3) We lock DB.F for read, and we lock DB.D for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from DB.F by way of an
all-rows scan with no residual conditions into Spool 6
(all_amps), which is redistributed by the hash code of (
DB.F.SKEW_COL) to all AMPs. Then we
do a SORT to order Spool 6 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 6
is estimated with low confidence to be 989,301 rows (
28,689,729 bytes). The estimated time for this step is 1
minute and 36 seconds.
2) We do an all-AMPs RETRIEVE step from DB.D by way of an
all-rows scan with no residual conditions into Spool 7
(all_amps), which is built locally on the AMPs. Then we do a
SORT to order Spool 7 by the hash code of (
DB.D.DIM_COL). The size of Spool 7 is
estimated with low confidence to be 6,118,545 rows (
177,437,805 bytes). The estimated time for this step is 0.11
seconds.
5) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an
all-rows scan, which is joined to Spool 7 (Last Use) by way of an
all-rows scan. Spool 6 and Spool 7 are joined using an exclusion
merge join, with a join condition of ("Field_1 = Field_1"). The
result goes into Spool 1 (all_amps), which is built locally on the
AMPs. The size of Spool 1 is estimated with low confidence to be
494,651 rows (14,344,879 bytes). The estimated time for this step
is 3.00 seconds.
6) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by
way of an all-rows scan into Spool 5 (all_amps), which is
redistributed by the hash code of (
DB.F.SKEW_COL) to all AMPs. Then we
do a SORT to order Spool 5 by row hash. The size of Spool 5
is estimated with low confidence to be 494,651 rows (
12,366,275 bytes). The estimated time for this step is 0.13
seconds.
2) We do an all-AMPs RETRIEVE step from DB.FF by way of an
all-rows scan with no residual conditions into Spool 8
(all_amps) fanned out into 24 hash join partitions, which is
built locally on the AMPs. The size of Spool 8 is estimated
with high confidence to be 2,603,284,805 rows (
54,668,980,905 bytes). The estimated time for this step is
24.40 seconds.
7) We do an all-AMPs RETRIEVE step from Spool 5 (Last Use) by way of
an all-rows scan into Spool 9 (all_amps) fanned out into 24 hash
join partitions, which is duplicated on all AMPs. The size of
Spool 9 is estimated with low confidence to be 249,304,104 rows (
5,235,386,184 bytes). The estimated time for this step is 1.55
seconds.
8) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an
all-rows scan, which is joined to Spool 9 (Last Use) by way of an
all-rows scan. Spool 8 and Spool 9 are joined using a inclusion
hash join of 24 partitions, with a join condition of (
"SKEW_COL = SKEW_COL"). The
result goes into Spool 4 (all_amps), which is built locally on the
AMPs. The size of Spool 4 is estimated with index join confidence
to be 1,630,304,007 rows (37,496,992,161 bytes). The estimated
time for this step is 11.92 seconds.
9) We do an all-AMPs SUM step to aggregate from Spool 4 (Last Use) by
way of an all-rows scan , grouping by field1 (
DB.FF.SKEW_COL). Aggregate Intermediate
Results are computed globally, then placed in Spool 11. The size
of Spool 11 is estimated with low confidence to be 494,651 rows (
14,344,879 bytes). The estimated time for this step is 35.00
seconds.
10) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan into Spool 2 (group_amps), which is built locally
on the AMPs. The size of Spool 2 is estimated with low confidence
to be 494,651 rows (16,323,483 bytes). The estimated time for
this step is 0.01 seconds.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 2 are sent back to the user as the result of
statement 1. The total estimated time is 2 minutes and 52 seconds.
There are some 900K unique values of skewed_ column and * ( interestingly there are 6 Million unique values for DIM_COL, which is why I think it is veering towards the Fact table column. But still..it knows from the Low Unique value in the bigger table, that its badly skewed )
My Q is after knowing that SKEWED_COL is 99% skewed due to a constant value like -9999 WHY does the optimizer still redistribute by this skewed column instead of using alternate PRPD approach. A similar ( but not same ) situation happened in past but when we upgraded to faster box ( more AMPS ) it went away .
Anything that comes to mind that will make it change plans. I tried most diagnostics - no result. Created a SI ( On a similar VT but it will still skew ).SKEWING is inevitable , ( You can artificially change the data - I am aware so to minimize this BUT all that is NOT after the fact. Now we are in PROD. Everything is over ) but even after it knows the Col is Skewed, why re-distribute it when other options are available
Its not the NULL value that skewing . Its a constant flag value ( probably value rep. of the NULL like -9999 that is causing the skew as I mentioned in the poster ) . If you rewrite the Q as I updated it works fine. I preferred NOT EXISTS because the latter will not need NULL CHECKING ( as a practice though from my DD knowledge - i know both cols are declared NOT NULL ) . I have updated the Poster with an alternative code that will work ( though like I explained - i finalized with the NOT exists version)
Select count(*) , f.SKEW_COL
from (
select ff.SKEW_COL
from DB.10_BILLON_FACT ff
where ff.SKEW_COL not in (
select d.DIM_COL
from DB.Smaller_DIM d )) as f
Group by f.SKEW_COL
Can I not get the optimizer query rewrite feature to think through the Q and rewrite with above logic. The above will NOT redistribute but JUST SORT By the Skewed Column

Until you can replace the SQL, adding spool may be your only option.
Make sure your stats are current or consider a join index with an alternative PI that covers this particular query without having to do the redistribution. You may have a skewed JI but if the work can be done AMP local you may be able to address the spool issue.

teradata SQL : redistribution by join column causes 98% skew

Wea re on Teradata 14.
2 Tables being LOJ . Each table PI has ONLY 1.5% SKEW . When ANY of these tables is being redestributed on the others Join key the spool table has a 98% skew and the query will just HANG UP on the merge join step
sel A.x , count ('1') from A left outer join B on A.x=B.y where B.y is NULL
This was production code ( already written in past ) that failed. I can rewrite it using an exclusion join
and it works just fine. But Changing it is not an option.
It redistribites B.y by A.x. So if I create a table in spool , with the B.* and A.x ( A.x is PI ) there's a 98% skew. There is a 98% skew for vis versa too. A.*, B.y ( B.y is redistrubuting column )
Stats are upto date.
WITHOUT changing the orignal query - only other thing that I could do is run a diagnostic statement before the query and then it will avoid the redistribution step( run in < 5 s ). The other thing I could consider is drop stats on join column ( ? ) with the hope that the optimizer will avoid redistributing one table by the other ?
My question is how can I get the optimizer to choose the alternate plan that it ran when I ran the diagnostic statement.
from PDCR the tables being joined had same size and the query ran in the past ( though had a good amt of spool ) . Now it hangs up.This is a small 50 AMP system
Update to the comment :
Here is the explain WITHOUT diagnostic. It will hang on the merge join step
Explain
--diagnostic verboseexplain on for session ;
SELECT F.keyX ,
COUNT('1')
FROM DB.TB1 F LEFT OUTER JOIN DB.TB2 D
ON F.keyX = D.keyY
where (F.keyX IS NOT NULL
AND D.keyY IS NULL )
GROUP BY F.keyX;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.D.
2) Next, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.F.
3) We lock DB.D for read, and we lock DB.F for read.
4) We do an all-AMPs RETRIEVE step from DB.F by way of an
all-rows scan with no residual conditions into Spool 4 (all_amps)
(compressed columns allowed), which is redistributed by hash code
to all AMPs with hash fields ("DB.F.keyX")
and Field1 ("DB.F.ROWID"). Then we do a SORT to order
Spool 4 by row hash. The size of Spool 4 is estimated with high
confidence to be 270,555 rows (5,952,210 bytes). Spool AsgnList:
"Field_1" = "DB.F.ROWID",
"keyX" = "DB.F.keyX".
5) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 4 by way of an
all-rows scan into Spool 5 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs with hash fields (
"Spool_4.keyX") and Field1 ("Spool_4.Field_1").
The size of Spool 5 is estimated with high confidence to be
24,349,950 rows (535,698,900 bytes). Spool AsgnList:
"Field_1" = "Spool_4.Field_1",
"keyX" = "Spool_4.keyX".
The estimated time for this step is 0.42 seconds.
2) We do an all-AMPs RETRIEVE step from DB.D by way of
an all-rows scan with no residual conditions into Spool 6
(all_amps) (compressed columns allowed), which is built
locally on the AMPs with hash fields (
"DB.D.keyY"). The size of Spool 6 is
estimated with high confidence to be 307,616 rows (5,229,472
bytes). Spool AsgnList:
"keyY" = "DB.D.keyY".
The estimated time for this step is 0.02 seconds.
6) We do an all-AMPs JOIN step (Global sum) from Spool 6 (Last Use)
by way of an all-rows scan, which is joined to Spool 5 (Last Use)
by way of an all-rows scan. Spool 6 is used as the hash table and
Spool 5 is used as the probe table in a joined using a single
partition classical hash join, with a join condition of (
"Spool_5.keyX = Spool_6.keyY"). The result
goes into Spool 7 (all_amps) (compressed columns allowed), which
is redistributed by hash code to all AMPs with hash fields (
"Spool_5.keyX") and Field1 ("Spool_5.Field_1"). Then
we do a SORT to order Spool 7 by row hash. The size of Spool 7 is
estimated with low confidence to be 270,555 rows (7,034,430 bytes).
Spool AsgnList:
"Field_1" = "Spool_5.Field_1",
"keyY" = "{LeftTable}.keyY",
"Field_3" = "Spool_5.keyX".
7) We do an all-AMPs JOIN step (Global sum) from Spool 7 (Last Use)
by way of a RowHash match scan, which is joined to Spool 4 (Last
Use) by way of a RowHash match scan. Spool 7 and Spool 4 are
right outer joined using a merge join, with a join condition of (
"Spool_7.Field_1 = Spool_4.Field_1"). The result goes into Spool
8 (all_amps) (compressed columns allowed), which is built locally
on the AMPs. The size of Spool 8 is estimated with low confidence
to be 270,555 rows (5,681,655 bytes). Spool AsgnList:
"keyY" = "{LeftTable}.keyY",
"keyX" = "{RightTable}.keyX".
The estimated time for this step is 0.49 seconds.
8) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan with a condition of ("Spool_8.keyY IS
NULL") into Spool 3 (all_amps) (compressed columns allowed), which
is built locally on the AMPs with Field1 ("25614"). The size of
Spool 3 is estimated with low confidence to be 270,555 rows (
5,140,545 bytes). Spool AsgnList:
"Field_1" = "25614",
"Spool_3.keyX" = "{ Copy }keyX".
The estimated time for this step is 0.01 seconds.
9) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 11. The size of Spool 11 is estimated with low
confidence to be 296 rows (6,216 bytes). The estimated time for
this step is 0.04 seconds.
10) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs with Field1 ("UniqueId"). The size of Spool 1 is
estimated with low confidence to be 296 rows (8,584 bytes). Spool
AsgnList:
"Field_1" = "UniqueId",
"Field_2" = "Field_2 ,Field_3 (INTEGER),".
The estimated time for this step is 0.01 seconds.
11) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1.
This is the new explain AFTER diagnostic is run
This will run in 5secs
EXPLAIN SELECT F.keyY,
COUNT('1')
FROM DB.TB1 F LEFT OUTER JOIN DB.TB2 D
ON F.keyY = D.keyY
where (
D.keyY IS NULL )
GROUP BY F.keyY;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.D.
2) Next, we lock a distinct DB."pseudo table" for read on a
RowHash to prevent global deadlock for DB.F.
3) We lock DB.D for read, and we lock DB.F for read.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from DB.D by way of
an all-rows scan with no residual conditions split into Spool
4 (all_amps) with a condition of ("keyY IN (:*)")
to qualify rows matching skewed rows of the skewed relation
(compressed columns allowed) and Spool 5 (all_amps) with
remaining rows (compressed columns allowed). Spool 4 is
duplicated on all AMPs with hash fields (
"DB.D.keyY"). Then we do a SORT to order
Spool 4 by row hash. The size of Spool 4 is estimated with
high confidence to be 180 rows. Spool 5 is built locally on
the AMPs with hash fields ("DB.D.keyY").
The size of Spool 5 is estimated with high confidence to be
307,614 rows. Spool 4 AsgnList:
"keyY" = "DB.D.keyY".
Spool 5 AsgnList:
"keyY" = "DB.D.keyY".
The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from DB.F by way of
an all-rows scan with no residual conditions split into Spool
6 (all_amps) with a condition of ("keyY IN (:*)")
to qualify skewed rows (compressed columns allowed) and Spool
7 (all_amps) with remaining rows (compressed columns allowed).
Spool 6 is built locally on the AMPs with hash fields (
"DB.F.keyY"). Then we do a SORT to
order Spool 6 by row hash. The size of Spool 6 is estimated
with high confidence to be 231,040 rows. Spool 7 is
redistributed by hash code to all AMPs with hash fields (
"DB.F.keyY"). Then we do a SORT to
order Spool 7 by row hash. The size of Spool 7 is estimated
with high confidence to be 39,515 rows. Spool 6 AsgnList:
"keyY" = "DB.F.keyY".
Spool 7 AsgnList:
"keyY" = "DB.F.keyY".
The estimated time for this step is 0.32 seconds.
5) We do an all-AMPs JOIN step (Global sum) from Spool 4 (Last Use)
by way of a RowHash match scan, which is joined to Spool 6 (Last
Use) by way of a RowHash match scan. Spool 4 and Spool 6 are
right outer joined using a merge join, with a join condition of (
"Spool_6.keyY = Spool_4.keyY"). The result
goes into Spool 8 (all_amps), which is built locally on the AMPs.
The size of Spool 8 is estimated with low confidence to be 231,040
rows (4,851,840 bytes). Spool AsgnList:
"keyY" = "{LeftTable}.keyY",
"keyY" = "{RightTable}.keyY".
The estimated time for this step is 0.01 seconds.
6) We do an all-AMPs JOIN step (Local sum) from Spool 5 (Last Use) by
way of a RowHash match scan, which is joined to Spool 7 (Last Use)
by way of a RowHash match scan. Spool 5 and Spool 7 are
right outer joined using a merge join, with a join condition of (
"Spool_7.keyY = Spool_5.keyY"). The result
goes into Spool 8 (all_amps), which is built locally on the AMPs.
The size of Spool 8 is estimated with low confidence to be 39,515
rows (829,815 bytes). Spool AsgnList:
"keyY" = "{LeftTable}.keyY",
"keyY" = "{RightTable}.keyY".
The estimated time for this step is 0.02 seconds.
7) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan with a condition of ("Spool_8.keyY IS
NULL") into Spool 3 (all_amps) (compressed columns allowed), which
is built locally on the AMPs with Field1 ("26086"). The size of
Spool 3 is estimated with low confidence to be 270,555 rows (
5,140,545 bytes). Spool AsgnList:
"Field_1" = "26086",
"Spool_3.keyY" = "{ Copy }keyY".
The estimated time for this step is 0.01 seconds.
8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan, and the grouping identifier in field 1.
Aggregate Intermediate Results are computed globally, then placed
in Spool 13. The size of Spool 13 is estimated with low
confidence to be 296 rows (6,216 bytes). The estimated time for
this step is 0.04 seconds.
9) We do an all-AMPs RETRIEVE step from Spool 13 (Last Use) by way of
an all-rows scan into Spool 1 (group_amps), which is built locally
on the AMPs with Field1 ("UniqueId"). The size of Spool 1 is
estimated with low confidence to be 296 rows (8,584 bytes). Spool
AsgnList:
"Field_1" = "UniqueId",
"Field_2" = "keyY ,Field_3 (INTEGER),".
The estimated time for this step is 0.01 seconds.
10) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.41 seconds.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Optimize query Teradata - optimization

600,000K you mean 600M, right? ;) That's not that little. First. The columns used in JOIN should be indexed. (Looks like you've done it) 2. Add WHERE condition to choose specific values you need 3. Add LIMIT to limit the SELECT result And finally, use EXPLAIN SELECT ... to understand the issue.

Related

No more spool space for query in Teradata

Teradata - Insert into table takes long time - problem with plan execution?

My query runs for about an hour then runs out of spool space

Error 2646 Teradata : Redistribution on a badly skewed column after FULL stats

teradata SQL : redistribution by join column causes 98% skew

Categories

Resources