Ignite caching/querying taking too much time

Ignite caching/querying taking too much time - ignite

I m trying to adopt ignite to resolve one of my needs by implementing an in memory data grid
Right know I m using a 3rd persistence read/write through mechanism to fetch data from my oracle db, in my topology I m using a single node in my activated cluster that are hosted in a VM having 8G ram and 120 G hdd.
my node is using local storage with a on-heap memory of 2G and 50G off-heap memory, all this with eviction and swapping enabled
DataStorageConfiguration dataStorageCfg = new DataStorageConfiguration();
DataRegionConfiguration dataRegionCfg = new DataRegionConfiguration();
// 2G initial size (RAM).
dataRegionCfg.setInitialSize(2L * 1024 * 1024 * 1024);
// 40 GB max size (RAM).
dataRegionCfg.setMaxSize(40L * 1024 * 1024 * 1024);
// Enabling RANDOM_LRU eviction for this region.
dataRegionCfg.setPageEvictionMode(DataPageEvictionMode.RANDOM_LRU);
//dataRegionCfg.setPersistenceEnabled(true);
final String swapPath ="/opt/ignite/swap";
dataRegionCfg.setSwapPath(swapPath);
dataStorageCfg.setDefaultDataRegionConfiguration(dataRegionCfg);
cfg.setDataStorageConfiguration(dataStorageCfg);
Caching this on my machine is taking too much time
my swap folder is about 13G when caching comes to end
concerning my SQL query there is no response
the same query in my tools takes 1min45s to respond but using ignite cache query method doesn't respond and doesn't trow any kind of error or exception
SqlFieldsQuery sqlQuery;
FieldsQueryCursor<List<?>> queryCursor;
Iterator<List<?>> resultIt;
System.out.println(">>> All caches loaded! in : " + total + " ms");
System.out.println("---------------------------------------------- ");
System.out.println("---------------------------------------------- ");
System.out.println("---------------------------------------------- ");
System.out.println("\\n \\n \\n ");
System.out.println("---------------------------------------------- ");
System.out.println("---------------------------------------------- ");
System.out.println("---------------------------------------------- ");
System.out.println("Checking join query POC first run");
start = System.currentTimeMillis();
sqlQuery = new SqlFieldsQuery(sql);
queryCursor = ignite.cache("MInoutlineCache").query(sqlQuery);
System.out.println("query result size is : "+queryCursor.getAll().size());
end = System.currentTimeMillis() - start;
total += end;
M I well using ignite? is using ignite in a one node fashion useful, or I was meant to build clusters with lot of nodes in a partitioned strategy ?
the number of lines cached is 10 million line
is there another way to achieve a good in memory data-grid in my context using the 3rd persistence strategy.
There is lot of question in this question I m sorry
NB: I m using the grid gain console to generate my configuration
I m also updating my caching schema names as public to execute query directly
Here is the query
SELECT bp.name,
CF.documentno,
CF.MOVEMENTDATE,
CF.m_product_id AS M_PRODUCT_ID,
CF.product,
CF.xx_lignegratuite,
CF.m_attributesetinstance_id,
CASE
WHEN cf.isreturntrx='Y'
THEN - CF.qtyentered
ELSE CF.qtyentered
END AS qtyentered,
CF.discount,
CF.DOCSTATUS,
CF.ISRETURNTRX,
CF.XX_REWARDAMT,
CF.OperID,
CF.clientId,
CASE
WHEN cf.xx_lignegratuite='N'
THEN
CASE
WHEN cf.isreturntrx='Y'
THEN -cf.prixVente*cf.qtyentered* (1-(cf.discount/100))
ELSE cf.prixVente*cf.qtyentered* (1-(cf.discount/100))
END
ELSE 0
END AS totalline,
CASE
WHEN cf.XX_StartegicalProduct='Y'
THEN (
CASE
WHEN cf.xx_lignegratuite='N'
THEN
CASE
WHEN cf.isreturntrx='Y'
THEN -cf.prixVente*cf.qtyentered* (1-(cf.discount/100))
ELSE cf.prixVente*cf.qtyentered* (1-(cf.discount/100))
END
ELSE 0
END)
ELSE 0
END AS totallineStar,
CASE
WHEN cf.xx_lignegratuite='N'
THEN
CASE
WHEN cf.typevente='W'
THEN
CASE
WHEN cf.isreturntrx='Y'
THEN -(cf.XX_REWARDAMT/nb_doc)
ELSE cf.XX_REWARDAMT/nb_doc
END
ELSE 0
END
ELSE 0
END AS totalreward,
CF.XX_StartegicalProduct,
CF.SALESREP_ID,
CF.C_DOCTYPE_ID,
CF.AD_ORG_ID,
CF.ad_orgtrx_id,
CF.xx_laboratory_id,
bp.c_bpartner_id,
CF.nb_doc,
CF.rate,
CF.poste_id,
CF.SalesRepTier_poste_id,
CF.recSupr,
CF.recSupr_poste_id,
(SELECT Objectif_CA_oper
FROM c_bpartner
WHERE issalesrep ='Y'
AND isemployee ='Y'
AND c_bpartner_id=CF.SalesRepTier
) AS ObjectifOp,
(SELECT Objectif_CA_oper
FROM c_bpartner
WHERE issalesrep ='Y'
AND isemployee ='Y'
AND c_bpartner_id=CF.SalesRepTier_poste_id
) AS ObjectifOp_poste_id,
CASE
WHEN cf.ISRETURNTRX='Y'
THEN -cf.QTYENTERED*prixRevient
ELSE cf.QTYENTERED*prixRevient
END AS consomation
FROM
(SELECT i.documentno,
i.MOVEMENTDATE,
p.m_product_id,
p.name AS product,
ol.xx_lignegratuite,
il.m_attributesetinstance_id,
il.qtyentered,
ol.discount,
i.DOCSTATUS,
i.isreturntrx,
ol.XX_REWARDAMT,
i.C_BPartner_ID AS clientId,
(SELECT u.C_BPARTNER_ID FROM AD_User u WHERE u.AD_User_ID = i.SALESREP_ID
) AS OperID,
(SELECT ai.Valuenumber
FROM M_AttributeInstance ai
INNER JOIN M_Attribute a
ON (ai.M_Attribute_ID =a.M_Attribute_ID
AND a.IsInstanceAttribute ='Y')
WHERE ai.M_AttributeSetInstance_ID=il.m_attributesetinstance_id
AND a.Name ='Prix Vente'
) AS prixVente,
(SELECT ai.Valuenumber
FROM M_AttributeInstance ai
INNER JOIN M_Attribute a
ON (ai.M_Attribute_ID =a.M_Attribute_ID
AND a.IsInstanceAttribute ='Y')
WHERE ai.M_AttributeSetInstance_ID=il.m_attributesetinstance_id
AND a.Name ='Prix Revient'
) AS prixRevient,
(SELECT ai.Valuenumber
FROM M_AttributeInstance ai
INNER JOIN M_Attribute a
ON (ai.M_Attribute_ID =a.M_Attribute_ID
AND a.IsInstanceAttribute ='Y')
WHERE ai.M_AttributeSetInstance_ID=il.m_attributesetinstance_id
AND a.Name ='Fournisseur'
) AS Fournisseur,
XX_StartegicalProduct,
i.SALESREP_ID,
i.C_DOCTYPE_ID,
i.AD_ORG_ID,
(SELECT o.AD_ORGTRX_ID
FROM c_order o
WHERE i.c_order_id=o.c_order_id
) AS ad_orgtrx_id,
p.xx_laboratory_id,
lt.rate,
--COUNT(*) over (partition BY il.c_orderline_id) AS nb_doc,
(
SELECT COUNT(*)
FROM m_inoutline ill
WHERE ill.c_orderline_id=il.c_orderline_id
) AS nb_doc,
ol.type AS typevente,
bpl.salesrep_id AS poste_id,
(SELECT u.c_bpartner_id FROM AD_User u WHERE u.AD_User_ID = i.salesrep_id
) AS SalesRepTier,
(SELECT u.c_bpartner_id FROM AD_User u WHERE u.AD_User_ID = bpl.salesrep_id
) AS SalesRepTier_poste_id,
(SELECT u.XX_RecSupervisor_ID FROM AD_User u WHERE u.AD_User_ID=i.SALESREP_ID
) AS recSupr,
(SELECT u.XX_RecSupervisor_ID
FROM AD_User u
WHERE u.AD_User_ID=bpl.salesrep_id
) AS recSupr_poste_id
FROM m_inoutline il
INNER JOIN m_inout i
ON il.m_inout_id=i.m_inout_id
INNER JOIN c_orderline ol
ON ol.c_orderline_id=il.c_orderline_id
INNER JOIN m_product p
ON p.m_product_id=il.m_product_id
INNER JOIN C_Bpartner bpl
ON (bpl.c_bpartner_id=i.c_bpartner_id)
LEFT OUTER JOIN xx_listetauxvaleur lt
ON p.xx_listetauxvaleur_id = lt.xx_listetauxvaleur_id
WHERE i.issotrx ='Y'
--AND p.m_attributeset_id IS NOT NULL
AND il.movementqty<>0
) CF
LEFT OUTER JOIN c_bpartner bp
ON (CF.Fournisseur=bp.c_bpartner_id)
ORDER BY bp.name,
documentno;

I see a lot of joins in your statement. Are you sure that you have indexes on all the fields present in those joins? I think it would take a lot of optimization until this one will perform.
I also recommend getting rid of swap since this is an outdated feature and its performance implications are not known.
If you expect to outperform Oracle on a single node when doing multiple joins, this is also not going to happen.

Consider general performance tips that should answer most of your questions and issues:
https://www.gridgain.com/docs/latest/perf-troubleshooting-guide/general-perf-tips
To achieve maximum performance you need to keep an entire data set in RAM. If there is a lack of resources then consider Ignite native persistence instead of the swap storage.

Related

Select from view takes too long

I have a query against a table that contains like 2 million rows using linked server.
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id
WHERE
PV.col_id = ''80C53C9B-6272-11DA-BB34-000E0C7F3ED2''')
That query results into 365 rows and is executed within 0 second.
However when I make that query into a view it runs for about minimum of 20 seconds and sometimes it reaches to 40 seconds tops.
Here's my create view script
CREATE VIEW [dbo].[myview]
AS
Select * from OPENQUERY(LinkedServerName,
'SELECT
PV.col1
,PV.col2
,PV.col3
,VTR.col1
,CTR.col1
,PSR.col1
FROM
LinkedDbName.dbo.tbl1 PV
INNER JOIN LinkedDbName.dbo.tbl2 VTR
ON PV.col_id = VTR.col_id
INNER JOIN LinkedDbName.dbo.tbl3 CTR
ON PV.col_id = CTR.col_id
INNER JOIN LinkedDbName.dbo.tbl4 PSR
ON PV.col_id = PSR.col_id')
then
Select * from myview where PV.col_id = '80C53C9B-6272-11DA-BB34-000E0C7F3ED2'
Any idea ? Thanks !

Your queries are quite different. In the first, the where clause is part of the SQL statement passed to OPENQUERY(). This has two important effects:
The amount of data returned is much smaller, only being the rows that match the condition.
The query can be optimized with the WHERE clause.
If you need to share the table, I might suggest that you make a copy on the local server -- either using replication or scheduling a job to copy it over.

Select SQL View Slow with table alias

I am baffled as to why selecting my SQL View is so slow when using a table alias (25 seconds) but runs so much faster when the alias is removed (2 seconds)
-this query takes 25 seconds.
SELECT [Extent1].[Id] AS [Id],
[Extent1].[ProjectId] AS [ProjectId],
[Extent1].[ProjectWorkOrderId] AS [ProjectWorkOrderId],
[Extent1].[Project] AS [Project],
[Extent1].[SubcontractorId] AS [SubcontractorId],
[Extent1].[Subcontractor] AS [Subcontractor],
[Extent1].[ValuationNumber] AS [ValuationNumber],
[Extent1].[WorksOrderName] AS [WorksOrderName],
[Extent1].[NewGross],
[Extent1].[CumulativeGross],
[Extent1].[CreateByName] AS [CreateByName],
[Extent1].[CreateDate] AS [CreateDate],
[Extent1].[FinalDateForPayment] AS [FinalDateForPayment],
[Extent1].[CreateByEmail] AS [CreateByEmail],
[Extent1].[Deleted] AS [Deleted],
[Extent1].[ValuationStatusCategoryId] AS [ValuationStatusCategoryId]
FROM [dbo].[ValuationsTotal] AS [Extent1]
-this query takes 2 seconds.
SELECT [Id],
[ProjectId],
[Project],
[SubcontractorId],
[Subcontractor],
[NewGross],
[ProjectWorkOrderId],
[ValuationNumber],
[WorksOrderName],
[CreateByName],
[CreateDate],
[CreateByEmail],
[Deleted],
[ValuationStatusCategoryId],
[FinalDateForPayment],
[CumulativeGross]
FROM [dbo].[ValuationsTotal]
this is my SQL View code -
WITH ValuationTotalsTemp(Id, ProjectId, Project, SubcontractorId, Subcontractor, WorksOrderName, NewGross, ProjectWorkOrderId, ValuationNumber, CreateByName, CreateDate, CreateByEmail, Deleted, ValuationStatusCategoryId, FinalDateForPayment)
AS (SELECT vi.ValuationId AS Id,
v.ProjectId,
p.NAME,
b.Id AS Expr1,
b.NAME AS Expr2,
wo.OrderNumber,
SUM(vi.ValuationQuantity * pbc.BudgetRate) AS 'NewGross',
sa.ProjectWorkOrderId,
v.ValuationNumber,
up.FirstName + ' ' + up.LastName AS Expr3,
v.CreateDate,
up.Email,
v.Deleted,
v.ValuationStatusCategoryId,
sa.FinalDateForPayment
FROM dbo.ValuationItems AS vi
INNER JOIN dbo.ProjectBudgetCosts AS pbc
ON vi.ProjectBudgetCostId = pbc.Id
INNER JOIN dbo.Valuations AS v
ON vi.ValuationId = v.Id
INNER JOIN dbo.ProjectSubcontractorApplications AS sa
ON sa.Id = v.ProjectSubcontractorApplicationId
INNER JOIN dbo.Projects AS p
ON p.Id = v.ProjectId
INNER JOIN dbo.ProjectWorkOrders AS wo
ON wo.Id = sa.ProjectWorkOrderId
INNER JOIN dbo.ProjectSubcontractors AS sub
ON sub.Id = wo.ProjectSubcontractorId
INNER JOIN dbo.Businesses AS b
ON b.Id = sub.BusinessId
INNER JOIN dbo.UserProfile AS up
ON up.Id = v.CreateBy
WHERE ( vi.Deleted = 0 )
AND ( v.Deleted = 0 )
GROUP BY vi.ValuationId,
v.ProjectId,
p.NAME,
b.Id,
b.NAME,
wo.OrderNumber,
sa.ProjectWorkOrderId,
v.ValuationNumber,
up.FirstName + ' ' + up.LastName,
v.CreateDate,
up.Email,
v.Deleted,
v.ValuationStatusCategoryId,
sa.FinalDateForPayment)
SELECT Id,
ProjectId,
Project,
SubcontractorId,
Subcontractor,
NewGross,
ProjectWorkOrderId,
ValuationNumber,
WorksOrderName,
CreateByName,
CreateDate,
CreateByEmail,
Deleted,
ValuationStatusCategoryId,
FinalDateForPayment,
(SELECT SUM(NewGross) AS Expr1
FROM ValuationTotalsTemp AS tt
WHERE ( ProjectWorkOrderId = t.ProjectWorkOrderId )
AND ( t.ValuationNumber >= ValuationNumber )
GROUP BY ProjectWorkOrderId) AS CumulativeGross
FROM ValuationTotalsTemp AS t
Any ideas why this is?
The SQL query runs with table alias as this is generated from Entity Framework so I have no way of changing this. I will need to modify my SQL view to be able to handle the table alias without affecting performance.

The execution plans are very different.
The slow one has a part that leaps out as being problematic. It estimates a single row will be input to a nested loops join and result in a single scan of ValuationItems. In practice it ends up performing more than 1,000 such scans.
Estimated
Actual
SQL Server 2014 introduced a new cardinality estimator. Your fast plan is using it. This is shown in the XML as CardinalityEstimationModelVersion="120" Your slow plan isn't (CardinalityEstimationModelVersion="70").
So it looks as though in this case the assumptions used by the new estimator give you a better plan.
The reason for the difference is probably as the fast one is running cross database (references [ProbeProduction].[dbo].[ValuationsTotal]) and presumably the database you are executing it from has compatility level of 2014 so automatically gets the new CardinalityEstimator.
The slow one is executing in the context of ProbeProduction itself and I assume the compatibility level of that database must be < 2014 - so you are defaulting to the legacy cardinality estimator.
You can use OPTION (QUERYTRACEON 2312) to get the slow query to use the new cardinality estimator (changing the database compatibility mode to globally alter the behaviour shouldn't be done without careful testing of existing queries as it can cause regressions as well as improvements).
Alternatively you could just try and tune the query working within the limits of the legacy CE. Perhaps adding join hints to encourage it to use something more akin to the faster plan.

The two queries are different (column order!). It is reasonable to assume the first query uses an index and is therefore much faster. I doubt it has anything to do with the aliassing.

For grins would take out the where and give this a try?
I might be doing a bunch of loop joins and filtering at the end
This might get it to filter up front
FROM dbo.ValuationItems AS vi
INNER JOIN dbo.Valuations AS v
ON vi.ValuationId = v.Id
AND vi.Deleted = 0
AND v.Deleted = 0
-- other joins
-- NO where
If you have a lot of loop joins going on then try inner hash join (on all)

SQL shorten responds time using parallel or with clause

I have a query that takes 20 min to executed... I remember in one project we used /*+ PARALLEL(T,8) / or we would use the with clause and /+ materialize */ and it would make the query responds time really fast withing seconds. How can I do this to this query?
select count(*) from (
select hdr.ACCESS_IND,
hdr.SID,
hdr.CLLI,
hdr.DA,
hdr.TAPER_CODE,
hdr.CFG_TYPE as CFG_TYPE,
hdr.IP_ADDR,
hdr.IOS_VERSION,
hdr.ADMIN_STATE,
hdr.WIRE_CENTER,
substr(hdr.SID_IO_PRI, 1, 8) PRI_IO_CLLI,
substr(hdr.SID_IO_SEC, 1, 8) SEC_IO_CLLI,
hdr.VHO_CLLI ,
hdr.CFG_TYPE ,
-- dtl.MULTIPURPOSE_IND,
lkup.code3 as shelf_type
from RPT_7330_HDR hdr
INNER JOIN RPT_7330_DTL dtl on hdr.EID = dtl.EID
INNER JOIN CODE_LKUP2 lkup ON LKUP.CODE1 = hdr.ACCESS_IND
where LKUP.CATEGORY='ACCESS_MAPPING' and hdr.DT_MODIFIED = (select DT_MODIFIED
from LS_DT_MODIFIED
where NAME = 'RPT_7330_HDR')) n;

Try this, might be faster:
select count(*)
from RPT_7330_HDR hdr
JOIN LS_DT_MODIFIED LS ON LS.NAME = 'RPT_7330_HDR' AND hdr.DT_MODIFIED = LS.DT_MODIFIED
JOIN RPT_7330_DTL dtl on hdr.EID = dtl.EID
JOIN CODE_LKUP2 lkup ON LKUP.CODE1 = hdr.ACCESS_IND AND LKUP.CATEGORY='ACCESS_MAPPING'
The SQL engine can optimize JOINS to be parallel if you have the right indexes and such. It is often able to optimize joins when it can't optimize sub-queries.

If you want data then
SELECT /*+ PARALLEL(DTL,4) */
HDR.ACCESS_IND,
HDR.SID,
HDR.CLLI,
HDR.DA,
HDR.TAPER_CODE,
HDR.CFG_TYPE AS CFG_TYPE,
HDR.IP_ADDR,
HDR.IOS_VERSION,
HDR.ADMIN_STATE,
HDR.WIRE_CENTER,
SUBSTR ( HDR.SID_IO_PRI, 1, 8 ) PRI_IO_CLLI,
SUBSTR ( HDR.SID_IO_SEC, 1, 8 ) SEC_IO_CLLI,
HDR.VHO_CLLI,
HDR.CFG_TYPE,
LKUP.CODE3 AS SHELF_TYPE
FROM
RPT_7330_HDR HDR INNER JOIN RPT_7330_DTL DTL ON HDR.EID = DTL.EID
INNER JOIN CODE_LKUP2 LKUP ON LKUP.CODE1 = HDR.ACCESS_IND
INNER JOIN LS_DT_MODIFIED ON HDR.DT_MODIFIED = DT_MODIFIED
WHERE
LKUP.CATEGORY = 'ACCESS_MAPPING'
AND NAME = 'RPT_7330_HDR';
If you want count then
SELECT /*+ PARALLEL(DTL,4) */
COUNT (*)
FROM
RPT_7330_HDR HDR INNER JOIN RPT_7330_DTL DTL ON HDR.EID = DTL.EID
INNER JOIN CODE_LKUP2 LKUP ON LKUP.CODE1 = HDR.ACCESS_IND
INNER JOIN LS_DT_MODIFIED ON HDR.DT_MODIFIED = DT_MODIFIED
WHERE
LKUP.CATEGORY = 'ACCESS_MAPPING'
AND NAME = 'RPT_7330_HDR';
Note: Hint used on DTL table which is taking more cost for FTS. Number 4 means query fired on 8 CPU's parallely. Identify your pain points from query plan and decide on your hints for parallel or any other. You can also use parallel hint on multiple tables /*+ PARALLEL(table1 4) PARALLEL(table2 4) PARALLEL(table3 4) PARALLEL(table4 4)*/ . Also This works only on Enterprise edition and not on standard edition.

SELECT /*+ PARALLEL */ ...
You do not want to use a magic number and you do not want to list any objects.
A degree of parallelism of 8 is probably too high on your laptop, yet too low on your production server. If the query simply specifies PARALLEL, Oracle will automatically determine the DOP if it's configured for automatic parallelism. Or it will default to the number of CPUs * threads per CPU * number of instances. Judging by all the manuals, features, and whitepapers, Oracle has put a lot of thought into the degree of parallelism. If you're not going to benchmark it you should trust the defaults.
In 11gR2, when you don't list any objects you use statement level parallelism instead of object level parallelism. There's not need to try to determine exactly
which objects and access paths needs to be parallelized. If you change the query or alias names later there's no chance of the hint not working.
Here's a quick introduction to the parallel hint. There's also the
Using Parallel Execution chapter of the VLDB and Partitioning Guide.

Query Shows multiple listings of same info

I'm using this query to find all the machines on the network (using dell kace) that have an expired warranty according to their service tag.
However, when I run the query, some of the machines are listed twice but should only be listed once.
Here is an example of the output where machine example3 is correctly listed but example1 is listed twice.
# Machine Name Service Tag
1 example1 abcd123
2 example1 abcd123
3 example3 abcd124
Code:
SELECT
M.NAME AS MACHINE_NAME, M.CS_MODEL AS MODEL, DA.SERVICE_TAG,
DA.SHIP_DATE,M.USER_LOGGED AS LAST_LOGGED_IN_USER, DW.SERVICE_LEVEL_CODE,
DW.SERVICE_LEVEL_DESCRIPTION, DW.END_DATE AS EXPIRATION_DATE
FROM
DELL_WARRANTY DW
JOIN
DELL_ASSET DA ON (DW.SERVICE_TAG = DA.SERVICE_TAG)
JOIN
MACHINE M
ON (M.BIOS_SERIAL_NUMBER = DA.PARENT_SERVICE_TAG OR M.BIOS_SERIAL_NUMBER = DA.SERVICE_TAG)
LEFT JOIN
DELL_WARRANTY DW2 ON DW2.SERVICE_TAG=DW.SERVICE_TAG and DW2.END_DATE > NOW()
WHERE
M.CS_MANUFACTURER LIKE '%dell%'
AND
M.BIOS_SERIAL_NUMBER!=''
AND
DA.DISABLED != 1
AND
DW.END_DATE < NOW()
AND
DW2.SERVICE_TAG IS NULL;
Any ideas on how to make computers with the same machine name and service tags only output once? Thanks.

Let me make the assumption that you have a reasonable data model and reasonably populated data. That means that the duplicates are not coming from inappropriate data stored in the database.
Your query (formatted so I can read it) is:
SELECT M.NAME AS MACHINE_NAME, M.CS_MODEL AS MODEL, DA.SERVICE_TAG,
DA.SHIP_DATE, M.USER_LOGGED AS LAST_LOGGED_IN_USER, DW.SERVICE_LEVEL_CODE,
DW.SERVICE_LEVEL_DESCRIPTION, DW.END_DATE AS EXPIRATION_DATE
FROM DELL_WARRANTY DW JOIN
DELL_ASSET DA
ON DW.SERVICE_TAG = DA.SERVICE_TAG JOIN
MACHINE M
ON M.BIOS_SERIAL_NUMBER = DA.PARENT_SERVICE_TAG OR
M.BIOS_SERIAL_NUMBER = DA.SERVICE_TAG LEFT JOIN
DELL_WARRANTY DW2
ON DW2.SERVICE_TAG = DW.SERVICE_TAG and DW2.END_DATE > NOW()
WHERE M.CS_MANUFACTURER LIKE '%dell%' AND
M.BIOS_SERIAL_NUMBER <> '' AND
DA.DISABLED <> 1;
The suspect join is the one on Machine because it has an or. So, two machines might match different parts of the service tag, resulting in multiple very similar rows.
If your concern is specifically about machine names and service tags (the two columns you highlighted in the question), then you can fix those duplicates by ending the query with:
group by M.NAME, DA.SERVICE_TAG
(This assumes that you are using MySQL -- based on the syntax of your query. Most other databases would require aggregation functions around the rest of the columns in select.)

Try putting a DISTINCT infront of M.Name
OR playing with the joins like
SELECT
M.NAME AS MACHINE_NAME, M.CS_MODEL AS MODEL, DA.SERVICE_TAG,
DA.SHIP_DATE,M.USER_LOGGED AS LAST_LOGGED_IN_USER, DW.SERVICE_LEVEL_CODE,
DW.SERVICE_LEVEL_DESCRIPTION, DW.END_DATE AS EXPIRATION_DATE
FROM
DELL_WARRANTY DW
INNER JOIN
DELL_ASSET DA ON (DW.SERVICE_TAG = DA.SERVICE_TAG)
INNER JOIN
MACHINE M
ON (M.BIOS_SERIAL_NUMBER = DA.PARENT_SERVICE_TAG OR M.BIOS_SERIAL_NUMBER = DA.SERVICE_TAG)
INNER JOIN
DELL_WARRANTY DW2 ON DW2.SERVICE_TAG=DW.SERVICE_TAG and DW2.END_DATE > NOW()
WHERE
M.CS_MANUFACTURER LIKE '%dell%'
AND
M.BIOS_SERIAL_NUMBER!=''
AND
DA.DISABLED != 1
AND
DW.END_DATE < NOW()
AND
DW2.SERVICE_TAG IS NULL;

Oracle SQL Optimization : hierarchical query

I have this query that get list of records and traces the genealogy of each record but it runs forever. Can anyone help me improve the performance?
WITH root_nodes AS
(SELECT distinct dlot.dim_lot_key AS lot_key, facility, lot
FROM pedwroot.dim_lot dlot
JOIN AT_LOT a
ON (a.at_lot = dlot.lot AND a.at_facility = dlot.facility)
WHERE (dlot.has_test_lpt = 'Y'
or dlot.has_post_test_lpt = 'Y') and a.at_facility = 'MLA'),
upstream_genealogy AS
(SELECT /*+ INDEX(fact_link_lot IX_R_FLLOT_DLOT_2)*/DISTINCT CONNECT_BY_ROOT
fllot.dst_lot_key AS root_lot_key,
fllot.src_lot_key
FROM pedwroot.fact_link_lot fllot
CONNECT BY NOCYCLE PRIOR fllot.src_lot_key = fllot.dst_lot_key
START WITH fllot.dst_lot_key IN (SELECT lot_key FROM root_nodes)),
at_lst AS
(Select *
FROM pedwroot.dim_lot dlot_lst
JOIN upstream_genealogy upgen
ON (upgen.src_lot_key = dlot_lst.dim_lot_key)
where dlot_lst.has_assembly_lpt = 'Y')
SELECT distinct dlot_root.lot AS AT_LOT,
dlot_root.facility AS AT_FACILITY,
dfac_root.common_name AS AT_SITE,
dlot_root.LTC AS AT_LTC,
al.lot AS AT_LST,
dlot_src.lot AS FAB_LOT,
dlot_src.facility AS FAB_FACILITY,
dfac_src.common_name AS FAB_SITE
FROM upstream_genealogy upgen
JOIN at_lst al
ON (upgen.root_lot_key = al.root_lot_key)
JOIN pedwroot.dim_lot dlot_src
ON (upgen.src_lot_key = dlot_src.dim_lot_key)
JOIN pedwroot.dim_lot dlot_root
ON (upgen.root_lot_key = dlot_root.dim_lot_key)
JOIN pedwroot.fact_lot flot
ON (dlot_root.dim_lot_key = flot.lot_key)
JOIN pedwroot.dim_facility dfac_root
ON (flot.facility_key = dfac_root.dim_facility_key)
JOIN pedwroot.dim_facility dfac_src
ON (flot.fab_facility_key = dfac_src.dim_facility_key)
WHERE dlot_src.has_fab_lpt = 'Y';
Below is the explain plan of this query

The sudden change in cardinality from 11 million to 1 looks like a problem. Remove tables and predicates from the query until you find out exactly what is causing that poor estimate.
Most of the time these issue are caused by bad statistics, try gathering stats for all the related tables. (I can think of dozens of other potential problems, but it's probably not worth guessing until you can shrink the problem a little.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas