How to find duplicate records in SQL?

How to find duplicate records in SQL? - sql

I am trying to develop a query to insert unique records but am receiving the SQL Server Primary Key error for trying to insert duplicate records. I was able to insert some values with this query but not for this record (score_14).
So now I am trying to find duplicate record with the following query. The challenge is that my PK is based on 3 columns: StudentID, MeasureDate, and MeasureID--all from a different table not mentioned below.
But this only shows me count--instead I want to just return records with count > 1. How do I do that?
select count(a.score_14) as score_count, A.studentid, A.measuredate, B.measurename+' ' +B.LabelName
from [J5C_Measures_Sys] A
join [J5C_ListBoxMeasures_Sys] B on A.MeasureID = B.MeasureID
join sysobjects so on so.name = 'J5C_Measures_Sys'
join syscolumns sc on so.id = sc.id
join [J5C_MeasureNamesV2_Sys] v on v.Score_field_id = sc.name
where so.type = 'u' and sc.name = 'score_14' and a.score_14 is not null
AND A.STUDENTID IS NOT NULL AND A.MEASUREDATE IS NOT NULL AND B.MEASURENAME IS NOT NULL
--and count(a.score_14)>1
group by a.studentid, a.measuredate, B.measurename, B.LabelName, A.score_14
having count(a.score_14) > 1

Beth is correct - here's my re-write of your query:
SELECT a.studentid, a.measuredate, a.measureid
from [J5C_Measures_Sys] A
GROUP BY a.studentid, a.measuredate, a.measureid
HAVING COUNT(*) > 1
Previously:
SELECT a.studentid, a.measuredate, a.measureid
from [J5C_Measures_Sys] A
join [J5C_ListBoxMeasures_Sys] B on A.MeasureID = B.MeasureID
join sysobjects so on so.name = 'J5C_Measures_Sys'
AND so.type = 'u'
join syscolumns sc on so.id = sc.id
and sc.name = 'score_14'
join [J5C_MeasureNamesV2_Sys] v on v.Score_field_id = sc.name
where a.score_14 is not null
AND B.MEASURENAME IS NOT NULL
GROUP BY a.studentid, a.measuredate, a.measureid
HAVING COUNT(*) > 1

you need to take A.score_14 out of your group by clause if you want to count it

Related

SQL IN "Too many values"

I have a problem I want to get all of the DISTINCT CLIENTIDs in my LCMINV table in which the CLIENTID is in LAAPPL's CLIENTID and LACOBORW's COBORWID and also the APPLICATION STATUS is either 'PEN' OR 'REC'.
Here is my table:
++++++++++++++++++++++++++++++++++++++++++++
+ LAAPPL + LACOBORW + LCMINV +
++++++++++++++++++++++++++++++++++++++++++++
+ CLIENTID + APPLNO + CLIENTID +
+ APPLNO + COBORWID + +
+ STATUS + + +
++++++++++++++++++++++++++++++++++++++++++++
Here is my SQL Query I've done so far:
SELECT DISTINCT(CLIENTID)
FROM LCMINV
WHERE CLIENTID
IN (SELECT CLIENTID, COBORWID
FROM LAAPPL
LEFT OUTER JOIN LACOBORW ON LACOBORW.APPLNO = LAAPPL.APPLNO
WHERE STATUS = 'PEN' OR STATUS = 'REC')
Can it be done in one query or do I need separate queries? I tried my query above an I am getting an error "Too many values"

I would probably use a CTE to break the query up into smaller parts, and then use INNER JOINs to combine the tables. Since You query LAAPPL, I put both columns used into a single subquery, and then join on each column in the next CTE.
It would look something like this:
WITH CTE_LAAPPL AS
(
SELECT
CLIENTID
,APPLNO
FROM LAAPPL
WHERE STATUS IN ('PEN','REC')
)
,CLIENTIDs AS
(
SELECT DISTINCT
l.COBORWID AS CLIENTID
FROM CTE_LAAPPL c
INNER JOIN LACOBORW l ON l.APPLNO = c.APPLNO
UNION
SELECT DISTINCT
l.CLIENTID
FROM CTE_LAAPPL c
INNER JOIN LCMINV l ON l.CLIENTID = c.CLIENTID
)
SELECT
c.CLIENTID
,c.NAME
FROM LCCLIENT c
INNER JOIN CLIENTIDs i ON i.CLIENTID = c.CLIENTID
ORDER BY NAME

I think you just want:
SELECT DISTINCT i.CLIENTID
FROM LCMINV i
WHERE i.CLIENTID IN (SELECT a.CLIENTID
FROM LAAPPL a JOIN
LACOBORW c
ON c.APPLNO = a.APPLNO
WHERE a.STATUS IN ('PEN', 'REC')
);
I'm not really sure why you are joining in LACOBORW. You can probably remove that logic.

I would recommend left joins to get the data you need. Unfortunately, your question 3 does not seem to fit your model because your LCMINV table does not seem to have an id that matches to another table and I cannot understand where LCMINV.CLIENTID should be related to in another context. This works in MSSQL - cannot be sure about Oracle.
--this statement will give you all the names that have a status in the where clause
--you will receive NULL if there is no Co-BorrowerID
select
a.name
, b.applno
, b.status
, c.coborwid
from
lcclient a left join
laapl b on a.clientid = b.clientid left join
lacoborw c on b.applno = c.applno
where
b.status in ('PEN', 'REC')
order by
a.name

How to get the all table(s) name which are used in particular stored procedure?

I have a requirement to get all the database tables name which are used in specific stored procedure?
As an example, I have one stored procedure as given below.
CREATE PROCEDURE [dbo].[my_sp_Name]
#ID INT = NULL
AS
BEGIN
SELECT ID, NAME, PRICE
FROM tbl1
INNER JOIN tbl2 ON tbl1.ProductId = tbl2.ProductId
LEFT JOIN tbl3 ON tbl2.ProductSalesDate = tbl3.ProductSalesDate
LEFT JOIN tbl4 ON tbl1.ProductCode = tbl4.ItemCode
END
Expected output:
Used_Table_Name
tbl1
tbl2
tbl3
tbl4
Can any one suggest a way?

Below query will help you to get used database tables in stored procedure.
;WITH stored_procedures AS (
SELECT oo.name AS table_name,
ROW_NUMBER() OVER(partition by o.name,oo.name ORDER BY o.name,oo.name) AS row
FROM sysdepends d
INNER JOIN sysobjects o ON o.id=d.id
INNER JOIN sysobjects oo ON oo.id=d.depid
WHERE o.xtype = 'P' AND o.name = 'my_sp_Name'
)
SELECT Table_name AS 'Used_Table_Name' FROM stored_procedures
WHERE row = 1

Use below script to get table names in your store procedure :
SELECT DISTINCT [object_name] = SCHEMA_NAME(o.[schema_id]) + '.' + o.name
, o.type_desc
FROM sys.dm_sql_referenced_entities ('[dbo].[Your_procedurename]',
'OBJECT')d
JOIN sys.objects o ON d.referenced_id = o.[object_id]
WHERE o.[type] IN ('U', 'V')

Select qry to using 2 databases

I have the below query:
SELECT
--a.DateEntered,
--a.InventoryId,
a.SKU, a.QtyOnHand,
b.Dateentered AS VDateEntered,
b.GoLive, b.DateOnSite, b.CostPrice,
--a.CurrentPrice,
m.name AS Status,
hrf.category, hrf.department, hrf.BrandedOB, hrf.Division,
hrf.Fascia,
(a.QtyOnHand * b.CostPrice) AS Cost_Value,
NULL AS Item_Quantity, NULL AS Item,
NULL AS Season, hrf.Company,
(a.QtyOnHand * b.CurrentPrice) AS Sellilng_Value,
b.merchandisingseason, b.CostPrice AS Costprice_RP,
b.CurrentPrice, b.InventoryID,
-- a.AverageUnitCost,
-- a.AverageUnitCost AS RP_Stk_AverageUnitCost,
-- a.CurrentPrice AS RP_Stk_Current_Price,
-- a.Statusid,
-- a.Quantity_Sign,
(a.QtyOnHand * b.CostPrice) AS Cost_Value_RP,
(a.QtyOnHand * b.AverageUnitCost) AS AWC_Value
-- a.StockReconciliationId,
-- a.AverageUnitCost,
FROM
[dbo].[FC03QTY] a
JOIN
dbo.inventory b ON a.SKU = b.SKU
LEFT JOIN
(------Hierarchy-------
SELECT
ih.InventoryId, hry.category, hry.department,
hry.BrandedOB, hry.Division, hry.Fascia, hry.Company
FROM
(SELECT
ihn.HierarchyNodeId, ihn.InventoryId
FROM bm.InventoryHierarchyNode IHN
GROUP BY ihn.HierarchyNodeId, ihn.InventoryId) IH
JOIN
(SELECT
g.categoryid, g.category, h.department, i.BrandedOB,
j.Division, K.Fascia, L.company
FROM
Category g (NOLOCK)
JOIN
Department H ON g.departmentid = h.departmentID
JOIN
BrandedOB I (NOLOCK) ON h.BrandedOBID = i.BrandedOBID
JOIN
Division j (NOLOCK) ON i.divisionid = j.divisionid
JOIN
Fascia k (NOLOCK) ON j.fasciaid = k.fasciaID
JOIN
company l (NOLOCK) ON k.companyid = l.companyid
GROUP BY
g.categoryid, g.category, h.department,
i.BrandedOB, j.Division, K.Fascia, L.company) HRY ON ih.HierarchyNodeId = hry.CategoryId
GROUP BY
ih.InventoryId, hry.category, hry.department,
hry.BrandedOB, hry.Division, hry.Fascia, hry.Company) HRF ON b.inventoryid = hrf.inventoryid
JOIN
inventorystatus m (NOLOCK) ON b.statusid = m.statusid
It is using 2 tables -
[dbo].[FC03QTY] a
and
dbo.inventory b
that are joined at the SKU level.
[dbo].[FC03QTY] is on the scratch database and dbo.inventory is on the reports database.
How can I get my query to use these tables if they are on 2 different db?
Any advice greatly received.

In sql server the syntaxis for tables is [database name].[schema name].[table name]
So you need something like this:
SELECT A.*, B.*
FROM
Database1.dbo.Table1 as A,
Database2.dbo.Table2 as B

Trying to pull new SQL objects between dates from another table

I have a query that pulls the information from the sys tables to produce a list of the new objects that have been created in our environment. What I need to do is the last part of the query needs to be able to look at another table and pull the 3 most current records and use the date from those records to show only the objects that have been created since those 3 dates. In my testing I keep running into "Subquery returned more than 1 value" Any help would be greatly appreciated.
EDIT:
I am currently running SQL 2008 R2.
The query runs now as is, but only pulls the most recent date, I need it to pull everything from the last 3 dates.
SELECT
a.name AS ObjectName, b.name AS ParameterName, c.name AS DataType,
b.isnullable AS [Allow Nulls?], a.crdate AS CreateDate,
CASE WHEN d .name IS NULL THEN 0 ELSE 1 END AS [PKey?],
CASE WHEN e.parent_object_id IS NULL THEN 0 ELSE 1 END AS [FKey?],
CASE WHEN e.parent_object_id IS NULL THEN '-' ELSE g.name END AS [Ref Table],
CASE WHEN h.value IS NULL THEN '-' ELSE h.value END AS Description,
c.length AS FieldSize, a.replinfo AS IsReplicated,
CASE a.xtype WHEN 'V' THEN 'View' WHEN 'P' THEN 'StoredProcedure' WHEN 'FN' THEN 'ScalarFunction' WHEN 'F' THEN 'ForeignKey' WHEN 'U' THEN 'Table' WHEN
'TR' THEN 'Trigger' WHEN 'TT' THEN 'TableType' WHEN 'PK' THEN 'PrimaryKey' END AS ObjectType
FROM
sys.sysobjects AS a
INNER JOIN sys.syscolumns AS b ON a.id = b.id
INNER JOIN sys.systypes AS c ON b.xtype = c.xtype
LEFT OUTER JOIN
(SELECT so.id, sc.colid, sc.name
FROM
sys.syscolumns AS sc
INNER JOIN sys.sysobjects AS so ON so.id = sc.id
INNER JOIN sys.sysindexkeys AS si ON so.id = si.id AND sc.colid = si.colid
WHERE (si.indid = 1)) AS d ON a.id = d.id AND b.colid = d.colid
LEFT OUTER JOIN sys.foreign_key_columns AS e ON a.id = e.parent_object_id AND b.colid = e.parent_column_id
LEFT OUTER JOIN sys.objects AS g ON e.referenced_object_id = g.object_id
LEFT OUTER JOIN sys.extended_properties AS h ON a.id = h.major_id AND b.colid = h.minor_id
WHERE (a.type = 'U' OR
a.type = 'V' OR
a.type = 'F' OR
a.type = 'PK' OR
a.type = 'P' OR
a.type = 'FN' OR
a.type = 'TT' OR
a.type = 'TR') AND
(a.crdate >
(SELECT TOP (1) DeployDate
FROM OtherTable.dbo.Tracking
ORDER BY DeployDate DESC))
ORDER BY CreateDate DESC

If I'm understanding your question correctly, you want the minimum of the last 3 dates in the tracking table.
If so, you could do something like this...
(a.crdate >
(SELECT Min(topthree.DeployDate)
FROM (select Top 3 DeployDate
From dbo.Tracking
ORDER BY DeployDate DESC
) as topthree
)
)

Selecting Rowcount and number of columns with one sql query?

How can i club both the select count and select columns with one statement?

If you are just looking to see what "shape" your tables are in, something like this would work. But if you're wanting to know column/row counts for a given query, it may be possible but I don't know how to do it
; WITH ROW_COUNTS AS
(
SELECT
s.[Name] as [SchemaName]
, t.[name] as [TableName]
, SUM(p.rows) as [RowCounts]
FROM
sys.schemas s
LEFT JOIN
sys.tables t
ON s.schema_id = t.schema_id
LEFT JOIN
sys.partitions p
ON t.object_id = p.object_id
LEFT JOIN
sys.allocation_units a
ON p.partition_id = a.container_id
WHERE
p.index_id in(0,1) -- 0 heap table , 1 table with clustered index
AND p.rows is not null
AND a.type = 1 -- row-data only , not LOB
GROUP BY
s.[Name]
, t.[name]
)
, COLUMN_COUNTS AS
(
SELECT
s.[Name] as [SchemaName]
, t.[name] as [TableName]
, COUNT(c.column_id) as [ColumnCounts]
FROM
sys.schemas s
INNER JOIN
sys.tables t
ON s.schema_id = t.schema_id
INNER JOIN
sys.columns c
ON C.object_id = T.object_id
GROUP BY
s.[Name]
, t.[name]
)
SELECT
CC.SchemaName
, CC.TableName
, RC.RowCounts
, CC.ColumnCounts
FROM
COLUMN_COUNTS CC
INNER JOIN
ROW_COUNTS RC
ON RC.SchemaName = CC.SchemaName
AND RC.TableName = CC.TableName
ORDER BY
1,2
Results run against master
SchemaName TableName RowCounts ColumnCounts
dbo Hold_Cluster_Status 0 10
dbo MSreplication_options 3 6
dbo spt_fallback_db 0 8
dbo spt_fallback_dev 0 10
dbo spt_fallback_usg 0 9
dbo spt_monitor 1 11
dbo spt_values 2506 6

Using a COUNT() window aggregate you should be able to accomplish what you are looking to do. However, the DISTINCT against a very large table is not likely to be very performant:
SELECT DISTINCT a, b, c, COUNT(*) OVER (PARTITION BY 1) FROM <tablename>;
Another option would be but you are touching the table twice:
SELECT a,b,c, MyCount
FROM <tablename>
CROSS JOIN
(SELECT COUNT(*) AS MyCount
FROM <tablename>
)

I'm not 100% ure what you want, but here is a possible example...
SELECT
a,
b,
c,
(SELECT COUNT(*) FROM yourTable) as table_count
FROM
yourTable

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to find duplicate records in SQL? - sql

you need to take A.score_14 out of your group by clause if you want to count it

Related

SQL IN "Too many values"

How to get the all table(s) name which are used in particular stored procedure?

Select qry to using 2 databases

Trying to pull new SQL objects between dates from another table

Selecting Rowcount and number of columns with one sql query?

Categories

Resources