Retrieve Duplicate Names (not numbers) in a Table

Retrieve Duplicate Names (not numbers) in a Table - sql

I am going to attach a link that showcases exactly what I am facing, with the code shown and descriptions of what should be happeneing.
See link: https://www.screencast.com/t/Zo3BH0i3v
In the event that you do not want to open that link, below I will explain the issue:
First and foremost, the code:
--Query Duplicate Document Names--
SELECT count(dcd.Name)'How Many Duplicates', dcf.Name 'Folder Name', dcf.ID
'Folder ID', dcd.ID 'Document ID', dcd.name 'Document Name',
'https://' + txtcityssldomainname + '/' + 'Admin/DocumentCenter/Folder/Index/'
+ Cast (FK_FolderID AS VARCHAR) AS Link_To_Folder
FROM tblcitysettings tblcs, DocumentCenterDocuments dcd
JOIN DocumentCenterFolders dcf on dcf.ID = dcd.FK_FolderID
WHERE dcf.FK_Status IN (40,10) AND (dcd.FK_Status IN (40,10)) AND
(dcd.IsArchived IN (0)) AND (dcf.Name NOT IN ('Content', 'Design', 'Banners',
'MyAccount'))
GROUP BY dcf.Name, dcf.ID, dcd.ID, dcd.Name, 'https://' + txtcityssldomainname
+ '/' + 'Admin/DocumentCenter/Folder/Index/' + Cast (FK_FolderID AS VARCHAR)
HAVING count(dcd.Name) = 1
ORDER BY dcd.ID
What should happen, is it should return a count in the first column of "1" for every document that appears once. If the document appears more than once, it will concatenate those rows and show a value of how many appeared.
It is not working and I am completely at a loss.
Any help would be greatly appreciated!
Below is the Edited, Working copy:
Note: I had realized that I was trying to merge a singular value in a column when there were other values that were not the same. Therefore, how could it merge those strings? I re-wrote the code to display the bare minimum.
SELECT count(dcd.DisplayName) 'Number of Duplicate Entries', dcd.DisplayName 'Document Name'
FROM DocumentCenterDocuments dcd
JOIN DocumentCenterFolders dcf on dcf.ID = dcd.FK_FolderID
where dcd.IsArchived IN (0) AND (dcd.FK_Status IN (40, 10)) AND (dcf.FK_Status IN (40, 10)) AND (dcf.Name NOT IN ('Content', 'Design', 'Banners', 'MyAccount'))
GROUP BY dcd.DisplayName
Having count(dcd.DisplayName) > 1
Thank you!

Related

OPENJSON - modify statement to ignore first part of the string

We receive auto-generated emails from an application, and we export those to our database as they arrive at the Inbox. The table is called dbo.MailArchive.
Up until recently, the body of the email has always looked like this...
Status: Completed
Successful actions count: 250
Page load count: 250
...except with different numbers and statuses. Note that there is a carriage return on the blank line after Page load count.
The entirety of this data gets written to a field called Mail_Body - then we run the following statement using OPENJSON to parse those lines into their own columns in the record:
DECLARE #PI varchar(7) = '%[^' + CHAR(13) + CHAR(10) + ']%';
SELECT j.Status,
j.Successful_Actions_Count,
j.Page_Load_Count
FROM dbo.MailArchive m
CROSS APPLY(VALUES(REVERSE(m.Mail_Body),PATINDEX(#PI,REVERSE(m.Mail_Body)))) PI(SY,I)
CROSS APPLY(VALUES(REVERSE(STUFF(PI.SY,1,PI.I,''))))S(FixedString)
CROSS APPLY OPENJSON (CONCAT('{"', REPLACE(REPLACE(S.FixedString, ': ', '":"'), CHAR(13) + CHAR(10), '","'), '"}'))
WITH (Status varchar(100) '$.Status',
Successful_Actions_Count int '$."Successful actions count"',
Page_Load_Count int '$."Page load count"') j;
Beginning today, there are certain emails where the body of the email looks like this:
Agent did not meet defined success criteria on this run.
Status: Completed
Successful actions count: 250
Page load count: 250
To clarify, that's one new line at the top, a carriage return at the end of that line, and a carriage return on the blank line between the new line and the Status line. At this time, there is no consistent way to predict which emails will come in with this new line, and which ones won't.
How can I modify our OPENJSON statement to say, If this first line exists in the body, skip/ignore it and parse lines 3 through 5, else just do exactly what I have above? Or perhaps even better to future-proof it, always ignore everything before the word Status?

Since your data has new leading and trailing rows, I think a simple aggregation in concert with a string_split() and a CROSS APPLY would be more effective than my previous XML answer and the current JSON approach
Example or dbFiddle
Select A.ID
,Status = stuff(Pos1,1,charindex(':',Pos1),'')
,Action = try_convert(int,stuff(Pos2,1,charindex(':',Pos2),''))
,PageCnt = try_convert(int,stuff(Pos3,1,charindex(':',Pos3),''))
From YourTable A
Cross Apply (
Select [Pos1] = max(case when Value like 'Status:%' then value end)
,[Pos2] = max(case when Value like '%actions count:%' then value end)
,[Pos3] = max(case when Value like 'Page load count:%' then value end)
From string_split(SomeCol,char(10))
) B
Returns
ID Status Action PageCnt
1 Completed 250 250
Note: Use an OUTER APPLY if you want to see NULLs

Getting error "Only one expression can be specified in the select list..."

I am trying to add a column to a view with the following code:
SELECT ';' + CONTEXT as DriverNotes,
(STUFF((SELECT CustomerID FROM Notes E2 WHERE E2.CustomerID IN (Notes.CustomerID)
FOR XML PATH(''), TYPE, ROOT).value('root[1]','nvarchar(5)'),1,0,'')) as CustomerID FROM NOTES
On it's own it works just fine. When I run it within a View however, I get the following error:
"Only one expression can be specified in the select list when the subquery is not introduced with EXISTS."
I realize that the code here is trying to call two columns and that is what is giving me the error, but I only want one, and that would be CONTEXT. I need this to correlate with Notes.CustomerID but without the column appearing in the query.
I am still quite new to this, so any help would be greatly appreciated.

Check this query. I think this is what you want :
SELECT Notes.CustomerId,
STUFF(
(SELECT ';' + CONTEXT FROM Notes E2
WHERE E2.CustomerId = Notes.CustomerId
FOR XML PATH ('')), 1, 1, ''
) DriverNotes
FROM Notes /*Probably it should be Customer table */
GROUP BY Notes.CustomerId

Pyodbc and Access with query parameter that contains a period

I recently found a bug with some Access SQL queries that I can't seem to track down. I have a fairly straightforward SQL query that I use to retrieve data from an access database that's "managed" in an older application (ie the data is already in the database and I have no real control over what's in there).
import pyodbc
MDB = '******.MDB'
DRV = '{Microsoft Access Driver (*.mdb)}'
PWD = ''
con = pyodbc.connect('DRIVER={};DBQ={};PWD={}'.format(DRV, MDB, PWD))
sql = ('SELECT Estim.PartNo, Estim.Descrip, Estim.CustCode, Estim.User_Text1, Estim.Revision, ' +
'Estim.Comments, Routing.PartNo AS RPartNo, Routing.StepNo, Routing.WorkCntr, Routing.VendCode, ' +
'Routing.Descrip AS StepDescrip, Routing.SetupTime, Routing.CycleTime, ' +
'Routing.WorkOrVend, ' +
'Materials.PartNo as MatPartNo, Materials.SubPartNo, Materials.Qty, ' +
'Materials.Unit, Materials.TotalQty, Materials.ItemNo, Materials.Vendor ' +
'FROM (( Estim ' +
'INNER JOIN Routing ON Estim.PartNo = Routing.PartNo ) ' +
'INNER JOIN Materials ON Estim.PartNo = Materials.PartNo )')
if 'PartNo' in kwargs:
key = kwargs['PartNo']
sql = sql + 'WHERE Estim.PartNo=?'
cursor = con.cursor().execute(sql, key)
# use this for debuging only
num = 0
for row in cursor.fetchall():
num += 1
return num
This works fine for all PartNo except when PartNo contains a decimal point. Curiously, when PartNo contains a decimal point AND a hyphen, I get the appropriate record(s).
kwargs['PartNo'] = "100.100-2" # returns 1 record
kwargs['PartNo'] = "200.100" # returns 0 records
Both PartNos exist when viewed in the other application, so I know there should be records returned for both queries.
My first thought was to ensure kwargs['PartNo'] is a string key = str(kwargs['PartNo']) with no change.
I also tried to places quotes around the 'PartNo' value with no success. key = '\'' + kwargs['PartNo'] + '\''
Finally, I tried to escape the . with no success (I realize this would break most queries, but I'm just trying to track down the issue with a single period) key = str(kwargs['partNo']).replace('.', '"."')
I know using query parameters should handle all the escaping for me, but at this point, I'm just trying to figure out what's going on. Any thoughts on this?

So the issue isn't with the query parameters - everything works as it should. The problem is with the SQL statement. I incorrectly assumed - and never checked - that there was a record in the Materials table that matched PartNo.
INNER JOIN Materials ON Estim.PartNo = Materials.PartNo
will only return a record if PartNo is found in both tables, which in this particular case it is not.
Changing it to
LEFT OUTER JOIN Materials ON Estim.PartNo = Materials.PartNo
produces the expected results. See this for info on JOINS. https://msdn.microsoft.com/en-us/library/bb243855(v=office.12).aspx
As for print (repr(key)) - flask handles the kwarg type upstream properly
api.add_resource(PartAPI, '/api/v1.0/part/<string:PartNo>'
so when I ran this in the browser, I got the "full length" strings. When run in the cmd line using python -c ....... I was not handling the argument type properly as Gord pointed out, so it was truncating the trailing zeros. I didn't think the flask portion was relevant, so I never added that in the original question.

Set alias for column values

Is it possible to Set Alias for Column Values as we are set for column header in sql server.
Or if there is any other way to convert my column values to readable format for clients.
I have the following System generated values:
BILL_DETAILS
BILLING_MENU
ComplaintNumberInput
CUSTOMER_ACCOUNT_NUMBER_INPUT
DEFAULTER
FAULTS_SHUTDOWN_MENU
KUNDA_CONNECTION
LOAD_SHEDDING_MENU
LOAD_SHEDDING_SCHEDULED
loadSheddingScheduleReplayer
loadSheddingStatus
loadSheddingStatusReplayer
MENU_CONTEXT_EVAL
POWER_COMPLAINTS_MENU
repaetComplaintStatus
Is it possible to change them in the following:
BILL DETAILS
BILLING MENU
COMPLAINT NUMBER INPUT
CUSTOMER ACCOUNT NUMBER INPUT
DEFAULTER
FAULTS SHUTDOWN MENU
KUNDA CONNECTION
LOAD SHEDDING MENU
LOAD SHEDDING SCHEDULED
LOAD SHEDDING SCHEDULE REPLAYER
LOAD SHEDDING STATUS
LOAD SHEDDING STATUS REPLAYER
MENU CONTEXT EVAL
POWER COMPLAINTS MENU
REPEAT COMPLAINT STATUS

In sql, an Alias is a different name for a database object. Values does not fall under this category so it's impossible to alias them. You can, however, format the output of your query, though formatting is usually best to do in the presentation layer and not in the data layer.
Having said that, there is a t-sql solution for your question:
SELECT REPLACE(ColumnName, '_', ' ') As ColumnName
FROM TableName
This will convert all underlines to spaces.
To handle the other format you can thank Jeff Moden for solving that problem as well (see this link).
SELECT COALESCE(STUFF(ColumnName, NULLIF(patindex('%[a-z][A-Z]%', ColumnName COLLATE Latin1_General_BIN), 0) + 1, 0, ' '), Col) AS ColumnName
FROM TableName
So combining the 2 solutions your final sql should be something like this:
SELECT REPLACE(COALESCE(STUFF(ColumnName, NULLIF(patindex('%[a-z][A-Z]%', ColumnName COLLATE Latin1_General_BIN), 0) + 1, 0, ' '), ColumnName), '_', ' ') AS ColumnName
FROM TableName
This way you can handle these 2 formats in pure t-sql without having to change your query whenever a new value is added to the table.
Here is a test case with the values you posted:
DECLARE #t TABLE (Col VARCHAR(40))
INSERT INTO #t VALUES
('BILL_DETAILS'),
('BILLING_MENU'),
('ComplaintNumberInput'),
('CUSTOMER_ACCOUNT_NUMBER_INPUT'),
('DEFAULTER'),
('FAULTS_SHUTDOWN_MENU'),
('KUNDA_CONNECTION'),
('LOAD_SHEDDING_MENU'),
('LOAD_SHEDDING_SCHEDULED'),
('loadSheddingScheduleReplayer'),
('loadSheddingStatus'),
('loadSheddingStatusReplayer'),
('MENU_CONTEXT_EVAL'),
('POWER_COMPLAINTS_MENU'),
('repaetComplaintStatus')
SELECT Col
,UPPER(REPLACE(COALESCE(STUFF(col, NULLIF(patindex('%[a-z][A-Z]%', Col COLLATE Latin1_General_BIN), 0) + 1, 0, ' '), Col), '_', ' ')) AS NewCol
FROM #t
Results:
Col NewCol
BILL_DETAILS BILL DETAILS
BILLING_MENU BILLING MENU
ComplaintNumberInput COMPLAINT NUMBERINPUT
CUSTOMER_ACCOUNT_NUMBER_INPUT CUSTOMER ACCOUNT NUMBER INPUT
DEFAULTER DEFAULTER
FAULTS_SHUTDOWN_MENU FAULTS SHUTDOWN MENU
KUNDA_CONNECTION KUNDA CONNECTION
LOAD_SHEDDING_MENU LOAD SHEDDING MENU
LOAD_SHEDDING_SCHEDULED LOAD SHEDDING SCHEDULED
loadSheddingScheduleReplayer LOAD SHEDDINGSCHEDULEREPLAYER
loadSheddingStatus LOAD SHEDDINGSTATUS
loadSheddingStatusReplayer LOAD SHEDDINGSTATUSREPLAYER
MENU_CONTEXT_EVAL MENU CONTEXT EVAL
POWER_COMPLAINTS_MENU POWER COMPLAINTS MENU
repaetComplaintStatus REPAET COMPLAINTSTATUS

Use case statements for each value, like:
case old_column_name
when 'LOAD_SHEDDING_MENU'
then 'LOAD SHEDDING MENU'
when 'loadSheddingScheduleReplayer'
then 'LOAD SHEDDING SCHEDULE REPLAYER'
when ...........
then ...........
end as column_name

SQL Server 2005 Issue Column name or number of supplied values does not match table definition

I wish to DELETE the data from a table before performing an INSERT INTO, however I keep recieving an error stating:
Insert Error: Column name or number of supplied values does not match table definition.
I've also tried defining the columns the data should be entered into as part of the INSERT INTO statement, but then get issues with column names, even though they are correct. I have a feeling the issues relates to me selecting 2 PostCode entries and converting them into 1, but if someone could shed light on this it would be a big help.
My code can be found below, if you want me to add the code where I was sepcifing column names let me know. So you know the fields selected are all the fields in the Course table other than AutoNum which is a auto number primary key and SSMA_TimeStamp, which is a TimeStamp.
BEGIN
DELETE dbo.Course
INSERT INTO dbo.Course
SELECT
RTRIM( CAST (sd.[RefNo] AS nvarchar(50))) AS 'Student Ref No',
sd.[FirstForeName] AS Forename,
sd.[Surname],
sd.[Address1],
sd.[Address2],
sd.[Address3],
sd.[Address4],
sd.[DateOfBirth] AS DOB,
sd.[PostCodeOut] + ' ' + sd.[PostCodeIn] AS 'Post Code',
o.[Name] AS 'Course Name',
o.[Code] As 'Course Code',
e.[StartDate] AS 'Start Date',
e.[ExpectedGLH] AS 'Exp GLH',
e.[ExpectedEndDate] AS 'Expected End Date',
e.[ActualEndDate] AS 'Actual End Date',
e.[Grade] AS 'Grade',
ou.[Description] AS Outcome,
cs.[Description] AS 'Completion Status',
sd.[Tel1] AS 'Tel 1'
FROM [xxxxxxx].[xxxxxx].[dbo].[StudentDetail] sd
INNER JOIN [xxxxxxx].[xxxxxx].[dbo].[Enrolment] e
ON sd.[StudentDetailID] = e.[StudentDetailID]
Inner JOIN [xxxxxxx].[xxxxxx].[dbo].[Offering] o
ON o.[OfferingID] = e.[OfferingID]
INNER JOIN [xxxxxxx].[xxxxxx].[dbo].[CompletionStatus] cs
ON cs.[CompletionStatusID] = e.[CompletionStatusID]
INNER JOIN [xxxxxxx].[xxxxxx].[dbo].[Outcome] ou
ON ou.[OutcomeID] = e.[OutcomeID]
WHERE sd.[AcademicYearID] = '09/10'
AND
o.[Code] LIKE '%-ee%'
AND
o.[Name] LIKE '%-%dl%'
ORDER BY
sd.[RefNo]

It sounds like your 'Course' table does not match your insert statement, either in the number or names of the columns specified (as per the error message).
Could you add the create table code for the 'Course' table as that will show where the discrepancy lies.
Thanks.

I would explicitly list the columns in the course table that you are inserting into - this may solve your problem/help find your issue, but also reduce maintenance problems in the future.

To fix this issue you need explicitly specify list of the table's columns in the INSERT INTO statement.

you should add a list of columns to the INSERT statement, see below, where you explicitly list each column from dbo.Course that you intend to populate in your INSERT:
INSERT INTO dbo.Course
---<<<<<
(col1, col2, col3, col4, clo5....) ---<<<<<Add this here
---<<<<<
SELECT
RTRIM( CAST (sd.[RefNo] AS nvarchar(50))) AS 'Student Ref No',
sd.[FirstForeName] AS Forename,
sd.[Surname],
sd.[Address1],
sd.[Address2],
sd.[Address3],
sd.[Address4],
sd.[DateOfBirth] AS DOB,
sd.[PostCodeOut] + ' ' + sd.[PostCodeIn] AS 'Post Code',
o.[Name] AS 'Course Name',
o.[Code] As 'Course Code',
e.[StartDate] AS 'Start Date',
e.[ExpectedGLH] AS 'Exp GLH',
e.[ExpectedEndDate] AS 'Expected End Date',
e.[ActualEndDate] AS 'Actual End Date',
e.[Grade] AS 'Grade',
ou.[Description] AS Outcome,
cs.[Description] AS 'Completion Status',
sd.[Tel1] AS 'Tel 1'
FROM ....
then make sure that each column in the SELECT list matches each of these columns and in order. From your error, it sounds like you have too many any or too few returned columns in the SELECT.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Retrieve Duplicate Names (not numbers) in a Table - sql

Related

OPENJSON - modify statement to ignore first part of the string

Getting error "Only one expression can be specified in the select list..."

Pyodbc and Access with query parameter that contains a period

Set alias for column values

SQL Server 2005 Issue Column name or number of supplied values does not match table definition

Categories

Resources