Two Identical Tables, Two Identical SQL Queries, Two Completely Different Results

Two Identical Tables, Two Identical SQL Queries, Two Completely Different Results - sql

I was tasked with migrating an ancient (2007) supposedly ASP.NET site from a "private" server in the office of the people who own it onto my company's hosting account on rackspace. Everything went smoothly until we switched the DNS. It turned out the original programmer had hardcoded references to files, specifically to the file that generates and formats the navigation menu. When we replaced the hardcoded references, it suddenly wasn't behaving at all like it should. I tracked down the query he used to generate the XML table for the menu.
SELECT
parent.id,
parent.title,
'/page.aspx?id=' + isnull(cast(parent.id as varchar),'') + '&name=' + parent.name url,
siteMapNode.id, siteMapNode.title,
'/page.aspx?id=' + isnull(cast(siteMapNode.id as varchar),'') + '&name=' + siteMapNode.name url,
siteMapSubNode.title,
'/page.aspx?id=' + isnull(cast(siteMapSubNode.id as varchar),'') + '&name=' + siteMapSubNode.name url
FROM page parent
right join page siteMapNode on siteMapNode.pageid=parent.id and siteMapNode.active=1 and siteMapNode.hidden=0
left join page siteMapSubNode on siteMapSubNode.pageid=siteMapNode.id and siteMapSubNode.active=1 and siteMapSubNode.hidden=0
where SiteMapNode.name <> 'home' and parent.menu = '1' and parent.active = 1 and parent.hidden <> 1
order by parent.orderby, siteMapNode.orderby
for xml auto
I had backed up the local db, also on that box in their office, "restored" to the backup on my local testing db, and then imported to rackspace's db from my testing db. (All this middleman stuff is to get around their firewalls.) So to all intents and purposes, the source code, tables and queries used across all 3 servers are exact copies.
When I run that query in MSSQL, here's a short excerpt of the results I get:
Their server (Version unknown right now, I have to go through teamviewer to find out.) and my Server (MSSQL 2008 Server 10.0.2531 - I think maybe SP1)
Rackspace's Server (MSSQL 2008 server 10.0.4064 I think maybe SP2)
Any advice, hints, ideas on why the rackspace one acts so weirdly is greatly appreciated. It seems like it's obvious that it's something to do with the difference in servers but I can't tell if it's the version, the SP, a setting, or what. If anyone has ever seen something similar I'd love to hear what you learned from it. I'm just a humble programmer, definitely not an SQL expert.
EDIT: Here is the schema of the table, id is the primary key, the poorly named pageid is actually more of a parent-page-id.
I've tried looking at it with and without for xml auto. When I take off for xml auto it returns the same results in a slightly different order, but when I change the 4th line of the query from siteMapNode.id to parent.pageid then the results show the same order. Adding xml auto back in shows the same results as the above images. I'll try experimenting with for xml path, thanks for the suggestion!

Related

Same code executed on different servers (same version) yields different results

Issue with Delphi legacy code. Added one line of code to correct one error and created a new error.
New error is causing the same executable to yield different results on different servers(switched the pointer from dev to prod environment on the executable).
code:
sEscapedString:=stringreplace(sStringIn,'[','''+char(27)+''[',[rfReplaceAll]);
sEscapedString:=stringreplace(sEscapedString,']','''+char(27)+'']',[rfReplaceAll]);
sEscapedString:=stringreplace(sEscapedString,'''','''''',[rfReplaceAll]);// this line created new
bug
result:=' like ''' + Trim(sEscapedString) + '%'''+' escape char(27) ';
When running the code against dev this query finds objects with the characters '[' and ']' in it
Against prod the query does not find those items:
The first thing I checked was the data: Exactly identical in both cases
The second thing I checked was SQL server versions (11.0.3128 on BOTH servers)
The third thing I am checking is settings on those servers:
DBCC USEROPTIONS; -- same on both
SELECT name, collation_Name FROM sys.databases -- same on both
select ##OPTIONS -- same on both.
Quoted identifiers are 'ON' for both servers
It comes down to the fact that I know one server is treating the escape character (chr(27)) differently than the other but I do knot know why.
Does anyone have a theory(or answer) as to why the 2 similar servers are treating the escape characters differently?
The goal here is getting the prod server to return values with '[' and ']', as setting up my system to work with the legacy code will take a LOT of additional time. I do have a fix for the code
sEscapedString:=stringreplace(sStringIn,'[','[[]',[rfReplaceAll]);
But the faster option would seem to be getting the server to read the values the same.
Update: We found the root cause of the difference and it was more mundane than what we expected, turns out the query we were running was actually executed twice. The second execution was missing the key piece on the production server.
The issue was resolved by moving the new line of code so that it executed first rather than last.

I would first try to find out if this SQL only causes different behaviour when it is sent from the application: by sending the SQL from an interactive SQL client tool to both servers.
To make sure that the manually tested SQL is exactly the same as in the application, I would try to log or capture the exact SQL as sent from the application as a text file and then paste its content to the SQL client tool.
If the server is the culprit, then using the SQL from a different client tool should cause the same difference with the two servers. If the client tool shows the same (correct) result on both servers, then something is going on in the Delphi application.
p.s. upvoted, it is an interesting phenomenon

Select * not returning all columns - Coldfusion / SQL Server 2008

I am getting some strange behavior involving database queries that I have never seen before and I'm hoping you can illuminate the issue.
I have a table of data called myTable with some columns; thus far everything involving it has been fine. Now, I've just added a column called subTitle; and II notice that the SELECT * Query that pulls in the data for a given record is not aware of that column (it says the returned query does not have a subTitle column), but if I explicitly name the column (select subTitle) it is. I thought perhaps that the Coldfusion server might be caching the query so I tried to work around with cachedwithin="#CreateTimeSpan(0, 0, 0, 0)#" but no dice.
Consider the below code:
<cfquery name="getSub" datasource="#Application.datasourceName#">
SELECT subTitle
FROM myTable
WHERE RecordID = '674'
</cfquery>
<cfoutput>#getSub.subTitle#</cfoutput>
<cfquery name="getInfo" datasource="#Application.datasourceName#">
SELECT *
FROM myTable
WHERE RecordID = '674'
</cfquery>
<cfoutput>#getInfo.subTitle#</cfoutput>
Keeping in mind that record 674 has the string "test" in it's subTitle column the about of the above is
test
[[CRASH WITH ERROR]]
This doesn't make sense to me unless SQL Server 2008 has somehow cached the SELECT * query with the previous incarnation of the table, but the strange thing is if I run the query right from within SQL Management Studio there are no problems and it shows all columns with the select *
Frankly this one has me baffled; I know I can get around this by explicitly naming all the desired columns in the select query instead of using * (which is best practice anyway), but I want to understand why this is occurring.
I've worked with SQL Server 2005 for many years and never had something like this happen, which leads me to believe it might involve something new in SQL Server 2008; but then again the fact that the query works fine inside of the management studio doesn't jive with that either.
===UPDATE===
Clearing the Template Cache in the CF admin will solve the issue

Yes, ColdFusion caches the <cfquery> SQL string. If the underlying table structure changes, the result might be an exception like you see it.
Work-arounds:
Recommended solutiuons:
If you have the development or enterprise version you can view your query cache in the server moniter and clear only the queries there. (comment from #Dpolehonski, thanks)
Otherwise, click Clear Template Cache Now in the ColdFusion Administrator (under Server Settings/Caching).
This will invalidate all cached CFML-Templates on the server and CF will re-compile them when necessary.
Quick and dirty:
Subtly change the query SQL, for example add a space somewhere. When you are on a development machine it's the quickest way to fix the issue.
This will invalidate the compiled version of this query only and force a re-compile.
(Note that removing the subtle change will trigger the error again since the old query text will remain cached.)
Brute-force:
Re-start the ColdFusion server. Brutal, but effective.

Or the quick and super dirty method:
<cfquery name="getInfo" datasource="#Application.datasourceName#">
SELECT
*, #createUUID()# as starQueryCacheFix
FROM
myTable
WHERE
RecordID = '674'
</cfquery>
Don't leave in production code though... it'll obsolete all of the query caching ColdFusion does. I did say it was super dirty ;)

Empty XML Columns during SQL Server replication

We have a merge replication setup on SQL Server that goes like this: 1 SQL server at the office, another SQL server traveling around the world. The publisher is the SQL server at the office.
In about 1% of the cases, two of our tables with a column of XML Data type (not bound to a schema) are replicated with rows containing empty XML columns. ( This only happened when data is sent from the "traveling server" back home, but then again, data seems to be changed more often there ). We only have this in prod. environment ( WAN replication ).
Things i have verified:
The row is replicated, as the last modification date on the row is refreshed but the xml column is empty. Of course it is not empty on the other SQL Server.
No conflicts are displayed in the replication conflicts UI.
It is not caused by the size of the data inside the XML Column as some are very small.
Usually, the problem occurs in batch. ( The xml column of 8-9 consecutive rows will be empty )
The problem occurs if a row was inserted OR updated. No pattern there.
The problem seems to occur, but this is pure speculation on my part when the connection is weaker. ( We've seen this problem happen more often when the server was far away as compared to when it was close by. )
Sorry if i have confused some things, I am not really a DBA, more of a DEV with knowledge of SQL but since the application using the database keeps getting blamed for the problems ( the XML column must not be empty!! ) I have taken it at heart to try and find the problem instead of just manually patching the data each time ( Whats the use of replication if you have to do that? )
If anyone could help out with this problem, or at least suggest some ways of being able to debug / investigate this it would be greatly appreciated.
I did search alot on google and I did find this: Hot Fix . But we do have the latest service pack and the problem seems a bit different.
fyi: We have a replication setup locally here but the problem never occurs. We will be trying a WAN simulator on it as well to see if that can help.
Thanks

Edit: hot fix is now available for my issue: http://support.microsoft.com/kb/2591902
After logging this issue with Microsoft, we were able to reproduce the problem without a slow link ( Big thanks to the competent escalation engineer at Microsoft ). The repro is a bit different from our scenario, but highlights the timing issue we were getting perfectly.
Create 2 tables – One parent one child (have a PK-FK relationship)
Insert 2 rows in the parent table
Set up replication – configure merge agent to run ON DEMAND
Sync
Once all is replicated:
On the PUBLISHER: delete one row from the parent table
On the SUBSCRIBER: Insert 2 rows of data that references the parentid you deleted above
Insert 5 rows of data that references the parentid that will stay in the table
Sync, Merge agent will fail, Sync again, Merge agent will succeed
Missing XML data on the publisher on the 5 rows.
Seems it is a bug that is in SQL Server 2005/2008 and 2008R2.
It will be addressed in a hot fix in 2008 and up. ( As SQL Server 2005 is no longer being altered )
Cheers.

You may want to start out by slapping a bandaid on this perplexing situation to buy some time to fully investigate and fix (or more likely get MS to fix it). SQL Data Compare is an excellent tool that might help.

Figured i'd put an update here as this issue got me a few gray hairs and I am somewhat closer to a solution now.
I finally had some time to work on this and managed to reproduce this issue in our test environment, using a WAN simulator and slowing down the link and injecting some random packet loss. ( to best simulate the production environment where the server is overseas on a really bad line ).
After doing some SQL tracing, and some verbose logging here are my conclusions:
When replicating a row with an XML column, the process is done in 2 steps. First an insert is done of the full row but with an empty string for the XML column. Right after, an update is done this time with the XML column having data. Since the link is slow, in some situations a foreign key violation occured.
In this scenario, Table2 depends on Table1. After finishing replicating table1, and starting to replicate table2 (Enumration of insert/updates which takes time on a slow link), some entries were added to table1 and table2. Therefore some inserts on Table2 failed because Table1 entries were not in the database and were only going to be replicated next batch. The next time the replication occured, no more foreign key violations occured, however when it tried to insert the row that had previously failed in Table2 ( XML column row ), the update part of it was missing ( I could see that in the SQL profiler ) and that is why the row ended up after all was done with an empty XML.
Setting "Enforce for replication" to false on the foreign keys seems to address the problem, however I do still think that this whole process should work with the option set to true.
I logged a support call with Microsoft for this. I have sent the traces and logs to Microsoft and will see what they have to say.
I've read this article: http://msdn.microsoft.com/en-us/library/ms152529(v=SQL.90).aspx. But for me, setting this option to false is kind of a work around, no?
What do you guys think?
ps: Hope this is clear, tried to explain it the best I could. English is not my first language.

Problem with a MS Access query after a "Compact and repair" operation

I have an Access application that use the classical front-end/back-end approach. Yesterday, the backend got corrupted for a reason I don't know. So I opened the backend with Access 2003 and access asked me if I wanted to repair the file, I said yes and it seemed to work.
I can open the database see the tables contents and run most of the queries.
However there is an access query that doesn't work with a specific where clause.
Example :
// This works in the original DB, but not in the compacted one :
SELECT a, b, c
FROM tbl1 INNER JOIN tbl2 ON tbl1.d = tbl2.d
WHERE e = 3 AND tbl2.f = 1;
// This works in both the original and the compacted one :
SELECT a, b, c
FROM tbl1 INNER JOIN tbl2 ON tbl1.d = tbl2.d
WHERE e = 3;
When I try to run the queries, nothing happens. The access process start to use most of the CPU and the GUI stop responding. If I run the query from the query editor, I can use Ctrl+Break to stop the execution. I tried to give the query lot of time and it didn't help.
I've checked the execution plan in showplan.out and it seems correct (at least it should not takes forever to execute)
I tried to compact the DB again. I tried to import the tables in a new DB. I even tried to import the tables and their data in a mdb file that was in a now good state (from a backup).
Anyone have an idea?

Sounds like an index was corrupted and when that happens, it's dropped during the compact. Check for a system table called MSysCompactErrors -- you'll have to show hidden objects and/or system objects in Tools | Options | VIEW.
Never compact a Jet MDB without making a backup beforehand. Because of that rule, the COMPACT ON CLOSE function is completely useless, as it's not cancellable, so you always make sure it's turned off in all MDBs.

I don't know what type of meta data Access brings along when it imports a table from one database into another one. If the meta data is corrupted, importing the table to another database wouldn't necessarily resolve the problem. If practical, you might try creating the tables from scratch in a brand new database and then just exporting and importing (or copying and paste appending) the data into the new database.
I've never seen a table get corrupted like this in such a small database, although with Access anything is possible. Could there be something wrong with the data?

I'd try recreating the query fresh (new name, etc.), and see what happens.
You could even try copying it (even within the same DB or to a brand new one). If that works, the worst case scenario is you have to copy all the objects across to a new DB.

Is there an index on the field tbl2.f?
Also try going into that table in datasheet view, sort tbl2.f in ascending sequence and see if there is anything really strange in the first or last records.

Do you have access to a SQL Server installation? You could use the Upsizing Wizard under the Tools -> Database Utilities menu to copy the data to SQL Server, and see if you get the same problem there.

SQL query giving wrong result on linked server

I'm trying to pull user data from 2 tables, one locally and one on a linked server, but I get the wrong results when querying the remote server.
I've cut my query down to
select * from SQL2.USER.dbo.people where persId = 475785
for testing and found that when I run it I get no results even though I know the person exists.
(persId is an integer, db is SQL Server 2000 and dbo.people is a table by the way)
If I copy/ paste the query and run it on the same server as the database then it works.
It only seems to affect certain user ids as running for example
select * from SQL2.USER.dbo.people where persId = 475784
works fine for the user before the one I want.
Strangely I've found that
select * from SQL2.USER.dbo.people where persId like '475785'
also works but
select * from SQL2.USER.dbo.people where persId > 475784
brings back records with persIds starting at 22519 not 475785 as I'd expect.
Hope that made sense to somebody
Any ideas ?
UPDATE:
Due to internal concerns about doing any changes to the live people table, I've temporarily moved my database so they're both on the same server and so the linked server issue doesn't apply. Once the whole lot is migrated to a separate cluster I'll be able to investigate properly. I'll update the update once this happens and I can work my way through all the suggestions. Thanks for your help.

The fact that LIKE operates is not a major clue: LIKE forces integers to string (so you can say WHERE field LIKE '2%' and you will get all records that start with a 2, even when field is of integer type). Your incorrect comparisons would lead me to think your indexes are corrupt, but you say they work when not used via the link... however, the selected index might be different depending on the use? (I seem to recall an instance when I had duplicate indexes and only one was stale, although that was too long ago to recall the exact cause).
Nevertheless, I would try rebuilding your index using the DBCC DBREINDEX (tablenname) command. If it turns out that doing so fixes your query, you may want to rebuild them all: here is a script for rebuilding them all easily.

Is dbo.people a table or a view? I've seen something similar where the underlying table schema had been changed and dropping and recreating the view fixed the problem, although the fact that the query works if run directly on the linked server does indicate something index based..

Is the linked server using the same collation? Depending on the index used, I could see something like this perhaps happening if the servers were not collation compatible, but the linked server was set up with collation compatible (which tells Sql Server it can run the query on the remote server).

I would check the following:
Check your definition on the linked server, and confirm that SQL2 is the
server you expect it to be
Check and compare the execution plans both from the remote and local servers
Try linking by IP address rather than name, to ensure you have the proper machine
Put the code into a stored procedure on the remote machine, and try calling that instead

Sounds like a bug to me - I;ve read of some issues along these lines, btu can't remember specifically what. What version of SQL Server are you running?
select * from SQL2.USER.dbo.people where persId = 475785
for a PersID which fails how does:
SELECT *
FROM OpenQuery(SQL2, 'SELECT * FROM USER.dbo.people WHERE persId = 475785')
behave?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas