Creating XML Output with Carriage Returns from SQL Server 2014 - sql

I've written code in SQL Server to create an XML output. However, this exports with no carriage returns.
I initially built a workaround with a replace statement around the entire XML output code that would embed carriage returns between the nodes, but because that only allows me to export a small amount of data at a time, it's not sufficient long-term. When I try to run this on larger datasets, it truncates the text around 65000 characters.
I've tried to cast the entire statement as nvarchar(max) to increase the output size but that doesn't seem to work either. Does anybody have any recommendations for how to do this that isn't just find+replace once the file has already been output from SQL?

First, I would educate the client first. I would imagine it is to make it human readable, but it also expands the size of the returned set. They will likely stick to their guns, but education often stops people from spending money on stupid crap.
Second, I would not do this in SQL Server. This is a user interface type of task (including service endpoints as "user" interface here) and not a task to be done in the database. Doing it outside of SQL Server gives you better access to the XML DOM, which can help if they are truly CRLF and not the &#__; numeric equivalents. If the later, you will have to do a replace function.
If you HAVE to do this in SQL Server, grab the XML result and then replace. I would do this the easy way and replace > with >CRLF and see if that is acceptable, as it is less time consuming. Without the DOM it is difficult to know the difference between open tags and end tags. You can find the right tag using regex, if you want to go that far, but SQL Server's implementation is not as good as many programming languages, so this will be time consuming.
Ultimately, if they are willing to pay you for something that does not make a difference, then that is their baby, but it is a useless exercise IMO.

Related

Looking for an explanation of this attempted SQL injection query

Looking through my logs I found the following query string as an attempt to perform a SQL injection, probably from an automated tool:
(select*from(select+sleep(10)union/**/select+1)a)
From what I can tell, it’s attempting a timing based attack to see if any of the tables in my database start with “a” - the sleep function will only run if the union query matches something? But I am a bit confused about other parts of the attack:
Why are there plus signs between parts of the query?
Why is there a comment as part of the query string?
Would be interested in any answers - I’m fairly certain my site hasn’t been compromised as I haven’t scanned further activity on that query and can’t get it to execute myself, so just wondering if my intuition was correct. Cheers!
I don't know what the point of this is, nor what the point is of trying to figure out the point. Injections are easier to block than to reverse engineer, and the latter doesn't contribute much to the former.
The point of the + and the /**/ are probably pretty much the same, they separate tokens without the use of whitespace. Presumably someone thinks whitespace is going to trigger some kind of alarm or blockage.
The 'a' is just an alias, and is probably there to avoid the error 'ERROR: subquery in FROM must have an alias'
This won't work in stock PostgreSQL because there is no function spelled sleep. They might be targeting a different DBMS, or maybe PostgreSQL with a specific app/framework in use which creates its own sleep function.
The sleep is probably there in case the system doesn't return meaningful messages to the end user. If it takes 10 seconds to get a response, then you know the sleep got executed. If it immediately returns, you know it didn't execute, but don't know why it didn't.
This is meant to detect a SQL injection (probably through an HTML parameter) via a timing attack. The inserted comments (as other people have mentioned) are meant to remove whitespace while still allowing the query to parse in an attempt to fool custom (badly designed) sanitization. The "+" is likely meant to be decoded into a space after passing through HTML decoding.
If you replace the whitespace and add indentation it's easier to see what's going on:
select * <-- match any number of columns on the original query
from
(select <-- nested sub-query in the from clause
sleep(10) <-- timing attack meant to detect whether the SQL ran
union <-- not sure why the union is needed
select 1) a <-- alias the subquery to "a"
) <-- close off matching parens in injected SQL?
I don't think this is attempting to look for tables that start with a, simply run a sleep on a possible recursive query, which could cause your database trouble, if a bunch of them execute.
The + signs are likely an attempt to do some string concatenation... That would be my guess
Regardless I would strongly look at tracing back where this originated from and sanitizing your inputs on your site so raw inputs ( potential sql ) is not being dropped into queries.

Does redshift store failed queries?

I wish to analyze the queries executed on certain redshift warehouse (not mine).
In order to do so I'm using a query with a join on stl_querytext and stl_query.
My question is how come I'm also getting illegal queries (I.E queries with wrong sql syntax)?
When I've tried it in my local redshift I haven't seen those. Also, couldn't find relevant documentation.
Is this a configuration issue? And in case I'm supposed to those queries is there a way to know those are illegal ones?
Thanks,
Nir.
So stl_querytext breaks long queries into parts identified by sequence number. I hope you are reconstructing the parts into the original query as a first step. This can be done with listagg() function as long as the resulting query doesn't over the max tex field (about 320 parts).
Now this is not enough to get valid SQL back in all cases because you need to treat combining the parts differently depending if the section of the query is inside or outside a text string in the query. (Is white space needed between parts or not)
I've done this exact process a bunch so this is doable. I don't have a perfect process on the whitespace question, I get close. Maybe others know the expression to get an exact recreation of the query from stl_querytext.

.Net Parsing Fixed Width Data... From a Concatenated, Single, Fixed-Width Column

I was bored and looking at old code that runs like molasses on a cold day. I found that a group of tables in our accounting system - each with 500,000 records of ~20 datapoints - that use a single column of concatenated, fixed-width values instead of separate columns. (Fixing the tables isn't an option.) An old .net ETL project is grabbing all records, doing a bunch of substrings on each record to set an object's corresponding attributes, then sending the object to merge with production data via a stored proc.
The way it is working is fine. It works. And, to be perfectly honest, I doubt I'll be given the go-ahead to fix it even if I come up with a better solution, but I was curious to see if anyone knew of a better way of doing this, because it's not entirely unlikely that I'll face a situation like this in the future.
I was thinking that if there was a way to use the TextFieldParser to parse a static string instead of a file/stream that might be a valid idea. Or, instead, I could write the entire table to a text file and then use the TextFieldParser to send data to the SProc. http://www.dotnetperls.com/textfieldparser does show that TextFieldParser is quite a bit faster than split, which I would assume is tantamount to the string manipulation our project is currently doing with substring. So there may be something to that idea.
Or perhaps the whole, old project should be dumped for a shiny new SSIS project. Would it also have to write the records to a flat file before importing into SQL? Or can it import directly from the table?
Thank you in advance!

Replace Character in String with SQL Server Table Trigger on Insert\Update

**Answered
I am attempting to create a trigger that will replace a character ’ (MS Word Smart Quote) with a proper apostrophe ' when new data is inserted or updated by a user from our website.
The special apostrophe may be found anywhere on a 5000 NVarchar column and may be found multiple times in the same string.
Any easy replace statement for this?
REPLACE(Column,'’','''')
I'm going to argue that you should probably look at doing this in your applications instead of from within SQL Server. That's NOT the answer you're looking for - but it would probably make more sense.
Typically, when I see questions like this I instantly worry about devs trying to 'defeat' SQL Injection. If that's the case, this approach will NEVER work - as per:
http://sqlmag.com/database-security/sql-injection-beyond-basics
That said, if you're not focused on that and just need to get rid of 'pesky' characters, then REPLACE() will work (and likely be your best option), but I'd still argue that you're probably better off tackling 'formatting' issues like this from within your applications. Or in other words, treat SQL Server as your data repository - something that stores your raw data. Then, if you need to make it 'pretty' or 'tweak' it for various outputs/displays, then do that on the way out to your users by means of your application(s).

how to update field names automatically after updating SQL

I am changing the command text for a data set inside the .rdl ffile:
I would like to know how can I update the resulting fields that are returned by the select statement:
I know that these fields must be automatically generated, so I was wondering if it's possible to update them right after editing the SQL code inline??
Usually when someone wants to have a look at the data in command text they are wanting it for reference to an end user(from what I have seen). You may want to amend it but ultimately with reporting your first goal should be: "What am I doing this for?" If your goal is dynamic creation at runtime then I would avoid this and offer a few other suggestions:
Procertize it. Making a stored procedure if you have the know how in SQL Server is a convenient and fast way to get what you want and you can optimize it if you know what you are doing with your SQL FU to get good results. The downside would be if you work with multiple environments you have to deploy your code for the TSQL as well as the RDL file.
Use an expression to build the dataset at runtime. In cases where I have been told that the query itself was not properly optimized by other developers they have mentioned doing this. I myself do not always see the advantage of doing this versus just having your predicate construction work well with good indexing on the source engine. Regardless you can build your dataset at runtime. It would be similar to hitting 'fx' next to the text and then putting in something like this(assuming you have a variable named #Start):
="Select thing
from table
Where >= " & Parameters!Start.Value
Again I have not really seen if this is really that much faster than:
Select thing
from table
Where >= #Start
But it is there if you just want to build it dynamically.
You can try to build your expression dynamically from parameters being PART of the select statement. SSRS is all about the 'expressions' and what you can do with them. Once you jump in and learn how they apply to everything you can go nuts so to speak on using them. A general rule though is the more of them you use and rely on the slower your reports will become.
I hope some of this may help, I would ask first is something dynamic due to a need to be event driven or is performance related.