Spellcheck Data in SQL Server

Spellcheck Data in SQL Server - sql

We have a large database with hundreds of tables that have lookup text stored in them with spelling errors from when the application was originally written offshore. Is there a way to run a spellcheck on the data in sql server so we can find all of these errors quickly?

One thought - You could write a CLR function to access the spell checker included with Microsoft Word. See: Walkthrough: Accessing the Spelling Checker in Word for a starting point.

Export your data to Excel.
Distribute the sheets to multiple people to break up the work load
Have them run spell check to identify the misspelled words.
Gather up the bad records and create update statements from the db.
Don't offshore applications where English spelling is important.
EDIT: I should have pointed out, you might use the spell check to identify the issues but you want human eyes on the actual data as well as suggested fixes. Thus, being able to spread the work around to run the spell check is important. There won't be any fully automated solution that will catch everything, or worse it will catch too much and muck the data up worse.

Just stumbled across this one.
Another way to do this is using Microsoft Access, using ODBC, link to the tables on SQL side, then open the table in access and run the spellcheck in Access, you can update the corrections on the fly.

Related

Best way to save SQL versions while working on Tableau?

I am working using Tableau and have to write down multiple different SQL each time, while making new data sources.
I have to save all changes on SQL for every data source.
Currently I would paste the SQL on notepad and save them on separate folder in my computer, along with description of the changes.
Is there any better way to do this?

Assuming you have permission to create objects in the database, begin by creating database views, As #Nick.McDermaid commented.
Then, instead of using Custom SQL data source in Tableau, just connect to the View as if it were a table.
If you need to track the changes to these SQL views of your data, you will need to learn how to use source control for the .sql files that can be scripted from within SQL Server Management Studio:
Your company or school may have a preferred source control system already in use, in which case you should use that. If they don't, or if you are learning at home, then Git and Subversion are popular open source choices.
There are many courses available on learning platforms like Coursera that will teach you how to learn how to use those systems.

I had similar problem as you.
We ended up writing the queries in SQL Editor SQL Work bench (https://www.sql-workbench.eu/), then managed the code history and performed code peer-review (logic, error check, etc) in team shared space (like confluence).
The reasons we did that is
1) SQL queries are much easy to write on Work Bench
2) Code review is a must! You will find through implementing a review process more mistakes than you could ever think about
3) The shared space is just really convenient as it is accessible by everyone, and all errors are documented. After sometimes you get a lot of visible knowledge accumulated.
I also totally agree with Nick as this is one step to a reporting solution. But developing a whole reporting server is heavy, costly and takes time. Unless management are really convinced of the importance of developing a reporting solution, you may have to get a workaround with queries and Tableau (at least that was the case for us)

A little late to the party, but I would suggest you simply version the tableau workbook. The contents of the workbook are XML, so perfect for versioning using file based tools (Dropbox, One Drive, etc.) or source control (git, etc.). The workbooks themselves are usually quite small, so just make sure to keep the extract data separate if you use it.

Search entire database in Toad for searchterm

I need a way to search my entire Oracle database for a column that contains the value 'Beef'. What I need is the column name and table name so I can complete my query. Beef is an animal feed type and it is a known value in my database. I just don't know where.....
Essentially, we have a very old very clunky application that I am using SQL data sets generated from Toad freeware to get around. The application shows us laboratory testing information for our companies. The catch is you can only look at one company's lab report at a time, and as a I said, it takes FOREVER. We have over 700 companies we regulate so this is not an option (oh and you can't copy any of the fields).
I have already generated a query that gets me 99% of the information I need until I realized I was missing one column value that for some unearthly reason isn't included with the other attributes of the lab samples. We have around 100 or so tables and many of them aren't even in use. It's a poorly organized database and I've tried manually going through it and simply cannot find the stupid column and I have no idea what it could be named (naming conventions here seem not to apply).
A monkey wrench is: although I've done a decent amount of SQL coding for my job I'm not in IT. My job hooked me up with a read-only access of our database so I could run reports for them etc, but I'm not in the IT section so I don't get write privileges. So a lot of solutions I see that use DDL aren't available to me.
I guess it might be relevant, it looks like we're running oraClient10g.
I've tried this code that I got from here: community.Oracle
but as you can see I don't get any results.
I also tried the one suggested here at stackOverflow, but got a litany of errors so I abandoned that pretty quick. (I figured it's because of my read-only privileges and or my version of Oracle).
Any help would be greatly appreciated.

How to minimize bloating Access database having queries and Macros, no VBA code?

I am new to Access, I am a C programmer who also worked with Oracle. Now I am creating an Access database for a small business with Access front-end and SQL Server back-end. My database has about 30 small tables (a few hundreds records each) and a rather complicated algorithm.
I don't use VBA code because I don't have time for learning VBA, but I can write complicated SQL statements, so I use a lot of queries and macros.
I am trying to minimize the daily growth of my database. I've thought about splitting the Access DB. It doesn't make sense because my DB is rather small. After compacting its size is about 5 MB. The regular compact procedure is not convenient because my client's employees work from home any time they wish. So I need to create a DB that would bloat as slowly as possible.
I did some research and found a useful info: "the most common causes of db bloat are over-use of temporary tables and over-use of non-querydef SQL" (http://www.access-programmers.co.uk/forums/showthread.php?t=48759). Could somebody please clarify that for me? I have 3 questions about that:
1) I cannot help using temporary tables, I tried re-using the same table names in 2 ways:
a) first clear all records and then run an append query or
b) first run a Macro command "DeleteObject" (to free the space in full) and then re-create the temporary table.
Can somebody please advise which way is better in order to reduce the DB growth?
2)After running a stored query I cannot free the space like I did in C using VBA statement "query.close" (because I don't use VBA). But I can run Macro command "close query" after each "OpenQuery". Will it help or just double the length of my Macros?
3)Is it correct that I shouldn't use Macro commands RunSQL for simple SQL statements and create stored queries instead? Even though it will create additional stored queries.
Any help would be appreciated!

Ah the joys of going back to lego after being a brickie! :)
1) Access is essentially a text-based file system. When you delete a record or a table, is persists in the file but with a flag which marks it to be ignored. When you compact an Access db, the executable creates a new file, and moves everything unmarked into that, then deletes the old file. You can see this actually happening if you use Windows Explorer to monitor the folder during this process.
You mention you are using SQL Server, is there a reason you are not building the temp tables on the server? This would be a faster, cleaner and all-round more efficient solution - unless we've missed something. Failing that, you will have to make the move from macros, but truthfully, if you can figure out C, then VBA will be like writing a memo!
http://www.access-programmers.co.uk/forums/showthread.php?t=263390
2) issuing close commands for saved queries in Access has no impact on the file-bloat issue, they just look untidy
3) yes, always used saved queries, since this allows Access to compile the SQL in advance, and optimise execution.
ps. did you know you can call SQL Server Stored Procs from within an Access saved query?
https://accessexperts.com/blog/2011/07/29/sql-server-stored-procedure-guide-for-microsoft-access-part-1/
If at all possible, you should look for ways to dispense with the Access back-end, since you already have SQL Server as the backend - though I suspect you have your reasons for this.

what method is the most used and efficient to copy "database schema" to other servers not the db!

servers all sql server 2008, and win xp
i have the following task
create a huge database, DONE
distribute it to the 20 waiting servers!!
if there were two or three i would have taken the trouble of creating the db's on all of using sql server managemnt stdio
but i am guessing that there is an efficient way
please note,
only the copy of the database structure, the schema is needed not the values within the cells!
thank you

Or of course you should have been creating the scripts as you went along and putting them in Source control. Then you would have exactly which scripts you needed for this version of the software and be used to doing the same thing for later modifications. You would also script the data inserts for any lookup tables you need to build.
Not having that, you can script the entire database. or use a SQL compare tool. But I strongly urge you to start treating database code like all other code and scripting, storing it in source control and versioning it. Life is so much better when you do that.

What Gabriel McAdams has shown, or, Redgate SQL Compare does this very nicely also.

If you can spare the moolah, using a tool like Red Gate's SQL Packager is an option i have used in the past and it works well!
The tool can do a lot more as well and may not be worth the spend though if you do not need the other features!
In that case, Gabriels'option above is definitely the easiest one to go with!

Getting a significant amount of data into a SQL Server (Express) database at time of deployment

For most database-backed projects I've worked on, there is a need to get "startup" or test data into the database before deploying the project. Examples of startup data: a table that lists all the countries in the world or a table that lists a bunch of colors that will be used to populate a color palette.
I've been using a system where I store all my startup data in an Excel spreadsheet (with one table per worksheet), then I have a utility script in SQL that (1) creates the database, (2) creates the schemas, (3) creates the tables (including primary and foreign keys), (4) connects to the spreadsheet as a linked server, and (5) inserts all the data into the tables.
I mostly like this system. I find it very easy to lay out columns in Excel, verify foreign key relationships using simple lookup functions, perform concatenation operations, copy in data from web tables or other spreadsheets, etc. One major disadvantage of this system is the need to sync up the columns in my worksheets any time I change a table definition.
I've been going through some tutorials to learn new .NET technologies or design patterns, and I've noticed that these typically involve using Visual Studio to create the database and add tables (rather than scripts), and the data is typically entered using the built-in designer. This has me wondering if maybe the way I'm doing it is not the most efficient or maintainable.
Questions
In general, do you find it preferable to build your whole database via scripts or a GUI designer, such as SSMSE or Visual Studio?
What method do you recommend for populating your database with startup or test data and why?
Clarification
Judging by the answers so far, I think I should clarify something. Assume that I have a significant amount of data (hundreds or thousands of rows) that needs to find its way into the database. This data could be sourced from various places, such as text files, spreadsheets, web tables, etc. I've received several suggestions to script this process using INSERT statements, but is this really viable when you're talking about a lot of data?
Which leads me to...
New questions
How would you write a SQL script to take the country data on this page and insert it into the database?
With Excel, I could just copy/paste the table into a worksheet and run my utility script, and I'd basically be done.
What if you later realized you needed a new column, CapitalCity?
With Excel, I could take that information from this page, paste it into Excel, and with a quick text-to-column manipulation, I'd have the data in the format I need.
I honestly didn't write this question to defend Excel as the best way or even a good way to get data into a database, but the answers so far don't seem to be addressing my main concern--how to get all this data into your database. Writing a script with hundreds of INSERT statements by hand would be extremely time consuming and error prone. Somehow, this script needs to be machine generated, but how?

I think your current process is fine for seeding the database with initial data. It's simple, easy to maintain, and works for you. If you've got a good database design with adequate constraints then it doesn't really matter how you seed the initial data. You could use an intermediate tool to generate scripts but why bother?
SSIS has a steep learning curve, doesn't work well with source control (impossible to tell what changed between versions), and is very finicky about type conversions from Excel. There's also an issue with how many rows it reads ahead to determine the data type -- you're in deep trouble if your first x rows contain numbers stored as text.

1) I prefer to use scripts for several reasons.
• Scripts are easy to modify, and plus when I get ready to deploy my application to a production environment, I already have the scripts written so I'm all set.
• If I need to deploy my database to a different platform (like Oracle or MySQL) then it's easy to make minor modifications to the scripts to work on the target database.
• With scripts, I'm not dependent on a tool like Visual Studio to build and maintain the database.
2) I like good old fashioned insert statements using a script. Again, at deployment time scripts are your best friend. At our shop, when we deploy our applications we have to have scripts ready for the DBA's to run, as that's what they expect.
I just find that scripts are simple, easy to maintain, and the "least common denominator" when it comes to creating a database and loading up data to it. By least common denominator, I mean that the majority of people (i.e. DBA's, other people in your shop that might not have visual studio) will be able to use them without any trouble.
The other thing that's important with scripts is that it forces you to learn SQL and more specfically DDL (data definition language). While the hand-holding GUI tools are nice, there's no substitute for taking the time to learn SQL and DDL inside out. I've found that those skills are invaluable to have in almost any shop.

Frankly, I find the concept of using Excel here a bit scary. It obviously works, but it's creating a dependency on an ad-hoc data source that won't be resolved until much later. Last thing you want is to be in a mad rush to deploy a database and find out that the Excel file is mangled, or worse, missing entirely. I suppose the severity of this would vary from company to company as a function of risk tolerance, but I would be actively seeking to remove Excel from the equation, or at least remove it as a permanent fixture.
I always use scripts to create databases, because scripts are portable and repeatable - you can use (almost) the same script to create a development database, a QA database, a UAT database, and a production database. For this reason it's equally important to use scripts to modify existing databases.
I also always use a script to create bootstrap data (AKA startup data), and there's a very important reason for this: there's usually more scripting to be done afterward. Or at least there should be. Bootstrap data is almost invariably read-only, and as such, you should be placing it on a read-only filegroup to improve performance and prevent accidental changes. So you'll generally need to script the data first, then make the filegroup read-only.
On a more philosophical level, though, if this startup data is required for the database to work properly - and most of the time, it is - then you really ought to consider it part of the data definition itself, the metadata. For that reason, I don't think it's appropriate to have the data defined anywhere but in the same script or set of scripts that you use to create the database itself.
Test data is a little different, but in my experience you're usually trying to auto-generate that data in some fashion, which makes it even more important to use a script. You don't want to have to manually maintain an ad-hoc database of millions of rows for testing purposes.
If your problem is that the test or startup data comes from an external source - a web page, a CSV file, etc. - then I would handle this with an actual "configuration database." This way you don't have to validate references with VLOOKUPS as in Excel, you can actually enforce them.
Use SQL Server Integration Services (formerly DTS) to pull your external data from CSV, Excel, or wherever, into your configuration database - if you need to periodically refresh the data, you can save the SSIS package so it ends up being just a couple of clicks.
If you need to use Excel as an intermediary, i.e. to format or restructure some data from a web page, that's fine, but the important thing IMO is to get it out of Excel as soon as possible, and SSIS with a config database is an excellent repeatable method of doing that.
When you are ready to migrate the data from your configuration database into your application database, you can use SQL Server Management Studio to generate a script for the data (in case you don't already know - when you right click on the database, go to Tasks, Generate Scripts, and turn on "Script Data" in the Script Options). If you're really hardcore, you can actually script the scripting process, but I find that this usually takes less than a minute anyway.
It may sound like a lot of overhead, but in practice the effort is minimal. You set up your configuration database once, create an SSIS package once, and refresh the config data maybe once every few months or maybe never (this is the part you're already doing, and this part will become less work). Once that "setup" is out of the way, it's really just a few minutes to generate the script, which you can then use on all copies of the main database.

Since I use an object-relational mapper (Hibernate, there is also a .NET version), I prefer to generate such data in my programming language. The ORM then takes care of writing things into the database. I don't have to worry about changing column names in the data because I need to fix the mapping anyway. If refactoring is involved, it usually takes care of the startup/test data also.

Excel is an unnecessary component of this process.
Script the current version the database components that you want to reuse, and add the script to your source control system. When you need to make changes in the future, either modify the entities in the database and regenerate the script, or modify the script and regenerate the database.
Avoid mixing Visual Studio's db designer and Excel as they only add complexity. Scripts and SQL Management Studio are your friends.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas