I am getting a DB collation error when loading data - pentaho

Hi I am loading the data from MySQL staging to MySQL destination.
I get this error and it says Illegal mix of collations (latin1_swedish_ci, COERCIBLE) and (latin1_german1_ci, COERCIBLE) for operation '='
Does this has to do anything with Pentaho. Because the same runs fine in Production server but gives error in Dev server.

Probably not Pentaho since it is working in one area, but not another. Try:
Moving the code from your prod box to your dev box to make sure you didn't introduce any changes unintentionally.
Are your MySQL instances the same versions? Are they supported by Pentaho? What about your drivers? And are the drivers all stored in the correct places? Make sure that you don't have two of the MySQL drivers in the different folders to avoid conflicts.
Run the job in row level mode to see the most messages about what is occurring. It could give you important clues.

Related

Data Migration Assistant

I am using Data Migration Assistant to assess compatibility issues migrating a SQL database to Azure SQL. After running for a couple of minutes, it throws an error saying "The file contains the XML node type {0}. This type is unsupported or in an unsupported location." I have successfully assessed other databases using DMA but this particular database always aborts after throwing this error.
I decided to go ahead and migrate the database using the wizard (Deploy Database to Microsoft Azure SQL Database) from SSMS, and ran into several compatibility issues that showed as errors. The database had several triggers that were created by a third-party database tool that referred to table objects with the 3 & 4-part naming conventions which is not supported on Azure SQL. There were several other errors in addition to these but I decided to delete these triggers first and run Data Migration Assistant again. This time it ran to completion and I got compatibility report. I am not sure if it was the sheer number of issues found or something in those triggers that I deleted had caused the error.

SSIS Error: VS_NEEDSNEWMETADATA

I'm currently updating all of our ETLs using Visual Studio 2015 (made in BIDS 2008) and redeploying them to a new reporting server running on SQL Server 2016 (originally 2008R2).
While updating one of the ETLs and trying to run on the new server I got this error:
The package execution failed. The step failed.
Sometimes it also produces this error:
Source: Load Fact Table SSIS.Pipeline Description: "Copy To Fact
Table" failed validation and returned validation status
"VS_NEEDSNEWMETADATA".
I've tried deleting and re-adding the OLEDB Destination, connection strings and opened up the column mappings to refresh the meta data. I also recreated the whole data flow task but I'm still getting the same error.
The package runs fine on my local machine.
UPDATE:
I started taking the package apart and running only pieces of it to try and narrow down which part was failing. It seemed to be failing on loading into the staging table but I couldn't find out why.
I eventually decided to just try and re-create the whole thing. After re-creating the entire package, still no luck. The picture below is from the event viewer on the server itself but it didn't give me any new information.
Package error from event viewer
I have tried all the solutions provided above and the other sites. Nothing worked.
I got a suggestion from my friend Which worked for me.
Here are the steps:
Right click on the Source/Target Data flow component.
Go to Advanced Editor -> Component Properties
Find ValidateExternalMetadata and set it to False.
Try your luck. This is a pathetic issue and left me clueless for 2 days.
I finally found the issue and here's how I did it.
Because the error messages I was getting from SSMS weren't very insightful I first opened up my remote desktop and logged into the server. Then I went to Administrative Tools>Event Viewer and then Windows Logs>Application to see if the failed event would provide greater detail.
It didn't give me much still.
The next step I took was to run the package from the command line because the messages should be more verbose. Opened up cmd, changed directory to the one my package was in and then...
DTEXEC /FILE YourPackageName.dtsx
Finally, the error message here showed a missing column in the tables the package was trying to write to. I added those columns and voila!
As stated in comments,
if it runs ok in your development environment, then the problem isn't with the package, it's with the scheduled job on the server. Try recreating that.
If that doesn't work,
It seems like the server has a cached instance of the package it's using instead of the updated one. Try renaming your package and creating a new job with the new package name and see if that works.
If that doesn't work,
all I can recommend at that point is to cut the package down until it succeeds, then add the next step that fails.
Sounds like from your solution the development environment is more forgiving of schema updates than the deployed solution. Glad you were able to resolve, eliminating clutter helps.
I had the same problem and my issue was a difference between two environments, the same field in the same table once was written with a capital and once not. So the name was the same, but with this small difference (e.g. isActive vs IsActive).
This came from a refactoring effort, where we used VS database publish that did not updated the field name.
Have you tried deleting and re-creating the source? When I get this I can generally modify OK any object that has the error but have to delete and rebuild the paths between them, however sometimes I have to delete everything in the data flow and re-create it.
A Proxy for SSIS Package Execution should be created under the SQL Server Agent. You should then change your job step (or steps) to Run As the Proxy you've created.
I had your same problem some time ago and the proxy fixed it.
Forgive me if you've already tried this.
It is very common to get that message when 2 columns in the source file are being inserted into the same field of the table.
i.e.
My text file has twice "neighborhood" (same label for different columns) and my table has "neighborhood" and "neighborhoodb" (notice the "b" at the end). The import will try to import both text columns into the field "neighborhood" and ignore the "neighborhoodb" field, it will fail with the "VS_NEEDSNEWMETADATA" error.
Re-creating the job worked for me. Some cached version of the job may have been causing the VS_NEEDSNEWMETADATA error. The package was executing correctly but it was failing, when it was executed by an agent job.
This ended up being a permissions issue for me. The OLE DB Source was using a stored procedure that selected from a SQL view. This view joined to a table in another database and unfortunately the proxy account the SQL Agent job step was running the package under did not have SELECT permission to the table in that database. This is why the package ran fine in Visual Studio but not from a job when deployed to the server. I found the root cause of the error by taking the SELECT statement out of the stored procedure and putting it directly in the Source Query box of the OLE DB Source control which caused it to finally return the 'SELECT permission denied' error message. This error was apparently hidden from SSIS since the proxy account DID have execute permission on the stored procedure.
It works for me after changing the ValidateExternalMetadata to false. I was transferring the data from MSSQL database to MySQL database. Changed "ADO NET Destination".
You may need to strongly type your Source Query.
Example:
If your DestinationDB has a FullName field Nvarchar(255)
and in your source query you have
select firstname + lastname as FullName from...
Try this:
Select CONVERT(NVARCHAR(255),firstname + lastname) as Fullname from...
So if you are going from db to db and both are nvarchar(255) I don't have this issue, but if you are concatenating fields in your query specify the data type and length.
This error can also occur when an entire SSIS project needs to be redeployed rather than just one of the packages (for VS versions that allow deployment of a single package in a multi-package project), particularly when a project connection has been changed or added. For example, if you've added or removed columns from a flat-file project connection. In that case, you need to deploy the entire project to push out the updated project connection properties. This can be true even if the project only has one package in it. In VS Solution Explorer, rather than click on the package name to deploy, select the bolded project name at the top, and then click deploy.

SQL Agent Job error when trying to run SSIS package

I have an SSIS package which when I run in VS it runs fine. I have changed the protection level to Dont Save Sensitive. I have also given any files related permission to be overwritten by the sql agent job account.
The error I am getting is:
Parsing XML with internal subset DTDs not allowed. Use CONVERT with style option 2 to enable limited internal subset DTD support.
So far I have got the impression that I need to change the code but if the code was really a problem my package wouldnt run in the first place which clearly isnt the case.
How can I reslve this?

Unknown HTML Data Entered in SQL Server Database

Recently found out that unknown html codes were inserted into my SQL Server database without my knowledge, it's something like this in every cell
[my original database data]</title><style>.a2vf{position:absolute;clip:rect(475px,auto,auto,475px);}</style><div class=a2vf>These rules are bound <a href=http://paydayloansforsure.com >fast payday loans</a> unscrupulous len...
I initially thought my database password was compromised. So I changed my password to a more difficult one, but after a couple of days, it appeared again. Anyone knows how it got into the database like that and how to prevent it?
UPDATE:
After some investigation, I suspect this might be caused by a software which I downloaded to schedule backup SQL databases. I've reformatted my local machine and start all over again, it did not happened anymore.
After some investigation, I suspect this might be caused by a software which I downloaded to my local machine to schedule SQL databases backup. I've reformatted my local machine it did not happen anymore.

"Transactional publication with updatable subscriptions" - gives error - "The distributor has not been installed correctly"

I am new to replication. I have two sql server 2008 servers running on windows 2008 R2. The servers are in two different locations and on two different domains. I have been able to use aliases to get both "Snapshot publication" and "Transactional Publication" working perfectly. But what I need is "Transactional publication with updatable subscriptions" so if a change is made on either server, the changes are replicated to the other server.
When I run through the New Publication, I get through every page to the very end with no problem but when I click the finish button I get an error. There are three actions and it fails on the first action called "Creating Publication 'xxxx'" The message I get is "SQL Server could not create publication 'xxxx'. an exception occurred while executing a Transact-SQL statement or batch. The distributor has not been installed correctly."
I have searched for an answer and cannot find it. I think this is a permission problem between the two servers but I have no idea how to solve it.
Any help would be appreciated.
In my experience the installation of different types of replication over the top of one that was previously implemented can cause issues.
If able I would suggest clearing all replication and starting from scratch with your new approach.
You have to run a variety of stored procedures to get it completely off the server. Using the GUI only doesn't do nearly as good a job of cleaning everything off.
This guide from Microsoft should get you started.
http://msdn.microsoft.com/en-us/library/ms152757.aspx