StreamSets data not landing into table created on postgres db - sql

I am using StreamSets to build a pipeline to land data from a table that sits in a sqlserver db to a table on postgres db.
JDBC Query Consumer --> Timestamp --> JDBC Producer
The pipeline passes validation checks and runs successfully on preview mode. However, the problem is that the data does not land into the postgres table.
I have checked the connection string and credentials and these should be right.
This is the error it throws in the logs.
No parameters found for record with YY SELECT 'XX' AS fieldA, YY AS
fieldB, ZZ AS fieldC::rowCount:#; skipping
How can I resolve this issue?

'No parameters found' means that there were no fields on the record that could be mapped to database columns. Check your field-to-column mappings. If they look correct, it might be a problem with case. Try enabling Enclose Object Names on the JDBC tab.

Related

Facing error while accessing cassandra data through Solr in DSE

When I issue this query on solr(separate db) then it is working. But when I am accessing cassandra data through Solr query(I am using DSE) then it returns nothing. And it is giving some error related UserCacheField. So How to enable UserCacheField in a solr query?
Update
My Query is
select * FROM trackfleet_db.location WHERE
solr_query='{"facet":{"pivot":"date,latitude,longitude"},"q":":"}' ;
And I am getting following error
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Field cache is disabled, set the field=date to be docValues=true
and reindex. Or if the field cache will not exceed the heap usage,
then place useFieldCache=true in the request parameters."
The best approach should be enabling of docValues on given field (date) & reindex the data.
But it looks like that you have this field defined with date type, that (per documentation) doesn't support docValues, so you may need to change type for this field to timestamp (I'm not sure that you can use copy field with different type).

exception with Hive long create table statement

I have a "very long" create external table" statement that i try to run in Hive (200+ columns) but I end up with this error message.
Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)
It's suppose to create an external table over an already populated hbase table. If reduce the number of column in my Hive statement it works.
So could it be the max number of column?, a connection timeout? , the lenght of the statement?
Please share your thought.
Regards,
Breach
Not sure if the number of variables is the real problem given the limited information provided, but this post should be able to help you check if the number of variables is the problem.
Creating a hive table with ~40K columns
Change the type of column "PARAM_VALUE" in "SERDE_PARAMS" Table in metastore database.
Try this command if you are using mysql server for storing the metastore DB
ALTER TABLE SERDE_PARAMS MODIFY PARAM_VALUE TEXT NOT NULL;
Hope it works for you.

How to Join two MongoDB Collections with the Unity JDBC driver?

From the Unity JDBC download page:
If the SQL query requires joins or functions not supported by MongoDB, then the query is promoted to UnityJDBC (trial version). The UnityJDBC trial version has no expiration date and is fully functioning except that it is limited to returning up to 100 results.
However, when I try to join two tables using any syntax like
SELECT * from a, b WHERE a.id = b.id
SELECT * from a INNER JOIN b ON a.id = b.id
SELECT * from a INNER JOIN b USING (id)
Results in the following:
Exception: java.sql.SQLException: ERROR: No schema defined. The default schema location is _schema in the current database. You need write permission to create this collection. Otherwise, use the schema parameter to set a file location (e.g. schema=mongo.xml) to store the schema. See connection parameters at http://www.unityjdbc.com/mongojdbc/ for more details.
java.sql.SQLException: ERROR: No schema defined. The default schema location is _schema in the current database. You need write permission to create this collection. Otherwise, use the schema parameter to set a file location (e.g. schema=mongo.xml) to store the schema. See connection parameters at http://www.unityjdbc.com/mongojdbc/ for more details.
at mongodb.conn.ServerConnection.processMongoWithUnity(Unknown Source)
at mongodb.conn.ServerConnection.executeQuery(Unknown Source)
at mongodb.jdbc.MongoStatement.executeQuery(Unknown Source)
at mongodb.ExampleMongoJDBC.doQuery(ExampleMongoJDBC.java:222)
at mongodb.ExampleMongoJDBC.main(ExampleMongoJDBC.java:66)
Ok, so I took a look in the readme and found it mentioning the code/test/dspec/ folder with some files related to schemas. I opened a few up, they are highly detailed xml files of all the collections mapping them to relational data types.
Do I have to write one of these out, or is there a way to auto generate it?
I received a (fast) response from the Unity team.
The MongoDB JDBC driver has two modes. For single collection queries, it does not build a schema. For queries involving joins or expressions it builds a schema and by default stores it in the _schema collection in the current Mongo database. If you do not have permission to write to the database, an error is thrown.
As mentioned in the error, you can set the schema parameter to be a local file name (such as mongo.xml) and it will store it on your computer rather than in the Mongo database. You could also use an account that has write permissions.
To make this work, add schema=mongo.xml to your connection URL like this:
jdbc:mongo://localhost/dbname?schema=mongo.xml?rebuildschema=true
After this is done the first time, you can remove the rebuildschema=true or it will rebuild it every time.
The only thing I'm confused about is that I'm using a database which doesn't require authentication. I even made a user with write permissions, connected to him, and still received the above error.
--EDIT
I realized that you could also just do jdbc:mongo://localhost/dbname?rebuildschema=true. If the schema hasn't been created, then the previous error will be thrown.

Specifying database other than default with Impala JDBC driver

I'm using the Impala JDBC driver (or I guess it's actually the Hive Server 2 JDBC driver). I have a view created in another database -- let's call it "store55".
Let's say my view is defined as follows:
CREATE VIEW good_customers AS
SELECT * from customers WHERE good = true;
When I try to query this view using JDBC as follow:
SELECT * FROM store55.good_customers LIMIT 10
I get an error such as:
java.sql.SQLException: AnalysisException: Table does not exist: default.customers
Ideally, I'd like to specify the database name somewhere in the JDBC URL or as a parameter but when I try to use this JDBC url, I still get the same error:
jdbc:hive2://<host>:<port>/store55;auth=noSasl
Doe the Hive2 JDBC driver just ignore the database part of the URL and assume all queries are executed against the default database?
The only way I was able to have the queries return is to change the view definition itself to include the database name:
CREATE VIEW good_customers AS
SELECT * from store55.customers WHERE good = true;
However, I'd like to keep the view definition free of database names.
Thanks!
You might want to specify in JDBC the "use database xxxxx;" statement.
Also, if you are already using the database try "invalidate metadata" statement.
The URL is jdbc:hive2://:/store55;auth=noSasl correct
Can you run few diagnostics such as:
SHOW TABLES - to ensure that the view is created in store55
Are you using the USE DATABASE command in the DDL's

SQL Server reports 'Invalid column name', but the column is present and the query works through management studio

I've hit a bit of an impasse. I have a query that is generated by some C# code. The query works fine in Microsoft SQL Server Management Studio when run against the same database.
However when my code tries to run the same query I get the same error about an invalid column and an exception is thrown. All queries that reference this column are failing.
The column in question was recently added to the database. It is a date column called Incident_Begin_Time_ts .
An example that fails is:
select * from PerfDiag
where Incident_Begin_Time_ts > '2010-01-01 00:00:00';
Other queries like Select MAX(Incident_Being_Time_ts); also fail when run in code because it thinks the column is missing.
Any ideas?
Just press Ctrl + Shift + R and see...
In SQL Server Management Studio, Ctrl+Shift+R refreshes the local cache.
I suspect that you have two tables with the same name. One is owned by the schema 'dbo' (dbo.PerfDiag), and the other is owned by the default schema of the account used to connect to SQL Server (something like userid.PerfDiag).
When you have an unqualified reference to a schema object (such as a table) — one not qualified by schema name — the object reference must be resolved. Name resolution occurs by searching in the following sequence for an object of the appropriate type (table) with the specified name. The name resolves to the first match:
Under the default schema of the user.
Under the schema 'dbo'.
The unqualified reference is bound to the first match in the above sequence.
As a general recommended practice, one should always qualify references to schema objects, for performance reasons:
An unqualified reference may invalidate a cached execution plan for the stored procedure or query, since the schema to which the reference was bound may change depending on the credentials executing the stored procedure or query. This results in recompilation of the query/stored procedure, a performance hit. Recompilations cause compile locks to be taken out, blocking others from accessing the needed resource(s).
Name resolution slows down query execution as two probes must be made to resolve to the likely version of the object (that owned by 'dbo'). This is the usual case. The only time a single probe will resolve the name is if the current user owns an object of the specified name and type.
[Edited to further note]
The other possibilities are (in no particular order):
You aren't connected to the database you think you are.
You aren't connected to the SQL Server instance you think you are.
Double check your connect strings and ensure that they explicitly specify the SQL Server instance name and the database name.
In my case I restart Microsoft SQL Sever Management Studio and this works well for me.
If you are running this inside a transaction and a SQL statement before this drops/alters the table you can also get this message.
I eventually shut-down and restarted Microsoft SQL Server Management Studio; and that fixed it for me. But at other times, just starting a new query window was enough.
If you are using variables with the same name as your column, it could be that you forgot the '#' variable marker. In an INSERT statement it will be detected as a column.
Just had the exact same problem. I renamed some aliased columns in a temporary table which is further used by another part of the same code. For some reason, this was not captured by SQL Server Management Studio and it complained about invalid column names.
What I simply did is create a new query, copy paste the SQL code from the old query to this new query and run it again. This seemed to refresh the environment correctly.
In my case I was trying to get the value from wrong ResultSet when querying multiple SQL statements.
In my case it seems the problem was a weird caching problem. The solutions above didn't work.
If your code was working fine and you added a column to one of your tables and it gives the 'invalid column name' error, and the solutions above doesn't work, try this: First run only the section of code for creating that modified table and then run the whole code.
Including this answer because this was the top result for "invalid column name sql" on google and I didn't see this answer here. In my case, I was getting Invalid Column Name, Id1 because I had used the wrong id in my .HasForeignKey statement in my Entity Framework C# code. Once I changed it to match the .HasOne() object's id, the error was gone.
I've gotten this error when running a scalar function using a table value, but the Select statement in my scalar function RETURN clause was missing the "FROM table" portion. :facepalms:
Also happens when you forget to change the ConnectionString and ask a table that has no idea about the changes you're making locally.
I had this problem with a View, but the exact same SQL code worked perfectly as a query. In fact SSMS actually threw up a couple of other problems with the View, that it did not have with the query. I tried refreshing, closing the connection to the server and going back in, and renaming columns - nothing worked. Instead I created the query as a stored procedure, and connected Excel to that rather than the View, and this solved the problem.