HiveMetaStoreClient thinks Derby but actually Postgres - hive

I'm trying to connect to a Posgres Hive Metastore through an Oozie Java Action using the code below.
I'm passing the hive-site.xml to the action so it should have all the information it needs.
HiveMetaStoreClient client = new HiveMetaStoreClient(conf);
log.info("Successfully created the HiveMetaStoreClient");
try {
log.info(String.format("Loading the partitions for %s.%s", database, table));
List<Partition> partitions = client.listPartitions(database, table, (short) 200);
log.info(String.format("Processing %d partitions", partitions.size()));
for (Partition partition : partitions) {
StorageDescriptor sd = partition.getSd();
String location = sd.getLocation();
String newLocation = location.replace(from, to);
log.info(String.format("Moving from %s to %s", location, newLocation));
sd.setLocation(newLocation);
}
} catch (TException e) {
logExceptionStack(e);
}
The log doesn't exactly error, but it suggests that its looking at some other metastore with a DERBY backend;
I'm stumped where to look for the issue and to force HiveMetaStoreClient to point to the correct metastore.
2016-02-08 16:48:05,972 INFO [uber-SubtaskRunner] com.xxxxxxx.PartitionMigrator.Program: Attempting to create the HiveMetaStoreClient
2016-02-08 16:48:06,123 INFO [uber-SubtaskRunner] com.xxxxxxx.PartitionMigrator.Program: hiveconf metastoreURI: null
2016-02-08 16:48:06,194 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2016-02-08 16:48:06,222 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.ObjectStore: ObjectStore, initialize called
2016-02-08 16:48:06,385 INFO [uber-SubtaskRunner] DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
2016-02-08 16:48:06,385 INFO [uber-SubtaskRunner] DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
2016-02-08 16:48:06,506 WARN [uber-SubtaskRunner] DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
2016-02-08 16:48:06,840 WARN [uber-SubtaskRunner] DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
2016-02-08 16:48:08,339 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
2016-02-08 16:48:09,286 INFO [uber-SubtaskRunner] DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2016-02-08 16:48:09,286 INFO [uber-SubtaskRunner] DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2016-02-08 16:48:10,400 INFO [uber-SubtaskRunner] DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
2016-02-08 16:48:10,400 INFO [uber-SubtaskRunner] DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
2016-02-08 16:48:10,676 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY
2016-02-08 16:48:10,677 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.ObjectStore: Initialized ObjectStore
2016-02-08 16:48:10,798 WARN [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0
2016-02-08 16:48:10,928 WARN [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
2016-02-08 16:48:11,019 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.HiveMetaStore: Added admin role in metastore
2016-02-08 16:48:11,021 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.HiveMetaStore: Added public role in metastore
2016-02-08 16:48:11,097 INFO [uber-SubtaskRunner] org.apache.hadoop.hive.metastore.HiveMetaStore: No user is added in admin role, since config is empty
2016-02-08 16:48:11,193 INFO [uber-SubtaskRunner] com.xxxxxxx.PartitionMigrator.Program: Successfully created the HiveMetaStoreClient

Related

Quarkus and duplicate liquibase master file

I've just moved up to Quarkus 2.11.1.Final from 2.6.2.Final and my native image is now failing to start up with the error:
Info
2022-07-29 15:53:36.323 BST2022-07-29 14:53:36,275 WARN [io.qua.ope.run.tra.LateBoundSampler] (vert.x-eventloop-thread-0) No Sampler delegate specified, no action taken.
Info
2022-07-29 15:53:36.361 BST2022-07-29 14:53:36,361 INFO [io.qua.sma.ope.run.OpenApiRecorder] (main) Default CORS properties will be used, please use 'quarkus.http.cors' properties instead
Info
2022-07-29 15:53:36.398 BST2022-07-29 14:53:36,398 INFO [liq.database] (main) Set default schema name to public
Info
2022-07-29 15:53:36.432 BST2022-07-29 14:53:36,432 ERROR [io.qua.run.Application] (main) Failed to start application (with profile prod): java.io.IOException: Found 2 files with the path 'db/changelog/liquibase-changelog-master.yml':
Info
2022-07-29 15:53:36.432 BST - resource:/db/changelog/liquibase-changelog-master.yml
Info
2022-07-29 15:53:36.432 BST - resource:/db/changelog/liquibase-changelog-master.yml#1
Info
2022-07-29 15:53:36.432 BST Search Path:
I tried altering quarkus.liquibase.change-log to something very specific just in case it was picking a file of the same name from some 3rd party, but it doesn't make any difference.
Could this be a bug, or could I have missed something in uprevving Quarkus?
Answer posted as a comment by #wabrit
in older versions of Quarkus it was necessary to set quarkus.native.resources.include to ensure that additional liquibase change files were present in the native image. That seems to be done automatically now so including them in that property resulted in duplicates

CLFRN1254E exception when syncinc TDI for HCL Connections against OpenLDAP server

For a test environment, I want to setup HCL Connections 6.5 with OpenLDAP. This should be a more lightweight alternative that could be better automated than a full Domino server, which is used in production. I created test users with the following attributes:
{ sn: Max, cn: Muster, uid: max, displayName: "Max Muster", userPassword: "ldap", mail: "max.muster#example.com" }
All have the objectClasses person shadowAccount inetOrgPerson. After executing collect_dns.sh, the following DN is present in collect.dns
uid=max,ou=People,dc=cnx,dc=local
When syncing those users with ./populate_from_dn_file.sh I got a failed record. The log file logs/ibmdi.log shows
2020-05-21 09:41:07,703 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Eagerly caching bean 'PostgreSQL' to allow for resolving potential circular references
2020-05-21 09:41:07,703 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Finished creating instance of bean 'PostgreSQL'
2020-05-21 09:41:07,703 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Creating shared instance of singleton bean 'Sybase'
2020-05-21 09:41:07,704 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Creating instance of bean 'Sybase'
2020-05-21 09:41:07,704 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Eagerly caching bean 'Sybase' to allow for resolving potential circular references
2020-05-21 09:41:07,704 DEBUG [org.springframework.beans.factory.support.DefaultListableBeanFactory] - Finished creating instance of bean 'Sybase'
2020-05-21 09:41:07,704 INFO [org.springframework.jdbc.support.SQLErrorCodesFactory] - SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase]
2020-05-21 09:41:07,704 DEBUG [org.springframework.jdbc.support.SQLErrorCodesFactory] - Looking up default SQLErrorCodes for DataSource [org.springframework.jdbc.datasource.TransactionAwareDataSourceProxy#64a644f9]
2020-05-21 09:41:07,705 DEBUG [org.springframework.jdbc.datasource.DataSourceUtils] - Fetching JDBC Connection from DataSource
2020-05-21 09:41:07,705 DEBUG [org.springframework.jdbc.datasource.DataSourceUtils] - Registering transaction synchronization for JDBC Connection
2020-05-21 09:41:07,706 DEBUG [org.springframework.jdbc.support.SQLErrorCodesFactory] - Database product name cached for DataSource [org.springframework.jdbc.datasource.TransactionAwareDataSourceProxy#64a644f9]: name is 'DB2/LINUXX8664'
2020-05-21 09:41:07,706 DEBUG [org.springframework.jdbc.support.SQLErrorCodesFactory] - SQL error codes for 'DB2/LINUXX8664' found
2020-05-21 09:41:07,706 DEBUG [org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator] - Translating SQLException with SQL state '23502', error code '-407', message [
--- The error occurred while applying a parameter map.
--- Check the Profile.createProfile-InlineParameterMap.
--- Check the statement (update failed).
--- Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -407, SQLSTATE: 23502, SQLERRMC: TBSPACEID=5, TABLEID=5, COLNO=7]; SQL was [] for task [SqlMapClient operation]
2020-05-21 09:41:07,707 DEBUG [org.springframework.jdbc.datasource.DataSourceUtils] - Returning JDBC Connection to DataSource
2020-05-21 09:41:07,707 DEBUG [org.springframework.jdbc.datasource.DataSourceTransactionManager] - Initiating transaction rollback
2020-05-21 09:41:07,707 DEBUG [org.springframework.jdbc.datasource.DataSourceTransactionManager] - Rolling back JDBC transaction on Connection [org.apache.commons.dbcp.PoolableConnection#a2d822e9]
2020-05-21 09:41:07,707 DEBUG [org.springframework.jdbc.datasource.DataSourceTransactionManager] - Releasing JDBC Connection [org.apache.commons.dbcp.PoolableConnection#a2d822e9] after transaction
2020-05-21 09:41:07,707 DEBUG [org.springframework.jdbc.datasource.DataSourceUtils] - Returning JDBC Connection to DataSource
2020-05-21 09:41:07,707 ERROR [com.ibm.lconn.profiles.api.tdi.connectors.ProfileConnector] - CLFRN1254E: An error occurred while performing findEntry: max.
2020-05-21 09:41:07,708 ERROR [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - !com.ibm.lconn.profiles.api.tdi.service.TDIException: CLFRN1254E: An error occurred while performing findEntry: max.!
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS274I Skipping entry from [addorUpdateDB], CTGDIS393I Throwing this exception to tell the AssemblyLine to skip the current Entry. If used in an EventHandler, this exception tells the EventHandler to skip the remaining actions..
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS075I Trying to exit TaskCallBlock.
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS076I Succeeded exiting TaskCallBlock.
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS057I Hook after_functioncall not enabled.
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - CTGDIS352I Use null Behavior for outputResult.
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS504I *Result of attribute mapping*
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS505I The 'conn' object
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - [callSyncDB_mod] CTGDIS003I *** Start dumping Entry
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - Operation: generic
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - Entry attributes:
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - displayName (replace): 'Max Muster'
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $lookup_status (replace): 'success'
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - userPassword (replace): (\6c\64\61\70)
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $lookup_operation (replace): 'lookup_user'
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - cn (replace): 'Muster'
2020-05-21 09:41:07,708 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $_already_lookup_secretary (replace):
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - objectClass (replace): 'person' 'shadowAccount' 'inetOrgPerson'
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - entryUUID (replace): 'e74f6eec-2f22-103a-960a-770a291c4e47'
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $secretary_uid (replace):
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - uid (replace): 'max'
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $manager_uid (replace):
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $_already_lookup_manager (replace):
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - syncExisting (replace):
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $dn (replace): 'uid=max,ou=People,dc=cnx,dc=local'
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - mail (replace): 'max.muster#example.com'
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - sn (replace): 'Max'
2020-05-21 09:41:07,709 INFO [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - $operation (replace): 'add'
How can I fix this? According to the error message, I really have no idea what the problem is.
What I already tried
This blog post has the same error and indicates that we need to set a field mode, which caused the error being set to null. To test if this works, I set in this to a custom function by inserting mode={func_mode} in map_dbrepos_from_source.properties. Additionally, I added those function in profiles_functions.js:
function func_mode(fieldname) {
return 'internal';
}
This should handle all users as internal and avoid trouble because of null fields. With the debug logs, I could verify that this value was applied:
2020-05-21 09:41:07,587 DEBUG [AssemblyLine.AssemblyLines/populate_from_dns_file.1] - CLFRN0011I: Mapping result: mode = internal.
The other thing I tried is diabling validation for fields I don't have in my LDAP like guid or isManager by commenting their validation functions out in validate_dbrepos_fields.properties:
#distinguishedName=(x != null) && (x.length() > 0) && (x.length() <= 256)
#guid=(x != null) && (x.length() > 0) && (x.length() <= 256)
#isManager=(x == null) || (x == "Y") || (x == "N")
#surname=(x != null) && (x.length() > 0) && (x.length() <= 128)
Additionally, the mapping to those fields were set to null to avoid errors by fetching them from an LDAP entry where they doesn't exist
grep "=null" map_dbrepos_from_source.properties
alternateLastname=null
blogUrl=null
bldgId=null
calendarUrl=null
countryCode=null
courtesyTitle=null
deptNumber=null
description=null
employeeNumber=null
employeeTypeCode=null
experience=null
faxNumber=null
freeBusyUrl=null
floor=null
groupwareEmail=null
ipTelephoneNumber=null
jobResp=null
loginId=null
logins=null
managerUid=null
mobileNumber=null
nativeFirstName=null
nativeLastName=null
orgId=null
pagerNumber=null
pagerId=null
pagerServiceProvider=null
pagerType=null
officeName=null
preferredFirstName=null
preferredLanguage=null
preferredLastName=null
profileType=null
secretaryUid=null
shift=null
telephoneNumber=null
tenantKey=null
timezone=null
title=null
workLocationCode=null
isManager=nul
Verify that the DB exists
In the past, I had the same problem and found out that the databases were not created properly. So I checked this:
su - db2inst1
/opt/IBM/db2/V11.1/bin/db2 list db directory | grep "Database name"
Database name = OPNACT
Database name = METRICS
Database name = SNCOMM
Database name = PNS
Database name = WIKIS
Database name = FORUM
Database name = HOMEPAGE
Database name = DOGEAR
Database name = PEOPLEDB
Database name = MOBILE
Database name = FILES
Database name = XCC
Database name = BLOGS
All databases are present. Especially PEOPLEDB, where TDI places the fetched user profiles from LDAP. Also the tables seems there:
db2 => list tables for schema EMPINST#
Table/View Schema Type Creation time
------------------------------- --------------- ----- --------------------------
CHG_EMP_DRAFT EMPINST T 2020-05-20-22.48.28.416187
COUNTRY EMPINST T 2020-05-20-22.48.26.864072
DEPARTMENT EMPINST T 2020-05-20-22.48.26.635113
EMPLOYEE EMPINST T 2020-05-20-22.48.25.249286
EMP_DRAFT EMPINST T 2020-05-20-22.48.28.079615
EMP_ROLE_MAP EMPINST T 2020-05-20-22.48.29.296064
EMP_TYPE EMPINST T 2020-05-20-22.48.26.973100
EMP_UPDATE_TIMESTAMP EMPINST T 2020-05-20-22.48.29.539973
EVENTLOG EMPINST T 2020-05-20-22.48.28.764942
GIVEN_NAME EMPINST T 2020-05-20-22.48.25.723208
ORGANIZATION EMPINST T 2020-05-20-22.48.26.745316
PEOPLE_TAG EMPINST T 2020-05-20-22.48.26.477954
PHOTO EMPINST T 2020-05-20-22.48.27.097088
PHOTOBKUP EMPINST T 2020-05-20-22.48.27.311065
PHOTO_GUID EMPINST T 2020-05-20-22.48.27.519014
PROFILES_SCHEDULER_LMGR EMPINST T 2020-05-20-22.48.30.229810
PROFILES_SCHEDULER_LMPR EMPINST T 2020-05-20-22.48.30.340702
PROFILES_SCHEDULER_TASK EMPINST T 2020-05-20-22.48.29.873149
PROFILES_SCHEDULER_TREG EMPINST T 2020-05-20-22.48.30.108769
PROFILE_EXTENSIONS EMPINST T 2020-05-20-22.48.26.025818
PROFILE_EXT_DRAFT EMPINST T 2020-05-20-22.48.26.258480
PROFILE_LAST_LOGIN EMPINST T 2020-05-20-22.48.29.430376
PROFILE_LOGIN EMPINST T 2020-05-20-22.48.29.051552
PROFILE_PREFS EMPINST T 2020-05-20-22.48.29.183711
PROF_CONNECTIONS EMPINST T 2020-05-20-22.48.28.490983
PROF_CONSTANTS EMPINST T 2020-05-20-22.48.28.644499
PRONUNCIATION EMPINST T 2020-05-20-22.48.27.726899
SNPROF_SCHEMA EMPINST T 2020-05-20-22.48.25.020502
SURNAME EMPINST T 2020-05-20-22.48.25.875498
TENANT EMPINST T 2020-05-20-22.48.25.084242
USER_PLATFORM_EVENTS EMPINST T 2020-05-20-22.48.29.659806
WORKLOC EMPINST T 2020-05-20-22.48.27.953047
This matches the number of tables from the SQL file
$ grep -i "create table" /opt/cnx-install/cnx/wizard/connections.sql/profiles/db2/createDb.sql | wc -l
32
You asked the question in May so I assume this answer comes much too late. For future reference: "Skipping entry from [addorUpdateDB]" is a scripted message which means that the account did not pass the minimal requirements for a Profile entry. If I remember correctly, there are 4 essential fields without which a profile entry can't be created:
email
distinguishedName
guid
uid
Seeing that you left out a guid, the error is logical. You should have mapped your guid to your entryUUID.

TDI for HCL Connections 6.5 synchronization fails with "bad SQL grammar [];" error

I'm using Tivoli Directory Integrator (TDI) to sync users from Domino LDAP to the local DB2 people database of HCL Connections. On a test installation, I got the following error when trying to initially sync the users:
[root#cnx65 tdisol]# LANG=en_US.utf8 ./sync_all_dns.sh
create synchronization lock
log4j:WARN No appenders could be found for logger (server).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
**********
CLFRN1275I: Begin to hash records in database.
CLFRN1269I: Finish hash records in database.
**********
"message": "CLFRN1254E: An error occurred while performing findEntry: {0}."
"exception": "com.ibm.lconn.profiles.api.tdi.service.TDIException: CLFRN1254E: An error occurred while performing findEntry: {0}."
Synchronize of Database Repository failed
HCLs documentation recommend to check the logs in case of CLFRN1254E. The file logs/SyncUpdates.log contains the following exception:
2020-01-21 07:50:03,803 INFO [org.apache.log4j.DailyRollingFileAppender.7431103d-4d0a-4d63-bdb7-61e274f23ed4] - CTGDIS092I Use entry provided at runtime as work entry (first pass only).
2020-01-21 07:50:11,723 ERROR [org.apache.log4j.DailyRollingFileAppender.7431103d-4d0a-4d63-bdb7-61e274f23ed4] - [hash_db_entries] CTGDIS181E Error while evaluating the hook 'Function error' in the component 'hash_db_entries (hash_db_entries.functioncall_fail).
com.ibm.lconn.profiles.api.tdi.service.TDIException: CLFRN1254E: An error occurred while executing findEntry: {0}.
at com.ibm.lconn.profiles.api.tdi.connectors.ProfileConnector$ProfileCodeBlock.handleRecoverable(ProfileConnector.java:1063)
at com.ibm.lconn.profiles.api.tdi.connectors.Util.TDICodeRunner.run(TDICodeRunner.java:41)
at com.ibm.lconn.profiles.api.tdi.connectors.ProfileConnector.getNextEntry(ProfileConnector.java:155)
at com.ibm.di.server.AssemblyLineComponent.executeOperation(AssemblyLineComponent.java:3370)
at com.ibm.di.server.AssemblyLineComponent.getnext(AssemblyLineComponent.java:932)
at com.ibm.di.server.AssemblyLine.msGetNextIteratorEntry(AssemblyLine.java:3689)
at com.ibm.di.server.AssemblyLine.executeMainStep(AssemblyLine.java:3388)
at com.ibm.di.server.AssemblyLine.executeMainLoop(AssemblyLine.java:3000)
at com.ibm.di.server.AssemblyLine.executeMainLoop(AssemblyLine.java:2983)
at com.ibm.di.server.AssemblyLine.executeAL(AssemblyLine.java:2952)
at com.ibm.di.server.AssemblyLine.run(AssemblyLine.java:1319)
Caused by: org.springframework.jdbc.BadSqlGrammarException: SqlMapClient operation; bad SQL grammar []; nested exception is com.ibatis.common.jdbc.exception.NestedSQLException:
--- The error occurred while applying a parameter map.
--- Check the TDIProfile.get-InlineParameterMap.
--- Check the statement (query failed).
--- Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -551, SQLSTATE: 42501, SQLERRMC: LCUSER;SELECT;EMPINST.EMPLOYEE
at org.springframework.jdbc.support.SQLStateSQLExceptionTranslator.doTranslate(SQLStateSQLExceptionTranslator.java:97)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:72)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
at org.springframework.jdbc.support.AbstractFallbackSQLExceptionTranslator.translate(AbstractFallbackSQLExceptionTranslator.java:80)
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:212)
at org.springframework.orm.ibatis.SqlMapClientTemplate.executeWithListResult(SqlMapClientTemplate.java:249)
at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForList(SqlMapClientTemplate.java:296)
at com.ibm.lconn.profiles.internal.service.store.sqlmapdao.TDIProfileSqlMapDao.get(TDIProfileSqlMapDao.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:88)
What could be the problem? How could I find out more information why this error occurs?
What I already tried
Increase log level
In profiles_tdi.properties I enabled debug logs for every component:
debug_collect=true
debug_draft=true
debug_fill_codes=true
debug_managers=true
debug_photos=true
debug_pronounce=true
debug_special=true
debug_update_profile=true
trace_profile_tdi_javascript=on
Since this had no effect, I set the log4j level to debug for the entire application in etc/log4j.properties:
log3j.rootCategory=DEBUG, Default
Also tried ALL instead of DEBUG. However, there is no change in the output. I expected to see the SQL query, which caused the exception.
Set mode in properties
According to this post, the mode attribute will be used to decide if an user is internal or external. Since the example config says
Actually, any string other than "external" is interpreted as employee.
it is set to mode=memberType. Also tried mode=uid and mode=mail. Both are fields containing a string not equal to "external", so this should result in all members imported as internal users.
Sync single users
Since my LDAP filter applies to around 60 users, I ran ./collect_dns.sh successfully and removed all users from collect.dns file except my own. Then sync the user from the dn file with ./populate_from_dn_file.sh. This was done for two other users, resulting always in the same error:
CLFRN0027I: After operation, success records is 0, duplicate records 0, failure records is 1, and last successful entry is null.
CLFRN1280I: 20200121105123 Iterations total number: 1.
The only difference is that logs/PopulateDBFromDNFile.log contains more detailled information about the fetched attributes, mappings and so on. Unfortunately, it doesn't really help me in terms of the error, since it produces a similiar message:
2020-01-21 10:55:27,530 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] [setup_if_lookup] CTGDIS126I Return false.
2020-01-21 10:55:27,530 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] [setup_if_lookup] CTGDIS123I Returned object class java.lang.Boolean.
2020-01-21 10:55:27,530 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] CTGDIS075I Trying to exit TaskCallBlock.
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] CTGDIS076I Succeeded exiting TaskCallBlock.
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] CTGDIS057I Hook after_functioncall not enabled.
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] CTGDIS352I Use null Behavior for $_already_lookup_manager.
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] CTGDIS351I Map Attribute $manager_uid [1].
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] CTGDIS353I Script is: conn["$manager_uid"]
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] CTGDIS352I Use null Behavior for $manager_uid.
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] CTGDIS057I Hook functioncall_ok not enabled.
2020-01-21 10:55:27,531 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [add_manager_data] CTGDIS057I Hook default_ok not enabled.
2020-01-21 10:55:27,538 INFO [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] Result: <My Name of the User in dn file>
2020-01-21 10:55:27,591 ERROR [com.ibm.di.log.FileRollerAppender.268b5e1d-d0fc-4a7c-9e12-4d742c44faa5] - [callSyncDB_mod] [ProfileConnector] SqlMapClient operation; bad SQL grammar []; nested exception is com.ibatis.common.jdbc.exception.NestedSQLException:
--- The error occurred while applying a parameter map.
--- Check the TDIProfile.get-InlineParameterMap.
--- Check the statement (query failed).
--- Cause: com.ibm.db2.jcc.c.SqlException: DB2 SQL error: SQLCODE: -551, SQLSTATE: 42501, SQLERRMC: LCUSER;SELECT;EMPINST.EMPLOYEE
Found out that this was a unlucky logical mistake from me. The database is created using sql files, shipped with the Connections Installation Wizard. I automatically import them in a loop. Since it was very slow (about 30 min for all scripts), I tried to parallelize them by adding a & at the end of the command and finally wait at the end to make sure all scripts were executed.
- name: Check and create non existing DBs for CNX
become: yes
become_user: "{{ db2.instance.name }}"
shell: |
db={{ item.name }}
scripts=({{ item.files | join(' ') }})
existing_dbs=$(echo -e '{{ existing_dbs.stdout }}')
echo "Check db ${db}"
if ! echo ${existing_dbs} | grep -q ${db}; then
echo "DB ${db} doesn't exist, execute scripts"
for script in "${scripts[#]}"
do
echo "${db}: Execute script ${script}"
{{ db2.target }}/bin/db2 -td# -f {{ cnx_sql_dir }}/${script} &
done
wait
fi
register: db_check
changed_when: "'execute scripts' in db_check.stdout"
loop: "{{ cnx.db_scripts }}"
cnx.db_scripts is a mapping of database names to SQL files:
db_scripts:
- name: PEOPLEDB
files:
- profiles/db2/createDb.sql
- profiles/db2/appGrants.sql
- name: FORUM
files:
# - ...
In retrospect, this was a terrible logical mistake because I missed the fact that those scripts rely on each other: When profiles/db2/appGrants.sql is executed before profiles/db2/createDb.sql was finished, it wouldn't work because the db doesn't exists.
As a result, TDIs queries failed because the database and tables were only partly created. I didn't notice this immediately, since the machine was several re-deployed during of the Ansible playbook development. Strangely, TDI only failed in 2 of 10 deployments. Seems like DB2 make some kind of queue and depending on the timing, the people database required for TDI is created successfully on some runs.

Master startup cannot progress, in holding-pattern until region onlined

I've setup a Hbase cluster with two nodes and I've noticed the warning "AssignmentManager: STUCK Region-In-Transition" which is not allowing the master to startup.
Node 1: observepreserve.corp.com (Master / Zookeeper)
Node 2: knewshoe.corp.com (Region Server)
Why it is happeneing and how to fix it?
Under Hbase UI I can see the below message.
b94eb458bf643b46deaf6b00998d1f95 hbase:namespace,,1542792846910.b94eb458bf643b46deaf6b00998d1f95. state=OPENING, ts=Wed Nov 21 09:39:46 UTC 2018 (PT18M9.696S ago), server=knewshoe.corp.com,16020,1542792833282
Logs:
2018-11-21 09:40:45,900 INFO [ReadOnlyZKClient-observepreserve.corp.com:2181#0x4068418f] zookeeper.ZooKeeper: Session: 0x167359e5ad60006 closed
2018-11-21 09:40:45,900 INFO [ReadOnlyZKClient-observepreserve.corp.com:2181#0x4068418f-EventThread] zookeeper.ClientCnxn: EventThread shut down for session: 0x167359e5ad60006
2018-11-21 09:40:49,266 WARN [master/observepreserve:16000:becomeActiveMaster] master.HMaster: hbase:namespace,,1542792846910.b94eb458bf643b46deaf6b00998d1f95. is NOT online; state={b94eb458bf643b46deaf6b00998d1f95 state=OPENING, ts=1542793186164, server=knewshoe.corp.com,16020,1542792833282}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
2018-11-21 09:41:46,095 WARN [ProcExecTimeout] assignment.AssignmentManager: STUCK Region-In-Transition rit=OPENING, location=knewshoe.corp.com,16020,1542792833282, table=hbase:namespace, region=b94eb458bf643b46deaf6b00998d1f95
2018-11-21 09:41:53,267 WARN [master/observepreserve:16000:becomeActiveMaster] master.HMaster: hbase:namespace,,1542792846910.b94eb458bf643b46deaf6b00998d1f95. is NOT online; state={b94eb458bf643b46deaf6b00998d1f95 state=OPENING, ts=1542793186164, server=knewshoe.corp.com,16020,1542792833282}; ServerCrashProcedures=false. Master startup cannot progress, in holding-pattern until region onlined.
Yes,reinstall hbase cause this problem!
This is because old metadata was not removed,you need try to delete hbase metadata from zk
and restart hbase,all will be ok,good luck.

Spark cannot query Hive tables it can see?

I'm running the prebuilt version of Spark 1.2 for CDH 4 on CentOS. I have copied the hive-site.xml file into the conf directory in Spark so it should see the Hive metastore.
I have three tables in Hive (facility, newpercentile, percentile), all of which I can query from the Hive CLI. After I log into Spark and create the Hive Context like so: val hiveC = new org.apache.spark.sql.hive.HiveContext(sc) I am running into an issue querying these tables.
If I run the following command: val tableList = hiveC.hql("show tables") and do a collect() on tableList, I get this result: res0: Array[org.apache.spark.sql.Row] = Array([facility], [newpercentile], [percentile])
If I then run this command to get the count of the facility table: val facTable = hiveC.hql("select count(*) from facility"), I get the following output, which I take to mean that it cannot find the facility table to query it:
scala> val facTable = hiveC.hql("select count(*) from facility")
warning: there were 1 deprecation warning(s); re-run with -deprecation for details
14/12/26 10:27:26 WARN HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
14/12/26 10:27:26 INFO ParseDriver: Parsing command: select count(*) from facility
14/12/26 10:27:26 INFO ParseDriver: Parse Completed
14/12/26 10:27:26 INFO MemoryStore: ensureFreeSpace(355177) called with curMem=0, maxMem=277842493
14/12/26 10:27:26 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 346.9 KB, free 264.6 MB)
14/12/26 10:27:26 INFO MemoryStore: ensureFreeSpace(50689) called with curMem=355177, maxMem=277842493
14/12/26 10:27:26 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 49.5 KB, free 264.6 MB)
14/12/26 10:27:26 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.2.15:45305 (size: 49.5 KB, free: 264.9 MB)
14/12/26 10:27:26 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
14/12/26 10:27:26 INFO SparkContext: Created broadcast 0 from broadcast at TableReader.scala:68
facTable: org.apache.spark.sql.SchemaRDD =
SchemaRDD[2] at RDD at SchemaRDD.scala:108
== Query Plan ==
== Physical Plan ==
Aggregate false, [], [Coalesce(SUM(PartialCount#38L),0) AS _c0#5L]
Exchange SinglePartition
Aggregate true, [], [COUNT(1) AS PartialCount#38L]
HiveTableScan [], (MetastoreRelation default, facility, None), None
Any assistance would be appreciated. Thanks.
scala> val facTable = hiveC.hql("select count(*) from facility")
Great! You have an RDD, now what do you want to do with it?
scala> facTable.collect()
Remember that an RDD is an abstraction on top of your data and is not materialized until you invoke an action on it such as collect() or count().
You would get a very obvious error if you tried to use a non-existent table name.