Data Science Experience responds with an empty Hive table - hive

From my Data Science Experience, I am able to make a connection to the Hive database in BigInsights and read the table schema. But Data Science Experience does not seem to be able to read the table contents as I get a count of zero! Here are some of my settings:
conf = (SparkConf().set("com.ibm.analytics.metadata.enabled","false"))
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
dash = {
'jdbcurl': 'jdbc:hive2://nnnnnnnnnnn:10000/;ssl=true;',
'user': 'xxxxxxxxxx',
'password': 'xxxxxxxxx',
}
spark.conf
offers = spark.read.jdbc(dash['jdbcurl'],
table='offers',
properties={"user" : dash["user"],
"password" : dash["password"]})
offers.count() returns: 0
offers.show()
returns:
+-----------+----------+
|offers.name|offers.age|
+-----------+----------+
+-----------+----------+
Thanks.

Yes i was able to see same behaviour with hive jdbc connector.
I tried this python connector and it returned correct count.
https://datascience.ibm.com/docs/content/analyze-data/python_load.html
from ingest.Connectors import Connectors
`HiveloadOptions = { Connectors.Hive.HOST : 'bi-hadoop-prod-4222.bi.services.us-south.bluemix.net',
Connectors.Hive.PORT : '10000',
Connectors.Hive.SSL : True,
Connectors.Hive.DATABASE : 'default',
Connectors.Hive.USERNAME : 'charles',
Connectors.Hive.PASSWORD : 'march14march',
Connectors.Hive.SOURCE_TABLE_NAME : 'student'}
`
`HiveDF = sqlContext.read.format("com.ibm.spark.discover").options(**HiveloadOptions).load()`
HiveDF.printSchema()
HiveDF.show()
HiveDF.count()
Thanks,
Charles.

Related

Scala MatchError while joining a dataframe and a dataset

I have one dataframe and one dataset :
Dataframe 1 :
+------------------------------+-----------+
|City_Name |Level |
+------------------------------+------------
|{City -> Paris} |86 |
+------------------------------+-----------+
Dataset 2 :
+-----------------------------------+-----------+
|Country_Details |Temperature|
+-----------------------------------+------------
|{City -> Paris, Country -> France} |31 |
+-----------------------------------+-----------+
I am trying to make a join of them by checking if the map in the column "City_Name" is included in the map of the Column "Country_Details".
I am using the following UDF to check the condition :
val mapEqual = udf((col1: Map[String, String], col2: Map[String, String]) => {
if (col2.nonEmpty){
col2.toSet subsetOf col1.toSet
} else {
true
}
})
And I am making the join this way :
dataset2.join(dataframe1 , mapEqual(dataset2("Country_Details"), dataframe1("City_Name"), "leftanti")
However, I get such error :
terminated with error scala.MatchError: UDF(Country_Details#528) AS City_Name#552 (of class org.apache.spark.sql.catalyst.expressions.Alias)
Has anyone previously got the same error ?
I am using Spark version 3.0.2 and SQLContext, with scala language.
There are 2 issues here, the first one is that when you're calling your function, you're passing one extra parameter leftanti (you meant to pass it to join function, but you passed it to the udf instead).
The second one is that the udf logic won't work as expected, I suggest you use this:
val mapContains = udf { (col1: Map[String, String], col2: Map[String, String]) =>
col2.keys.forall { key =>
col1.get(key).exists(_ eq col2(key))
}
}
Result:
scala> ds.join(df1 , mapContains(ds("Country_Details"), df1("City_Name")), "leftanti").show(false)
+----------------------------------+-----------+
|Country_Details |Temperature|
+----------------------------------+-----------+
|{City -> Paris, Country -> France}|31 |
+----------------------------------+-----------+

How to delete some data from External Table Databricks

I am trying to delete some data from Azure SQL from Databricks using JDBC, it generate error each time. I have very simple query delete from table1 where date>'2022-05-01'.
I searched many documents online but did not find any appropriate solution for this. Please find below code.
jdbcUsername = "userName"
jdbcPassword = "password" #these from Azure Key Vault
jdbcHostname = "host server name"
jdbcPort = "1433"
jdbcDatabase = "db_test"
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
"user" : jdbcUsername,
"password" : jdbcPassword,
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
pushdown_delete_query = f"(delete from table1 where date>'2022-05-01') table_alias"
print(pushdown_delete_query)
spark.read.jdbc(url=jdbcUrl, table=pushdown_delete_query, properties=connectionProperties)
the query return error com.microsoft.sqlserver.jdbc.SQLServerException: A nested INSERT, UPDATE, DELETE, or MERGE statement must have an OUTPUT clause

how to extract a value from a json format

I need to extract the email from an intricate 'dict' (I am new to sql)
I have seen several previous posts on the same topic (e.g. this one) however, none seem to work on my data
select au.details
from table_au au
result:
{
"id":3526,
"contacts":[
{
"contactType":"EMAIL",
"value":"name#email.be",
"private":false
},
{
"contactType":"PHONE",
"phoneType":"PHONE",
"value":"025/6251111",
"private":false
}
]
}
I need:
name#email.be
select d.value -> 0 -> 'value' as Email
from json_each('{"id":3526,"contacts":[{"contactType":"EMAIL","value":"name#email.be","private":false},{"contactType":"PHONE","phoneType":"PHONE","value":"025/6251111","private":false}]}') d
where d.key::text = 'contacts'
Output:
| | email |
-------------------
|1 |"name#email.be"|
You can run it here: https://rextester.com/VHWRQ89385

Get all classes in Parse-server

I'm writing a backup job, and need to fetch all classes in Parse-server, so I can then query all rows and export them.
How do I fetch all classes?
Thanks
Query the schemas collection.
GET /parse/schemas
Probably need to use the masterkey on the query. Not sure what language you're writing your job in but should be simple for you to create a REST query or create a node.js script and use the javascript/node api
--Added after comment below --
var Parse = require('parse/node').Parse;
Parse.serverURL = "http://localhost:23740/parse";
Parse.initialize('APP_ID', 'RESTKEY', 'MASTERKEY');
var Schema = Parse.Object.extend("_SCHEMA");
var query = new Parse.Query(Schema);
query.find({
success : (results) => {
console.log(JSON.stringify(results));
},
error : (err) => {
console.log("err : " + JSON.stringify(err));
}});

How to generate list of tables for DB using RoseDB

I have to list the tables for a given database using RoseDB . I know the mysql command for it :
SHOW TABLES in DB_NAME;
How do I implement this in rose DB ? Pleas help
It's not really a Rose::DB-specific question. Simply use the database handle how you would normally in DBI:
package My::DB {
use Rose::DB;
our #ISA = qw(Rose::DB);
My::DB->register_db(
domain => 'dev',
type => 'main',
driver => 'mysql',
...
);
My::DB->default_domain('dev');
My::DB->default_type('main');
}
use Carp;
my $db = My::DB->new();
my $sth = $db->dbh->prepare('SHOW TABLES');
$sth->execute || croak "query failed";
while (my $row = $sth->fetchrow_arrayref) {
print "$row->[0]\n";
}