That is the data which is in JSON format and I'm supposed to extract some fields from it,"You have subscribed to Gnip data feed for Twitter and have received the feeds in the json format. Use HIVE Json serde to extract the following fields using HIVE.
data
Need a plan to sort out this problem
Can you show the sample data structure.
To get json data into hive, we need to use get_json_object function.
Related
When I load parquet files into Bigquery table, values stored are wierd. It seems to be the encoding of BYTES fields or whatever else.
Here's the format of the create fields
So when I read the table with casted fields, I get the readable values.
I found the solution here
Ma question is WHY TF bigquery is bahaving like this?
According to this GCP documentation, there are some parquet data types that can be converted into multiple BigQuery data types. A workaround is to add the data type that you want to parse to BigQuery.
For example, to convert the Parquet INT32 data type to the BigQuery DATE data type, specify the following:
optional int32 date_col (DATE);
And another way is to add the schema to the bq load command:
bq load --source_format=PARQUET --noreplace --noautodetect --parquet_enum_as_string=true --decimal_target_types=STRING [project]:[dataset].[tables] gs://[bucket]/[file].parquet Column_name:Data_type
I am using Hive for json storage. Then, I have created a table with only one string column containing all the json. I have tested the get_json_object function that Hive offers but I am not able to create a query that iterates all the subdocuments in a list and finds a value in a specific field.
In MongoDB, this problem can be solved by using $elemMatch as the documentation says.
Is there any way to do something like this in Hive?
I have one "Bal_123.csv" file and when I am searching its data on splunk web by providing query " sourcetype="Bal_123.csv" " I am getting latest indexed raw data in comma separated format. But for further operation I need that data in .Json format
Is there any way we can get that data in .Json format itself. I know I can export the data in Json format but I am using Rest call to get data from splunk and I need that Json data on splunk itself.
can anyone help me regarding this?
Splunk will parse JSON, but will not display data in JSON format except, as you've already noted, in an export.
You may be able to play with the format command to get something close to JSON.
A better option might be to wrap your REST call in some Python that converts the results into JSON.
When we create an ORC table in hive we can see that the data is compressed and not exactly readable in HDFS. So how is Hive able to convert that compressed data into readable format which is shown to us when we fire a simple select * query to that table?
Thanks for suggestions!!
By using ORCserde while creating table. u have to provide package name for serde class.
ROW FORMAT ''.
What serde does is to serialize a particular format data into object which hive can process and then deserialize to store it back in hdfs.
Hive uses “Serde” (Serialization DeSerialization) to do that. When you create a table you mention the file format ex: in your case It’s ORC “STORED AS ORC” , right. Hive uses the ORC library(Jar file) internally to convert into a readable format. To know more about hive internals search for “Hive Serde” and you will know how the data is converted to object and vice-versa.
I exported data of the dataset of BigQuery using API to JSON file, but the JSON that I download has a properties saved as array object with key name as "V" instead of original name of property.
I don't want to export the table fo dataset to Google Storage, nor to execute a specified query.
I need to get the table data of the dataset with the orginal schema using the api to json file.
I am using the api:
Function:
Tabledata: list: Retrieves table data from a specified set of rows.
https://cloud.google.com/bigquery/docs/reference/v2/tabledata/list#request
Function
Tables: get This method does not return the data in the table, it only returns the table resource, which describes the structure of this table.
https://cloud.google.com/bigquery/docs/reference/v2/tables/get#request
Thank you,
Best regards,