How to search for column existence in Hive table

How to search for column existence in Hive table - hive

hive table that has around 50 columns. Finding out a specific using Describe command or running select command is becoming tedious.
Is there a way we can search for existence of column in Hive table?
And also, can we use substring in the column name instead of complete name that will much more useful.
Thanks in Advance.

I believe there is no command to check this directly ,but you have below options.
1) Query column name , if you get NOT FOUND exception , the column doesn't exist.
2) Describe first , iterate through results and search for the column you are looking for.
3) Use Hive meta store client and get HiveMetastoreClient.getFields ,this will give list of columns ,iterate and search for your column

Related

How to use/do where in column of a lookup in Splunk Search Query

I want the search with a field which match with any of the values in
look up table.
For now, I have used below where in query. But, I still want to query with Look up table instead of manually putting all those values in double quotes using the in clause.
|where in(search,"abcd","bcda","efsg","zyca");

First, you need to create a lookup field in the Splunk Lookup manager. Here you can specify a CSV file or KMZ file as the lookup. You will name the lookup definition here too. Be sure to share this lookup definition with the applications that will use it.
Once you have a lookup definition created, you can use it in a query with the Lookup Command. Say you named your lookup definition "my_lookup_csv", and your lookup column in your search is "event_column", and your csv column names are "column1", "column2", etc. Your search query will now end in:
| lookup my_lookup_csv column1 as event_column

Retrieve results from a batch of SQL queries in Pentaho or Postgres?

I'm still relatively new to SQL and Pentaho.
I've pulled a table with two different IDs and need to run a query for each specific instance.
For example,
SELECT *
FROM Table
WHERE RecordA = 'value in column A'
AND RecordB = 'value in column B'
I need the results back, either appended to new columns in the original table or part of their own text file output.
I was initially looking at using a formula for this inside of Pentaho, but couldn't quite figure it out. Since I have the query written I threw it into Excel and got the concatenated results (so a string of 350 or so queries that I need to run). I'm just not sure how to accomplish this - I tried the Execute SQL Script inside of Pentaho but it doesn't seem to do output?
Any direction would be useful. I've searched a little but have come up short so far, possibly because I am still pretty new to this platform.

You can accomplish this behavior in a lot of ways, with a "Database Lookup" step for example, but I usually do that in a quite easy way and here is a example for your tests, I hope it helps.
The idea here is to have two Table input steps, the first one will fetch the IDs we want to look at. For example you may use a SQL query similar to note on the left. The result will be a 1 column stream of rows.
Next we have a Table Input that reads the rows received and executes it's query for each row. I'll add a screenshot with the options that I selected.
What it does is replace a placeholder '?' with the data that is received. If you need two columns use two '?' but remember that it will replace the first one with the first column and the second one with the second column
And you are good to go. Test it a couple of times and good luck.
And the config for the second table input.

BigQuery - remove unused column from schema

I accidentally added a wrong column to my BigQuery table schema.
Instead of reloading the complete table (million of rows), I would like to know if the following is possible:
remove bad rows (rows with values contains the wrong column) by running a "select *" query on the table with some kind of filter, and saving result to same table.
removing the (now) unused column.
Is this functionality (or similar) supported?
Possibly the "save result to table" functionality can have a "compact schema" option.

The smallest time-saving way to remove a column from Big Query according to the documentation.
ALTER TABLE [table_name] DROP COLUMN IF EXISTS [column_name]

If your table does not consist of record/repeated type fields - your simple option is:
Select valid columns while filtering out bad records into new temp table
SELECT < list of original columns >
FROM YourTable
WHERE < filter to remove bad entries here >
Write above to temp table - YourTable_Temp
Make a backup copy of "broken" table - YourTable_Backup
Delete YourTable
Copy YourTable_Temp to YourTable
Check if all looks as expected and if so - get rid of temp and backup tables
Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free
In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you
Create a table with Record type column
create a table with a column type RECORD
I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself

Below is the code to do it. Lets say c is the column that you wants to delete.
CREATE OR REPLACE TABLE transactions.test_table AS
SELECT * EXCEPT (c) FROM transactions.test_table;
Or second method and my favorite is by following below steps.
Write Select query with the columns you want to exclude.
Go to Query Settings
Query Settings
In Destination setting Set destination table for query results, enter project name, Dataset name and table name exactly same as you entered in Step 1.
In Destination table write preference select Overwrite table.
Destination table settings
Save the Query Setting and run the query.

Save results to table is your way to go. Try on the big table with the selected columns you are interested, and you can apply a limit to make it small.

Searching Sqlite

I was wondering if there was a way to search an entire SQLite database for one specific word. I do not know the column that it is in or even the table that it is in.
The table and row/column that contains this specific word also contains the other entries that i need to edit.
In-short:
Need to find a specific word
Can't query (i don't think i can atleast) since i don't know the table or column name that its located in.
I need to know where this specific word is referenced. In what table and row so I can access the others that are along side it.
Basically, is there a CTRL+F functionality of SQlite that searches the entirety of the SQLite file?
I have mac/windows/linux machines. I am not limited by software if that is a solution.

Any such functionality would essentially be running queries that check every column of every table. You can do that via a script that runs the following SQL:
1) Get a list of all the tables:
select name from sqlite_master where type = 'table'
2) For each table, get all of its columns (column name is available in the name field)
pragma table_info(cows)
3) Then for each table, generate a query that checks every field and run it:
select
*
from cows
where name like '%Daisy%'
or owner like '%Daisy%'
or farm like '%Daisy%'

query a table not in normal 3rd form

Hi I have a table which was designed by a lazy developer who did not created it in 3rd normal form. He saved the arrays in the table instead of using MM relation . And the application is running so I can not change the database schema.
I need to query the table like this:
SELECT * FROM myTable
WHERE usergroup = 20
where usergroup field contains data like this : 17,19,20 or it could be also only 20 or only 19.
I could search with like:
SELECT * FROM myTable
WHERE usergroup LIKE 20
but in this case it would match also for field which contain 200 e.g.
Anybody any idea?
thanx

Fix the bad database design.
A short-term fix is to add a related table for the correct structure. Add a trigger to parse the info in the old field to the related table on insert and update. Then write a script to [parse out existing data. Now you can porperly query but you haven't broken any of the old code. THen you can search for the old code and fix. Once you have done that then just change how code is inserted or udated inthe orginal table to add the new table and drop the old column.

Write a table-valued user-defined function (UDF in SQL Server, I am sure it will have a different name in other RDBMS) to parse the values of the column containing the list which is stored as a string. For each item in the comma-delimited list, your function should return a row in the table result. When you are using a query like this, query against the results returned from the UDF.

Write a function to convert a comma delimited list to a table. Should be pretty simple. Then you can use IN().

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to search for column existence in Hive table - hive

Related

How to use/do where in column of a lookup in Splunk Search Query

Retrieve results from a batch of SQL queries in Pentaho or Postgres?

BigQuery - remove unused column from schema

Searching Sqlite

query a table not in normal 3rd form

Categories

Resources