I have a database full of many databases and Tables on Hive, now how could I search and find the column of interest? - hive

Long in short that I realized that Hue(Hive/Impala) is not like Microsoft SQL server that you run the following to look for the Table of Interest.
Select * from information_schema.columns where column_name like '%The_Table_of_Interest%'
1st scenario: Imagine that I know what my Database is and I target my attention to the right table by searching through the table and find the column of interest.
2nd scenario: I don't know even what database I need to look for the right table and as a result the column of interest.
I realized that in Hue, there is no option to look for a column. All I can see is Table Search!
Having said that for the two above scenarios there should be a way to find the column of interest.
Scenario 2 is of course difficult to approach, however the 1st one looks a bit easier.
Now, I did my research and came of with running some code in Shell Command Line might be helpful to find the target column. However, that require some further investigation in the layer that I am not quite familiar.(Speaking of Metaset, etc.)
Therefore, here is my question.
Assume we are discussing the 1st scenario, now how can I search and find the columns while you have no knowledge about the tables at all. I can not take guesses and try every tables to find the right one to find the column that I am looking for. What would you suggest, and what is your strategy to approach? Thank you in advance. :)

Good Day H2019
Here are some commands that should help you out to explore the different tables that you have access to:
Find a table or a database
show tables like 'ben*'
Look at the table definition
show create table <table>;
Get table information
describe my_table_01;
Get even more information
describe extended table_name
Get more information in a pretty format
describe formatted table_name;
If you have access to Apache Ranger I also find it useful to look into tables permissions. (And see who's using what)
Apache Atlas if you use it it helpful to see where data comes from.(It keeps data lineage information and may help to give you an understanding of how things work)
Don't forget you can look at HDFS to find databases, tables if they're in /hive/warehouse/. This can also be helpful to understand when things are created.

Related

BigQuery Table Last Query Date

I heavily use bigQuery and there are now quite a number of intermediate tables. Because teammates can upload their own tables, I do not understand all the tables well.
I want to check if a table have not been used for a long time, then check if it can be deleted manually.
Is there anyone know how to do?
Many thanks
You could use logs if you have access. If you made yourself familiar with how to filter log entries you can find out about your usage quite easily: https://cloud.google.com/logging/docs/quickstart-sdk#explore
There's also the possibility of exporting logs to big query - so you could analyze them using SQL - I guess that's even more convenient.
You can get table specific meta data via the TABLES command.
SELECT *,TIMESTAMP_MILLIS(LAST_MODIFIED_TIME) ACCESS_DATE
FROM [DATASET].__TABLES__
The mentioned code snippet should provide you with the last access date.

I'm being asked to create IN queries for different GUIDs...huh?

I'm a GIS intern.
I've been asked:
"Could you also create IN queries for the different sets of GUID’s? Here is an example:
"GlobalID" IN '{58BEE03F-1656-4BD5-B53D-B887E93A5287}', '{009C7364-8D77-46B3-A531-B60ED4E5B407}', '{0105263C-1305-4AB9-A00A-4BED01832177}')"
I'm not sure what that means or why I'd have to do it. What I can tell you is that I have several .shp that I have geocoded and then created global IDs for.
I've googled this for hours now and am no closer to understanding the request than I was. It could be that the answer is staring me in the face but I don't think I know enough to know that.
Thank you,
Kathy
In order to create and understand IN queries, first you'll have to understand the basics of a query. It sounds like this might not be something you're familiar with, so I'll start with that.
There are 3 main parts to a query, SELECT, FROM, and WHERE.
SELECT is the information (or columns) you want to return. You can SELECT * to select all columns or SELECT specificColumn1, specificColumn2 to select specific columns.
The next step is the FROM statement. From determines what table(s) you will be querying. You can query multiple tables here if you like and tables can also be aliased like so: FROM table1 t1.
The third statement is the WHERE statement, which specifies any conditions that the query is required to meet. In your case, this is where your IN statement will go. There are a ton of different keywords you can use here, but I'll just give a quick sample query for you (keep in mind I have no idea what your schema looks like).
SELECT *
FROM GUIDData
WHERE GlobalID IN ('{58BEE03F-1656-4BD5-B53D-B887E93A5287}', '{009C7364-8D77-46B3-A531-B60ED4E5B407}', '{0105263C-1305-4AB9-A00A-4BED01832177}');
So what this query will do, is it will give you all the data for each item in the GUIDData table with a global ID of {58BEE03F-1656-4BD5-B53D-B887E93A5287}, {009C7364-8D77-46B3-A531-B60ED4E5B407}, or {0105263C-1305-4AB9-A00A-4BED01832177}.
Did this help?

Find location of a cell's value without knowing table or column name in a sql server database

I need to find a certain table of data but I do not know the name of the table or the columns in it. I have the data in the table at hand. The database is huge do not have the patience it manually go through and look. Is there a query that can be used to search through each of the thousands of tables searching for an exact value that I have.
Having been working as a system admin on systems I did not develop this is an issue I face frequently. Here is how I approach it:
1) Is there a test system and a UI you can insert the values from? If so, do a profile trace or extended events to see where the data is going.
2) Is there a data dictionary for the product you can look through and hopefully find the table location?
3) The hardest way is to use information_Schema.columns and information_schema.tables and make educated guesses as to what the table might be named and review the data in it and see if you have it right or not.

How can I divide a single table into multiple tables in access?

Here is the problem: I am currently working with a Microsoft Access database that the previous employee created by just adding all the data into one table (yes, all the data into one table). There are about 186 columns in that one table.
I am now responsible for dividing each category of data into its own table. Everything is going fine although progress is too slow. Is there perhaps an SQL command that will somehow divide each category of data into its proper table? As of now I am manually looking at the main table and carefully transferring groups of data into each respective table along with its proper IDs making sure data is not corrupted. Here is the layout I have so far:
Note: I am probably one of the very few at my campus with database experience.
I would approach this as a classic normalisation process. Your single hugely wide table should contain all of the entities within your domain so as long as you understand the domain you should be able to normalise the structure until you're happy with it.
To create your foreign key lookups run distinct queries against the columns your going to remove and then add the key values back in.
It sounds like you know what you're doing already ? Are you just looking for reassurance that you're on the right track ? (Which it looks like you are).
Good luck though, and enjoy it - it sounds like a good little piece of work.

How would you do give the user a preference for how from an SQL table is to be printed?

I'm given a task from a prospective employer which involves SQL tables. One requirement that they mentioned is that they want the name retrieved from a table called "Employees" to come in the form at of either "<LastName>, <FirstName>" OR "<FirstName> <MiddleName> <LastName> <Suffix>".
This appears confusing to me because this kind of sounds like they're asking me to make a function or something. I could probably do this in a programming language and have the information retrieved that way, but to do this in the SQL table exclusively is weird to me. Since I'm rather new to SQL and my familiarity with SQL doesn't exceed simple tasks such as creating databases, tables, fields, inserting data into fields, updating fields in records, deleting records in tables which meet a specific condition, and selecting fields from tables.
I hope that this isn't considered cheating since I mentioned that this was for a prospective employer, but if I was still in school then I could just outright ask a professor where I can find a clue for this or he would've outright told me in class. But, for a prospective job, I'm not sure who I would ask about any confusion. Thanks in advance for anyone's help.
A SQL query has a fixed column output: you can't change it. To achieve this. you could have a concatenate with a CASE statement to make it one varchar column, but then you need something (parameter) to switch the CASE.
So, this is presentation, not querying SQL.
I'd return all 4 columns mentioned and decide how I want them in the client.
Unless you have just been asked for 2 different queries on the same SQL table
You haven't specified the RDBMS, but in SQL Server you could accomplish this using Computed Columns.
Typically, you would use a View over the table..