Query Map values in Hive - hive

I have a table in hive which is updated every hour by Spark/Parquet
CREATE TABLE IF NOT EXISTS user
(
name STRING,
creation_date DATE,
cards map<STRING,STRING>
) STORED AS PARQUET ;
Let's suppose that I want to query the number of Gobelin cards per user.
My query looks like this:
select * from user where card["Gobelin"] IS NOT NULL ;
The result looks like this
KillerZord1001 2016-01-02 {"Archer":"2","Gobelin":"6"}
HalfAMill 2016-02-05 {"Witch":"7","Gobelin":"8"}
But what I would like to have is the value of the key that I am looking for, more like:
KillerZord1001 2016-01-02 6
HalfAMill 2016-02-05 8
Can Hive perform such queries?

You can simply do
SELECT name, creation_date, card["Gobelin"] FROM USER
WHERE card["Gobelin"] IS NOT NULL

Related

How can I use an input from another table in my query?

I'm creating a new table using PostgreSQL, but I need to get a parameter from another table as an input.
This is the table I have (I called table_1):
id column_1
1 100
2 100
3 100
4 100
5 100
I want to create a new table, but only using ids that are higher than the highest id from the table above (table_1). Something like this:
insert into table_new
select id, column_1 from table_old
where id > (max(id) from table_1)
How can I do this? I tried searching, but I got to several posts like https://community.powerbi.com/t5/Desktop/M-Query-Create-a-table-using-input-from-another-table/td-p/209923, Take one table as input and output using another table BigQuery and sql query needs input from another table, which are not exactly what I need.
Just use where id > (select max(id) from table_1).

Select Numbers Before and After String Values - Presto SQL

I am trying to create 2 new columns from a single column.
My data looks like this:
userid:5438888,locationid:84646646478,property:g
I want to make a new column for the userid, and a new column for the locationid. There are many more rows, and the userids and locationids aren't always going to be the same length throughout the dataset.
I am assuming there is a way to split the text after : and before , but I am not sure how it would work doing this twice inside the string. I don't care about the property part of the string. Solely userid and locationid.
You should be able to do this with with the split_to_map() function:
WITH data(attribution_site_id) AS (
VALUES 'userid:5438888,locationid:84646646478,property:g'
),
t AS (
SELECT split_to_map(attribution_site_id, ',',':') map
FROM data
)
SELECT element_at(map, 'userid') as userid,
element_at(map, 'locationid') as locationid
FROM t
which produces:
userid | locationid
---------+-------------
5438888 | 84646646478

Create a table without knowing its columns in SQL

How can I create a table without knowing in advance how many and what columns it exactly holds?
The idea is that I have a table DATA that has 3 columns : ID, NAME, and VALUE
What I need is a way to get multiple values depending on the value of NAME - I can't do it with simple WHERE or JOIN (because I'll need other values - with other NAME values - later on in my query).
Because of the way this table is constructed I want to PIVOT it in order to transform every distinct value of NAME into a column so it will be easier to get to it in my later search.
What I want now is to somehow save this to a temp table / variable so I can use it later on to join with the result of another query...
So example:
Columns:
CREATE TABLE MainTab
(
id int,
nameMain varchar(max),
notes varchar(max)
);
CREATE TABLE SecondTab
(
id int,
id_mainTab, int,
nameSecond varchar(max),
notes varchar(max)
);
CREATE TABLE DATA
(
id int,
id_second int,
name varchar(max),
value varchar(max)
);
Now some example data from the table DATA:
| id | id_second_int | name | value |
|-------------------------------------------------------|
| 1 | 5550 | number | 111115550 |
| 2 | 6154 | address | 1, First Avenue |
| 3 | 1784 | supervisor | John Smith |
| 4 | 3467 | function | Marketing |
| 5 | 9999 | start_date | 01/01/2000 |
::::
Now imagine that 'name' has A LOT of different values, and in one query I'll need to get a lot of different values depending on the value of 'name'...
That's why I pivot it so that number, address, supervisor, function, start_date, ... become colums.
This I do dynamically because of the amount of possible columns - it would take me a while to write all of them in an 'IN' statement - and I don't want to have to remember to add it manually every time a new 'name' value gets added...
herefore I followed http://sqlhints.com/2014/03/18/dynamic-pivot-in-sql-server/
the thing is know that I want the result of my execute(#query) to get stored in a tempTab / variable. I want to use it later on to join it with mainTab...
It would be nice if I could use #cols (which holds the values of DATA.name) but I can't seem to figure out a way to do this.
ADDITIONALLY:
If I use the not dynamic way (write down all the values manually after 'IN') I still need to create a column called status. Now in this column (so far it's NULL everywhere because that value doesn't exist in my unpivoted table) i want to have 'open' or 'closed', depending on the date (let's say i have start_date and end_date,
CASE end_date
WHEN end_date < GETDATE() THEN pivotTab.status = 'closed'
ELSE pivotTab.status = 'open'
Where can I put this statement? Let's say my main query looks like this:
SELECT * FROM(
(SELECT id_second, name, value, id FROM TABLE_DATA) src
PIVOT (max(value) FOR name IN id, number, address, supervisor, function, start_date, end_date, status) AS pivotTab
JOIN SecondTab ON SecondTab.id = pivotTab.id_second
JOIN MainTab ON MainTab.id = SecondTab.id_mainTab
WHERE pivotTab.status = 'closed';
Well, as far as I can understand - you have some select statement and just need to "dump" its result to some temporary table. In this case you can use select into syntax like:
select .....
into #temp_table
from ....
This will create temporary table according to columns in select statement and populate it with data returned by select datatement.
See MDSN for reference.

Insert and update rows from a file in oracle

I have a file in linux, the file is something like:
(I have millions of rows)
date number name id state
20131110 1089 name1 123 start
20131110 1080 name2 122 start
20131110 1082 name3 121 start
20131114 1089 name1 120 end
20131115 1082 name3 119 end
And i have a table in Oracle with the following fileds:
init_table
start_date
end_date
number
name
id
The problem is that i read that i can insert data with a sqlloader, (I have millions of rows, then create a temporal table to insert and later with a trigger update the other table is not well) the problem is that I have an user with start date X, for example the number 1089 has the start date: 20131110, and the end_date of this user is: 20131114, then i need insert first the start_date in my table, later when i found the end_date, update my table of the number that i am inserting, that in my example is 1089 with the end date that is: 20131114.
How can do it with a ctl, or with other thing.
Who can help me. Thanks
What version of Oracle?
I would use an external table. Define an external table that exactly matches your flat file. Then, you should be able to solve this with two passes, one insert, one update.
Something like this should do it:
insert into init_table select to_date(date,'YYYYMMDD'),null,number,name,id from external_table where state='start';
update init_table set end_date=(select date from external_table where state='end' and init_table.number=external_table.number);
Note that you can't actually have columns named 'date' or 'number', so, the sql above isn't actually going to work as written. You'll have to change those column names.
Hope that helps...
If you use an external table approach then you can join the data in the external table to itself to produce a single record that can then be inserted. Although the join is expensive, overall it ought to be an efficient process as long as the hash join I'd expect to be used stays in memory.
So something like ...
insert into init_table (
start_date,
end_date,
number,
name,
id)
select
s.date,
e.date,
s.number,
s.name,
s.id
from external_table s
join external_table e on s.number = e.number
where
s.state = 'start' and
e.state = 'end'
That assumes that there will always be an end date for every start date, and that the number does not already exist in the table -- if either of those conditions is not true then an outer join would be required in the former case, and a merge required in the latter.
$ cat sqlldrNew.ctl
load data
infile '/home/Sameer/employee.txt'
into table employee
fields terminated by X'9'
( date, -->select number from employee where name=[Name from the file record], name, id, state )
$ sqlldr scott/tiger control=/home/Sameer/sqlldrNew.ctl
I think this should work.

help in sql command .. i want list names which have defined id s

i have table consist of columns : id,name,job
and i have row stored with this data :
id: 1
name : jason
job: 11,12
id: 2
name : mark
job: 11,14
i want write sql command to fetch names which have value "14" stored in job column only from this table
so how i do that ?
thanks
You can do:
WHERE FIND_IN_SET(14, job)
But that is really not the correct way. The correct way is to normalize your database and separate the job field into its own table. Check this answer for extra information:
PHP select row from db where ID is found in group of IDs
You shouldn't be storing multiple job ids in the same field. You want to normalise your data model. Remove the 'job' column from your names table, and have a second JOB table defined like this:
id | name_id | job_id
1 1 11
2 1 12
3 2 11
4 2 14
where name_id is the primary id ('id') of the entry in the names table.
Then you can do:
SELECT name_id, job_id FROM JOB WHERE name_id = 1;
for example. As well as making your data storage far more extensible - you can now assign unlimited numbers of job_ids to each name for example - it'll also be much faster to execute queries as all your entries are now ints and no string processing is required.
SELECT
*
FROM
MyTable
WHERE
job LIKE '14,%' OR job LIKE '%,14' OR job LIKE '%,14,%'
EDIT: Thanks to onedaywhen
SELECT
*
FROM
MyTable
WHERE
(',' + job + ',') LIKE ('%,14,%')