I have two tables with latitude and longitude points. I would like to create a new table which has information from both tables based on finding the closest points between tables. This is similar to a question previously asked; however one of the tables has arrays. The solution from the previously asked question did not seem to work with arrays.
Table A
|--------|-------------|-------------|-------------|
| id | latitude | longitude | address |
|--------|-------------|-------------|-------------|
| 1 | 39.79 | 86.03 | 123 Vine St |
|--------|-------------|-------------|-------------|
| 2 | 39.89 | 84.01 | 123 Oak St |
|--------|-------------|-------------|-------------|
Table B
|-------------|-------------|-------------|--------------|
| latitude | longitude | parameter1 | parameter2 |
|-------------|-------------|-------------|--------------|
| 39.74 | 86.33 | [1, 2, 3] | [.1, .2, .3] |
|-------------|-------------|-------------|--------------|
| 39.81 | 83.90 | [4, 5, 6] | [.4, .5, .6] |
|-------------|-------------|-------------|--------------|
I would like to create a new table, Table C, which has all the rows from TABLE A and adds the information from Table B. The information from Table B is added based on the closest point in Table B to the particular row in Table A.
Table C
|------|-------------|-------------|--------------|
| id_A | address | parameter1 | parameter2 |
|------|-------------|-------------|--------------|
| 1 | 123 Vine St | [1, 2, 3] | [.1, .2, .3] |
|------|-------------|-------------|--------------|
| 2 | 123 Oak St | [4, 5, 6] | [.4, .5, .6] |
|------|-------------|-------------|--------------|
Thank you in advance!
Below is for BigQuery Standard SQL
#standardSQL
SELECT AS VALUE
ARRAY_AGG(STRUCT(id, address, parameter1, parameter2) ORDER BY ST_DISTANCE(a.point, b.point) LIMIT 1)[OFFSET(0)]
FROM (SELECT *, ST_GEOGPOINT(longitude, latitude) point FROM `project.dataset.tableA`) a,
(SELECT *, ST_GEOGPOINT(longitude, latitude) point FROM `project.dataset.tableB`) b
GROUP BY id
If to apply to sample data from your question
WITH `project.dataset.tableA` AS (
SELECT 1 id, 39.79 latitude, 86.03 longitude, '123 Vine St' address UNION ALL
SELECT 2, 39.89, 84.01, '123 Oak St'
), `project.dataset.tableB` AS (
SELECT 39.74 latitude, 86.33 longitude, [1, 2, 3] parameter1, [.1, .2, .3] parameter2 UNION ALL
SELECT 39.81, 83.90, [4, 5, 6], [.4, .5, .6]
)
output is
Related
Data is a flat normalised table:
|ID | Product selected | Product Code 1 | Product Code 2 | Product Code 3 | Cost of Product 1 | Cost of Product 2 | Cost of Product 3 | Rate of Product 1 | Rate of Product 2 | Rate of Product 3 |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|1 | ABCDEDFHIJKL | AAABBBCCCDDD | ABCDEDFHIJKL | DDDCCCBBBAAA | 995 | 495 | 0 | 4.4 | 6.3 | 7.8 |
|2 | DDDCCCBBBAAA | AAABBBCCCDDD | ABCDEDFHIJKL | DDDCCCBBBAAA | 995 | 495 | 0 | 4.4 | 6.3 | 7.8 |
What:
Using the product selected (ABCDEDFHIJKL), look across the rows to find the corresponding locations of columns with data relating to the product selected.
Desired Output:
| Product selected | Cost of Product | Rate of Product |
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| ABCDEDFHIJKL | 495 | 6.3 |
| DDDCCCBBBAAA | 0 | 7.8 |
To do this in R is straight forward, and i'm sure for someone more knowledgable in SQL than I, this will be easy
You can use cross apply:
select t.product_selected, x.cost_of_product, x.rate_of_product
from mytable t
cross apply (values
(product_code_1, cost_of_product_1, rate_of_product_1),
(product_code_2, cost_of_product_2, rate_of_product_2),
(product_code_3, cost_of_product_3, rate_of_product_3)
) as x(product_selected, cost_of_product, rate_of_product)
where x.product_selected = t.product_selected
Either use unpivot or union or crossapply
Unpivot sample
SELECT [Product selected], ProductCode ,ProductCost,ProductRate
FROM
(
SELECT *
FROM dbo.table
) AS cp
UNPIVOT
(
ProductCode FOR PC IN ([Product Code 1], [Product Code 2], [Product Code 3])
) AS up
UNPIVOT
(
ProductCost FOR Po IN ([Cost of Product 1], [Cost of Product 2], [Cost of Product 3])
) AS up2
UNPIVOT
(
ProductRate FOR Pr IN ([Rate of Product 1], [Rate of Product 2], [Rate of Product 3])
) AS up3;
I have the following problem and wanted to ask if this is the correct way to do it or if there is a better way of doing it:
Assume I have the following table/data in my DB:
|---|----|------|-------------|---------|---------|
|id |city|street|street_number|lastname |firstname|
|---|----|------|-------------|---------|---------|
| 1 | ar | K1 | 13 |Davenport| Hector |
| 2 | ar | L1 | 27 |Cannon | Teresa |
| 3 | ar | A1 | 135 |Brewer | Izaac |
| 4 | dc | A2 | 8 |Fowler | Milan |
| 5 | fr | C1 | 18 |Kaiser | Ibrar |
| 6 | fr | C1 | 28 |Weaver | Kiri |
| 7 | ny | O1 | 37 |Petersen | Derrick |
I now get some some requests of the following structures: (city/street/street_number)
E.g.: {(ar,K1,13),(dc,A2,8),(ny,01,37)}
I want to retrieve the last name of the person living there. Since the request amount is quite large I don't want to run over all the request one-by-one. My current implementation is to insert the data into a temporary table and join the values.
Is this the right approach or is there some better way of doing this?
You can construct a query using in with tuples:
select t.*
from t
where (city, street, street_number) in ( (('ar', 'K1', '13'), ('dc', 'A2', '8'), ('ny', '01', '37') );
However, if the data starts in the database, then a temporary table or subquery is better than bringing the results back to the application and constructing such a query.
I think you can use the hierarchy query and string function as follows:
WITH YOUR_INPUT_DATA AS
(SELECT '(ar,K1,13),(dc,A2,8),(ny,01,37)' AS INPUT_STR FROM DUAL),
--
CTE AS
( SELECT REGEXP_SUBSTR(STR,'[^,]',1,2) AS STR1,
REGEXP_SUBSTR(STR,'[^,]',1,3) AS STR2,
REGEXP_SUBSTR(STR,'[^,]',1,4) AS STR3
FROM (SELECT SUBSTR(INPUT_STR,
INSTR(INPUT_STR,'(',1,LEVEL),
INSTR(INPUT_STR,')',1,LEVEL) - INSTR(INPUT_STR,'(',1,LEVEL) + 1) STR
FROM YOUR_INPUT_DATA
CONNECT BY LEVEL <= REGEXP_COUNT(INPUT_STR,'\),\(') + 1))
--
SELECT * FROM YOUR_TABLE WHERE (city,street,street_number)
IN (SELECT STR1,STR2,STR3 FROM CTE);
I have two tables as follows:
users table
==========================
| user_id name age |
|=========================
| 1 pete 20 |
| 2 sam 21 |
| 3 nash 22 |
==========================
hobbies table
======================================
| user_id hobby time_spent |
|=====================================
| 1 football 2 |
| 1 running 1 |
| 1 basketball 3 |
======================================
First question: I would like to make a single Hive query that can return rows in this format:
{ "user_id":1, "name":"pete", "hobbies":[ {hobby: "football", "time_spent": 2}, {"hobby": "running", "time_spent": 1}, {"hobby": "basketball", "time_spent": 3} ] }
Second question: If the hobbies table were to be as follows:
========================================
| user_id hobby scores |
|=======================================
| 1 football 2,3,1 |
| 1 running 1,1,2,5 |
| 1 basketball 3,6,7 |
========================================
Would it be possible to get the row output where scores is a list in the output as shown below:
{ "user_id":1, "name":"pete", "hobbies":[ {hobby: "football", "scores": [2, 3, 1]}, {"hobby": "running", "scores": [1, 1, 2, 5]}, {"hobby": "basketball", "scores": [3, 6, 7]} ] }
I was able to find the answer to my first question
select u.user_id, u.name,
collect_list(
str_to_map(
concat_ws(",", array(
concat("hobby:", h.hobby),
concat("time_spent:", h.time_spent)
))
)
) as hobbies
from users as u
join hobbies as h on u.user_id=h.user_id
group by u.user_id, u.name;
I have an access db that i must use to manage interns and the places they work at. Right now, I have two tables: one for the persons, with their personal detail and a bridge to where they work, and another table with the name of the workplace with the respective boss.
Like so:
(table 1, where the persons are listed)
Cadastro_de_estagiarios
id | Ativo | Nível | Lotação | Nome
1 | Verdadeiro | Superior | 1ª Vara Cível | Marina x
3 | Verdadeiro | Médio | 1ª Vara Cível | Raquel x
and so on...
(table 2, where the locations and bosses are specificated)
Cadastro_de_varas_e_juizes
id | Vara | Juiz responsável | Vagas totais nível superior | Vagas totais nível médio
1 | 1ª Vara Cível | fist boss | 2 | 3
2 | 2ª Vara Cível | sec boss | 2 | 4
3 | 3ª Vara Cível | third boss | 2 | 3
and so on...
To clarify, I have two kinds of interns (nível superior e nível médio), as well as two kinds of job vacancies per workplace. Like this: In 1ª Vara Cível, I can have 2 interns with "superior" and 3 with "médio".
What I need to do is get the info on how many interns are placed on each workplace per job type, and then have a query that tells me how many vacancies I still have per place and type.
I appreciate any help. Thanks!
Translating the tables
table1
id | Active | Education level of intern | Workplace | Name
table2
id | Workplace | Boss | Vacancies for college students | Vacancies for high school students
This should give you a starting point:
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE table1 (
id INT,
Active VARCHAR(50),
Education_level VARCHAR(50),
Workplace VARCHAR(50),
Name VARCHAR(50)
);
CREATE TABLE table2 (
id INT,
Workplace VARCHAR(50),
Boss VARCHAR(50),
Vacancies_college INT,
Vacancies_high_school INT
);
INSERT INTO table1 VALUES
(1, 'Verdadeiro', 'Superior', '1ª Vara Cível', 'Marina x'),
(3, 'Verdadeiro', 'Médio', '1ª Vara Cível', 'Raquel x');
INSERT INTO table2 VALUES
(1, '1ª Vara Cível', 'fist boss', 2, 3),
(2, '2ª Vara Cível', 'sec boss', 2, 4),
(3, '3ª Vara Cível', 'third boss', 2, 3);
Query 1:
SELECT t1.Workplace, t1.Active, t1.Education_level,
(CASE
WHEN t1.Education_level = 'Médio' THEN t2.Vacancies_college
WHEN t1.Education_level = 'Superior' THEN t2.Vacancies_high_school
END) - COUNT(*) AS vacancies
FROM table1 t1 LEFT JOIN table2 t2
ON (t1.Workplace = t2.Workplace)
GROUP BY t1.Workplace, t1.Active, t1.Education_level
Results:
| Workplace | Active | Education_level | vacancies |
|---------------|------------|-----------------|-----------|
| 1ª Vara Cível | Verdadeiro | Médio | 1 |
| 1ª Vara Cível | Verdadeiro | Superior | 2 |
I am looking to do a left join on a table that has an array column called tags with a table that has the definitions of the tag, tag_definitions. There will only be one (at the most) match per row in the Cities Table. I can't join an array with a string and i'm not sure how to proceed.
Cities_Table
City_Code | State |Tags
NYC | NY | 1, 4, 5
SF | CA | 2,4, 6
CHI | IL | 3, 8, 10
.
Tag_Definitions
Tag_ID | Name
5 | East_Coast
6 | West_Coast
10 | MidWest
So I'm looking to get something like this...
City_Code | State |Tags | Tag_Descr
NYC | NY | 1, 4, 5 | East_Coast
SF | CA | 2,4, 6 | West_Coast
CHI | IL | 3, 8, 10 | MidWest
Depending on your database (the syntax might be different), you can do something like the following:
select *
from cities_table c
join tag_definitions t on concat(',',c.tags,',') like concat('%,',t.tag_id,',%')
SQL Fiddle Demo
However as noted, a better idea would be to create a City_Tags table and store the individual ids in that table. Generally it's not a good idea to store comma delimited data.