How to convert rows with different values into different columns? - sql

I have a query that returns something like this:
+----+------+-------------------+
| ID | Type | Address |
+----+------+-------------------+
| 1 | 0 | Some address text |
| 1 | 1 | Some address text |
| 1 | 3 | Some address text |
| 2 | 0 | Some address text |
| 2 | 1 | Some address text |
+----+------+-------------------+
The number of types is fixed. There is up to three of them. ID is not unique within this table, it can't be more than three of the though (one type per ID). What I want is to create a table as follows:
+----+-------------------+-------------------+-------------------+
| ID | AddressType0 | AddressType1 | AddressType2 |
+----+-------------------+-------------------+-------------------+
| 1 | Some address text | Some address text | Some address text |
| 2 | Some address text | Some address text | Some address text |
| 3 | Some address text | Some address text | Some address text |
| 4 | Some address text | Some address text | Some address text |
| 5 | Some address text | Some address text | Some address text |
+----+-------------------+-------------------+-------------------+
In the resulting table ID should be unique. If there is no address of appropriate type in the original table, the resulting table should contains null in the field.

You can do aggregation :
with cte as (
<query here>
)
select row_number() over (order by id) as id,
max(case when type = 0 then address end) as [AddressType0],
max(case when type = 1 then address end) as [AddressType1],
max(case when type = 2 then address end) as [AddressType2]
from cte c
group by id;

Related

How can I 'flatten' a one to many table with URLs so that each additional URL shows up in a new column?

I'm trying to 'flatten' a one to many relationship using SQL to create a CSV of points and their associated photos to use with a web map.
Table 1 is a list of points and their locations, and Table 2 is a list of URLs of photos and their associated captions.
Table 1
+-------------+------------+-----------+-------------+
| LOCATION_ID | Name | Latitude | Longitude |
+-------------+------------+-----------+-------------+
| 1 | Dawson | 64.06 | -139.410833 |
| 2 | Whitehorse | 60.716667 | -135.05 |
+-------------+------------+-----------+-------------+
Table 2
+-------------+-------------------------+----------------------+
| LOCATION_ID | CAPTION | URL |
+-------------+-------------------------+----------------------+
| 1 | Photo of Dawson city | http://fakeurl.com/1 |
| 1 | Photo of Klondike River | http://fakeurl.com/2 |
| 1 | Photo of Yukon River | http://fakeurl.com/3 |
| 2 | Photo of Main Street | http://fakeurl.com/4 |
| 2 | Photo of Miles Canyon | http://fakeurl.com/5 |
+-------------+-------------------------+----------------------+
How do I write SQL code so that it creates a table that looks like this?
+-------------+------------+-----------+-------------+----------------------+----------------------+-------------------------+----------------------+----------------------+----------------------+
| LOCATION_ID | NAME | Latitude | Longitude | CAPTION1 | URL1 | CAPTION2 | URL2 | CAPTION3 | URL3 |
+-------------+------------+-----------+-------------+----------------------+----------------------+-------------------------+----------------------+----------------------+----------------------+
| 1 | Dawson | 64.06 | -139.410833 | Photo of Dawson city | http://fakeurl.com/1 | Photo of Klondike River | http://fakeurl.com/2 | Photo of Yukon River | http://fakeurl.com/3 |
| 2 | Whitehorse | 60.716667 | -135.05 | Photo of Main Street | http://fakeurl.com/4 | Photo of Miles Canyon | http://fakeurl.com/5 | | |
+-------------+------------+-----------+-------------+----------------------+----------------------+-------------------------+----------------------+----------------------+----------------------+
You want to pivot the data in table2. But to do so, you need a pivoting column, which can be generated using row_number().
I prefer to use conditional aggregation for pivoting, so:
select t1.LOCATION_ID, t1.Name, t1.Latitude, t1.Longitude,
max(case when seqnum = 1 then t2.caption end) as caption_1,
max(case when seqnum = 1 then t2.url end) as url_1,
max(case when seqnum = 2 then t2.caption end) as caption_2,
max(case when seqnum = 2 then t2.url end) as url_2,
max(case when seqnum = 3 then t2.caption end) as caption_3,
max(case when seqnum = 3 then t2.url end) as url_3
from table1 t1 left join
(select t2.*,
row_number() over (partition by location_id order by location_id) as seqnum
from table2 t2
) t2
on t1.location_id = t2.location_id
group by t1.LOCATION_ID, t1.Name, t1.Latitude, t1.Longitude;

SQL - Sum of Unique Values From Reused Column

I need a sum of Shoes and Hats from a table containing a User, Filename, and Payload. Duplicate records should be ignored where a Duplicate Record is defined as the same User, Payload, and the portion of the Filename following the '/'. In the example table below, record #3 is a duplicate of record #2 using the rules above. The desired result is a sum of Shoes and a sum of Hats, example below.
Example Data
+---+------+----------+-----------+
| # | User | Filename | Payload |
+---+------+----------+-----------+
| 1 | A | a/123 | Shoes = 3 |
| 2 | A | a/123 | Hats = 2 |
| 3 | A | b/123 | Hats = 2 |
| 4 | B | a/123 | Shoes = 1 |
| 5 | B | a/123 | Hats = 1 |
+---+------+----------+-----------+
Expected Output
+-------+------+
| Shoes | Hats |
+-------+------+
| 4 | 3 |
+-------+------+
Hive happens to support substring_index(), so you can do:
select sum(case when payload like 'Shoes%'
then substring_index(payload, ' = ', -1)
else 0
end) as num_shoes,
sum(case when payload like 'Hats%'
then substring_index(payload, ' = ', -1)
else 0
end) as num_hats
from (select t.*,
row_number() over (partition by user, payload, substring_index(filename, '/', -1)
order by user
) as seqnum
from t
) t
where seqnum = 1;
I strongly suggest that you change your data model and not store the payload as a string. Numbers should be stored as numbers. Names should be stored as names. They should not be combined in a string, if that can be avoided.

Group key-value columns into a single row

I'm trying to extract data from a SQLite table that stores key-value pairs in dual columns. For example, with the keys foo, bar, man, and row, the table would look like:
| _id | external_id | key | value |
|-----|-------------|------|-------|
| 1 | 12345 | foo | cow |
| 2 | 12345 | bar | moo |
| 3 | 12345 | man | hole |
| 4 | 12345 | row | boat |
| 5 | 67980 | foo | abc |
| 6 | 67890 | bar | def |
| 7 | 67890 | man | ghi |
| 8 | 67890 | row | jkl |
I want to perform a query that gives me each external_id in a row with the keys as the columns and the values as the rows. Like this:
| external_id | foo | bar | man | row |
|-------------|-----|------|------|------|
| 12345 | cow | moo | hole | boat |
| 67890 | abc | def | ghi | jkl |
The only solution I've been able to come up with is a join for each key:
SELECT a.external_id, b.foo, c.bar, d.main, e.row
FROM myTable AS a
LEFT JOIN
(SELECT external_id, key AS foo
FROM myTable
WHERE key="foo") AS b
ON a.external_id = b.external_id
...
LEFT JOIN
(SELECT external_id, key AS row
FROM myTable
WHERE key="row") AS e
ON a.external_id = e.external_id
GROUP BY a.external_id
Is there a better way to do this?
The other available option is to use conditional aggregation:
SELECT external_id,
MAX(CASE WHEN key = 'foo' THEN value END) AS foo,
MAX(CASE WHEN key = 'bar' THEN value END) AS bar,
MAX(CASE WHEN key = 'man' THEN value END) AS man,
... etc
FROM mytable
GROUP BY external_id
you can also use the collect() method
select external_id,
collected_objects['foo'] as foo,
collected_objects['bar'] as bar,
collected_objects['man'] as man,
collected_objects['row'] as row
from(
select external_id,
collect(key, value) as collected_objects
group by external_id)t1

Sql: simultaneous aggregate from two tables

I have two tables: a Files table, which includes the file type, and a File Properties table, which references the file table via a foreign key. Sample Files table:
| id | name | type |
---------------------
| 1 | file1 | zip |
| 2 | file2 | zip |
| 3 | file3 | zip |
| 4 | file4 | jpg |
And the Properties table:
| file_id | property |
-----------------------
| 1 | x |
| 2 | x |
I want to make a query, which shows the count of each file type, and how many files of that type have a property.
So in the example, the result would be
| type | filecount | prop count |
----------------------------------
| zip | 3 | 2 |
| jpg | 1 | 0 |
I could accomplish this by
select f.type, (select count(id) from files where type = f.type), count(fp.id) from
files as f, file_properties as fp where f.id = fp.file_id group by f.type;
But this seems very suboptimal and is very slow. Any better way to do this?
select type, count(*) as filecount, sum(pc.count) as [prop count]
from Files f
left outer join (
select file_id, count(*) as count
from Properties p
group by file_id
) pc on f.id = pc.file_id
group by type

Postgresql select from 2 tables. Joins?

I have 2 tables that look like this:
Table "public.phone_lists"
Column | Type | Modifiers
----------+-------------------+--------------------------------------------------------------------
id | integer | not null default nextval(('"phone_lists_id_seq"'::text)::regclass)
list_id | integer | not null
sequence | integer | not null
phone | character varying |
name | character varying |
and
Table "public.email_lists"
Column | Type | Modifiers
---------+-------------------+--------------------------------------------------------------------
id | integer | not null default nextval(('"email_lists_id_seq"'::text)::regclass)
list_id | integer | not null
email | character varying |
I'm trying to get the list_id, phone, and emails out of the tables in one table. I'm looking for an output like:
list_id | phone | email
---------+-------------+--------------------------------
0 | | jqeron#wqwerweper.com
0 | | qwerox#wqwekeeper.com
0 | | erreon#fdfdeper.com
0 | | sfar#weasdfer.com
0 | | rawq#gdfefdgheper.com
1 | 15555555555 |
1 | 15555551806 |
1 | 15555555508 |
1 | 15055555506 |
1 | 15055555558 |
1 | | rfoasdfx#wefdaser.com
1 | | radfy#wfdfder.com
I've come up with
select pl.list_id, pl.phone, el.email from phone_lists as pl left join email_lists as el using (list_id);
but thats not quite right. Any suggestions?
SELECT list_id, phone, email
FROM (
SELECT list_id, NULL AS phone, email, 1 AS set_id
FROM email_lists
UNION ALL
SELECT list_id, phone, NULL AS email, 2 AS set_id
FROM phone_lists
) q
ORDER BY
list_id, set_id