What SQL magic do I need to turn one column into several? - sql

I need to print some tickets, each of which has enough room to hold one set of customer details along with codes for up to five items ordered by that customer. Customers who have ordered more than five items get multiple tickets. So from an orders table like this,
Customer | Item
---------|------
Bob | FTMCH
Bob | ZORP
Bob | KLUGE
Carol | FTMCH
Carol | MEEP
Carol | ZORP
Ted | FOON
Ted | SMOCK
Alice | ORGO
Carol | SQICK
Carol | BLECH
Carol | KLUGE
Carol | GLURP
I need a query that returns this:
Customer | Item1 | Item2 | Item3 | Item4 | Item5
---------|-------|-------|-------|-------|------
Alice | ORGO | null | null | null | null
Bob | FTMCH | ZORP | KLUGE | null | null
Carol | FTMCH | MEEP | ZORP | SQICK | BLECH
Carol | KLUGE | GLURP | null | null | null
Ted | FOON | SMOCK | null | null | null
Can some kind soul help me with the SQL for this? HSQL embedded database in OpenOffice.org Base, if it makes a difference.

OK, this works well enough:
SELECT
"Customer",
MAX(CASE WHEN "Slot" = 0 THEN "Item" END) AS "Item1",
MAX(CASE WHEN "Slot" = 1 THEN "Item" END) AS "Item2",
MAX(CASE WHEN "Slot" = 2 THEN "Item" END) AS "Item3",
MAX(CASE WHEN "Slot" = 3 THEN "Item" END) AS "Item4",
MAX(CASE WHEN "Slot" = 4 THEN "Item" END) AS "Item5"
FROM (
SELECT
l."Customer" AS "Customer",
l."Item" AS "Item",
COUNT(r."Item") / 5 AS "Ticket",
MOD(COUNT(r."Item"), 5) AS "Slot"
FROM "Orders" AS l
LEFT JOIN "Orders" AS r
ON r."Customer" = l."Customer" AND r."Item" < l."Item"
GROUP BY "Customer", "Item"
)
GROUP BY "Customer", "Ticket"
ORDER BY "Customer", "Ticket"
It makes this:
Customer | Item1 | Item2 | Item3 | Item4 | Item5
---------|-------|-------|-------|-------|-------
Alice | ORGO | | | |
Bob | FTMCH | KLUGE | ZORP | |
Carol | BLECH | FTMCH | GLURP | KLUGE | MEEP
Carol | SQICK | ZORP | | |
Ted | FOON | SMOCK | | |
Thanks to all who helped, both here and at Ask Metafilter.
(Followup edit:)
Jesus, this just gets worse :-(
Turns out the business rules allow the same customer to order the same item on multiple occasions, and that all outstanding orders are to be included on the one set of tickets. So my toy table should have looked more like this:
ID | Customer | Item
159 | Bob | FTMCH
264 | Bob | ZORP
265 | Bob | KLUGE
288 | Carol | FTMCH
314 | Carol | MEEP
323 | Carol | ZORP
327 | Ted | FOON
338 | Ted | SMOCK
358 | Alice | ORGO
419 | Carol | SQICK
716 | Carol | MEEP
846 | Carol | BLECH
939 | Carol | MEEP
950 | Carol | GLURP
979 | Carol | KLUGE
Carol's multiple MEEPs bugger the ranking logic in the original solution, and I've ended up with the following hideous monster:
SELECT
"Customer",
MAX(CASE WHEN "Slot" = 0 THEN "Item" END) AS "Item0",
MAX(CASE WHEN "Slot" = 1 THEN "Item" END) AS "Item1",
MAX(CASE WHEN "Slot" = 2 THEN "Item" END) AS "Item2",
MAX(CASE WHEN "Slot" = 3 THEN "Item" END) AS "Item3",
MAX(CASE WHEN "Slot" = 4 THEN "Item" END) AS "Item4",
MAX(CASE WHEN "Slot" = 0 THEN "Quantity" END) AS "Qty0",
MAX(CASE WHEN "Slot" = 1 THEN "Quantity" END) AS "Qty1",
MAX(CASE WHEN "Slot" = 2 THEN "Quantity" END) AS "Qty2",
MAX(CASE WHEN "Slot" = 3 THEN "Quantity" END) AS "Qty3",
MAX(CASE WHEN "Slot" = 4 THEN "Quantity" END) AS "Qty4"
FROM (
SELECT
"Customer",
"Item",
COUNT("ID") AS "Quantity",
"Rank" / 5 AS "Ticket",
MOD("Rank", 5) AS "Slot"
FROM (
SELECT
main."ID" AS "ID",
main."Customer" AS "Customer",
main."Item" AS "Item",
COUNT(less."Item") AS "Rank"
FROM "Orders" AS main
LEFT JOIN (
SELECT DISTINCT
"Customer",
"Item"
FROM "Orders") AS less
ON less."Customer" = main."Customer" AND less."Item" < main."Item"
GROUP BY "ID", "Customer", "Item"
)
GROUP BY "Customer", "Item", "Rank"
)
GROUP BY "Customer", "Ticket"
which makes this:
Customer | Item0 | Item1 | Item2 | Item3 | Item4 | Qty0 | Qty1 | Qty2 | Qty3 | Qty3 | Qty4
Bob | FTMCH | KLUGE | ZORP | | | 1 | 1 | 1 | | |
Carol | BLECH | FTMCH | GLURP | KLUGE | MEEP | 1 | 1 | 1 | 1 | 1 | 3
Carol | SQICK | ZORP | | | | 1 | 1 | | | |
Ted | FOON | SMOCK | | | | 1 | 1 | | | |
Alice | ORGO | | | | | 1 | | | | |
It does the job, I guess, but I'm feeling pretty lucky that the database involved is always going to be quite small (a few thousand rows).
Spiritually I'm an embedded-systems guy, not a database guy. Can anybody who does this for a living tell me whether this kind of nonsense is common? Would a query with four nested SELECTs and a LEFT JOIN merit a mention on the Daily WTF?

I believe this is only usable for T-SQL, but you can use PIVOT: http://msdn.microsoft.com/en-us/library/ms177410.aspx
I did something similar with a list of dates becoming the columns for calculations.

Not exactly what you asked, and MySQL rather than OpenOffice, but might give you an idea or someone else could work on it :
select
u.Customer,
group_concat(u.Item) items
from
(select
t.Item,
#n:=if(#c=t.Customer and #n<4,#n+1,0) c1,
#m:=if(#n,#m,#m+1) g,
#c:=t.Customer as Customer
from
t1 t, (select #m:=0) init
order
by t.Customer
) u
group by
u.g
Output :
+----------+------------------------------+
| Customer | items |
+----------+------------------------------+
| Alice | ORGO |
| Bob | FTMCH,ZORP,KLUGE |
| Carol | KLUGE,ZORP,BLECH,SQICK,GLURP |
| Carol | MEEP,FTMCH |
| Ted | FOON,SMOCK |
+----------+------------------------------+

This gets you most of the way there, but does not handle the duplicate order for Carol. That would be easy to do if there was something else to group on, like OrderID or OrderDate. Can you post the full schema?
select m1.Customer,
min(m1.Item) as Item1,
min(m2.item) as Item2,
min(m3.item) as Item3,
min(m4.item) as Item4,
min(m5.item) as Item5
from CustomerOrder m1
left outer join CustomerOrder m2 on m1.Customer = m2.Customer
and m2.item > m1.item
left outer join CustomerOrder m3 on m1.Customer = m3.Customer
and m3.item > m2.item
left outer join CustomerOrder m4 on m1.Customer = m4.Customer
and m4.item > m3.item
left outer join CustomerOrder m5 on m1.Customer = m5.Customer
and m5.item > m4.item
group by m1.Customer
Output:
Customer Item1 Item2 Item3 Item4 Item5
-------------- ---------- ---------- ---------- ---------- ----------
Alice ORGO NULL NULL NULL NULL
Bob FTMCH KLUGE ZORP NULL NULL
Carol BLECH FTMCH GLURP KLUGE MEEP
Ted FOON SMOCK NULL NULL NULL

The requirement is not uncommon, and can be supplied reasonably in SQL. But you have two issues blocking you.
1) You've entered an SQL tag, that means ISO/IEC/ANSI Standard SQL. The correct method to use is a cursor or cursor substitute (while loop, which does the same thing, but is faster). That avoids all these outer joins and handling massive result sets; then beating it into submission with GROUP BYs, etc. It also handles duplicates, mainly because it does it create them in the first place (via those five versions of the aliased table). And yes, it will keep getting worse, and when the database is reasonably populated it will be a performance hog.
2) Duplicates are not allowed in a Relational database, ie. in your source tables; you need to make the rows unique (and those keys/columns is not shown). No use trying to eliminate duplicates via code. If that is corrected, then all duplicates (real and created by the poor code) can be eliminated.
This requirement can also be supplied more elegantly using Subqueries; except that here you need two levels of nesting, one to build teach Item column, and two to obtain rank or Position. And that (standard SQL construct) pre-supposes that you have a Relational database (no duplicate rows). High Eek factor if you are not used to SQL. Which is why most coders use a cursor or cursor substitute.
But if you do not have SQL, its basic capabilities, (HSQL being some sub-standard implementation), then we are not using the same tool kit. The SQL code I can provide will not run for you, and we will keep going back and forth.
(Maybe we should have a "psuedo-SQL" tag.)
ID Column Prevents Duplicates ???
There is a myth that is prevalent in some parts of the industry, to that effect, due to books written by database beginners. As usual, myths have no scientific basis. Let's try a simple test. CREATE TABLE Person (
PersonId IDENTITY NOT NULL
PRIMARY KEY,
FirstName CHAR(30) NOT NULL,
LastName CHAR(30) NOT NULL
)
INSERT Person VALUES ("Fred", "Astaire")
1 row(s) affected
INSERT Person VALUES ("Ginger", "Rogers")
1 row(s) affected
INSERT Person VALUES ("Fred", "Astaire")
1 row(s) affected
SELECT * FROM Person
PersonId FirstName LastName
======== ============================== ==============================
1 Fred Astaire
2 Ginger Rogers
3 Fred Astaire
3 row(s) affected
That's a pure, unarguable duplicate row. The simple fact is. the Id column provides a row number, but does nothing to prevent duplicate rows. For that you need an Unique Index on the columns that determine uniqueness, as identified in the data model, for every relational table in the database (by definition, if the rows are not unique, it is not a Relational table). Otherwise it is just a file storage system. CREATE UNIQUE NONCLUSTERED INDEX U_Name
ON Person (LastName, FirstName)
There is another form of data integrity (duplication) which I might identify while I am at it. INSERT Person VALUES ("Fred", "Astair")
1 row(s) affected
INSERT Person VALUES ("Astaire", "Fred")
1 row(s) affected
All are preventable in SQL.

Related

SQL - specific requirement to compare tables

I'm trying to merge 2 queries into 1 (cuts the number of daily queries in half): I have 2 tables, I want to do a query against 1 table, then the same query against the other table that has the same list just less entries.
Basically its a list of (let's call it for obfuscation) people and hobby. One table is ALL people & hobby, the other shorter list is people & hobby that I've met. Table 2 would all be found in table 1. Table 1 includes entries (people I have yet to meet) not found in table 2
The tables are synced up from elsewhere, what I'm looking to do is print a list of ALL people in the first column then print the hobby ONLY of people that are on both lists. That way I can see the lists merged, and track the rate at which the gap between both lists is closing. I have tried a number of SQL combinations but they either filter out the first table and match only items that are true for both (i.e. just giving me table 2) or just adding table 2 to table 1.
Example of what I'm trying to do below:
+---------+----------+--+----------+---------+--+---------+----------+
| table1 | | | table2 | | | query | |
+---------+----------+--+----------+---------+--+---------+----------+
| name | hobby | | activity | person | | name | hobby |
| bob | fishing | | fishing | bob | | bob | fishing |
| bill | vidgames | | hiking | sarah | | bill | |
| sarah | hiking | | planking | sabrina | | sarah | hiking |
| mike | cooking | | | | | mike | |
| sabrina | planking | | | | | sabrina | planking |
+---------+----------+--+----------+---------+--+---------+----------+
Normally I'd just take the few days to learn SQL a bit better however I'm stretched pretty thin at work as it is!
I should mention the table 2 is flipped and the headings are all unique (don't think this matters)!
I think you just want a left join:
select t1.name, t2.activity as hobby
from table1 t1 left join
table2 t2
on t1.name = t2.person;

SQL - UNION vs NULL functions. Which is better?

I have three tables: ACCT, PERS, ORG. Each ACCT is owned by either a PERS or ORG. The PERS and ORG tables are very similar and so are all of their child tables, but all PERS and ORG data is separate.
I'm writing a query to get PERS and ORG information for each account in ACCT and I'm curious what the best method of combining the information is. Should I use a series of left joins and NULL functions to fill in the blanks, or should I write the queries separately and use UNION to combine?
I've already written separate queries for PERS ACCT's and another for ORG ACCT's and plan on using UNION. My question more pertains to best practice in the future.
I'm expecting both to give me my desired my results, but I want to find the most efficient method both in development time and run time.
EDIT: Sample Table Data
ACCT Table:
+---------+---------+--------------+-------------+
| ACCTNBR | ACCTTYP | OWNERPERSNBR | OWNERORGNBR |
+---------+---------+--------------+-------------+
| 555001 | abc | 3010 | |
| 555002 | abc | | 2255 |
| 555003 | tre | 5125 | |
| 555004 | tre | 4485 | |
| 555005 | dsa | | 6785 |
+---------+---------+--------------+-------------+
PERS Table:
+---------+--------------+---------------+----------+-------+
| PERSNBR | PHONE | STREET | CITY | STATE |
+---------+--------------+---------------+----------+-------+
| 3010 | 555-555-5555 | 1234 Main St | New York | NY |
| 5125 | 555-555-5555 | 1234 State St | New York | NY |
| 4485 | 555-555-5555 | 6542 Vine St | New York | NY |
+---------+--------------+---------------+----------+-------+
ORG Table:
+--------+--------------+--------------+----------+-------+
| ORGNBR | PHONE | STREET | CITY | STATE |
+--------+--------------+--------------+----------+-------+
| 2255 | 222-222-2222 | 1000 Main St | New York | NY |
| 6785 | 333-333-3333 | 400 4th St | New York | NY |
+--------+--------------+--------------+----------+-------+
Desired Output:
+---------+---------+--------------+-------------+--------------+---------------+----------+-------+
| ACCTNBR | ACCTTYP | OWNERPERSNBR | OWNERORGNBR | PHONE | STREET | CITY | STATE |
+---------+---------+--------------+-------------+--------------+---------------+----------+-------+
| 555001 | abc | 3010 | | 555-555-5555 | 1234 Main St | New York | NY |
| 555002 | abc | | 2255 | 222-222-2222 | 1000 Main St | New York | NY |
| 555003 | tre | 5125 | | 555-555-5555 | 1234 State St | New York | NY |
| 555004 | tre | 4485 | | 555-555-5555 | 6542 Vine St | New York | NY |
| 555005 | dsa | | 6785 | 333-333-3333 | 400 4th St | New York | NY |
+---------+---------+--------------+-------------+--------------+---------------+----------+-------+
Query Option 1: Write 2 queries and use UNION to combine them:
select a.acctnbr, a.accttyp, a.ownerpersnbr, a.ownerorgnbr, p.phone, p.street, p.city, p.state
from acct a
inner join pers p on p.persnbr = a.ownerpersnbr
UNION
select a.acctnbr, a.accttyp, a.ownerpersnbr, a.ownerorgnbr, o.phone, o.street, o.city, o.state
from acct a
inner join org o on o.orgnbr = a.ownerorgnbr
Option 2: Use NVL() or Coalesce to return a single data set:
SELECT a.acctnbr,
a.accttyp,
NVL(a.ownerpersnbr, a.ownerorgnbr) Owner,
NVL(p.phone, o.phone) Phone,
NVL(p.street, o.street) Street,
NVL(p.city, o.city) City,
NVL(p.state, o.state) State
FROM
acct a
LEFT JOIN pers p on p.persnbr = a.ownerpersnbr
LEFT JOIN org o on o.orgnbr = a.ownerorgnbr
There are way more fields in each of the 3 tables as well as many more PERS and ORG tables in my actual query. Is one way better (faster, more efficient) than another?
That depends, on what you consider "better".
Assuming, that you will always want to pull all rows from ACCT table, I'd say to go for the LEFT OUTER JOIN and no UNION. (If using UNION, then rather go for UNION ALL variant.)
EDIT: As you've already shown your queries, mine is no longer required, and did not match your structures. Removing this part.
Why LEFT JOIN? Because with UNION you'd have to go through ACCT twice, based on "parent" criteria (whether separate or done INNER JOIN criteria), while with plain LEFT OUTER JOIN you'll probably get just one pass through ACCT. In both cases, rows from "parents" will most probably be accessed based on primary keys.
As you are probably considering performance, when looking for "better", as always: Test your queries and look at the execution plans with adequate and fresh database statistics in place, as depending on the data "layout" (histograms, etc.) the "better" may be something completely different.
I think you misunderstand what a Union does versus a join statement. A union takes the records from multiple tables, generally similar or the same structure and combines them into a single resultset. It is not meant to combine multiple dissimilar tables.
What I am seeing is that you have two tables PERS and ORG with some of the same data in it. In this case I suggest you union those two tables and then join to ACCT to get the sample output.
In this case to get the output as you have shown you would want to use Outer joins so that you don't drop any records without a match. That will give you nulls in some places but most of the time that is what you want. It is much easier to filter those out later.
Very rough sample code.
SELECT a.*, b.*
from Acct as a
FULL OUTER JOIN (
Select * from PERS UNION Select * from ORG
) as b
ON a.ID = b.ID

How to use XML Path to generate a grid

I need to output results of a query to a grid, rather a long list of values.
What I have right now is
(SELECT COLUMN1+' '+COLUMN2
FROM TABLE
FOR XML PATH) AS MyGrid
Results I have are displayed as
Bob s12345 Chuck s54321
I would like to have them displayed as
Bob s12345
Chuck s54321
Any help, please?
Added table records
CustID | CustName | StoreNumber | City
------+----------+--------------+-----------
1 | Bob | s12345 | Somewhere
2 | Chuck | s54321 | Town
3 | Paul | s19285 | BillaBong
4 | David | s65478 | North
5 | Arnold | s47381 | South
The MyGrid ALIAS is passed to Outlook as merge field.
you can use cross apply with values
select value1,value2 from table
cross apply
(values (value3 ,value4))b(v1,v2)

Remove newest redundant row and update timestamp

I'm working with a SQLite database that receives large data dumps on a regular basis from several sources. Unfortunately, those sources aren't intelligent about what they dump, and I end up with a lot of repeated records from one time to the next. I'm looking for a way to remove these repeated records without affecting the records that have legitimately changed from the past dump to this one.
Here's the general structure of the data (_id is the primary key):
| _id | _dateUpdated | _dateEffective | _dateExpired | name | status | location |
|-----|--------------|----------------|--------------|------|--------|----------|
| 1 | 2016-05-01 | 2016-05-01 | NULL | Fred | Online | USA |
| 2 | 2016-05-01 | 2016-05-01 | NULL | Jim | Online | USA |
| 3 | 2016-05-08 | 2016-05-08 | NULL | Fred | Offline| USA |
| 4 | 2016-05-08 | 2016-05-08 | NULL | Jim | Online | USA |
| 5 | 2016-05-15 | 2016-05-15 | NULL | Fred | Offline| USA |
| 6 | 2016-05-15 | 2016-05-15 | NULL | Jim | Online | USA |
I'd like to be able to reduce this data to something like this:
| _id | _dateUpdated | _dateEffective | _dateExpired | name | status | location |
|-----|--------------|----------------|--------------|------|--------|----------|
| 1 | 2016-05-01 | 2016-05-01 | 2016-05-07 | Fred | Online | USA |
| 2 | 2016-05-15 | 2016-05-01 | NULL | Jim | Online | USA |
| 3 | 2016-05-15 | 2016-05-08 | NULL | Fred | Offline| USA |
The idea here is that rows 4, 5, and 6 exactly duplicate rows 2 and 3 except for the timestamps (I'd need to compare by all three fields - name, status, location). However, row 3 does not duplicate row 1 (status changed from Online to Offline), so the _dateExpired field is set in row 1, and row 3 becomes the most recent record.
I'm querying this table with something like this:
SELECT * FROM Data WHERE
date(_dateEffective) <= date("now")
AND (_dateExpired IS NULL OR date(_dateExpired) > date("now"))
Is this sort of reduction possible in SQLite?
I am still a beginner to SQL and database design in general, so it's possible that I haven't structured the database in the best way. I'm open to suggestions there as well...I'm going for the ability to query data at a given point in time - for example, "what was Jim's status around 2016-05-06?"
Thanks in advance!
Consider using a staging table where the dump file goes into a DumpTable (regularly cleaned out before each dump) and then an INSERT...SELECT query migrates to your final table.
Now the SELECT portion maintains a correlated subquery (to calculate new [_dateExpired] for needed rows) and derived table subquery (to filter out non-dups according to your criteria). Finally, the LEFT JOIN...NULL with FinalTable is to ensure no duplicate records are appended, assuming [_id] is a unique identifier. Below is the routine:
Clean Out DumpTable
DELETE FROM DumpTable;
Run Dump Routine to be appended into DumpTable
Append Records to FinalTable
INSERT INTO FinalTable ([_id], [_dateUpdated], [_dateEffective], [_dateExpired],
[name], status, location)
SELECT d.[_id], d.[_dateUpdated], d.[_dateEffective],
(SELECT Min(date(sub.[_dateEffective], '-1 day'))
FROM DumpTable sub
WHERE sub.[name] = DumpTable.[name]
AND sub.[_dateEffective] > DumpTable.[_dateEffective]
AND sub.status <> DumpTable.status) As calcExpired
d.name, d.status, d.location
FROM DumpTable d
INNER JOIN
(SELECT Min(DumpTable.[_id]) AS min_id,
DumpTable.name, DumpTable.status
FROM DumpTable
GROUP BY DumpTable.name, DumpTable.status) AS c
ON (c.name = d.name)
AND (c.min_id = d.[_id])
AND (c.status = d.status)
LEFT JOIN FinalTable f
ON d.[_id] = f.[_id]
WHERE f.[_id] IS NULL;
-- INSERTED RECORDS:
-- _id _dateUpdated _dateEffective _dateExpired name status location
-- 1 2016-05-01 2016-05-01 2016-05-07 Fred Online USA
-- 2 2016-05-01 2016-05-01 Jim Online USA
-- 3 2016-05-08 2016-05-08 Fred Offline USA
Is this sort of reduction possible in SQLite?
The answer to any "reduction" question in SQL is always Yes. The trick is to find what axes you're reducing along.
Here's a partial solution to illustrate; it gives the first Online date for each name & location.
select min(_dateEffective) as start_date
, name
, location
from Data
where status = 'Online'
group by
name
, location
With an outer join back to the table (on name & location) where the status is 'Offline' and the _dateEffective is greater than start_date, you get your _dateExpired.
_id is the primary key
There is a commonly held misunderstanding that every table needs some kind of sequential "ID" number as a primary key. The key you really care about is known as a natural key, 1 or more columns in the data that uniquely identify the data. In your case, it looks to me like that's _dateEffective, name, status, and location. At the very least, declare them unique to prevent accidental duplication.

JOIN, aggregate and convert in postgres between two tables

Here are the two tables i have: [all columns in both tables are of type "text"], Table name and the column names are in bold fonts.
Names
--------------------------------
Name | DoB | Team |
--------------------------------
Harry | 3/12/85 | England
Kevin | 8/07/86 | England
James | 5/05/89 | England
Scores
------------------------
ScoreName | Score
------------------------
James-1 | 120
Harry-1 | 30
Harry-2 | 40
James-2 | 56
End result i need is a table that has the following
NameScores
---------------------------------------------
Name | DoB | Team | ScoreData
---------------------------------------------
Harry | 3/12/85 | England | "{"ScoreName":"Harry-1", "Score":"30"}, {"ScoreName":"Harry-2", "Score":"40"}"
Kevin | 8/07/86 | England | null
James | 5/05/89 | England | "{"ScoreName":"James-1", "Score":"120"}, {"ScoreName":"James-2", "Score":"56"}"
I need to do this using a single SQL command which i will use to create a materialized view.
I have gotten as far as realising that it will involve a combination of string_agg, JOIN and JSON, but haven't been able to crack it fully. Please help :)
I don't think the join is tricky. The complication is building the JSON object:
select n.name, n.dob, n.team,
json_agg(json_build_object('ScoreName', s.name,
'Score', s.score)) as ScoreData
from names n left join
scores s
ons.name like concat(s.name, '-', '%')
group by n.name, n.dob, n.team;
Note: json_build_object() was introduced in Postgres 9.4.
EDIT:
I think you can add a case statement to get the simple NULL:
(case when s.name is null then NULL
else json_agg(json_build_object('ScoreName', s.name,
'Score', s.score))
end) as ScoreData
Use json_agg() with row_to_json() to aggregate scores data into a json value:
select n.*, json_agg(row_to_json(s)) "ScoreData"
from "Names" n
left join "Scores" s
on n."Name" = regexp_replace(s."ScoreName", '(.*)-.*', '\1')
group by 1, 2, 3;
Name | DoB | Team | ScoreData
-------+---------+---------+---------------------------------------------------------------------------
Harry | 3/12/85 | England | [{"ScoreName":"Harry-1","Score":30}, {"ScoreName":"Harry-2","Score":40}]
James | 5/05/89 | England | [{"ScoreName":"James-1","Score":120}, {"ScoreName":"James-2","Score":56}]
Kevin | 8/07/86 | England | [null]
(3 rows)