In PostgreSQL database I have table which has columns like ITEM_ID and PARENT_ITEM_ID.
| ITEM_ID | ITEM_NAME | PARENT_ITEM_ID |
|---------|-----------|----------------|
| 1 | A | 0 |
| 2 | B | 0 |
| 3 | C | 1 |
My task to take all values from these columns and put them to one array. In the same time I need delete all duplicates. I started with such SQL query but what the best way to delete duplicates?
SELECT
ARRAY_AGG(ITEM_ID || ',' || PARENT_ITEM_ID)
FROM
ITEMS_RELATIONSHIP
GROUP BY
ITEM_ID
I want such result:
[1,0,2,3]
Right now I have such result:
|{1,0}|
|{2,0}|
|{3,1}|
If you want one array of all item IDs, don't group by item_id. Something like this might be what you want:
select
array_agg(item_id, ',') as itemlist
from
(
select item_id from items_relationship
union
select parent_item_id from items_relationship
) as allitems;
Here is one method to get the parent item ids in with the other item ids:
select array_agg(distinct item_id)
from items_relationship ir cross join lateral
(values (ir.item_id), (ir.parent_item_id)) v(item_id);
This unpivots the data using a lateral join and then aggregates.
I tried to search posts, but I only found solutions for SQL Server/Access. I need a solution in MySQL (5.X).
I have a table (called history) with 3 columns: hostid, itemname, itemvalue.
If I do a select (select * from history), it will return
+--------+----------+-----------+
| hostid | itemname | itemvalue |
+--------+----------+-----------+
| 1 | A | 10 |
+--------+----------+-----------+
| 1 | B | 3 |
+--------+----------+-----------+
| 2 | A | 9 |
+--------+----------+-----------+
| 2 | C | 40 |
+--------+----------+-----------+
How do I query the database to return something like
+--------+------+-----+-----+
| hostid | A | B | C |
+--------+------+-----+-----+
| 1 | 10 | 3 | 0 |
+--------+------+-----+-----+
| 2 | 9 | 0 | 40 |
+--------+------+-----+-----+
I'm going to add a somewhat longer and more detailed explanation of the steps to take to solve this problem. I apologize if it's too long.
I'll start out with the base you've given and use it to define a couple of terms that I'll use for the rest of this post. This will be the base table:
select * from history;
+--------+----------+-----------+
| hostid | itemname | itemvalue |
+--------+----------+-----------+
| 1 | A | 10 |
| 1 | B | 3 |
| 2 | A | 9 |
| 2 | C | 40 |
+--------+----------+-----------+
This will be our goal, the pretty pivot table:
select * from history_itemvalue_pivot;
+--------+------+------+------+
| hostid | A | B | C |
+--------+------+------+------+
| 1 | 10 | 3 | 0 |
| 2 | 9 | 0 | 40 |
+--------+------+------+------+
Values in the history.hostid column will become y-values in the pivot table. Values in the history.itemname column will become x-values (for obvious reasons).
When I have to solve the problem of creating a pivot table, I tackle it using a three-step process (with an optional fourth step):
select the columns of interest, i.e. y-values and x-values
extend the base table with extra columns -- one for each x-value
group and aggregate the extended table -- one group for each y-value
(optional) prettify the aggregated table
Let's apply these steps to your problem and see what we get:
Step 1: select columns of interest. In the desired result, hostid provides the y-values and itemname provides the x-values.
Step 2: extend the base table with extra columns. We typically need one column per x-value. Recall that our x-value column is itemname:
create view history_extended as (
select
history.*,
case when itemname = "A" then itemvalue end as A,
case when itemname = "B" then itemvalue end as B,
case when itemname = "C" then itemvalue end as C
from history
);
select * from history_extended;
+--------+----------+-----------+------+------+------+
| hostid | itemname | itemvalue | A | B | C |
+--------+----------+-----------+------+------+------+
| 1 | A | 10 | 10 | NULL | NULL |
| 1 | B | 3 | NULL | 3 | NULL |
| 2 | A | 9 | 9 | NULL | NULL |
| 2 | C | 40 | NULL | NULL | 40 |
+--------+----------+-----------+------+------+------+
Note that we didn't change the number of rows -- we just added extra columns. Also note the pattern of NULLs -- a row with itemname = "A" has a non-null value for new column A, and null values for the other new columns.
Step 3: group and aggregate the extended table. We need to group by hostid, since it provides the y-values:
create view history_itemvalue_pivot as (
select
hostid,
sum(A) as A,
sum(B) as B,
sum(C) as C
from history_extended
group by hostid
);
select * from history_itemvalue_pivot;
+--------+------+------+------+
| hostid | A | B | C |
+--------+------+------+------+
| 1 | 10 | 3 | NULL |
| 2 | 9 | NULL | 40 |
+--------+------+------+------+
(Note that we now have one row per y-value.) Okay, we're almost there! We just need to get rid of those ugly NULLs.
Step 4: prettify. We're just going to replace any null values with zeroes so the result set is nicer to look at:
create view history_itemvalue_pivot_pretty as (
select
hostid,
coalesce(A, 0) as A,
coalesce(B, 0) as B,
coalesce(C, 0) as C
from history_itemvalue_pivot
);
select * from history_itemvalue_pivot_pretty;
+--------+------+------+------+
| hostid | A | B | C |
+--------+------+------+------+
| 1 | 10 | 3 | 0 |
| 2 | 9 | 0 | 40 |
+--------+------+------+------+
And we're done -- we've built a nice, pretty pivot table using MySQL.
Considerations when applying this procedure:
what value to use in the extra columns. I used itemvalue in this example
what "neutral" value to use in the extra columns. I used NULL, but it could also be 0 or "", depending on your exact situation
what aggregate function to use when grouping. I used sum, but count and max are also often used (max is often used when building one-row "objects" that had been spread across many rows)
using multiple columns for y-values. This solution isn't limited to using a single column for the y-values -- just plug the extra columns into the group by clause (and don't forget to select them)
Known limitations:
this solution doesn't allow n columns in the pivot table -- each pivot column needs to be manually added when extending the base table. So for 5 or 10 x-values, this solution is nice. For 100, not so nice. There are some solutions with stored procedures generating a query, but they're ugly and difficult to get right. I currently don't know of a good way to solve this problem when the pivot table needs to have lots of columns.
SELECT
hostid,
sum( if( itemname = 'A', itemvalue, 0 ) ) AS A,
sum( if( itemname = 'B', itemvalue, 0 ) ) AS B,
sum( if( itemname = 'C', itemvalue, 0 ) ) AS C
FROM
bob
GROUP BY
hostid;
Another option,especially useful if you have many items you need to pivot is to let mysql build the query for you:
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'ifnull(SUM(case when itemname = ''',
itemname,
''' then itemvalue end),0) AS `',
itemname, '`'
)
) INTO #sql
FROM
history;
SET #sql = CONCAT('SELECT hostid, ', #sql, '
FROM history
GROUP BY hostid');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
FIDDLE
Added some extra values to see it working
GROUP_CONCAT has a default value of 1000 so if you have a really big query change this parameter before running it
SET SESSION group_concat_max_len = 1000000;
Test:
DROP TABLE IF EXISTS history;
CREATE TABLE history
(hostid INT,
itemname VARCHAR(5),
itemvalue INT);
INSERT INTO history VALUES(1,'A',10),(1,'B',3),(2,'A',9),
(2,'C',40),(2,'D',5),
(3,'A',14),(3,'B',67),(3,'D',8);
hostid A B C D
1 10 3 0 0
2 9 0 40 5
3 14 67 0 8
Taking advantage of Matt Fenwick's idea that helped me to solve the problem (a lot of thanks), let's reduce it to only one query:
select
history.*,
coalesce(sum(case when itemname = "A" then itemvalue end), 0) as A,
coalesce(sum(case when itemname = "B" then itemvalue end), 0) as B,
coalesce(sum(case when itemname = "C" then itemvalue end), 0) as C
from history
group by hostid
I edit Agung Sagita's answer from subquery to join.
I'm not sure about how much difference between this 2 way, but just for another reference.
SELECT hostid, T2.VALUE AS A, T3.VALUE AS B, T4.VALUE AS C
FROM TableTest AS T1
LEFT JOIN TableTest T2 ON T2.hostid=T1.hostid AND T2.ITEMNAME='A'
LEFT JOIN TableTest T3 ON T3.hostid=T1.hostid AND T3.ITEMNAME='B'
LEFT JOIN TableTest T4 ON T4.hostid=T1.hostid AND T4.ITEMNAME='C'
use subquery
SELECT hostid,
(SELECT VALUE FROM TableTest WHERE ITEMNAME='A' AND hostid = t1.hostid) AS A,
(SELECT VALUE FROM TableTest WHERE ITEMNAME='B' AND hostid = t1.hostid) AS B,
(SELECT VALUE FROM TableTest WHERE ITEMNAME='C' AND hostid = t1.hostid) AS C
FROM TableTest AS T1
GROUP BY hostid
but it will be a problem if sub query resulting more than a row, use further aggregate function in the subquery
If you could use MariaDB there is a very very easy solution.
Since MariaDB-10.02 there has been added a new storage engine called CONNECT that can help us to convert the results of another query or table into a pivot table, just like what you want:
You can have a look at the docs.
First of all install the connect storage engine.
Now the pivot column of our table is itemname and the data for each item is located in itemvalue column, so we can have the result pivot table using this query:
create table pivot_table
engine=connect table_type=pivot tabname=history
option_list='PivotCol=itemname,FncCol=itemvalue';
Now we can select what we want from the pivot_table:
select * from pivot_table
More details here
My solution :
select h.hostid, sum(ifnull(h.A,0)) as A, sum(ifnull(h.B,0)) as B, sum(ifnull(h.C,0)) as C from (
select
hostid,
case when itemName = 'A' then itemvalue end as A,
case when itemName = 'B' then itemvalue end as B,
case when itemName = 'C' then itemvalue end as C
from history
) h group by hostid
It produces the expected results in the submitted case.
I make that into Group By hostId then it will show only first row with values,
like:
A B C
1 10
2 3
I figure out one way to make my reports converting rows to columns almost dynamic using simple querys. You can see and test it online here.
The number of columns of query is fixed but the values are dynamic and based on values of rows. You can build it So, I use one query to build the table header and another one to see the values:
SELECT distinct concat('<th>',itemname,'</th>') as column_name_table_header FROM history order by 1;
SELECT
hostid
,(case when itemname = (select distinct itemname from history a order by 1 limit 0,1) then itemvalue else '' end) as col1
,(case when itemname = (select distinct itemname from history a order by 1 limit 1,1) then itemvalue else '' end) as col2
,(case when itemname = (select distinct itemname from history a order by 1 limit 2,1) then itemvalue else '' end) as col3
,(case when itemname = (select distinct itemname from history a order by 1 limit 3,1) then itemvalue else '' end) as col4
FROM history order by 1;
You can summarize it, too:
SELECT
hostid
,sum(case when itemname = (select distinct itemname from history a order by 1 limit 0,1) then itemvalue end) as A
,sum(case when itemname = (select distinct itemname from history a order by 1 limit 1,1) then itemvalue end) as B
,sum(case when itemname = (select distinct itemname from history a order by 1 limit 2,1) then itemvalue end) as C
FROM history group by hostid order by 1;
+--------+------+------+------+
| hostid | A | B | C |
+--------+------+------+------+
| 1 | 10 | 3 | NULL |
| 2 | 9 | NULL | 40 |
+--------+------+------+------+
Results of RexTester:
http://rextester.com/ZSWKS28923
For one real example of use, this report bellow show in columns the hours of departures arrivals of boat/bus with a visual schedule. You will see one additional column not used at the last col without confuse the visualization:
** ticketing system to of sell ticket online and presential
This isn't the exact answer you are looking for but it was a solution that i needed on my project and hope this helps someone. This will list 1 to n row items separated by commas. Group_Concat makes this possible in MySQL.
select
cemetery.cemetery_id as "Cemetery_ID",
GROUP_CONCAT(distinct(names.name)) as "Cemetery_Name",
cemetery.latitude as Latitude,
cemetery.longitude as Longitude,
c.Contact_Info,
d.Direction_Type,
d.Directions
from cemetery
left join cemetery_names on cemetery.cemetery_id = cemetery_names.cemetery_id
left join names on cemetery_names.name_id = names.name_id
left join cemetery_contact on cemetery.cemetery_id = cemetery_contact.cemetery_id
left join
(
select
cemetery_contact.cemetery_id as cID,
group_concat(contacts.name, char(32), phone.number) as Contact_Info
from cemetery_contact
left join contacts on cemetery_contact.contact_id = contacts.contact_id
left join phone on cemetery_contact.contact_id = phone.contact_id
group by cID
)
as c on c.cID = cemetery.cemetery_id
left join
(
select
cemetery_id as dID,
group_concat(direction_type.direction_type) as Direction_Type,
group_concat(directions.value , char(13), char(9)) as Directions
from directions
left join direction_type on directions.type = direction_type.direction_type_id
group by dID
)
as d on d.dID = cemetery.cemetery_id
group by Cemetery_ID
This cemetery has two common names so the names are listed in different rows connected by a single id but two name ids and the query produces something like this
CemeteryID Cemetery_Name Latitude
1 Appleton,Sulpher Springs 35.4276242832293
You can use a couple of LEFT JOINs. Kindly use this code
SELECT t.hostid,
COALESCE(t1.itemvalue, 0) A,
COALESCE(t2.itemvalue, 0) B,
COALESCE(t3.itemvalue, 0) C
FROM history t
LEFT JOIN history t1
ON t1.hostid = t.hostid
AND t1.itemname = 'A'
LEFT JOIN history t2
ON t2.hostid = t.hostid
AND t2.itemname = 'B'
LEFT JOIN history t3
ON t3.hostid = t.hostid
AND t3.itemname = 'C'
GROUP BY t.hostid
I'm sorry to say this and maybe I'm not solving your problem exactly but PostgreSQL is 10 years older than MySQL and is extremely advanced compared to MySQL and there's many ways to achieve this easily. Install PostgreSQL and execute this query
CREATE EXTENSION tablefunc;
then voila! And here's extensive documentation: PostgreSQL: Documentation: 9.1: tablefunc or this query
CREATE EXTENSION hstore;
then again voila! PostgreSQL: Documentation: 9.0: hstore
I have following tables
create table top100
(
id integer not null,
top100ids integer[] not null
);
create table top100_data
(
id integer not null,
data_string text not null
);
Rows in table top100 look like:
1, {1,2,3,4,5,6...100}
Rows in table top100_data look like:
1, 'string of text, up to 500 chars'
I need to get the text values from table top100_data and join them with table top100.
So the result will be:
1, {'text1','text2','text3',...'text100'}
I am currenly doing this on application side by selecting from top100, then iterating over all array items and then selecting from top100_data and iterating again + transforming ids to their _data text values.
This can be very slow on large data sets.
Is is possible to get this same result with single SQL query?
You can unnest() and re-aggregate:
select t100.id, array_agg(t100d.data order by top100id)
from top100 t100 cross join
unnest(top100ids) as top100id join
top100_data t100d
on t100d.id = top100id
group by t100.id;
Or if you want to keep the original ordering:
select t100.id, array_agg(t100d.data order by top100id.n)
from top100 t100 cross join
unnest(top100ids) with ordinality as top100id(id, n) join
top100_data t100d
on t100d.id = top100id.id
group by t100.id;
Just use unnest and array_agg function in PostgreSQL, your final sql could be like below:
with core as (
select
id,
unnest(top100ids) as top_id
from
top100
)
select
t1.id,
array_agg(t1.data_string) as text_datas
from
top100 t1
join
core c on t1.id = c.top_id
The example of unnest as below:
postgres=# select * from my_test;
id | top_ids
----+--------------
1 | {1,2,3,4,5}
2 | {6,7,8,9,10}
(2 rows)
postgres=# select id, unnest(top_ids) from my_test;
id | unnest
----+--------
1 | 1
1 | 2
1 | 3
1 | 4
1 | 5
2 | 6
2 | 7
2 | 8
2 | 9
2 | 10
(10 rows)
The example of array_agg as below:
postgres=# select * from my_test_1 ;
id | content
----+---------
1 | a
1 | b
1 | c
1 | d
2 | x
2 | y
(6 rows)
postgres=# select id,array_agg(content) from my_test_1 group by id;
id | array_agg
----+-----------
1 | {a,b,c,d}
2 | {x,y}
(2 rows)
after some transformation I have a result from a cross join (from table a and b) where I want to do some analysis on. The table for this looks like this:
+-----+------+------+------+------+-----+------+------+------+------+
| id | 10_1 | 10_2 | 11_1 | 11_2 | id | 10_1 | 10_2 | 11_1 | 11_2 |
+-----+------+------+------+------+-----+------+------+------+------+
| 111 | 1 | 0 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
| 111 | 1 | 0 | 1 | 0 | 333 | 0 | 0 | 0 | 0 |
| 111 | 1 | 0 | 1 | 0 | 444 | 1 | 0 | 1 | 1 |
| 112 | 0 | 1 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
+-----+------+------+------+------+-----+------+------+------+------+
The ids in the first column are different from the ids in the sixth column.
In a row are always two different IDs that are matched with each other. The other columns always have either 0 or 1 as a value.
I am now trying to find out how many values(meaning both have "1" in 10_1, 10_2 etc) two IDs have on average in common, but I don't really know how to do so.
I was trying something like this as a start:
SELECT SUM(CASE WHEN a.10_1 = 1 AND b.10_1 = 1 then 1 end)
But this would obviously only count how often two ids have 10_1 in common. I could make something like this for example for different columns:
SELECT SUM(CASE WHEN (a.10_1 = 1 AND b.10_1 = 1)
OR (a.10_2 = 1 AND b.10_1 = 1) OR [...] then 1 end)
To count in general how often two IDs have one thing in common, but this would of course also count if they have two or more things in common. Plus, I would also like to know how often two IDS have two things, three things etc in common.
One "problem" in my case is also that I have like ~30 columns I want to look at, so I can hardly write down for each case every possible combination.
Does anyone know how I can approach my problem in a better way?
Thanks in advance.
Edit:
A possible result could look like this:
+-----------+---------+
| in_common | count |
+-----------+---------+
| 0 | 100 |
| 1 | 500 |
| 2 | 1500 |
| 3 | 5000 |
| 4 | 3000 |
+-----------+---------+
With the codes as column names, you're going to have to write some code that explicitly references each column name. To keep that to a minimum, you could write those references in a single union statement that normalizes the data, such as:
select id, '10_1' where "10_1" = 1
union
select id, '10_2' where "10_2" = 1
union
select id, '11_1' where "11_1" = 1
union
select id, '11_2' where "11_2" = 1;
This needs to be modified to include whatever additional columns you need to link up different IDs. For the purpose of this illustration, I assume the following data model
create table p (
id integer not null primary key,
sex character(1) not null,
age integer not null
);
create table t1 (
id integer not null,
code character varying(4) not null,
constraint pk_t1 primary key (id, code)
);
Though your data evidently does not currently resemble this structure, normalizing your data into a form like this would allow you to apply the following solution to summarize your data in the desired form.
select
in_common,
count(*) as count
from (
select
count(*) as in_common
from (
select
a.id as a_id, a.code,
b.id as b_id, b.code
from
(select p.*, t1.code
from p left join t1 on p.id=t1.id
) as a
inner join (select p.*, t1.code
from p left join t1 on p.id=t1.id
) as b on b.sex <> a.sex and b.age between a.age-10 and a.age+10
where
a.id < b.id
and a.code = b.code
) as c
group by
a_id, b_id
) as summ
group by
in_common;
The proposed solution requires first to take one step back from the cross-join table, as the identical column names are super annoying. Instead, we take the ids from the two tables and put them in a temporary table. The following query gets the result wanted in the question. It assumes table_a and table_b from the question are the same and called tbl, but this assumption is not needed and tbl can be replaced by table_a and table_b in the two sub-SELECT queries. It looks complicated and uses the JSON trick to flatten the columns, but it works here:
WITH idtable AS (
SELECT a.id as id_1, b.id as id_2 FROM
-- put cross join of table a and table b here
)
SELECT in_common,
count(*)
FROM
(SELECT idtable.*,
sum(CASE
WHEN meltedR.value::text=meltedL.value::text THEN 1
ELSE 0
END) AS in_common
FROM idtable
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_a
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedL ON (idtable.id_1 = meltedL.id)
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_b
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedR ON (idtable.id_2 = meltedR.id
AND meltedL.key = meltedR.key)
GROUP BY idtable.id_1,
idtable.id_2) tt
GROUP BY in_common ORDER BY in_common;
The output here looks like this:
in_common | count
-----------+-------
2 | 2
3 | 1
4 | 1
(3 rows)
I have found it quite hard to word what I want to do in the title so I will try my best to explain now!
I have two tables which I am using:
Master_Tab and Parts_Tab
Parts_Tab has the following information:
Order_Number | Completed| Part_Number|
| 1 | Y | 64 |
| 2 | N | 32 |
| 3 | Y | 42 |
| 1 | N | 32 |
| 1 | N | 5 |
Master_Tab has the following information:
Order_Number|
1 |
2 |
3 |
4 |
5 |
I want to generate a query which will return ALL of the Order_Numbers listed in the Master_Tab on the following conditions...
For each Order_Number I want to check the Parts_Tab table to see if there are any parts which aren't complete (Completed = 'N'). For each Order_Number I then want to count the number of uncompleted parts an order has against it. If an Order_Number does not have uncompleted parts or it is not in the Parts_Table then I want the count value to be 0.
So the table that would be generated would look like this:
Order_Number | Count_of_Non_Complete_Parts|
1 | 2 |
2 | 1 |
3 | 0 |
4 | 0 |
5 | 0 |
I was hoping that using a different kind of join on the tables would do this but I am clearly missing the trick!
Any help is much appreciated!
Thanks.
I have used COALESCE to convert NULL to zero where necessary. Depending on your database platform, you may need to use another method, e.g. ISNULL or CASE.
select mt.Order_Number,
coalesce(ptc.Count, 0) as Count_of_Non_Complete_Parts
from Master_Tab mt
left outer join (
select Order_Number, count(*) as Count
from Parts_Tab
where Completed = 'N'
group by Order_Number
) ptc on mt.Order_Number = ptc.Order_Number
order by mt.Order_Number
You are looking for a LEFT JOIN.
SELECT mt.order_number, count(part_number) AS count_noncomplete_parts
FROM master_tab mt LEFT JOIN parts_tab pt
ON mt.order_number=pt.order_number AND pt.completed='N'
GROUP BY mt.order_number;
It is also possible to put pt.completed='N' into a WHERE clause, but you have to be careful of NULLs. Instead of the AND you can have
WHERE pt.completed='N' OR pr.completed IS NULL
SELECT mt.Order_Number SUM(tbl.Incomplete) Count_of_Non_Complete_Parts
FROM Master_Tab mt
LEFT JOIN (
SELECT Order_Number, CASE WHEN Completed = 'N' THEN 1 ELSE 0 END Incomplete
FROM Parts_Tab
) tbl on mt.Order_Number = tbl.Order_Number
GROUP BY mt.Order_Number
Add a WHERE clause to the outer query if you need to filter for specific order numbers.
I think it's easiest to get a subquery in there. I think this should be self-explanitory, if not feel free to ask any questions.
CREATE TABLE #Parts
(
Order_Number int,
Completed char(1),
Part_Number int
)
CREATE TABLE #Master
(
Order_Number int
)
INSERT INTO #Parts
SELECT 1, 'Y', 64 UNION ALL
SELECT 2, 'N', 32 UNION ALL
SELECT 3, 'Y', 42 UNION ALL
SELECT 1, 'N', 32 UNION ALL
SELECT 1, 'N', 5
INSERT INTO #Master
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
SELECT 6
SELECT M.Order_Number, ISNULL(Totals.NonCompletedCount, 0) FROM #Master M
LEFT JOIN (SELECT P.Order_Number, COUNT(*) AS NonCompletedCount FROM #Parts P
WHERE P.Completed = 'N'
GROUP BY P.Order_Number) Totals ON Totals.Order_Number = M.Order_Number