Group by fields in single row [duplicate] - sql

Does any one know how to create crosstab queries in PostgreSQL?
For example I have the following table:
Section Status Count
A Active 1
A Inactive 2
B Active 4
B Inactive 5
I would like the query to return the following crosstab:
Section Active Inactive
A 1 2
B 4 5
Is this possible?

Install the additional module tablefunc once per database, which provides the function crosstab(). Since Postgres 9.1 you can use CREATE EXTENSION for that:
Improved test case
section text
, status text
, ct integer -- "count" is a reserved word in standard SQL
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7); -- ('C', 'Active') is missing
Simple form - not fit for missing attributes
crosstab(text) with 1 input parameter:
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- needs to be "ORDER BY 1,2" here
) AS ct ("Section" text, "Active" int, "Inactive" int);
Section | Active | Inactive
A | 1 | 2
B | 4 | 5
C | 7 | -- !!
No need for casting and renaming.
Note the incorrect result for C: the value 7 is filled in for the first column. Sometimes, this behavior is desirable, but not for this use case.
The simple form is also limited to exactly three columns in the provided input query: row_name, category, value. There is no room for extra columns like in the 2-parameter alternative below.
Safe form
crosstab(text, text) with 2 input parameters:
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- could also just be "ORDER BY 1" here
, $$VALUES ('Active'::text), ('Inactive')$$
) AS ct ("Section" text, "Active" int, "Inactive" int);
Section | Active | Inactive
A | 1 | 2
B | 4 | 5
C | | 7 -- !!
Note the correct result for C.
The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end. Often you will want to query distinct attributes from the underlying table like this:
That's in the manual.
Since you have to spell out all columns in a column definition list anyway (except for pre-defined crosstabN() variants), it is typically more efficient to provide a short list in a VALUES expression like demonstrated:
$$VALUES ('Active'::text), ('Inactive')$$)
Or (not in the manual):
$$SELECT unnest('{Active,Inactive}'::text[])$$ -- short syntax for long lists
I used dollar quoting to make quoting easier.
You can even output columns with different data types with crosstab(text, text) - as long as the text representation of the value column is valid input for the target type. This way you might have attributes of different kind and output text, date, numeric etc. for respective attributes. There is a code example at the end of the chapter crosstab(text, text) in the manual.
db<>fiddle here
Effect of excess input rows
Excess input rows are handled differently - duplicate rows for the same ("row_name", "category") combination - (section, status) in the above example.
The 1-parameter form fills in available value columns from left to right. Excess values are discarded.
Earlier input rows win.
The 2-parameter form assigns each input value to its dedicated column, overwriting any previous assignment.
Later input rows win.
Typically, you don't have duplicates to begin with. But if you do, carefully adjust the sort order to your requirements - and document what's happening.
Or get fast arbitrary results if you don't care. Just be aware of the effect.
Advanced examples
Pivot on Multiple Columns using Tablefunc - also demonstrating mentioned "extra columns"
Dynamic alternative to pivot with CASE and GROUP BY
\crosstabview in psql
Postgres 9.6 added this meta-command to its default interactive terminal psql. You can run the query you would use as first crosstab() parameter and feed it to \crosstabview (immediately or in the next step). Like:
db=> SELECT section, status, ct FROM tbl \crosstabview
Similar result as above, but it's a representation feature on the client side exclusively. Input rows are treated slightly differently, hence ORDER BY is not required. Details for \crosstabview in the manual. There are more code examples at the bottom of that page.
Related answer on dba.SE by Daniel Vérité (the author of the psql feature):
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?

SELECT section,
SUM(CASE status WHEN 'Active' THEN count ELSE 0 END) AS active, --here you pivot each status value as a separate column explicitly
SUM(CASE status WHEN 'Inactive' THEN count ELSE 0 END) AS inactive --here you pivot each status value as a separate column explicitly
GROUP BY section

You can use the crosstab() function of the additional module tablefunc - which you have to install once per database. Since PostgreSQL 9.1 you can use CREATE EXTENSION for that:
In your case, I believe it would look something like this:
CREATE TABLE t (Section CHAR(1), Status VARCHAR(10), Count integer);
INSERT INTO t VALUES ('A', 'Active', 1);
INSERT INTO t VALUES ('A', 'Inactive', 2);
INSERT INTO t VALUES ('B', 'Active', 4);
INSERT INTO t VALUES ('B', 'Inactive', 5);
SELECT row_name AS Section,
category_1::integer AS Active,
category_2::integer AS Inactive
FROM crosstab('select section::text, status, count::text from t',2)
AS ct (row_name text, category_1 text, category_2 text);
DB Fiddle here:
Everything works:
Without CREATE EXTENSION tablefunc; you get this error:
ERROR: function crosstab(unknown, integer) does not exist
LINE 4: FROM crosstab('select section::text, status, count::text fro...
HINT: No function matches the given name and argument types. You might need to add explicit type casts.

Solution with JSON aggregation:
section text
, status text
, ct integer -- don't use "count" as column name.
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7);
SELECT section,
(obj ->> 'Active')::int AS active,
(obj ->> 'Inactive')::int AS inactive
FROM (SELECT section, json_object_agg(status,ct) AS obj
GROUP BY section

Sorry this isn't complete because I can't test it here, but it may get you off in the right direction. I'm translating from something I use that makes a similar query:
select mt.section, mt1.count as Active, mt2.count as Inactive
from mytable mt
left join (select section, count from mytable where status='Active')mt1
on mt.section = mt1.section
left join (select section, count from mytable where status='Inactive')mt2
on mt.section = mt2.section
group by mt.section,
order by mt.section asc;
The code I'm working from is:
select m.typeID, m1.highBid, m2.lowAsk, m1.highBid - m2.lowAsk as diff, 100*(m1.highBid - m2.lowAsk)/m2.lowAsk as diffPercent
from mktTrades m
left join (select typeID,MAX(price) as highBid from mktTrades where bid=1 group by typeID)m1
on m.typeID = m1.typeID
left join (select typeID,MIN(price) as lowAsk from mktTrades where bid=0 group by typeID)m2
on m1.typeID = m2.typeID
group by m.typeID,
order by diffPercent desc;
which will return a typeID, the highest price bid and the lowest price asked and the difference between the two (a positive difference would mean something could be bought for less than it can be sold).

There's a different dynamic method that I've devised, one that employs a dynamic rec. type (a temp table, built via an anonymous procedure) & JSON. This may be useful for an end-user who can't install the tablefunc/crosstab extension, but can still create temp tables or run anon. proc's.
The example assumes all the xtab columns are the same type (INTEGER), but the # of columns is data-driven & variadic. That said, JSON aggregate functions do allow for mixed data types, so there's potential for innovation via the use of embedded composite (mixed) types.
The real meat of it can be reduced down to one step if you want to statically define the rec. type inside the JSON recordset function (via nested SELECTs that emit a composite type).

Crosstab function is available under the tablefunc extension. You'll have to create this extension one time for the database.
You can use the below code to create pivot table using cross tab:
create table test_Crosstab( section text,
status text,
count numeric)
insert into test_Crosstab values ( 'A','Active',1)
,( 'A','Inactive',2)
,( 'B','Active',4)
,( 'B','Inactive',5)
select * from crosstab(
'select section
from test_crosstab'
)as ctab ("Section" text,"Active" numeric,"Inactive" numeric)


Insert into 2 tables with single SQL statement instead of loop

I have to insert data in provonance of several table which itself comes from csv (COPY).
Before I used a LOOP in a function to enter the data. I want to simplify the thing for the sake of maintainability and speed.
I need to insert data into a description table, which serves as both the title and description (and multi language).
Previously my code was as follows (extract from the loop):
insert into description (label, lang_id, poi_id,date_dernier_update, date_enregistrementbdd, date_derniere_lecture) values (label, lang_id, poi_id, now(), now(), now()) RETURNING id INTO _retour_id_titre;
insert into poi_titre_poi (poi_id, titre_poi_id, titre_poi_key) values (poi_id, _retour_id_titre, label_lang);
But now I can't:
with rows as (
insert into description (label, lang_id, poi_id)
select rdfslabelfrs, '1', (select id from poi where uri_id = csv_poi_rdf_fr.poi) as toto from csv_poi_rdf_fr RETURNING id
insert into poi_titre_poi (poi_id, titre_poi_id, titre_poi_key)
select description.poi_id, id , 'fr'
FROM description;
In fact, I cannot insert the 'poi_id' in the 'poi_titre_poi' table which corresponds to the one which was inserted in the description table.
I get this error message:
ERROR: more than one row returned by a subquery used as an expression
État SQL : 21000
Can I make this work, or do I need to loop?
Filling in missing bits with assumptions, it could work like this:
WITH description_insert AS (
INSERT INTO description
(label , lang_id, poi_id)
SELECT c.rdfslabelfrs, 1 ,
FROM csv_poi_rdf_fr c
JOIN poi p ON p.uri_id = c.poi
RETURNING poi_id, id
INSERT INTO poi_titre_poi (poi_id, titre_poi_id, titre_poi_key)
SELECT d.poi_id, , 'fr'
FROM description_insert d;
PostgreSQL multi INSERT...RETURNING with multiple columns
Insert data in 3 tables at a time using Postgres
Get Id from a conditional INSERT

Group By clause in SQLite

I would like to query the table to pick only the latest version of each item.
Why does Query1 work in SQLite (I was thinking the group by clause would throw an error, because select statement contains the column content and it not part of the group by clause) ?
Would Query1 throw an error in Oracle ?
Is Query1 better than Query2 ?
Is there a better way to write the query ?
select item_id,
from item_version
group by item_id;
select iv.*
from item_version iv,
(select item_id,
max(version_number) latest_version_number
from item_version
group by item_id) liv
where iv.item_id = liv.item_id
and iv.version_number = liv.latest_version_number;
Setting up the table:
create table item_version(
item_id varchar,
version_number integer,
content varchar,
primary key (item_id, version_number)
insert into item_version values (1, 1, null);
insert into item_version values (2, 1, "Content A");
insert into item_version values (2, 2, "Content B");
insert into item_version values (3, 1, "Content C");
insert into item_version values (3, 2, null);
insert into item_version values (4, 1, "Content D");
insert into item_version values (4, 2, null);
From the documentation:
In most SQL implementations, output columns of an aggregate query may only reference aggregate functions or columns named in the GROUP BY clause. It does not make good sense to reference an ordinary column in an aggregate query because each output row might be composed from two or more rows in the input table(s).
SQLite does not impose this restriction. The output columns from an aggregate query can be arbitrary expressions that include columns not found in GROUP BY clause.
With SQLite (but not any other SQL implementation that we know of) if an aggregate query contains a single min() or max() function, then the values of columns used in the output are taken from the row where the min() or max() value was achieved. If two or more rows have the same min() or max() value, then the columns values will be chosen arbitrarily from one of those rows.
For example to find the highest paid employee:
SELECT max(salary), first_name, last_name FROM employee;
In the query above, the values for the first_name and last_name columns will correspond to the row that satisfied the max(salary) condition.
If a query contains no aggregate functions at all, then a GROUP BY clause can be added as a substitute of DISTINCT ON clause. In other words, output rows are filtered so that only one row is shows for each distinct set of values in the GROUP BY clause. If two or more output rows would have otherwise had the same set of values for the GROUP BY columns, then one of the rows is chosen arbitrarily.
Your query 1 would cause an error in most databases, yes, but as long as you're only going to use it with sqlite, it's perfectly fine.
An alternative to finding the highest version of each item uses the window functions added in Sqlite 3.25:
SELECT item_id, version_number, content
FROM (SELECT item_id, version_number, content
, row_number() OVER (PARTITION BY item_id ORDER BY version_number DESC) AS rnk
FROM item_version) AS sq
WHERE rnk = 1
ORDER BY item_id;
item_id version_number content
---------- -------------- ----------
1 1
2 2 Content B
3 2
4 2
This one should work on other databases like Oracle, as long as they support window functions too.
Shawn does a really good job of explaining the issue. A typical way to solve this uses a correlated subquery:
select iv.*
from item_version iv
where iv.version_number = (select max(iv2.version_number)
from item_version iv2
where iv2.item_id = iv.item_id
With an index on item_version(item_id, version_number) this may be the fastest way to get the results that you want. You already have this index with your primary key definition.

Find all rows with the same exact relations as provided in another table

Given these tables:
Table: Test
testID int PK
name nvarchar(128) UNIQUE NOT NULL
Table: [Test-Inputs]
inputsTableName nvarchar(128) UNIQUE PK
testID int PK FK
Temporary Table: ##TestSearchParams
inputsTableName nvarchar(128) UNIQUE NOT NULL
I need to find Tests that have entries in Test-Inputs with inputsTableNames matching EXACTLY ALL of the entries in ##TestSearchParams; the resulting tests relationships must be exactly the ones listed in ##TestSearchParams.
Essentially I am finding tests with ONLY the given relationships, no more, no less. I am matching names with LIKE and wildcards, but that is a sidenote that I believe I can solve after the core logic is there for exact matching.
This is my current query:
Select *
From Tests As B
Where B.testID In (
Select ti
From (
Select (
Select Count(inputsTableName)
From [Test-Inputs]
Where [Test-Inputs].testID = B.testID
) - Count(Distinct i1) As delta,
From (
Select [Test-Inputs].inputsTableName As i1,
[Test-Inputs].testID As ti
From ##TableSearchParams
Join [Test-Inputs]
On [Test-Inputs].inputsTableName Like ##TableSearchParams.inputsTableName
And B.testID = [Test-Inputs].testID
) As A
Group By ti
) As D
Where = 0
The current problem is that his seems to retrieve Tests with a match to ANY of the entries in ##TableSearchParams. I have tried several other queries before this, to varying levels of success. I have working queries for find tests that match any of the parameters, all of the paramters, and none of the parameters -- I just cant get this query working.
Here are some sample table values:
1, Test1
2, Test2
3, Test3
Table1, 1
Table2, 2
Table1, 3
Table2, 3
The given values should only return (3, Test3)
Here's a possible solution that works by getting the complete set of TestInputs for each record in Tests, left-joining to the set of search parameters, and then aggregating the results by test and making two observations:
First, if a record from Tests includes a TestInput that is not among the search parameters, then that record must be excluded from the result set. We can check this by seeing if there is any case in which the left-join described above did not produce a match in the search parameters table.
Second, if a record from Tests satisfies the first condition, then we know that it doesn't have any superfluous TestInput records, so the only problem it could have is if there exists a search parameter that is not among its TestInputs. If that is so, then the number of records we've aggregated for that Test will be less than the total number of search parameters.
I have made the assumption here that you don't have Tests records with duplicate TestInputs, and that you likewise don't use duplicate search parameters. If those assumptions are not valid then this becomes more complicated. But if they are, then this ought to work:
declare #Tests table (testID int, [name] nvarchar(128));
declare #TestInputs table (testID int, inputsTableName nvarchar(128));
declare #TestSearchParams table (inputsTableName nvarchar(128));
-- Sample data.
-- testID 1 has only a subset of the search parameters.
-- testID 2 matches the search parameters exactly.
-- testID 3 has a superset of the search parameters.
-- Therefore the result set should include testID 2 only.
insert #Tests values
(1, 'Table A'),
(2, 'Table B'),
(3, 'Table C');
insert #TestInputs values
(1, 'X'),
(2, 'X'),
(2, 'Y'),
(3, 'X'),
(3, 'Y'),
(3, 'Z');
insert #TestSearchParams values
declare #ParamCount int;
select #ParamCount = count(1) from #TestSearchParams;
#Tests Tests
inner join #TestInputs Inputs on Tests.testID = Inputs.testID
left join #TestSearchParams Search on Inputs.inputsTableName = Search.inputsTableName
group by
-- If a group includes any record where Search.inputsTableName is null, it means that
-- the record in Tests has a TestInput that is not among the search parameters.
sum(case when Search.inputsTableName is null then 1 else 0 end) = 0 and
-- If a group includes fewer records than there are search parameters, it means that
-- there exists some parameter that was not found among the Tests record's TestInputs.
count(1) = #ParamCount;

SQL Server where condition on column with separated values

I have a table with a column that can have values separated by ",".
Example column group:
id column group:
1 10,20,30
2 280
3 20
I want to create a SELECT with where condition on column group where I can search for example 20 ad It should return 1 and 3 rows or search by 20,280 and it should return 1 and 2 rows.
Can you help me please?
As pointed out in comments,storing mutiple values in a single row is not a good idea..
coming to your question,you can use one of the split string functions from here to split comma separated values into a table and then query them..
create table #temp
id int,
columnss varchar(100)
insert into #temp
(2, '280'),
(3, '20')
select *
from #temp
cross apply
select * from dbo.SplitStrings_Numbers(columnss,',')
where item in (20)
id columnss Item
1 10,20,30 20
3 20 20
The short answer is: don't do it.
Instead normalize your tables to at least 3NF. If you don't know what database normalization is, you need to do some reading.
If you absolutely have to do it (e.g. this is a legacy system and you cannot change the table structure), there are several articles on string splitting with TSQL and at least a couple that have done extensive benchmarks on various methods available (e.g. see:
Since you only want to search, you don't really need to split the strings, so you can write something like:
SELECT id, list
WHERE ','+list+',' LIKE '%,'+#searchValue+',%'
Where t(id int, list varchar(max)) is the table to search and #searchValue is the value you are looking for. If you need to search for more than one value you have to add those in a table and use a join or subquery.
E.g. if s(searchValue varchar(max)) is the table of values to search then:
SELECT distinct, t.list
ON ','+t.list+',' LIKE '%,'+s.searchValue+',%'
If you need to pass those search values from ADO.Net consider table parameters.

