How to count the amount of entry in a column separated by a comma - sql

Currently I have a table as my database, and I want to create a Bar Chart to of out the Reasons Column. This is an example of my table:
Table Name: Survey
id
reasons
1
a,b,c
2
a,d,e
3
b,c,d
How to count total amount of each reasons like this table below?
reasons
total
a
2
b
2
c
2
d
1
e
1

You would use string_split():
select s.value as reason, count(*)
from t cross apply
string_split(reasons, ',') s
group by s.value
order by s.value;
That said, you should fix your data model. You should have a separate table with one row per reason.

Related

calculating average metric across two tables in Postgres SQL

I currently have two tables, A and B, where
Table A:-
col1 col2
a 1,2,3
b 1,4,5
c 4
Table B:-
ID metric 1
1 231.0
2 1123.1
3 110
4 1231
5 116
I have to find the mean value of metric 1 for each col1 value in Table A. The resulting table should contain col1 in descending order measured by avg(metric1) value from table B, using SQL
Result: -
col1 avg(metric1) count
c 1231 1
b 526 3
c 488 3
any ideas on how I can come up with a query for the same in Postgres SQL? I've tried the following query, but this does not work :
combined_stats AS(
select avg(metric1), count(*)
from table_b
where ID in (select col2 from table_a)
group by (select col1 from table_a)
Fix your data model! Do not store numbers in strings! Do not store multiple values in a string!
Let me assume that you are stuck with someone else's really bad data model. If so, you can split the results and join:
select a.col1, avg(b.metric1), count(b.id)
from a left join
b
on b.id = any (regexp_split_to_array(col2, ','))
group by a.col1;
Note: If b.id is a number, then you need to deal with type conversions, something like:
on b.id::text = any (regexp_split_to_array(col2, ','))
Here is a db<>fiddle.

How to concatenate multiple columns vertically efficiently

for the end goal, I want to create a table that looks like something like this:
Table 1
option_ID person_ID option
1 1 B
2 1
3 2 C
4 2 A
5 3 A
6 3 B
The idea is that a person can choose up to 2 options out of 3 (in this case person 1 only chose 1 option). However, when my raw data format puts the 3 options into one single column, ie:
Table 2
person_ID option
1 B
2 C,A
3 A,B
What I usually do is the use 'Text to Columns' function using the ',' delimiter in Excel, and manually concatenate the 2 columns vertically. However, I find this method to become impractical when faced with more options (say 10 or even 20). Is there a way for me to get from Table 2 to Table 1 efficiently using postgresql or some other methods?
use string_agg() function.
select person_ID, string_agg(option, ',') as option
from table1
group by person_ID
You can use regexp_split_to_table():
select row_number() over () as id,
t.person_id, v.option
from t cross join lateral
regexp_split_to_table(t.option, ',') option
order by person_id, option;
Here is a db<>fiddle.
Actually, if you want the exactly two rows per personid:
select row_number() over () as id, t.person_id, v.option
from t cross join lateral
(values (1, split_part(t.option, ',', 1)), (2, split_part(t.option, ',', 2))) v(pos, option)
order by person_id, pos;

How to write sql clause where column not equal and any id associated

I want to setup a simple query that will filter out any row that contains "A" in the ItemID, but my issue is I also do NOT want to display any journal ID from a different row since it matched "A". I tried googling the solution, but I am sure I am not using the right keywords to find it. I am using microsoft sql 2008, but I am not a database admin so I am not to familiar. I tried using distinct, and I also tried group by, but in this situation it does not work.
This is a simplified version of the table that I am working with:
JournalID ItemID PrimaryKEY
1 A 1
1 B 2
2 A 3
2 C 4
3 B 5
4 D 6
And here is how I would like to make it look:
JournalID ItemID PrimaryKEY
3 B 5
4 D 6
This will exclude any rows where the ItemID is 'A' and also any rows that have the same JournalID as a row where a ItemID was 'A'.
SELECT JournalID, ItemID, PrimaryKEY
FROM TABLE
WHERE JournalID NOT IN (Select JournalID FROM TABLE WHERE ItemID = 'A')
Try this:
SELECT *
FROM table_name
WHERE JournalID
NOT IN (SELECT JournalID
FROM table_name
WHERE ItemID = 'A')

sql server getting first value when grouping

I have a table with column a having not necessarily distinct values and column b having for each value of a a number of distinct values. I want to get a result having each value of a appearing only once and getting the first found value of b for that value of a. How do I do this in sql server 2000?
example table:
a b
1 aa
1 bb
2 zz
3 aa
3 zz
3 bb
4 bb
4 aa
Wanted result:
a b
1 aa
2 zz
3 aa
4 bb
In addition, I must add that the values in column b are all text values. I updated the example to reflect this.
Thanks
;with cte as
(
select *,
row_number() over(partition by a order by a) as rn
from yourtablename
)
select
a,b
from cte
where rn = 1
SQL does not know about ordering by table rows. You need to introduce order in the table structure (usually using an id column). That said, once you have an id column, it's rather easy:
SELECT a, b FROM test WHERE id in (SELECT MIN(id) FROM test GROUP BY a)
There might be a way to do this, using internal SQL Server functions. But this solution is portable and more easily understood by anyone who knows SQL.

Select values in SQL that do not have other corresponding values except those that i search for

I have a table in my database:
Name | Element
1 2
1 3
4 2
4 3
4 5
I need to make a query that for a number of arguments will select the value of Name that has on the right side these and only these values.
E.g.:
arguments are 2 and 3, the query should return only 1 and not 4 (because 4 also has 5). For arguments 2,3,5 it should return 4.
My query looks like this:
SELECT name FROM aggregations WHERE (element=2 and name in (select name from aggregations where element=3))
What do i have to add to this query to make it not return 4?
A simple way to do it:
SELECT name
FROM aggregations
WHERE element IN (2,3)
GROUP BY name
HAVING COUNT(element) = 2
If you want to add more, you'll need to change both the IN (2,3) part and the HAVING part:
SELECT name
FROM aggregations
WHERE element IN (2,3,5)
GROUP BY name
HAVING COUNT(element) = 3
A more robust way would be to check for everything that isn't not in your set:
SELECT name
FROM aggregations
WHERE NOT EXISTS (
SELECT DISTINCT a.element
FROM aggregations a
WHERE a.element NOT IN (2,3,5)
AND a.name = aggregations.name
)
GROUP BY name
HAVING COUNT(element) = 3
It's not very efficient, though.
Create a temporary table, fill it with your values and query like this:
SELECT name
FROM (
SELECT DISTINCT name
FROM aggregations
) n
WHERE NOT EXISTS
(
SELECT 1
FROM (
SELECT element
FROM aggregations aii
WHERE aii.name = n.name
) ai
FULL OUTER JOIN
temptable tt
ON tt.element = ai.element
WHERE ai.element IS NULL OR tt.element IS NULL
)
This is more efficient than using COUNT(*), since it will stop checking a name as soon as it finds the first row that doesn't have a match (either in aggregations or in temptable)
This isn't tested, but usually I would do this with a query in my where clause for a small amount of data. Note that this is not efficient for large record counts.
SELECT ag1.Name FROM aggregations ag1
WHERE ag1.Element IN (2,3)
AND 0 = (select COUNT(ag2.Name)
FROM aggregatsions ag2
WHERE ag1.Name = ag2.Name
AND ag2.Element NOT IN (2,3)
)
GROUP BY ag1.name;
This says "Give me all of the names that have the elements I want, but have no records with elements I don't want"