Select MAX of multiple Attributes - sql

I have a table that contains 3000 attributes (its for a data mining experiment)
The table looks like
id attr1 attr2, attr3
a 0 1 0
a 1 0 0
a 0 0 0
a 0 0 1
I wish to have it in the format
id, attr1, attr2, attr3
a 1 1 1
The values can only be 0 or 1 so, i think just getting the max of each column and grouping it by the ID would achieve this
However, i don't wish to Type MAX (attr X) for each and every attribute
Does anyone know a quick way of implementing this
Thank you very much for your help in advance

This is easy enough with group by:
select id, max(attr1) as attr1, max(attr2) as attr2, max(attr3) as attr3
from t
group by id
If you don't want to do all this typing, put your list of columns in Excel. Add in a formula such as =" max("&A1&") as "&A1&",". Then copy the cell down and copy the result to where your query is.
You can also do this in SQL, with something like:
select ' max('||column_name||') as '||column_name||','
from INFORMATION_SCHEMA.columns c
where table_name = <your table name here> and column_name like 'attr%'
When you do these last two, remember to remove the final comma from the last row.

You have to use some aggregating function in order to use attributes which are not in the group statement. So there is no any quicker way.

Related

SQL to fetch Unique values based on condition

I have below data. The condition is that if the Id has two different types then take Long, such that there should not be any duplicate Id's
**id type**
1 Short
1 Long
2 Short
3 Short
3 Long
4 Short
And i need output like this.
**id type**
1 Long
2 Short
3 Long
4 Short
Does this work for you:
select id,
case when count(id) > 1 then 'Long' else 'Short' end as type
from tmp
group by id
You can simply take MIN from your Type column's value using GROUP BY on ID column. No CASE or COUNT statement is required. This following script will always work if you have specific value "short" and "long" in your column Type.
SELECT ID,MIN(Type) Type
FROM your_table
GROUP BY ID
You can do this:
Select id, case when count(id)>1 then 'Long' else min(Type) End as Type
from Tbl
group by id

SQL Rows to Columns if column values are unknown

I have a table that has demographic information about a set of users which looks like this:
User_id Category IsMember
1 College 1
1 Married 0
1 Employed 1
1 Has_Kids 1
2 College 0
2 Married 1
2 Employed 1
3 College 0
3 Employed 0
The result set I want is a table that looks like this:
User_Id|College|Married|Employed|Has_Kids
1 1 0 1 1
2 0 1 1 0
3 0 0 0 0
In other words, the table indicates the presence or absence of a category for each user. Sometimes the user will have a category where the value if false, sometimes the user will have no row for a category, in which case IsMember is assumed to be false.
Also, from time to time additional categories will be added to the data set, and I'm wondering if its possible to do this query without knowing up front all the possible category names, in other words, I won't be able to specify all the column names I want to count in the result. (Note only user 1 has category "has_kids" and user 3 is missing a row for category "married"
(using Postgres)
Thanks.
You can use jsonb funcions.
with titles as (
select jsonb_object_agg(Category, Category) as titles,
jsonb_object_agg(Category, -1) as defaults
from demog
),
the_rows as (
select null::bigint as id, titles as data
from titles
union
select User_id, defaults || jsonb_object_agg(Category, IsMember)
from demog, titles
group by User_id, defaults
)
select id, string_agg(value, '|' order by key)
from (
select id, key, value
from the_rows, jsonb_each_text(data)
) x
group by id
order by id nulls first
You can see a running example in http://rextester.com/QEGT70842
You can replace -1 with 0 for the default value and '|' with ',' for the separator.
You can install tablefunc module and use the crosstab function.
https://www.postgresql.org/docs/9.1/static/tablefunc.html
I found a Postgres function script called colpivot here which does the trick. Ran the script to create the function, then created the table in one statement:
select colpivot ('_pivoted', 'select * from user_categories', array['user_id'],
array ['category'], '#.is_member', null);

Find certain values and show corresponding value from different field in SQL

So I found these 2 articles but they don't quite answer my question...
Find max value and show corresponding value from different field in SQL server
Find max value and show corresponding value from different field in MS Access
I have a table like this...
ID Type Date
1 Initial 1/5/15
1 Periodic 3/5/15
2 Initial 2/5/15
3 Initial 1/10/15
3 Periodic 3/6/15
4
5 Initial 3/8/15
I need to get all of the ID numbers that are "Periodic" or NULL and corresponding date. So I want a to get query results that looks like this...
ID Type Date
1 Periodic 3/5/15
3 Periodic 3/6/15
4
I've tried
select id, type, date1
from Table1 as t
where type in (select type
from Table1 as t2
where ((t2.type) Is Null) or "" or ("periodic"));
But this doesn't work... From what I've read about NULL you can't compare null values...
Why in SQL NULL can't match with NULL?
So I tried
SELECT id, type, date1
FROM Table1 AS t
WHERE type in (select type
from Table1 as t2
where ((t.Type)<>"Initial"));
But this doesn't give me the ID of 4...
Any suggestions?
Unless I'm missing something, you just want:
select id, type, date1
from Table1 as t
where (t.type Is Null) or (t.type = "") or (t.type = "periodic");
The or applies to boolean expressions, not to values being compared.

In Oracle, get max value while ignoring negative matching records

I have a table that looks like the following:
ID Value
462338900 41040
462338900 -41040
462338900 50
462338900 0
What I would like to do is get the max value from this table where the value field does not have a matching negative record. In the example above, 41040 would be the max value. However, since it has a negative matching record of -41040, I want to "throw it out" and bring back the new max value of 50.
Here is a method using exists:
select id, max(value)
from t
where not exists (select 1
from t t2
where t2.id = t.id and t2.value = - t.value
)
group by id;
I upvoted Gordon's answer -- it's the right one. However, depending on how big your table is and how many ids you're going after at one time, this could perform better, since it only reads the table once. I.e., it doesn't require the ANTI JOIN operation that the not exists will require.
select id, max(value)
from (
select id, abs(value) value, count(case when value < 0 then 1 else null end) neg_count
from t
group by id, abs(value) )
where neg_count = 0
group by id;
Also, be careful.. you stated your requirements very specifically. If your data were
ID Value
462338900 41040
462338900 41040
462338900 -41040
462338900 50
462338900 0
... with value 41040 duplicated, the single occurrence of -41040 would exclude both from the results at the max would be 50. If you'd want the max to be 41040 in that case, it's a different query. My version would be more adaptable to that requirement than the not exists approach: you could calculate a pos_count similar to neg_count and change where neg_count=0 to where pos_count > neg_count.

SQL (SQLite) count for null-fields over all columns

I've got a table called datapoints with about 150 columns and 2600 rows. I know, 150 columns is too much, but I got this db after importing a csv and it is not possible to shrink the number of columns.
I have to get some statistical stuff out of the data. E.g. one question would be:
Give me the total number of fields (of all columns), which are null. Does somebody have any idea how I can do this efficiently?
For one column it isn't a problem:
SELECT count(*) FROM datapoints tb1 where 'tb1'.'column1' is null;
But how can I solve this for all columns together, without doing it by hand for every column?
Best,
Michael
Building on Lamak's idea, how about this idea:
SELECT (N * COUNT(*)) - (
COUNT(COLUMN_1)
+ COUNT(COLUMN_2)
+ ...
+ COUNT(COLUMN_N)
)
FROM DATAPOINTS;
where N is the number of columns. The trick will be in making the summation series of COUNT(column), but that shouldn't be too terrible with a good text editor and/or spreadsheet.
i don't think there is an easy way to do it. i'd get started on the 150 queries. you only have to replace one word (column name) each time.
Well, COUNT (and most aggregations funcions) ignore NULL values. In your case, since you are using COUNT(*), it counts every row in the table, but you can do that on any column. Something like this:
SELECT TotalRows-Column1NotNullCount, etc
FROM (
SELECT COUNT(1) TotalRows,
COUNT(column1) Column1NotNullCount,
COUNT(column2) Column2NotNullCount,
COUNT(column3) Column3NotNullCount ....
FROM datapoints) A
To get started it's often helpful to use a visual query tool to generate a field list and then use cut/paste/search/replace or manipulation in a spreadsheet program to transform it into what is needed. To do it all in one step you can use something like:
SELECT SUM(CASE COLUMN1 WHEN NULL THEN 1 ELSE 0 END) +
SUM(CASE COLUMN2 WHEN NULL THEN 1 ELSE 0 END) +
SUM(CASE COLUMN3 WHEN NULL THEN 1 ELSE 0 END) +
...
FROM DATAPOINTS;
With a visual query builder you can quickly generate:
SELECT COLUMN1, COLUMN2, COLUMN3 ... FROM DATAPOINTS;
You can then replace the comma with all the text that needs to appear between two field names followed by fixing up the first and last fields. So in the example search for "," and replace with " WHEN NULL 1 ELSE 0 END) + SUM(CASE " and then fix up the first and last fields.