I want to make a table like following
| ID | Sibling1 | Sibling2 | Sibling 3 | Total_Siblings |
______________________________________________________________
| 1 | Tom | Lisa | Null | 2 |
______________________________________________________________
| 2 | Bart | Jason | Nelson | 3 |
______________________________________________________________
| 3 | George | Null | Null | 1 |
______________________________________________________________
| 4 | Null | Null | Null | 0 |
For Sibling1, Sibling2, Sibling3: they are all nvarchar(50) (can't change this as the requirement).
My concern is that how can I calculate the value for Total_Siblings so it will display the number of siblings like above, using SQL? i attempted to use (Sibling1 + Sibling 2) but it does not display the result I want.
Cheers
A query like this would do the trick.
SELECT ID,Sibling1,Sibling2,Sibling3
,COUNT(Sibling1)+Count(Sibling2)+Count(Sibling3) AS Total
FROM MyTable
GROUP BY ID
A little explanation is probably required here. Count with a field name will count the number of non-null values. Since you are grouping by ID, It will only ever return 0 or 1. Now, if you're using anything other than MySQL, you'll have to substitute
GROUP BY ID
FOR
GROUP BY ID,Sibling1,Sibling2,Sibling3
Because most other databases require that you specify all columns that don't contain an aggregate function in the GROUP BY section.
Also, as an aside, you may want to consider changing your database schema to store the siblings in another table, so that each person can have any number of siblings.
You can do this by adding up individual counts:
select id,sibling1,sibling2,sibling3
,count(sibling1)+count(sibling2)+count(sibling3) as total_siblings
from table
group by 1,2,3,4;
However, your table structure makes this scale crappily (what if an id can belong to, say, 50 siblings?). If you store your data into a table with columns of id and sibling, then this query would be as simple as:
select id,count(sibling)
from table
group by id;
Related
I am using Microsoft Access and in it, I have a table with data that is sometimes repeated. I'm not able to create an SQL query that removes duplicate data, leaving only distinct data in the table. Can someone help me?
My current table:
Date | Level | Name
---------+--------+--------
12/25/2021 | 2 | Jack
12/25/2021 | 2 | Jack
12/10/2021 | 3 | Ana
12/01/2021 | 1 | Lenon
12/01/2021 | 1 | Lenon
12/30/2021 | 3 | Ana
Expected result:
Date | Level | Name
---------+--------+--------
12/25/2021 | 2 | Jack
12/10/2021 | 3 | Ana
12/01/2021 | 1 | Lenon
12/30/2021 | 3 | Ana
PS: Ana appears twice in the expected result table because the dates of the two rows referring to Ana are different, so they are not duplicated values.
Just use select distinct:
select distinct t.*
from t;
I would add that tables should not have duplicate rows. Something is wrong with the table generation if you are getting duplicates -- either the query being used or the process for inserting rows into the table.
You can do a group by of the Date, Level and Name columns.
Use this query:
SELECT Date
,Level
,Name
FROM <TableName>
GROUP BY Date, Level, Name
I have a query which return more than a million rows based on the Entity-Attribute-Value model. Note that each entity may have a different number of attributes, therefore, I can't just look for a row ID. Here is an example table:
+----------+-----------+------------+
| EntityID | Attr_Name | Attr_Value |
+----------+-----------+------------+
| 1 | Age | 2 |
+----------+-----------+------------+
| 1 | Class | Spatial |
+----------+-----------+------------+
| 2 | Age | 3 |
+----------+-----------+------------+
| 2 | Class | Industrial |
+----------+-----------+------------+
| 3 | Class | Industrial |
+----------+-----------+------------+
I need to filter all the EntityID according to their Class. In this example, let's say I need all the EntityID that are Industrial, I want my query to return rows 3-4-5 (so all rows associated with EntityID 2 and 3).
I thought about using a sub-select on the same query and grouping by EntityID and looking only for all EntityIDs that are Industrial in the where clause (WHERE EntityID = (subquery)), but is not effective at all. The query has a lot of joins and unions and therefore, it takes a lot of time. I'm open to all suggestions for a more efficient way of doing it (which I'm sure there is) !
Thanks.
You can use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.entityid = t.entityid and
t2.attr_name = 'Class' and
t2.attr_value = 'Industrial'
);
I have tables below as follows:
tbl_tasks
+---------+-------------+
| Task_ID | Assigned_ID |
+---------+-------------+
| 1 | 8 |
| 2 | 12 |
| 3 | 31 |
+---------+-------------+
tbl_resources
+---------+-----------+
| Task_ID | Source_ID |
+---------+-----------+
| 1 | 4 |
| 1 | 10 |
| 2 | 42 |
| 4 | 8 |
+---------+-----------+
A task is assigned to at least one person (denoted by the "assigned_ID") and then any number of people can be assigned as a source (denoted by "source_ID"). The ID numbers are all linked to names in another table. Though the ID numbers are named differently, they all return to the same table.
Would there be any way for me to combine the two tables based on ID such that I could search based on someone's ID number? For example- if I decide to search on or do a WHERE User_ID = 8, in order to see what Tasks that 8 is involved in, I would get back Task 1 and Task 4.
Right now, by joining all the tables together, I can easily filter on "Assigned" but not "Source" due to all the multiple entries in the table.
Use union all:
select distinct task_id
from ((select task_id, assigned_id as id
from tbl_tasks
) union all
(select task_id, source_id
from tbl_resources
)
) ti
where id = ?;
Note that this uses select distinct in case someone is assigned to the same task in both tables. If not, remove the distinct.
I am currently trying to SELECT the DISTINCT FirstNames in a GROUP, using Microsoft Access 2010.
The simplified relevant columns of my table looks like this:
+----+-------------+-----------+
| ID | GroupNumber | FirstName |
+----+-------------+-----------+
| 1 | 1 | Peter |
| 2 | 1 | Bob |
| 3 | 1 | Peter |
| 4 | 2 | Rosemary |
| 5 | 2 | Jamie |
| 6 | 3 | Peter |
+----+-------------+-----------+
My actual table contains two columns to which I want to apply this process (separately), but I should be able to simply repeat the process for the other column. The column group number is a simplification, my table actually groups all rows in a ten day interval together, but I've already solved that problem.
And I would like it to return this:
+-------------+------------+
| GroupNumber | FirstNames |
+-------------+------------+
| 1 | Peter |
| 1 | Bob |
| 2 | Rosemary |
| 2 | Jamie |
| 3 | Peter |
+-------------+------------+
This means that I want all Distinct FirstNames for each Group.
A regular DISTINCT would ignore group boundaries and only mention Peter once. All aggregate functions reduce my output to only one value or don't work on strings at all. Access also doesn't support SELECTing columns that are not aggregates or in the GROUP BY statement.
All other answers I've found either want an aggregate, are not applicable to MS Access or are solved by working around the data in ways not applicable to my case. (Standardized languages are a nice thing, aren't they?)
My current (invalid) query looks like this:
SELECT GroupNumber,
DISTINCT FirstNames -- This is illegal, distinct applies to all
-- columns and doesn't respect groups.
FROM Example AS b
-- Complicated stuff to make the groups
GROUP BY GroupNumber;
This query is a one time thing and is used to analyze a 58000 row excel spreadsheet exported from another Database (not my fault), so optimizing for runtime is not necessary.
I would like to achieve this purely through SQL and without VBA if at all possible.
This should work:
SELECT DISTINCT GroupNumber, FirstNames
FROM Example AS b
A solution for this problem would be group by the columns GroupNumber and FirstNames at the same time. The query is presented below:
Select GroupNumber, FirstNames
From input
Group By GroupNumber, FirstNames
(Standardized languages are a nice thing, aren't they?)
I've created a form in PHP that collects basic information. I have a list box that allows multiple items selected (i.e. Housing, rent, food, water). If multiple items are selected they are stored in a field called Needs separated by a comma.
I have created a report ordered by the persons needs. The people who only have one need are sorted correctly, but the people who have multiple are sorted exactly as the string passed to the database (i.e. housing, rent, food, water) --> which is not what I want.
Is there a way to separate the multiple values in this field using SQL to count each need instance/occurrence as 1 so that there are no comma delimitations shown in the results?
Your database is not in the first normal form. A non-normalized database will be very problematic to use and to query, as you are actually experiencing.
In general, you should be using at least the following structure. It can still be normalized further, but I hope this gets you going in the right direction:
CREATE TABLE users (
user_id int,
name varchar(100)
);
CREATE TABLE users_needs (
need varchar(100),
user_id int
);
Then you should store the data as follows:
-- TABLE: users
+---------+-------+
| user_id | name |
+---------+-------+
| 1 | joe |
| 2 | peter |
| 3 | steve |
| 4 | clint |
+---------+-------+
-- TABLE: users_needs
+---------+----------+
| need | user_id |
+---------+----------+
| housing | 1 |
| water | 1 |
| food | 1 |
| housing | 2 |
| rent | 2 |
| water | 2 |
| housing | 3 |
+---------+----------+
Note how the users_needs table is defining the relationship between one user and one or many needs (or none at all, as for user number 4.)
To normalise your database further, you should also use another table called needs, and as follows:
-- TABLE: needs
+---------+---------+
| need_id | name |
+---------+---------+
| 1 | housing |
| 2 | water |
| 3 | food |
| 4 | rent |
+---------+---------+
Then the users_needs table should just refer to a candidate key of the needs table instead of repeating the text.
-- TABLE: users_needs (instead of the previous one)
+---------+----------+
| need_id | user_id |
+---------+----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 4 | 2 |
| 2 | 2 |
| 1 | 3 |
+---------+----------+
You may also be interested in checking out the following Wikipedia article for further reading about repeating values inside columns:
Wikipedia: First normal form - Repeating groups within columns
UPDATE:
To fully answer your question, if you follow the above guidelines, sorting, counting and aggregating the data should then become straight-forward.
To sort the result-set by needs, you would be able to do the following:
SELECT users.name, needs.name
FROM users
INNER JOIN needs ON (needs.user_id = users.user_id)
ORDER BY needs.name;
You would also be able to count how many needs each user has selected, for example:
SELECT users.name, COUNT(needs.need) as number_of_needs
FROM users
LEFT JOIN needs ON (needs.user_id = users.user_id)
GROUP BY users.user_id, users.name
ORDER BY number_of_needs;
I'm a little confused by the goal. Is this a UI problem or are you just having trouble determining who has multiple needs?
The number of needs is the difference:
Len([Needs]) - Len(Replace([Needs],',','')) + 1
Can you provide more information about the Sort you're trying to accomplish?
UPDATE:
I think these Oracle-based posts may have what you're looking for: post and post. The only difference is that you would probably be better off using the method I list above to find the number of comma-delimited pieces rather than doing the translate(...) that the author suggests. Hope this helps - it's Oracle-based, but I don't see .