How to avoid group by? [closed] - sql

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
Say there are two tables, one table has a column name, and the other has a column occupation. I'm trying to find out how many people have more than 6 occupations in my records. I've tried to COUNT the occupations, but the problem is, when I do, I need to group by. When I group by the name, the problem arises when there are two "Alex Jones", each having 4 occupations, and so the resulting group by gives me "Alex Jones: 8".
I'm not sure how I can avoid this, some advise would be great, thanks in advance!

If your problem is that when you group by "name" you end up grouping two names that are identical, but refer to different people, than your "name" column is not unique. Try using a combination of columns that make the group by unique or group by using a unique column.
You can do for example group by name, other_column, where other_column is a column that in conjunction with "name" identify uniquely the person. Or even better group by personal_id., if you have a unique column like a social security number, or something like that.
As another option, you can use window functions to count without grouping by. For example :
select
...
name,
COUNT(occupation) OVER(PARTITION BY name)
...
from
my_table
You can learn how to use it from here:
https://www.postgresql.org/docs/current/tutorial-window.html

Related

SQL Server combine 2 rows into 1 [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 23 days ago.
This post was edited and submitted for review 21 days ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
In a SQL Server query, I'm trying to figure out how to combine two rows of data into one row for specific records.
The following is an example of table data. Below it is how I would like the data to be displayed. I want to display all available columns for each employee but on 1 row. I tried group by but that did not work as I want all the columns displayed.
I'd like to display only one row for certain employees who have two rows. I can use EMP ID because it is associated with a specific employee. Any suggestions for the best way to accomplish this in SQL Server?
In SQL Server, you can use the GROUP BY clause and an aggregate function to combine multiple rows of data into one for specific records. The following query, for example, will group the rows by EMP ID and return the sum and count of the specified column for each group:
SELECT EMP_ID, SUM(column_name) AS column_name, COUNT(column_name) AS column_name_count
FROM your_table
GROUP BY EMP_ID;
The data will be organized by employee ID, and a summary of the column specified for each group of records with the same employee ID will be provided.
If you simply wish to display the employee IDs without regard to any of the other variables in the table, then this can be accomplished using the DISTINCT function:
select distinct emp_id from table;
This will return employee IDs without any duplicate values being returned.
If you are looking to aggregate data (which I believe is your intention), then it is a case of using an aggregate function such as GROUP BY. e.g. given emp_id and a column x, one example of a query could be as follows:
select emp_id, sum(x) from table group by emp_id order by emp_id;

How to sum values from a column that is null? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 months ago.
Improve this question
I need to do an union between 2 tables and there is a column on the first table that doesn't exist on the second table, so, I would like to set up a column with null's (e.g: select null as column_name from table2). However, it involves a SUM and I can't use "null as Sum(column_name)" since it doesn't work. Since the values are null, the sum would be 0.
So, how do I select a sum from a column that doesn't exist in a table but I will insert null values on that column?
In your union, when you select from the table that does not have the column in question, just include null as <column name> in the appropriate position in your select statement. Or you could do 0 as <column name>, whatever works better for you.
You could apply ZEROIFNULL around that column to make them 0.
SUM(ZEROIFNULL(UnionColumn)) This should work fine with Teradata.

How can I find duplicate records in clickhouse [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to know how I can find duplicate data entries within one table in clickhouse.
I am actually investigating on a merge tree table and actually threw optimize statements at my table but that didn't do the trick. The duplicate entries still persist.
Preferred would be to have a universal strategy without referencing individual column names.
I only want to see the duplicate entries, since I am working on very large tables.
The straight forward way would be to run this query.
SELECT
*,
count() AS cnt
FROM myDB.myTable
GROUP BY *
HAVING cnt > 1
ORDER BY date ASC
If that query gets to big you can run it in pieces.
SELECT
*,
count() AS cnt
FROM myDB.myTable
WHERE (date >= '2020-08-01') AND (date < '2020-09-01')
GROUP BY *
HAVING cnt > 1
ORDER BY date ASC

Why this SQL statement is wrong? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Given the table below -called marks- why the following statement is wrong?
SELECT student_name, SUM(subject1)
FROM marks
WHERE student_name LIKE 'R%';
Following is 'marks' table
You need to add a GROUP BY whenever we perform an aggregation function like SUM, AVG.. etc.
SELECT student_name,SUM(subject1)
FROM marks
WHERE student_name
LIKE'R%' GROUP BY student_name;
Hope this works for you now!
Why is it wrong?
The SUM() makes this an aggregation query. An aggregation query is fine without a GROUP BY. It returns one row and all columns in the SELECT should be aggregation functions.
However, you have an unaggregated column student_name. This is not in a GROUP BY and it is not the argument to an aggregation function.
Presumably, you want GROUP BY student_name. But there are other possibilities, such as MIN(), MAX() or LISTAGG().

Picking unique records in SQL [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Say I have a table with multiple records of peoples name, I draw out a prize winner every month. What is a query in SQL that I can use so that I pick up a unique record every month and the person does not get picked on the next month. I do not want to delete the record of that person.
create a new column as a flag named anything like 'prizeFlag' make it boolean take only 0 and 1 or anything as you like, make it's default value is 0 means not get a prize yet
when you select a random column update this filed with 1 means take a prize
when you select a random column next month, Add a condition in WHERE Clause say the prizeFlag not equal 1 to avoid duplication
One should store whether a person has already won. A date would make sense to allow people after say 10 years to win again.
ALTER TABLE ADD won DATE;
A portable way would be to use a random number function outside the SQL.
SELECT MIN(id), MAX(id), COUNT(*) FROM persons
With a random number one can get the next valid ID.
SELECT MIN(ID) FROM persons WHERE won NOT IS NULL AND ID >= ?
int randomno = minId + new Random().nextInt(maxId - minId);
preparedStatement.setInt(1, randomno)
UPDATE persons SET won = CURRENT_DATE WHERE ID = ?