Count and Group By Returning Different Sets - Halfway solved :/ - sql

Can anyone help? My trouble is that people might have the same id or different and have different name spellings. If I group by id (which is not the primary key) I get a different amount of rows than if I group by ID and name. How do I just group by ID, while still having the ID and Name in the select?
Create Table Client(ID Int, Name Varchar(15))
Insert Into Client VALUES(11,'Batman'),(22,'Batman'),(33,'Robin'),(44,'Joker'),(44,'The Joker'),(33,'Robin')
Select Count(ID) From Client
Select * From Client
--This returns 4 rows as it should
Select Count (ID)
From Client
Group By ID
--This returns 5 rows because Joker and The Joker have different names, but the same ID. I want to count by ID and not the name, since so many have typos.
Select Count (ID), [Name] , ID
From Client
Group By ID, [Name]
How do I do this and have it work?
Select Count (ID), [Name] , ID
From Client
Group By ID --<< Always throws and error unless I include Name, which
--returns too many rows.
It should return
Count Name ID
1 Batman 11
1 Batman 22
2 Joker 44 --<< Correct
2 Robin 33
And not
Count Name ID
1 Batman 11
1 Batman 22
2 Robin 33
1 Joker 44 --Wrong
1 The Joker 44 --Wrong

using select count(*) from ClientLog will tell you exactly how many records there are in your table. If your ID field is the primary key, then select count(ID) from ClientLog should return the same number.
Your first query is a little confusing, because you're grouping by ID but not displaying the ID. So you're likely getting a row for each record, where the row value is 1.
Your 2nd query is also a bit confusing, because there's no aggregation happening (since your ID field is unique).
What specifically are you trying to obtain in your query (if anything besides just how many records you have in your table)?

Related

First name should randomly match with other FIRST name

All first name should randomly match with each other and when I tried to run query again the First Name should be match with others name. Not the match with FIRST time match.
For example I have 6 records in one table ...
First name column looks like:
JHON
LEE
SAM
HARRY
JIM
KRUK
So I want result like
First name1 First name2
Jhon. Harry
LEE. KRUK
HARRY SAM
The simplest solution is to first randomly sort the records, then calculate the grouping and a sequence number within the group and then finally select out the groups as rows.
You can follow along with the logic in this fiddle: https://dbfiddle.uk/9JlK59w4
DECLARE #Sorted TABLE
(
Id INT PRIMARY KEY,
FirstName varchar(30),
RowNum INT IDENTITY(1,1)
);
INSERT INTO #Sorted (Id, FirstName)
SELECT Id, FirstName
FROM People
ORDER BY NEWID();
WITH Pairs as
(
SELECT *
, (RowNum+1)/2 as PairNum
, RowNum % 2 as Ordinal
FROM #Sorted
)
SELECT
Person1.FirstName as [First name1], Person2.FirstName as [First name2]
FROM Pairs Person1
LEFT JOIN Pairs Person2 ON Person1.PairNum = Person2.PairNum AND Person2.Ordinal = 1
WHERE Person1.Ordinal = 0
ORDER BY Person1.PairNum
ORDER BY NEWID() is used here to randomly sort the records. Note that it is indeterminate and will return a new value with each execution. It's not very efficient, but is suitable for our requirement.
You can't easily use CTE's for producing lists of randomly sorted records because the result of a CTE is not cached. Each time the CTE is referenced in the subsequent logic can result in re-evaluating the expression. Run this fiddle a few times and watch how it often allocates the names incorrectly: https://dbfiddle.uk/rpPdkkAG
Due to the volatility of NEWID() this example stores the results in a table valued variable. For a very large list of records a temporary table might be more efficient.
PairNum uses the simple divide by n logic to assign a group number with a length of n
It is necessary to add 1 to the RowNum because the integer math will round down, see this in action in the fiddle.
Ordinal uses the modulo on the RowNumber and is a value we can use to differentiate between Person 1 and Person 2 in the pair. This helps us keep the rest of the logic determinate.
In the final SELECT we select first from the Pairs that have an Ordinal of 0, then we join on the Pairs that have an Ordinal of 1 matching by the PairNum
You can see in the fiddle I added a solution using groups of 3 to show how this can be easily extended to larger groupings.

Row returned by a query with ROWNUM and PRIMARY KEY column

I'm reading about 1 of the oracle pseudocolumns i.e
ROWNUM: which return number indicating the order in which oracle return the row.
I have encountered some behavior here,
Example used
1. Create Script:
CREATE TABLE CUSTOMERS(
ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL,
AGE INT NOT NULL,
ADDRESS CHAR (25) ,
SALARY DECIMAL (18, 2),
PRIMARY KEY (ID)
);
2 Insert Script:
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (2, 'Max', 22, 'India', 4500.00 );
INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY)
VALUES (1, 'Maths', 22, 'US', 4500.00 );
select * from CUSTOMERS; executed in the same order as inserted above,
ID NAME AGE ADDRESS SALARY
------ ---- --- ------- ------
2 Max 22 India 4500.00
1 Math 22 US 4500.00
Here if I run the select query,
select
rownum,
customers.ID
from customers;
I get below output:
ROWNUW ID
------ --
1 1
2 2
Here ID = 2 is inserted first but oracle returns in the 2nd row,
But if you include any other column from a table with ROWNUM and PK like
select
rownum,
customers.ID,customers.Name
from customers;
I get correct output (Correct inserted order) :
ROWNUW ID NAME
------ -- ----
1 2 Max
2 1 Math
If run query without Name column i.e only ROWNUW (pseudocolumns) and PK (table column)
We get this,
ROWNUW ID
------ --
1 1
2 2
My Question is Why is that ID=2 is not first returned row.
if I query any table column with ROWNUM, I get the result back based on insertion order.
Example below
select
rownum,
(NAME / AGE / ADDRESS SALARY) any one of these columns
From CUSTOMERS
But if use ROWNUM with ID (Primary Key column) insertion order is not working. why is this behavior with the only Primary Key column?
ROWNUM values depends on how oracle access the resultset. Once resultset is fetched, rownum is assigned to the rows.
As there is no guarantee in what order the data is returned, there is no guarantee in what order rownum is assigned.
Maybe if you try running both of your queries without rownum, you might get the rows swapped as you showed, or they might not.
Rownum will always be in order of 1,2,3,4 .. no matter what you are getting in result set. It is always calculated after result set is returned from query.
Now, you have ID column as PRIMARY KEY which makes it eligible for default clustered index creation in database. So when you select only PRIMARY column, it will be sorted as it was stored in DB that way. Index created as a result of PRIMARY key creation is sorted in ASC which is default.
Here if you select only PRIMARY KEY column i.e. "ID", Database is returning it in sorted order and ROWNUM is writing 1 & 2 to it.
But if you are selecting some extra non-PRIMARY KEY as well then it is totally upto oracle, whichever way it finds data faster. Sometimes it will return data from cache memory. If not it will read disk and fetch data. While reading datafile, there are multiple things which will differ machine to machine i.e is your table partitioned? is your table sharing tablespace ? what organization you are using in table ? blah blah....
Bottom line is, don't trust ROWNUM. It gives 1,2,3.. after you have applied your brain to the query.

Querying SQL table with different values in same column with same ID

I have an SQL Server 2012 table with ID, First Name and Last name. The ID is unique per person but due to an error in the historical feed, different people were assigned the same id.
------------------------------
ID FirstName LastName
------------------------------
1 ABC M
1 ABC M
1 ABC M
1 ABC N
2 BCD S
3 CDE T
4 DEF T
4 DEG T
In this case, the people with ID’s 1 are different (their last name is clearly different) but they have the same ID. How do I query and get the result? The table in this case has millions of rows. If it was a smaller table, I would probably have queried all ID’s with a count > 1 and filtered them in an excel.
What I am trying to do is, get a list of all such ID's which have been assigned to two different users.
Any ideas or help would be very appreciated.
Edit: I dont think I framed the question very well.
There are two ID's which are present multiple time. 1 and 4. The rows with id 4 are identical. I dont want this in my result. The rows with ID 1, although the first name is same, the last name is different for 1 row. I want only those ID's whose ID is same but one of the first or last names is different.
I tried loading ID's which have multiple occurrences into a temp table and tried to compare it against the parent table albeit unsuccessfully. Any other ideas that I can try and implement?
SELECT
ID
FROM
<<Table>>
GROUP BY
ID
HAVING
COUNT(*) > 1;
SELECT *
FROM myTable
WHERE ID IN (
SELECT ID
FROM myTable
GROUP BY ID
HAVING MAX(LastName) <> MIN(LastName) OR MAX(FirstName) <> MIN(FirstName)
)
ORDER BY ID, LASTNAME

how to increase character sequence in sql if given random number of table rows?

I'm currently studying SQL language in Oracle.
After making very simple STUDENT table, I thought about how to make character sequence in ID field.
For example, if STUDENT table has 6 rows, I want the ID field to be inserted by 'a','b','c'...'f' characters respectively. And another condition is that the ID sequence should be ordered by age in ascending order.
The below explanation is about STUDENT table description and current inserted value (ID field is currently empty).
NAME AGE GRADE ID
hi 15 1
dui 12 2
giyu 16 3
hero 27 4
power 55 3
rai 37 4
///////////////////////////////////////////////////////////////////////////////////////////////////////
DESC STUDENT
NAME VARCHAR2(20)
AGE NUMBER(5)
GRADE NUMBER
ID VARCHAR2(12)
I hope many brilliant ideas come up here =)
until now, this is very easy to come up with making table ordered by age.
but inserting character sequence respectively is ... well .. idea doesn't come up now. And this is not homework. i just want to practice sql language.
update tableX X
set ID=(
select ID from (
select rowid as rid,
chr(mod((row_number() over (order by age))-1,26)+97) as ID
from tableX T
)
where rid=X.rowid
)
Required order of the ID set in the over(order by ) clause. Function row_number() gets sequence number of rows in given order. mod() gets remainder of the division (for 26 chars only). chr() get char by the ascii code.

Can IBM DB2 return a 0 (zero) when no records are found?

for example :
I have a table with student ID and student grades
-----------------------
ID | grades
-----------------------
1 | 80
2 | 28
-----------------------
I want to get 0 when I query about ID = 3
can I do that ?
like select grades from student where id = 3 .
I want to get 0 because ID is not in the table
Run a select command with the reserved function called count:
select count(*) from STUDENT.GRADES where ID=3
It should be just like that.
Maybe this will do what you want:
SELECT ID, MAX(Grades)
FROM (SELECT ID, Grade FROM Students WHERE ID = 3
UNION
VALUES (3, 0) -- Not certain of syntax here
)
GROUP BY ID
The basic idea is that students present in the table will have two rows and the MAX will pick their proper grade (assuming that there are no circumstances where the grade is coded as a negative value). Students that are not represented will have just the one row with a grade of 0. The repeated 3 is the ID of the student being sought.
Have fun chasing down the full syntax. I started at Queries in the DB2 9.7 Information Centre, but ran out of patience before I got a good answer — and I don't have DB2 to experiment on. You might need to write SELECT ID, Grades FROM VALUES (3, 0), or there might be some other magical incantation that does the job. You could probably use SELECT 3 AS ID, 0 AS Grades FROM SYSIBM.SYSTABLES WHERE TABID = 1, but that's a clumsy expression.
I've kept with the column name Grades (plural) even though it looks like it contains one grade. It is depressing how often people ask questions about anonymous tables.